IT: Infrastructure
Observability SRE
Associate
|
職務内容 Job Description
|
Job/Group Overview:
Nomura is a global financial services group with an integrated global network spanning over 30 countries. Japan IT (Information Technology) is a diverse environment with employees of over 25 nationalities, who work on technical support, application development and implementation of system changes for Japan Retail Wealth Management Business and Global Wholesale (Global Markets and Investment Banking). Nomura provides competitive employee benefits, training and upskilling opportunities, and is committed to promoting diversity, equity and inclusion, employee health and well-being.
Japan Observability SRE within the Group Platform Services & Engineering division which provides the Nomura group common services to Development, Infrastructure and Production Services. This is a technical position responsible for support, operation, enhancements and integration of the Observability platform. The successful candidate will have a vital role in shaping future Observability strategy and direction within the Nomura Group.
The candidate must have native Japanese and fluency in English.
A fantastic opportunity for somebody with 3+ years experience to work with state of the art technologies to deliver industry leading solutions in the Telemetry, Observability and Monitoring space (known as TOM internally). The successful candidate would join a team of enthusiastic forward thinking SRE & Engineers who are working to radically transform how the Nomura Group manages the operation of its estate. This will evolve into full AI Ops Capability with automated anomaly detection, automated impact and root cause analysis and machine learning generated resolutions. This is a global team consisting of 20 team members bringing change across the organisation.
The individual will be a part of our Japan based Observability SRE team and shall be responsible for the support & operational engineering for the various tools and technologies that make up our Observability suite. The candidate will work closely with their peers in other regions as well as other development teams to facilitate the strategic objectives of the team.
The observability platform consists of tools from vendor, open source and in house.
These tools include Grafana UI, Loki, Mimir, Tempo, Sloth, RightITNow, EverBridge and Open Telemetry. Experience of the Observability principles and the Grafana toolset is a must. Our solutions are deployed on Linux backend thus intermediate knowledge of Linux is also must. We also expect the candidate to have experience of development and engineering in some capacity.
Aspiring individual must be a quick learner and be able to understand and support a complex production ecosystem. To that end, they should be familiar with best practices in terms of managing releases / changes / incidents / requests etc. We expect the individual to leverage skills and production support experience to find solutions to problems while minimising impact to production.
We are looking for an individual who can be innovative and find opportunities to improvise either via process improvement or automation. They need to be self-motivated as well as motivating to others and be a driver of change. Needs to be a team player and foster a healthy and conductive environment in the team where everyone is supportive and respectful of other’s opinions. |
|
Responsibilities: |
- Supporting the Observability tools including Grafana Loki, Mimir and Tempo & Grafana UI
- Supporting a large user base as they manage their transition to modern observability tools and adoption of Open Telemetry
- Managing updates, releases and testing in both production & non production environments
- Drive adoption of best practices in Observability across the organisation
- Contribute to Observability standards and procedures
- Locally reporting to Japan Observability Lead
- Functionally reporting to head of Observability SRE Team
|
|
登録資格 Requirements
|
Requirements: |
Mandatory: |
- Experience in supporting large enterprise systems (3+ years)
- Passionate in providing high quality deliverables and learning new technologies
- Strong and confident communicator with good interpersonal skills
- Experience of collaboration tools such as Confluence / JIRA / Microsoft Teams & Office 365
- Able to take the initiative to investigate and follow-up with various stakeholders to resolve issues
- Solid understanding of release, deployment, and change management processes
- Self-motivated individual, quality and improvement focused
- Self-starter and able to self-manage
- Must be able to take initiative to keep own skills up to date and to maintain awareness of current technology developments
- Good team player, ability to work on a local, regional and global basis and as part of joint cross location initiatives
|
Preffered: |
- SRE experience
- Experience with Observability tools such as Open Telemetry, Grafana UI, Mimir, Loki, Tempo, Grafana Agent, Prometheus
- Continuous Integration / Deployment via DevOps Tools such as GitLab, Jenkins, Ansible, Nexus etc.
- Supporting a medium / large scale production environment
- Knowledge of ITIL
- Decent understanding of DB Platforms − Sybase / MySQL / MSSQL − general RDBMS concepts, SQL
- Knowledge of operating system fundamentals including monitoring of IO, Networks, CPU and Memory
- Basic knowledge of / familiarity with other infrastructure technologies such as Middleware, Web servers, Load balancers, System Services etc.
|
|
勤務地 Location
|
豊洲
|
|
|