onsite
Site Reliability Engineer - RBC
Site Reliability Engineer
Site Reliability Engineer responsible for building and maintaining highly available cloud services, automating infrastructure with Ansible, developing tooling in Golang, and supporting production systems through monitoring, incident response, and performance optimization.
About the role
Key Responsibilities
- Design, implement, and operate scalable, fault‑tolerant services on cloud platforms.
- Develop automation scripts and tools using Golang and Ansible to streamline provisioning, configuration, and deployment.
- Monitor system health, respond to incidents, and perform root‑cause analysis to improve reliability.
- Collaborate with development teams to integrate observability, logging, and alerting into applications.
- Maintain Linux infrastructure, manage container orchestration, and ensure security best practices are applied.
Requirements
- 3+ years of experience in site reliability, DevOps, or systems engineering.
- Proficiency with Linux administration, cloud services (AWS, Azure, or GCP), and container technologies.
- Strong programming skills in Golang and familiarity with front‑end frameworks such as React for internal tooling.
- Hands‑on experience with configuration management tools, especially Ansible.
- Solid understanding of networking, monitoring, and incident management processes.