onsite
Senior Software Engineer, Site Reliability Engineering, Distributed Cloud - Google
Software Engineer
Senior Software Engineer specializing in Site Reliability Engineering, building and operating large‑scale, fault‑tolerant distributed cloud services using Python, Go, Kubernetes, and modern cloud platforms.
About the role
Key Responsibilities
- Design, develop, and maintain highly available services that power large‑scale cloud infrastructure.
- Implement automation, monitoring, and alerting solutions to improve reliability and reduce manual toil.
- Collaborate with product and engineering teams to define service level objectives (SLOs) and error budgets.
- Lead incident response, root‑cause analysis, and post‑mortem processes for production issues.
- Mentor junior engineers and provide technical leadership across multiple projects.
Requirements
- Bachelor’s degree in Computer Science or related field with 5+ years of software development experience.
- 3+ years designing, analyzing, and troubleshooting large‑scale distributed systems.
- Proficiency in at least one programming language such as Python or Go.
- Hands‑on experience with container orchestration (e.g., Kubernetes) and cloud platforms.
- Demonstrated ability to lead projects and drive reliability improvements.