remote
Site Reliability Engineer III - eBay
Site Reliability Engineer
Senior Site Reliability Engineer responsible for designing, automating, and operating scalable, highly available services on AWS using Kubernetes, Terraform, and modern CI/CD pipelines.
About the role
Key Responsibilities
- Design, implement, and maintain highly available, fault‑tolerant services on AWS infrastructure.
- Develop automation and orchestration solutions using Python, Go, Terraform, and Kubernetes.
- Build and improve CI/CD pipelines to accelerate delivery while ensuring reliability and security.
- Monitor system performance, troubleshoot incidents, and lead post‑mortem analyses.
- Collaborate with development, product, and security teams to embed reliability best practices throughout the software lifecycle.
Requirements
- 5+ years of experience in site reliability or production engineering roles.
- Strong programming skills in Python and Go.
- Deep experience with AWS services, Kubernetes, and infrastructure‑as‑code tools such as Terraform.
- Proficiency in Linux system administration and networking concepts.
- Hands‑on experience building CI/CD pipelines and implementing observability (monitoring, logging, tracing).
Skills
pythongokubernetesawsterraformcicdlinux