remote
Senior Site Reliability Engineer - Morningstar
Site Reliability Engineer
Senior Site Reliability Engineer driving reliability and scalability of investment data platforms using Kubernetes, AWS, Terraform, and automation with Python and Go.
About the role
Key Responsibilities
- Design, implement, and operate highly available, scalable services that process and deliver investment data across global pipelines.
- Develop and maintain infrastructure-as-code using Terraform and automate deployments with CI/CD pipelines.
- Build observability solutions—metrics, logging, tracing—and lead incident response and post‑mortem processes.
- Collaborate with data engineering, product, and analyst teams to embed reliability, performance, and security best practices into new and existing systems.
- Drive continuous improvement through capacity planning, performance tuning, and adoption of cloud‑native technologies on AWS.
Requirements
- 5+ years of SRE or DevOps experience in large‑scale, data‑intensive environments.
- Strong programming skills in Python and Go for automation and tooling.
- Deep expertise with Kubernetes, container orchestration, and cloud platforms (AWS).
- Proficiency in infrastructure‑as‑code (Terraform) and CI/CD frameworks (Jenkins, GitLab CI, or similar).
- Hands‑on experience with monitoring, alerting, and incident management tools (Prometheus, Grafana, ELK, PagerDuty).
Skills
pythongokubernetesawsterraformcicd