remote
Sr. Associate, Site Reliability Engineering - McKesson
Software Engineer
Senior Associate Site Reliability Engineer driving reliability, automation, and performance for cloud-native healthcare services using Kubernetes, Docker, CI/CD pipelines, AWS, and Python-driven monitoring solutions.
About the role
Key Responsibilities
- Design, implement, and maintain highly available, scalable infrastructure for mission‑critical healthcare applications on AWS.
- Build and manage Kubernetes clusters, container orchestration, and CI/CD pipelines to accelerate deployment cycles.
- Develop and maintain observability stack (Prometheus, Grafana, ELK) for real‑time monitoring, alerting, and incident response.
- Automate configuration, provisioning, and scaling using IaC tools (Terraform, CloudFormation) and scripting (Python, Bash).
- Collaborate with development, security, and product teams to embed reliability best practices into the software development lifecycle.
Requirements
- 5+ years of experience in Site Reliability Engineering or DevOps roles.
- Proficient with Kubernetes, Docker, and cloud-native tooling.
- Strong scripting skills in Python and experience with IaC (Terraform, CloudFormation).
- Hands‑on experience with AWS services (EC2, EKS, S3, CloudWatch).
- Excellent problem‑solving, communication, and teamwork abilities.
Skills
kubernetesdockercicdawspython