remote

Sr Site Reliability Engineer US Federal - Workday

Site Reliability Engineer

Senior Site Reliability Engineer driving reliability, scalability, and automation for federal cloud services using AWS, Kubernetes, Terraform, and Python, while leading incident response and continuous improvement initiatives.

About the role

Key Responsibilities

Design, implement, and maintain highly available, secure, and scalable infrastructure on AWS for federal workloads.
Automate deployment pipelines and configuration management using Terraform, Kubernetes, and CI/CD tools.
Develop and maintain monitoring, alerting, and logging solutions to ensure 99.9% uptime and rapid incident resolution.
Lead incident response, root cause analysis, and post‑mortem documentation to drive continuous improvement.
Collaborate with development, security, and compliance teams to enforce best practices and regulatory requirements.

Requirements

5+ years of SRE or DevOps experience in a cloud‑native environment.
Proficiency with AWS services (EC2, RDS, S3, IAM, CloudWatch) and Kubernetes orchestration.
Strong scripting skills in Python and experience with Terraform for IaC.
Hands‑on experience with monitoring tools (Prometheus, Grafana, Datadog) and log aggregation.
Excellent problem‑solving, communication, and collaboration skills in a fast‑paced, federal‑compliant setting.

Skills

awskubernetesterraformpython

CompanyWorkday

DepartmentEngineering

LocationReston, VA, United States

Experience5+ years

Tenurefull-time

LevelSenior

Salary264,000

Posted June 19, 2026