remote

Site Reliability Engineer - Metova Federal

Site Reliability Engineer

Site Reliability Engineer responsible for designing, deploying, and maintaining highly available, scalable infrastructure on AWS using Kubernetes, Docker, Terraform, and monitoring tools to support mission‑critical federal applications.

About the role

Key Responsibilities

Design, implement, and manage scalable, highly available Kubernetes clusters on AWS for mission‑critical workloads.
Automate infrastructure provisioning and configuration using Terraform, ensuring repeatable and auditable deployments.
Implement CI/CD pipelines with GitHub Actions or Jenkins to streamline application releases and rollbacks.
Monitor system health with Prometheus, Grafana, and CloudWatch, proactively identifying and resolving performance bottlenecks.
Collaborate with development teams to enforce best practices for observability, security, and cost optimization.

Requirements

3+ years of experience in site reliability or DevOps roles, preferably in federal or defense environments.
Proficient with Kubernetes, Docker, and AWS services (EKS, EC2, S3, CloudWatch).
Hands‑on experience with Terraform, CI/CD tooling, and monitoring/alerting stacks.
Strong scripting skills in Bash or Python for automation.
Excellent problem‑solving skills and ability to work in a fast‑paced, mission‑critical setting.

Skills

kubernetesdockerawsterraform

CompanyMetova Federal

DepartmentEngineering

LocationOrlando, FL, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 19, 2026