remote

Senior Site Reliability Engineer SRE - Tradeweb

Site Reliability Engineer

Senior Site Reliability Engineer driving high‑availability, scalable infrastructure for a global electronic trading platform using Kubernetes, Docker, AWS, and Terraform to ensure 99.99% uptime and rapid incident response.

About the role

Key Responsibilities

Design, implement, and maintain highly available, scalable infrastructure for a global electronic trading platform.
Lead incident response, root‑cause analysis, and post‑mortem documentation to continuously improve reliability.
Automate deployment pipelines with CI/CD tools, Terraform, and container orchestration (Kubernetes).
Monitor system health using Prometheus, Grafana, and custom alerts; optimize performance and cost.
Collaborate with development, security, and product teams to embed reliability best practices into the software lifecycle.

Requirements

5+ years of SRE or DevOps experience in a high‑frequency trading or financial services environment.
Proficient with Kubernetes, Docker, and cloud platforms (AWS preferred).
Strong scripting skills (Python, Bash) and experience with IaC tools (Terraform, CloudFormation).
Hands‑on experience with monitoring, alerting, and incident management tools (Prometheus, Grafana, PagerDuty).
Excellent communication, problem‑solving, and collaboration skills.

Skills

kubernetesdockerawsterraform

CompanyTradeweb

DepartmentEngineering

LocationUnited States

Experience5+ years

Tenurefull-time

LevelSenior

Salary240,000

Posted June 19, 2026