THE ROLE
We're looking for a Senior Platform Engineer / SRE who can work with complex infrastructure work, drive IaC and GitOps architecture, and set the standard for how we automate and operate systems at scale. You'll tackle hard problems — multi-tenant isolation, self-service infrastructure, reliability engineering — and have the scope to solve them properly.
This is not a ticket-processing role. Seniors here identify problems before they're asked, make architectural calls, mentor engineers, and raise the ceiling on what the platform can do.
WHAT YOU'LL WORK ON
- IaC architecture — Terraform module design, state management, multi-account patterns, and setting the standards the rest of the team builds against
- Drive GitOps at scale — ArgoCD configuration, progressive delivery patterns, promotion workflows, and deployment reliability across multiple environments and tenants
- Architect and operate multi-tenant Kubernetes infrastructure on AWS EKS — tenant isolation, workload placement, cluster topology, and long-term scalability strategy
- Build self-service infrastructure automation — provisioning pipelines, configuration management, and platform capabilities that engineering teams can consume without manual intervention
- Agentic coding tools for infrastructure work — scaffolding new environments, generating and reviewing IaC, accelerating automation, and establishing patterns for the team
- Own reliability — SLO definitions, error budgets, incident response quality, and the feedback loop that turns incidents into platform improvements
- Set observability standards — trace coverage, alert quality, on-call ergonomics, and runbook culture
- Partner with security on zero-trust architecture, secrets management at scale, and infrastructure hardening
- Contribute to technical roadmap and help the team prioritize the right work
- Mentor mid-level engineers — code review, design feedback, on-call shadowing
WHAT WE'RE LOOKING FOR
- 6+ years in platform engineering, SRE, or infrastructure — with meaningful time operating production systems at scale
- Deep IaC expertise — you design Terraform architectures, not just write modules; you've managed complex state and multi-account configurations in production
- Strong GitOps background — you understand declarative infrastructure management at depth and have opinions on how to do it well
- Deep Kubernetes knowledge — you've operated clusters in production, dealt with real failure modes, and understand the system at the control plane level
- Strong AWS background — networking, compute, IAM, storage, multi-account design
- Experience with multi-tenant infrastructure — isolation patterns, noisy neighbor mitigation, and tenant lifecycle management
- Automation-first thinking at a senior level — you design systems that elimi