remote

Sr. SRE II - Filevine

Site Reliability Engineer

Senior Site Reliability Engineer driving reliability, automation, and scalability for a high‑growth Legal AI platform using Kubernetes, Docker, AWS, and modern observability tools.

About the role

Key Responsibilities

Design, implement, and maintain highly available, scalable infrastructure for a cloud‑native Legal AI platform.
Lead incident response, root‑cause analysis, and post‑mortem documentation to continuously improve reliability.
Build and maintain CI/CD pipelines, infrastructure as code (Terraform), and container orchestration (Kubernetes).
Implement monitoring, alerting, and observability with Prometheus, Grafana, and custom dashboards.
Collaborate with development, security, and product teams to enforce best practices and drive automation.

Requirements

5+ years of SRE or DevOps experience in a fast‑paced, cloud‑first environment.
Proficiency with Kubernetes, Docker, and AWS services (EKS, EC2, S3, CloudWatch).
Hands‑on experience with Terraform, CI/CD tooling (GitHub Actions, Jenkins, ArgoCD), and scripting (Python, Bash).
Strong knowledge of monitoring, alerting, and incident management tools (Prometheus, Grafana, PagerDuty).
Excellent communication skills and a collaborative mindset.

Skills

kubernetesdockercicdawsterraformprometheusgrafana

CompanyFilevine

DepartmentEngineering

LocationUnited States

Experience5+ years

Tenurefull-time

LevelSenior

Posted June 20, 2026