remote

Senior Site Reliability Engineer - Adobe

Site Reliability Engineer

Senior Site Reliability Engineer responsible for building and maintaining scalable, highly available HTTP APIs and infrastructure for a creative AI platform, leveraging Kubernetes, Docker, CI/CD pipelines, AWS, and observability tools to ensure reliability and performance.

About the role

Key Responsibilities

Design, implement, and operate highly available Kubernetes clusters that host the Graph platform’s HTTP APIs and microservices.
Build and maintain CI/CD pipelines using GitHub Actions, Terraform, and Docker to automate deployments across multiple environments.
Implement observability with Prometheus, Grafana, and Loki, creating dashboards, alerts, and incident response playbooks.
Collaborate with backend and frontend teams to optimize API performance, reduce latency, and enforce security best practices.
Lead capacity planning, load testing, and cost optimization initiatives on AWS.
Mentor junior engineers and contribute to SRE knowledge base and tooling improvements.

Requirements

5+ years of experience in site reliability or DevOps roles, with a strong focus on cloud-native technologies.
Proficient with Kubernetes, Docker, and Helm; hands‑on experience with Terraform or CloudFormation.
Deep knowledge of AWS services (EKS, EC2, S3, CloudWatch, IAM) and experience building scalable, secure APIs.
Strong scripting skills in Python or Go, and familiarity with CI/CD tooling.
Excellent problem‑solving, communication, and collaboration skills in a fast‑paced, cross‑functional environment.

Skills

kubernetesdockercicdawspythonterraformprometheus

CompanyAdobe

DepartmentEngineering

LocationSan Jose, CA, United States

Experience9+ years

Tenurefull-time

LevelSenior

Salary301,600

Posted June 20, 2026