onsite

Staff Site Reliability Engineer Cloud Management - Palo Alto Networks

Site Reliability Engineer

Lead the design, deployment, and operation of scalable cloud infrastructure, ensuring high availability, performance, and security for enterprise‑grade services using Kubernetes, AWS, and Terraform.

About the role

Key Responsibilities

Architect and maintain highly available, scalable cloud environments for mission‑critical applications.
Implement and manage Kubernetes clusters, CI/CD pipelines, and infrastructure as code with Terraform.
Design and enforce observability, monitoring, and alerting strategies to detect and resolve incidents quickly.
Collaborate with development, security, and product teams to embed reliability best practices into the software delivery lifecycle.
Lead post‑mortem analyses, root‑cause investigations, and continuous improvement initiatives.

Requirements

10+ years of experience in site reliability engineering or related roles.
Deep expertise in AWS, Kubernetes, and Terraform.
Strong scripting skills (Python, Bash) and familiarity with CI/CD tools (GitHub Actions, Jenkins).
Proven track record of building and operating large‑scale, highly available systems.
Excellent communication skills and a collaborative mindset.

Skills

kubernetesawsterraform

CompanyPalo Alto Networks

DepartmentEngineering

LocationSanta Clara, California, United States

Experience7+ years

Tenurefull-time

LevelLead

Salary169,225

Posted June 23, 2026