remote
Principal Platform Engineer, Observability - Palo Alto Networks
Devops Engineer
Lead the design and delivery of a scalable observability platform for cloud‑native security services, leveraging Go, Python, Kubernetes, and modern monitoring tools such as Prometheus and Grafana on AWS.
About the role
Key Responsibilities
- Architect and build a high‑performance, multi‑tenant observability platform that ingests, stores, and visualizes telemetry from security products.
- Define data models, pipelines, and alerting frameworks using Prometheus, Grafana, and custom exporters.
- Drive automation and infrastructure‑as‑code practices on AWS, employing Kubernetes, Helm, and CI/CD pipelines.
- Collaborate with product, security, and SRE teams to ensure reliability, low latency, and scalability of monitoring services.
- Mentor engineering teams, establish best practices, and contribute to open‑source observability components.
Requirements
- 10+ years of software engineering experience, with at least 5 years building large‑scale observability or monitoring platforms.
- Strong proficiency in Go and Python for backend services and data processing.
- Deep experience with Kubernetes, container orchestration, and cloud services (AWS preferred).
- Hands‑on expertise with Prometheus, Grafana, OpenTelemetry, and related metrics/trace collection tools.
- Proven track record of designing highly available, low‑latency systems and leading technical teams.
Skills
gopythonkubernetesprometheusgrafanaaws