remote
Senior Compute Platform Engineer - Zeta Global
Devops Engineer
Senior Compute Platform Engineer responsible for designing, building, and operating scalable cloud‑native infrastructure that powers a large AI‑driven marketing platform, leveraging Kubernetes, Docker, and AWS services.
About the role
Key Responsibilities
- Architect, develop, and maintain highly available, container‑based compute platforms on AWS to support data‑intensive AI workloads.
- Design and implement CI/CD pipelines using Terraform, Jenkins/GitHub Actions, and automated testing frameworks.
- Collaborate with data engineering, ML, and product teams to optimize resource utilization, latency, and cost.
- Monitor, troubleshoot, and resolve performance and reliability issues across Kubernetes clusters and supporting services.
- Establish best practices for security, observability, and disaster recovery in a multi‑tenant environment.
Requirements
- 5+ years of experience building and operating large‑scale cloud infrastructure, preferably on AWS.
- Strong proficiency in Python and Java for automation and service development.
- Deep hands‑on experience with Kubernetes, Docker, and infrastructure‑as‑code tools such as Terraform.
- Proven track record implementing CI/CD pipelines and automated testing at scale.
- Solid understanding of networking, security, and monitoring tools (e.g., Prometheus, Grafana, CloudWatch).
Skills
pythonjavakubernetesdockerawsterraformcicd