onsite
Platform Engineer - Outmarket AI
Devops Engineer
Platform Engineer responsible for building and maintaining a reliable, high‑performance, multi‑tenant AI platform on GCP, using Kubernetes, Terraform, CI/CD pipelines, observability tools, and Temporal orchestration to empower developers and optimize cost and performance.
About the role
Key Responsibilities
- Own and evolve the Kubernetes (GKE) cluster, ensuring high availability, scalability, and security for the AI platform.
- Design, implement, and maintain infrastructure-as-code with Terraform across the GCP footprint.
- Build and operate end‑to‑end CI/CD pipelines, integrating automated testing, linting, and deployment for rapid feature delivery.
- Implement comprehensive observability: logs, metrics, and traces to monitor performance, detect anomalies, and drive continuous improvement.
- Leverage Temporal for complex workflow orchestration, ensuring reliability and fault tolerance of long‑running AI processes.
- Collaborate with cross‑functional teams to define best practices, tooling, and documentation that enhance developer experience and productivity.
Requirements
- 3+ years of experience as a Platform or DevOps Engineer in a cloud‑native environment.
- Proficiency with Kubernetes (GKE), Terraform, and GCP services (Compute, Cloud Storage, Pub/Sub, etc.).
- Hands‑on experience building CI/CD pipelines (GitHub Actions, Cloud Build, or similar) and implementing observability stacks (Prometheus, Grafana, Loki, Jaeger).
- Strong scripting skills (Python, Bash) and familiarity with workflow orchestration tools like Temporal.
- Excellent problem‑solving abilities, communication skills, and a passion for building reliable, scalable systems.
Skills
kubernetesterraformgcpcicd