onsite
Staff AI Operations Architect - Checkr
Systems Engineer
Lead the design and execution of scalable AI operations infrastructure on AWS, orchestrating Kubernetes clusters, ML‑ops pipelines, and automated monitoring to ensure high‑availability, security, and compliance for enterprise‑grade AI verification services.
About the role
Key Responsibilities
- Architect and maintain end‑to‑end AI operations platform on AWS, leveraging Kubernetes, Terraform, and CI/CD pipelines.
- Design and implement scalable data pipelines for model training, inference, and monitoring, ensuring low latency and high throughput.
- Define and enforce security, compliance, and governance policies across all AI workloads.
- Collaborate with data science, product, and engineering teams to translate business requirements into robust, production‑ready solutions.
- Lead incident response, performance tuning, and capacity planning for mission‑critical AI services.
Requirements
- 10+ years of experience in cloud architecture, with deep expertise in AWS services (EKS, ECS, S3, Lambda, SageMaker).
- Proven track record building and scaling Kubernetes‑based ML‑ops platforms.
- Strong knowledge of data engineering, ETL, and real‑time streaming technologies.
- Experience with security best practices, compliance frameworks (SOC, ISO, GDPR), and automated policy enforcement.
- Excellent communication skills and ability to mentor cross‑functional teams.