remote
Staff AI TechOps Lead - 1password
Software Engineer
Lead AI infrastructure and operations, designing scalable, secure cloud services with Kubernetes, CI/CD pipelines, and Python automation to support enterprise security and productivity solutions.
About the role
Key Responsibilities
- Architect and maintain highly available AI and machine‑learning workloads on AWS, ensuring performance, scalability, and cost efficiency.
- Design and implement CI/CD pipelines for model training, deployment, and monitoring using Kubernetes and GitOps practices.
- Collaborate with security, product, and data teams to embed robust identity, access, and data‑privacy controls across all AI services.
- Lead incident response and capacity planning, proactively identifying bottlenecks and optimizing resource utilization.
- Mentor and grow a high‑performing TechOps team, fostering a culture of continuous improvement and automation.
Requirements
- 10+ years of experience in cloud operations, with deep expertise in AWS, Kubernetes, and container orchestration.
- Proven track record building and scaling AI/ML pipelines, including model training, inference, and monitoring.
- Strong scripting skills in Python and experience with CI/CD tools such as GitHub Actions, ArgoCD, or similar.
- Solid understanding of security best practices, identity management, and compliance frameworks.
- Excellent communication skills and a collaborative mindset to work across engineering, product, and security teams.
Skills
kubernetescicdpythonaws