onsite
DevOps Engineer - Agentic AI Platform - Advisor360 Llc
Devops Engineer
Lead the engineering of a scalable Agentic AI platform, designing and maintaining CI/CD pipelines, container orchestration, and cloud infrastructure on AWS. Drive automation, reliability, and performance for AI workloads.
About the role
Key Responsibilities
- Design, implement, and maintain CI/CD pipelines for AI model training and deployment using Git, Jenkins, or GitHub Actions.
- Provision and manage Kubernetes clusters on AWS EKS, ensuring high availability and autoscaling for AI workloads.
- Automate infrastructure with Terraform, CloudFormation, and Docker, creating reusable modules and blueprints.
- Monitor system health, performance, and security using Prometheus, Grafana, and CloudWatch; troubleshoot incidents and implement proactive alerts.
- Collaborate with data scientists and backend teams to optimize model serving, data pipelines, and resource utilization.
Requirements
- 3+ years of DevOps experience in a cloud‑native environment.
- Proficient with Kubernetes, Docker, and AWS services (EKS, EC2, S3, RDS).
- Hands‑on experience with Terraform, CI/CD tooling, and scripting (Python, Bash).
- Strong understanding of monitoring, logging, and security best practices.
- Excellent problem‑solving skills and a collaborative mindset.
Skills
kubernetesdockerawsterraformcicdpython