onsite
AI Engineer - Cloud Infrastructure - Traversal
AI Engineer
Lead the design and deployment of AI‑driven services on cloud platforms, ensuring high availability, scalability, and security. Leverage Python, ML frameworks, and AWS infrastructure to build resilient, automated pipelines for enterprise AI workloads.
About the role
Key Responsibilities
- Architect, develop, and maintain AI services on AWS, ensuring high availability and scalability.
- Implement CI/CD pipelines using Terraform, GitHub Actions, and Docker to automate model training, testing, and deployment.
- Collaborate with data scientists to translate ML models into production‑ready microservices.
- Monitor and troubleshoot production incidents, applying SRE best practices to reduce MTTR.
- Optimize cost and performance of cloud resources through right‑sizing and autoscaling strategies.
Requirements
- 3+ years of experience building AI/ML solutions in a cloud environment.
- Proficiency in Python, TensorFlow/PyTorch, and container orchestration with Kubernetes.
- Hands‑on experience with AWS services (ECS/EKS, S3, SageMaker, Lambda) and IaC tools like Terraform.
- Strong understanding of CI/CD, monitoring, and incident response practices.
- Excellent problem‑solving skills and a collaborative mindset.
Skills
pythonmachine learningawskubernetesterraformcicd