remote
AI Systems & Platform Internals - Technical Architect - Accellor
Solutions Architect
Lead the design and implementation of scalable AI platform infrastructure, integrating advanced ML models with cloud-native services to deliver enterprise‑grade solutions across diverse industries.
About the role
Key Responsibilities
- Architect and build robust, cloud‑native AI platform components using Kubernetes, AWS services, and containerized Python microservices.
- Design and implement ML Ops pipelines for model training, deployment, monitoring, and versioning at scale.
- Collaborate with data scientists and product teams to translate business requirements into scalable, secure infrastructure solutions.
- Ensure high availability, performance, and compliance of AI services through automated testing, CI/CD, and observability best practices.
- Document architecture, design decisions, and operational procedures for cross‑functional teams.
Requirements
- Strong experience with Python, Kubernetes, and AWS (EKS, S3, SageMaker).
- Hands‑on knowledge of ML Ops tools such as MLflow, Kubeflow, or similar.
- Proficiency in designing scalable, secure cloud architectures and microservice patterns.
- Excellent problem‑solving skills and ability to work in a fast‑paced, collaborative environment.
- Effective communication skills for translating technical concepts to non‑technical stakeholders.
Skills
pythonkubernetesaws