remote
Staff Software Engineer, Machine Learning Infrastructure
ML Engineer
Lead the design and implementation of scalable ML infrastructure, building data pipelines, API integration, and automated code generation to support experimentation and data quality across the organization.
About the role
Key Responsibilities
- Architect and develop end‑to‑end machine‑learning infrastructure that supports high‑throughput data pipelines and model training at scale.
- Design and maintain robust API integration layers for automated code generation and model deployment.
- Implement data quality and validation frameworks to ensure reliable inputs for experimentation.
- Collaborate with data scientists and product teams to translate research prototypes into production‑ready services.
- Drive continuous improvement of cloud‑native platforms using AWS services and Kubernetes orchestration.
Requirements
- 10+ years of software engineering experience, with at least 5 years focused on machine‑learning infrastructure.
- Strong proficiency in Python and deep‑learning frameworks such as TensorFlow or PyTorch.
- Hands‑on experience building scalable data pipelines and CI/CD workflows on AWS.
- Expertise with containerization, Kubernetes, and infrastructure‑as‑code tools.
- Proven ability to lead technical initiatives, mentor engineers, and deliver production‑grade ML systems.
Skills
pythontensorflowpytorchawskubernetes