Description/Responsibilities
Our Senior Data Science & ML Ops Engineer is a hands-on role focused on partnering with business leaders and technology teams to design, test, and deploy actionable machine learning solutions that drive measurable business outcomes. This role bridges data science, engineering, and operations—owning the full lifecycle from hypothesis and experimentation through production deployment and operationalization.
This position is centered on applied machine learning, using proven, off-the-shelf algorithms and scalable AWS services to rapidly validate ideas, embed models into business workflows, and ensure they are reliably running in production.
Business-Driven Experimentation & Model Ownership
- Partner directly with business stakeholders to identify opportunities where data and machine learning can improve decisions, efficiency, or outcomes
- Design experiments and hypotheses that can be validated quickly using available data and pragmatic modeling approaches
- Select and apply out-of-the-box machine learning algorithms (e.g., classification, regression, forecasting, clustering, optimization)
- Own models end-to-end—from data preparation and feature engineering through deployment, monitoring, and iteration based on real-world results
ML Implementation, Production & Operations
- Deploy ML models into production using AWS-native tooling and integrate them into operational workflows and downstream systems
- Implement ML training and inference pipelines on Amazon SageMaker, including pipelines, endpoints, model registry, and monitoring
- Ensure production readiness through versioning, validation, rollback strategies, and performance monitoring
- Monitor model performance (accuracy, drift, stability, business KPIs) and iterate based on real-world impact
- Participate directly in diagnosis and resolution of production issues affecting data pipelines or ML workloads
Data Platform & Engineering Collaboration
- Build and operate data ingestion and transformation pipelines across batch and event-driven workloads using AWS Glue, zero‑ETL integrations, Step Functions, EventBridge, and related services
- Collaborate closely with IT, Security, and Platform Engineering teams to align with enterprise security, compliance, and operational standards
- Use infrastructure as code (Terraform, CDK, or CloudFormation) to create repeatable, scalable environments
Data Governance, Lake Architecture & Operational Excellence
- Own and operate S3-based data lake infrastructure, including Iceberg table formats, AWS Glue Data Catalog, and AWS Lake Formation
- Implement and enforce data zone architecture (e.g., raw, curated, and consumption zones) to support governed data access and lifecycle management
- Define and apply data access controls using Lake Formation permissions and IAM