hybrid

Sr. MLOps Engineer

Betterdata is seeking a Senior MLOps Engineer to transform cutting-edge research into production-ready services for synthetic data generation and optimize ML algorithms at enterprise scale. The role involves building and tuning end-to-end model pipelines, ensuring high performance, scalability, and reliability across diverse workloads and dataset sizes, with a focus on algorithm optimization, data handling at scale, and end-to-end orchestration.

About the role

Who are We Looking for

We seek an experienced Machine Learning Engineer (Senior) to transform cutting-edge research into robust, production-ready services for synthetic data generation and to optimize both deep learning and classical ML algorithms (e.g. tree-based models) at enterprise scale (billions of rows). You will build and tune model pipelines end-to-end, ensuring high performance, scalability, and reliability across diverse workloads and dataset sizes.

Key Responsibilities

Algorithm Optimization & Scaling

Optimize bottlenecks of the deep generative models to accelerate training and generation of generative models (e.g. transformer, diffusion, GANs).
Implement distributed training of the models across multi-GPU clusters.
Optimize distributed training of traditional ML models (e.g. XGBoost, LightGBM, CatBoost) on billion-row datasets.
Design best practices for memory management to maximize resource utilization (compute and memory), enabling faster training at lower cost.

Data Handling at Scale

Collaborate with data engineers to design ETL/ELT workflows handling terabyte to petabyte scale tabular and unstructured data.
Implement scalable feature engineering pipelines using distributed computing frameworks (e.g. Spark, Dask, or Ray).
Automate data validation (e.g. schema checks, anomaly detection) with rule-based and ML-driven frameworks.

End to end orchestration

Build ML pipelines that transition research prototypes into reliable production-grade workflow.
Package models into Docker containers and deploy using Kubernetes.
Build automated model and data quality monitoring and validation systems to ensure data integrity throughout the pipeline lifecycle.
Design robust error handling mechanisms, with automatic retries and data recovery in case of pipeline failures.
Implement logging, monitoring and alerting systems.

Qualifications

Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, Software Engineering, Data Science or a related quantitative discipline.
5+ years of hands-on experience optimizing and scaling machine learning models in production environments.
Demonstrated track record of accelerating model training workflows (e.g., transformers, diffusion models, GANs) at multi-GPU scale.
Experience in operating ETL/ELT pipelines handling terabytes to petabytes of tabular and unstructured data using distributed computing tools (e.g. Apache Spark, Dask, Ray).
Demonstrated ability to translate research prototypes into reliable, production-grade ML pipelines with rigorous testing and validation.
Experience in the ML orchestration (e.g. airflow, dagster).

Good to Have

Experience hosting models to scalable cloud infrastructure (AWS / Azure / GCP).
Experience containerisation of the data pipelines & AI models in docker with supporting orchestration tools (e.g. kubernetes).

About the role

Who are We Looking for

Key Responsibilities

Algorithm Optimization & Scaling

Optimize bottlenecks of the deep generative models to accelerate training and generation of generative models (e.g. transformer, diffusion, GANs).
Implement distributed training of the models across multi-GPU clusters.
Optimize distributed training of traditional ML models (e.g. XGBoost, LightGBM, CatBoost) on billion-row datasets.
Design best practices for memory management to maximize resource utilization (compute and memory), enabling faster training at lower cost.

Data Handling at Scale

Collaborate with data engineers to design ETL/ELT workflows handling terabyte to petabyte scale tabular and unstructured data.
Implement scalable feature engineering pipelines using distributed computing frameworks (e.g. Spark, Dask, or Ray).
Automate data validation (e.g. schema checks, anomaly detection) with rule-based and ML-driven frameworks.

End to end orchestration

Build ML pipelines that transition research prototypes into reliable production-grade workflow.
Package models into Docker containers and deploy using Kubernetes.
Build automated model and data quality monitoring and validation systems to ensure data integrity throughout the pipeline lifecycle.
Design robust error handling mechanisms, with automatic retries and data recovery in case of pipeline failures.
Implement logging, monitoring and alerting systems.

Qualifications

Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, Software Engineering, Data Science or a related quantitative discipline.
5+ years of hands-on experience optimizing and scaling machine learning models in production environments.
Demonstrated track record of accelerating model training workflows (e.g., transformers, diffusion models, GANs) at multi-GPU scale.
Experience in operating ETL/ELT pipelines handling terabytes to petabytes of tabular and unstructured data using distributed computing tools (e.g. Apache Spark, Dask, Ray).
Demonstrated ability to translate research prototypes into reliable, production-grade ML pipelines with rigorous testing and validation.
Experience in the ML orchestration (e.g. airflow, dagster).

Good to Have

Experience hosting models to scalable cloud infrastructure (AWS / Azure / GCP).
Experience containerisation of the data pipelines & AI models in docker with supporting orchestration tools (e.g. kubernetes).

Sr. MLOps Engineer

About the role

Who are We Looking for

Key Responsibilities

Algorithm Optimization & Scaling

Data Handling at Scale

End to end orchestration

Qualifications

Good to Have

Sr. MLOps Engineer

About the role

Who are We Looking for

Key Responsibilities

Algorithm Optimization & Scaling

Data Handling at Scale

End to end orchestration

Qualifications

Good to Have

Skills