onsite

Senior/Staff Machine Learning Engineer - Data Infrastructure

ML Engineer

Lead the design and deployment of scalable ML pipelines on data lake and warehouse platforms, orchestrating workflows with Apache Airflow and Spark while ensuring robust automated testing and production reliability.

About the role

Key Responsibilities

Architect and implement end‑to‑end machine learning pipelines on large‑scale data lake and warehouse environments.
Design and maintain Airflow DAGs to orchestrate data ingestion, feature engineering, model training, and deployment workflows.
Leverage Apache Spark for distributed data processing and model training at scale.
Develop and maintain automated testing suites (unit, integration, and performance) to guarantee pipeline reliability.
Collaborate with data engineering and data science teams to optimize data pipelines and model performance.
Document architecture, processes, and best practices for internal knowledge sharing.

Requirements

10+ years of experience in software engineering with a strong focus on machine learning and data infrastructure.
Proficient in Python, Apache Airflow, Apache Spark, and SQL.
Hands‑on experience building and scaling data lakes and data warehouses (e.g., Snowflake, BigQuery, Redshift).
Deep understanding of automated testing frameworks and CI/CD pipelines for data workflows.
Excellent communication skills and ability to mentor junior engineers.

Skills

machine learningapache spark

DepartmentResearch

LocationShanghai, China

Experience7+ years

Tenurefull-time

LevelLead

Posted June 22, 2026