remote
Data & Machine Learning Engineer Hybrid work
ML Engineer
We are seeking a Data & Machine Learning Engineer to design and maintain data lakes, build robust data pipelines, and develop scalable machine‑learning models using Python, Spark, Airflow, and AWS services.
About the role
Key Responsibilities
- Design, implement, and optimize data lake architectures to support structured and unstructured data at scale.
- Develop, schedule, and monitor end‑to‑end data pipelines using Apache Airflow and Spark, ensuring data quality and reliability.
- Collaborate with data scientists to integrate machine‑learning models into production workflows and provide feature engineering support.
- Implement data modeling standards and maintain metadata catalogs for efficient data discovery.
- Leverage AWS services (S3, Redshift, Glue, EMR) to build secure, cost‑effective data solutions.
Requirements
- Strong proficiency in Python and SQL for data manipulation and analysis.
- Hands‑on experience with Apache Spark and Airflow for large‑scale data processing.
- Solid understanding of data lake concepts, data modeling, and ETL best practices.
- Familiarity with AWS ecosystem (S3, Redshift, Glue, EMR) and infrastructure‑as‑code tools.
- Excellent communication skills in English and ability to work effectively in a hybrid team environment.
Skills
pythonsqlapache sparkawsmachine learning