remoteonsite

Databricks Engineer - Persistent Systems

Software Engineer

Lead end‑to‑end data engineering on Databricks, building scalable Spark pipelines, Delta Lake data lakes, and MLflow‑driven ML workflows in a cloud environment.

About the role

Key Responsibilities

Design, develop, and maintain large‑scale Spark pipelines on Databricks for batch and streaming data.
Implement Delta Lake best practices for ACID transactions, schema evolution, and data versioning.
Integrate MLflow for experiment tracking, model packaging, and deployment pipelines.
Collaborate with data scientists to optimize feature engineering and model training workflows.
Automate data workflows using CI/CD pipelines and monitor job performance and data quality.

Requirements

3+ years of experience building data pipelines with Databricks and Apache Spark.
Strong proficiency in Python and SQL; experience with Scala or Java is a plus.
Hands‑on experience with Delta Lake, MLflow, and cloud data services (AWS, Azure, or GCP).
Solid understanding of data modeling, ETL best practices, and performance tuning.
Excellent problem‑solving skills and ability to work collaboratively in a fast‑paced environment.

Skills

databricksapache sparkpythonmlflowaws

CompanyPersistent Systems

DepartmentEngineering

LocationIndia

Experience9+ years

Tenurefull-time

LevelMid-Level

Posted June 22, 2026