remote
Lead Data Engineer Databricks - Bridgenext
Data Engineer
Lead data engineering initiatives, designing and scaling end‑to‑end pipelines on Databricks using Spark, Python, and SQL, while orchestrating workflows with Airflow and leveraging cloud platforms for robust, production‑grade solutions.
About the role
Key Responsibilities
- Architect, develop, and maintain scalable data pipelines on Databricks using Apache Spark and Python.
- Design and implement robust data models and warehouses to support analytics and machine‑learning workloads.
- Orchestrate ETL/ELT workflows with Airflow, ensuring reliability, monitoring, and alerting.
- Collaborate with data scientists, analysts, and product teams to translate business requirements into technical solutions.
- Optimize performance and cost on cloud platforms (AWS or GCP) through proper resource provisioning and best‑practice configurations.
- Mentor junior engineers, enforce coding standards, and drive continuous improvement of data engineering practices.
Requirements
- 5+ years of hands‑on experience building data pipelines, with at least 2 years focused on Databricks and Spark.
- Proficiency in Python and advanced SQL for data transformation and analysis.
- Strong experience with workflow orchestration tools, preferably Apache Airflow.
- Deep understanding of cloud services (AWS or GCP) and best practices for data security, scalability, and cost management.
- Demonstrated ability to design data models, data lakes, and data warehouses, and to mentor technical teams.
Skills
databricksapache sparkpythonsqlairflow