remote
Databricks Data Engineer - Palmetto Tech
Data Engineer
Design and optimize scalable data pipelines on Databricks, leveraging Apache Spark, PySpark, Python, SQL, Delta Lake and streaming technologies such as Kafka to build enterprise lakehouse and real‑time processing solutions.
About the role
Key Responsibilities
- Design, develop, and tune high‑performance data pipelines using Databricks, Apache Spark, PySpark, Python, and SQL.
- Implement lakehouse architectures with Delta Lake, Delta Live Tables, Unity Catalog, and Databricks SQL, following medallion (Bronze‑Silver‑Gold) patterns.
- Build batch and real‑time processing solutions using Structured Streaming, Kafka, Event Hubs, or Kinesis.
- Create reusable ingestion and transformation frameworks for structured, semi‑structured, and unstructured data sources.
- Collaborate with data scientists and analysts to deliver clean, governed data sets for downstream analytics and AI applications.
Requirements
- 3+ years of hands‑on experience with Databricks and Apache Spark ecosystems.
- Proficiency in Python, PySpark, and SQL for data manipulation and pipeline development.
- Experience implementing Delta Lake, Delta Live Tables, and Unity Catalog for data governance.
- Solid understanding of streaming platforms such as Kafka, Event Hubs, or Kinesis.
- Strong problem‑solving skills and ability to work in a remote, collaborative environment.
Skills
databricksapache sparkpythonsqlkafka