remote
Senior Data Engineer - Sonny's Enterprises, Inc
Data Engineer
Lead the design, development, and optimization of structured data pipelines on Databricks, leveraging Spark, Python, and cloud services to deliver reliable, high‑performance data solutions.
About the role
Key Responsibilities
- Architect, build, and maintain scalable data pipelines on Databricks using Apache Spark and Python.
- Design and implement robust ETL processes to ingest, transform, and load structured data from diverse sources.
- Collaborate with data scientists, analysts, and product teams to define data requirements and ensure data quality.
- Optimize query performance and storage costs in the cloud environment (AWS), including partitioning, caching, and resource tuning.
- Implement workflow orchestration and monitoring using Airflow or similar tools, ensuring reliable job scheduling and alerting.
- Establish best practices for data governance, documentation, and version control.
Requirements
- 5+ years of professional experience in data engineering, with a focus on structured data pipelines.
- Strong expertise in Databricks, Apache Spark, and Python programming.
- Proficient in SQL and relational database design, with experience optimizing complex queries.
- Hands‑on experience with cloud platforms (AWS) and services such as S3, Redshift, or Glue.
- Familiarity with workflow orchestration tools like Airflow and CI/CD practices for data pipelines.
Skills
databricksapache sparkpythonsqlairflowaws