remote
Data Engineer - Fabric - ClearCaptions, LLC
Data Engineer
Remote Data Engineer focused on building scalable data pipelines in Fabric, leveraging Python, Spark, and AWS services to transform and model data for real‑time analytics and machine learning.
About the role
Key Responsibilities
- Design, develop, and maintain robust data pipelines using Apache Spark and Python within the Fabric ecosystem.
- Implement data ingestion from diverse sources, ensuring high quality and consistency across the data lake.
- Collaborate with data scientists and product teams to model data for analytics, reporting, and ML workloads.
- Automate workflow orchestration with Airflow, monitoring job health and performance.
- Optimize query performance and storage costs on AWS, utilizing services such as S3, Redshift, and Glue.
Requirements
- 3+ years of experience as a data engineer or similar role.
Skills
pythonapache sparkawsairflowsql