onsite
Data Engineer - 66degrees
Data Engineer
Data Engineer building scalable data pipelines on AWS, leveraging Python, SQL, Spark and Airflow to transform raw data into actionable insights for AI initiatives.
About the role
Key Responsibilities
- Design, develop and maintain end‑to‑end data pipelines on AWS (S3, Redshift, Glue, EMR) to ingest, transform and load large volumes of structured and unstructured data.
- Implement data models and schemas using SQL and Spark, ensuring optimal performance and data quality.
- Automate workflow orchestration with Airflow, monitoring job health and troubleshooting failures.
- Collaborate with data scientists and product teams to provide clean, well‑documented datasets for AI and analytics projects.
- Optimize storage and compute costs through efficient data partitioning, compression and caching strategies.
Requirements
- 3+ years of experience in data engineering, with strong proficiency in Python and SQL.
- Hands‑on experience with AWS services (S3, Redshift, Glue, EMR) and Spark.
- Solid understanding of data modeling, ETL best practices and workflow orchestration.
- Excellent problem‑solving skills and ability to work in a fast‑paced, collaborative environment.
Skills
pythonsqlawsapache sparkairflow