remote
Data Engineer - Hexaware Technologies, Inc
Data Engineer
Data Engineer responsible for designing, building, and maintaining scalable data pipelines using Python, SQL, Spark, and Airflow on AWS, ensuring high data quality and performance for analytics and machine learning initiatives.
About the role
Key Responsibilities
- Design, develop, and maintain robust ETL pipelines to ingest, transform, and load data from diverse sources into data lakes and warehouses.
- Leverage Apache Spark and Python to process large datasets efficiently, optimizing performance and resource utilization.
- Implement and manage workflow orchestration with Apache Airflow, ensuring reliable scheduling and monitoring of data jobs.
- Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver actionable insights.
- Ensure data quality, lineage, and security compliance across all data assets.
Requirements
- 3+ years of experience in data engineering or related roles.
- Hands‑on experience with AWS services such as S3, Redshift, Glue, and EMR.
- Familiarity with Airflow or other workflow orchestration tools.
Skills
pythonsqlapache sparkaws