onsite
Data Engineer - Guidehouse
Data Engineer
Data Engineer building scalable, cloud‑based data pipelines and analytics platforms using Python, SQL, AWS, Spark, Airflow, and CI/CD to deliver high‑quality, maintainable data solutions.
About the role
Key Responsibilities
- Design, develop, and maintain robust data pipelines and ETL processes for ingesting, transforming, and loading data from diverse sources.
- Build and optimize data workflows on AWS using services such as S3, Redshift, Glue, and Lambda.
- Implement and manage orchestration with Apache Airflow, ensuring reliable scheduling and monitoring.
- Collaborate with data scientists, analysts, and product teams to understand requirements and deliver scalable data solutions.
- Apply CI/CD practices to automate deployment, testing, and version control of data engineering artifacts.
- Document data models, pipeline logic, and best practices for maintainability and knowledge transfer.
Requirements
- 3+ years of experience in data engineering, with strong proficiency in Python and SQL.
- Hands‑on experience with AWS data services (S3, Redshift, Glue, Lambda) and big‑data frameworks (Spark).
- Proficiency in workflow orchestration tools such as Apache Airflow.
- Solid understanding of CI/CD pipelines, Git, and automated testing for data workflows.
- Excellent problem‑solving skills and ability to work collaboratively in cross‑functional teams.
Skills
pythonsqlawsapache sparkcicd