remote
Data Engineer - Jacobs
Data Engineer
Data Engineer driving the expansion of a modern data platform and AI capabilities across critical infrastructure, leveraging Python, SQL, Spark, and AWS to build scalable data lakes and robust ETL pipelines.
About the role
Key Responsibilities
- Design, develop, and maintain scalable data pipelines using Python, SQL, and Apache Spark to ingest, transform, and load data into a cloud-based data lake.
- Collaborate with data scientists and analysts to ensure data quality, consistency, and accessibility for advanced analytics and AI projects.
- Implement and optimize ETL processes, data models, and metadata management to support real-time and batch workloads.
- Leverage AWS services (S3, Glue, Redshift, Athena) to build and manage data infrastructure, ensuring high availability and performance.
- Monitor pipeline health, troubleshoot issues, and continuously improve data processing efficiency.
Requirements
- 3+ years of experience in data engineering with a strong focus on Python, SQL, and Spark.
- Hands‑on experience with AWS data services (S3, Glue, Redshift, Athena).
- Proven ability to design and implement data lakes and ETL pipelines at scale.
- Strong problem‑solving skills and a collaborative mindset.
- Excellent communication skills in English.
Skills
pythonsqlapache sparkaws