remote
Associate Data Engineer III - Penske Truck Leasing
Data Engineer
Mid‑level data engineer responsible for designing, building, and maintaining scalable data pipelines and warehouses using Python, SQL, Spark, and AWS services to support analytics and business intelligence initiatives.
About the role
Key Responsibilities
- Design, develop, and maintain robust ETL pipelines that ingest, transform, and load data from diverse sources into cloud‑based data warehouses.
- Implement data models and schemas optimized for reporting, analytics, and machine‑learning workloads.
- Collaborate with data scientists, analysts, and product teams to understand data requirements and deliver reliable data solutions.
- Monitor pipeline performance, troubleshoot issues, and apply best practices for scalability, reliability, and security.
- Leverage AWS services (e.g., S3, Redshift, Glue, Lambda) and Apache Spark to process large‑volume datasets efficiently.
Requirements
- 3+ years of hands‑on experience building data pipelines using Python, SQL, and Spark.
- Strong knowledge of cloud platforms, preferably AWS, and related data services.
- Experience with data warehousing concepts, dimensional modeling, and performance tuning.
- Proficiency in version control (Git) and CI/CD practices for data engineering workflows.
- Excellent problem‑solving skills and ability to work collaboratively in an agile environment.
Skills
pythonsqlawsapache spark