remoteonsite
Lead Data Engineer - Presidio
Data Engineer
Lead Data Engineer responsible for designing, building, and scaling end‑to‑end data pipelines on cloud platforms, leveraging Python, Spark, and SQL to deliver reliable, high‑performance data solutions for enterprise customers.
About the role
Key Responsibilities
- Architect and implement scalable data pipelines and ETL processes on AWS using Python, Spark, and containerized workloads.
- Design logical and physical data models to support analytics, reporting, and machine‑learning initiatives.
- Lead a team of data engineers, providing technical guidance, code reviews, and mentorship.
- Collaborate with data scientists, product owners, and business stakeholders to translate requirements into robust data solutions.
- Establish best practices for data quality, monitoring, and performance optimization.
Requirements
- 5+ years of hands‑on experience building data pipelines with Python, SQL, and Apache Spark.
- Strong expertise in cloud services (AWS) and container technologies such as Docker.
- Proven ability to design and implement data models for large‑scale analytics environments.
- Experience leading technical teams and driving best‑practice adoption.
- Excellent problem‑solving skills and ability to communicate complex concepts to both technical and non‑technical audiences.
Skills
pythonsqlapache sparkawsdocker