onsite
Data Engineer - New York City Department of Sanitation
Data Engineer
Data Engineer responsible for designing, building, and maintaining scalable data pipelines and warehouses, leveraging Python, SQL, AWS, and Airflow to support analytics and operational reporting for a large municipal agency.
About the role
Key Responsibilities
- Design, develop, and maintain robust ETL pipelines to ingest, transform, and load data from diverse sources into a centralized data warehouse.
- Implement and manage data workflows using Apache Airflow, ensuring reliability, monitoring, and timely execution.
- Optimize data storage and query performance on cloud platforms (AWS Redshift, S3, Athena) and support data modeling for analytical use cases.
- Collaborate with analysts, engineers, and stakeholders to define data requirements, create data dictionaries, and ensure data quality.
- Develop automated data validation and testing frameworks using Python and SQL.
Requirements
- Bachelor's degree in Computer Science, Engineering, or related field with 3+ years of hands‑on data engineering experience.
- Proficiency in Python programming and advanced SQL for data manipulation and performance tuning.
- Experience building scalable ETL pipelines and orchestrating workflows with Apache Airflow or similar tools.
- Strong knowledge of AWS services (S3, Redshift, Lambda, Glue) and best practices for cloud‑based data solutions.
- Familiarity with big‑data technologies (e.g., Spark, Hadoop) and data modeling concepts.
Skills
pythonsqlawsairflow