onsite
Staff Data Engineer I
Data Engineer
Lead end‑to‑end pricing data pipelines on AWS, designing scalable ETL workflows with EMR, Airflow, and Spark. Drive data quality, performance, and algorithmic insights for pricing models.
About the role
Key Responsibilities
- Design, build, and maintain large‑scale pricing data pipelines on AWS using EMR, Spark, and Airflow.
- Collaborate with data scientists to implement pricing algorithms and ensure data integrity for model training.
- Optimize ETL performance, troubleshoot failures, and implement monitoring and alerting.
- Document architecture, data flows, and best practices for reproducibility and compliance.
- Mentor junior engineers and contribute to continuous improvement of data engineering processes.
Requirements
- 5+ years of experience in data engineering with a focus on pricing or financial data.
- Proficiency in AWS services (S3, EMR, Glue, Redshift) and big‑data frameworks (Spark, Hive).
- Strong scripting skills in Python or Scala and experience orchestrating workflows with Apache Airflow.
- Solid understanding of data modeling, SQL, and performance tuning.
- Excellent problem‑solving skills and ability to communicate complex concepts to cross‑functional teams.