onsite
Data Engineer II - Yes Energy
Data Engineer
Mid‑level data engineer building real‑time pipelines and analytics platforms for electric power market data using Python, SQL, Spark, and AWS services.
About the role
Key Responsibilities
- Design, develop, and maintain scalable data pipelines that ingest, transform, and store high‑velocity electric grid data.
- Implement ETL processes using Python, SQL, and Apache Spark to support real‑time trading analytics.
- Collaborate with data scientists and product teams to model complex energy datasets and enable advanced analytics.
- Deploy and manage data infrastructure on AWS, including S3, Redshift, Lambda, and EMR.
- Integrate streaming data sources such as Kafka to provide low‑latency market insights.
- Monitor pipeline performance, troubleshoot issues, and continuously improve data quality and reliability.
Requirements
- 2+ years of professional experience building data pipelines in a cloud environment.
- Proficiency in Python and SQL, with hands‑on experience in Apache Spark or similar distributed processing frameworks.
- Strong understanding of AWS services (S3, Redshift, Lambda, EMR) and infrastructure‑as‑code concepts.
- Experience with streaming platforms like Kafka and designing real‑time data flows.
- Solid data modeling skills and ability to translate business requirements into scalable data solutions.
Skills
pythonsqlapache sparkawskafka