onsite
Senior Data Engineer - Diversitech
Data Engineer
Design, build, and maintain scalable data platform infrastructure to enable analytics, data science, and AI initiatives, leveraging cloud services, streaming pipelines, and modern ETL orchestration tools.
About the role
Key Responsibilities
- Architect and implement end‑to‑end data pipelines on cloud platforms to ingest, transform, and store large‑scale datasets.
- Develop and maintain data models, warehouses, and lakehouse solutions that support analytics, machine learning, and reporting needs.
- Design, configure, and optimize streaming solutions using technologies such as Kafka and Spark Structured Streaming.
- Automate workflow orchestration and job scheduling with Apache Airflow, ensuring reliability and observability.
- Collaborate with data scientists, analysts, and application teams to translate business requirements into robust data solutions.
Requirements
- 5+ years of hands‑on experience building data pipelines and platforms in a cloud environment (AWS preferred).
- Strong programming skills in Python and advanced SQL for data transformation and analysis.
- Proficiency with big‑data processing frameworks such as Apache Spark and streaming technologies like Kafka.
- Experience with workflow orchestration tools (e.g., Apache Airflow) and infrastructure as code practices.
- Solid understanding of data modeling, warehousing concepts, and best practices for performance tuning and security.
Skills
pythonsqlapache sparkawskafka