onsite
Senior Data Infrastructure Engineer - Newsbreak
Devops Engineer
Lead the design, implementation, and scaling of data pipelines and infrastructure for a high‑traffic content platform, leveraging Python, Spark, Kafka, and AWS to deliver reliable, real‑time data to AI‑driven products.
About the role
Key Responsibilities
- Architect, build, and maintain scalable data pipelines that ingest, process, and store billions of events daily.
- Design and optimize data models and warehouses to support real‑time recommendation and analytics workloads.
- Implement robust streaming solutions using Kafka and batch processing with Apache Spark.
- Develop and manage orchestration workflows with Apache Airflow, ensuring reliability and observability.
- Collaborate with data scientists, product engineers, and analytics teams to deliver high‑quality data for AI and ad‑tech services.
Requirements
- 5+ years of experience building large‑scale data infrastructure in cloud environments, preferably AWS.
- Strong proficiency in Python and SQL, with hands‑on experience in Apache Spark and Kafka.
- Deep understanding of data modeling, ETL/ELT design, and data warehousing concepts.
- Experience with workflow orchestration tools such as Apache Airflow.
- Proven ability to troubleshoot performance issues and implement monitoring, alerting, and best‑practice security controls.
Skills
pythonapache sparkkafkaawssqlairflow