onsite
Senior AI Data Pipeline Specialist
Software Engineer
Lead the design, implementation, and optimization of AI data pipelines on AWS, leveraging Airflow, Kafka, and Spark to deliver scalable, high‑performance data solutions for advanced analytics.
About the role
Key Responsibilities
- Architect and maintain end‑to‑end AI data pipelines on AWS, ensuring reliability, scalability, and performance.
- Design and orchestrate workflows using Apache Airflow, integrating data ingestion from Kafka streams and batch processing with Apache Spark.
- Collaborate with data scientists and ML engineers to translate model requirements into robust data pipelines.
- Implement monitoring, alerting, and automated recovery mechanisms to guarantee data quality and pipeline uptime.
- Optimize resource utilization and cost across the cloud platform, applying best practices for serverless and containerized deployments.
Requirements
- 7+ years of experience building data pipelines in a cloud environment, preferably AWS.
- Deep expertise in Apache Airflow, Kafka, and Spark, with hands‑on experience in production deployments.
- Strong programming skills in Python or Scala, and familiarity with CI/CD for data workflows.
- Proven ability to troubleshoot complex data flow issues and implement performance improvements.
- Excellent communication skills and a collaborative mindset for cross‑functional teams.