remote
Senior Software Engineer Data Pipeline - Amplitude
Software Engineer
Senior Software Engineer building scalable data pipelines in Python and Spark on AWS, orchestrating workflows with Airflow, and ensuring high‑quality data for AI analytics.
About the role
Key Responsibilities
- Design, develop, and maintain large‑scale data pipelines using Python and Apache Spark to ingest, transform, and enrich data for AI analytics.
- Implement and manage Airflow DAGs to orchestrate complex ETL workflows, ensuring reliability and observability.
- Optimize data storage and query performance on AWS services (S3, Redshift, Athena) and maintain data lake architecture.
- Collaborate with data scientists and product teams to translate business requirements into robust data solutions.
- Monitor pipeline health, troubleshoot failures, and continuously improve pipeline efficiency and scalability.
Requirements
- 5+ years of experience in data engineering with strong Python and Spark skills.
- Proven expertise in AWS data services and experience building data lakes.
- Hands‑on experience with Airflow or similar workflow orchestration tools.
- Strong SQL skills and familiarity with relational and columnar databases.
- Excellent problem‑solving abilities and a collaborative mindset.
Skills
pythonapache sparkawsairflowsql