remote
Staff Data Engineer - Boulevard
Data Engineer
Lead the design, development, and scaling of data pipelines and platforms using Python, Spark, Airflow, and AWS to deliver reliable, real‑time analytics for a fast‑growing appointment‑based SaaS product.
About the role
Key Responsibilities
- Architect, build, and maintain robust, high‑performance data pipelines that ingest, transform, and store large volumes of transactional and event data.
- Design and implement scalable data models and warehouses on AWS (Redshift, S3, Glue) to support analytics, reporting, and machine‑learning use cases.
- Develop and operate orchestration workflows with Apache Airflow, ensuring reliability, observability, and automated recovery.
- Collaborate with product, analytics, and engineering teams to translate business requirements into technical specifications and deliver end‑to‑end data solutions.
- Implement streaming ingestion and processing using Kafka and Spark Structured Streaming for near‑real‑time insights.
- Establish best practices for data quality, testing, documentation, and security across the data platform.
Requirements
- 5+ years of professional experience building data pipelines and warehouses in a cloud environment, preferably AWS.
- Strong proficiency in Python and SQL, with hands‑on experience in Apache Spark (PySpark) for batch and streaming workloads.
- Deep knowledge of workflow orchestration tools such as Apache Airflow, including DAG design, monitoring, and scaling.
- Experience with data streaming technologies (Kafka, Kinesis) and real‑time processing frameworks.
- Solid understanding of data modeling, ETL/ELT design patterns, and data governance/security best practices.
Skills
pythonsqlapache sparkawskafka