onsite
Data Engineer - abridge
Data Engineer
Data Engineer building scalable pipelines to transform medical conversation data into structured clinical notes, leveraging Python, SQL, Airflow, Spark and AWS services to support real‑time AI‑powered documentation.
About the role
Key Responsibilities
- Design, develop and maintain robust data pipelines that ingest, clean, and transform unstructured medical conversation data into structured formats for downstream AI models.
- Implement and optimize ETL workflows using Apache Airflow, ensuring reliability, scalability, and timely execution across production environments.
- Collaborate with data scientists and product teams to define data schemas, quality metrics, and performance benchmarks for clinical note generation.
- Leverage AWS services (S3, Redshift, Glue, Lambda) to build a secure, compliant data lake and warehouse architecture.
- Monitor pipeline health, troubleshoot issues, and continuously improve data processing efficiency and cost‑effectiveness.
Requirements
- 3+ years of experience as a Data Engineer in a fast‑paced, cloud‑native environment.