onsite
Data Engineer - Rad AI
Data Engineer
Build and maintain scalable data pipelines for AI-driven radiology solutions, leveraging Python, SQL, Airflow, and cloud services to enable fast, reliable analytics on large medical imaging datasets.
About the role
Key Responsibilities
- Design, develop, and operate robust ETL pipelines that ingest, transform, and store massive radiology report datasets.
- Implement workflow orchestration using Apache Airflow to ensure reliable, scheduled data processing.
- Optimize data storage and query performance on AWS services such as S3, Redshift, and Athena.
- Collaborate with data scientists and product teams to provide clean, well‑documented data for AI model training and inference.
- Maintain containerized environments with Docker and manage infrastructure as code for reproducible deployments.
Requirements
- 3+ years of experience building data pipelines in Python and SQL.
- Hands‑on expertise with Apache Airflow, Apache Spark, and cloud platforms (AWS preferred).
- Strong understanding of data modeling, warehousing, and performance tuning.
- Familiarity with containerization (Docker) and CI/CD practices.
- Ability to work in a fast‑moving, interdisciplinary team focused on healthcare AI.
Skills
pythonsqlawsapache sparkdocker