remote
Senior Staff Data Engineer - SmithRx
Data Engineer
Lead data engineering initiatives in healthcare, designing scalable pipelines, applying machine learning, and ensuring data quality across large biomedical datasets using Python, SQL, and Spark.
About the role
Key Responsibilities
- Architect and maintain end‑to‑end data pipelines for clinical and research datasets, ensuring high availability and performance.
- Collaborate with data scientists to deploy machine learning models into production, monitoring model drift and performance.
- Design and enforce data governance, security, and compliance standards for sensitive healthcare information.
- Mentor junior engineers, conduct code reviews, and promote best practices in data engineering and analytics.
- Evaluate and integrate emerging big‑data technologies to improve scalability and cost efficiency.
Requirements
- 10+ years of experience in data engineering, with a strong focus on healthcare or life sciences.
- Proficiency in Python, SQL, and distributed processing frameworks such as Apache Spark.
- Hands‑on experience with cloud platforms (AWS, Azure, or GCP) and data lake architectures.
- Deep understanding of machine learning workflows and model deployment pipelines.
- Excellent communication skills and a proven ability to lead cross‑functional teams.
Skills
machine learningpythonsqlapache spark