onsite
Data Engineer - Axle
Data Engineer
Data Engineer responsible for designing, building, and maintaining scalable data pipelines and warehouses using Python, SQL, AWS, and Spark to support biomedical research and analytics initiatives.
About the role
Key Responsibilities
- Design, develop, and maintain robust ETL pipelines that ingest, transform, and load large-scale biomedical and research data.
- Implement data models and warehouse solutions on cloud platforms, primarily AWS, ensuring high performance and reliability.
- Collaborate with data scientists, bioinformaticians, and software engineers to deliver clean, well‑documented data sets for analytics and machine‑learning workflows.
- Monitor, troubleshoot, and optimize data workflows using tools such as Apache Spark and Docker containers.
- Establish data governance, security, and compliance practices aligned with healthcare and research regulations.
Requirements
- Strong proficiency in Python and SQL for data manipulation and pipeline development.
- Hands‑on experience with cloud services (AWS) and infrastructure‑as‑code tools.
- Proven ability to build and optimize data pipelines using Apache Spark or similar distributed processing frameworks.
- Familiarity with containerization (Docker) and CI/CD practices for reproducible data environments.
- Background in biomedical, health‑care, or research data domains is a plus.
Skills
pythonsqlawsapache sparkdocker