remote
Research Software Engineer - User Configurable AI-Ready Datasets - ECMWF
Software Engineer
Lead the design and implementation of scalable data infrastructure to enable AI‑ready, user‑configurable datasets for the DestinE initiative, leveraging Python, ML pipelines, and containerized cloud services.
About the role
Key Responsibilities
- Architect and develop robust data pipelines that transform and expose ECMWF scientific datasets for machine‑learning workloads.
- Implement scalable, containerized services using Docker and Kubernetes, ensuring high availability and performance.
- Collaborate with data scientists to define schema, metadata, and access patterns that support rapid experimentation.
- Integrate cloud storage and compute resources (AWS/GCP) to support large‑scale data ingestion and processing.
- Monitor, troubleshoot, and optimize data services, applying best practices in observability and CI/CD.
Requirements
- Strong proficiency in Python and experience building production‑grade data pipelines.
- Hands‑on knowledge of SQL and relational/NoSQL databases for data storage and retrieval.
- Experience with containerization (Docker) and orchestration (Kubernetes) in a cloud environment.
- Familiarity with machine‑learning workflows and data preparation for AI models.
- Excellent problem‑solving skills and ability to work collaboratively in a research‑driven team.
Skills
pythonmachine learningsqldockerkubernetes