onsite
Software Engineer, HPC / Deep Learning
Software Engineer, HPC / Deep Learning
The Software Engineer, HPC / Deep Learning will collaborate with research teams to integrate advanced technologies into the codebase and develop systems supporting the machine learning model lifecycle, particularly for foundation models. This role involves participating in design reviews, performing code reviews, contributing to documentation, and debugging system issues.
About the role
Responsibilities
- Collaborate with Research teams to understand technologies, adapting and integrating them into codebase.
- Develop and implement systems to support the lifecycle of machine learning models, such as data preprocessing, pre-training, post-training, evaluation and so on, especially foundation models.
- Participate in or lead design reviews with peers and stakeholders to decide amongst available technologies.
- Review code developed by other developers and provide feedback to ensure best practices (e.g., style guidelines, checking code in, accuracy, testability, and efficiency).
- Contribute to existing documentation or educational content and adapt content based on product/program updates and user feedback.
- Triage product or system issues and debug/track/resolve by analyzing the sources of issues and the impact on hardware, network, or service operations and quality.
- Contribute to research papers and represent MBZUAI at industry conferences and events, showcasing the institution’s cutting-edge HPC and deep learning capabilities and establishing MBZUAI as a global leader in AI research and innovation.
- Perform all other duties as reasonably directed by the line manager that are commensurate with these functional objectives.
Skills
HpcDeep LearningMachine Learning