onsite
Lead LLM Efficiency Research
Lead LLM Efficiency Research
This role involves leading research into technologies for improving the efficiency of Large Language Models (LLMs), including novel architectures and pre-training methods. The Lead LLM Efficiency Researcher will design and implement NLP algorithms, integrate LLMs with other models like computer vision, and contribute to model optimization using frameworks like Pytorch.
About the role
About the Role
As a Lead LLM Efficiency Researcher, you will drive the investigation and implementation of novel technologies to enhance the efficiency of Large Language Models. This involves exploring new architectures, refining pre-training methods, and integrating LLMs with other AI models, such as computer vision models.
Responsibilities
- Lead the research of technology for improving the efficiency of Large Language Model (LLM) while performing target capabilities or supporting many capabilities, such as novel architectures and improved pre-training.
- Design and implement NLP algorithms for model training and prediction, leverage ML infrastructure, and contribute to model optimization and data processing, using Pytorch or other frameworks.
- Integrate and improve LLM algorithms to work with other models such as computer vision models.
- Identify defined problems/gaps in existing technology and engage other Research teams, stakeholders and leaders to expand efficient LLM technology.
- Collaborate with peers and stakeholders through design and code reviews to ensure best practices amongst available technologies.
- Write up results in design documents, technical reports, and papers for publication.
- Represent MBZUAI at industry conferences and events, showcasing the institution’s cutting-edge HPC and deep learning capabilities and establishing MBZUAI as a global leader in AI research and innovation.
- Perform all other duties as reasonably directed by the line manager that are commensurate with these functional objectives.
Skills
LlmNLP algorithmsMl InfrastructurePyTorchcomputer vision modelsHpcDeep Learning