Shaheen Nabi

Post-Training Research Engineer

https://www.opentalent.in/shaheen-nabi

Research Engineer · Post-Training · Reasoning & Thinking Models · Inference-Time Compute

Bengaluru, Karnataka, India

Key Strengths

Extensive practical experience in Generative AI, Large Language Models (LLMs), and Reinforcement Learning (RL) through numerous personal projects.
Demonstrated ability to implement complex AI architectures from scratch (e.g., OLMo, Qwen3).
Strong focus on post-training research, including SFT, preference learning, RLHF/RLVR, and inference-time scaling, directly aligning with the 'Post-Training Research Engineer' target role.
Proficiency in Python, Jupyter Notebook, and Docker, essential tools for AI/ML research and deployment.
Experience with multi-agent systems and instruction fine-tuning, indicating a deep understanding of advanced AI applications.

Cultural & Operational Fit

Cultural Fit Analysis

The candidate's portfolio is heavily focused on personal projects in advanced AI/ML, particularly LLMs and RL, which aligns well with a research-heavy environment. The diversity of projects within this niche (e.g., multi-AI agents, instruction fine-tuning, from-scratch implementations) suggests a strong passion and continuous learning mindset. However, the lack of team-based projects or professional experience makes it difficult to assess collaboration and broader organizational fit.

Soft Skills & Operational Fit

The candidate's project descriptions indicate a strong initiative and self-driven learning, crucial for a research-oriented role. The focus on 'production-ready' projects suggests an understanding of practical application and deployment considerations. However, without psychometric or English test scores, it's difficult to assess communication clarity, logical reasoning, work attitude, stress handling, or team collaboration.

AI is analyzing your overall score…

Identifying your key strengths…

Evaluating your skill match against the job requirements…

Assessing your cultural and operational fit

Projects

open-posttraining-system

August 17, 2025 – Present

Open-source research engineering project for building the end-to-end post-training stack for reasoning language models, including SFT, preference learning, RLHF/RLVR, evaluation, inference-time scaling, and scalable systems for frontier-level reasoning.

View Project

Proximal-Policy-Optimization-PPO

August 6, 2025 – Present

Modular Implementation of Proximal Policy Optimization (PPO) is a policy gradient reinforcement learning algorithm introduced by OpenAI in 2017. It's designed to be a simpler, more stable, and more sample-efficient alternative to previous policy gradient methods like A3C and TRPO (Trust Region Policy Optimization).

View Project

Qwen3-from-scratch

August 2, 2025 – Present

This repository contains a lightweight PyTorch implementation of Qwen 3-style transformer components in qwen3_from_scratch.py.

View Project

Olmo3-from-scratch

August 2, 2025 – Present

“A clean, from-scratch implementation of the OLMo architecture with KV caching, RoPE, and an efficient autoregressive inference pipeline. Designed as a minimal yet extensible foundation for post-training research, including RLHF, preference optimization, and reasoning-focused systems.”

View Project

Production-Ready-LeafLogic-Multi-AI-Agents-Project

November 17, 2024 – March 29, 2025

🍃 Production-ready: Just upload a photo of any plant or crop, the system takes care of the rest. Powered by advanced object detection and Multi-AI Agents, it identifies over 100+ species and autonomously fetches detailed insights like scientific name, history, health benefits and risks, ideal growing seasons, market prices, etc.🌾

View Project

Production-Ready-TripPlanner-Multi-AI-Agents-Project

November 17, 2024 – March 4, 2025

✈️🌍 Production-Ready TripPlanner Multi-AI Agent Project: Transform your travel planning with AI-driven assistance! From discovering dream destinations, creating custom itineraries, exploring avenues of nature, to finding local attractions and beach spots 🔍💡—all powered by industry-ready AI tools. 🏨🌍

View Project

Production-Ready-Instruction-Finetuning-of-Meta-Llama-3.2-3B-Instruct-Project

November 17, 2024 – January 30, 2025

Instruction Fine-Tuning of Meta Llama 3.2-3B Instruct on Kannada Conversations. Tailoring the model to follow specific instructions in Kannada, enhancing its ability to generate relevant, context-aware responses based on conversational inputs. Using the Kannada Instruct dataset for fine-tuning! Happy Finetuning 🎋

View Project

Multi-lingual-AI-Assistant-with-gTTS-and-Gemini-Pro

November 12, 2024 – August 2, 2025

An end-to-end AI assistant using gTTS for multi-lingual text-to-speech and Gemini Pro API for smart responses. Experience seamless voice interaction in various languages with continuous updates and improvements!

View Project

Generative-AI-Practices-and-Mini-Projects

October 14, 2024 – June 12, 2025

Generative AI Practices and Mini-Projects: A hands-on repository for Generative AI mini-projects! Explore model building, fine-tuning, and RAG techniques. Includes experiments with open-source models like LLaMA and Gemma, plus deployments using OpenAI and Google Gemini APIs.

View Project

Reinforcement-Learning-Zero-to-Hero

October 14, 2024 – Present

Reinforcement Learning (RL)! This repository is your hands-on guide to implementing RL algorithms, from Markov Decision Processes (MDPs) to advanced methods like PPO and DDPG. Build smart agents, learn the math behind policies, and experiment with real-world applications!

View Project

Key Strengths

Extensive practical experience in Generative AI, Large Language Models (LLMs), and Reinforcement Learning (RL) through numerous personal projects.
Demonstrated ability to implement complex AI architectures from scratch (e.g., OLMo, Qwen3).
Strong focus on post-training research, including SFT, preference learning, RLHF/RLVR, and inference-time scaling, directly aligning with the 'Post-Training Research Engineer' target role.
Proficiency in Python, Jupyter Notebook, and Docker, essential tools for AI/ML research and deployment.
Experience with multi-agent systems and instruction fine-tuning, indicating a deep understanding of advanced AI applications.

Cultural & Operational Fit

Cultural Fit Analysis

Soft Skills & Operational Fit

Shaheen Nabi

Key Strengths

Cultural & Operational Fit

Top Skills

Skills

Projects

Key Strengths

Cultural & Operational Fit