
Research Engineer · Post-Training · Reasoning & Thinking Models · Inference-Time Compute
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
open-posttraining-system
August 17, 2025 – Present
Open-source research engineering project for building the end-to-end post-training stack for reasoning language models, including SFT, preference learning, RLHF/RLVR, evaluation, inference-time scaling, and scalable systems for frontier-level reasoning.
View ProjectProximal-Policy-Optimization-PPO
August 6, 2025 – Present
Modular Implementation of Proximal Policy Optimization (PPO) is a policy gradient reinforcement learning algorithm introduced by OpenAI in 2017. It's designed to be a simpler, more stable, and more sample-efficient alternative to previous policy gradient methods like A3C and TRPO (Trust Region Policy Optimization).
View ProjectQwen3-from-scratch
August 2, 2025 – Present
This repository contains a lightweight PyTorch implementation of Qwen 3-style transformer components in qwen3_from_scratch.py.
View ProjectOlmo3-from-scratch
August 2, 2025 – Present
“A clean, from-scratch implementation of the OLMo architecture with KV caching, RoPE, and an efficient autoregressive inference pipeline. Designed as a minimal yet extensible foundation for post-training research, including RLHF, preference optimization, and reasoning-focused systems.”
View ProjectProduction-Ready-LeafLogic-Multi-AI-Agents-Project
November 17, 2024 – March 29, 2025
🍃 Production-ready: Just upload a photo of any plant or crop, the system takes care of the rest. Powered by advanced object detection and Multi-AI Agents, it identifies over 100+ species and autonomously fetches detailed insights like scientific name, history, health benefits and risks, ideal growing seasons, market prices, etc.🌾
View ProjectProduction-Ready-TripPlanner-Multi-AI-Agents-Project
November 17, 2024 – March 4, 2025
✈️🌍 Production-Ready TripPlanner Multi-AI Agent Project: Transform your travel planning with AI-driven assistance! From discovering dream destinations, creating custom itineraries, exploring avenues of nature, to finding local attractions and beach spots 🔍💡—all powered by industry-ready AI tools. 🏨🌍
View ProjectProduction-Ready-Instruction-Finetuning-of-Meta-Llama-3.2-3B-Instruct-Project
November 17, 2024 – January 30, 2025
Instruction Fine-Tuning of Meta Llama 3.2-3B Instruct on Kannada Conversations. Tailoring the model to follow specific instructions in Kannada, enhancing its ability to generate relevant, context-aware responses based on conversational inputs. Using the Kannada Instruct dataset for fine-tuning! Happy Finetuning 🎋
View ProjectMulti-lingual-AI-Assistant-with-gTTS-and-Gemini-Pro
November 12, 2024 – August 2, 2025
An end-to-end AI assistant using gTTS for multi-lingual text-to-speech and Gemini Pro API for smart responses. Experience seamless voice interaction in various languages with continuous updates and improvements!
View ProjectGenerative-AI-Practices-and-Mini-Projects
October 14, 2024 – June 12, 2025
Generative AI Practices and Mini-Projects: A hands-on repository for Generative AI mini-projects! Explore model building, fine-tuning, and RAG techniques. Includes experiments with open-source models like LLaMA and Gemma, plus deployments using OpenAI and Google Gemini APIs.
View ProjectReinforcement-Learning-Zero-to-Hero
October 14, 2024 – Present
Reinforcement Learning (RL)! This repository is your hands-on guide to implementing RL algorithms, from Markov Decision Processes (MDPs) to advanced methods like PPO and DDPG. Build smart agents, learn the math behind policies, and experiment with real-world applications!
View ProjectCultural Fit Analysis
The candidate's portfolio is heavily focused on personal projects in advanced AI/ML, particularly LLMs and RL, which aligns well with a research-heavy environment. The diversity of projects within this niche (e.g., multi-AI agents, instruction fine-tuning, from-scratch implementations) suggests a strong passion and continuous learning mindset. However, the lack of team-based projects or professional experience makes it difficult to assess collaboration and broader organizational fit.
Soft Skills & Operational Fit
The candidate's project descriptions indicate a strong initiative and self-driven learning, crucial for a research-oriented role. The focus on 'production-ready' projects suggests an understanding of practical application and deployment considerations. However, without psychometric or English test scores, it's difficult to assess communication clarity, logical reasoning, work attitude, stress handling, or team collaboration.