Generative AI Engineer with 4+ years in LLM Systems & RAG Pipelines
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Senior Python AI Engineer with 4+ years of production experience designing and deploying AI agent systems, LLM-powered applications, and RAG pipelines that automate real business workflows at scale. Proven track record building multi-agent architectures with LangChain, LangGraph, and LlamaIndex; fine-tuning LLMs (Llama 3, GPT-4) with LoRA/QLORA; and designing RESTful microservice APIs (FastAPI, Flask) serving 100K+ daily requests at sub-200ms latency. Delivered systems handling 50K+ monthly users and 2M+ document pipelines. Reduced inference latency by 25% and cloud costs by 30% through quantization and infrastructure optimization. Proficient with DeepEval and Ragas for LLM evaluation, Docker/Kubernetes for containerization, and AWS/Azure/GCP for cloud deployment. Strong communicator comfortable collaborating across engineering, product, and data science teams.
Kohat University of Science & Technology
Bachelor of Science · Information Technology
October 1, 2021 – June 1, 2025
Vision Byte Technologies
Generative AI Engineer & Python Backend Developer
January 1, 2023 – August 1, 2025
Islamabad, Islamabad Capital Territory, Pakistan
Dot Coder
ML & Deep Learning Engineer
January 1, 2021 – November 1, 2022
Khyber Pakhtunkhwa, Pakistan
AI Travel Chatbot
June 24, 2026 – Present
Built an AI-powered travel planning chatbot with 9+ specialized parallel agents orchestrated via LangGraph: flight resolution, hotel search, activity planning, budget analysis, visa/weather advisory, and itinerary generation. Implemented Human-in-the-Loop Gen-UI with interactive cards inside the chat interface planning engine fires only after full context is captured via structured JSON outputs. Stack: Next.js, TailwindCSS 4, Framer Motion, Clerk, Convex (frontend) + FastAPI, LangGraph, Redis, Docker, PostgreSQL (backend) with streaming SSE responses.
Healthcare Conversational AI Platform
June 24, 2026 – Present
Built end-to-end conversational AI using fine-tuned LLMS, NER, and intent classification for patient engagement processing 50K+ monthly interactions with 4.6/5 satisfaction rating. Integrated RAG pipeline over healthcare knowledge base with 92% retrieval accuracy using Ragas evaluation; implemented HIPAA-compliant data handling with encryption and audit logs.
LLM Fine-Tuning Pipeline
June 24, 2026 – Present
Built end-to-end fine-tuning pipeline for Llama 3 using LoRA/QLORA adapters and 4-bit quantization via Unsloth, enabling high-performance training on consumer hardware. Evaluated fine-tuned models using DeepEval and Ragas benchmarks, achieving 35% accuracy improvement over baseline with automated regression tracking.
Voice Medical Assistant
June 24, 2026 – Present
Developed HIPAA-compliant voice AI assistant combining RAG, computer vision, and speech recognition with custom TTS for hands-free medical workflow automation. Reduced manual clinical tasks by 45% through intelligent voice-driven interaction and document retrieval with real-time streaming responses.
AI Career Assistant Platform
June 24, 2026 – Present
Built full-stack SaaS with AI career counselor, resume analyzer (PDF parsing + scoring), ReactFlow-based roadmap generator, and personalized cover letter generator powered by Gemini AI. Used LlamaIndex for document ingestion and retrieval; Inngest for background AI agent orchestration; structured JSON outputs for consistent LLM responses.
AI Course & Video Generator
June 24, 2026 – Present
Engineered a SaaS platform that converts any topic into a complete video course with animated slides, TTS narration, and auto-synced captions fully automated end-to-end. Implemented parallel processing with ThreadPoolExecutor for concurrent slide, audio, and caption generation; used Inngest for background jobs and Redis for rate limiting. Built Streamlit admin dashboard for internal monitoring of job queues, user activity, and model performance metrics.
Cultural Fit Analysis
The candidate's project diversity, ranging from AI Travel Chatbots and Healthcare Conversational AI to AI Career Assistants and Video Generators, showcases adaptability and a broad interest in applying AI across various domains. Their experience with multiple cloud providers (AWS, Azure, GCP) and a wide array of AI/ML frameworks and tools indicates a willingness to learn and integrate new technologies. The focus on production-grade systems, efficiency, and security aligns well with a high-performance engineering culture.
Soft Skills & Operational Fit
The candidate demonstrates strong collaboration skills, having worked cross-functionally with data scientists, product managers, and frontend developers. Their ability to explain technical concepts to non-technical stakeholders is a significant asset. The resume highlights a focus on reproducible and reliable model deployments, efficiency improvements, and adherence to security best practices (HIPAA compliance), indicating a strong operational fit for a senior role.