Data Science with 1+ years in Python & Machine Learning
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Data-focused graduate engineer with a Post Graduate Diploma in Big Data Analytics and a strong foundation in Python, SQL, Java, and R. Hands-on experience designing ETL pipelines, building machine learning models, and deploying AI-powered applications. Proven ability to process large datasets, engineer features, and deliver measurable outcomes - including 89% model accuracy, 15% error reduction, and sub-2-second retrieval. Detail-oriented collaborator eager to contribute to data engineering, analytics, or ML engineering teams.
Centre for Development of Advanced Computing (C-DAC)
Post Graduate Diploma · Big Data Analytics
August 1, 2025 – February 1, 2026
Greater Noida Institute of Technology
Bachelor of Technology · Mechanical Engineering
August 1, 2015 – June 1, 2019
RLSY College, Patna
Senior Secondary (XII)
April 1, 2013 – May 1, 2015
St. Karen's Secondary School, Patna
Secondary Education (X)
N/A – April 1, 2013
Vikasaarth Trust
Volunteer Teacher & Data Analysis Support
October 1, 2022 – February 1, 2024
Patna, Bihar, India
Medical Chatbot Using LLMs and RAG
January 1, 2026 – March 1, 2026
Critical healthcare information was buried across 1,000+ pages of unstructured PDF documents, making fast and accurate query resolution impossible for end users. Built an AI-powered medical chatbot to enable intelligent, context-aware retrieval of healthcare information from large unstructured document collections. Indexed 1,000+ medical PDF pages; generated and managed 10,000+ vector embeddings in Pinecone for semantic similarity search; integrated LLMs via LangChain RAG pipeline; deployed backend via Flask on AWS cloud services. Reduced information retrieval time to under 2 seconds and improved contextual response relevance by 20%+ over keyword-based search approaches.
Insurance Price Prediction
December 1, 2025 – February 1, 2026
Insurance premium estimation relied on manual, error-prone calculations with no scalable predictive system in place. Designed and built a full end-to-end ML pipeline to automate premium prediction from structured customer data. Processed 10,000+ records through ETL and preprocessing pipelines; implemented Linear Regression, Random Forest, and XGBoost with GridSearchCV hyperparameter tuning; conducted EDA and feature engineering; deployed an interactive Streamlit web app for real-time premium estimation. Achieved 89% prediction accuracy with XGBoost and reduced prediction error by 15% compared to the baseline model.
Cultural Fit Analysis
The candidate's academic projects showcase a diverse application of data science skills, from traditional ML for prediction to advanced AI with LLMs and RAG. The volunteer experience, though not directly technical, demonstrates a commitment to community and an ability to apply data analysis in a non-profit setting, indicating adaptability and a broader perspective. The target role of Data Science aligns well with the candidate's recent education and project focus, suggesting a strong interest and dedication to the field.
Soft Skills & Operational Fit
The candidate demonstrates strong problem-solving skills through project descriptions, addressing real-world challenges with technical solutions. The volunteer experience highlights a collaborative spirit and an ability to take ownership of data management and reporting, indicating good operational fit and a proactive attitude. The detailed project descriptions suggest good communication of technical work.