Sawan Kumar

Data Scientist

https://www.opentalent.in/sawan-kumar-1

Data Scientist

Mumbai, Maharashtra, IndiaMember since June 16, 2026

Key Strengths

Demonstrated experience in various data science techniques including sentiment analysis, dimensionality reduction (SVD), and machine learning algorithms (Random Forest, Neural Networks).
Practical application of pre-trained word vectors (Glove) for natural language processing tasks.
Experience with model evaluation metrics like AUC score and cross-validation.
Familiarity with Jupyter Notebook and R for data analysis and model development.

Cultural & Operational Fit

Cultural Fit Analysis

The candidate's projects primarily focus on data science and machine learning, which aligns well with a Data Scientist role. The diversity of projects, from video key-frame extraction to credit scoring, indicates a broad interest in applying data science techniques across different domains. However, the projects are all personal, which might suggest a lack of experience in collaborative or production environments. The absence of team-based projects or contributions to open-source initiatives limits the assessment of cultural fit in a team setting.

Soft Skills & Operational Fit

The provided data does not contain information to assess soft skills or operational fit. The psychometric test score is 0, indicating no assessment was completed.

AI is analyzing your overall score…

Identifying your key strengths…

Evaluating your skill match against the job requirements…

Assessing your cultural and operational fit

Projects

localrepo

June 14, 2026 – Present

localrepo — GitHub repository

View Project

git-demo

June 13, 2026 – Present

This is my first git repository

View Project

Coursera-Course-Certificates

January 4, 2021 – January 7, 2021

Coursera-Course-Certificates — GitHub repository

View Project

Key-Frames-Extraction-from-Video

December 9, 2019 – December 14, 2020

Using Color Histogram, SVD and Dynamic Clustering Method obtained Key-Frames from a video. This analysis can be used to identify frames which make a shot. The code is well documented.

View Project

Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-Neural-Network

December 8, 2019 – January 7, 2021

The challenge is to obtain Ten-fold cross validation auc score more than 0.803. After basic cleaning and spelling correction i used pre-trained Glove vector to find 200D representation for words in tweet which are there in Glove Vector words dictionary. Then i summed the (matching) vectors to obtain 200D feacture vector for each tweet. Atlast, i fitted a neural network with 1 hidden layer. I obtained 0.81 10-Fold cross validation auc score.

View Project

Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-ML-Algo

December 8, 2019 – January 7, 2021

The challenge is to obtain Ten-fold cross validation auc score more than 0.803. After basic cleaning and spelling correction i used pre-trained Glove vector to find 200D representation for words in tweet which are there in Glove Vector words dictionary. Then i summed the (matching) vectors to obtain 200D feacture vector for each tweet. Atlast, i fitted Random Forest Algorithm. I obtained 0.793 10-Fold cross validation auc score.

View Project

Sentiment-Analysis-of-Twitter-Data-using-DTM-SVD-and-ML

December 8, 2019 – January 7, 2021

The challenge is to obtain Ten-fold cross validation auc score more than 0.803. The approach i have taken is to first clean the tweets, spelling correction, lemmatization, stop words removal, creating document term matrix (since all frequent words already have been removed) , dimensionality reduction and then finally fitting ML Algorithm. These approaches are pretty naive. With this approach i could reach to 0.775 10-fold cross validation auc score.

View Project

ML-Model-to-identify-Churning-Customer-

December 8, 2019 – December 8, 2019

The challenge is to obtain Ten-fold Cross Validation AUC Score above 0.893, given telecom data with 'Churn' as target variable.

View Project

Creating-a-Credit-Scoring-Model-to-obtain-the-probability-of-default

December 8, 2019 – December 8, 2019

We have baseline and loan performance information for approximately 6000 loans. The target variable (BAD) is a binary variable indicating whether an applicant eventually defaulted or was seriously delinquent. We have 12 recorded variables for each applicant. Given these information we want to obtain a predictive model which outputs 'probability of default'. Our model should be interpretable and statistically sound so that we can give the reasons for rejections.

View Project

-AI-CL-688-Course-Project

November 9, 2015 – November 9, 2015

This contains codes for course project

View Project

Key Strengths

Demonstrated experience in various data science techniques including sentiment analysis, dimensionality reduction (SVD), and machine learning algorithms (Random Forest, Neural Networks).
Practical application of pre-trained word vectors (Glove) for natural language processing tasks.
Experience with model evaluation metrics like AUC score and cross-validation.
Familiarity with Jupyter Notebook and R for data analysis and model development.

Cultural & Operational Fit

Cultural Fit Analysis

Soft Skills & Operational Fit

The provided data does not contain information to assess soft skills or operational fit. The psychometric test score is 0, indicating no assessment was completed.

Sawan Kumar

Key Strengths

Cultural & Operational Fit

Top Skills

Skills

Projects

Key Strengths

Cultural & Operational Fit