Tanvi Keswani | OpenTalent

Key Strengths

Extensive experience in data science and predictive analytics, with a strong background in machine learning model development and deployment.
Demonstrated ability to handle large datasets, perform data preprocessing, and apply various statistical and machine learning techniques (Logistic Regression, Random Forest, ANN, Gradient Boosting, ARIMA, K-Means Clustering).
Experience with ETL processes, migrating SQL to pySpark, and working with AWS, HiveQL, and PostgreSQL, indicating a solid understanding of data pipelines and infrastructure.
Proven track record in customer churn prediction and segmentation, which is highly relevant for data-driven business strategies.
Master's degree in Data Science from a reputable institution (IIT Kanpur) provides a strong theoretical foundation.

Cultural & Operational Fit

Cultural Fit Analysis

The candidate's diverse project portfolio, ranging from customer churn prediction to time series analysis and even a smart transportation system, indicates a broad interest and adaptability. Experience across multiple companies (ALDAR, Swvl, Anheuser-Busch InBev, Ericsson, CoreCompete) and roles (Senior Associate, Product Data Scientist, Mentor) suggests an ability to integrate into different organizational cultures. The focus on practical, problem-solving projects aligns well with a results-oriented environment. The numerous certifications, including those in Generative AI and MLOps, demonstrate a commitment to continuous learning and staying current with industry trends.

Soft Skills & Operational Fit

The candidate's experience as a Data Science Mentor and involvement in placement activities at IIT Kanpur suggest strong communication, leadership, and stakeholder management skills. The project descriptions indicate an ability to translate business requirements into technical solutions and work with diverse teams. The certifications in project management and MLOps also point to an understanding of operational aspects of data science projects.

AI is analyzing your overall score…

Identifying your key strengths…

Evaluating your skill match against the job requirements…

Assessing your cultural and operational fit

About

I’m a Lead Data Scientist with 10+ years of experience (M.Tech, IIT Kanpur) building AI/ML and Generative AI solutions that solve real business problems across real estate, retail, and customer analytics domains. Over the last few years, I’ve been particularly focused on applying Generative AI and LLMs to enterprise use cases — from auto-classification and narrative generation to operational intelligence and decision support systems. One of the recent solutions I worked on achieved ~96% accuracy on unseen data while reducing manual operational effort from approximately one week to a few hours. What I enjoy most is combining strong technical problem-solving with business understanding — whether it’s improving customer engagement, predicting churn, building risk scoring frameworks, or helping leadership teams make faster and better decisions using data. My experience spans the full AI/ML lifecycle: • Problem framing & stakeholder discussions • Data engineering & feature engineering • Machine learning & predictive modeling • Generative AI / LLM applications • Model deployment & monitoring • Driving adoption of AI solutions across business teams Technically, I work across: • Generative AI & LLMs (classification, summarization, narrative generation, prompt engineering) • Machine Learning & Predictive Modeling (XGBoost, Random Forest, forecasting, churn/risk models) • Customer Analytics (engagement scoring, segmentation, behavioral analytics) • Python, SQL, Azure OpenAI, Databricks, Docker, Streamlit, Flask I’ve also worked closely with business stakeholders, internal teams, and external vendors on requirement gathering, solution evaluation, and enterprise AI initiatives. Passionate about building scalable AI solutions that create measurable business impact. Currently open to opportunities in Data Science, AI, and Generative A

Top Skills

Azure DatabricksCustomer lifecycle managementStakeholder ManagementPrompt EngineeringAzure OpenaiPredictive AnalyticsMachine LearningData scienceData AnalysisBusiness AnalysisStatistical ModelingResearchengineeringFinancial AnalysisData MiningStatisticsPythonSQLRMicrosoft WordPowerPointHTML

Education

Indian Institute of Technology, Kanpur

Master’s Degree, Data Science

January 1, 2015 – January 1, 2017

Maharana Pratap University of Agriculture and Technology

Bachelor of Technology (B.Tech.), Electronics and communication Engineering

January 1, 2011 – January 1, 2015

MDS Public School,Udaipur

High School

January 1, 2009 – January 1, 2011

St. Gregorios Sr. Sec. school

SSC

January 1, 2003 – January 1, 2009

Experience

ALDAR

Senior Associate - Data Scientist

June 1, 2022 – Present

United Arab Emirates

Swvl

Product Data Scientist

January 1, 2022 – May 1, 2022

Dubai, United Arab Emirates

Anheuser-Busch InBev

Senior Data Scientist - Analytics

November 1, 2021 – December 1, 2021

Bengaluru, Karnataka, India

Analytics India Magazine

Data Science Mentor

July 1, 2020 – August 1, 2021

Ericsson

Data Scientist

October 1, 2018 – November 1, 2021

Bengaluru, Karnataka, India

CoreCompete LLC

Associate Data Scientist

July 1, 2017 – September 1, 2018

Greater Hyderabad Area

CoreCompete LLC

Data Science Intern

May 1, 2016 – July 1, 2016

Hyderabad, Telangana, India

Indian Institute of Technology, Kanpur

DPC, IME M.Tech

April 1, 2016 – March 1, 2017

Airports Authority of India

Intern

June 1, 2014 – July 1, 2014

Udaipur

Projects

ETL Development for Global Auto Parts Retailer

August 1, 2017 – September 1, 2018

US Based retail client operates with more than 3000 stores carrying millions of products across US. - Migrate Oracle SQL to pySpark to reduce the processing time required to create the Prediction Input Abstract base table, which is the input for demand forecasts, assortments and replenishment forecasts - Coordinate with Business team to translate different requirements to automated processes which requires data analysis and end-to-end technical implementation on the AWS platform - Loading the output data of different process into PostGres Database for easy utilization by end user Tools: pySpark, HiveQL, SQL, Shell scripting, Sqoop

Development of a Money Attitude and Financial Behavior Scale for Indians [M. Tech Thesis]

October 1, 2016 – May 1, 2017

- Developed a 36 questions questionnaire in English and Hindi and gathered data from 625 respondents scattered across 20 villages and 22 cities across India using online surveys as well as personal interviews. - Exploratory factor analysis yielded 6 factors which were named (i) ‘Financial Prudence’, (ii) ‘Extravagance’ (iii) ‘Financial Knowledge’, (iv) ‘Financial Anxiety’, (v) ‘Importance Attached to Money’ and (vi) ‘Financial Support Network’. - After obtaining the 6-factor model with eighteen items, confirmatory factor analysis of this model was conducted with the sample of individuals who had been administered with the Hindi questionnaire.

Customer Churn Prediction and Segmentation using Python

May 1, 2016 – July 1, 2016

• Worked with a large dataset from the telecommunication sector • Applied data mining techniques to identify the customer having high likelihood of churn in prepaid service base • Analysed, Explored and Prepared data –identified outliers, missing values and treated them and used the cleaned data for predictive data modelling techniques • Started with 578 variables; reduced variables based on univariate analysis, low variance, multicollinearity and applied feature selection. • By Iterative modeling process and business understanding finally 11 predictors were selected • Various models were implemented using Python (Logistic Regression, Decision Tree, Naïve Bayes, Ensemble Methods) • Validated the different models developed using various evaluation metrics • Compared all the models developed and selected the best one • The best model was Gradient Boosting having an accuracy of 92%, misclassification rate of 8%, cumulative lift of 5.75, KS of 58.64 and ROC of 91%. • Profiled customers using K-Means Clustering to design best service program specific to customers’ segments

Time Series Analysis of Agricultural Production (Food Grains) in India

April 1, 2016 – Present

• To predict the production of various food grains such as rice, wheat, coarse cereals, and pulses using time series models. • We have collected the production data from 1950 to 2014 from RBI database. • We used single variant time series analysis for our study. • The time series forecasting of the production of each food grains based on previously observed values was calculated. • Autoregressive Integrated Moving Average Model (ARIMA), model was the best fit.

Customer Churn Management Program

March 1, 2016 – Present

• To develop a statistical model for predicting customer churn and use the model to identify the most important drivers of churn. • Did initial data pre-processing and cleaning of a dataset containing 71,047 records and 78 variables. • Built a model, using Logistic Regression, Random Forest and Artificial Neural Networks, which delineates the key factors that lead to customer churn. Model was trained using a calibration dataset of 40,000 records. • Predictive accuracy of our final model came to be around 69%.

Assessing the impact of mobile advertisement amongst consumers

March 1, 2016 – Present

• To find out how individuals attitude towards shopping are correlated with mobile marketing. • To check the reliability of mobile marketing. • Conducted focus groups, interviews to perform initial exploratory research. • Prepared a questionnaire to conduct a survey. • Tested different hypotheses from the data obtained and analysed the factors responsible using Logistic Regression.

Analysis of the factors affecting Housing Prices

February 1, 2016 – Present

• Determining various factors affecting Housing Prices. • Data was collected for 546 observations of sales price of houses in the city of Windsor in Ontario, Canada for the year 1987. • Multiple regression was carried out on secondary data to find out relationships between the housing prices and independent factors such as lot size, number of bedrooms, number of bathrooms, central air conditioning etc. • Formulated models depicting the effects of these factors.

Smart Transportation System

September 1, 2014 – June 1, 2015

• Designed a car to autonomously navigate through a track by detecting lanes and centering itself between them as well as detect objects in front of it and avoid collision.

Certifications

What is Data Science?

IBM

June 25, 2026 – Present

Fraud Detection in Python

DataCamp

June 25, 2026 – Present

Visualizing Time Series Data in Python

DataCamp

June 25, 2026 – Present

Machine Learning for Time Series Data in Python

DataCamp

June 25, 2026 – Present

Data Science Methodology

IBM

June 25, 2026 – Present

Ask Questions to Make Data-Driven Decisions

Google

June 25, 2026 – Present

Introduction to Data Science Specialization

IBM

June 25, 2026 – Present

Introduction to Network Analysis in Python

DataCamp

June 25, 2026 – Present

Introduction to SQL

DataCamp

June 25, 2026 – Present

Tools for Data Science

IBM

June 25, 2026 – Present

Introduction to R

DataCamp

June 25, 2026 – Present

AI For Everyone

DeepLearning.AI

June 25, 2026 – Present

Essentials of MLOps with Azure: 3 Spark MLflow Projects on Databricks

June 25, 2026 – Present

Essentials of MLOps with Azure: 1 Introduction

June 25, 2026 – Present

The Employee's Guide to Sustainability

June 25, 2026 – Present

Prepare Data for Exploration

Google

June 25, 2026 – Present

How Google does Machine Learning

Google Cloud

June 25, 2026 – Present

Applied Text Mining in Python

University of Michigan

June 25, 2026 – Present

Digital Transformation

Boston Consulting Group (BCG)

June 25, 2026 – Present

Data Visualization with Python

IBM

June 25, 2026 – Present

Understanding and Visualizing Data with Python

University of Michigan

June 25, 2026 – Present

Data Frameworks for Generative AI

Fractal Analytics

June 25, 2026 – Present

Essentials of MLOps with Azure: 2 Databricks MLflow and MLflow Tracking

June 25, 2026 – Present

Introduction to Machine Learning in Production

Coursera

June 25, 2026 – Present

Foundations: Data, Data, Everywhere

Google

June 25, 2026 – Present

Introduction to Git and GitHub

Google

June 25, 2026 – Present

Intermediate Network Analysis in Python

DataCamp

June 25, 2026 – Present

Perform Sentiment Analysis with scikit-learn

Coursera

June 25, 2026 – Present

Intro to Time Series Analysis in R

Coursera

June 25, 2026 – Present

Practical Time Series Analysis

The State University of New York

June 25, 2026 – Present

Generative AI Essentials: A Comprehensive Introduction

Fractal Analytics

June 25, 2026 – Present

Using Databases with Python

University of Michigan

June 25, 2026 – Present

Develop Generative AI Applications: Get Started

IBM

June 25, 2026 – Present

MLOps Essentials: Model Development and Integration

June 25, 2026 – Present

What Is Generative AI?

June 25, 2026 – Present

Foundations of Project Management

Google

June 25, 2026 – Present

Analyze Data to Answer Questions

Google

June 25, 2026 – Present

Introduction to Portfolio Risk Management in Python

DataCamp

June 25, 2026 – Present

Applied Machine Learning in Python

University of Michigan

June 25, 2026 – Present

Databases and SQL for Data Science

IBM

June 25, 2026 – Present

Key Strengths

Extensive experience in data science and predictive analytics, with a strong background in machine learning model development and deployment.
Demonstrated ability to handle large datasets, perform data preprocessing, and apply various statistical and machine learning techniques (Logistic Regression, Random Forest, ANN, Gradient Boosting, ARIMA, K-Means Clustering).
Experience with ETL processes, migrating SQL to pySpark, and working with AWS, HiveQL, and PostgreSQL, indicating a solid understanding of data pipelines and infrastructure.
Proven track record in customer churn prediction and segmentation, which is highly relevant for data-driven business strategies.
Master's degree in Data Science from a reputable institution (IIT Kanpur) provides a strong theoretical foundation.

Cultural & Operational Fit

Cultural Fit Analysis

Soft Skills & Operational Fit