
Machine Learning at Netflix
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Machine Learning Research and Engineering with expertise in the following domains - 1. Personalization and Recommendation AI Research at Netflix. 2. Large-scale end to end ML for Hate speech/Bullying detection and Misinformation matching at Facebook/Meta. 3. Conversational AI and Natural Language Understanding Research during my time at Amazon Alexa.
University of Southern California
Master's degree, Computer Science
January 1, 2015 – January 1, 2017
Pune Institute of Computer Technology
Bachelor of Engineering (B.E.), Information Technology
January 1, 2011 – January 1, 2015
Netflix
Machine Learning Researcher
January 1, 2023 – Present
Los Gatos, California, United States · On-site
Jnana Prabodhini Foundation
Volunteer
November 1, 2019 – Present
United States · Remote
Meta
Machine Learning Researcher
November 1, 2018 – January 1, 2023
Menlo Park, California, United States
Amazon
Applied Scientist, Machine Learning and NLU
June 1, 2017 – November 1, 2018
Cambridge, Massachusetts
Amazon
Applied Scientist Intern
January 1, 2017 – April 1, 2017
Greater Boston Area
Cisco
Software Development Intern
June 1, 2016 – August 1, 2016
San Francisco Bay Area
BMC Software
Application Development Intern
July 1, 2014 – May 1, 2015
Pune, Maharashtra, India
Toutiao Q&A Recommendation System
September 1, 2016 – November 1, 2016
Finished #28 out of 1036 international teams in the following competition - Toutiao Q&A is an upcoming mobile social platform, which has around 530 million Toutiao users and a precise recommendation algorithm, which promotes short-form content creation and interaction on mobile devices in the format of Q&A. They strive to match information with the right people, finding the best respondent to the questions, and the best readers to the answers based on the the expert’s area of expertise and the tags related to the questions. Each data record includes expert tags, question data and question distribution data. Given certain questions, the task was to forecast which experts are more likely to answer which questions. Specifically, given each question and each expert, we had to calculate the probability of that expert answering the question. The competition uses Normalized Discounted Cumulative Gain (NDCG) as the as evaluation criteria, using the formula: NDCG@5 * 0.5 + NDCG@10 * 0.5
Sarcasm Detection in Hindi
March 1, 2016 – May 1, 2016
Detection of sarcasm can benefit many sentiment analysis NLP applications, such as review summarization, dialogue systems, opinion mining and review ranking systems. In this project, we define our problem precisely as follows: We formulate sarcasm detection as a classification task. Given a text, the goal is to predict whether it is sarcastic or not. Twitter as a micro-blogging platform offers a diverse range of sarcastic and non-sarcastic tweets. These tweets are available in multiple domains like politics, sports, environment, regional etc. Cross Language Text Classification: We have trained our classifier on tweets available in Hindi and then test it on both Hindi and English tweets and evaluate performance with comments on aspects of language conversion. The biggest challenge of this research paper lies in the feature engineering of the problem. We wish to exploit different language features along with contextualized twitter features to train classifiers. Existing work in the field emphasizes on using NB and SVM for classification using various features formulations. None of the previous work has been done in Hindi language or Cross Language Learning. Our aim is to achieve both. We believe that such a project will help improve the accuracy of sentiment analyses across different languages.
Part of Speech Tagger for Catalan corpus
March 1, 2016 – Present
Implemented a Hidden Markov Model for POS tagging. The corpus used was Catalan. Implemented Viterbi decoding algorithm for output sequence. Final accuracy of 94.04% was achieved on test data
Content Enrichment in Big Data Text Retrieval
February 1, 2016 – April 1, 2016
The objective of the project It is to significantly enrich the metadata, and automatically extracted text and entities from the TREC Polar Dataset, and to make the dataset easily to relate to and to interact with. Key Steps 1. Context Extraction Enrichment – We applied the Tag Ratios algorithm to identify text, and constructed a Tika parser to extract Measurement mentions from text automatically. 2. Metadata Enrichment – We applied the GROBID journal parser with Tika, and extract TEI metadata, and also scientific publication metadata using the Google Scholar API to develop a network of related scientific publications to the Polar dataset, and to map publications to the data. In addition, we classified the data using a common Earth science domain model, ontology, called SWEET, for Semantic Web for Earth and Environmental Terminology (http://sweet.jpl.nasa.gov/). We also createed Digital Object Identifiers (DOIs) for the data. 3. Information Similarity and Clustering – We created clusters of the Polar data using the enriched measurements extracted, and using the enriched metadata, and demonstrated information using Data-Driven-Documents visualizations after ingesting data into Apache Solr. 4. Named Entity Recognition (NER) – We applied geospatial NER using the GeoTopicParser in Apache Tika and using the MEMEX GeoParser tools
Truthful and Deceptive Hotel Reviews Detection
February 1, 2016 – Present
A naive Bayes classifier to identify hotel reviews as either truthful or deceptive, and either positive or negative. word tokens were used as features for classification on real data from hotel corpus. Smoothing and unknown words were handled using Laplace smoothing and set priors. F1 score of 0.87 was achieved on the test data.
Mime Diversity Analysis in Big Data
January 1, 2016 – March 1, 2016
In this project concepts from MIME Taxonomy,data similarity, and regarding learning Byte-based fingerprints of the data via Byte Frequency Analysis (BFA), Byte Frequency Distribution (BFD) Correlation, Byte Frequency Cross-Correlation (BFC), and File Header Trailer (FHT) were employed. We implemented a set of MIME diversity programs and applications that helped in better understanding these unknown types in a rich scientific domain. We then computed BFA, BFC and FHT of these unknown (and other) Polar data types from the dataset, and built a system that allows visual interaction and introspection of the MIME diversity in this dataset. Those classifications improved Tika’s overall ability by suggesting new MIME magic for its database, and improved techniques for MIME detection in the Big Data present in the TREC-DD-Polar dataset.
Mancala Game Engine Development
October 1, 2015 – November 1, 2015
Implemented Mancala game engine using Minimax algorithm with alpha beta pruning.
Spell Suggest and Grammar Checker
September 1, 2015 – October 1, 2015
The spell suggest tool uses language model as a unigram modelled language dataset of words. Edit distances upto 2 are covered to correct the spelling mistake. In grammar chcecking , the given text is parsed into POS tags using Stanford NLP POS tagger. The error detction is carried out using a train POST model of correct english dataset.
Octave Based Sentiment Analysis
August 1, 2015 – September 1, 2015
The application built in Octave trains svm-based classifiers to predict the emotion present in given text message.Logistic Regression algorithm to chat applications was applied . Text messages and word lists were used as features in designing the system
Chess Game Development
July 1, 2014 – August 1, 2014
A GUI based chess game developed in Visual Basic 6 The game provides the chess board and all the valid moves to the player at the current time. Rules such as en passant and promotions are provided.
Benchmarking-SQL-vs-NoSQL
July 1, 2014 – May 1, 2015
Java based tool for comparing the database performances To compare the two trends in databases, MySQL and MongoDB were chosen as the representatives. Datasets of sizes varying from 100 to 1000000 were generated using Python and were compared on various criteria. Output of the benchmarking was represnted in the form of graphs to depict the differences between the perfomances.
Textile Industry Database Management System
July 1, 2014 – October 1, 2014
This business application was deployed on a textile industry. Entire business flow right from supply chain management to customer support service was designed and implemented. The application front end was developed in Visual Studio and Oracle database was used for back end. Visual Studio based enterprise DBMS. Complete design and implementation of a textile industry DBMS in Visual Studio. Analysis of every expense and gain in the form of visually intuitive reports. Hanlding unique way of bill payments and reminders where customers are allowed a duration of debt.
Web Based Student Help Forum
February 1, 2014 – April 1, 2014
Interactive web portal for student help forum. The website allowed students to ask questions, give answers and share knowledge. The website included facilities for upvoting and downvoting an asnwer. The website was developed using Java, JSP, JavaScript, HTML/CSS.
Machine Learning
Coursera
June 24, 2026 – Present
Cultural Fit Analysis
The candidate has worked at leading tech companies (Netflix, Meta, Amazon) known for fast-paced, innovative environments. The diverse range of personal projects, from NLP to game development and database management, indicates a broad interest and willingness to explore different technical domains. The volunteer experience suggests a community-oriented aspect. While the experience is heavily skewed towards Machine Learning Research, the target role is 'Data Analyst'. This represents a potential mismatch in primary focus, as the candidate's background is more advanced in ML/AI than typical data analysis roles, which might lead to overqualification or a desire for more advanced ML-focused tasks. The breadth of projects, however, shows adaptability.
Soft Skills & Operational Fit
The candidate's project history, particularly the volunteer work and diverse project portfolio, suggests a proactive and engaged individual. Experience in leading ML initiatives at Meta indicates strong problem-solving and potentially leadership skills. However, without specific psychometric test results, a definitive assessment of work attitude, stress handling, and team collaboration is not possible.