
Senior Data Scientist
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
I am an experienced data scientist with expertise in building end to end data science packages handling data exploration, interactive visualizations, model selection/training to model deployment. I have also worked on multiple data streams such as medical images, financial documents, structured audit data and web articles. My core competencies span Machine Learning algorithms, NLP, Cloud platforms, Data Visualization and Web Service Design. I am keen to exchange ideas and will be happy to connect with interested people.
The University of Alabama in Huntsville
Master of Science in Engineering, Electrical and Computer Engineering
January 1, 2014 – January 1, 2016
Anna University Chennai
Bachelor of Engineering (BEng), Electronics and Communications Engineering
January 1, 2010 – January 1, 2014
Microsoft
Senior Security Data Scientist
October 1, 2024 – Present
Redmond, Washington, United States · Hybrid
Centific
Data Scientist
June 1, 2023 – September 1, 2024
Redmond, Washington, United States · Remote
Microsoft
Senior Data and Applied Scientist
June 1, 2020 – May 1, 2023
Greater Seattle Area · Hybrid
KPMG US
Senior Associate - Data Scientist
October 1, 2017 – June 1, 2020
New York City
KPMG US
Associate Software Engineer
November 1, 2016 – September 1, 2017
New York City
The University of Alabama in Huntsville
Graduate Teaching Assistant
August 1, 2015 – May 1, 2016
The University of Alabama in Huntsville
Office Assistant
September 1, 2014 – August 1, 2015
Regression Model Performance for Boston Housing Prices Prediction
May 1, 2016 – Present
This project implemented a Decision Tree Regressor, a kNN Regressor and an AdaBoost Regressor and investigated the performance of each of these on the Boston housing dataset. Properties were studied by varying the training set size and the model complexity generating the learning curves and the model complexity curves respectively. A Grid-Search/Cross-validation pipeline was built to automatically choose the best model.
Hand written Digit Recongition using Neural Networks
February 1, 2016 – Present
I tried to develop a module that implemented the complete paradigm of setting up a neural network, a feed-forward mechanism for training followed by backpropagation for recognizing MNIST images of digits resulting in a 10-class classification task.
A Sliding Window Based Algorithm for Compression of Data
November 1, 2015 – Present
In this project, I sought to implement the LZ77 compression algorithm. This algorithm has been at the forefront of many tools today such as gzip. In this project, the program is built as a MATLAB function. Users have the option of compressing a string or a text file. This algorithm has many implementation variations and mine is no exception. I have included options for users to use a growing or fixed size sliding window. Click on the project name above to access the function from the MATLAB File Exchange website.
Fisher score and Discriminating Coefficient for Feature Selection
September 1, 2015 – Present
Fisher Score and Discriminating Coefficient are two effective ranking methods for feature selection while mining data. I have developed function that can compute these scores for feature data and return the ranks of significant features. Click on the project name to access the code.
Hu's Seven Invariant Image Moments
July 1, 2015 – Present
The Hu's Seven Invariant Moments of an image are essential to understanding the patterns that occur in the image and can be used as a feature to model the image. I have developed a comprehensive set of codes that can compute the 7 invariant Hu's Moments of an image.
Performance of Feature Selection Methods for Machine Learning based Automatic Malarial Cell Recognition in Wholeslide Images (Master's thesis)
January 1, 2015 – March 1, 2016
There is a growing need worldwide to provide universal healthcare irrespective of the availability resources. For this purpose, Machine Learning holds promise when it comes to automating healthcare in areas where trained manpower and equipment may not be available. Machine Learning for malaria diagnosis with wholeslide images is relevant since wholeslide images are of large sizes. In this research work, classification performance of an SVM classifier was studied when features selected by different feature selection techniques were presented to it. Feature selection is important since it helps in reducing the training set size while keeping the data's essential characteristics.I found that among 6 feature selection techniques studied, the features selected by the Kullback-Liebler distance between the two classes of 'Infected' and 'Normal, gave the best accuracy of 95.5%. The wholeslide image samples were provided by University of Alabama, Birmingham.
Implementation of Golomb Encoder/Decoder to perform Image coding
November 1, 2014 – Present
Golomb coding is an algorithm that has played a seminal role in advancing data compression.This project involved the development of an encoder and decoder that functioned robustly and were carefully designed to always ensure unique decoding. I have uploaded the project on GitHub for use by the data compression community at large. Please click on the project title to get to the code.
A Segmentation based Approach to Lossless Compression of Medical Images
October 1, 2014 – December 1, 2014
This project was done with the aim of presenting a method that bridges image segmentation and lossless compression of medical images As a result, the SLIC algorithm is the focal method of usage. The original image is decomposed by means of segmentation into three constituent parts which require much less bits for transmission leading to image compression. The reference paper for this project is: "A Segmentation-Based Lossless Image Coding Method for High-Resolution Medical Image Compression", L.Shen and R.M.Rangayyan.
Dictionary Learning for Efficient Image Sparse Modelling
August 1, 2013 – April 1, 2014
There has been a growing interest in the field of Sparse Modelling as it performs robustly in worst case scenarios such as when 80% of the pixels in an image are missing. Dictionaries for sparse modeling play a seminal role in determining the output robustness of the sparse reconstruction algorithm. My project was aimed at determining the most computationally robust algorithm for training dictionaries. In this project, me and my team worked on a detailed implementation of the the K-SVD and Stagewise K-SVD algorithms without using any of the software packages provided by various researchers. This helped us learn the nuances of the respective algorithms and find out solutions to computational issues. The dictionary training set involved patches extracted from 20 flower images and the dictionaries were used for the reconstruction of Lena and Barbara images! It was concluded that the Stagewise K-SVD algorithm is an efficient method for dictionary learning as compared to K-SVD algorithm based on comparison of performance measures.
Neural Networks and Deep Learning
Coursera
June 24, 2026 – Present
Machine Learning (Stanford University)
Coursera
June 24, 2026 – Present
Data Engineering on Google Cloud Platform Specialization
Coursera
June 24, 2026 – Present
Neo4j Fundamentals
Neo4j
June 24, 2026 – Present
Microsoft Certified Azure Fundamentals
Microsoft
June 24, 2026 – Present
Text Retrieval and Search Engines
Coursera
June 24, 2026 – Present
Inferential Statistics
Coursera
June 24, 2026 – Present
Cultural Fit Analysis
The candidate has a strong background in data science and machine learning, with significant experience at large tech companies like Microsoft and KPMG. The project diversity, ranging from regression models to image processing and data compression, demonstrates a broad interest in data-related fields. The target role of 'Data Analyst' might be a slight mismatch given the candidate's extensive experience as a 'Data Scientist' and 'Senior Security Data Scientist', which typically involves more advanced modeling and engineering tasks than a traditional Data Analyst role. However, the foundational skills are highly relevant.
Soft Skills & Operational Fit
The candidate's experience descriptions highlight leadership in project development, partnership with engineering and threat intelligence teams, and mentoring new hires, indicating strong collaboration and communication skills. The ability to translate complex technical concepts into actionable business insights is also evident.