
To Design, build, deploy Machine Learning applications to solve real-world problems empirically for applying my knowledge for the development of mankind
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Revenue-Forecasting-for-Finance-dataset
July 9, 2019 – July 9, 2019
Revenue forecasting is project of finance datasets where Forecasting is done from empirical model.
View Projectweb-application-which-has-a-drop-down-to-select-one-file-and-one-or-more-fields-in-the-file-and-extr
May 13, 2019 – August 22, 2019
I have a bunch of files in CSV format (with same set of fields) and to aid non-programmers I need to make it easy to access the same. Can you develop a web application which has a drop-down to select one file and one or more fields in the file and extract the information? Write both the front-end code and back-end code.
View ProjectFB-Profanity-check
May 13, 2019 – May 13, 2019
Imagine there is a file full of Facebook comments by various users and you are provided a set of words that signify profanity. Can you write a program which can indicate the degree of profanity for each sentence in the file?
View ProjectStack-Overflow-Tag-Prediction
March 19, 2019 – March 19, 2019
Stack-Overflow-Tag-Prediction — GitHub repository
View ProjectTaxi-demand-prediction-in-New-York-City
March 19, 2019 – March 19, 2019
Objectives: Task 1: Incorporate Fourier features as features into Regression models and measure MAPE. Task 2: Perform hyper-parameter tuning for Regression models. 2a. Linear Regression: Grid Search 2b. Random Forest: Random Search 2c. Xgboost: Random Search Task 3: Explore more time-series features using Google search/Quora/Stackoverflow to reduce the MAPE to < 12%
View ProjectK-Means-clustering-Hierarchical-Clustering-and-DBSCAN-to-Amazon-food-reviews-dataset
March 19, 2019 – March 19, 2019
Clustering algorithm is applied on amazon reviews datasets to cluster the reviews. Types of clustering : 1.K-Means clustering 2.Hierarchical Clustering 3.DBSCAN(Density Based spital clustering of application with noise).
View ProjectGBDT-and-RF-to-Amazon-reviews-dataset
March 19, 2019 – March 19, 2019
GBDT(Gradient Boosting Decision Tree) and RF(Random Forest) algorithm is applied on amazon reviews datasets to predict whether a review is positive or negative. Procedure to execute the above task is as follows: • Step1: Data Pre-processing is applied on given amazon reviews data-set. • Step2: Time based splitting on train and test datasets. • Step3: Apply Feature generation techniques(BOW,TF-IDF,avg w2v,tfidfw2v) • Step4: Apply GBDT(Gradient Boosting Decision Tree) algorithm using each technique. • Step5: Apply RF(Random Forest) algorithm using each technique. • Step6: To find Number of Base learners(m) using gridsearch cross-validation in case of RF(Random Forest) algorithm . • Step7: To find Number of Base learners(m),depth,learning rate(v) using gridsearch crossvalidation in case of RF(Random Forest) algorithm. 0.2 Objective: • To classify given reviews (positive (Rating of 4 or 5) & negative (rating of 1 or 2)) using GBDT(Gradient Boosting Decision Tree) and RF(Random Forest) algo
View ProjectDecision-Trees-on-Amazon-reviews-data-set
March 19, 2019 – March 19, 2019
decision Tree algorithm is applied on amazon reviews datasets to predict whether a review is positive or negative. Procedure to execute the above task is as follows: • Step1: Data Pre-processing is applied on given amazon reviews data-set.And Take sample of data from dataset because of computational limitations • Step2: Time based splitting on train and test datasets. • Step3: Apply Feature generation techniques(avg w2v,tfidfw2v) • Step4: Apply Decision Tree algorithm using each technique. • Step5: To find C(1/lambda) and gamma(=1/sigma). • Step6: Decision tree Feature Importance using BOW and TF-IDF • Step7: Images of Decision tree in png format with verious vectorizations. 0.2 Objective: • To classify given reviews (positive (Rating of 4 or 5) & negative (rating of 1 or 2)) using Decision Trees algorithm.
View ProjectImplement-SGD-for-linear-regression
March 19, 2019 – March 19, 2019
0.1 Assignment 6: Implement SGD for linear regression To implement stochastic gradient descent to optimize a linear regression algorithm on Boston House Prices dataset which is already exists in sklearn as a sklearn.linear_model.SGDRegressor.here,SGD algorithm is defined manually and then comapring the both results.Linear regression is technique to predict on real values. ##### stochastic gradient descent technique , evaluates and updates the coefficients every iteration to minimize the error of a model on training data. 0.2 Objective: To Implement stochastic gradient descent on Bostan House Prices dataset for linear Regression • Implement SGD and deploy on Bostan House Prices dataset. • Comapare the Results with sklearn.linear_model.SGDRegressor
View ProjectLogistic-Regression-on-Amazon-reviews-data-set.
March 19, 2019 – March 19, 2019
Logistic Regression algorithm is applied on amazon reviews datasets to predict whether a review is positive or negative. Procedure to execute the above task is as follows: • Step1: Data Pre-processing is applied on given amazon reviews data-set.And Take sample of data from dataset because of computational limitations • Step2: Time based splitting on train and test datasets. • Step3: Apply Feature generation techniques(Bow,tfidf,avg w2v,tfidfw2v) • Step4: Apply Logistic Regression algorithm using each technique. • Step5: To find lambda using gridsearch cross-validation and random cross-validation • Step5: L1 and L2 regularization • Step6: L1 Regularization- Increase lambda hyperparameter to generate sparcity in dataset. 1. Report Performance metric 2. Report Error 3. Report Sparcity in "W*" • Step6: Feature Importance for postive and Negative reviews 1. Most Important Feature 2. Bar plot of top 15 Important Features. 0.2 Objective: • To classify given reviews (positive (Rating of 4 or 5
View ProjectCultural Fit Analysis
The candidate's personal projects demonstrate initiative and a strong interest in data science. The diversity of algorithms and datasets explored suggests a proactive learning approach. The projects are well-aligned with a data scientist role, indicating a good fit for a technically focused environment. However, the lack of team projects or contributions to open-source initiatives limits the assessment of collaborative cultural fit.
Soft Skills & Operational Fit
The candidate's project descriptions indicate a focus on problem-solving and algorithm implementation. However, there is no direct data to assess soft skills like teamwork, communication, or stress handling. The project descriptions are functional but lack detail on collaborative aspects or challenges overcome.