Jingzhi Pang

Data Analyst

https://www.opentalent.in/jingzhi-pang

Software Engineer

Google

Key Strengths

Extensive experience in machine learning model development and deployment, particularly in areas like content searching, recommendation, information extraction, and object detection.
Strong academic background with a Master's in Computational Biology from Carnegie Mellon University, indicating a solid foundation in data science and analytical methods.
Demonstrated ability to improve model accuracy and performance in various projects (e.g., Titanic survival prediction, active learning, pancreatic cancer recognition).
Experience with diverse machine learning techniques including KNN, Gradient Boosted Trees, SVM, Neural Networks, CNNs, and clustering methods like GMM and Affinity Propagation.
Proficiency in data preprocessing, feature engineering, and model optimization, as evidenced by multiple project descriptions.

Cultural & Operational Fit

Cultural Fit Analysis

The candidate's background is heavily skewed towards machine learning engineering and computational biology, with a strong research component. While the target role is 'Data Analyst', the candidate's experience is more aligned with advanced data science and ML engineering roles. The projects demonstrate a strong academic and research-driven approach, which might require adaptation to a purely analytical, business-focused data analyst role. The diversity of projects shows intellectual curiosity and a broad technical interest, but the direct relevance to typical data analyst tasks (e.g., SQL, dashboarding, A/B testing, business intelligence) is less apparent.

Soft Skills & Operational Fit

The candidate's project descriptions highlight problem-solving skills, a results-oriented approach (e.g., improving accuracy, reducing reading time), and experience in leading technical aspects of projects (e.g., 'group technical leader' for PyCorn). The experience at Google and Petuum suggests an ability to work in structured, product-focused environments. However, specific soft skills like collaboration, adaptability, or leadership are not explicitly detailed beyond project roles.

AI is analyzing your overall score…

Identifying your key strengths…

Evaluating your skill match against the job requirements…

Assessing your cultural and operational fit

Experience

Google

Software Engineer

November 1, 2021 – Present

Petuum, Inc.

Software Engineer II (Machine Learning)

February 1, 2018 – October 1, 2021

San Francisco Bay Area

Agile SDE, LLC

Data Scientist

July 1, 2017 – February 1, 2018

San Francisco Bay Area

Carnegie Mellon University

Graduate Research Assistant - SCS - CBD - Langmead Lab

January 1, 2017 – March 1, 2017

Greater Pittsburgh Region

Amyris

Scientific Computing Intern

June 1, 2016 – August 1, 2016

Emeryville, CA

Projects

Survival Prediction on Kaggle Titanic

February 1, 2017 – March 1, 2017

·  Set up a Hadoop mapreduce version of KNN classifier from scratch ·  Performed data pre-processing and feature engineering on Titanic dataset, creating 3 new features ·  Improved the accuracy of KNN classifier from baseline of 74% to 90.43%, with precision and recall highly increased ·  Built up Gradient Boosted Trees classifier with xgboost, implemented grid search optimizing parameters in model tuning, improving accuracy from baseline of 76.08 to 88.52%

Active Learning on Image Classification

October 1, 2016 – December 1, 2016

· Trained an active leaner with Query by Committee(QBC) as query strategy and SVM with rbf kernel as classifier, and compare its performance with a baseline learner using random sampling strategy · Achieved meow than 90% accuracy, which is the same as base learner using only 40% of total samples

TSS Recognition Pipeline Construction-"PyCorn"

March 1, 2016 – May 1, 2016

· Participated in the design of "PyCorn"- a genome-wide transcription start sites(TSS) prediction pipeline of Zea mays, as group technical leader · Built up neural network(NN) to predict if the input genome sequence contains a TSS with scikit-learn package · Improved the prediction ability by optimizing the parameters of NN, such as activation function, number of hidden units, etc. · Designed the input module of PyCorn to deal with large input sequence · Achieved 83.9% accuracy on TSS prediction

Neuroscience meets Deep Learning

February 1, 2016 – April 1, 2016

· Performed data pre-processing, transforming functional magnetic resonance imaging (fMRI) data associated with different words of 9 subjects, to three-dimensional(3D) space · Built up three-dimensional convolutional neural network(3D CNN) model, which is composed of convolution layers, max-pooling layers, a dense layer and a logistic regression layer, to predict the neural activation associated with different categories of words · Achieved three times higher accuracy than a random classifier · Compared 3D CNN prediction ability with other basic machine learning models, such as neural network, random forest, etc.

Automated Recognition of Pancreatic Cancer

January 1, 2016 – May 1, 2016

· Built up SVM classification models to determine whether a patient has pancreatic cancer with proteomic data · Enhanced the classification accuracy to above 80% by Implementing feature extraction and false negative control · Implemented unsupervised feature selection with Gaussian Mixture Model(GMM) · Applied "Affinity Propagation", a clustering method, to find the subtypes of pancreatic cancer · Proved statistical correlation between different pancreatic cancer clusters and clinical symptoms with chi-square test

Image Classification of CIFAR-10 dataset

November 1, 2015 – December 1, 2015

· Extracted visual features of CIFAR-10 dataset, which consists of 5000 pictures in 10 classes · Established classification models based on SVM, softmax regression and k-binary logistic regression, and K-nearest neighbors algorithm · Improved the correctness from 20% to above 50%

Integration and Application of Regulatory-Metabolic Network

December 1, 2014 – May 1, 2015

· Built up an integrated metabolic-regulatory network for yeast based on a new automatic metabolic-regulatory integration algorithm, EGRIN-PROM · Proved a strong phenotype prediction ability of EGRIN-PROM integrated network, by comparing the Matthews correlation coefficient (mcc) with YEASTRACT-PROM network · Performed optimization and simulation of the flux in yeast 7.00 metabolic network, aiming to improve the expression of acetoacetyl CoA · Proposed gene modification strategies including gene knockout and gene over-expression that could improve the expression of acetoacetyl CoA · Proves the accuracy of modification strategies based on pathway analysis

Synthetic Biology Software Design-“EASYBBK”

December 1, 2013 – October 1, 2014

· Participated in functional design of “EASYBBK” – an assistant tool achieving evaluation, visualization and simplification of Biobricks by requirements survey · Designed an assessment model of biobricks based on its status, reliability, feedback and relevant publication · Integrated relevant information and data of all extant biobricks in International Genetically Engineered Machine (iGEM) official Registry · Achieved sequence alignment function against our own database using NCBI Stand-alone BLAST · Won a gold medal in iGEM competition as an EASYBBK software team member Project Website: http://2014.igem.org/Team:SJTU-Software

Mechanism Exploration of Selectivity of PKB Inhibitor

October 1, 2012 – July 1, 2014

· Explored the selectivity mechanism for protein kinase B inhibitors with molecular dynamics simulation(MD) · Conducted 3D-QSAR modeling on PKB inhibitors and PKA inhibitors thus proving its high prediction ability · Revealed possible methods to improve the selectivity of inhibitors based on simulation results

Key Strengths

Extensive experience in machine learning model development and deployment, particularly in areas like content searching, recommendation, information extraction, and object detection.
Strong academic background with a Master's in Computational Biology from Carnegie Mellon University, indicating a solid foundation in data science and analytical methods.
Demonstrated ability to improve model accuracy and performance in various projects (e.g., Titanic survival prediction, active learning, pancreatic cancer recognition).
Experience with diverse machine learning techniques including KNN, Gradient Boosted Trees, SVM, Neural Networks, CNNs, and clustering methods like GMM and Affinity Propagation.
Proficiency in data preprocessing, feature engineering, and model optimization, as evidenced by multiple project descriptions.

Cultural & Operational Fit

Cultural Fit Analysis

Soft Skills & Operational Fit

Jingzhi Pang

Key Strengths

Cultural & Operational Fit

About

Top Skills

Skills

Education

Experience

Projects

Key Strengths

Cultural & Operational Fit