
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Senior Deep Learning Architect at NVIDIA
I work at NVIDIA as a Deep Learning Architect in the GPU Compute Architecture team. I'm currently working on improving deep learning performance through hardware, software and network optimizations. Prior to NVIDIA, I worked at Texas Instruments India as a Field Applications Engineer where I provided application support to TI's key customers in North and West India. I was responsible for TI's embedded processing and wireless connectivity solutions. I graduated with a master's degree in Electrical and Computer Engineering from Carnegie Mellon University in 2017 and a bachelor's degree in Electronics and Communication Engineering from National Institute of Technology, Karnataka in 2013. My coursework at CMU focused on deep learning, accelerated computing and computer architecture.
Carnegie Mellon University
Master’s Degree, Electrical and Computer Engineering
January 1, 2016 – January 1, 2017
National Institute of Technology Karnataka
Bachelor of Technology (BTech), Electronics and Communications Engineering
January 1, 2009 – January 1, 2013
NVIDIA
Senior Deep Learning Architect
February 1, 2018 – Present
San Francisco Bay Area
NVIDIA
Deep Learning Architect Intern
May 1, 2017 – August 1, 2017
Santa Clara, California
Texas Instruments
Field Application Engineer
July 1, 2014 – July 1, 2016
Texas Instruments
Smart Grid Applications Engineer
January 1, 2014 – June 1, 2014
Texas Instruments
Digital Applications Associate
July 1, 2013 – December 1, 2013
Texas Instruments
Analog Applications Associate - Intern
May 1, 2012 – July 1, 2012
Greater Bengaluru Area
Hardware Accelerator for Bellman-Ford Single Source Shortest Path Algorithm using Reconfigurable Logic
December 1, 2016 – Present
Designed a hardware accelerator for the Bellman-Ford algorithm on the programmable fabric of a Xilinx Zynq-7000 SoC using Vivado High Level Synthesis. Achieved ~2.1x better performance/Watt v/s a dual-core Intel Core i7 processor and ~15x better performance/Watt over the Zynq’s own ARM Cortex A9 CPU by optimizing DRAM-to-BRAM data transfers and hiding memory access latency.
Optimizing the Training Time of a Recurrent Neural Network (RNN) for Auto-Generating Text
December 1, 2016 – Present
Optimized the batch and RNN size to reach minimal training loss in a given amount of time by sizing the network and batch to efficiently use all the SMMs on a GPU (tested on an Nvidia Tesla K40M). Achieved ~5x speedup over an existing implementation’s default settings for the same training loss and dataset size.
Concurrent Proxy Server
August 1, 2016 – Present
Designed a concurrent proxy web server that is capable of handling multiple concurrent HTTP/1.0 GET requests using pthreads and a counting semaphore to allocate work between threads.
Dynamic Memory Allocator
July 1, 2016 – Present
Designed a dynamic memory allocator (similar to the malloc package - with malloc, free, realloc and calloc functions) with optimal utilization and high throughput using segregated free lists to maintain different free lists of different sizes. The algorithm used a combination of first fit and best fit search algorithms to get the optimal trade off between throughput and utilization. As a further optimization, allocated block footers were removed to improve utilization.
Basic UNIX Shell
June 1, 2016 – Present
Designed a basic command line UNIX shell that uses process, inter process communication, I/O redirection and signal handling, and can also process built in bash commands.
Low cost, long life LED driver
June 1, 2012 – July 1, 2013
Worked in the ECE dept on the development of a low cost, long life LED driver under Dr Ramesh Kini. The project was accepted into the finals of the IFEC 2013 contest organized by IEEE, to be held at Zhiejang University, China in July 2013. Presented a progress report for the same in APEC 2013, Long beach, California on March 17th as a part of the IFEC workshop and participated in finals at Zhiejang University, HangZhou, China.
Product development for orthopaedic application
June 1, 2012 – May 1, 2013
We worked on development of a Remote triggered laser based target assisting system for the C-Arm Imaging units in the Wenlock hospital, Mangalore. Also developed a grip quantification prototype for the same institution.
Vehicular Data Logger with an Intel Atom
December 1, 2011 – Present
Worked in the ECE dept under Prof Ramesh Kini to develop an automobile data logger and dashboard unit using an Intel Atom which handled the file storage, calculations and display. This was then connected to a number of sensors like IMU, Tachometer and wheel speed sensors handled by MSP430 microcontrollers interfaced to the Atom via UART. The Atom used pthreads to concurrently manage multiple UART connections to multiple MSP430 microcontrollers.
Lighting Automation System
May 1, 2011 – July 1, 2011
The involved the development of a power saving smart LED lighting product which could be interfaced and controlled with an IR remote and an Ethernet gateway via a web-page hosted on a Wiznet-W5100 based server through power-line communication. The light in itself was a standalone system with inbuilt motion and ambient light sensors and would light up according to external conditions and presence of a moving body underneath it. All these lights had power-line communication modules, allowing the formation of a network of lights when connected on the same phase which was interfaced with an Ethernet module. The Ethernet module hosted a webpage which could access status and control the lights on the network. The system used TI MSP430 micro-controllers. The idea was entered into the TI Analog design contest 2012 under the mentorship of the HOD Dr. Muralidhar Kulkarni. http://youtu.be/_tY_8JBnBb0
Cultural Fit Analysis
The candidate's background is heavily skewed towards hardware, embedded systems, and deep learning engineering roles, with significant experience at NVIDIA and Texas Instruments. While the projects demonstrate strong technical capabilities, the target role of 'Data Analyst' appears to be a significant pivot from their core expertise. The projects do not explicitly showcase typical data analyst skills such as statistical modeling, data visualization, SQL, or specific data analysis tools. This misalignment suggests a potential cultural fit challenge for a pure data analyst role, though their strong technical foundation could be leveraged in a data science or machine learning engineering role.
Soft Skills & Operational Fit
The candidate's project descriptions indicate a strong problem-solving aptitude and a results-oriented approach, evidenced by quantifiable achievements (e.g., ~2.1x better performance/Watt, ~5x speedup). The diverse range of projects suggests adaptability and a willingness to tackle complex technical challenges. However, without specific psychometric test results or interview data, a detailed assessment of soft skills like teamwork, communication, and stress handling is not possible.