Key Strengths

Extensive experience (18 years) in software development, with a significant tenure at Amazon.
Deep expertise in LLM inference optimization, PyTorch, Distributed Systems, and C++ from recent Senior SDE role at Amazon.
Experience leading inference optimization for foundation models, including GPU and custom silicon (Trainium), at frontier scale.
Proven ability to achieve state-of-the-art latency and unblock critical launches by building custom capabilities.
Patent holder in multimodal AI and LLM inference, demonstrating innovation and deep technical contribution.

AI is analyzing your overall score…

Identifying your key strengths…

Evaluating your skill match against the job requirements…

Assessing your cultural and operational fit

Spurthi Sandiri

ML Engineer

https://www.opentalent.in/spurthi-sandiri

Sr SDE @ Amazon | LLM Inference Engineer — Driving SOTA Latency at Scale | Speculative Decoding · CUDA · Distributed ML Systems | undertheinferencehood.substack.com

About

I solve complex, ambiguous problems at the intersection of AI models, hardware, and distributed systems — and ship solutions that serve millions of customers. As a Senior SDE at Amazon, I lead inference optimizations for the Amazon Nova family of models — spanning GPU and custom silicon, from single-node to multi-node frontier scale. LLM inference is where research and production speak different languages. A technique that shows gains in a paper often falls apart against real hardware constraints, feature cross-compatibility (multimodal, customization, constrained decoding), quantization trade-offs, and the tension of keeping accelerators maximally utilized while serving low-latency responses. Add external dependencies with uncertain timelines and multiple teams with conflicting hypotheses — and the right approach often doesn't exist yet. That's where I thrive: → Led speculative decoding across Amazon Nova models — significant throughput improvements with multimodal, constrained decoding, and LoRA cross-compatibility → Unblocked a frontier-scale model launch by debugging accuracy regressions and delivering major throughput gains under multi-node constraints → Implemented speculative decoding for LoRA, enabling low-latency model customization for a critical product launch → Root-caused a critical latency regression, then delivered substantial performance improvements → Wrote custom CUDA kernels for performance-critical inference paths → Filed 3 patents in inference optimization How I work: → End-to-end ownership from ambiguous requirements through production deployment → Bias for action — I make decisions with incomplete data, move fast, and course-correct → Cross-team alignment across model training, evals, runtime orchestration, and external partners → Force multiplier through mentoring, knowledge sharing, and scalable frameworks → T

Top Skills

PyTorchLlm InferenceTest AutomationPerlC++JavaSeleniumAgile MethodologiesDistributed SystemsJunitShell ScriptingRestTestingWeb ServicesXMLUnixC++JavaScriptSoftware DevelopmentSQL

Education

Gayatri Vidya Parishad College of Engineering (Autonomous)

Bachelor's Degree, Computer Science

N/A – Present

Experience

Amazon

Senior Software Development Engineer

April 1, 2021 – Present

Amazon

SDE-II, Alexa ML Data Platform at Amazon

April 1, 2016 – April 1, 2021

Amazon

SDET-II

December 1, 2013 – April 1, 2016

Amazon

SDET - II

April 1, 2013 – December 1, 2013

Amazon

Software Developer Engineer in Test

July 1, 2011 – April 1, 2013

Akamai Technologies

Software Engineer

September 1, 2009 – July 1, 2011

Bangalore

Yahoo!

Quality Engineer

June 1, 2008 – September 1, 2009

Key Strengths

Extensive experience (18 years) in software development, with a significant tenure at Amazon.
Deep expertise in LLM inference optimization, PyTorch, Distributed Systems, and C++ from recent Senior SDE role at Amazon.
Experience leading inference optimization for foundation models, including GPU and custom silicon (Trainium), at frontier scale.
Proven ability to achieve state-of-the-art latency and unblock critical launches by building custom capabilities.
Patent holder in multimodal AI and LLM inference, demonstrating innovation and deep technical contribution.

Cultural & Operational Fit

Cultural Fit Analysis

The candidate has a long tenure at Amazon, indicating stability and experience within a large, fast-paced tech environment. The progression from SDET to Senior SDE demonstrates adaptability and growth. However, the lack of diverse company experience outside of Amazon (since 2011) and Akamai/Yahoo prior to that, combined with no listed personal projects, limits the assessment of broader cultural fit and adaptability to different organizational structures or startup environments. The target role of ML Engineer aligns well with their recent experience in LLM inference.

Soft Skills & Operational Fit

The candidate's experience at Amazon, particularly in leadership and cross-team collaboration, suggests strong operational fit and soft skills relevant to a senior role. Mentoring and unblocking critical launches indicate problem-solving and leadership capabilities. However, without specific psychometric or communication test results, a detailed assessment of soft skills like stress handling or team collaboration is limited.