onsite

SDE III Gen AI

As an SDE III Gen AI at InMobi Advertising, you will design, implement, and deploy production-ready generative AI applications for millions of users. This role involves building advanced RAG pipelines, developing multimodal AI systems, and architecting scalable microservices, all while leading technical design and mentoring junior engineers.

About the role

What You Will Be Doing

Design and implement production-ready generative AI applications that serve millions of users, from initial architecture through deployment and monitoring
Build advanced RAG (Retrieval-Augmented Generation) pipelines that combine vector databases, hybrid search, and intelligent caching to deliver sub-second response times
Develop multimodal AI systems that seamlessly integrate text, vision, and audio capabilities using state-of-the-art models
Architect scalable microservices that handle thousands of concurrent AI requests while optimizing for cost, latency, and reliability
Lead code reviews and technical design sessions, establishing best practices and architectural patterns that elevate the entire team's capabilities
Optimize large language models through fine-tuning techniques to achieve domain-specific performance improvements
Implement comprehensive MLOps practices including automated testing, model versioning, A/B testing frameworks, and real-time monitoring dashboards
Collaborate with product managers and stakeholders to translate complex business requirements into innovative AI solutions
Deploy AI models across multiple cloud platforms (GCP) using containerization and orchestration technologies
Create and maintain technical documentation, runbooks, and architectural decision records that enable knowledge sharing across teams
Mentor junior engineers through pair programming, technical talks, and hands-on guidance to accelerate their growth
Research and prototype emerging AI technologies to identify opportunities for competitive advantage

Gen AI Responsibilities

Fine-tune and optimize state-of-the-art language models for specific business use cases, achieving significant improvements in accuracy and relevance
Design multi-agent AI systems using frameworks to orchestrate complex workflows and decision-making processes
Implement advanced prompt engineering strategies including Tree of Thoughts, ReAct patterns, and automatic prompt optimization to maximize model performance
Build production-grade embedding systems that handle billions of vectors, implementing efficient indexing strategies and hybrid search capabilities
Develop computer vision pipelines using models for tasks ranging from object detection to visual question answering
Create secure AI applications with robust safeguards against prompt injection, jailbreaking, and data leakage while maintaining compliance with AI governance standards
Optimize token usage and implement intelligent caching strategies to reduce costs by 50-70% while maintaining quality
Design and implement evaluation frameworks that go beyond traditional metrics, incorporating human feedback loops and domain-specific quality measures
Build real-time AI inference systems capable of processing streaming data with sub-100ms latency requirements
Integrate multiple foundation models into unified applications, implementing fallback mechanisms and load balancing for high availability
Develop custom tools and functions that extend LLM capabilities, enabling models to interact with databases, APIs, and external systems
Implement advanced RAG techniques including contextual embeddings, cross-encoder reranking, and Graph RAG for complex reasoning tasks
Create multimodal search systems that enable users to query across text, images, and documents using natural language
Build AI-powered data processing pipelines that automatically extract, transform, and enrich unstructured data at scale
Deploy edge AI solutions using frameworks like ONNX and TensorRT, optimizing models for resource-constrained environments

What We're Looking For

5+ years of hands-on experience building and deploying ML/AI systems, with at least 2+ years focused on generative AI and LLMs
Expert-level Python programming skills with deep knowledge of async programming, multiprocessing, and performance optimization
Strong experience with modern AI frameworks including PyTorch, Transformers, LangChain, and vector databases
Proven track record of deploying AI applications to production environments serving real users at scale
Deep understanding of transformer architectures, attention mechanisms, and the latest advances in generative AI
Experience with cloud platforms (GCP) and containerization technologies (Docker, Kubernetes)
Excellent communication skills with the ability to explain complex AI concepts to both technical and non-technical audiences
Proven experience improving large-scale product search and discovery — including dense retrieval with bi-encoders, cross-encoder reranking, query understanding, and hybrid BM25 + vector search across catalogs of tens of millions of SKUs
Hands-on experience building and deploying production multi-agent systems using orchestration frameworks such as LangGraph and Google ADK — designing stateful, tool-augmented agents for complex, real-world workflows
Bachelor's degree in Computer Science, Mathematics, or related field (Master's preferred but not required with relevant experience)

Nice to Have

Published research papers or significant contributions to open-source AI projects
Experience with multimodal AI systems combining vision, language, and audio
Domain expertise in specific verticals (healthcare, finance, legal, e-commerce)
Knowledge of AI safety, alignment, and constitutional AI principles
Experience building AI infrastructure and platforms used by other engineers
Familiarity with emerging technologies like neural architecture search, mixture of experts, or neuromorphic computing

About the role

What You Will Be Doing

Design and implement production-ready generative AI applications that serve millions of users, from initial architecture through deployment and monitoring
Build advanced RAG (Retrieval-Augmented Generation) pipelines that combine vector databases, hybrid search, and intelligent caching to deliver sub-second response times
Develop multimodal AI systems that seamlessly integrate text, vision, and audio capabilities using state-of-the-art models
Architect scalable microservices that handle thousands of concurrent AI requests while optimizing for cost, latency, and reliability
Lead code reviews and technical design sessions, establishing best practices and architectural patterns that elevate the entire team's capabilities
Optimize large language models through fine-tuning techniques to achieve domain-specific performance improvements
Implement comprehensive MLOps practices including automated testing, model versioning, A/B testing frameworks, and real-time monitoring dashboards
Collaborate with product managers and stakeholders to translate complex business requirements into innovative AI solutions
Deploy AI models across multiple cloud platforms (GCP) using containerization and orchestration technologies
Create and maintain technical documentation, runbooks, and architectural decision records that enable knowledge sharing across teams
Mentor junior engineers through pair programming, technical talks, and hands-on guidance to accelerate their growth
Research and prototype emerging AI technologies to identify opportunities for competitive advantage

Gen AI Responsibilities

Fine-tune and optimize state-of-the-art language models for specific business use cases, achieving significant improvements in accuracy and relevance
Design multi-agent AI systems using frameworks to orchestrate complex workflows and decision-making processes
Implement advanced prompt engineering strategies including Tree of Thoughts, ReAct patterns, and automatic prompt optimization to maximize model performance
Build production-grade embedding systems that handle billions of vectors, implementing efficient indexing strategies and hybrid search capabilities
Develop computer vision pipelines using models for tasks ranging from object detection to visual question answering
Create secure AI applications with robust safeguards against prompt injection, jailbreaking, and data leakage while maintaining compliance with AI governance standards
Optimize token usage and implement intelligent caching strategies to reduce costs by 50-70% while maintaining quality
Design and implement evaluation frameworks that go beyond traditional metrics, incorporating human feedback loops and domain-specific quality measures
Build real-time AI inference systems capable of processing streaming data with sub-100ms latency requirements
Integrate multiple foundation models into unified applications, implementing fallback mechanisms and load balancing for high availability
Develop custom tools and functions that extend LLM capabilities, enabling models to interact with databases, APIs, and external systems
Implement advanced RAG techniques including contextual embeddings, cross-encoder reranking, and Graph RAG for complex reasoning tasks
Create multimodal search systems that enable users to query across text, images, and documents using natural language
Build AI-powered data processing pipelines that automatically extract, transform, and enrich unstructured data at scale
Deploy edge AI solutions using frameworks like ONNX and TensorRT, optimizing models for resource-constrained environments

What We're Looking For

5+ years of hands-on experience building and deploying ML/AI systems, with at least 2+ years focused on generative AI and LLMs
Expert-level Python programming skills with deep knowledge of async programming, multiprocessing, and performance optimization
Strong experience with modern AI frameworks including PyTorch, Transformers, LangChain, and vector databases
Proven track record of deploying AI applications to production environments serving real users at scale
Deep understanding of transformer architectures, attention mechanisms, and the latest advances in generative AI
Experience with cloud platforms (GCP) and containerization technologies (Docker, Kubernetes)
Excellent communication skills with the ability to explain complex AI concepts to both technical and non-technical audiences
Proven experience improving large-scale product search and discovery — including dense retrieval with bi-encoders, cross-encoder reranking, query understanding, and hybrid BM25 + vector search across catalogs of tens of millions of SKUs
Hands-on experience building and deploying production multi-agent systems using orchestration frameworks such as LangGraph and Google ADK — designing stateful, tool-augmented agents for complex, real-world workflows
Bachelor's degree in Computer Science, Mathematics, or related field (Master's preferred but not required with relevant experience)

Nice to Have

Published research papers or significant contributions to open-source AI projects
Experience with multimodal AI systems combining vision, language, and audio
Domain expertise in specific verticals (healthcare, finance, legal, e-commerce)
Knowledge of AI safety, alignment, and constitutional AI principles
Experience building AI infrastructure and platforms used by other engineers
Familiarity with emerging technologies like neural architecture search, mixture of experts, or neuromorphic computing

SDE III Gen AI

About the role

What You Will Be Doing

Gen AI Responsibilities

What We're Looking For

Nice to Have

SDE III Gen AI

About the role

What You Will Be Doing

Gen AI Responsibilities

What We're Looking For

Nice to Have

Skills