onsite
SDE III Gen AI
SDE III Gen AI
As an SDE III Gen AI at InMobi Advertising, you will design, implement, and deploy production-ready generative AI applications for millions of users. This role involves building advanced RAG pipelines, developing multimodal AI systems, and architecting scalable microservices, all while leading technical design and mentoring junior engineers.
About the role
What You Will Be Doing
- Design and implement production-ready generative AI applications that serve millions of users, from initial architecture through deployment and monitoring
- Build advanced RAG (Retrieval-Augmented Generation) pipelines that combine vector databases, hybrid search, and intelligent caching to deliver sub-second response times
- Develop multimodal AI systems that seamlessly integrate text, vision, and audio capabilities using state-of-the-art models
- Architect scalable microservices that handle thousands of concurrent AI requests while optimizing for cost, latency, and reliability
- Lead code reviews and technical design sessions, establishing best practices and architectural patterns that elevate the entire team's capabilities
- Optimize large language models through fine-tuning techniques to achieve domain-specific performance improvements
- Implement comprehensive MLOps practices including automated testing, model versioning, A/B testing frameworks, and real-time monitoring dashboards
- Collaborate with product managers and stakeholders to translate complex business requirements into innovative AI solutions
- Deploy AI models across multiple cloud platforms (GCP) using containerization and orchestration technologies
- Create and maintain technical documentation, runbooks, and architectural decision records that enable knowledge sharing across teams
- Mentor junior engineers through pair programming, technical talks, and hands-on guidance to accelerate their growth
- Research and prototype emerging AI technologies to identify opportunities for competitive advantage
Gen AI Responsibilities
- Fine-tune and optimize state-of-the-art language models for specific business use cases, achieving significant improvements in accuracy and relevance
- Design multi-agent AI systems using frameworks to orchestrate complex workflows and decision-making processes
- Implement advanced prompt engineering strategies including Tree of Thoughts, ReAct patterns, and automatic prompt optimization to maximize model performance
- Build production-grade embedding systems that handle billions of vectors, implementing efficient indexing strategies and hybrid search capabilities
- Develop computer vision pipelines using models for tasks ranging from object detection to visual question answering
- Create secure AI applications with robust safeguards against prompt injection, jailbreaking, and data leakage while maintaining compliance with AI governance standards
- Optimize token usage and implement intelligent caching strategies to reduce costs by 50-70% while maintaining quality
- Design and implement evaluation frameworks that go beyond traditional metrics, incorporating human feedback loops and domain-specific quality measures
- Build real-time AI inference systems capable of processing streaming data with sub-100ms latency requirements
- Integrate multiple foundation models into unified applications, implementing fallback mechanisms and load balancing for high availability
- Develop custom tools and functions that extend LLM capabilities, enabling models to interact with databases, APIs, and external systems
- Implement advanced RAG techniques including contextual embeddings, cross-encoder reranking, and Graph RAG for complex reasoning tasks
- Create multimodal search systems that enable users to query across text, images, and documents using natural language
- Build AI-powered data processing pipelines that automatically extract, transform, and enrich unstructured data at scale
- Deploy edge AI solutions using frameworks like ONNX and TensorRT, optimizing models for resource-constrained environments
What We're Looking For
- 5+ years of hands-on experience building and deploying ML/AI systems, with at least 2+ years focused on generative AI and LLMs
- Expert-level Python programming skills with deep knowledge of async programming, multiprocessing, and performance optimization
- Strong experience with modern AI frameworks including PyTorch, Transformers, LangChain, and vector databases
- Proven track record of deploying AI applications to production environments serving real users at scale
- Deep understanding of transformer architectures, attention mechanisms, and the latest advances in generative AI
- Experience with cloud platforms (GCP) and containerization technologies (Docker, Kubernetes)
- Excellent communication skills with the ability to explain complex AI concepts to both technical and non-technical audiences
- Proven experience improving large-scale product search and discovery — including dense retrieval with bi-encoders, cross-encoder reranking, query understanding, and hybrid BM25 + vector search across catalogs of tens of millions of SKUs
- Hands-on experience building and deploying production multi-agent systems using orchestration frameworks such as LangGraph and Google ADK — designing stateful, tool-augmented agents for complex, real-world workflows
- Bachelor's degree in Computer Science, Mathematics, or related field (Master's preferred but not required with relevant experience)
Nice to Have
- Published research papers or significant contributions to open-source AI projects
- Experience with multimodal AI systems combining vision, language, and audio
- Domain expertise in specific verticals (healthcare, finance, legal, e-commerce)
- Knowledge of AI safety, alignment, and constitutional AI principles
- Experience building AI infrastructure and platforms used by other engineers
- Familiarity with emerging technologies like neural architecture search, mixture of experts, or neuromorphic computing