Software Engineer - ML Observability
Staff Software Engineer for the ML Observability team at Datadog, building tools to monitor and improve AI systems in production, leveraging Large Language Models and generative AI.
The ML Observability team builds cutting-edge tools to monitor, explain, and improve AI systems in production, particularly those leveraging Large Language Models (LLMs) and generative AI. We provide robust, scalable observability for AI workloads, including drift detection and model evaluation, and behavior tracing, enabling customers to ship AI with confidence.
As a Staff Engineer, you’ll lead the development of new features and foundational capabilities within Datadog’s LLM Observability product. You will shape product direction, drive experimentation, and apply your deep understanding of both AI systems and software engineering to solve open-ended problems in the fast-moving AI landscape. Your work will directly impact how our customers monitor, troubleshoot, and optimize LLM-based applications in production.
Join us in building the foundational tools that make AI systems observable, understandable, and reliable in the real world.
At Datadog, we place value in our office culture - the relationships and collaboration it builds and the creativity it brings to the table. We operate as a hybrid workplace to ensure our Datadogs can create a work-life harmony that best fits them.
What You’ll Do:
Drive design and implementation of LLM observability features.
Who You Are:
Datadog values people from all walks of life. We understand not everyone will meet all the above qualifications on day one. That's okay. If you’re passionate about technology and want to grow your skills, we encourage you to appl
Posted June 7, 2026