Machine Learning Data Engineer
Synthesia is seeking an experienced Machine Learning Data Engineer to design, develop, and maintain data processing pipelines for large quantities of text and audio data. This role involves using machine learning techniques to prepare ready-to-train datasets for large models and contributing to the development of an LLM-based TTS system.
It is an exciting time to join Synthesia as we reached a hallmark by becoming a Unicorn, having raised $90 million in Series C funding and now evaluated at $1 billion!! ✨ 🦄
Synthesia is the world’s #1 AI video generation platform. Well, it’s actually a video production studio — in a browser. As in, no cameras or film crews at all. You simply choose an avatar, enter your script in one of 60 languages, and your video is ready in minutes. In Synthesia, you can build personalised on-the-fly videos, give your chatbot a human face or run 24/7 weather channels in different languages, to name just a few of the possibilities. 🎬
We believe the future of media is synthetic, and we are on a mission to turn cameras into code and make everyone a creator. To learn more, check out our brand video that explains what we’re doing at Synthesia.
We are looking for an experienced Machine Learning Data Engineer who loves dealing with large quantities of text and audio data. The successful candidate will be proficient in using machine learning techniques to build data processing pipelines, preparing ready-to-train datasets for large models.
If you are excited about the intersection of AI, Machine Learning, and Large Data, this role provides a unique opportunity to make a high-impact contribution. 💪🏻
Our aim is to make video content creation available for all - not only to studio production!
🧑🏼🔬 You will be someone who loves to code and build working systems. You are used to working in a fast-paced start-up environment. You will have experience with the software development life cycle, from ideation through implementation, to testing and release.
👩💼 You will join a group of more than 40 Engineers in the R&D department and will have the opportunity to collaborate with multiple research teams across diverse areas, our R&D research is guided by our co-founders - Prof. Lourdes Agapito and Prof. Matthias Niessner .
If you know and love Voicebox, Whisper, VALL-E, SPEAR-TTS and more - and you love machine learning and large data, then we would love to talk to you. We will also want to talk to you - if that's what you dream of doing. 🤩
🚀 In This Position, You'll Join The Team To Help Develop Our LLM-based TTS System That Will Provide Our Customers With Voice Clones That Are Indistinguishable From Real Voices. You Will Also Help Us Create High Quality, Production Ready Code And Take Ownership Of Production Pipelines. This Would Include
Posted June 1, 2026