About Lingokids
Lingokids is a global leader in educational technology, helping over 185 million families worldwide raise amazing kids through Playlearning™, our unique approach that blends education with play. Our mission is to empower children with modern learning experiences, combining educational subjects with essential life skills to help them grow into confident, conscious, and resilient lifelong learners.
Beyond our award-winning app, we’ve built a multi-platform educational universe, including our “_Baby Bot”_ and “_Baby Bot’s Backyard Tales”_ shows, Podcasts, and Music Publishing. Our content, developed in collaboration with top education experts and Oxford Press University, ensures an engaging, high-quality learning experience in a safe, ad-free environment. This dedication to excellence has earned Lingokids multiple industry awards across app, podcast, and video categories, including Best Original Learning App by Kidscreen Awards, National Parenting Product Awards by NAPPA Awards, and Best Parenting Product by Good Housekeeping, among many others!
About the Role
As a Data Scientist (Applied ML Engineering & Recommendations) on the Product Engagement team, your mission is twofold: keep our recommendation infrastructure robust, scalable, and production-ready, and explore and validate more advanced recommendation algorithms that could take our personalization to the next level. Where the Data Scientist (Recommendations & Experimentation) designs the statistical logic, you are the person who makes sure it actually works in production - at scale, reliably, and fast - while also pushing the frontier of what our recommendation engine is capable of technically. Think of yourself as the engineering backbone and the technical innovator of the recommendations squad.
What you'll do
- Own the production recommendation infrastructure: maintain and improve the systems that serve personalized content to millions of users, ensuring reliability, low latency, and scalability as the catalog and user base grow.
- Research and prototype advanced recommendation algorithms: explore newer approaches - deep learning-based models, contextual bandits, session-based recommendations, graph-based methods - evaluate their potential, and run controlled experiments to validate uplift before production.
- Produce ML models and pipelines: take prototypes (from yourself or from the team's Data Scientist) and turn them into production-grade, monitored, maintainable features integrated into the live recommendation engine.
- Design scalable infrastructure: anticipate bottlenecks and design systems that can handle larger catalogs, more complex segmentations, and higher traffic - including serving layer optimization, caching strategies, and pipeline orchestration.
- Build and maintain data pipelines in DBT and Databricks, ensuring clean transformations, data quality, and robust experimentation frameworks that the team can rely on.
- Monitor model health in production: define retraining strategies, detect drift, and ensure recommendation quality is measured and maintained over time.
- Collaborate closely with the Data Scientist and Senior Analyst to translate statistical insights and business requirements into engineering decisions.
What you'll bring
- Python for ML and infrastructure: strong Python skills applied to model training, evaluation, deployment, and pipeline scripting. Writes production-quality, testable, version-controlled code - not just notebooks.
- SQL and DBT: solid SQL and hands-on DBT experience to build and maintain reliable transformation pipelines with clear data lineage and quality controls.
- ML production on AWS: hands-on experience deploying and monitoring ML models using AWS services (SageMaker, Lambda, ECS, Step Functions). Understands model drift, monitoring strategies, and retraining triggers.
- Batch ML model training and evaluation pipelines: design, build, and maintain scalable machine learning training and evaluation pipelines that support recommendation systems and related personalization use cases. This includes developing robust, well-monitored workflows for model development, deployment, and continuous improvement, while contributing to the evolution of the recommendation infrastructure toward more adaptive and responsive systems over time.
- Advanced ML algorithms: familiarity with recommendation techniques beyond collaborative filtering - e.g. neural approaches (two-tower models, transformers for sequences), contextual bandits, learning-to-rank. Knows how to evaluate and compare them rigorously.
- Orchestration and CI/CD: experience with orchestration tools (Airflow, Prefect, or Dagster) for reliable, observable pipelines, and comfort with Git and CI/CD workflows for ML systems.
- Scalability and system design mindset: can anticipate infrastructure bottlenecks, reason through architecture trade-offs (batch vs. streaming, horizontal vs. vertical scaling), and connect engineering decisions to business outcomes.
Nice to have
- Experience with real-time or low-latency serving layers (Redis, DynamoDB or equivalent) - the system is currently batch, but session-level adaptation is a future direction.
- Experience with experimentation frameworks for ML systems, including online evaluation of recommendation algorithms (A/B tests, interleaving, counterfactual evaluation).
- Knowledge of modern data stack tools (Snowflake, BigQuery, Fivetran).
- Exposure to knowledge graph or content graph approaches for content-aware recommendation.
- Interest in balancing data-driven optimization with pedagogical or brand-driven constraints (e.g. content diversity goals, curated onboarding, character injection).
English is a must: We’re a multicultural team providing a service in English, so while certifications aren’t necessary, fluency is essential. As a fully remote company, clear and effective spoken and written communication, especially in asynchronous and long-form formats, is key to collaborating.