About the Role
As a Machine Learning Ops Engineer at Cloudbeds, you will be instrumental in building and implementing features that empower lodging customers to make data-driven pricing decisions. These features will utilize both heuristic data and advanced machine learning techniques to optimize revenue strategies. You will collaborate closely with product and engineering teams to identify improvement opportunities, develop innovative solutions, and drive revenue growth for hotels using the platform. Your primary focus will be on ensuring the reliability, scalability, and high quality of ML systems from development to production, establishing robust MLOps practices and rigorous testing processes across the entire ML lifecycle. From structuring data pipelines to implementing and validating ML models, you will own the end-to-end development of the revenue management application, ensuring hotels receive reliable, accurate insights to maximize success.
Our Machine Learning Team
Our machine learning team thrives on the unique challenge of revolutionizing guest experiences through AI-driven insights, transforming traditional hospitality with cutting-edge predictive algorithms. We foster collaborative innovation where data scientists, engineers, and product experts blend their expertise to prototype bold ideas and directly impact operational efficiency. We seek individuals passionate about continuous learning, unafraid to challenge conventions, and excited by the intersection of hospitality and deep technical prowess.
Responsibilities
- Develop and implement end-to-end machine learning features, emphasizing production readiness and system reliability, to enable customers to optimize their revenue strategies.
- Establish and maintain robust MLOps practices, including CI/CD for model training, testing, deployment, and monitoring.
- Design, build, and maintain highly reliable and well-tested data and ML pipelines to extract, transform, and structure large datasets for ML applications.
- Utilize Apache Airflow (or similar orchestration tools like Prefect/Dagster) to define, schedule, and monitor complex data and ML workflows (DAGs).
- Implement comprehensive software quality and testing processes for ML systems, covering unit, integration, and end-to-end testing for both code and data/model performance.
- Design, train, and rigorously test machine learning models as needed to improve pricing optimization, with a focus on statistical validation and production stability.
- Implement model performance monitoring (e.g., drift detection, data quality checks) to ensure deployed models maintain accuracy and relevance over time.
- Collaborate cross-functionally with product, engineering, and data science teams to define SLIs/SLOs for ML services and enhance system performance, stability, and usability.
- Conduct structured A/B testing and experimentation to validate model effectiveness and continuously improve performance, documenting results and sharing technical insights.
Requirements
- Bachelor's degree in Computer Science, Statistics, Mathematics, Data Science, or a related quantitative field.
- 3+ years of experience in a data engineering or machine learning role, with demonstrated success in MLOps and deploying models to production.
- Proven expertise in designing and implementing ML testing strategies (e.g., data validation, model correctness, performance testing).
- Expertise in deploying ML models at scale on AWS, with experience using MLFlow or similar platforms.
- Strong Python programming skills and adherence to software engineering best practices (e.g., clean code, version control, code reviews).
- Expert-level SQL skills and experience working with large datasets for analysis and modeling.
- Strong problem-solving skills with the ability to apply creative, data-driven solutions to complex business challenges.
- Excellent communication and collaboration skills, with experience working cross-functionally with product and engineering teams.
Bonus Skills to Stand Out (Optional)
- Experience with CI/CD tooling (e.g., GitHub Actions, Jenkins) specifically for ML pipelines and Airflow DAG deployment.
- Experience with data quality monitoring tools and frameworks.
- Master’s or PhD in Computer Science, Data Science, or a related field.
- Relevant certifications (AWS, MLFlow, or other data science/ML certifications).