onsite

Machine Learning Engineer - Computer Vision, Multimodal & Generative AI

ML Engineer

Develop and deploy cutting‑edge computer‑vision and multimodal generative AI systems, focusing on diffusion models, transformers, and efficient, production‑ready pipelines for virtual try‑on, video modeling, and smart sizing.

About the role

Key Responsibilities

Design, implement, and optimize multimodal AI pipelines that combine image, video, and generative components.
Research and adapt state‑of‑the‑art architectures such as diffusion models and transformers for photorealistic virtual try‑on and video‑based modeling.
Improve model controllability, compute efficiency, and scalability to meet production constraints.
Collaborate with applied research and engineering teams to translate prototypes into robust, deployable services.
Maintain code quality, documentation, and automated testing for all machine‑learning components.

Requirements

Strong programming skills in Python and experience with deep‑learning frameworks like PyTorch or TensorFlow.
Hands‑on experience building computer‑vision or generative AI models, particularly diffusion models or transformer‑based architectures.
Proven ability to optimize models for speed, memory, and inference cost in production environments.
Background in multimodal representation learning, handling image and video data jointly.
Excellent problem‑solving mindset and ability to work at the intersection of research and engineering.

Skills

pythonpytorchtensorflowcomputer visiongenerative ai

DepartmentResearch

LocationSan Francisco, California, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 23, 2026