onsite
Member of Technical Staff, Product Engineer - Plato
Software Engineer
Lead the design and implementation of scalable simulation environments for training AI agents, leveraging Python, C++, and GPU-accelerated reinforcement learning pipelines on AWS infrastructure.
About the role
Key Responsibilities
- Architect and develop high‑fidelity simulation environments that ingest real‑world data streams and generate training signals for reinforcement learning agents.
- Implement performance‑critical components in C++ and Python, optimizing GPU utilization and ensuring low‑latency data pipelines.
- Collaborate with research and product teams to translate experimental ideas into production‑ready services, using Docker and Kubernetes for deployment.
- Integrate with AWS services (S3, EC2, SageMaker) to scale compute resources and manage data storage securely.
- Continuously benchmark and profile simulation workloads, proposing and applying architectural improvements.
Requirements
- 5+ years of software engineering experience in high‑performance computing or AI infrastructure.
- Proficiency in Python and C++, with a strong grasp of multithreading, memory management, and GPU programming.
- Hands‑on experience with reinforcement learning frameworks (e.g., RLlib, Stable Baselines) and simulation engines.
- Solid understanding of cloud architecture, especially AWS, and container orchestration with Docker/Kubernetes.
- Excellent problem‑solving skills and a passion for building scalable, reliable systems.
Skills
pythoncreinforcement learningawsdocker