remote
Principal Software Engineer, Data Infrastructure - Roblox
Software Engineer
Lead design and implementation of large‑scale data pipelines and infrastructure, driving performance, reliability, and scalability for millions of daily users across the platform.
About the role
Key Responsibilities
- Architect and build end‑to‑end data pipelines that ingest, transform, and serve terabyte‑scale datasets for real‑time analytics and machine learning workloads.
- Collaborate with cross‑functional teams to define data models, schema evolution strategies, and governance policies that ensure data quality and compliance.
- Optimize distributed processing frameworks (Spark, Hadoop) for cost, latency, and throughput, leveraging AWS services such as EMR, S3, and Redshift.
- Design and maintain resilient, scalable infrastructure on Kubernetes, implementing CI/CD pipelines and automated monitoring for production workloads.
- Mentor and guide junior engineers, fostering a culture of best practices, code quality, and continuous improvement.
Requirements
- 10+ years of software engineering experience with a focus on data engineering and distributed systems.
- Proficiency in Python and Scala, with deep knowledge of Apache Spark and Hadoop ecosystems.
- Hands‑on experience deploying and managing data workloads on AWS and Kubernetes.
- Strong SQL skills and familiarity with data warehousing solutions such as Redshift or Snowflake.
- Excellent communication skills and a proven ability to lead technical initiatives in a fast‑paced environment.
Skills
pythonscalaapache sparkhadoopawskubernetessql