onsite
Software Engineer, Data Infrastructure & Acquisition - Melbourne, Australia - Speechify
Software Engineer
Build and maintain scalable data pipelines and acquisition systems for a high‑growth text‑to‑speech platform, leveraging Python, Java, AWS, Kafka, and SQL to deliver reliable, real‑time data infrastructure.
About the role
Key Responsibilities
- Design, develop, and operate robust data ingestion pipelines that collect and process user interaction data from mobile, web, and browser extensions.
- Implement real‑time streaming solutions using Kafka and batch workflows with Apache Airflow to ensure data availability for analytics and machine‑learning teams.
- Collaborate with product, engineering, and data science stakeholders to define data schemas, quality metrics, and monitoring dashboards.
- Optimize storage and query performance on AWS services (S3, Redshift, RDS) and maintain secure, cost‑effective data architectures.
- Write production‑grade code in Python and Java, enforce best practices, and conduct code reviews to uphold reliability and scalability.
Requirements
- 3+ years of experience building data pipelines or data‑platform services in a cloud environment.
- Strong proficiency in Python and Java, with hands‑on experience using AWS services such as S3, Redshift, Lambda, or EMR.
- Deep understanding of streaming technologies (Kafka) and orchestration tools (Airflow, similar).
- Solid SQL skills and experience designing relational and columnar data models for analytics.
- Ability to work autonomously in a distributed team, communicate clearly, and troubleshoot production issues quickly.
Skills
pythonjavaawskafkasqlairflow