onsite
Software Engineer, Data Infrastructure & Acquisition - Ottawa, Canada - Speechify
Software Engineer
Lead the design and maintenance of scalable data pipelines and infrastructure, leveraging Python, AWS, Spark, and SQL to ingest, transform, and store large volumes of content for Speechify’s text‑to‑speech platform.
About the role
Key Responsibilities
- Design, build, and maintain robust data pipelines that ingest content from diverse sources (PDFs, web pages, documents) into a unified data lake.
- Implement ETL processes using Python and Spark, ensuring data quality, consistency, and performance at scale.
- Collaborate with product and engineering teams to define data models, schemas, and metadata standards.
- Optimize storage and query performance on AWS services (S3, Redshift, Athena) and troubleshoot production issues.
- Automate monitoring, alerting, and documentation for data workflows.
Requirements
- 3+ years of experience in data engineering or related roles.
- Experience with CI/CD pipelines and infrastructure as code.
- Excellent problem‑solving skills and a passion for building reliable, scalable systems.