onsite
Software Engineer, Data Infrastructure & Acquisition - Jobgether
Software Engineer
Build and scale data pipelines and infrastructure that feed large‑scale machine learning models, leveraging Python, SQL, AWS, and data engineering best practices in a fully distributed environment.
About the role
Key Responsibilities
- Design, develop, and maintain scalable data ingestion pipelines for large‑volume datasets used in AI training.
- Implement robust data quality, validation, and monitoring solutions across distributed systems.
- Collaborate with ML teams to optimize data flow and storage for model training and inference.
- Automate deployment and scaling of data services using AWS infrastructure and CI/CD pipelines.
- Continuously improve performance, reliability, and cost efficiency of data infrastructure.
Requirements
- Strong experience with Python and SQL for data processing and ETL.
- Hands‑on expertise in AWS services (S3, Redshift, EMR, Glue, Lambda).
- Proficiency in building and maintaining data pipelines (e.g., Airflow, Spark).
- Solid understanding of machine learning data needs and best practices.
- Excellent problem‑solving skills and ability to work independently in a distributed team.
Skills
pythonsqlawsmachine learning