remote
Data Engineer - ABBYY
Data Engineer
Data Engineer building scalable data pipelines and warehouses using Python, SQL, ETL tools, AWS services, and Apache Spark to deliver high‑quality data for analytics and AI.
About the role
Key Responsibilities
- Design, develop, and maintain robust data pipelines that ingest, transform, and load data from diverse sources into cloud data warehouses.
- Implement and optimize ETL processes using Python, SQL, and Spark to ensure data quality, performance, and reliability.
- Collaborate with data scientists and business analysts to understand data requirements and deliver actionable insights.
- Monitor pipeline health, troubleshoot issues, and continuously improve data workflows and infrastructure.
- Document data models, pipeline logic, and best practices for future maintenance and scalability.
Requirements
- Proven experience as a Data Engineer or similar role, with strong Python and SQL skills.
- Hands‑on experience with ETL tools and frameworks, preferably Apache Spark.
- Solid understanding of data warehousing concepts and cloud platforms, especially AWS (Redshift, S3, Glue).
- Strong analytical mindset and ability to translate business needs into technical solutions.
- Excellent communication skills and a collaborative approach to cross‑functional teams.
Skills
pythonsqlawsapache spark