onsite
Data Engineer - Vforce Infotech
Data Engineer
Data Engineer building scalable pipelines with Python, Scala, and Spark on AWS to integrate diverse data sources, ensuring high‑quality, accessible data for analytics and modeling.
About the role
Key Responsibilities
- Design, develop, and maintain scalable data pipelines using Python, Scala, and Apache Spark.
- Integrate and transform data from multiple heterogeneous sources into a unified data warehouse on AWS.
- Optimize ETL processes for performance, reliability, and cost efficiency.
- Collaborate with data scientists and analysts to provide clean, high‑quality data for modeling and reporting.
- Implement data quality checks, monitoring, and alerting to ensure data integrity.
Requirements
- Strong experience with Python, Scala, and SQL for data engineering tasks.
- Proficiency in Apache Spark and related big‑data technologies.
- Hands‑on experience with AWS services (S3, Redshift, Glue, EMR).
- Solid understanding of data warehousing concepts and ETL best practices.
- Excellent problem‑solving skills and ability to work in a fast‑paced environment.
Skills
pythonscalasqlapache sparkaws