onsite
DataOS Data Engineer - HP Inc.
Data Engineer
Design and implement scalable data ingestion, transformation, and integration pipelines using Python, Spark, and cloud services to enable business‑driven data products.
About the role
Key Responsibilities
- Develop, test, and maintain high‑performance data pipelines for ingesting, cleaning, and enriching large‑scale datasets.
- Design data models and storage solutions that support analytics and downstream applications.
- Implement real‑time streaming workflows using Kafka and batch processing with Apache Spark.
- Leverage AWS services (S3, Redshift, Glue, Lambda) to build secure, scalable, and cost‑effective data platforms.
- Collaborate with product owners, data scientists, and engineers to translate business requirements into robust data solutions.
- Mentor junior team members and promote best practices in code quality, testing, and CI/CD pipelines.
Requirements
- 3+ years of experience building data pipelines and ETL processes in a cloud environment.
- Proficiency in Python and SQL, with hands‑on experience in Apache Spark or similar distributed processing frameworks.
- Strong understanding of streaming technologies such as Kafka and event‑driven architectures.
- Experience with AWS data services (S3, Redshift, Glue, Lambda) and infrastructure‑as‑code tools.
- Solid grasp of data modeling, schema design, and performance optimization techniques.
Skills
pythonsqlapache sparkkafkaawscicd