onsite

Data Engineer - P2PSoftTek Inc

Data Engineer

Data Engineer responsible for designing, building, and maintaining scalable ETL/ELT pipelines with Spark, managing data lake solutions on S3, and implementing CDC workflows for Redshift and Snowflake environments.

About the role

Key Responsibilities

Design, develop, and maintain high‑performance ETL/ELT pipelines using Apache Spark (PySpark or Scala).
Construct and operate data lake architectures on Amazon S3, leveraging Parquet and Iceberg file formats for optimal storage and query efficiency.
Implement Change Data Capture (CDC) pipelines to deliver real‑time or near‑real‑time data feeds into Amazon Redshift and Snowflake warehouses.
Collaborate with data analysts and scientists to understand data requirements and translate them into robust data models.
Monitor, troubleshoot, and optimize pipeline performance, ensuring data quality, reliability, and scalability.

Requirements

Strong experience with Apache Spark, including hands‑on development in PySpark or Scala.
Proficiency in building data lakes on Amazon S3 and working with Parquet/Iceberg formats.
Hands‑on experience with Amazon Redshift and Snowflake, including data loading and performance tuning.
Knowledge of CDC techniques and tools for streaming or batch data replication.
Solid understanding of SQL, data modeling, and best practices for ETL/ELT pipeline design.

Skills

apache sparkscalasnowflake

CompanyP2PSoftTek Inc

DepartmentEngineering

LocationCharlotte, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 26, 2026