onsite

Data Engineer PySpark - IT First Source

Data Engineer

Seeking a Data Engineer to design, build, and maintain scalable data pipelines using PySpark, Python, and cloud services, enabling reliable data delivery for analytics and business intelligence.

About the role

Key Responsibilities

Design, develop, and optimize end‑to‑end data pipelines using PySpark and Python to ingest, transform, and store large‑scale datasets.
Implement and manage data workflows in Apache Airflow, ensuring reliable scheduling, monitoring, and error handling.
Collaborate with data analysts and data scientists to understand data requirements and deliver clean, well‑documented data assets.
Maintain and tune data storage solutions on AWS (e.g., S3, Redshift, RDS) for performance, cost efficiency, and security.
Apply best practices for data quality, lineage, and governance, including automated testing and validation.

Requirements

3+ years of professional experience in data engineering, with a focus on PySpark and Python.
Strong SQL skills and experience building data models in relational or columnar databases.
Hands‑on experience with AWS services such as S3, Redshift, Glue, or EMR.
Proficiency in orchestrating workflows using Apache Airflow or similar tools.
Solid understanding of data engineering concepts, including ETL/ELT design, data partitioning, and performance optimization.

Skills

pythonsqlaws

CompanyIT First Source

DepartmentEngineering

LocationCharlotte, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 26, 2026