remote
Data Platform Engineer - Strategic Staffing Solutions
Data Engineer
Design, build, and maintain scalable data pipelines and platforms using Python and PySpark, integrating cloud services and ensuring high‑performance data processing for analytics and reporting.
About the role
Key Responsibilities
- Develop and optimize end‑to‑end data pipelines using Python, PySpark, and SQL to ingest, transform, and load large‑scale datasets.
- Design and implement data models and schemas that support analytical and reporting workloads.
- Collaborate with data scientists, analysts, and engineering teams to define data requirements and ensure data quality.
- Deploy and manage data platform components on cloud infrastructure (e.g., AWS), leveraging services such as S3, Redshift, and EMR.
- Monitor pipeline performance, troubleshoot issues, and implement automation for reliability and scalability.
Requirements
- 3+ years of experience building data pipelines with Python and PySpark.
- Strong SQL skills and experience with relational and columnar databases.
- Hands‑on experience with cloud platforms, preferably AWS, and related data services.
- Proficiency in data modeling, ETL design, and performance tuning.
- Ability to work both on‑site and remotely, communicating effectively with cross‑functional teams.