remote
Senior Data Scientist - Stellent IT LLC
Data Scientist
Senior Data Scientist responsible for designing scalable ETL/ELT pipelines, optimizing data processing with Databricks and Spark, and implementing lakehouse architecture on AWS to ensure high‑quality, reliable data for analytics.
About the role
Key Responsibilities
- Design, build, and maintain scalable ETL/ELT pipelines for both structured and unstructured data sources.
- Develop and optimize data processing workflows using Databricks, Apache Spark, and SQL.
- Implement data quality, validation, lineage, and monitoring frameworks to ensure reliable data delivery.
- Support and evolve medallion/Lakehouse architecture (bronze, silver, gold) on AWS cloud services.
- Collaborate with analytics and machine‑learning teams to provide clean, curated datasets for model development and reporting.
Requirements
- 5+ years of experience in data engineering or data science roles.
- Proficiency with Databricks, Apache Spark, and advanced SQL querying.
- Strong hands‑on experience with AWS services (S3, Glue, Redshift, Lambda, etc.).
- Demonstrated ability to design ETL/ELT pipelines and implement data quality controls.
- Solid programming skills in Python and familiarity with lakehouse/medallion concepts.
Skills
databricksapache sparksqlawspython