remote
Senior Data Scientist - LEO DOES IT INC
Data Scientist
Senior Data Scientist with 5+ years of experience designing scalable ETL/ELT pipelines, leveraging Databricks, Spark, SQL, and AWS to process structured and unstructured data while ensuring data quality and lineage.
About the role
Key Responsibilities
- Design, build, and maintain high‑performance ETL/ELT pipelines on Databricks and Apache Spark.
- Develop SQL queries and data models to transform raw data into analytics‑ready datasets.
- Utilize AWS services (S3, Glue, Redshift, Lambda) for storage, orchestration, and compute.
- Implement data quality checks, validation rules, lineage tracking, and monitoring dashboards.
- Collaborate with data scientists, analysts, and engineering teams to support machine‑learning model development and productionization.
Requirements
- 5+ years of hands‑on experience in data engineering or data science roles.
- Proficiency with Databricks, Apache Spark, and advanced SQL.
- Strong knowledge of AWS cloud services and best practices for data pipelines.
- Experience implementing data quality frameworks, monitoring, and lineage solutions.
- Excellent problem‑solving skills and ability to work in a fast‑paced, collaborative environment.
Skills
databricksapache sparksqlaws