remote
Senior Data Scientist - Gandiva Insights
Data Scientist
Senior Data Scientist leading design and implementation of scalable ETL pipelines on Databricks and AWS, handling structured and unstructured data, ensuring data quality, and delivering analytics solutions.
About the role
Key Responsibilities
- Design, develop, and maintain high‑performance ETL/ELT pipelines using Databricks, Apache Spark, and SQL to ingest and transform both structured and unstructured data.
- Leverage AWS services (e.g., S3, Glue, Lambda, Redshift) to build cloud‑native data processing workflows and ensure scalability.
- Implement data quality, validation, lineage, and monitoring frameworks to guarantee reliable data delivery.
- Collaborate with analytics and product teams to translate business requirements into robust data models and feature pipelines.
- Optimize query performance and resource utilization, conducting root‑cause analysis for bottlenecks.
Requirements
- 5+ years of experience in data engineering or data science, with a strong focus on building production‑grade pipelines.
- Proficiency in Python, SQL, and Spark programming.
- Hands‑on experience with Databricks and core AWS services for data processing.
- Demonstrated ability to implement data quality, monitoring, and lineage solutions.
- Excellent problem‑solving skills and ability to work autonomously in a fast‑paced environment.
Skills
pythonsqlapache sparkdatabricksaws