onsite
Data Scientist / Senior Data Scientist - B2B Data
Data Scientist
Lead B2B data initiatives, cleansing and governing large datasets on AWS with Apache Spark, ensuring data integrity and delivering actionable insights for enterprise clients.
About the role
Key Responsibilities
- Design, develop, and maintain scalable data pipelines using Apache Spark on AWS to ingest, cleanse, and transform large B2B datasets.
- Implement robust data governance frameworks, ensuring data quality, lineage, and compliance across all data assets.
- Collaborate with cross‑functional teams to define data standards, metadata management, and data integrity checks.
- Analyze complex data sets to uncover patterns, trends, and insights that drive business decisions and product enhancements.
- Mentor junior data scientists and engineers, fostering best practices in data engineering and analytics.
Requirements
- 5+ years of experience in data science or data engineering roles, with a strong focus on data cleansing and governance.
- Proficiency in AWS services (EMR, S3, Glue, Redshift) and Apache Spark (PySpark/Scala).
- Deep understanding of data quality concepts, metadata management, and data integrity techniques.
- Strong analytical skills with experience in statistical modeling and machine learning.
- Excellent communication skills and ability to translate technical findings into business insights.