remote

Data Solutions Engineer - Abaka AI

Software Engineer

Lead end‑to‑end data pipeline development for AI workloads, leveraging Python, SQL, Spark, and AWS services to deliver scalable, high‑quality data solutions that power generative and embodied AI applications.

About the role

Key Responsibilities

Design, build, and maintain robust data pipelines that ingest, transform, and serve large volumes of structured and unstructured data for AI models.
Implement data quality checks, monitoring, and alerting to ensure pipeline reliability and performance.
Collaborate with data scientists and product teams to understand data requirements and translate them into scalable engineering solutions.
Optimize data workflows using Spark, SQL, and AWS services (S3, Redshift, Glue, Athena) for cost‑effective, high‑throughput processing.
Document architecture, data schemas, and best practices for future maintenance and onboarding.

Requirements

5+ years of experience in data engineering or related roles, with a strong focus on AI/ML data pipelines.
Hands‑on experience with AWS data services (S3, Redshift, Glue, Athena) and workflow orchestration tools like Airflow.
Solid understanding of data modeling, ETL best practices, and performance tuning.
Excellent problem‑solving skills and ability to work collaboratively in a fast‑paced, cross‑functional environment.

Skills

pythonsqlawsapache sparkairflow

CompanyAbaka AI

DepartmentEngineering

LocationMountain View, California, United States

Experience4+ years

Tenurefull-time

LevelMid-Level

Salary160,000

Posted June 21, 2026