remote

AI Data Engineer - Quantifind

Data Engineer

Experienced data engineer specializing in AI‑driven pipelines, knowledge graph construction, and large‑scale ingestion of structured and unstructured data using Python, Spark, Airflow, and cloud services.

About the role

Key Responsibilities

Design, build, and maintain high‑throughput data ingestion pipelines that transform raw sources into curated knowledge graphs.
Develop and orchestrate ETL workflows with Apache Airflow, ensuring reliability, scalability, and observability.
Implement data processing jobs using Spark and Python to handle both structured and unstructured datasets.
Integrate streaming data via Kafka and manage storage/compute resources on AWS (S3, Redshift, EMR, etc.).
Collaborate with product and research teams to define ontologies, data quality standards, and documentation frameworks.

Requirements

5+ years of professional experience in data engineering, with a focus on AI/ML‑enabled pipelines.
Proficiency in Python, SQL, and big‑data technologies such as Spark, Airflow, and Kafka.
Hands‑on experience building and deploying solutions on AWS cloud services.
Strong understanding of knowledge graph concepts, ontologies, and data provenance.
Excellent problem‑solving skills and a curiosity‑driven approach to data quality and value.

Skills

pythonsqlapache sparkkafkaaws

CompanyQuantifind

DepartmentEngineering

LocationPalo Alto, California, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Salary200,000

Posted June 24, 2026