remote

Senior Software Engineer, Data Engineering - GitHub

Software Engineer

Senior Data Engineering software engineer responsible for designing, building, and scaling robust data pipelines and platforms using Python, Spark, Kafka, and AWS to enable analytics and machine‑learning workloads.

About the role

Key Responsibilities

Design, develop, and maintain high‑performance data pipelines that ingest, transform, and store large‑scale event streams.
Collaborate with product, analytics, and machine‑learning teams to define data models and schema evolution strategies.
Implement and optimize ETL processes using Python, SQL, and Apache Spark on AWS services such as EMR, S3, and Redshift.
Build reliable streaming solutions with Kafka, ensuring low latency and fault tolerance.
Drive best practices for data quality, monitoring, and observability across the data platform.

Requirements

5+ years of professional experience in data engineering or backend software development.
Strong proficiency in Python, SQL, and distributed processing frameworks (e.g., Apache Spark).
Hands‑on experience with streaming technologies such as Kafka and cloud platforms, preferably AWS.
Demonstrated ability to design scalable data models and build end‑to‑end ETL pipelines.
Excellent problem‑solving skills and ability to work autonomously in a remote, collaborative environment.

Skills

pythonsqlapache sparkkafkaaws

CompanyGitHub

DepartmentEngineering

LocationUnited States

Experience5+ years

Tenurefull-time

LevelSenior

Salary329,200

Posted June 24, 2026