onsite
AI/ML Data Engineer - GCI Careers
Data Engineer
Design, build, and maintain scalable AI/ML data pipelines on cloud platforms, leveraging Python, Spark, and Airflow to support advanced analytics and model deployment for mission‑critical applications.
About the role
Key Responsibilities
- Develop and optimize end‑to‑end data pipelines for ingesting, processing, and storing large‑scale structured and unstructured data.
- Implement ETL workflows using Apache Spark and Airflow, ensuring reliability, performance, and observability.
- Collaborate with data scientists and ML engineers to provision feature stores and serve data for model training and inference.
- Design and manage cloud infrastructure on AWS (S3, Redshift, EMR, Lambda) to support scalable, secure data solutions.
- Apply best practices for data governance, security, and compliance, including handling classified data under TS/SCI requirements.
Requirements
- 5+ years of professional experience in data engineering, with strong Python and SQL programming skills.
- Hands‑on expertise with Apache Spark, Airflow, and cloud services (AWS preferred).
- Experience building and operationalizing machine‑learning pipelines and feature stores.
- Demonstrated ability to work in secure environments; active TS/SCI clearance with Polygraph required.
- Solid understanding of data modeling, warehousing, and performance tuning.
Skills
pythonsqlapache sparkairflowawsmachine learning