remote
Senior Data Engineer, AI - Caterpillar
Data Engineer
Senior Data Engineer focused on AI solutions, designing and scaling data pipelines, integrating machine‑learning models, and leveraging cloud services to enable data‑driven decision making across the organization.
About the role
Key Responsibilities
- Design, develop, and maintain scalable data pipelines using Python, Apache Spark, and Apache Airflow.
- Integrate and operationalize machine‑learning models into production workflows.
- Build and optimize data warehouses and data lakes on AWS (S3, Redshift, Glue).
- Collaborate with data scientists, analysts, and product teams to define data requirements and ensure data quality.
- Implement data governance, security, and performance monitoring best practices.
Requirements
- 5+ years of experience in data engineering or related fields.
- Strong proficiency in Python, SQL, and big‑data processing frameworks (Spark, Flink).
- Hands‑on experience with cloud platforms, preferably AWS, and orchestration tools such as Airflow.
- Demonstrated ability to deploy and support machine‑learning pipelines in production.
- Solid understanding of data modeling, ETL design, and performance tuning.
Skills
pythonsqlapache sparkawsmachine learning