remote

Data Scientist / Data Engineer - Caterpillar

Data Scientist

Develop and deploy data pipelines and machine‑learning models on cloud platforms, turning raw data into actionable insights for industrial applications.

About the role

Key Responsibilities

Design, build, and maintain scalable data pipelines using Python, SQL, and Apache Spark.
Develop, train, and operationalize machine‑learning models to support predictive maintenance and optimization initiatives.
Collaborate with cross‑functional teams to translate business requirements into data solutions.
Implement cloud‑native services on AWS for data storage, processing, and model deployment.
Monitor data quality, performance, and model accuracy, applying continuous improvements.

Requirements

Strong proficiency in Python and SQL for data manipulation and analysis.
Experience with machine‑learning frameworks (e.g., scikit‑learn, TensorFlow, PyTorch).
Hands‑on experience building data pipelines with Apache Spark or similar big‑data technologies.
Practical knowledge of AWS services such as S3, Redshift, Lambda, and SageMaker.
Solid understanding of data modeling, ETL processes, and software engineering best practices.

Skills

pythonsqlmachine learningawsapache spark

CompanyCaterpillar

DepartmentResearch

LocationENG, United Kingdom

Experience3+ years

Tenurefull-time

LevelMid-Level

Salary65,000

Posted June 24, 2026