remoteonsite
Lead Software Engineer - Data Ops - Caterpillar
Software Engineer
Lead a team building scalable data pipelines and platforms, leveraging Python, Java, Spark, Kafka, and AWS to enable real‑time analytics and data‑driven decision making.
About the role
Key Responsibilities
- Design, develop, and maintain high‑performance data ingestion and processing pipelines using Python, Java, Apache Spark, and Kafka.
- Architect and implement cloud‑native data platforms on AWS, ensuring scalability, reliability, and security.
- Lead a cross‑functional engineering team, providing technical guidance, code reviews, and mentorship.
- Collaborate with data scientists, analysts, and product owners to translate business requirements into robust data solutions.
- Establish best practices for data quality, monitoring, and observability across the data stack.
Requirements
- 5+ years of software engineering experience with a focus on data engineering or data operations.
- Strong proficiency in Python and Java, and hands‑on experience with Apache Spark and Kafka.
- Deep knowledge of AWS services (e.g., S3, EMR, Lambda, Kinesis) and infrastructure‑as‑code tools.
- Expertise in SQL and relational/NoSQL databases, with a solid understanding of data modeling.
- Proven leadership abilities, excellent communication skills, and a track record of delivering production‑grade data solutions.
Skills
pythonjavaapache sparkkafkaawssql