Data Engineer, LLM AI Platforms
Principal Data Engineer role at Crowdstrike, focusing on LLM/AI Platforms, requiring expertise in Python, Machine Learning, and AWS.
As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you.
About the Role:
CrowdStrike is looking for a Principal Data Engineer with deep expertise in Large Language Models (LLMs) and AI platforms to join our growing Data Science Platform Engineering Team. You will be a key leader, responsible for designing, building, and deploying cutting-edge data infrastructure that powers our next generation of AI-driven security products. This role requires significant hands-on experience in LLM integration, agentic workflows, and agent harnessing to deliver high-impact, scalable solutions. You will champion engineering excellence, focusing on shipping fast, writing elegant, high-quality code, and actively mentoring and strengthening the team's technical knowledge and capabilities.
The scale of our systems and data are approaching Exabytes in size. Experience with extremely large-scale systems, including DevSecOps patterns, practices, and standards are important for this work.
What You'll Do:
Architect, implement, and optimize data platforms and pipelines specifically designed to support LLMs, Retrieval-Augmented Generation (RAG), and sophisticated AI agentic systems at Exabyte scale.
Drive the adoption and deployment of agentic workflows and agent harnessing techniques to create autonomous, data-driven security features.
Design and implement highly scalable, fault-tolerant, and cost-effective data solutions, emphasizing rapid iteration and high-quality deployment.
Write elegant, production-ready code with a focus on performance, maintainability, and testing rigor, ensuring the ability to ship fast without compromising quality.
Provide technical leadership and deep expertise in data modeling, normalization, and semantic cataloging for AI/ML workloads.
Establish best practices for MLOps/DataOps surrounding LLMs, including monitoring, observability, and zero-touch recovery mechanisms for AI services.
Actively mentor engineers, conducting technical workshops, leading design reviews, and strengthening the team's knowledge in cutting-edge AI platform technologies.
Posted June 7, 2026