remote
Senior Site Reliability Engineer - SentinelOne
Site Reliability Engineer
Senior Site Reliability Engineer responsible for designing, automating, and operating highly available cloud infrastructure, driving reliability initiatives, and supporting security‑focused services using Kubernetes, Terraform, Python, and AWS.
About the role
Key Responsibilities
- Design, build, and maintain scalable, secure infrastructure on AWS supporting AI‑driven security services.
- Develop and manage container orchestration platforms (Kubernetes) to ensure high availability and performance.
- Automate provisioning, configuration, and deployment pipelines using Terraform and Python.
- Implement monitoring, alerting, and incident response processes to meet stringent reliability SLAs.
- Collaborate with development and security teams to embed reliability and compliance into the software lifecycle.
Requirements
- 5+ years of SRE or DevOps experience in cloud environments, preferably AWS.
- Strong expertise with Linux systems, Kubernetes, and infrastructure‑as‑code tools such as Terraform.
- Proficiency in scripting/programming (Python) for automation and tooling.
- Hands‑on experience with monitoring, logging, and incident management frameworks.
- Solid understanding of security best practices and ability to work in a security‑focused organization.
Skills
linuxkubernetesterraformpythonaws