onsite

Founding DevOps Engineer SRE - Cygrid GmbH

Site Reliability Engineer

Lead the design and operation of scalable, resilient infrastructure for a fast‑growing startup, driving automation, reliability, and performance using Kubernetes, CI/CD pipelines, and cloud services.

About the role

Key Responsibilities

Architect, deploy, and maintain production‑grade Kubernetes clusters on AWS, ensuring high availability and scalability.
Design and implement CI/CD pipelines with GitOps principles, automating code delivery from commit to production.
Build and manage observability stack (Prometheus, Grafana, Loki) for real‑time monitoring, alerting, and incident response.
Implement infrastructure as code using Terraform and Helm, enforcing version control and reproducibility.
Collaborate with development teams to embed SRE practices, such as error budgets, blameless post‑mortems, and capacity planning.
Lead incident management, root‑cause analysis, and continuous improvement of reliability metrics.

Requirements

5+ years of experience in DevOps or SRE roles, with a strong background in cloud-native technologies.
Proficiency with Kubernetes, Docker, and container orchestration best practices.
Hands‑on experience with AWS services (EKS, EC2, S3, CloudWatch) and IaC tools (Terraform, Helm).
Solid scripting skills in Bash, Python, or Go for automation.
Excellent problem‑solving skills, strong communication, and a proactive, ownership mindset.

Skills

kubernetescicdawsterraform

CompanyCygrid GmbH

DepartmentEngineering

LocationBerlin, Germany

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 21, 2026