onsite
Site Reliability / DevOps Engineer - qbees GmbH
Devops Engineer
Lead the design, deployment, and maintenance of scalable, highly available cloud infrastructure using Kubernetes, Docker, and AWS. Drive automation, monitoring, and incident response to ensure optimal system reliability and performance.
About the role
Key Responsibilities
- Architect, implement, and manage Kubernetes clusters and Docker containers across AWS environments.
- Design and maintain CI/CD pipelines with GitOps principles, ensuring rapid and reliable application delivery.
- Implement infrastructure as code using Terraform and automate configuration with Ansible or similar tools.
- Set up and tune monitoring, alerting, and logging solutions with Prometheus, Grafana, and ELK stack.
- Lead incident response, root‑cause analysis, and post‑mortem documentation to continuously improve reliability.
- Collaborate with development teams to optimize application performance and scalability.
Requirements
- 3+ years of experience in site reliability engineering or DevOps roles.
- Strong proficiency with Kubernetes, Docker, and AWS services (EKS, EC2, S3, RDS).
- Hands‑on experience with Terraform, CI/CD tools (GitHub Actions, Jenkins, ArgoCD), and scripting in Python or Bash.
- Deep understanding of monitoring, alerting, and log aggregation (Prometheus, Grafana, ELK).
- Excellent problem‑solving skills and a proactive, collaborative mindset.
Skills
kubernetesdockercicdawsterraformprometheusgrafanapython