onsite

DevOps / SRE Engineer for Cloud Environments - AI Focus - Scopevisio AG

Site Reliability Engineer

Lead cloud operations and site reliability engineering, driving automation, scalability, and AI‑enabled observability across AWS and Kubernetes environments.

About the role

Key Responsibilities

Design, implement, and maintain highly available cloud infrastructure on AWS, ensuring performance, security, and cost efficiency.
Develop and manage CI/CD pipelines, automating deployments, rollbacks, and blue‑green strategies for microservices.
Implement robust monitoring, logging, and alerting solutions (Prometheus, Grafana, ELK) to detect and resolve incidents proactively.
Collaborate with data science teams to integrate Machine Learning Ops workflows, ensuring model deployment and monitoring in production.
Lead incident response, root‑cause analysis, and post‑mortem documentation to continuously improve reliability.

Requirements

3+ years of experience in DevOps or SRE roles within cloud‑native environments.
Hands‑on expertise with AWS services (EKS, ECS, Lambda, CloudFormation) and Kubernetes orchestration.
Proficiency in scripting (Python, Bash) and configuration management (Terraform, Ansible).
Strong understanding of CI/CD tools (GitLab CI, Jenkins, ArgoCD) and observability stacks.
Experience with ML Ops concepts and integrating AI models into production pipelines is a plus.

Skills

awskubernetescicd

CompanyScopevisio AG

DepartmentEngineering

LocationBonn, Germany

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 21, 2026