remote
Senior Site Reliability Engineer - Oowlish Technology
Site Reliability Engineer
Senior Site Reliability Engineer responsible for designing, deploying, and maintaining scalable, highly available systems on cloud platforms, automating infrastructure, and ensuring robust monitoring and incident response.
About the role
Senior Site Reliability Engineer at Oowlish Technology.
Key technologies: AWS, Kubernetes, Terraform.
Key Responsibilities
- Define and track SLOs, SLIs and error budgets
- Design and implement observability stacks (metrics, logging, tracing)
- Automate toil and improve system reliability through engineering
- Conduct post-mortems and drive blameless incident retrospectives
Requirements
- 5+ years of relevant experience in site reliability engineer
- Proficiency with monitoring tools (Prometheus, Grafana, Datadog)
- Strong programming skills for automation and tooling
Skills
kubernetesdockercicdawsprometheuspython