remote
Site Reliability Engineer - HI Technology & Innovation
Site Reliability Engineer
Hands‑on Site Reliability Engineer role supporting a platform team transitioning to cloud‑native tooling, focusing on infrastructure automation, monitoring, and scalability using Kubernetes, Docker, AWS, Terraform, and CI/CD pipelines.
About the role
Site Reliability Engineer at HI Technology & Innovation.
Key technologies: AWS, Kubernetes, Terraform, Kafka.
Key Responsibilities
- Define and track SLOs, SLIs and error budgets
- Design and implement observability stacks (metrics, logging, tracing)
- Automate toil and improve system reliability through engineering
- Conduct post-mortems and drive blameless incident retrospectives
Requirements
- 3+ years of relevant experience in site reliability engineer
- Proficiency with monitoring tools (Prometheus, Grafana, Datadog)
- Strong programming skills for automation and tooling
Skills
kubernetesdockerawsterraformprometheusgrafanacicdlinux