remote
Observability Engineer - Accommodations Plus International
Software Engineer
Observability Engineer building scalable monitoring solutions on Kubernetes and AWS, leveraging Prometheus, Grafana, and Python to ensure high availability and performance for travel industry services.
About the role
Key Responsibilities
- Design, implement, and maintain end‑to‑end observability stack for microservices running on Kubernetes.
- Develop Prometheus exporters and Grafana dashboards to surface key metrics and alerts.
- Automate infrastructure provisioning and configuration using Terraform and AWS CloudFormation.
- Collaborate with DevOps and SRE teams to troubleshoot incidents and improve system reliability.
- Document observability best practices and contribute to knowledge base.
Requirements
- 3+ years of experience with Kubernetes, Prometheus, and Grafana.
- Strong scripting skills in Python and experience with IaC tools like Terraform.
- Hands‑on experience with AWS services (EKS, CloudWatch, Lambda).
- Excellent problem‑solving skills and ability to work in a fast‑paced environment.
Skills
prometheusgrafanakubernetesawspythonterraform