remote
Staff Software Engineer - Platform, SysEng Canada - Grafana Labs
Software Engineer
Lead the design and delivery of scalable platform services for a cloud‑native observability stack, driving performance, reliability, and innovation across Kubernetes, AWS, and Grafana ecosystems.
About the role
Key Responsibilities
- Architect and implement high‑throughput, fault‑tolerant services in Go and Python that power Grafana Cloud’s observability platform.
- Own end‑to‑end feature development, from requirements gathering to production deployment, ensuring robust CI/CD pipelines on Kubernetes.
- Collaborate with cross‑functional teams to integrate OpenTelemetry, Prometheus, and Loki, enhancing data ingestion and query performance.
- Mentor junior engineers, conduct code reviews, and champion best practices in cloud‑native design and observability.
- Drive performance tuning, capacity planning, and incident response for mission‑critical services.
Requirements
- 10+ years of software engineering experience, with 5+ in a senior or staff role.
- Deep expertise in Go, Python, and Kubernetes‑based microservices.
- Hands‑on experience with AWS services (EKS, S3, CloudWatch) and observability tooling (Grafana, Prometheus, Loki, OpenTelemetry).
- Strong background in distributed systems, scalability, and reliability engineering.
- Excellent communication skills and a passion for open‑source collaboration.
Skills
pythongokubernetesawsgrafana