remote
Telecom Observability Engineer - OXIO
Software Engineer
Hands‑on Telecom Observability Engineer responsible for real‑time monitoring, detection, and escalation of core network issues, collaborating with engineering and support teams to ensure service reliability.
About the role
Key Responsibilities
- Design, implement, and maintain observability pipelines for telecom core network and infrastructure using tools such as Prometheus and Grafana.
- Develop automated alerting and escalation workflows to quickly surface performance degradations and outages to Level 1/2 support.
- Collaborate with core engineering and telecom teams to integrate telemetry from routers, switches, and virtual network functions.
- Write scripts and utilities (primarily in Python) for data collection, normalization, and dashboard creation.
- Participate in on‑call rotations, perform root‑cause analysis, and document incident response procedures.
Requirements
- Strong experience with Linux environments and network monitoring tools (Prometheus, Grafana, ELK, or similar).
- Proficiency in Python for automation, data processing, and API integration.
- Solid understanding of telecom protocols and core network components (e.g., MPLS, SIP, LTE/5G transport).
- Experience building alerting, dashboarding, and incident‑response workflows in a production environment.
- Excellent problem‑solving skills and ability to work cross‑functionally with engineering and support teams.
Skills
pythonlinuxprometheusgrafana