remote
Senior Engineer, Network Observability - CoreWeave Europe
Software Engineer
Lead the design and implementation of scalable network telemetry solutions using Python, Go, and Kubernetes, driving observability across cloud-native infrastructures with Prometheus, Grafana, and AWS services.
About the role
Key Responsibilities
- Architect and develop high‑throughput telemetry pipelines for network traffic, leveraging Python and Go to ingest, process, and store metrics at scale.
- Design and maintain Kubernetes‑native observability stacks, integrating Prometheus, Grafana, and custom exporters to provide real‑time visibility into network performance.
- Collaborate with cross‑functional teams to define data models, alerting rules, and dashboards that support proactive incident response and capacity planning.
- Implement automated deployment and CI/CD workflows for observability components, ensuring zero‑downtime updates and robust rollback mechanisms.
- Analyze and troubleshoot complex network anomalies, translating findings into actionable improvements for infrastructure and application teams.
Requirements
- 5+ years of experience in network engineering or observability, with a strong background in distributed systems.
- Proficiency in Python and Go, and hands‑on experience with Kubernetes, Prometheus, and Grafana.
- Deep understanding of network protocols (TCP/IP, BGP, MPLS) and experience with packet capture and flow analysis.
- Experience deploying observability solutions on AWS, including EC2, EKS, and CloudWatch.
- Excellent problem‑solving skills, strong communication, and a passion for building reliable, scalable monitoring platforms.
Skills
pythongokubernetesprometheusgrafanaaws