onsite

Senior Site Reliability Engineer - Optimum

Site Reliability Engineer

Lead the design, implementation, and operation of highly available, scalable connectivity services using Kubernetes, AWS, and Docker, while driving automation, monitoring, and incident response to ensure world‑class uptime and performance.

About the role

Key Responsibilities

Architect, deploy, and maintain large‑scale, highly available services on Kubernetes and AWS infrastructure.
Implement CI/CD pipelines, infrastructure as code, and automated testing to accelerate feature delivery.
Design and maintain robust monitoring, alerting, and logging solutions to detect and resolve incidents proactively.
Lead incident response, post‑mortem analysis, and continuous improvement initiatives to enhance reliability.
Collaborate with development, security, and product teams to embed reliability best practices across the software lifecycle.

Requirements

5+ years of experience in site reliability or DevOps roles, with a strong focus on cloud and container orchestration.
Proficiency with Kubernetes, AWS services (EC2, EKS, S3, CloudWatch), and Docker.
Hands‑on experience with CI/CD tools (GitHub Actions, Jenkins, ArgoCD) and IaC (Terraform, CloudFormation).
Deep understanding of monitoring, alerting, and log aggregation tools (Prometheus, Grafana, ELK/EFK).
Excellent problem‑solving skills, strong communication, and a passion for continuous learning and automation.

Skills

kubernetesawsdocker

CompanyOptimum

DepartmentEngineering

LocationBethpage, New York, United States

Experience5+ years

Tenurefull-time

LevelSenior

Salary164,689

Posted June 22, 2026