remote

Site Reliability Engineer SRE Specialist - NTT DATA

Site Reliability Engineer

Experienced Site Reliability Engineer with 8+ years managing observability, defining SLIs/SLOs, and building alerting pipelines using New Relic, automation, and Linux environments.

About the role

Key Responsibilities

Own end‑to‑end observability stack, including New Relic APM, infrastructure monitoring, dashboards, and alerting.
Define, implement, and continuously refine Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to drive reliability goals.
Design and maintain automated alerting and incident response workflows, ensuring rapid detection and resolution of production issues.
Collaborate with development and operations teams to embed reliability best practices into CI/CD pipelines.
Develop and maintain automation scripts and tooling for configuration management, scaling, and performance tuning on Linux platforms.

Requirements

8+ years of hands‑on experience in Site Reliability Engineering or related roles.
Deep expertise with New Relic for application performance monitoring and infrastructure observability.
Proven ability to design and manage SLIs/SLOs and associated alerting strategies.
Strong scripting/automation skills (e.g., Bash, Python) on Linux systems.
Experience with incident management processes and a track record of improving system reliability.

Skills

new reliclinux

CompanyNTT DATA

DepartmentEngineering

LocationAddison, Texas, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 24, 2026