onsite

Site Reliability Engineer - Intapp

Site Reliability Engineer

Lead the design and operation of AI‑native infrastructure, automating incident response and anomaly detection to keep mission‑critical services running smoothly.

About the role

Key Responsibilities

Architect, deploy, and maintain highly available cloud infrastructure that supports AI agents and data pipelines.
Develop and refine automated incident response workflows, leveraging AI for anomaly detection and root‑cause analysis.
Collaborate with DevOps, security, and product teams to implement observability, monitoring, and alerting best practices.
Drive continuous improvement of reliability metrics, reducing mean time to recovery and toil across the platform.
Mentor and guide junior engineers on SRE principles, cloud operations, and AI‑powered tooling.

Requirements

5+ years of experience in site reliability engineering or cloud operations.
Proficiency with Kubernetes, container orchestration, and cloud platforms (AWS, GCP, or Azure).
Hands‑on experience building AI/ML pipelines for monitoring, anomaly detection, or incident automation.
Strong scripting skills (Python, Bash) and familiarity with CI/CD pipelines.
Excellent problem‑solving skills and a passion for building resilient, scalable systems.

Skills

kubernetesaws

CompanyIntapp

DepartmentEngineering

LocationBelfast, NIR, United Kingdom

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 19, 2026