remote

Junior Site Reliability Engineer - Fable

Site Reliability Engineer

Junior Site Reliability Engineer responsible for ensuring the reliability, performance, and scalability of Fable’s accessibility platform using Kubernetes, Docker, AWS, and modern monitoring tools, while collaborating with development teams to implement CI/CD pipelines and automation scripts.

About the role

Key Responsibilities

Maintain and improve the reliability and uptime of production services running on Kubernetes clusters in AWS.
Implement and manage CI/CD pipelines, ensuring automated testing, deployment, and rollbacks.
Configure and monitor observability stack (Prometheus, Grafana, Loki) to detect and resolve performance bottlenecks.
Automate infrastructure provisioning and configuration using IaC tools and Python scripts.
Collaborate with developers to troubleshoot incidents, conduct post‑mortems, and implement preventive measures.

Requirements

1–2 years of experience in site reliability or DevOps roles.
Proficiency in scripting (Python or Bash) for automation tasks.
Familiarity with monitoring and alerting tools such as Prometheus, Grafana, and Loki.
Strong problem‑solving skills and a proactive approach to incident management.

Skills

kubernetesdockerawscicdpython

CompanyFable

DepartmentEngineering

LocationON, CA, United States

Experience2+ years

Tenurefull-time

LevelJunior

Salary90,000

Posted June 20, 2026