remote

Site Reliability Engineer - moneybird

Site Reliability Engineer

Site Reliability Engineer responsible for designing, deploying, and maintaining highly available cloud infrastructure using AWS, Kubernetes, and Terraform, while ensuring performance, reliability, and continuous improvement through automation and observability tools.

About the role

Key Responsibilities

Design, implement, and manage scalable, highly available infrastructure on AWS using Terraform and Kubernetes.
Automate deployment pipelines with CI/CD tools, ensuring rapid, reliable releases.
Monitor system health with Prometheus, Grafana, and custom alerts; troubleshoot incidents and conduct post‑mortems.
Collaborate with development teams to embed reliability best practices into code and architecture.
Implement security, compliance, and cost‑optimization strategies across the stack.

Requirements

3+ years of experience in site reliability or DevOps roles.
Proficiency in Python scripting and automation.
Hands‑on experience with Kubernetes, Docker, and AWS services (EC2, RDS, S3, EKS).
Strong knowledge of Terraform, CI/CD pipelines, and monitoring/alerting tools.
Excellent problem‑solving skills and a proactive, collaborative mindset.

Skills

pythonkubernetesawsdockerterraformprometheusgrafanacicd

Companymoneybird

DepartmentEngineering

LocationUnited States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 19, 2026