remote

Site Reliability Engineer - Bigstone Health Commission

Site Reliability Engineer

Lead the design, deployment, and operation of scalable, highly available cloud services using Kubernetes, Docker, and AWS, while implementing robust CI/CD pipelines and monitoring solutions to ensure reliability and performance.

About the role

Key Responsibilities

Design, build, and maintain production-grade infrastructure on AWS, leveraging services such as EC2, EKS, RDS, and S3.
Implement and manage Kubernetes clusters, ensuring high availability, auto‑scaling, and secure networking.
Develop and maintain CI/CD pipelines with GitHub Actions, Jenkins, or ArgoCD to automate application deployments and rollbacks.
Monitor system health using Prometheus, Grafana, and CloudWatch; respond to incidents and conduct post‑mortem analyses.
Collaborate with development teams to enforce best practices for code quality, security, and observability.

Requirements

5+ years of experience in site reliability engineering or DevOps roles.
Proficient with Kubernetes, Docker, and AWS services.
Strong scripting skills in Python or Bash for automation.
Experience with CI/CD tooling and infrastructure as code (Terraform, CloudFormation).
Excellent problem‑solving skills and a proactive, collaborative mindset.

Skills

kubernetesdockerawscicdpython

CompanyBigstone Health Commission

DepartmentEngineering

LocationManhattan, NY, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Salary200,000

Posted June 19, 2026