remote
Senior Cloud Engineer & Site Reliability Engineer - UST
Site Reliability Engineer
Senior Cloud Engineer & SRE with deep AWS expertise, driving platform reliability, automation, and infrastructure-as-code using Terraform, Kubernetes, CI/CD pipelines, and Python scripting in a fully remote role.
About the role
Key Responsibilities
- Design, implement, and maintain highly available AWS infrastructure using Terraform and best‑practice IaC patterns.
- Lead site reliability initiatives, including monitoring, incident response, and capacity planning to ensure service uptime.
- Develop and manage Kubernetes clusters, container orchestration, and related networking/security configurations.
- Build and optimize CI/CD pipelines for automated build, test, and deployment workflows.
- Write Python scripts and automation tools to streamline operational tasks and improve efficiency.
- Collaborate with development and product teams to embed reliability and security into the software lifecycle.
Requirements
- 5+ years of hands‑on experience managing production‑grade AWS environments.
- Proficiency with Terraform (or similar IaC) and Kubernetes orchestration.
- Strong background in Linux system administration and networking.
- Experience building CI/CD pipelines using tools such as Jenkins, GitLab CI, or GitHub Actions.
- Solid scripting skills in Python and familiarity with automation frameworks.
Skills
awsterraformkubernetescicdpythonlinux