remote
Site Reliability Engineer - ALIQAN Technologies
Site Reliability Engineer
Senior Site Reliability Engineer responsible for designing, deploying, and maintaining scalable cloud infrastructure, Kubernetes clusters, and CI/CD pipelines, ensuring high availability, performance, and observability across production environments.
About the role
Key Responsibilities
- Design, implement, and maintain scalable cloud infrastructure using Infrastructure as Code (IaC) principles across multiple cloud providers.
- Architect and manage Kubernetes clusters and containerized applications, ensuring high availability and efficient resource utilization.
- Develop and maintain automation scripts for infrastructure provisioning, configuration management, and deployment workflows.
- Implement and optimize CI/CD pipelines to enable rapid, reliable application releases.
- Design and implement monitoring, logging, and alerting solutions to ensure system reliability and rapid incident response.
- Collaborate with development teams to enforce best practices for security, performance, and cost optimization.
Requirements
- 4–6 years of experience in Site Reliability Engineering or DevOps roles.
- Proficiency with Kubernetes, Docker, and container orchestration best practices.
- Hands‑on experience with IaC tools such as Terraform, CloudFormation, or Pulumi.
- Strong scripting skills (Python, Bash, or similar) for automation and tooling.
- Experience with CI/CD platforms (GitLab CI, Jenkins, ArgoCD, etc.) and cloud monitoring solutions (Prometheus, Grafana, ELK, etc.).