remote
Site Reliability Engineer - Technical Lead 12 month FTC - BTG Pactual Europe
Engineering Manager
Technical Lead for Site Reliability Engineering, driving automation, cloud infrastructure, and observability using Python, Kubernetes, AWS, Terraform and CI/CD pipelines to ensure high‑availability services.
About the role
Key Responsibilities
- Design, implement, and maintain highly available, scalable infrastructure on AWS using Terraform and Kubernetes.
- Lead the development of automation scripts and tools in Python to streamline deployment, configuration, and incident response.
- Establish CI/CD pipelines and promote best practices for continuous integration, delivery, and testing.
- Implement robust monitoring, logging, and alerting solutions to proactively detect and resolve performance issues.
- Mentor and guide SRE team members, fostering a culture of reliability, automation, and continuous improvement.
Requirements
- 5+ years of experience in site reliability or DevOps engineering, with a strong focus on cloud platforms (AWS) and container orchestration (Kubernetes).
- Proficiency in Python for scripting and automation, and solid experience with infrastructure‑as‑code tools such as Terraform.
- Hands‑on experience building and maintaining CI/CD pipelines (e.g., Jenkins, GitLab CI, GitHub Actions).
- Deep understanding of Linux systems, networking, and performance tuning.
- Demonstrated ability to implement monitoring, logging, and alerting frameworks (Prometheus, Grafana, ELK, etc.) and to lead technical teams.
Skills
pythonkubernetesawsterraformcicdlinux