remote
Team Lead, Support Engineer - Heidi
Software Engineer
Lead a high‑performing support engineering team, driving incident resolution, process improvement, and product reliability for a global AI‑powered healthcare platform using Python, AWS, and Kubernetes.
About the role
Key Responsibilities
- Lead and mentor a distributed support engineering team, ensuring rapid incident triage and resolution for a global AI healthcare platform.
- Design and implement scalable monitoring, alerting, and incident response workflows using AWS CloudWatch, Prometheus, and PagerDuty.
- Collaborate with product, engineering, and data science teams to prioritize and ship reliability improvements and feature enhancements.
- Own post‑mortem processes, root‑cause analysis, and continuous improvement initiatives to reduce MTTR and improve customer satisfaction.
- Drive automation of deployment pipelines with CI/CD tools (GitHub Actions, Terraform, Docker) and maintain Kubernetes clusters.
Requirements
- 5+ years of experience in support or reliability engineering, with at least 2 years in a leadership role.
- Proficient in Python, AWS services (EC2, S3, RDS, Lambda), and Kubernetes orchestration.
- Strong background in monitoring, alerting, and incident management tools.
- Excellent communication skills and a customer‑centric mindset.
- Experience with CI/CD pipelines, Terraform, and containerization is a plus.
Skills
pythonawskubernetescicdcustomer support