onsite
Senior Software Engineer - Site Reliability - Freshworks
Software Engineer
Lead the design, deployment, and operation of highly available, scalable services using Kubernetes, Docker, and AWS. Drive automation, observability, and incident response to ensure optimal uptime and performance.
About the role
Key Responsibilities
- Architect, build, and maintain production-grade services on Kubernetes and AWS, ensuring high availability and scalability.
- Implement and manage CI/CD pipelines, automated testing, and deployment workflows to accelerate release cycles.
- Design and maintain monitoring, alerting, and logging solutions (Prometheus, Grafana, ELK) to detect and resolve incidents proactively.
- Lead incident response, post‑mortem analysis, and continuous improvement initiatives to reduce MTTR and prevent recurrence.
- Collaborate with development teams to embed SRE best practices into code reviews, architecture decisions, and performance tuning.
Requirements
- 5+ years of experience in site reliability or DevOps roles, with a strong background in cloud-native technologies.
- Proficiency in Kubernetes, Docker, and AWS services (EC2, EKS, S3, CloudWatch).
- Solid scripting skills in Python or Bash for automation and tooling.
- Hands‑on experience with monitoring, alerting, and log aggregation platforms.
- Excellent problem‑solving, communication, and collaboration skills.
Skills
kubernetesdockerawspythoncicd