remote
Software Engineer II, Backend Infrastructure Platform - Affirm
Software Engineer
Backend engineer building a next‑generation reliability platform for production systems, blending distributed systems expertise with AI‑assisted tooling to deliver a single, high‑quality view of service health.
About the role
Key Responsibilities
- Design, develop, and maintain a reliability platform that aggregates metrics, logs, and traces across distributed services.
- Implement AI‑assisted debugging tools to surface root causes and automate remediation workflows.
- Collaborate with cross‑functional teams to define observability standards and improve system resilience.
- Optimize platform performance and scalability using Kubernetes, AWS services, and efficient data pipelines.
- Write clean, well‑tested code and participate in code reviews, ensuring high reliability and maintainability.
Requirements
- 3+ years of backend engineering experience with Python or Go.
- Strong background in distributed systems, observability, and cloud infrastructure (AWS).
- Hands‑on experience with Kubernetes, container orchestration, and CI/CD pipelines.
- Proficiency in designing and implementing monitoring, alerting, and logging solutions.
- Excellent problem‑solving skills and a passion for building reliable, scalable systems.
Skills
pythongokubernetesaws