remote
IT Operations Engineer - 3commas
Systems Engineer
IT Operations Engineer responsible for maintaining and scaling the 3Commas.io infrastructure, ensuring high availability of crypto‑trading services across multiple exchanges using Python, AWS, Docker, Kubernetes and advanced monitoring tools.
About the role
Key Responsibilities
- Design, deploy, and manage scalable infrastructure on AWS, ensuring 99.99% uptime for 24/7 trading bots.
- Automate deployment pipelines with CI/CD tools, containerizing services using Docker and orchestrating with Kubernetes.
- Implement and maintain monitoring, alerting, and log aggregation solutions (Prometheus, Grafana, ELK) to detect and resolve incidents proactively.
- Collaborate with development teams to optimize application performance, troubleshoot production issues, and enforce security best practices.
- Document infrastructure changes, runbooks, and incident post‑mortems to improve reliability and knowledge sharing.
Requirements
- 3+ years of experience in cloud operations, preferably with AWS and container orchestration.
- Strong scripting skills in Python and Bash for automation.
- Hands‑on experience with Docker, Kubernetes, and CI/CD pipelines.
- Proficiency in Linux system administration and network troubleshooting.
- Excellent problem‑solving skills and ability to work in a fast‑paced, high‑availability environment.
Skills
pythonawsdockerkubernetescicdlinux