remote
Data Center Operations Systems Engineer Dallas, TX - Lambda
Systems Engineer
Data Center Operations Systems Engineer responsible for deploying, configuring, and maintaining GPU‑centric server, storage, and networking infrastructure in a high‑density data center, ensuring reliability and performance for AI workloads.
About the role
Key Responsibilities
- Rack, label, cable, and configure new server, storage, and networking hardware in a high‑density GPU data center.
- Perform hardware and software troubleshooting for GPU nodes, networking gear, and storage arrays.
- Collaborate with site reliability and network teams to optimize infrastructure performance and uptime.
- Maintain documentation of rack layouts, cable management, and configuration changes.
- Participate in on‑call rotation and shift work to support 24/7 data center operations.
Requirements
- 3+ years of data center operations experience with GPU or high‑performance computing environments.
- Strong knowledge of Linux server administration and networking fundamentals.
- Hands‑on experience with rack mounting, cable management, and hardware diagnostics.
- Excellent problem‑solving skills and ability to work independently in a fast‑paced environment.
- Valid driver’s license and willingness to work on a shift schedule.
Skills
machine learninglinuxjira