onsite
Hardware Engineer - STN Inc
Embedded Systems Engineer
Hardware Engineer responsible for end‑to‑end lifecycle of GPU and server infrastructure, driving health monitoring, firmware updates, RMA processes, and long‑term capacity planning to ensure optimal performance and reliability.
About the role
Key Responsibilities
- Monitor GPU and server health metrics, including thermal performance, error rates, and component failures.
- Lead firmware management, ensuring timely updates and compatibility across the compute fleet.
- Oversee RMA workflows, coordinating with vendors and internal teams to resolve hardware defects efficiently.
- Develop and maintain capacity planning models to forecast future infrastructure needs.
- Collaborate with cross‑functional teams to implement hardware upgrades and new platform integrations.
Requirements
- 3+ years of experience in hardware engineering, preferably with GPU or high‑performance compute systems.
- Hands‑on experience with hardware monitoring tools and performance analytics.
- Strong problem‑solving skills and ability to manage RMA processes across multiple vendors.
- Excellent communication skills and a collaborative mindset for working with engineering and operations teams.
Skills
pythonbashlinuxelectrical engineering