Position Overview
NVIDIA is looking for outstanding software engineers to help us expand our enterprise GPU management and monitoring tools. In this role you will work closely with the broader NVIDIA team to design and build Linux-based management agents, CLI tools and end-to-end integration solutions that combine GPUs with the rest of the data center software management ecosystem. You will also help maintain our containerized build environment, build process, CI/CD pipelines and infrastructure, and packaging.
We are focused on supporting NVIDIA products across HPC, cloud and enterprise on both bare metal and virtualized platforms as the role of GPUs in all of these environments expands rapidly. Your contributions will span many aspects of GPU system integration, including telemetry and metrics, health checks, diagnostics, configuration, accounting and policy. These tools fill roles of both passive background monitoring and active online management with a core emphasis on operational tran...