Position Overview
We are seeking researchers to investigate acceleration of DL training and inference with GPU systems. This is an applied research position, where the goal is to investigate various techniques and to transfer the most promising ones to SW and HW products. Work entails investigations, prototyping, publication, and collaboration with researchers, DL practitioners, and engineering teams within and outside NVIDIA.
What you'll be doing:
- Numerics: research of low-bit number representations and their effect on neural network inference and training accuracy. This includes requirements by the existing state of art neural networks, as well as co-design of future neural network architectures and optimizers.
- System performance: research various parallelization approaches for large neural networks (both training and inference), communication patterns and their performance limiters on large GPU systems, communication and computation overlap.
<...