The new Hopper H100 NVL accelerator from NVIDIA can significantly accelerate work with massive language models such as ChatGPT. It combines the two H100 accelerators with the NVLINK interface and provides a 12-fold increase in performance compared to the previous generation A100 when working with GPT-3.
The novelty uses all six HBM3 memory stacks, the total volume of which reaches 188 GB, which is 2.35 times more than that of the predecessor. NVIDIA also announced the bandwidth of up to 7.8 TB/s and performance 68 TFLOPS (FP64), 134 TFLOPS (Tensor Core FP64) and 7916 TOPS (Int8).
According to the comparative table represented by the company, the new Hopper H100 NVL is significantly ahead of its predecessor H100 (in SXM and PCIE). At the moment, NVIDIA did not report the details about the new accelerator, but plans to launch it in the second half of 2023.
Last year, NVIDIA announced the export version of the A100 – A800 to circumvent export restrictions. She had a slightly smaller capacity of NVLINK, 400 GB/s instead of 600 GB/s. Now NVIDIA launches the Hopper architecture in mass production, similar to the flagship ampere. The advanced chip received the H800 model number and also has restrictions in the NVLINK, like the A800. In H100 NVLINK has a 900 GB/s bandwidth in the basic SXM version.