Nvidia enhances Grace-Hopper ‘superchip’ with faster AI memory.

Nvidia enhances Grace-Hopper 'superchip' with faster AI memory.

Nvidia Unveils Next-Generation “GH200” Grace Hopper Superchip

Nvidia CEO Jensen Huang introduces GH200 Nvidia CEO Jensen Huang on Tuesday showed off his company’s next iteration of the combination CPU and GPU, the “GH200” Grace Hopper “superchip.” The part boosts the memory capacity to 5 terabytes per second to handle increasing size of AI models.

Nvidia CEO Jensen Huang announced during his keynote address at the recent SIGGRAPH computer graphics show in Los Angeles that the company is planning to ship an enhanced version of their “superchip” next year. This superchip, called the GH200 Grace Hopper, combines CPU and GPU with faster memory to facilitate moving data in and out of the chip’s circuitry more effectively. The initial version of the Grace Hopper chip is already being used in computers from Dell and other manufacturers.

The GH200 chip is an upgrade to the original Grace Hopper combo chip, offering several improvements. One significant enhancement is the increase in memory capacity. While the initial version contains 96 gigabytes of HBM memory, the GH200 boasts 140 gigabytes of HBM3e, the next version of the high-bandwidth-memory standard. This boost in memory capacity allows the GH200 to handle data rates of up to 5 terabytes per second, a significant improvement from the 4 terabytes per second of the original Grace Hopper.

The GH200 is scheduled to follow the original Grace Hopper chip by approximately one year. Huang mentioned in May that the original chip was in full production, with plans to begin sampling the GH200 by the end of the year, followed by full production by the end of the second quarter of 2024.

One of the key features of the GH200, similar to its predecessor, is the integration of ARM-based CPU cores in the Grace chip and GPU cores in the Hopper GPU. These two chips are interconnected through a high-speed, cache-coherent memory interface called NVLink, enabling the GPU to access the CPU’s DRAM memory. Additionally, Huang highlighted the capability of connecting two GH200 chips in a dual-configuration server, effectively doubling the HBM3e memory bandwidth to a remarkable 10 terabytes.

GH200 Grace Hopper Superchip GH200 is the next version of the Grace Hopper superchip, which is designed to share the work of AI programs via a tight coupling of CPU and GPU.

Nvidia has a history of upgrading GPU memory speed, and the move from HBM2 to HBM2e with the previous generation of GPU, A100 “Ampere,” is an example of this trend. The adoption of HBM was driven by the growing memory demands of 4K displays in video game graphics. HBM is a stacked memory configuration, with memory dies vertically stacked and connected through a “through-silicon via” that runs through each chip, enabling efficient data transfer.

AI programs, especially generative AI models like ChatGPT, require substantial memory resources. These programs must store a vast number of neural weights or parameters, which are the matrices that form the core functional units of a neural network. As generative AI models grow larger, with some aiming to reach a trillion parameters, the need for high-bandwidth memory becomes even more critical.

In addition to showcasing the GH200, Nvidia made several other announcements during the SIGGRAPH show. They introduced the AI Workbench, a program that simplifies the process of uploading neural net models to the cloud in a containerized fashion. Users can sign up for early access to this program. Nvidia also unveiled new workstation configurations for generative AI, produced in collaboration with Dell, HP, Lenovo, and others under the “RTX” brand. These workstations can accommodate up to four “RTX 6000 Ada GPUs,” each possessing 48 gigabytes of memory. Offering up to 5,828 trillion floating point operations per second (TFLOPs) of AI performance and 192 gigabytes of GPU memory, these workstations provide substantial capabilities for AI development.

To learn more about Nvidia’s latest innovations and announcements, including the GH200 Grace Hopper superchip, you can watch the replay of Jensen Huang’s full keynote on the Nvidia website.