Nvidia really went all out at this year’s GPU Technology Conference (GTC), showing off their next-gen Blackwell AI platform. This beast is set to give a huge boost to large language models and generative AI workloads. But that’s not all they had up their sleeve. Let’s dive into the highlights.
Blackwell
The star of the show was Nvidia’s new Blackwell GPU. This is a massive AI accelerator made up of two GPUs working together on a single package. It’s like a double-chip monster that can handle the toughest generative AI tasks without breaking a sweat.
But it’s not just about raw power. Nvidia says Blackwell can deliver up to 30 times more inference throughput and four times better training performance than the previous H100 GPU, and it does all this while using less energy. This is thanks to a bunch of cool tech stuff, like a second-gen Transformer Engine with FP4 precision support and a super-fast decompression engine.
Nvidia didn’t stop there. They’ve also put the Blackwell GPU into a three-die “Superchip” called the GB200. This chip pairs two Blackwell GPUs with a Grace Arm CPU, creating a seriously powerful AI processor.
These GB200 Superchips are packed into the NVL72 rack, an AI powerhouse that holds 72 Blackwell GPUs and 36 Grace CPUs. Nvidia says this single rack can train models with up to 27 trillion parameters, letting companies take on the biggest generative AI challenges.
The impact of Blackwell’s performance could be huge. Nvidia thinks it could lead to breakthroughs in everything from data processing and engineering simulations to electronic design automation, computer-aided drug design, and quantum computing.
And it’s not just Nvidia who’s excited about Blackwell. Big tech companies like Amazon, Google, Meta, Microsoft, OpenAI, Oracle, and Tesla are already backing it.
Nvidia has also revealed new versions of its DGX AI systems and SuperPOD rack-scale architecture to make the most of Blackwell. The DGX GB200 systems, which use the GB200 Superchips, promise a 15-fold increase in inference performance and triple the training speed compared to the previous DGX H100 systems. Inference is when you run live data through a machine-learning algorithm to get an output.
The new liquid-cooled SuperPOD architecture, built with DGX GB200 systems, delivers a mind-blowing 11.5 exaflops of AI computing power at FP4 precision, along with a massive 240 terabytes of fast memory per rack. This level of performance is designed to meet the demands of the most complex AI models and workloads.
Deepened cloud AI collaboration with Google and AWS
At the GTC event, Nvidia also highlighted its growing partnerships with cloud giants Google and Amazon Web Services (AWS). Google announced it’s adopting the Grace Blackwell AI computing platform and the NVIDIA DGX Cloud service, boosting its generative AI capabilities.
The two companies will also work together on optimizing open models like Gemma, supporting JAX on Nvidia GPUs, and using Nvidia’s NIM inference microservices to give developers a flexible, open platform for training and deploying AI models.
NIM concept announced
One of the most interesting announcements at GTC was Nvidia’s Inference Microservices (NIM) concept. NIM is designed to help customers easily deploy their generative AI applications in a secure, stable, and scalable way. It’s part of Nvidia AI Enterprise and includes pre-packaged AI models, integration code, and a pre-configured Kubernetes Helm chart for deployment.
NIM could speed up the time-to-market for AI deployments, letting developers get models up and running in as little as 10 minutes.
Nvidia also revealed Project GR00T (Generalist Robot 00 Technology), an exciting initiative aimed at developing general-purpose foundation models for humanoid robots. Project GR00T aims to give robots the ability to understand natural language, copy human movements by watching, and quickly learn skills like coordination, dexterity, and real-world navigation. Nvidia CEO Jensen Huang showed off several GR00T-powered robots doing a variety of tasks, showing what the project could do.
To go along with Project GR00T, there’s Jetson Thor, a new computer designed specifically for humanoid robots, built on Nvidia’s Thor system-on-chip (SoC).
6G research platform
Nvidia also showed its commitment to advancing cutting-edge technologies like 6G wireless and industrial digital twins. The company revealed a 6G Research Cloud platform, which will help researchers develop the next phase of wireless technology and lay the groundwork for a super-smart world.
Nvidia also announced the availability of Omniverse Cloud APIs, extending the reach of its Omniverse platform for creating industrial digital twin applications and workflows. This will let software makers easily integrate Omniverse technologies into their existing design and automation software, speeding up the adoption of digital twins across industries.
Thor SoC expands footprint
Finally, Nvidia’s GTC event highlighted the company’s growing presence in the automotive industry. Several automakers, including BYD, Hyper (under GAC Aion), and Xpeng, announced plans to use Nvidia’s Drive Thor system-on-chip (SoC) for their future electric vehicle fleets.
Drive Thor promises to deliver an unprecedented 1,000 teraflops of performance while reducing overall system costs, letting automakers combine autonomous driving, in-cabin AI, and infotainment systems on a single platform.
The article originally appeared on Indian Express.