NVIDIA previews AI accelerator

In a talk at this week’s Hot Chips event at Stanford University, Bill Dally, NVIDIA’s chief scientist and senior vice president of research previewed a deep neural network (DNN) accelerator chip designed for efficient execution of natural language processing tasks.

The 5nm prototype achieves 95.6 TOPS/W in benchmarking and 1711 inferences/s/W with only 0.7% accuracy loss on BERT, demonstrating a practical accelerator design for energy-efficient inference with transformers.

He explored a half dozen other techniques for tailoring hardware to specific AI tasks, often by defining new data types or operations.

Dally described ways to simplify neural networks, pruning synapses and neurons in an approach called structural sparsity, first adopted in NVIDIA A100 Tensor Core GPUs.

“We’re not done with sparsity,” he said. “We need to do something with activations and can have greater sparsity in weights as well.”

In a separate talk, Kevin Deierling, NVIDIA’s vice president of networking, described the unique flexibility of NVIDIA BlueField DPUs and NVIDIA Spectrum networking switches for allocating resources based on changing network traffic or user rules.

“Today with generative AI workloads and cybersecurity, everything is dynamic, things are changing constantly,” Deierling said. “So we’re moving to runtime programmability and resources we can change on the fly.”

https://blogs.nvidia.com/blog/2023/08/29/hot-chips-dally-rese

Tags: AI Nvidia

NVIDIA previews AI accelerator

Intel previews 5th Gen Xeons

Google unveils its 5th gen TPU

Jim Carroll

Related Posts

Nokia Launches First Commercial AI-RAN Platform with NVIDIA

NVIDIA Pushes Telecom AI Toward Autonomous Operations at DTW Ignite 2026

Groq Raises $650 Million to Expand AI Inference Cloud

NVIDIA: Europe Unveils Record 35 AI Supercomputers

NVIDIA Expands Korea AI Push

NVIDIA Vera Rubin Enters Full Production

Google unveils its 5th gen TPU

Categories