Item: NVIDIA Ampere A100 GPU packs 6912 CUDA and 432 Tensor cores on 826mm² die
Rating: 0
Author: Vyncent Chan

NVIDIA Ampere A100 GPU packs 6912 CUDA and 432 Tensor cores on 826mm² die

NVIDIA held their GPU Technology Conference (GTC) 2020 earlier, and they did launch the NVIDIA Ampere architecture, although you will probably have to wait a bit longer for them to unveil the gaming-oriented Ampere GPUs. This is the NVIDIA A100, a high-performance computing (HPC) GPU designed to accelerate AI and data analytics with unprecedented scalability.

The NVIDIA A100 GPU is a massive 826mm² GPU packing 54.2 billion transistors. They have also stepped up from last-gen’s 12nm process to TSMC’s 7nm process, allowing for these gains in transistor density. The monstrous GPU is mated to 40GB of HBM2 via a 5120-bit memory bus, allowing for 1.6 TB/s of memory bandwidth.

All that memory bandwidth will probably be very helpful to keep the cores on the NVIDIA A100 busy. NVIDIA packed a whopping 6912 CUDA cores and 432 3rd Gen Tensor cores onto the A100 GPU. Four of these 3rd Gen Tensor cores offer 2x the raw fused multiply–add (FMA) computational power of eight Tensor cores in the last-gen GV100 GPU, so you are looking at more than double the FP16 Tensor performance in the NVIDIA A100, despite having fewer Tensor cores than its predecessor. NVIDIA Ampere will also feature support for more formats, TF32, bfloat16 and FP64. NVIDIA also introduced support for sparsity acceleration where the dataset can be halved to double the performance, with virtually no loss in inferencing accuracy.

For inter-GPU communications, the NVIDIA A100 supports NVLink 3, which is the latest iteration of the NVIDIA NVLink interconnect technology. It supports 12 NVLinks for a total bandwidth of 600GB/s, and it allows for 8 GPU configurations where every GPU is directly communicating with each other.

NVIDIA also added Multi-Instance GPU (MIG) with the NVIDIA A100, allowing you to partition a GPU into up to seven parts, letting multiple users run on specific slices of the GPU on separate SMs with dedicated L2 cache and memory, allowing for higher predictability than previous implementations. These configurations are slated to benefit cloud computing providers as they can allow for higher utilization of the GPUs. If you are interested in a more in-depth look at the architecture of the NVIDIA Ampere architecture, you can head on to this blog post.

NVIDIA Ampere A100 Specs

A100, 108 SMs, TSMC 7nm
6912 CUDA cores
432 Tensor cores
Up to 1410 GHz Boost
40GB HBM2 @ 2.4 Gbps
5120-bit memory bus, 1.6 TB/s bandwidth
PCIe 4.0 x16
400W TDP

If you fancy yourself an NVIDIA Ampere GPU right now, you can already get them in the NVIDIA DGX A100 systems. Each of these pack eight NVIDIA A100 GPUs, two 64-core AMD Rome 7742 CPUs, 1TB RAM, 15TB of PCIe 4.0 storage and nine Mellanox ConnectX-6 200 Gbps network interfaces. Each of these cost a cool $199K (~RM864K), so have fun. These are already shipping.

Pokdepinion: I wonder if those Tensor improvements will come to the consumer segment to boost DLSS performance?

Sections

Latest Reviews

FSP M580 Review (ft. VITA GM 850W & ZENFAN) – Easy Setup, Great Value

Apple iPhone 17 Pro Max Review – Beauty, Beast, and Bulk

Apple AirPods Pro 3 Review – More Than Just Earbuds

ASUS ROG Xbox Ally X (RC73XA) Review – This Is An Xbox (And A PC)

NVIDIA Ampere A100 GPU packs 6912 CUDA and 432 Tensor cores on 826mm² die

NVIDIA Ampere A100 GPU packs 6912 CUDA and 432 Tensor cores on 826mm² die

NVIDIA Ampere A100 Specs

Leave a Reply Cancel reply

Stay Connected

Latest Reviews

FSP M580 Review (ft. VITA GM 850W & ZENFAN) – Easy Setup, Great Value

Apple iPhone 17 Pro Max Review – Beauty, Beast, and Bulk

Apple AirPods Pro 3 Review – More Than Just Earbuds

ASUS ROG Xbox Ally X (RC73XA) Review – This Is An Xbox (And A PC)

Xiaomi 15T Pro Review – You’re Getting A Whole Lot Out Of This

Just Published

FSP M580 Review (ft. VITA GM 850W & ZENFAN) – Easy Setup, Great Value

This BTF 3.0 Concept PC Build Demonstrates The Future Of Cable-Less Desktop PCs

ASUS Confirms Availability Of ROG Swift OLED PG27AQWP-W And ROG Strix OLED XG27AQWMG Monitors

realme GT 8 Pro Confirmed For Malaysia Launch On 24 November

Haier Malaysia Launches New X Series Washing Machines

You Might Also Like

Here’s An ASUS ROG Matrix 9800GT GPU Cleaned From Years Of Accumulated Cigarette Tar

Anker Introduces New Powerbanks & Charging Accessories In Malaysia

iQOO 15 Confirmed To Launch In Malaysia This December

NVIDIA GeForce RTX 50 SUPER Series Allegedly Gets “Delayed”, Despite No Confirmation Of Existence

Latest Reviews

NVIDIA Ampere A100 GPU packs 6912 CUDA and 432 Tensor cores on 826mm² die

Related Article

NVIDIA Ampere A100 Specs

Leave a Reply Cancel reply

Stay Connected

Latest Reviews

Just Published

You Might Also Like