NVIDIA A2 Module 16GB PCI-Express x16 Gen 4.0 TCSA2MATX-PB

NVIDIA A2

Unprecedented Acceleration for World’s Highest-Performing Elastic Data Centers

The NVIDIA A2 Tensor Core GPU provides entry-level inference with low power, a small footprint, and high performance for intelligent video analytics (IVA) or NVIDIA AI at the edge. Featuring a low-profile PCIe Gen4 card and a low 40–60 watt (W) configurable thermal design power (TDP) capability, the A2 brings versatile inference acceleration to any server.

A2’s versatility, compact size, and low power exceed the demands for edge deployments at scale, instantly upgrading existing entry-level CPU servers to handle inference. Servers accelerated with A2 GPUs deliver up to 20X higher inference performance versus CPUs and 1.3x more efficient IVA deployments than previous GPU generations — all at an entry-level price point.

NVIDIA-Certified systems with the NVIDIA A2, A30, and A100 Tensor Core GPUs and NVIDIA AI—including the NVIDIA Triton Inference Server, open source inference service software—deliver breakthrough inference performance across edge, data center, and cloud. They ensure that AI-enabled applications deploy with fewer servers and less power, resulting in easier deployments and faster insights with dramatically lower costs.

CUDA Cores	1280
Streaming Multiprocessors	40 \| Gen 3
Tensor Cores \| Gen 3	108 Gen 2
GPU Memory	16 GB GDDR6 ECC
Thermal Solution	Passive
Memory Bandwidth	200 GB/s
Peak FP32	4.5 TFLOPS
Peak TF32 Tensor Core	9 TFLOPS \| 18 TFLOPS Sparsity
Peak FP16 Tensor Core	18 TFLOPS \| 36 TFLOPS Sparsity
INT8	36 TOPS \| 72 TOPS Sparsity
INT4	72 TOPS \| 144 TOPS Sparsity

Third-Generation NVIDIA Tensor Cores

The third-generation Tensor Cores in NVIDIA A2 support integer math down to INT4 and floating-point math up to FP32 to deliver high AI training and inference performance. A2’s NVIDIA Ampere architecture also supports TF32 and NVIDIA’s automatic mixed precision (AMP) capabilities.

Second-Generation RT Cores

The NVIDIA A2 GPU includes dedicated RT Cores for ray tracing and Tensor Cores for AI to power groundbreaking results at breakthrough speed. It delivers up to 2x the throughput over the previous generation and the ability to concurrently run ray tracing with either shading or denoising capabilities.

Structural Sparsity

Modern AI networks are big and getting bigger, with millions to billions of parameters. Not all of these parameters are needed for accurate predictions and inference. A2 provides up to 2x higher compute performance for sparse models compared to previous-generation GPUs. This feature readily benefits AI inference and can be used to improve the performance of model training.

Compare GPUs For Virtualization
NVIDIA virtual GPU (vGPU) software runs on NVIDIA GPUs.
Match your needs with the right GPU below.

	A100	A30	L40	A40	RTX 6000	A16	A2
GPU Architecture	NVIDIA Ampere	NVIDIA Ampere	NVIDIA Ada Lovelace	NVIDIA Ampere	NVIDIA Ada Lovelace	NVIDIA Ampere	NVIDIA Ampere
Memory Size	80 GB/40 GB HBM2	24 GB HBM2	48 GB GDDR6 with ECC	48 GB GDDR6	48 GB DDR6	64 GB GDDR6 (16 GB per GPU)	16 GB GDDR6
Virtualization Workload	Highest performance virtualized compute including AI, HPC, and data processing. Includes support for up to 7 MIG instances. Upgrade path for V100/V100S Tensor Core GPUs.	Virtualize mainstream compute and AI inference, includes support for up to 4 MIG instances. Upgrade path for T4.	High-end 3D visualization applications, AI training and inference workloads Upgrade path for Quadro® RTX 8000, Quadro RTX 6000, or T4.	Mid-range to high-end 3D design and creative workflows with NVIDIA RTX® Virtual Workstation (vWS). Upgrade path for Quadro RTX™ 8000, RTX 6000 or T4.	High-end design, real-time rendering, AI, and high-performance compute workflows. Upgrade path for Quadro RTX 6000.	Office productivity applications, streaming video and teleconferencing tools for graphics-rich virtual desktops accessible from anywhere. Upgrade path for M10 or T4.	Space constrained environments or edge deployments of graphics or compute workloads.
vGPU Software Support	NVIDIA AI Enterprise	NVIDIA AI Enterprise	NVIDIA RTX vWS, NVIDIA Virtual PC (vPC), NVIDIA Virtual Apps (vApps), NVIDIA AI Enterprise*	NVIDIA RTX vWS, vPC, vApps, NVIDIA AI Enterprise	NVIDIA RTX vWS, vPC, vApps, NVIDIA AI Enterprise*	NVIDIA RTX vWS, vPC, vApps, NVIDIA AI Enterprise	NVIDIA RTX vWS, vPC, vApps

Performance Optimized

NVIDIA A100
Datasheet

NVIDIA L40
Datasheet

NVIDIA RTX 6000
Datasheet

NVIDIA A40
Datasheet

NVIDIA A30
Datasheet

Więcej informacji

Więcej informacji
Architecture	NVIDIA Ampere
GPU rozwiązania i zastosowanie	NVIDIA Virtual PC (vPC), NVIDIA Virtual Applications (vApps), NVIDIA RTX Workstation (vWS), NVIDIA Virtual Compute Server (vCS), Visualization and AI, NVIDIA AI Enterprise for VMWare
wsparcie VR	tak
chłodzenie grafiki	pasywne serwerowe
rodzaj pamięci	GDDR6
pamieć karty	16GB
Mocowanie i profil kart	Low-Profile
Gwarancja	3 lata

Opinie

Napisz własną recenzję

Tylko zarejestrowani użytkownicy mogą pisać Recenzje. Proszę Zaloguj się lub Załóż konto