NVIDIA HGX B200

Rent NVIDIA DGX B200 GPU – Boost Your AI Workloads

Introducing the groundbreaking NVIDIA HGX B200, the world's first system powered by the revolutionary NVIDIA Blackwell architecture. This cutting-edge solution delivers unparalleled performance to tackle the most complex AI tasks like generative AI, LLM and NLP.

Get Started

Unrivalled Performance in...

Inference Capabilities

144 petaFLOPS inference performance for maximum speed and efficiency

Acceleration

Configuration of 8 GPUs delivering a staggering 32 petaFLOPS of performance

Next-Level Training

FP8 and new precision for 3x faster training for large language models

Networking

5th-gen NVLink with 1.8TB/s of GPU-to-GPU interconnect and InfiniBand networking

New Era of Generative AI with NVIDIA HGX B200

Redefine AI Performance

Packed with 208 billion transistors, Blackwell-architecture GPUs are manufactured using TSMC's cutting-edge 4NP process. It features two interconnected GPU dies, forming a unified GPU with a 10 TB/second chip-to-chip link. With the second-generation Transformer Engine and 4-bit floating point AI inference capabilities, Blackwell supports double the compute and model sizes, propelling AI performance to unprecedented heights. The fifth-generation NVLink delivers a mind-blowing 1.8TB/s bidirectional throughput per GPU

Reliability Meets Security

Blackwell's dedicated RAS Engine ensures unparalleled reliability, availability and serviceability, while its AI-based preventative maintenance capabilities maximise system uptime and reduce operating costs. Advanced confidential computing features safeguard AI models and customer data without sacrificing performance. Integrated decompression engine accelerates database queries, delivering unmatched performance in data analytics and data science

Benefits of NVIDIA HGX B200

Reserve now

Higher Inference Performance

Up to 15x faster inference for massive models like GPT-MoE-1.8T compared to the previous Hopper generation. With cutting-edge Blackwell Tensor Core technology, combined with TensorRT-LLM and Nemo Framework innovations, get unmatched acceleration for LLM and Mixture-of-Experts models

Transformer Engine

Experience 3x faster training GPT-MoE-1.8T with the second-generation Transformer Engine, featuring groundbreaking 8-bit floating point (FP8) and new precisions. Fifth-generation NVLink interconnect (1.8TB/s GPU-to-GPU bandwidth), high-speed InfiniBand networking, and NVIDIA Magnum IO software

A New Class of AI Superchip

Built with Blackwell Architecture with 208 billion transistor and a custom-built 4NP TSMC process, it features two-reticle limit GPU dies connected by a blazing 10 TB/second chip-to-chip link, forming a unified GPU of unparalleled power

RAS Engine for Reliability

Equipped with a dedicated engine for reliability, availability, and serviceability (RAS). The latest Blackwell architecture incorporates AI-based preventative maintenance capabilities at the chip level for diagnostics and forecasting of potential reliability issues. Maximise system uptime for massive-scale AI deployments

Secure AI for Confidentiality

Advanced confidential computing capabilities protect your AI models and customer data without compromising performance. New native interface encryption protocols, critical for privacy-sensitive industries like healthcare and financial services

Decompression Engine

Accelerate database queries and experience the highest performance in data analytics and data science with a dedicated decompression engine. Supporting the latest formats, this engine allows you to process vast amounts of data efficiently

Technical Specifications

View pricing

GPU: NVIDIA HGX B200

GPU Memory: 192 GB HBM3e

Power: 1000W

FP4 Tensor Core

18 petaFLOPS

FP8/FP6 Tensor Core

9 petaFLOPS

INT8 Tensor Core

9 petaOPS

FP16/BF16 Tensor Core

4.5 petaFLOPS

TF32 Tensor Core

FP64 Tensor Core

40 teraFLOPS

GPU memory

Up to 192 GB HBM3e

Bandwidth

Up to 8 TB/s

Multi-Instance GPU (MIG)

Decompression Engine

Yes

Decoders

2x 7 NVDEC, 2x 7 NVJPEG

Power

Up to 1,000W

Interconnect

5th Generation NVLink: 1.8TB/s, PCIe Gen6: 256GB/s

FP4 Tensor Core

18 petaFLOPS

FP8/FP6 Tensor Core

9 petaFLOPS

INT8 Tensor Core

9 petaOPS

FP16/BF16 Tensor Core

4.5 petaFLOPS

TF32 Tensor Core

FP64 Tensor Core

40 teraFLOPS

GPU memory

Up to 192 GB HBM3e

Bandwidth

Up to 8 TB/s

Multi-Instance GPU (MIG)

Decompression Engine

Yes

Decoders

2x 7 NVDEC, 2x 7 NVJPEG

Power

Up to 1,000W

Interconnect

5th Generation NVLink: 1.8TB/s, PCIe Gen6: 256GB/s

Frequently Asked Questions

Frequently asked questions about the NVIDIA HGX B200.

What is the NVIDIA B200 Card used for?

Based on the Blackwell architecture, the NVIDIA B200 card delivers a massive leap forward in speeding up inference workloads, making real-time performance a possibility for resource-intensive, multitrillion-parameter language models.

What is the NVIDIA HGX B200 price?

The NVIDIA HGX B200 GPU is available for reservation on Hyperstack. Please reserve here and our team will contact you to discuss pricing for NVIDIA HGX B200.

What is the inference performance of NVIDIA Blackwell B200 GPU?

The NVIDIA B200 specifications include massively powerful 144 petaFLOPS inference performance, delivering unparalleled speed and efficiency for computationally intensive tasks.

Why is the NVIDIA Blackwell B200 ideal for LLM?

The NVIDIA Blackwell B200 enables AI training and real-time LLM inference for models scaling up to 10 Trillion parametres. It is built with powerful technologies including:

Second Generation Transformer Engine: Custom Tensor Core technology, combined with NVIDIA TensorRT-LLM and NeMo framework innovations, accelerates inference and training for LLMs, including a mixture-of-experts models.
Secure AI: Advanced confidential computing capabilities protect AI models and customer data with uncompromised performance.
Fifth-Generation NVLink: To accelerate performance for multitrillion-parametre AI models, NVLink’s latest iteration delivers groundbreaking 1.8 Tb/s throughput per GPU, ensuring seamless high-speed communication among up to 576 GPUs for today’s most complex large language models.

Is the NVIDIA HGX B200 good for LLM training?

Absolutely! The NVIDIA HGX B200 is a great choice for training LLMS. With its 72 petaFLOPS of training performance, this system offers unparalleled computational power to accelerate the demanding training processes of LLMs.

How much memory does the NVIDIA HGX B200 have?

The NVIDIA HGX B200 has a memory of up to 192GB.

What networking does the NVIDIA HGX B200 have?

The NVIDIA HGX B200 has 5th-gen NVLink with 1.8TB/s of GPU-to-GPU interconnect and InfiniBand networking.

Is the NVIDIA HGX B200 ideal for inference workloads?

The NVIDIA HGX B200 is ideal for inference workloads with 144 petaFLOPS inference performance for maximum speed and efficiency.

NVIDIA HGX B200

Rent NVIDIA DGX B200 GPU – Boost Your AI Workloads

Unrivalled Performance in...

Inference Capabilities

Acceleration

Next-Level Training

Networking

New Era of Generative AI with NVIDIA HGX B200

Redefine AI Performance

Reliability Meets Security

Benefits of NVIDIA HGX B200

Higher Inference Performance

Transformer Engine

A New Class of AI Superchip

RAS Engine for Reliability

Secure AI for Confidentiality

Decompression Engine

Technical Specifications

Supercharge Gen AI with NVIDIA HGX B200

Frequently Asked Questions

What is the NVIDIA B200 Card used for?

What is the NVIDIA HGX B200 price?

What is the inference performance of NVIDIA Blackwell B200 GPU?

Why is the NVIDIA Blackwell B200 ideal for LLM?

Is the NVIDIA HGX B200 good for LLM training?

How much memory does the NVIDIA HGX B200 have?

What networking does the NVIDIA HGX B200 have?

Is the NVIDIA HGX B200 ideal for inference workloads?

United Kingdom (Head office)

Spain

Solutions

Site map

Products

Legal