NVIDIA Rubin GPU vs. NVIDIA Vera CPU
Written by
Technical Writer @ Civo
Written by
Technical Writer @ Civo
Brainchild of NVIDIA, the Vera Rubin lineup is shaping up to be a promising release from the company in the first half of the year. In a previous blog, we discussed how the NVIDIA Vera Rubin stacks up against the Blackwell series.
The Vera Rubin release is exciting not just because the tech giant is releasing an even more capable GPU, but because it is the first time the company is producing a CPU specifically geared toward “agentic reasoning.”
In this blog, we take a look at the two sides of NVIDIA's latest release, Vera and Rubin, draw a distinction between the two, and provide some clarity on which you should be most excited for.
What is Vera Rubin?
Vera Rubin is NVIDIA's successor to the Blackwell generation. It is not a single chip, but a full data center platform made up of six co-designed chips: the Vera CPU, Rubin GPU, NVLink 6 switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet switch.

The design philosophy behind the platform is what NVIDIA calls "extreme co-design." Rather than optimizing each chip independently and assembling them into a server, NVIDIA architected the GPU, CPU, networking, security, and cooling together as a single system.
This is the same approach NVIDIA introduced with Blackwell NVL72, but Rubin takes it further with a new CPU, doubled NVLink bandwidth, and HBM4 memory.
In previous generations, NVIDIA paired its GPUs with CPUs from other manufacturers. The Blackwell lineup, for example, uses B200 GPUs alongside ARM-based Grace CPUs.
With Vera Rubin, NVIDIA designed both the GPU and CPU in-house. Rubin refers to the GPU architecture, while Vera refers to a new ARM-based CPU built to handle data movement, orchestration, and agentic workloads alongside the Rubin GPU, or independently as a standalone data center CPU.
What is the NVIDIA Vera CPU?

Vera is special because it is NVIDIA's first CPU designed as a standalone processor to compete directly with traditional data center-grade CPUs like Intel Xeon and AMD EPYC. While NVIDIA has shipped CPUs before with its Grace lineup, those were always tightly paired with NVIDIA GPUs.
It is important to note here that NVIDIA designed Vera specifically for the demands of agentic AI, where the CPU is no longer just supporting a model but actively driving it. Agentic workloads require the CPU to handle orchestration, tool use, code execution, and data movement at scale.
What is the NVIDIA Rubin GPU?

The Rubin GPU is designed for large-scale AI inference and training in data centers. It is the first NVIDIA GPU to use HBM4 memory, replacing the HBM3e used in the Blackwell generation. Additional specs from the Rubin GPU include:
The Rubin GPU exists because inference is getting harder. With reasoning models, AI inference is no longer a quick request and response; models now generate long chains of thought, try different approaches, and iterate before producing an answer.
NVIDIA's headline claim is that Rubin delivers one-tenth the cost per million tokens compared to Blackwell for agentic AI inference, and can train MoE models with one-fourth the number of GPUs. It achieves this through a new Transformer Engine with hardware-accelerated NVFP4 support.
Rubin does not ship as a standalone GPU. It is part of a broader platform. The flagship configuration is the Vera Rubin NVL72, a 100% liquid-cooled rack that combines 72 Rubin GPUs with 36 Vera CPUs, cooled with 45°C water.
A smaller configuration, the DGX Rubin NVL8, packs eight Rubin GPUs into a liquid-cooled 2U system for training, inference, and post-training workloads. Both systems are optimized for NVIDIA's software stack, including NIM inference microservices and the NeMo framework.
NVIDIA Rubin GPU vs. NVIDIA Vera CPU
So far, we have established some of the functional differences between the two devices, but what do the spec sheets say?
Other notable differences
Aside from pure specs, NVIDIA has made certain optimizations to certain areas in the Vera Rubin line:
Energy efficiency and cooling
- Vera CPU's LPDDR5X memory subsystem delivers 1.2 TB/s of bandwidth at under 50W of memory power, using SOCAMM modules.
- The Vera Rubin NVL72 is 100% liquid-cooled, using 45°C water with no chillers required.
- NVIDIA states this allows data centers to allocate up to 10% more power budget directly to compute instead of cooling.
Precision and compute scope
- Both processors support FP8, but only the Rubin GPU supports FP4 through its Transformer Engine.
- FP4 inference is the primary way NVIDIA is driving down cost per token on the Rubin platform.
- Vera's native FP8 support allows it to handle lighter AI workloads like agentic tool execution and data preprocessing without offloading to a GPU.
Software stack
- Both are optimized for the NVIDIA AI Enterprise software suite, including NIM inference microservices and the NeMo framework.
- The Vera CPU is additionally positioned for reinforcement learning sandboxing, where NVIDIA claims a single 256-CPU rack can sustain over 22,500 concurrent RL or agent sandbox environments.
Standalone availability
- The Vera CPU ships as a standalone product in dual and single-socket server configurations from Dell, HPE, Lenovo, and Supermicro.
- The Rubin GPU only ships as part of multi-GPU system configurations like the NVL72 or NVL8.
Summary
You’ve probably heard it a hundred times this week, but as AI adoption speeds up, organizations are looking for ways to optimize GPU usage and make inference cheaper. The Vera Rubin line-up looks to make this a reality.
In this post, we distinguish between the Vera CPU and GPU, highlighting the differences and the reasons each one exists.
If you’re looking to learn more about previous generations of NVIDIA GPUs, here are some resources:

Technical Writer @ Civo
Jubril Oyetunji is a DevOps engineer and technical writer with a strong focus on cloud-native technologies and open-source tools. His work centers on creating practical tutorials that help developers better understand platforms such as Kubernetes, NGINX, Rust, and Go.
As a contract technical writer, Jubril authored an extensive library of technical guides covering cloud-native infrastructure and modern development workflows. Many of his tutorials achieved strong search rankings, helping developers around the world learn and adopt emerging technologies.
Share this article
Related Articles
17 February 2026
NVIDIA Vera Rubin vs. NVIDIA Blackwell (B200) GPU
Jubril Oyetunji
Technical Writer @ Civo
11 June 2024
A100 vs. L40s vs. H100 vs. H200 GH superchips: A comparison of NVIDIA’s next-gen GPUs
Barry Ugochukwu
Machine Learning Engineer @ JRZY
2 June 2025
Comparing NVIDIA's B200 and H100: A deep dive into next-gen AI performance
Mostafa Ibrahim
Software Engineer @ GoCardless