Every year, NVIDIA GTC offers a glimpse into the future of computing. But this year felt different. The conversations from the past few days point to something bigger than faster GPUs or larger models. The industry is shifting its mindset entirely.
GTC 2026 made it clear that the goalposts for AI haven't just moved, they’ve been uprooted. We’re past the point of talking about "faster chips." Everything points to a total shift in the industry's DNA.
“In the good old days when I would say ‘Hooper’, I would hold up a chip… that’s just adorable. When we think ‘Vera Rubin’, we think the entire system, optimized as one giant system.” - Jensen Huang, Founder and CEO of NVIDIA (Session: 2026 GTC Keynote)
We’ve moved beyond individual components. We’re now in the era of the AI Factory: a complete system designed to do one thing extremely well, generate tokens and intelligence at scale. That shift was reinforced not just in how NVIDIA talks about full-stack systems, but in what they’re building; most notably, their move into purpose-built inference infrastructure through the integration of Groq’s LPU technology, signalling that even the underlying silicon is now being redesigned around end-to-end AI production rather than general-purpose compute.

Events like NVIDIA GTC have always served as a pulse check for where AI infrastructure is heading. But this year highlighted something even bigger: NVIDIA’s evolution from a GPU vendor into a full AI infrastructure platform company. For organizations building AI products today, that shift matters. It means infrastructure decisions are becoming just as strategic as model choices, and access to the right compute environment is becoming a competitive advantage.
"This is how intelligence is made. A new kind of factory. Generator of tokens. The building blocks of AI." - Jensen Huang, Founder and CEO of NVIDIA (Session: GTC 2026 Keynote)
The era of the AI factories
Across several GTC sessions, one theme kept appearing: the move away from traditional GPU clusters toward pre-engineered AI factory infrastructure. Rather than assembling clusters server by server, the future increasingly looks like rack-scale systems designed and optimized as a single unit. These architectures combine compute, networking, storage, cooling, and orchestration software into tightly integrated environments.
Systems like NVIDIA DGX systems are becoming the blueprint for these deployments, allowing organizations to deploy AI infrastructure faster while improving performance-per-watt and overall system efficiency.
From clusters to industrial-scale systems
What stood out in sessions like Build Gigascale AI Factories with Next-Generation Rack-Scale Systems was just how far this integration is being pushed. Modern AI factories are being engineered at a scale where a single system can contain hundreds of thousands of components, highlighting the operational complexity that pre-integrated designs are intended to abstract away.

Source: Build Gigascale AI Factories With Next-Generation Rack-Scale Systems
At the high end, rack-scale systems such as DGX SuperPOD configurations are already delivering tens of exaFLOPS of AI performance (e.g., ~50 exaFLOPS inference and ~35 exaFLOPS training in a single deployment), reinforcing the idea that AI infrastructure is now being built more like industrial capacity than traditional IT.
While Rubin represents the “north star” architecture for the coming years, the message across the conference was clear: the path toward that future is already being built on today’s hardware. AI factories are not a distant concept; they are rapidly becoming the default model for organizations looking to operationalize AI at scale.
Vera Rubin: The architecture defining the next generation of AI
NVIDIA officially pulled back the curtain on the NVIDIA Vera Rubin platform, a generational leap designed specifically for the era of Agentic AI.
The industry is moving beyond the phase where models are simply trained and deployed. Increasingly, models are expected to act, reason, and operate autonomously, which dramatically changes what infrastructure needs to support. It’s no longer just about raw power; it’s about the orchestration and efficiency to run thousands of autonomous agents without costs spiralling out of control.
| Component | Description |
|---|---|
| The Vera CPU | Designed specifically to handle the sequential reasoning patterns that modern AI agents rely on. By improving efficiency compared with traditional CPU designs, it helps eliminate a common bottleneck in AI systems: GPUs sitting idle while workloads move through orchestration and decision layers. |
| The Rubin GPU | The next step beyond NVIDIA Blackwell architecture, with major improvements aimed at reducing the cost of large-scale inference. If the industry’s next challenge is deploying AI at a massive scale, lowering inference cost becomes just as important as increasing training performance. |
| The Groq LPUs | This was the real curveball. By integrating Groq’s LPU technology into dedicated LPX racks, NVIDIA has added a specialized "turbocharger" for token generation. While the Rubin GPU handles the heavy compute, the Groq LPU handles the lightning-fast delivery. Fusing them together is how NVIDIA is hitting a massive 35x inference performance-per-watt jump. |
Together, these architectural changes point to something bigger than a new chip generation. Rubin represents a shift toward fully integrated AI infrastructure systems designed to support the next wave of AI workloads.
“Rubin arrives at exactly the right moment, as AI computing demand for both training and inference is going through the roof… With our annual cadence of delivering a new generation of AI supercomputers — and extreme codesign across six new chips — Rubin takes a giant leap toward the next frontier of AI.” - Jensen Huang, Founder and CEO of NVIDIA (Source: NVIDIA press release)
Preparing for Vera Rubin
While the NVIDIA Vera Rubin architecture captured much of the attention at NVIDIA GTC, the most important question many organizations are already asking is simple: How do we prepare for it?
“Vera Rubin is a generational leap — seven breakthrough chips, five racks, one giant supercomputer — built to power every phase of AI… The agentic AI inflection point has arrived with Vera Rubin kicking off the greatest infrastructure buildout in history.” - Jensen Huang, Founder and CEO of NVIDIA (Source: NVIDIA press release)
Rubin represents the next major evolution in AI infrastructure architecture, designed for rack-scale systems capable of running massive agentic workloads. But as with every generational shift in compute, adoption won’t happen overnight. Organizations that want to take advantage of Rubin will need to begin preparing their infrastructure strategy well before the hardware becomes broadly available.

At Civo, we’re already working closely with partners across the ecosystem to help customers plan for that transition.
Teams interested in deploying Rubin-powered infrastructure can now register their interest through Civo. This allows us to work closely with organizations planning for the next generation of AI workloads as Rubin becomes available.
While timelines for Rubin availability are still emerging, one thing is already clear: the workloads coming next, autonomous agents, real-time reasoning systems, and large-scale inference pipelines will require a new generation of infrastructure to support them.
Preparing now ensures teams are ready when the next generation arrives.
Unlocking low-latency AI with NVIDIA and Groq
As Jensen Huang highlighted in his keynote, the integration of Groq's LPU technology into NVIDIA's infrastructure is a key component of the Vera Rubin platform's capabilities. By incorporating Groq's low-latency, high-throughput LPUs into dedicated LPX racks, NVIDIA has created a specialized "turbocharger" for token generation.

Source: Inside NVIDIA Groq 3 LPX
The Groq 3 LPX is designed to work in tandem with NVIDIA's Vera Rubin GPU, delivering a heterogeneous inference architecture that combines the strengths of both processors. While the Rubin GPU handles heavy compute tasks, the Groq LPU accelerates latency-sensitive token generation, enabling real-time AI applications that were previously unimaginable. As Jensen Huang noted during his keynote, this fusion of technologies enables a massive 35x inference performance-per-watt jump.
By leveraging Groq's LPU technology, NVIDIA is able to offer a more efficient and scalable solution for AI inference, paving the way for the widespread adoption of AI in industries such as healthcare, finance, and more. As the industry continues to evolve, the integration of Groq's technology into NVIDIA's infrastructure is likely to play a key role in shaping the future of AI computing.
The rise of sovereign AI
As AI infrastructure scales, governments are increasingly recognizing that compute capacity is becoming as strategically important as energy, telecoms, or semiconductor supply chains. At this year’s GTC, several discussions focused on how countries are beginning to build sovereign AI infrastructure to support national innovation and economic resilience.
“I am fortunate to live in a time when this huge technological shift is taking place. I believe that if you can, look upon these revolutionary times with a positive mindset. All changes have their risks, but they usually also carry great opportunities. I think if one dares to think positively, and think about the long-term opportunities, that will be very helpful.” - Marcus Wallenberg, Chair of the Board of Directors, Skandinaviska Enskilda Banken (Session: Driving National Growth Through Sovereign AI Investment)
AI factories and growth zones: Building the UK’s sovereign foundation
The imperative is clear: sovereign AI is essential for economic growth, with the potential to increase global output by $15.7 trillion and offset labor shortages that could leave 85 million jobs unfilled by 2030. AI factories and growth zones are becoming a key part of this strategy, as seen in the UK's AI Growth Zones initiative, which aims to accelerate the deployment of AI infrastructure by simplifying data-centre approvals and improving grid connectivity.
"We want to make sure that Britain has been able to not just build that AI but build that AI in a durable way from an energy generation point of view and an energy generation effort to be benefited." - Kanishka Narayan (Session: AI Growth Zones and the UK’s AI Opportunities Action Plan)
For organizations building AI products in the UK, sovereign infrastructure isn't just a regulatory consideration, it's becoming a strategic advantage, allowing them to maintain control over their data, models, and compute while still accessing cutting-edge infrastructure.
By investing in sovereign AI, countries can unlock significant economic benefits, such as reducing regulatory compliance costs, which can be substantial. The EU, for example, loses a significant portion of its GDP annually due to bureaucratic compliance costs. Ensuring that data, models, and compute can remain within national boundaries while still accessing cutting-edge infrastructure will be critical for many industries.
UK Sovereign Cloud for total control
100% UK-based. Data stays in the UK and under UK law. No hidden fees. No lock-in. Built on open standards.
👉 Find out more“In Europe, the conversation around sovereign AI has moved very quickly in the last year. For years, legal frameworks such as the CLOUD Act have always been there but rarely used. The recent geopolitical tensions and policy shifts have made these risks more tangible, so technology is now used as part of strategic leverage between the different blocks.
In Europe, the question is no longer ‘should we care about sovereignty’, but ‘how exposed are we if we don’t’. From here, we understand that sovereignty AI matters.” - Andres Desantes, 1MillionBot (Session: Model Builders at the Frontier of EMEA’s Open and Sovereign AI Movement)
From chatbots to "Agentic AI" with NemoClaw
One of the most interesting software announcements during the keynote was NemoClaw, NVIDIA’s enterprise platform built around the OpenClaw agent framework.
These systems represent the next stage of AI development: agentic AI. Rather than simply responding to prompts, these systems can autonomously plan, execute tasks, and coordinate across tools and services.
"Every company in the world today needs to have an OpenClaw strategy and an agentic system strategy. This is the new computer." - Jensen Huang, (Session: GTC 2026 Keynote)
When Jensen calls something "the new computer," you listen. But here’s the reality: these "always-on" agents can’t just run on a generic public cloud. They need isolated sandboxes and the kind of security you only get with sovereign GPU infrastructure.
At Civo, we’ve focused on providing the production-ready Kubernetes environments that make deploying these systems significantly simpler, turning what used to be months of infrastructure setup into something teams can launch in minutes.
Where Civo fits into this new AI landscape
As AI infrastructure becomes more specialized and complex, many organizations are looking for ways to access GPU compute without navigating the complexity or lock-in of hyperscale cloud environments.
This is where platforms like Civo aim to simplify the process of deploying and scaling AI workloads.
Building today, preparing for Rubin
In this industry, "waiting" is just another way of saying you're falling behind. While the buzz around Vera Rubin is justified, the reality is that the road to that future is paved with what’s on the shelf right now.
At Civo, we’ve already made the NVIDIA Blackwell B200 available because we know that innovation can't sit in a queue. The B200 isn't just a placeholder; it’s a powerhouse that delivers 15x the inference throughput of previous generations.
As interest in the NVIDIA Vera Rubin architecture grows, organizations can now register their interest with Civo to stay informed about early access opportunities as Rubin becomes available. The future of AI infrastructure is arriving fast, and the best way to be ready for it is to start building now.