NVIDIA debuts Vera CPU for agentic AI

At GTC 2026, NVIDIA announced the Vera CPU, a data centre processor purpose-built for agentic AI workloads such as AI agents, coding copilots and orchestration services that sit at the heart of modern AI factories.

Built around 88 custom Olympus Arm cores with spatial multithreading, the new CPU delivers up to 50 percent faster single-threaded sandbox performance versus competitive platforms, backed by 1.2 TB/s of memory bandwidth for highly concurrent, latency-sensitive AI control loops.

It offers twice the efficiency and 50 percent higher performance than traditional rack-scale CPUs for these tasks. Early partners include cloud providers such as Oracle Cloud Infrastructure, CoreWeave, Lambda, Nebius, and Nscale, and server makers like Dell Technologies, HPE, Lenovo, and Supermicro.

Vera debuts as part of NVIDIA’s Rubin NV72 platform, where the CPU is tightly linked to NVIDIA GPUs over the NV-2C coherent interconnect delivering up to 1.8 TB/s of bandwidth – around seven times PCIe Gen 6 – for rapid data sharing between CPU and GPU.

This level of coupling lets NVIDIA optimise the full pipeline from data ingestion and preprocessing to training, inference and agentic orchestration, turning the CPU from a commodity attach into a strategic control plane for AI systems.

NVIDIA’s reference designs can pack up to 256 Vera CPUs into a liquid-cooled rack, supporting more than 22,500 concurrent AI environments for reinforcement learning or agent evaluation, which directly targets large-scale AI operations traditionally anchored on x86 servers.

“As intelligence becomes agentic — capable of reasoning and acting — the importance of the systems orchestrating that work is elevated. The CPU is no longer simply supporting the model; it’s driving it. With breakthrough performance and energy efficiency, Vera unlocks AI systems that think faster and scale further,” said Jensen Huang, Founder and CEO of NVIDIA.

Intel and AMD face CPU squeeze

For Intel and AMD, Vera is a direct shot at the head node and orchestration tiers of AI clusters where x86 CPUs have remained entrenched even as GPUs grabbed the spotlight.

NVIDIA has historically relied on Intel Xeon and AMD EPYC to front its GPU systems, but Vera gives hyperscalers a tightly integrated alternative that can displace x86 in high-margin, control-centric roles.

Vera’s design underscores a broader pivot in the CPU market from general-purpose compute towards AI-native architectures optimised for agentic workflows, dataflow-style analytics and low tail latencies.

Features such as a 10‑wide decode pipeline, neural branch prediction, custom prefetching for graph analytics and instruction paths tuned for frameworks like PyTorch point to a new class of CPUs tailored to AI-era control logic rather than legacy enterprise workloads.

If hyperscalers and enterprise AI factories standardise on Vera-like AI-native CPUs for orchestration, Intel and AMD may be compelled to accelerate their own AI-centric core designs and platform-level integration strategies or risk ceding the most strategic layers of the data centre to Arm-based rivals.

Tagged with: