NVIDIA’s Vera CPU targets agentic AI at scale

NVIDIA has launched Vera, its first CPU built specifically for agentic AI and reinforcement learning.

The new CPU targets enterprises that need AI systems to plan, execute and validate tasks rather than just generate responses.

Vera delivers 2x efficiency and 50 percent faster performance than traditional rack-scale CPUs, and it is already being tested or adopted by cloud providers, AI companies and infrastructure vendors.

Positioned as an active part of the AI stack, not just a support chip, Vera is designed for the kinds of workloads that agentic systems generate every day — tool use, code execution, orchestration, analytics, data processing, and multi-tenant inferencing across many simultaneous jobs.

It also brings 88 custom Olympus cores and up to 1.2TB/s of memory bandwidth to help sustain performance under heavy parallel demand.

“AI agents will be the largest users of computing. Vera is the first CPU designed for that future — built to run agentic AI at hyperscale with extraordinary performance, efficiency and programmability,” said Jensen Huang (top), Founder and CEO of NVIDIA.

For enterprises, relevant use cases include coding assistants, software development agents, workflow orchestration, real-time data streaming, database-heavy applications, and cloud services that run many AI tools at once.

Vera is also suitable for reinforcement learning, agentic inference, storage management, and high-performance computing, which broadens its appeal beyond pure AI model training.

The clearest early adopters are cloud infrastructure, AI-native software, streaming data platforms, supercomputing centres, and enterprise data centres. Among those exploring using or collaborating on the CPU are Alibaba Cloud, ByteDance, Meta, Oracle Cloud Infrastructure, CoreWeave, Lambda, Nebius, and Nscale, National labs such as TACC and Los Alamos are also planning deployments.

For enterprises, Vera could lower the cost and latency of running agentic AI at scale while improving energy efficiency in dense data centres.

NVIDIA is also pairing Vera with its broader Rubin platform and NVLink-C2C interconnect, which means the CPU is meant to work as part of a tightly integrated AI factory rather than as a standalone processor. This could help enterprises deploy more concurrent AI agents, support faster internal automation and run larger multi-service AI applications without the same infrastructure overhead.

Share this:

Related