NVIDIA "Vera" CPU Sets New Performance Benchmark in Data Center Computing

NVIDIA’s latest innovation in custom CPU design, the "Vera" processor, has made a significant impact in early benchmark testing, signaling a new era for Arm-based CPUs in the data center market. Recent independent tests highlight how the Vera CPU not only competes with, but in many cases surpasses, the latest offerings from Intel and AMD.

Technical Highlights of the NVIDIA Vera CPU

The NVIDIA Vera CPU is built around 88 custom Armv9.2 "Olympus" cores, supporting 176 threads through advanced physical resource partitioning. This architecture enables native FP8 processing, allowing AI workloads to run directly on the CPU via a 6x128-bit SVE2 implementation. Such capabilities are particularly valuable for modern AI and machine learning applications, where efficient data processing is critical.

Memory bandwidth is another standout feature, with Vera delivering 1.2 TB/s and supporting up to 1.5 TB of LPDDR5X memory in the SOCAMM2 format. The processor’s second-generation Scalable Coherency Fabric provides 3.4 TB/s of bisection bandwidth, connecting all cores across a unified monolithic die. This design eliminates the latency challenges often seen in chiplet-based architectures, ensuring consistent and high-speed data access across the CPU.

Benchmark Results: Outperforming Industry Leaders

In comparative testing, the Vera CPU was evaluated against single and dual Intel Xeon "Granite Rapids" 6980P processors, as well as AMD EPYC "Turin" and "Turin Dense" models, including the EPYC 9755, 9575F, and 9475F. NVIDIA’s previous-generation "Grace" CPU, based on Arm Neoverse V2 cores, was also included for reference.

The benchmarks covered a range of standard workloads, such as code compilation, stream memory performance, video encoding, Python and Java execution, and database operations. Despite being a pre-release chip with a limited test suite, the Vera CPU consistently led the performance charts. On average, it delivered nearly 11% higher performance than AMD’s top-tier designs and outpaced the best single-socket Intel Xeon by approximately 55.3%. Notably, Vera also outperformed dual-socket configurations, highlighting its efficiency and the potential scaling limitations of multi-socket systems.

Efficiency and Market Impact

The Vera CPU operates at a 450 W TDP, with an additional 50 W allocated for a 768 GB memory pool. These power figures are competitive given the performance delivered, especially in high-density data center environments.

NVIDIA’s strategy with Vera and its predecessor, Grace, is already reshaping the CPU landscape. With projected sales of around $20 billion and a total addressable market estimated at $200 billion, NVIDIA is positioning itself as a major player in the standalone CPU market. The company’s partnerships with leading hyperscalers are driving widespread adoption, with Vera-powered racks being deployed for both internal infrastructure and third-party cloud offerings.

As deployments continue to expand, NVIDIA’s Vera CPU is poised to become a cornerstone of modern data center infrastructure, setting new standards for performance, efficiency, and scalability in Arm-based computing.