Tenstorrent Updates Blackhole P150 AI Accelerator Core Count

Tenstorrent, a leading innovator in high-performance AI hardware, has announced a significant update to its Blackhole P150 AI accelerators. The company, under the leadership of renowned computer architect Jim Keller, has revised the specifications for its P150a and P150b models, reducing the number of active "Tensix" cores from 140 to 120 per card. This change, detailed in the latest official documentation, represents a reduction of approximately 14.3% in core count compared to the original specifications.

Details of the Blackhole P150 Hardware Revision

The Blackhole P150 series is designed for demanding AI workloads, featuring 32 GB of GDDR6 memory and a maximum power draw of 300 W in an actively cooled form factor suitable for desktop workstations. The P150a variant also includes four passive QSFP-DD 800G ports for high-speed connectivity. With the updated core count, the accelerators now ship with 120 operational Tensix cores, as opposed to the previously advertised 140.

According to Tenstorrent, this adjustment is intended to "present a unified interface to metal and other system software," as stated in the release notes for firmware v19.5.0 and later. The company assures users that typical workloads will experience only a minimal performance impact, estimated at around 1-2%.

Performance Impact and Industry Analysis

The reduction in core count has a direct effect on the accelerator's computational throughput. The original 140-core models were rated at 774 TeraFLOPS for BLOCKFP8 8-bit floating point operations. With the new 120-core configuration, this figure drops to 664 TeraFLOPS at the same precision level. While Tenstorrent has not provided a detailed explanation for the change, several plausible reasons have been suggested by experts in the high-performance computing (HPC) community.

  • Thermal Constraints: The 300 W thermal envelope may have limited the ability of the 140-core version to reach its full performance potential, making a lower core count more practical for sustained workloads.
  • Silicon Yield Optimization: Reducing the required number of fully functional cores per chip can significantly improve manufacturing yield, which is especially important as Tenstorrent prepares to scale its architecture to multi-chip systems.
  • Silicon Maturity: The change may also indicate that the current silicon is still maturing, and the company is shipping hardware that is not yet fully optimized, a common practice in the early stages of advanced chip development.

While the exact motivation behind the core reduction remains undisclosed, the update reflects the complex balance between performance, manufacturability, and reliability in the rapidly evolving AI hardware landscape. Tenstorrent’s approach highlights the challenges faced by companies pushing the boundaries of AI accelerator technology.