Google Cloud has officially announced the general availability (GA) of its sixth-generation Tensor Processing Unit (TPU), known as Trillium. According to the company, the AI accelerator is designed to meet the growing demands of large-scale artificial intelligence workloads, offering more performance, energy efficiency, and scalability.
Trillium was announced in May and is a key component of Google Cloud's AI Hypercomputer, a supercomputer architecture that utilizes a cohesive system of performance-optimized hardware, open-source software, leading machine learning frameworks, and adaptable consumption models.
With the GA of Trillium TPUs, Google enhanced the AI Hypercomputer's software layer, optimizing the XLA compiler and popular frameworks like JAX, PyTorch, and TensorFlow for better price performance in AI training and serving. Features like host-offloading with large host DRAM complement High Bandwidth Memory (HBM) for improved efficiency.
The company states that Trillium delivers training performance over four times and up to three times the inference throughput compared to the previous generation. With a 67% improvement in energy efficiency, Trillium is faster and greener, aligning with the increasing emphasis on sustainable technology. Its peak compute performance per chip is 4.7 times higher than its predecessor, making it suitable for computationally intensive tasks.
Trillium TPUs were also used to train Google’s Gemini 2.0 AI model, with a correspondent on a Hacker News thread commenting:
Google silicon TPUs have been used for training for at least 5 years, probably more (I think it's 10 years). They do not depend on Nvidia GPUs for the majority of their projects. It took TPUs a while to catch up on some details, like sparsity.
This is followed by a comment that notes that TPUs have been used for training deep prediction models in ads since at least 2018, with TPU capacity now likely surpassing the combined capacity of CPUs and GPUs.
Currently, Nvidia holds between 70% and 95 % of the AI data center chip market, while the remaining percentage comprises different versions like Google's TPUs. Google does not sell the chips directly but offers access through its cloud computing platform.
In a Reddit thread, a correspondent commented regarding not selling the chips:
That's right, but I think Google is more future-focused, and efficient AI will ultimately be much more valuable than chips.
In my country, we often say that we should make wood products rather than export wood because making furniture creates more value. I think this is similar: TPUs and AI create more value than the two things alone.
More details on pricing and availability are available on the pricing page.