Recently, Google announced new Cloud TPU Virtual Machines (VMs), which provide direct access to TPU host machines. With these VMs, the company offers a new and improved user experience to develop and deploy TensorFlow, PyTorch, and JAX on Cloud TPUs.
Customers could already set up virtual instances in Google Cloud with the TPU chipsets. However, this presented some drawbacks as the instances did not run in the same server environment. The TPUs were connected to the chipsets remotely via a network connection, reducing the processing speed since applications had to send the data over the network to a TPU and then wait for the processed data to be sent back.
With Cloud TPU VMs now in preview, customers can connect their TPU chipsets directly to their deployed instances – preventing network delay between the various applications and the Google Cloud instances when using TPU chipsets. Alexander Spiridonov, product manager at Google AI, stated in a blog post on the new Cloud TPU VMs:
This new Cloud TPU system architecture is simpler and more flexible. In addition to major usability benefits, you may also achieve performance gains because your code no longer needs to make round trips across the datacenter network to reach the TPUs. Furthermore, you may also see significant cost savings: If you previously needed a fleet of powerful Compute Engine VMs to feed data to remote hosts in a Cloud TPU Pod slice, you can now run that data processing directly on the Cloud TPU hosts and eliminate the need for the additional Compute Engine VMs.
Source: https://cloud.google.com/blog/products/compute/introducing-cloud-tpu-vms
Google offers the Cloud TPU VMs in two variants. The first variant is the Cloud TPU v2, based on the second-generation TPU chipsets, and the newer Cloud TPU v3 version - based on third-generation TPU. The difference between the two, according to Google Cloud, is in the performance. A Cloud TPU v2 can perform up to 180 teraflops, and the TPU v3 up to 420 teraflops.
A use case for the Cloud TPU VMs is to develop algorithms on the already existing Cloud TPU Pods. These are large clusters of AI servers based on TPUs. In particular, these solutions are suitable for running very complex machine learning models. For example, the fastest cluster offers a capacity of over 100 petaflops per second - making building algorithms on these clusters a lot cheaper. Customers will only have to pay the rent of a pod and the migration costs to more powerful hardware when going into production. In addition, Google Cloud plans to use the Cloud TPU VMs in its quantum computing plans.
Huggingface, an AI community Twitter account, stated in a tweet:
With the power of JAX/Flax & the new cloud TPU V3-8 now you can pre-train a masked LM in just 18hrs!
Currently, the Cloud TPU VMs in preview are now available in the us-central1 and europe-west4 regions. These VMs are available from $1.35 per hour per TPU host machine with Google’s preemptible offerings and up – more details are available on the pricing page. And lastly, customers can start training ML models using JAX, PyTorch, and TensorFlow using Cloud TPUs and Cloud TPU Pods quickly by leveraging the documentation and JAX-, PyTorch-, and TensorFlow quickstarts.