Google Cloud has unveiled its new A4 virtual machines (VMs) in preview, powered by NVIDIA's Blackwell B200 GPUs, to address the increasing demands of advanced artificial intelligence (AI) workloads. The offering aims to accelerate AI model training, fine-tuning, and inference by combining Google's infrastructure with NVIDIA's hardware.
The A4 VM features eight Blackwell GPUs interconnected via fifth-generation NVIDIA NVLink, providing a 2.25x increase in peak compute and high bandwidth memory (HBM) capacity compared to the previous generation A3 High VMs. This performance enhancement addresses the growing complexity of AI models, which require powerful accelerators and high-speed interconnects. Key features include enhanced networking, Google Kubernetes Engine (GKE) integration, Vertex AI accessibility, open software optimization, a hypercompute cluster, and flexible consumption models.
Thomas Kurian, CEO of Google Cloud, announced the launch on X, highlighting Google Cloud as the first cloud provider to bring the NVIDIA B200 GPUs to customers.
Blackwell has made its Google Cloud debut by launching our new A4 VMs powered by NVIDIA B200. We're the first cloud provider to bring B200 to customers, and we can't wait to see how this powerful platform accelerates your AI workloads.
Specifically, the A4 VMs utilize Google's Titanium ML network adapter and NVIDIA ConnectX-7 NICs, delivering 3.2 Tbps of GPU-to-GPU traffic with RDMA over Converged Ethernet (RoCE). The Jupiter network fabric supports scaling to tens of thousands of GPUs with 13 Petabits/sec of bi-sectional bandwidth. Native integration with GKE, supporting up to 65,000 nodes per cluster, facilitates a robust AI platform. The VMs are accessible through Vertex AI, Google's unified AI development platform, powered by the AI Hypercomputer architecture. Google is also collaborating with NVIDIA to optimize JAX and XLA for efficient collective communication and computation on GPUs.
Furthermore, a new hypercompute cluster system simplifies the deployment and management of large-scale AI workloads across thousands of A4 VMs. This system focuses on high performance through co-location, optimized resource scheduling with GKE and Slurm, reliability through self-healing capabilities, enhanced observability, and automated provisioning. Flexible consumption models provide optimized AI workload consumption, including the Dynamic Workload Scheduler with Flex Start and Calendar modes.
Sai Ruhul, an entrepreneur on X, highlighted analyst estimates that the Blackwell GPUs could be 10-100x faster than NVIDIA's current Hopper/A100 GPUs for large transformer model workloads requiring multi-GPU scaling. This represents a significant leap in scale for accelerating "Trillion-Parameter AI" models.
In addition, Naeem Aslam, a CIO at Zaye Capital Markets, tweeted on X:
Google's integration of NVIDIA Blackwell GPUs into its cloud with A4 VMs could enhance computational power for AI and data processing. This partnership is likely to increase demand for NVIDIA’s GPUs, boosting its position in cloud infrastructure markets.
Lastly, this release provides developers access to the latest NVIDIA Blackwell GPUs within Google Cloud's infrastructure, offering substantial performance improvements for AI applications.