Google recently launched three new cloud storage options: Cloud Storage FUSE for Artificial Intelligence (AI) applications that require file system semantics, a parallel file system Parallelstore for demanding AI and HPC applications that use GPUs, and NetApp Volumes for enterprise applications running in the cloud.
Cloud Storage FUSE was already available as an open-source version, allowing objects in Cloud Storage buckets to be accessed as files mounted as a local file system. The company enhanced it with new levels of portability, reliability, performance, and integration targeting AI Workloads. In a Google blog post, the authors Marco Abela, product manager, and Akshay Ram, senior product manager at Google, explain:
The new Cloud Storage FUSE is particularly important for AI workloads. Because applications can access data directly (rather than downloading it locally), there’s no custom logic to implement and less idle time for valuable resources like TPUs and GPUs while the data is copied over. Further, a new Cloud Storage FUSE CSI driver for Google Kubernetes Engine (GKE) allows applications to mount Cloud Storage using familiar Kubernetes API, and it’s offered as a turn-key deployment managed by GKE.
Overview of Cloud Storage FUSE (Source: Google blog post)
J.W. Davis, a Cloud evangelist, commented on the availability of FUSE file systems in GKE in a tweet:
This seems destined for many disasters. Useful for a very narrow range of use cases; most will misuse it and suffer.
In addition to Cloud Storage FUSE for AI applications, the company announces a private preview of a parallel file system called Parallelstore that helps users stop wasting precious GPU resources while at the same time, they wait for storage I/O by providing a high-performing parallel file storage solution for AI/ML and HPC workloads. The solution is based on the next-generation Intel Distributed Asynchronous Object Storage (DAOS) architecture.
Sameet Agarwal, VP/GM, Storage, and Sean Derrington, group product manager, Storage, explain in a Google blog post on the new cloud storage options:
Based on the next-generation Intel DAOS architecture, all compute nodes in a Parallelstore environment have equal access to storage, so VMs can get immediate access to their data. With up to 6.3x read throughput performance compared to competitive Lustre Scratch offerings., Parallelstore is well suited for cloud-based applications that require extremely high performance (IOPS and throughput) and ultra-low latency.
Finally, the third new cloud storage option is NetApp Volumes, a fully Google-managed, high-performance file storage service. This storage option is designed for enterprises that want to migrate their applications on top of NetApp storage arrays on-premises to the cloud. The service provides the capability to increase volumes from 100GiB to 100TiB for maximum scalability, implement ONTAP data management for hybrid workloads, and run either Windows or Linux applications as virtual machines without refactoring.
When asked by InfoQ about what is driving this investment from Google, here is what Derrington had to say:
As AI has become instrumental in automating data management, organizations are turning to the cloud to deliver the right storage solution for the right application. With Google Cloud’s new AI-optimized storage offerings, Cloud Storage FUSE and Parallelstore, we’re helping customers adapt to complex AI workloads by offering tailored storage solutions that simplify operations, unlock innovation, reduce costs, and more.
Cloud Storage FUSE and NetApp Volumes are available through the Google Cloud Console, while the Parallel Store is through a Google account manager.