Google Cloud has announced a significant update to its Cloud Storage services by introducing the Hierarchical Namespace (HNS). Now available in preview, this new feature allows users to organize their storage buckets in a hierarchical file system structure, enhancing performance, consistency, and manageability.
The hierarchical namespace enables users to organize their data more effectively by creating directories and nested subdirectories within storage buckets. This logical structuring mirrors traditional file systems, making it easier for users to manage and access their data. The hierarchical organization simplifies data management and improves performance, particularly for workloads requiring extensive directory and file operations.
Vivek Saraswat, a group product manager, and Zhihong Yao, a staff software engineer, both at Cloud Storage at Google, write:
A bucket with a hierarchical namespace has storage folder resources backed by an API, and the new ‘Rename Folder’ operation recursively renames a folder and its contents as a metadata-only operation. This ensures the process is fast and atomic, improving performance and consistency for folder-related operations compared to existing buckets.
In addition, Richard Seroter, chief evangelist at Google Cloud, tweeted:
.. create a more functional "tree" of objects. This improves how you interact with "folders," improves performance, and more.
Left: Cloud Storage bucket with a flat hierarchy and simulated folders. Right: Bucket with hierarchical namespace organized into a tree-like structure (Source: Google Cloud blog post)
The introduction of HNS is particularly beneficial for scenarios that require high performance and manageability, such as big data analytics, content management systems, and large-scale application deployments. For example, a media company managing vast libraries of video files can use HNS to organize content by project, date, or type, improving accessibility and processing efficiency.
Users can create new buckets with HNS enabled or migrate existing buckets to utilize the hierarchical namespace. Google Cloud provides comprehensive documentation and tools to facilitate this transition. Users can allow HNS through the Google Cloud Console, command-line interface, or API, offering flexibility in managing their storage resources.
Patrick Haggerty, a director of Google cloud learning at ROI Training, listed in a LinkedIn post the pros and cons of the HNS feature in Google Cloud Storage:
Pros:
- If you rename a folder, it doesn't have to move or rewrite files anymore.
- New API operations to manipulate folders.
- Faster initial QPS (x8) on read/write operations.
- Works with the managed folder thing for folder permissions.
Cons:
- Must be enabled when the bucket is created.
- No support for versioning, locks, retention, or file-level ACLs.
- Extra charge for the feature (pricing not announced).
Other Hyperscalers like Microsoft and AWS also offer the HNS feature in their storage services. For instance, in Azure Data Lake Storage Gen2, HNS organizes objects/files within an account into a hierarchy of directories and nested subdirectories. Meanwhile, in Amazon S3, Directory Buckets organize data hierarchically into directories instead of the flat storage structure of general-purpose buckets.