Microsoft recently announced the general availability (GA) of Azure Managed Lustre, a managed file system for high-performance computing (HPC) and AI workloads.
Earlier, the company released a preview of the service as a managed offering for its customers to allow them to focus more on their business goals, such as predicting weather patterns, researching genomic sequences for diseases, and drug discovery. With the GA release, customers have a managed platform service allowing them to consume the Lustre file system, an open-source parallel filesystem born for high-performance computing as a research project in 1999. It provides a high-performance distributed parallel file system with hundreds of GBps storage bandwidth and solid-state disk latency. In addition, it integrates with Azure services like Azure HPC Compute, Azure Kubernetes Service, and Azure Machine Learning.
Wolfgang De Salvador, an EMEA GBB HPC, and AI senior specialist at Microsoft, explains in an Azure High-Performance Computing (HPC) blog post:
Azure Managed Lustre delivers all the performance and scalability benefits of Lustre, without the burden of managing the underlying infrastructure. Moreover, it features an integration through Lustre HSM with Azure Blob Storage for data retrieval and archival. This allows HPC/AI workloads to have access on the hot tier to the working datasets, keeping the remaining data in Azure Blob to minimize operational costs.
Azure Managed Lustre is provided in a hosted-on-behalf-of subscription, accessed through a straightforward interface within the customer's virtual network, eliminating the need for customers to worry about the complexities of deploying, managing, and operating the Lustre file system, including metadata servers/targets (MDS/MDT), management servers/targets (MGS/MGT), and object storage servers/targets (OSS/OSTs).
Service Topology of the Azure Managed Lustre Service (Source: Microsoft Compute Blog Post)
Jurgen Willis, vice president of product management, Azure Storage, explained in the announcement blog post the need for a managed service in Azure allowing customers to leverage Lustre:
Lustre, one of the most popular distributed parallel filesystems in the HPC world, has long been deployed on-premises serving scalable and high throughput storage needs of HPC workloads. As an open-source solution, it has enjoyed a thriving ecosystem of users and developers. With the recent explosive growth seen in generative AI and the need for a high throughput storage that keeps the expensive GPU cores from waiting on data, Lustre has found a renewed growth in its adoption.
The service offers two persistent, durable instances based on solid state drives (SSDs), which are differentiated by their performance option delivered for provisioned Tebibyte of capacity:
- Azure Managed Lustre File System (AMLFS) Standard – 125 MB/s
- AMLFS Premium – 250 MB/s
Users of Azure Managed Lustre can download the Lustre client packages from packages.microsoft.com for their desired Linux distribution and kernel version. Additionally, Microsoft offers support for HPC images prebuilt with Lustre client packages for Ubuntu – 18.04, 20.04, 22.04, and Alma 8.7.
Lastly, more details on the service are available through the documentation landing page. Furthermore, the pricing and availability details are on the pricing page.