InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Enter your e-mail address

Select your country

We protect your privacy.

InfoQ Homepage News Microsoft Introduces Serverless GPUs on Azure Container Apps in Public Preview

Cloud

Microsoft Introduces Serverless GPUs on Azure Container Apps in Public Preview

Dec 31, 2024 2 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

At the recent Microsoft Ignite conference, the company announced the public preview of Azure Container Apps with serverless GPUs powered by NVIDIA. This feature allows customers to utilize NVIDIA A100 GPUs and NVIDIA T4 GPUs in a serverless environment, providing scaling and flexibility for real-time custom model inferencing and other machine-learning tasks.

Azure Container Apps is a fully-managed serverless container service that allows developers to deploy, run, and scale containerized applications without managing infrastructure. With serverless GPUs, they can run GPU-powered applications without managing the underlying infrastructure and benefit from scale-to-zero capabilities; resources can dynamically scale based on demand, reducing idle costs. In addition, they can benefit from per-second billing for GPU usage with data governance that keeps information within container boundaries, flexible options with NVIDIA A100 and T4 GPUs, and a managed serverless platform for deploying their own AI models.

According to the company, Azure’s serverless GPUs excel in use cases like real-time AI inferencing, machine learning model deployments, and high-performance computing tasks. The platform ensures smooth integration into existing Azure workflows.

(Source: Azure Blogs on Apps blog post)

During an Ignite Session of Azure Functions Flex Consumption and GPUs, Simon Jakesch, principal product manager Azure Container Apps at Microsoft, said:

Anyone who has used serverless or in combination with Azure Container Apps has found it to be extremely powerful. This technology brings the same power to GPU use, making GPUs easily accessible.

Microsoft is not the sole provider of GPU capabilities for accelerating workloads such as real-time AI inferencing and machine learning model deployments. Others are Modal, RunPod, Replicate, Baseten, Koyeb and Fal. Furthermore, Google Cloud Run supports NVIDIA L4 GPUs for real-time AI inferencing.

Lars Wurm, a platform leader in Core Infrastructure at Inter Ikea, posted on LinkedIn:

With the introduction of serverless GPUs using Azure Container Apps, several new workloads and usage scenarios are enabled, shaping the offering into a one-stop shop for container workloads. This is particularly beneficial when workloads do not rely on committed ACA instances.

And in an NVIDIA corporate blog post, Dave Salvator wrote:

Serverless GPUs allow development teams to focus more on innovation and less on infrastructure management. With per-second billing and scale-to-zero capabilities, customers pay only for the compute they use, helping ensure resource utilization is both economical and efficient. NVIDIA is also working with Microsoft to bring NVIDIA NIM microservices to serverless NVIDIA GPUs in Azure to optimize AI model performance.

Serverless GPUs are available in a select set of Azure regions during the public preview phase. More information is available directly on Azure's platform in documentation, tutorials, and pricing details.

About the Author

Steef-Jan Wiggers

Steef-Jan Wiggers is one of InfoQ's senior cloud editors and works as an Principal Consultant Cloud/DevOps at Team Rockstars IT in The Netherlands. His current technical expertise focuses on integration platform implementations, Azure DevOps, AI and Azure Platform Solution Architectures. Steef-Jan is a regular speaker at conferences and user groups and writes for InfoQ. Furthermore, Microsoft has recognized him as Microsoft Azure MVP for the past fifteen years.

Show moreShow less

This content is in the Cloud topic

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Microsoft Introduces Serverless GPUs on Azure Container Apps in Public Preview

Write for InfoQ

About the Author

Steef-Jan Wiggers

This content is in the Cloud topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter