InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Enter your e-mail address

Select your country

We protect your privacy.

InfoQ Homepage News AI Lab Extension Allows Podman Desktop Users to Experiment with LLMs Locally

AI, ML & Data Engineering

AI Lab Extension Allows Podman Desktop Users to Experiment with LLMs Locally

May 21, 2024 3 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

One year after its 1.0 release, Podman Desktop announced the Podman AI Lab plugin, promising to help developers start working with Large Language Models (LLM) on their machines. Podman AI Lab streamlines LLM workflows featuring generative AI exploration, built-in recipe catalogue, curated models, local model serving, OpenAI-compatible API, code snippets, and playground environments.

The plugin intends "to democratize" gen AI for application developers and to close the gap between "it works on my machine" and that it runs in production on hybrid clouds. Currently, the supported ecosystems are Kubernetes and Red Hat OpenShift.

Developers can install it from the extensions catalogue. It is available for Podman Desktop 1.10 or later.

To allow the developers to build, test and run Gen AI-powered applications, the plugin promises to offer the "ingredients" to get you to the first Gen AI, "Hello World":

Educational applications (Recipes Catalogue): Under the name of "recipe" the application makes available sample applications that allow developers to discover and learn best practices of using gen AI in their application. The recipes catalogue is a public repository to which you can contribute by submitting PRs.

Catalogue of curated models: an out-of-the-box list of ready-to-use open-source models. The plugin promises that the available models have been checked to ensure adherence to legal requirements (usually Apache 2.0 or MIT open-source license). You can also import your model files in GGUF (GPT-Generated Unified Format) format.

Local model serving: the application can generate code snippets for instant integration in developers' applications. To make the transition between "online" and "local" models easier, the application provides an "Open-AI compatible" API. The plugin creates the inference server needed to interact with the model (based on llama.cpp) and hints to the user when there are not enough resources for running the model locally. To check how the application works, you can check it like any pod in pod view: you can see the details and terminal outputs of each container. If needed, you can SSH directly into those.

Playground environments: for testing or fine-tuning models. Whenever running an application locally, Podman runs an inference server for the model used in a container. Also, it displays all the already running applications (recipes). When starting a new playground environment, a prompt assists in finding the best model and settings. The playground allows you to select the temperature (controls the randomness of the model’s output affecting the creativity and predictability), max tokens (sets the maximum length of the model’s response as tokens - words) and top-p. The plugin can define the general context (instructions and guidelines) of each query.

To further understand the project’s mission and direction, InfoQ sat with Stevan Le Meur, PM of the project. Le Meur described these:

Le Meur: With the rapid growth of Gen AI, AI-infused applications are now becoming the norm. Our mission is to provide application developers with the needed [local] tools to easily and cheaply develop and debug this new breed of application while keeping their data safe.

In our vision, a normal lifecycle of an application should be: start from an existing recipe, and try out different models until you find the suitable one for your use case. Tweak it by setting the needed parameters or fine-tune it with InstructLab, even without the ML experience. When you are happy with your outcome, you can make it "deployment-ready" transitioning from local to production with minimal differences between the two environments.

The new plugin for Podman Desktop helps with local experimentation and migrating LLMs to production smoothly. Further, their roadmap hints that they will explore areas like GPU acceleration, function calls enablement or local Retrieval Augmented Generation (RAG). Given the rapidly changing ecosystem, they encourage you to provide feedback or to contribute to the open-source project.

About the Author

Olimpiu Pop

Tech Executive and Engineer Focused on a Holistic Approach and using technology to provide solutions to real problems with minimal impact on the environment. He has experience in developing real-time applications ranging from financial software to IAM. Passionate about tooling and optimising development flows with or without AI. Led and shaped technical organisations of hundreds of developers (from support engineers to Architects). Tech community builder: Transylvania JUG facilitator, member of the program committee for Voxxed Romania and Devoxx UK, conference speaker and podcaster on cybersecurity and open-source topics for 505updates.com. Main editor and troublemaker of JavaAdventCalendar.

Show moreShow less

This content is in the AI, ML & Data Engineering topic

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

AI Lab Extension Allows Podman Desktop Users to Experiment with LLMs Locally

Write for InfoQ

About the Author

Olimpiu Pop

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter