InfoQ Homepage Large language models Content on InfoQ
-
Vertex AI in Firebase Aims to Simplify the Creation of Gemini-powered Mobile Apps
Currently available in beta, the Vertex AI SDK for Firebase enables the creation of apps that go beyond the simple chat model and text prompting. Google has just made available a colab to help developers through the steps required to integrate it into their apps.
-
Google Publishes LLM Self-Correction Algorithm SCoRe
Researchers at Google DeepMind recently published a paper on Self-Correction via Reinforcement Learning (SCoRe), a technique for improving LLMs' ability to self-correct when solving math or coding problems. Models fine-tuned with SCoRe achieve improved performance on several benchmarks compared to baseline models.
-
NVIDIA Unveils NVLM 1.0: Open-Source Multimodal LLM with Improved Text and Vision Capabilities
NVIDIA unveiled NVLM 1.0, an open-source multimodal large language model (LLM) that performs strongly on both vision-language and text-only tasks. NVLM 1.0 shows improvements in text-based tasks after multimodal training, standing out among current models. The model weights are now available on Hugging Face, with the training code set to be released shortly.
-
Hugging Face Upgrades Open LLM Leaderboard v2 for Enhanced AI Model Comparison
Hugging Face has recently released Open LLM Leaderboard v2, an upgraded version of their benchmarking platform for large language models. Hugging Face created the Open LLM Leaderboard to provide a standardized evaluation setup for reference models, ensuring reproducible and comparable results.
-
PayPal Adds GenAI Support with LLMs to Its Cosmos.AI MLOps Platform
PayPal extended its MLOps platform Cosmos.AI to support the development of generative AI applications using large language models (LLMs). The company incorporated support for vendor, open-source, and self-tuned LLMs and provided capabilities around retrieval-augmented generation (RAG), semantic caching, prompt management, orchestration, and AI application hosting.
-
University of Chinese Academy of Sciences Open-Sources Multimodal LLM LLaMA-Omni
Researchers at the University of Chinese Academy of Sciences (UCAS) recently open-sourced LLaMA-Omni, an LLM that can operate on both speech and text data. LLaMA-Omni is based on Meta's Llama-3.1-8B-Instruct LLM and outperforms similar baseline models while requiring less training data and compute.
-
Meta Unveils Movie Gen, a New AI Model for Video Generation
Meta has announced Movie Gen, a new AI model designed to create high-quality 1080p videos with synchronized audio. The system enables instruction-based video editing and allows for personalized content generation using user-supplied images.
-
Intuit Engineering's Approach to Simplifying Kubernetes Management with AI
Intuit recently talked about how they managed the complexities of monitoring and debugging Kubernetes clusters using Generative AI (GenAI). The GenAI experiments were conducted to streamline detection, debugging, and remediation processes.
-
Anthropic Unveils Contextual Retrieval for Enhanced AI Data Handling
Anthropic has announced Contextual Retrieval, a significant advancement in AI systems' interaction with extensive knowledge bases. This technique addresses the challenge of context loss in Retrieval-Augmented Generation (RAG) systems by enriching text chunks with contextual information before embedding or indexing.
-
Uber Creates GenAI Gateway Mirroring OpenAI API to Support over 60 LLM Use Cases
Uber created a unified platform for serving large language models (LLMs) from external vendors and self-hosted ones and opted to mirror OpenAI API to help with internal adoption. GenAI Gateway provides a consistent and efficient interface and serves over 60 distinct LLM use cases across many areas.
-
HelixML Announces Helix 1.0 Release
HelixML has announced their Helix platform for Generative AI is production ready at version 1.0. Described as a "Private GenAI Stack," the platform provides an interface layer and applications that can be connected to a variety of LLMs. It can be used to prototype apps, starting with a laptop, with all components version controlled to ease subsequent deployment and scaling.
-
Leveraging the Transformer Architecture for Music Recommendation on YouTube
Google has described an approach to use transformer models, which ignited the current generative AI boom, for music recommendation. This approach, which is currently being applied experimentally on YouTube, aims to build a recommender that can understand sequences of user actions when listening to music to better predict user preferences based on their context.
-
Alibaba Releases Two Open-Weight Language Models for Math and Voice Chat
Alibaba released two open-weight language model families: Qwen2-Math, a series of LLMs tuned for solving mathematical problems; and Qwen2-Audio, a family of multi-modal LLMs that can accept voice or text input. Both families are based on Alibaba's Qwen2 LLM series, and all but the largest version of Qwen2-Math are available under the Apache 2.0 license.
-
Grok-2 Beta Version Released on X Platform
The Grok-2 language model has been released in beta on the X platform, introduced alongside Grok-2 mini. The model, tested under the designation "sus-column-r" on the LMSYS leaderboard, has achieved a higher Elo Score compared to Claude 3.5 Sonnet and GPT-4-Turbo. Grok-2 mini, a smaller variant, is also part of the beta release, designed to offer a balance between speed and performance.
-
Microsoft Launches Open-Source Phi-3.5 Models for Advanced AI Development
Microsoft launched three new open-source AI models in its Phi-3.5 series: Phi-3.5-mini-instruct, Phi-3.5-MoE-instruct, and Phi-3.5-vision-instruct. Available under a permissive MIT license, these models offer developers powerful tools for various tasks, including reasoning, multilingual processing, and image and video analysis.