InfoQ Homepage Artificial Intelligence Content on InfoQ
-
Google Open-Sources AI Fine-Tuning Method Distilling Step-by-Step
A team from the University of Washington and Google Research recently open-sourced Distilling Step-by-Step, a technique for fine-tuning smaller language models. Distilling Step-by-Step requires less training data than standard fine-tuning and results in smaller models that can outperform few-shot prompted large language models (LLMs) that have 700x the parameters.
-
Nvidia Introduces Eureka, an AI Agent Powered by GPT-4 That Can Train Robots
Nvidia Research revealed that it has created a brand-new AI agent named Eureka that is driven by OpenAI's GPT-4 and is capable of teaching robots sophisticated abilities on its own.
-
AWS Announces the Preview of Amazon CodeWhisperer Customization Capability
Amazon Web Services has announced the preview of Amazon CodeWhisperer Customization Capability. This new functionality empowers users to fine-tune CodeWhisperer, enabling it to provide more precise suggestions by incorporating an organization's proprietary APIs, internal libraries, classes, methods, and industry best practices.
-
Google DeepMind Announces LLM-Based Robot Controller RT-2
Google DeepMind recently announced Robotics Transformer 2 (RT-2), a vision-language-action (VLA) AI model for controlling robots. RT-2 uses a fine-tuned LLM to output motion control commands. It can perform tasks not explicitly included in its training data and improves on baseline models by up to 3x on emergent skill evaluations.
-
Google Cloud Ops Agent Can Now Monitor Nvidia GPUs
Google Cloud announced that Ops Agent, the agent for collecting telemetry from Compute Engine instances, can now collect and aggregate metrics from NVIDIA GPUs on VMs.
-
GitHub Copilot Chat in Open Beta: Now Available for All Individuals in Visual Studio and VS Code
GitHub Copilot Chat is a chat interface that allows developers to ask and receive answers to coding-related questions directly within a supported IDE. It is currently in open beta and available for all GitHub Copilot individual users across Visual Studio and VS Code.
-
PlanetScale's Challenge to Oracle: Forking MySQL and Introducing Vector Search
PlanetScale recently announced the intention to fork MySQL adding vector search. While PostgreSQL has been the default open-source choice for vector search, the company behind the Vitess database wants to release a version of MySQL and PlanetScale with vector support.
-
Stability AI Releases Generative Audio Model Stable Audio
Harmonai, the audio research lab of Stability AI, has released Stable Audio, a diffusion model for text-controlled audio generation. Stable Audio is trained on 19,500 hours of audio data and can generate 44.1kHz quality audio in realtime using a single NVIDIA A100 GPU.
-
A Modern Compute Stack for Scaling Large AI, ML, & LLM Workloads at QCon SF
Jules Damji, a lead developer advocate at Anyscale Inc., discussed the difficulties data scientists encounter when managing infrastructure for machine learning models. He emphasized the necessity for a framework that supports the latest machine learning libraries, is easily manageable, and can scale to accommodate large datasets and models. Damji introduced Ray as a potential solution.
-
Defensible Moats: Unlocking Enterprise Value with Large Language Models at QCon San Francisco
In a recent presentation at QConSFrancisco, Nischal HP discussed the challenges enterprises face when building LLM-powered applications using APIs alone. These challenges include data fragmentation, the absence of a shared business vocabulary, privacy concerns regarding data, and diverse objectives among stakeholders.
-
The Challenges of Producing Quality Code When Using AI-Based Generalistic Models
Using AI with generalistic models to do very specific things like generating code can cause problems. Producing code with AI is like using code from someone else who you don’t know which may not match your standards and quality. Creating specialised or dedicated models can be a way out.
-
Practical Advice for Retrieval Augmented Generation (RAG), by Sam Partee at QCon San Francisco
At the recent QCon San Francisco conference, Sam Partee, principal engineer at Redis, gave a talk about Retrieval Augmented Generation (RAG). He discussed Generative Search, which combines large language models (LLMs) with vector databases to improve information retrieval. Partee discussed several innovative tricks such as Hypothetical Document Embeddings (HyDE), and semantic caching.
-
Generative AI: Shaping a New Future for Fraud Prevention, by Neha Narkhede at QCon San Francisco
At the recent QCon San Francisco conference, Neha Narkhede gave a keynote on how generative AI can help improve the state of the art in fraud prevention. She discussed the "knowledge fabric", which is able to capture all information and knowledge on current fraud methods. She also introduced six foundational pillars of AI Risk Decisioning.
-
GitHub's Learnings from Building Copilot, an Enterprise LLM Application
GitHub has published an article containing the lessons they learned in building and scaling GitHub Copilot -- an enterprise application using an LLM (Large Language Model). In a post on GitHub's blog, AI product leader Shuyin Zhao describes how -- over three years -- they broke the project down into three stages - "find it", "nail it" and "scale it", and successfully launched GitHub Copilot.
-
OpenAI Announces ChatGPT Voice and Image Features
OpenAI recently announced new voice and image features for ChatGPT. A new backend model, GPT-4V, will handle image inputs, and an updated DALL-E model will be integrated to generate images. In addition, users of the mobile ChatGPT app will be able to hold voice conversations with the chatbot.