InfoQ Homepage Neural Networks Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Google Announces Video Generation LLM VideoPoet

Google Research recently published their work on VideoPoet, a large language model (LLM) that can generate video. VideoPoet was trained on 2 trillion tokens of text, audio, image, and video data, and in evaluations by human judges its output was preferred over that of other models.

Anthony Alford
on Jan 16, 2024
AI, ML & Data Engineering

OpenAI Publishes GPT Prompt Engineering Guide

OpenAI recently published a guide to Prompt Engineering. The guide lists six strategies for eliciting better responses from their GPT models, with a particular focus on examples for their latest version, GPT-4.

Anthony Alford
on Dec 26, 2023
AI, ML & Data Engineering

Microsoft Announces Small Language Model Phi-2

Microsoft Research announced Phi-2, a 2.7 billion-parameter Transformer-based language model. Phi-2 is trained on 1.4T tokens of synthetic data generated by GPT-3.5 and outperforms larger models on a variety of benchmarks.

Anthony Alford
on Dec 19, 2023
AI, ML & Data Engineering

Microsoft's Orca 2 LLM Outperforms Models That Are 10x Larger

Microsoft Research released its Orca 2 LLM, a fine-tuned version of Llama 2 that performs as well as or better than models that contain 10x the number of parameters. Orca 2 uses a synthetic training dataset and a new technique called Prompt Erasure to achieve this performance.

Anthony Alford
on Dec 12, 2023
AI, ML & Data Engineering

Stability AI Open-Sources Video Generation Model Stable Video Diffusion

Stability AI released the code and model weights for Stable Video Diffusion (SVD), a video generation AI model. When given an input image as context, the model can generate 25 video frames at a resolution of 576x1024 pixels.

Anthony Alford
on Dec 05, 2023
AI, ML & Data Engineering

Meta Announces Generative AI Models Emu Video and Emu Edit

Meta AI Research announced two new generative AI models: Emu Video, which can generate short videos given a text prompt, and Emu Edit, which can edit images given text-based instructions. Both models are based on Meta's Emu foundation model and exhibit state-of-the-art performance on several benchmarks.

Anthony Alford
on Nov 28, 2023
AI, ML & Data Engineering

Spotify Open-Sources Voyager Nearest-Neighbor Search Library

Spotify Engineering recently open-sourced Voyager, an approximate nearest-neighbor (ANN) search library. Voyager is based on the hierarchical navigable small worlds (HNSW) algorithm and is 10 times faster than Spotify's previous ANN library, Annoy.

Anthony Alford
on Nov 21, 2023
AI, ML & Data Engineering

Stability AI Releases Generative Audio Model Stable Audio

Harmonai, the audio research lab of Stability AI, has released Stable Audio, a diffusion model for text-controlled audio generation. Stable Audio is trained on 19,500 hours of audio data and can generate 44.1kHz quality audio in realtime using a single NVIDIA A100 GPU.

Anthony Alford
on Oct 10, 2023
AI, ML & Data Engineering

Meta's Voicebox Outperforms State-of-the-Art Models on Speech Synthesis

Meta recently announced Voicebox, a speech generation model that can perform text-to-speech (TTS) synthesis in six languages, as well as edit and remove noise from speech recordings. Voicebox is trained on over 50k hours of audio data and outperforms previous state-of-the-art models on several TTS benchmarks.

Anthony Alford
on Jul 25, 2023
AI, ML & Data Engineering

Meta's Open-Source Massively Multilingual Speech AI Handles over 1,100 Languages

Meta AI open-sourced the Massively Multilingual Speech (MMS) model, which supports automatic speech recognition (ASR) and text-to-speech synthesis (TTS) in over 1,100 languages and language identification (LID) in over 4,000 languages. MMS can outperform existing models and covers nearly 10x the number of languages.

Anthony Alford
on Jun 13, 2023
AI, ML & Data Engineering

Meta Open-Sources Computer Vision Foundation Model DINOv2

Meta AI Research open-sourced DINOv2, a foundation model for computer vision (CV) tasks. DINOv2 is pretrained on a curated dataset of 142M images and can be used as a backbone for several tasks, including image classification, video action recognition, semantic segmentation, and depth estimation.

Anthony Alford
on May 23, 2023
AI, ML & Data Engineering

Google's Universal Speech Model Performs Speech Recognition on Hundreds of Languages

Google Research announced Universal Speech Model (USM), a 2B parameter automated speech recognition (ASR) model trained on over 12M hours of speech audio. USM can recognize speech in over 100 languages, including low-resource languages, and achieves new state-of-the-art performance on several benchmarks.

Anthony Alford
on May 16, 2023
AI, ML & Data Engineering

Microsoft Open-Sources Weather Forecasting Deep Learning Model ClimaX

Researchers from Microsoft's Autonomous Systems and Robotics Research group have open-sourced ClimaX, a deep learning foundation model for weather and climate modeling. ClimaX can be fine-tuned for a variety of prediction tasks and performs as well as or better than state-of-the-art models on several benchmarks.

Anthony Alford
on Mar 14, 2023
AI, ML & Data Engineering

Stanford Researchers Develop Brain-Computer Interface for Speech Synthesis

Researchers from Stanford University have developed a brain-computer interface (BCI) for synthesizing speech from signals captured in a patient's brain and processed by a recurrent neural network (RNN). The prototype system can decode speech at 62 words-per-minute, 3.4x faster than previous BCI methods.

Anthony Alford
on Feb 21, 2023
AI, ML & Data Engineering

DeepMind Announces Minecraft-Playing AI DreamerV3

Researchers from DeepMind and the University of Toronto announced DreamerV3, a reinforcement-learning (RL) algorithm for training AI models for many different domains. Using a single set of hyperparameters, DreamerV3 outperforms other methods on several benchmarks and can train an AI to collect diamonds in Minecraft without human instruction.

Anthony Alford
on Jan 31, 2023

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News