InfoQ Homepage Deep Learning Content on InfoQ
-
Google Announces 200M Parameter AI Forecasting Model TimesFM
Google Research announced TimesFM, a 200M parameter Transformer-based foundation model for time-series forecasting. TimesFM is trained on nearly 100B data points and has zero-shot forecasting performance comparable to or better than supervised-learning models.
-
Google Renames Bard to Gemini
Google announced that their Bard chatbot will now be called Gemini. The company also announced the launch of Gemini Advanced, the largest version of their Gemini language model, along with two new mobile apps for interacting with the model.
-
MIT Researchers Use Explainable AI Model to Discover New Antibiotics
Researchers from MIT's Collins lab used an explainable deep-learning model to discover chemical compounds which could fight the MRSA bacteria. The model uses graph algorithms to identify chemical compounds which are likely to have antibiotic properties. Additional models predict whether or not the chemicals would be harmful to humans.
-
OpenAI Releases New Embedding Models and Improved GPT-4 Turbo
OpenAI recently announced the release of several updates to their models, including two new embedding models and updates to GPT-4 Turbo and GPT-3.5 Turbo. The company also announced improvements to their free text moderation tool and to their developer API management tools.
-
Stability AI Releases 1.6 Billion Parameter Language Model Stable LM 2
Stability AI released two sets of pre-trained model weights for Stable LM 2, a 1.6B parameter language model. Stable LM 2 is trained on 2 trillion tokens of text data from seven languages and can be run on common laptop computers.
-
Hugging Face and Google Cloud Announce Collaboration
Hugging Face and Google Cloud have announced a strategic alliance to advance machine learning and open AI research. Google Cloud customers, Hugging Face Hub users, and open source are the three main focuses of the strategic partnership. Google wants to make cutting-edge AI discoveries available through Hugging Face's open-source frameworks.
-
Mistral AI's Open-Source Mixtral 8x7B Outperforms GPT-3.5
Mistral AI recently released Mixtral 8x7B, a sparse mixture of experts (SMoE) large language model (LLM). The model contains 46.7B total parameters, but performs inference at the same speed and cost as models one-third that size. On several LLM benchmarks, it outperformed both Llama 2 70B and GPT-3.5, the model powering ChatGPT.
-
Google Announces Video Generation LLM VideoPoet
Google Research recently published their work on VideoPoet, a large language model (LLM) that can generate video. VideoPoet was trained on 2 trillion tokens of text, audio, image, and video data, and in evaluations by human judges its output was preferred over that of other models.
-
Waymo Publishes Report Showing Lower Crash Rates Than Human Drivers
Alphabet's autonomous taxi company Waymo recently published a report showing its autonomous driver software outperforms human drivers on several benchmarks. The analysis covers over seven million miles of driving with no human behind the wheel, with Waymo cars having a 85% reduction in crashes involving an injury.
-
Stable Diffusion in Java (SD4J) Enables Generating Images with Deep Learning
Stable Diffusion in Java (SD4J) is a modified port of the Stable Diffusion C# implementation with support for negative text inputs. Stable diffusion is a deep learning text to image model based on diffusion. SD4J can be used, via the GUI or programmatically in Java applications, to generate images.
-
OpenAI Publishes GPT Prompt Engineering Guide
OpenAI recently published a guide to Prompt Engineering. The guide lists six strategies for eliciting better responses from their GPT models, with a particular focus on examples for their latest version, GPT-4.
-
Microsoft Announces Small Language Model Phi-2
Microsoft Research announced Phi-2, a 2.7 billion-parameter Transformer-based language model. Phi-2 is trained on 1.4T tokens of synthetic data generated by GPT-3.5 and outperforms larger models on a variety of benchmarks.
-
Apple Open-sources Apple Silicon-Optimized Machine Learning Framework MLX
Apple's MLX combines familiar APIs, composable function transformations, and lazy computation to create a machine learning framework inspired by NumPy and PyTorch that is optimized for Apple Silicon. Implemented in Python and C++, the framework aims to provide a user-friendly and efficient solution to train and deploy machine learning models on Apple Silicon.
-
Microsoft's Orca 2 LLM Outperforms Models That Are 10x Larger
Microsoft Research released its Orca 2 LLM, a fine-tuned version of Llama 2 that performs as well as or better than models that contain 10x the number of parameters. Orca 2 uses a synthetic training dataset and a new technique called Prompt Erasure to achieve this performance.
-
Google Launches New Multi-Modal Gemini AI Model
On December 6, Alphabet released the first phase of its next-generation AI model, Gemini. Gemini was overseen and driven by its CEO, Sundar Pichai and Google DeepMind. Gemini is the first model to outperform human experts on MMLU (Massive Multitask Language Understanding), one of the most popular methods to test the performance of language models.