InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Enter your e-mail address

Select your country

We protect your privacy.

InfoQ Homepage Deep Learning Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Vesuvius Challenge Winners Use AI to Read Ancient Scroll

The Vesuvius Challenge recently announced the winners of their 2023 Grand Prize. The winning team used an ensemble of AI models to read text from a scroll of papyrus that was buried in volcanic ash nearly 2,000 years ago.

Anthony Alford
on Mar 19, 2024
AI, ML & Data Engineering

OpenAI Releases Transformer Debugger tool

OpenAI has unveiled a new tool called the Transformer Debugger (TDB), designed to provide insights into the inner workings of transformer models. The tool was developed by OpenAI's Superalignment team and combines automated interpretability techniques with sparse autoencoders.

Andrew Hoblitzell
on Mar 18, 2024
AI, ML & Data Engineering

RWKV Project Open-Sources LLM Eagle 7B

The RWKV Project recently open-sourced Eagle 7B, a 7.52B parameter large language model (LLM). Eagle 7B is trained on 1.1 trillion tokens of text in over 100 languages and outperforms other similarly-sized models on multilingual benchmarks.

Anthony Alford
on Mar 12, 2024
AI, ML & Data Engineering

Amazon Announces One Billion Parameter Speech Model BASE TTS

Amazon Science recently published their work on Big Adaptive Streamable TTS with Emergent abilities (BASE TTS). BASE TTS supports voice-cloning and outperforms baseline TTS models when evaluated by human judges. Further, Amazon's experiments show that scaling model and data size improves the subjective quality of the model's output.

Anthony Alford
on Mar 05, 2024
AI, ML & Data Engineering

Google Announces 200M Parameter AI Forecasting Model TimesFM

Google Research announced TimesFM, a 200M parameter Transformer-based foundation model for time-series forecasting. TimesFM is trained on nearly 100B data points and has zero-shot forecasting performance comparable to or better than supervised-learning models.

Anthony Alford
on Feb 27, 2024
AI, ML & Data Engineering

Google Renames Bard to Gemini

Google announced that their Bard chatbot will now be called Gemini. The company also announced the launch of Gemini Advanced, the largest version of their Gemini language model, along with two new mobile apps for interacting with the model.

Anthony Alford
on Feb 20, 2024
AI, ML & Data Engineering

MIT Researchers Use Explainable AI Model to Discover New Antibiotics

Researchers from MIT's Collins lab used an explainable deep-learning model to discover chemical compounds which could fight the MRSA bacteria. The model uses graph algorithms to identify chemical compounds which are likely to have antibiotic properties. Additional models predict whether or not the chemicals would be harmful to humans.

Anthony Alford
on Feb 13, 2024
AI, ML & Data Engineering

OpenAI Releases New Embedding Models and Improved GPT-4 Turbo

OpenAI recently announced the release of several updates to their models, including two new embedding models and updates to GPT-4 Turbo and GPT-3.5 Turbo. The company also announced improvements to their free text moderation tool and to their developer API management tools.

Anthony Alford
on Feb 06, 2024
AI, ML & Data Engineering

Stability AI Releases 1.6 Billion Parameter Language Model Stable LM 2

Stability AI released two sets of pre-trained model weights for Stable LM 2, a 1.6B parameter language model. Stable LM 2 is trained on 2 trillion tokens of text data from seven languages and can be run on common laptop computers.

Anthony Alford
on Jan 30, 2024
AI, ML & Data Engineering

Hugging Face and Google Cloud Announce Collaboration

Hugging Face and Google Cloud have announced a strategic alliance to advance machine learning and open AI research. Google Cloud customers, Hugging Face Hub users, and open source are the three main focuses of the strategic partnership. Google wants to make cutting-edge AI discoveries available through Hugging Face's open-source frameworks.

Daniel Dominguez
on Jan 30, 2024
AI, ML & Data Engineering

Mistral AI's Open-Source Mixtral 8x7B Outperforms GPT-3.5

Mistral AI recently released Mixtral 8x7B, a sparse mixture of experts (SMoE) large language model (LLM). The model contains 46.7B total parameters, but performs inference at the same speed and cost as models one-third that size. On several LLM benchmarks, it outperformed both Llama 2 70B and GPT-3.5, the model powering ChatGPT.

Anthony Alford
on Jan 23, 2024
AI, ML & Data Engineering

Google Announces Video Generation LLM VideoPoet

Google Research recently published their work on VideoPoet, a large language model (LLM) that can generate video. VideoPoet was trained on 2 trillion tokens of text, audio, image, and video data, and in evaluations by human judges its output was preferred over that of other models.

Anthony Alford
on Jan 16, 2024
AI, ML & Data Engineering

Waymo Publishes Report Showing Lower Crash Rates Than Human Drivers

Alphabet's autonomous taxi company Waymo recently published a report showing its autonomous driver software outperforms human drivers on several benchmarks. The analysis covers over seven million miles of driving with no human behind the wheel, with Waymo cars having a 85% reduction in crashes involving an injury.

Anthony Alford
on Jan 02, 2024
Java

Stable Diffusion in Java (SD4J) Enables Generating Images with Deep Learning

Stable Diffusion in Java (SD4J) is a modified port of the Stable Diffusion C# implementation with support for negative text inputs. Stable diffusion is a deep learning text to image model based on diffusion. SD4J can be used, via the GUI or programmatically in Java applications, to generate images.

Johan Janssen
on Dec 29, 2023
AI, ML & Data Engineering

OpenAI Publishes GPT Prompt Engineering Guide

OpenAI recently published a guide to Prompt Engineering. The guide lists six strategies for eliciting better responses from their GPT models, with a particular focus on examples for their latest version, GPT-4.

Anthony Alford
on Dec 26, 2023

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News