InfoQ Homepage Neural Networks Content on InfoQ
-
BigScience Research Workshop Releases AI Language Model T0
BigScience Research Workshop released T0, a series of natural language processing (NLP) AI models specifically trained for researching zero-shot multitask learning. T0 can often outperform models 6x larger on the BIG-bench benchmark, and can outperform the 16x larger GPT-3 on several other NLP benchmarks.
-
Baidu Announces 11 Billion Parameter Chatbot AI PLATO-XL
Baidu recently announced PLATO-XL, an AI model for dialog generation, which was trained on over a billion samples collected from social media conversations in both English and Chinese. PLATO-XL achieves state-of-the-art performance on several conversational benchmarks, outperforming currently available commercial chatbots.
-
IBM Develops Hardware-Based Vector-Symbolic AI Architecture
IBM Research recently announced a memory-augmented neural network (MANN) AI system consisting of a neural network controller and phase-change memory (PCM) hardware. By performing analog in-memory computation on high-dimensional (HD) binary vectors, the system learns few-shot classification tasks on the Omniglot benchmark with only 2.7% accuracy drop compared to 32-bit software implementations.
-
Google's Gated Multi-Layer Perceptron Outperforms Transformers Using Fewer Parameters
Researchers at Google Brain have announced Gated Multi-Layer Perceptron (gMLP), a deep-learning model that contains only basic multi-layer perceptrons. Using fewer parameters, gMLP outperforms Transformer models on natural-language processing (NLP) tasks and achieves comparable accuracy on computer vision (CV) tasks.
-
Intel Loihi 2 and Lava Framework Aim to Advance Neuromorphic Computing Research
Intel introduced its second-generation neuromorphic chip, Loihi 2, with the aim to provide tools for research in the field of neuromorphic computing. In addition, Intel has released Lava, a software framework to build neuromorphic apps both on conventional and neuromorphic hardware.
-
MIT Researchers Open-Source Approximate Matrix Multiplication Algorithm MADDNESS
Researchers at MIT's Computer Science & Artificial Intelligence Lab (CSAIL) have open-sourced Multiply-ADDitioN-lESS (MADDNESS), an algorithm that speeds up machine learning using approximate matrix multiplication (AMM). MADDNESS requires zero multiply-add operations and runs 10x faster than other approximate methods and 100x faster than exact multiplication.
-
Stanford Research Center Studies Impacts of Popular Pretrained Models
Stanford University recently announced a new research center, the Center for Research on Foundation Models (CRFM), devoted to studying the effects of large pretrained deep networks (e.g. BERT, GPT-3, CLIP) in use by a surge of machine-learning research institutions and startups.
-
Georgia Tech Researchers Create Wireless Brain-Machine Interface
Researchers from Georgia Tech University's Center for Human-Centric Interfaces and Engineering have created soft scalp electronics (SSE), a wearable wireless electro-encephalography (EEG) device for reading human brain signals. By processing the EEG data using a neural network, the system allows users wearing the device to control a video game simply by imagining activity.
-
Facebook Open-Sources Computer Vision Model Multiscale Vision Transformers
Facebook AI Research (FAIR) recently open-sourced Multiscale Vision Transformers (MViT), a deep-learning model for computer vision based on the Transformer architecture. MViT contains several internal resolution-reduction stages and outperforms other Transformer vision models while requiring less compute power, achieving new state-of-the-art accuracy on several benchmarks.
-
PyTorch 1.9 Release Includes Mobile, Scientific Computing, and Distributed Training Updates
PyTorch, Facebook's open-source deep-learning framework, announced the release of version 1.9 which includes improvements for scientific computing, mobile support, and distributed training. Overall, the new release contains more than 3,400 commits since the 1.8 release.
-
OpenAI Announces 12 Billion Parameter Code-Generation AI Codex
OpenAI recently announced Codex, an AI model that generates program code from natural language descriptions. Codex is based on the GPT-3 language model and can solve over 70% of the problems in OpenAI's publicly available HumanEval test dataset, compared to 0% for GPT-3.
-
DeepMind Open Sources Data Agnostic Deep Learning Model Perceiver IO
DeepMind has open-sourced Perceiver IO, a general-purpose deep-learning model architecture that can handle many different types of inputs and outputs. Perceiver IO can serve as a "drop-in" replacement for Transformers that performs as well or better than baseline models, but without domain-specific assumptions.
-
MIT Demonstrates Energy-Efficient Optical Accelerator for Deep-Learning Inference
Researchers at MIT's Quantum Photonics Laboratory have developed the Digital Optical Neural Network (DONN), a prototype deep-learning inference accelerator that uses light to transmit activation and weight data. At the cost of a few percentage points of accuracy, the system can achieve an transmission energy advantage of up to 1000x over traditional electronic devices.
-
Google Announces 800M Parameter Vision-Language AI Model ALIGN
Google Research announced the development of A Large-scale ImaGe and Noisy-Text Embedding (ALIGN), an 800M-parameter pre-trained deep-learning model trained on a noisy dataset of 1.8B image-text pairs. The model can be used on several downstream tasks and achieves state-of-the-art accuracy on several image-text retrieval benchmarks.
-
EleutherAI Open-Sources Six Billion Parameter GPT-3 Clone GPT-J
A team of researchers from EleutherAI have open-sourced GPT-J, a six-billion parameter natural language processing (NLP) AI model based on GPT-3. The model was trained on an 800GB open-source text dataset and has performance comparable to a GPT-3 model of similar size.