InfoQ Homepage Deep Learning Content on InfoQ
-
University Researchers Investigate Machine Learning Compute Trends
A team of researchers from University of Aberdeen, MIT, and several other institutions have released a dataset of historical compute demands for machine learning (ML) models. The dataset contains the compute required for training 123 important models, and an analysis shows that since the year 2010 the trend has significantly increased.
-
AlphaCode: Competitive Code Synthesis with Deep Learning
AlphaCode study brings promising results for goal-oriented code synthesis using deep sequence-to-sequence models. It extends the previous networks and releases a new dataset named CodeContests to contribute to future research benchmarks.
-
Tel-Aviv University Releases Long-Text NLP Benchmark SCROLLS
Researchers with Tel-Aviv University, Meta AI, IBM Research, and Allen Institute for AI have released Standardized CompaRison Over Long Language Sequences (SCROLLS), a set of natural language processing (NLP) benchmark tasks operating on long text sequences drawn from many domains. Experiments on baseline NLP models show that current models have significant room for improvement.
-
Waymo Releases Block-NeRF 3D View Synthesis Deep-Learning Model
Waymo released a ground-breaking deep-learning model called Block-NeRF for large-scale 3D world-view synthesis reconstructed from images collected by its self-driving cars. NeRF has the ability to encode surface and volume representation in neural networks.
-
Meta Open-Sources Multi-Modal AI Algorithm Data2vec
Meta AI recently open-sourced data2vec, a unified framework for self-supervised deep learning on images, text, and speech audio data. When evaluated on common benchmarks, models trained using data2vec perform as well as or better than state-of-the-art models trained with modality-specific objectives.
-
DeepMind Open-Sources Quantum Chemistry AI Model DM21
Researchers at Google subsidiary DeepMind have open-sourced DM21, a neural network model for mapping electron density to chemical interaction energy, a key component of quantum mechanical simulation. DM21 outperforms traditional models on several benchmarks and is available as an extension to the PySCF simulation framework.
-
Alibaba Open-Sources AutoML Algorithm KNAS
Researchers from Alibaba Group and Peking University have open-sourced Kernel Neural Architecture Search (KNAS), an efficient automated machine learning (AutoML) algorithm that can evaluate proposed architectures without training. KNAS uses a gradient kernel as a proxy for model quality, and uses an order of magnitude less compute power than baseline methods.
-
LambdaML: Pros and Cons of Serverless for Deep Network Training
A new study entitled "Towards Demystifying Serverless Machine Learning Training" aims to provide an experimental analysis of training deep networks by leveraging serverless platforms. FaaS for training has challenges due to its distributed nature and aggregation step in the learning algorithms. Results indicate FaaS can be a faster (for lightweight models) but not cheaper alternative than IaaS.
-
Evaluating Continual Deep Learning: a New Benchmark for Image Classification
Continual learning aims to preserve knowledge across deep network training iterations. A new dataset entitled "The CLEAR Benchmark: Continual LEArning on Real-World Imagery" has recently been published. The goal of the study is to establish a consistent image classification benchmark with the natural time evolution of objects for a more realistic comparison of continual learning models.
-
OpenAI Announces Question-Answering AI WebGPT
OpenAI has developed WebGPT, an AI model for long-form question-answering based on GPT-3. WebGPT can use web search queries to collect supporting references for its response, and on Reddit questions its answers were preferred by human judges over the highest-voted answer 69% of the time.
-
Facebook Open-Sources Two Billion Parameter Multilingual Speech Recognition Model XLS-R
Facebook AI Research (FAIR) open-sourced XLS-R, a cross-lingual speech recognition (SR) AI model. XSLR is trained on 436K hours of speech audio from 128 languages, an order of magnitude more than the largest previous models, and outperforms the current state-of-the-art on several downstream SR and translation tasks.
-
MLCommons Announces Latest MLPerf Training Benchmark Results
Engineering consortium MLCommons recently announced the results of the latest round of their MLPerf Training benchmark competition. Over 158 AI training job performance metrics were submitted by 14 organizations, with the best results improving up to 2.3x compared to the previous round.
-
Google Trains 280 Billion Parameter AI Language Model Gopher
Google subsidiary DeepMind announced Gopher, a 280-billion-parameter AI natural language processing (NLP) model. Based on the Transformer architecture and trained on a 10.5TB corpus called MassiveText, Gopher outperformed the current state-of-the-art on 100 of 124 evaluation tasks.
-
DeepMind Releases Weather Forecasting AI Deep Generative Models of Rainfall
DeepMind open-sourced a dataset and trained model snapshot for Deep Generative Models of Rainfall (DGMR), an AI system for short-term precipitation forecasts. In evaluations conducted by 58 expert meteorologists comparing it to other existing methods, DGMR was ranked first in accuracy and usefulness in 89% of test cases.
-
MIT Researchers Investigate Deep Learning's Computational Burden
A team of researchers from MIT, Yonsei University, and University of Brasilia have launched a new website, Computer Progress, which analyzes the computational burden from over 1,000 deep learning research papers. Data from the site show that computational burden is growing faster than the expected rate, suggesting that algorithms still have room for improvement.