InfoQ Homepage Deep Learning Content on InfoQ
-
AI Conference Recap: Google, Microsoft, Facebook, and Others at ICLR 2021
At the recent International Conference on Learning Representations (ICLR), research teams from several tech companies, including Google, Microsoft, IBM, Facebook, and Amazon, presented nearly 250 papers out of a total of 860 on a wide variety of AI topics related to deep learning.
-
CMU Develops Algorithm for Guaranteeing AI Model Generalization
Researchers at Carnegie Mellon University's (CMU) Approximately Correct Machine Intelligence (ACMI) Lab have published a paper on Randomly Assign, Train and Track (RATT), an algorithm that uses noisy training data to provide an upper bound on the true error risk of a deep-learning model. Using RATT, model developers can determine how well a model will generalize to new input data.
-
Microsoft's ZeRO-Infinity Library Trains 32 Trillion Parameter AI Model
Microsoft recently open-sourced ZeRO Infinity, an addition to their open-source DeepSpeed AI training library that optimizes memory use for training very large deep-learning models. Using ZeRO-Infinity, Microsoft trained a model with 32 trillion parameters on a cluster of 32 GPUs, and demonstrated fine-tuning of a 1 trillion parameter model on a single GPU.
-
NVIDIA Announces AI Training Dataset Generator DatasetGAN
Researchers at NVIDIA have created DatasetGAN, a system for generating synthetic images with annotations to create datasets for training AI vision models. DatasetGAN can be trained with as few as 16 human-annotated images and performs as well as fully-supervised systems requiring 100x more annotated images.
-
Researchers Publish Biologically Plausible AI Training Method
A team of researchers at Oxford University developed an algorithm called zero-divergence inference learning (Z-IL), an alternative to the backpropagation (BP) algorithm for training neural network AI models. Z-IL has been shown to exactly reproduce the results of BP on any neural network, but unlike BP does not violate known principles of brain function.
-
Facebook Announces ZionEX Platform for Training AI Models with 12 Trillion Parameters
A team of scientists at Facebook AI Research (FAIR) announced a system for training deep-learning recommendation models (DLRM) using PyTorch on a custom-built AI hardware platform, ZionEX. Using this system, the team trained models with up to 12T parameters and achieved nearly an order-of-magnitude speedup in training time compared to other systems.
-
Open Source AI Can Predict Electrical Outages from Storms with 81% Accuracy
A team of scientists from Aalto University and the Finnish Meteorological Institute have developed an open-source AI model for predicting electrical outages caused by storm damage. The model can predict storm location within 15km and classifies the amount of transformer damage with 81% accuracy, allowing power companies to prepare for outages and repair them more quickly.
-
MIT Announces AI Benchmark ThreeDWorld Transport Challenge
A team of researchers from MIT and the MIT-IBM Watson AI Lab have announced the ThreeDWorld Transport Challenge, a benchmark task for embodied AI agents. The challenge is to improve research on AI agents that can control a simulated mobile robot that is guided by computer vision to pick up objects and move them to new locations.
-
Perceiver: One Neural-Network Model for Multiple Input Data Types
Google’s DeepMind company has recently released a state-of-the-art deep-learning model called Perceiver that receives and processes multiple input data ranging from audio to images, similarly to how the human brain perceives multimodal data. Perceiver is able to receive and classify input multiple data types, namely point cloud, audio and images.
-
Microsoft Releases AI Training Library ZeRO-3 Offload
Microsoft recently open-sourced ZeRO-3 Offload, an extension of their DeepSpeed AI training library that improves memory efficiency while training very large deep-learning models. ZeRO-3 Offload allows users to train models with up to 40 billion parameters on a single GPU and over 2 trillion parameters on 512 GPUs.
-
Alibaba Announces 10 Billion Parameter Multi-Modal AI M6
Alibaba has created an AI model called Multi-Modality to Multi-Modality Multitask Mega-transformer (M6). The model contains 10 billion parameters and is pretrained on a dataset consisting of 1.9TB of images and 292GB of Chinese-language text. M6 can be fine-tuned for several downstream tasks, including text-guided image generation, visual question answering, and image-text matching.
-
Google's Apollo AI for Chip Design Improves Deep Learning Performance by 25%
Scientists at Google Research have announced APOLLO, a framework for optimizing AI accelerator chip designs. APOLLO uses evolutionary algorithms to select chip parameters that minimize deep-learning inference latency while also minimizing chip area. Using APOLLO, researchers found designs that achieved 24.6% speedup over those chosen by a baseline algorithm.
-
Google DeepMind’s NFNets Offers Deep Learning Efficiency
Google’s DeepMind AI company recently released NFNets, a normalizer-free ResNet image classification model that achieved a training performance of 8.7x faster than current state-of-the-art EfficientNet. In addition, it helps neural networks to generalize better.
-
PyTorch 1.8 Release Includes Distributed Training Updates and AMD ROCm Support
PyTorch, Facebook's open-source deep-learning framework, announced the release of version 1.8 which includes updated APIs, improvements for distributed training, and support for the ROCm platform for AMD's GPU accelerators. New versions of domain-specific libraries TorchVision, TorchAudio, and TorchText were also released.
-
Stanford Publishes AI Index 2021 Annual Report
Stanford University’s Institute for Human-Centered Artificial Intelligence (HAI) has published its AI Index annual report. This underlying data for this year's report has been expanded compared to the previous year's, and the report includes several perspectives on the COVID-19 pandemic's impact on AI research and development.