InfoQ Homepage Deep Learning Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Google Trains Two Billion Parameter AI Vision Model

Researchers at Google Brain announced a deep-learning computer vision (CV) model containing two billion parameters. The model was trained on three billion images and achieved 90.45% top-1 accuracy on ImageNet, setting a new state-of-the-art record.

Anthony Alford
on Jun 22, 2021
AI, ML & Data Engineering

Facebook Open-Sources Expire-Span Method for Scaling Transformer AI

Facebook AI Research (FAIR) open-sourced Expire-Span, a deep-learning technique that learns which items in an input sequence should be remembered, reducing the memory and computation requirements for AI. FAIR showed that Transformer models that incorporate Expire-Span can scale to sequences of tens of thousands of items with improved performance compared to previous models.

Anthony Alford
on Jun 15, 2021
AI, ML & Data Engineering

AI Conference Recap: Google, Microsoft, Facebook, and Others at ICLR 2021

At the recent International Conference on Learning Representations (ICLR), research teams from several tech companies, including Google, Microsoft, IBM, Facebook, and Amazon, presented nearly 250 papers out of a total of 860 on a wide variety of AI topics related to deep learning.

Anthony Alford
on Jun 08, 2021
AI, ML & Data Engineering

CMU Develops Algorithm for Guaranteeing AI Model Generalization

Researchers at Carnegie Mellon University's (CMU) Approximately Correct Machine Intelligence (ACMI) Lab have published a paper on Randomly Assign, Train and Track (RATT), an algorithm that uses noisy training data to provide an upper bound on the true error risk of a deep-learning model. Using RATT, model developers can determine how well a model will generalize to new input data.

Anthony Alford
on Jun 01, 2021
AI, ML & Data Engineering

Microsoft's ZeRO-Infinity Library Trains 32 Trillion Parameter AI Model

Microsoft recently open-sourced ZeRO Infinity, an addition to their open-source DeepSpeed AI training library that optimizes memory use for training very large deep-learning models. Using ZeRO-Infinity, Microsoft trained a model with 32 trillion parameters on a cluster of 32 GPUs, and demonstrated fine-tuning of a 1 trillion parameter model on a single GPU.

Anthony Alford
on May 25, 2021
AI, ML & Data Engineering

NVIDIA Announces AI Training Dataset Generator DatasetGAN

Researchers at NVIDIA have created DatasetGAN, a system for generating synthetic images with annotations to create datasets for training AI vision models. DatasetGAN can be trained with as few as 16 human-annotated images and performs as well as fully-supervised systems requiring 100x more annotated images.

Anthony Alford
on May 18, 2021
AI, ML & Data Engineering

Researchers Publish Biologically Plausible AI Training Method

A team of researchers at Oxford University developed an algorithm called zero-divergence inference learning (Z-IL), an alternative to the backpropagation (BP) algorithm for training neural network AI models. Z-IL has been shown to exactly reproduce the results of BP on any neural network, but unlike BP does not violate known principles of brain function.

Anthony Alford
on May 11, 2021
AI, ML & Data Engineering

Facebook Announces ZionEX Platform for Training AI Models with 12 Trillion Parameters

A team of scientists at Facebook AI Research (FAIR) announced a system for training deep-learning recommendation models (DLRM) using PyTorch on a custom-built AI hardware platform, ZionEX. Using this system, the team trained models with up to 12T parameters and achieved nearly an order-of-magnitude speedup in training time compared to other systems.

Anthony Alford
on May 04, 2021
AI, ML & Data Engineering

Open Source AI Can Predict Electrical Outages from Storms with 81% Accuracy

A team of scientists from Aalto University and the Finnish Meteorological Institute have developed an open-source AI model for predicting electrical outages caused by storm damage. The model can predict storm location within 15km and classifies the amount of transformer damage with 81% accuracy, allowing power companies to prepare for outages and repair them more quickly.

Anthony Alford
on Apr 27, 2021
AI, ML & Data Engineering

MIT Announces AI Benchmark ThreeDWorld Transport Challenge

A team of researchers from MIT and the MIT-IBM Watson AI Lab have announced the ThreeDWorld Transport Challenge, a benchmark task for embodied AI agents. The challenge is to improve research on AI agents that can control a simulated mobile robot that is guided by computer vision to pick up objects and move them to new locations.

Anthony Alford
on Apr 20, 2021
AI, ML & Data Engineering

Perceiver: One Neural-Network Model for Multiple Input Data Types

Google’s DeepMind company has recently released a state-of-the-art deep-learning model called Perceiver that receives and processes multiple input data ranging from audio to images, similarly to how the human brain perceives multimodal data. Perceiver is able to receive and classify input multiple data types, namely point cloud, audio and images.

Bruno Santos
on Apr 13, 2021
AI, ML & Data Engineering

Microsoft Releases AI Training Library ZeRO-3 Offload

Microsoft recently open-sourced ZeRO-3 Offload, an extension of their DeepSpeed AI training library that improves memory efficiency while training very large deep-learning models. ZeRO-3 Offload allows users to train models with up to 40 billion parameters on a single GPU and over 2 trillion parameters on 512 GPUs.

Anthony Alford
on Apr 13, 2021
AI, ML & Data Engineering

Alibaba Announces 10 Billion Parameter Multi-Modal AI M6

Alibaba has created an AI model called Multi-Modality to Multi-Modality Multitask Mega-transformer (M6). The model contains 10 billion parameters and is pretrained on a dataset consisting of 1.9TB of images and 292GB of Chinese-language text. M6 can be fine-tuned for several downstream tasks, including text-guided image generation, visual question answering, and image-text matching.

Anthony Alford
on Apr 06, 2021
AI, ML & Data Engineering

Google's Apollo AI for Chip Design Improves Deep Learning Performance by 25%

Scientists at Google Research have announced APOLLO, a framework for optimizing AI accelerator chip designs. APOLLO uses evolutionary algorithms to select chip parameters that minimize deep-learning inference latency while also minimizing chip area. Using APOLLO, researchers found designs that achieved 24.6% speedup over those chosen by a baseline algorithm.

Anthony Alford
on Mar 30, 2021
AI, ML & Data Engineering

Google DeepMind’s NFNets Offers Deep Learning Efficiency

Google’s DeepMind AI company recently released NFNets, a normalizer-free ResNet image classification model that achieved a training performance of 8.7x faster than current state-of-the-art EfficientNet. In addition, it helps neural networks to generalize better.

Bruno Santos
on Mar 26, 2021

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News