InfoQ Homepage Neural Networks Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Google Open-Sources Token-Free Language Model ByT5

Google Research has open-sourced ByT5, a natural language processing (NLP) AI model that operates on raw bytes instead of abstract tokens. Compared to baseline models, ByT5 is more accurate on several benchmark tasks and is more robust to misspellings and noise.

Anthony Alford
on Jul 06, 2021
AI, ML & Data Engineering

Google Trains Two Billion Parameter AI Vision Model

Researchers at Google Brain announced a deep-learning computer vision (CV) model containing two billion parameters. The model was trained on three billion images and achieved 90.45% top-1 accuracy on ImageNet, setting a new state-of-the-art record.

Anthony Alford
on Jun 22, 2021
AI, ML & Data Engineering

CMU Develops Algorithm for Guaranteeing AI Model Generalization

Researchers at Carnegie Mellon University's (CMU) Approximately Correct Machine Intelligence (ACMI) Lab have published a paper on Randomly Assign, Train and Track (RATT), an algorithm that uses noisy training data to provide an upper bound on the true error risk of a deep-learning model. Using RATT, model developers can determine how well a model will generalize to new input data.

Anthony Alford
on Jun 01, 2021
AI, ML & Data Engineering

Microsoft's ZeRO-Infinity Library Trains 32 Trillion Parameter AI Model

Microsoft recently open-sourced ZeRO Infinity, an addition to their open-source DeepSpeed AI training library that optimizes memory use for training very large deep-learning models. Using ZeRO-Infinity, Microsoft trained a model with 32 trillion parameters on a cluster of 32 GPUs, and demonstrated fine-tuning of a 1 trillion parameter model on a single GPU.

Anthony Alford
on May 25, 2021
AI, ML & Data Engineering

NVIDIA Announces AI Training Dataset Generator DatasetGAN

Researchers at NVIDIA have created DatasetGAN, a system for generating synthetic images with annotations to create datasets for training AI vision models. DatasetGAN can be trained with as few as 16 human-annotated images and performs as well as fully-supervised systems requiring 100x more annotated images.

Anthony Alford
on May 18, 2021
AI, ML & Data Engineering

Researchers Publish Biologically Plausible AI Training Method

A team of researchers at Oxford University developed an algorithm called zero-divergence inference learning (Z-IL), an alternative to the backpropagation (BP) algorithm for training neural network AI models. Z-IL has been shown to exactly reproduce the results of BP on any neural network, but unlike BP does not violate known principles of brain function.

Anthony Alford
on May 11, 2021
AI, ML & Data Engineering

Facebook Announces ZionEX Platform for Training AI Models with 12 Trillion Parameters

A team of scientists at Facebook AI Research (FAIR) announced a system for training deep-learning recommendation models (DLRM) using PyTorch on a custom-built AI hardware platform, ZionEX. Using this system, the team trained models with up to 12T parameters and achieved nearly an order-of-magnitude speedup in training time compared to other systems.

Anthony Alford
on May 04, 2021
AI, ML & Data Engineering

Open Source AI Can Predict Electrical Outages from Storms with 81% Accuracy

A team of scientists from Aalto University and the Finnish Meteorological Institute have developed an open-source AI model for predicting electrical outages caused by storm damage. The model can predict storm location within 15km and classifies the amount of transformer damage with 81% accuracy, allowing power companies to prepare for outages and repair them more quickly.

Anthony Alford
on Apr 27, 2021
AI, ML & Data Engineering

Perceiver: One Neural-Network Model for Multiple Input Data Types

Google’s DeepMind company has recently released a state-of-the-art deep-learning model called Perceiver that receives and processes multiple input data ranging from audio to images, similarly to how the human brain perceives multimodal data. Perceiver is able to receive and classify input multiple data types, namely point cloud, audio and images.

Bruno Santos
on Apr 13, 2021
AI, ML & Data Engineering

Microsoft Releases AI Training Library ZeRO-3 Offload

Microsoft recently open-sourced ZeRO-3 Offload, an extension of their DeepSpeed AI training library that improves memory efficiency while training very large deep-learning models. ZeRO-3 Offload allows users to train models with up to 40 billion parameters on a single GPU and over 2 trillion parameters on 512 GPUs.

Anthony Alford
on Apr 13, 2021
AI, ML & Data Engineering

Alibaba Announces 10 Billion Parameter Multi-Modal AI M6

Alibaba has created an AI model called Multi-Modality to Multi-Modality Multitask Mega-transformer (M6). The model contains 10 billion parameters and is pretrained on a dataset consisting of 1.9TB of images and 292GB of Chinese-language text. M6 can be fine-tuned for several downstream tasks, including text-guided image generation, visual question answering, and image-text matching.

Anthony Alford
on Apr 06, 2021
AI, ML & Data Engineering

Google's Apollo AI for Chip Design Improves Deep Learning Performance by 25%

Scientists at Google Research have announced APOLLO, a framework for optimizing AI accelerator chip designs. APOLLO uses evolutionary algorithms to select chip parameters that minimize deep-learning inference latency while also minimizing chip area. Using APOLLO, researchers found designs that achieved 24.6% speedup over those chosen by a baseline algorithm.

Anthony Alford
on Mar 30, 2021
AI, ML & Data Engineering

Google DeepMind’s NFNets Offers Deep Learning Efficiency

Google’s DeepMind AI company recently released NFNets, a normalizer-free ResNet image classification model that achieved a training performance of 8.7x faster than current state-of-the-art EfficientNet. In addition, it helps neural networks to generalize better.

Bruno Santos
on Mar 26, 2021
AI, ML & Data Engineering

PyTorch 1.8 Release Includes Distributed Training Updates and AMD ROCm Support

PyTorch, Facebook's open-source deep-learning framework, announced the release of version 1.8 which includes updated APIs, improvements for distributed training, and support for the ROCm platform for AMD's GPU accelerators. New versions of domain-specific libraries TorchVision, TorchAudio, and TorchText were also released.

Anthony Alford
on Mar 23, 2021
AI, ML & Data Engineering

Google Open-Sources AutoML Algorithm Model Search

A team from Google Research has open-sourced Model Search, an automated machine learning (AutoML) platform for designing deep-learning models. Experimental results show that the system produces models that outperform the best human-designed models, with fewer training iterations and model parameters.

Anthony Alford
on Mar 09, 2021

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News