InfoQ Homepage Neural Networks Content on InfoQ
-
Google's BigBird Model Improves Natural Language and Genomics Processing
Researchers at Google have developed a new deep-learning model called BigBird that allows Transformer neural networks to process sequences up to 8x longer than previously possible. Networks based on this model achieved new state-of-the-art performance levels on natural-language processing (NLP) and genomics tasks.
-
PyTorch 1.6 Released; Microsoft Takes over Windows Version
PyTorch, Facebook's open-source deep-learning framework, announced the release of version 1.6 which includes new APIs and performance improvements. Along with the release, Microsoft announced it will take over development and maintenance of the Windows version of the framework.
-
Google Open-Sources AI for Mapping Natural Language to Mobile UI Actions
Google has open-sourced their AI model for converting sequences of natural language instructions to actions in a mobile device UI. The model is based on the Transformer deep-learning architecture and achieves 70% accuracy on a new benchmark dataset created for the project.
-
Microsoft's ZeRO-2 Speeds up AI Training 10x
Microsoft open-sourced Zero Redundancy Optimizer version 2 (ZeRO-2), a distributed deep-learning optimization algorithm that scales super-linearly with cluster size. Using ZeRO-2, Microsoft trained a 100-billion-parameter natural-language processing (NLP) model 10x faster than with previous distributed learning techniques.
-
MIT and Toyota Release Autonomous Driving Dataset DriveSeg
Toyota's Collaborative Safety Research Center (CSRC) and MIT's AgeLab have released DriveSeg, a dataset for autonomous driving research. DriveSeg contains over 25,000 frames of high-resolution video with each pixel labelled with one of 12 classes of road object. DriveSeg is available free of charge for non-commercial use.
-
Facebook Announces TransCoder AI to Translate Code across Programming Languages
Facebook AI Research has announced TransCoder, a system that uses unsupervised deep-learning to convert code from one programming language to another. TransCoder was trained on more than 2.8 million open source projects and outperforms existing code translation systems that use rule-based methods.
-
Uber Open-Sources AI Abstraction Layer Neuropod
Uber open-sourced Neuropod, an abstraction layer for machine learning frameworks that allows researchers to build models in the framework of their choice while reducing the effort of integration, allowing the same production system to swap out models implemented in different frameworks. Neuropod currently supports several frameworks, including TensorFlow, PyTorch, Keras, and TorchScript.
-
OpenAI Announces GPT-3 AI Language Model with 175 Billion Parameters
A team of researchers from OpenAI recently published a paper describing GPT-3, a deep-learning model for natural-language with 175 billion parameters, 100x more than the previous version, GPT-2. The model is pre-trained on nearly half a trillion words and achieves state-of-the-art performance on several NLP benchmarks without fine-tuning.
-
OpenAI Introduces Microscope, Visualizations for Understanding Neural Networks
OpenAI has released Microscope, a collection of visualizations of every significant layer and neuron of eight leading computer vision (CV) models which are often studied in interpretability. The tool helps researchers analyze the features and other important attributes which form inside of the neural networks powering these CV models.
-
OpenAI Approximates Scaling Laws for Neural Language Models
Artificial intelligence company OpenAI studies empirical scaling laws for language models using cross entropy loss to determine the optimal allocation of a fixed compute budget.
-
Google Releases Quantization Aware Training for TensorFlow Model Optimization
Google announced the release of the Quantization Aware Training (QAT) API for their TensorFlow Model Optimization Toolkit. QAT simulates low-precision hardware during the neural-network training process, adding the quantization error into the overall network loss metric, which causes the training process to minimize the effects of post-training quantization.
-
Google's SEED RL Achieves 80x Speedup of Reinforcement-Learning
Researchers at Google Brain recently open-sourced their Scalable, Efficient Deep-RL (SEED RL) algorithm for AI reinforcement-learning. SEED RL is a distributed architecture that achieves state-of-the-art results on several RL benchmarks at lower cost and up to 80x faster than previous systems.
-
TensorFlow Quantum Joins Quantum Computing and Machine Learning
TensorFlow Quantum (TFQ) brings Google quantum computing framework Cirq and TensorFlow together to enable the creation of quantum machine learning (ML) models.
-
Deep Learning Accelerates Scientific Simulations up to Two Billion Times
Researchers from several physics and geology laboratories have developed Deep Emulator Network SEarch (DENSE), a technique for using deep-learning to perform scientific simulations from various fields from high-energy physics to climate science. Compared to previous simulators, the results from DENSE achieved speedups ranging from 10 million to 2 billion times.
-
MIT CSAIL TextFooler Framework Tricks Leading NLP Systems
A team of researchers at the MIT Computer Science & Artificial Intelligence Lab (CSAIL) recently released a framework called TextFooler which successfully tricked state-of-the-art NLP models (such as BERT) into making incorrect predictions.