InfoQ Homepage Deep Learning Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

PyTorch Becomes Linux Foundation Top-Level Project

PyTorch, the popular deep-learning framework developed by Meta AI Research, has now become an independent top-level project of the Linux Foundation. The project will be managed by the newly-chartered PyTorch Foundation, with support from several large companies including Meta, AWS, NVIDIA, AMD, Google, and Microsoft.

Anthony Alford
on Oct 18, 2022
AI, ML & Data Engineering

Amazon EC2 Trn1 Instances for High Performance on Deep Learning Training Models Now Available

AWS announces general availability of Amazon EC2 Trn1 instances powered by AWS Trainium Chips. Trn1 instances deliver the highest performance on deep learning training of popular machine learning models on AWS, while offering up to 50% cost-to-train savings over comparable GPU-based instances.

Daniel Dominguez
on Oct 14, 2022
AI, ML & Data Engineering

University Researchers Publish Results of NLP Community Metasurvey

Researchers from New York University, University of Washington, and Johns Hopkins University have published the results of the NLP Community Metasurvey, which compiles the opinions of 480 active NLP researchers about several issues in the natural language processing AI field. The survey also includes meta-questions about the perceived opinions of other researchers.

Anthony Alford
on Oct 11, 2022
AI, ML & Data Engineering

OpenAI Releases 1.6 Billion Parameter Multilingual Speech Recognition AI Whisper

OpenAI recently released Whisper, a 1.6 billion parameter AI model that can transcribe and translate speech audio from 97 different languages. Whisper was trained on 680,000 hours of audio data collected from the web and shows robust zero-shot performance on a wide range of automated speech recognition (ASR) tasks.

Anthony Alford
on Oct 04, 2022
AI, ML & Data Engineering

Microsoft Trains Two Billion Parameter Vision-Language AI Model BEiT-3

Researchers from Microsoft's Natural Language Computing (NLC) group announced the latest version of Bidirectional Encoder representation from Image Transformers: BEiT-3, a 1.9B parameter vision-language AI model. BEiT-3 models images as another language and achieves state-of-the-art performance on a wide range of downstream tasks.

Anthony Alford
on Sep 27, 2022
AI, ML & Data Engineering

Google Open-Sources Natural Language Robot Control Method SayCan

Researchers from Google's Robotics team have open-sourced SayCan, a robot control method that uses a large language model (LLM) to plan a sequence of robotic actions to achieve a user-specified goal. In experiments, SayCan generated the correct action sequence 84% of the time.

Anthony Alford
on Sep 20, 2022
AI, ML & Data Engineering

Amazon SageMaker Provides New Built-in TensorFlow Image Classification Algorithms

Amazon is announcing a new built-in TensorFlow algorithm for image classification in Amazon Sagemaker. The supervised learning algorithm supports transfer learning for many pre-trained models available in TensorFlow Hub.

Daniel Dominguez
on Sep 16, 2022
AI, ML & Data Engineering

MIT Researchers Develop AI Model to Solve University-Level Mathematics Problems

Researchers at MIT have developed an AI model that can solve problems used in university-level mathematics courses. The system uses the OpenAI Codex engine to generate programs that output the problem solution, including graphs and plots, achieving an accuracy of 81% on the MATH benchmark dataset as well as on real problems from MIT courses.

Anthony Alford
on Sep 13, 2022
AI, ML & Data Engineering

Near-Optimal Scaling of Large Deep Network Training on Public Cloud

A recently published study, MiCS, provides experimental evidence that the infrastructure used to carry out model training should be taken into account, especially for large deep neural networks trained on the public cloud. The article shows distributing the model weights unevenly between GPUs decreases inter-node communication overhead on AWS V100 and A100 instances.

Sabri Bolkar
on Sep 09, 2022
AI, ML & Data Engineering

Stability AI Open-Sources Image Generation Model Stable Diffusion

Stability AI released the pre-trained model weights for Stable Diffusion, a text-to-image AI model, to the general public. Given a text prompt, Stable Diffusion can generate photorealistic 512x512 pixel images depicting the scene described in the prompt.

Anthony Alford
on Sep 06, 2022
AI, ML & Data Engineering

AWS Deep Graph Knowledge Embedding for Bond Trading Predictions

AWS developed the Deep Graph Knowledge Embedding Library (DGL-KE), a knowledge graph embedding library built on the Deep Graph Library (DGL). DGL is a scalable, high performance Python library for deep learning in graphs. This library is used by the advanced machine learning systems developed with Trumid to build a credit trading platform.

Claudio Masolo
on Aug 31, 2022
AI, ML & Data Engineering

Meta Open-Sources 175B Parameter Chatbot BlenderBot 3

Meta AI Research open-sourced BlenderBot 3, a 175B parameter chatbot that can learn from live interactions with users "in the wild." In evaluations by human judges, BlenderBot 3 achieves a 31% rating increase compared to the previous BlenderBot version.

Anthony Alford
on Aug 30, 2022
AI, ML & Data Engineering

Berkeley Researchers Announce Robot Training Algorithm DayDreamer

Researchers from University of California, Berkeley, recently announced DayDreamer, a reinforcement-learning (RL) AI algorithm that uses a world model, which allows it to learn more quickly without the need for interacting with a simulator. Using DayDreamer, the team was able to train several physical robots to perform complex tasks within only a few hours.

Anthony Alford
on Aug 23, 2022
AI, ML & Data Engineering

Amazon's AlexaTM 20B Model Outperforms GPT-3 on NLP Benchmarks

Researchers at Amazon Alexa AI have announced Alexa Teacher Models (AlexaTM 20B), a 20-billion-parameter sequence-to-sequence (seq2seq) language model that exhibits state-of-the-art performance on 1-shot and few-shot NLP tasks. AlexaTM 20B outperforms GPT-3 on SuperGLUE and SQuADv2 benchmarks while having fewer than 1/8 the number of parameters.

Anthony Alford
on Aug 19, 2022
AI, ML & Data Engineering

Meta Develops Dataset Pruning Technique for Scaling AI Training

Researchers from Meta AI and Stanford University have developed a metric for pruning AI datasets which improves training scalability from a power-law to exponential-decay. The metric uses self-supervised learning and performs comparably to existing metrics which require more compute power.

Anthony Alford
on Aug 16, 2022

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News