InfoQ Homepage Computer Vision Content on InfoQ
-
Google ML Kit SDK Now Focuses on On-Device Machine Learning
Google has introduced a new ML Kit SDK aimed at working in standalone mode without requiring a tight integration with Firebase, as the original ML Kit SDK did. Additionally, it provides limited support for replacing its default models with custom ones for image labeling and object detection and tracking.
-
Google Open-Sources Computer Vision Model Big Transfer
Google Brain has released the pre-trained models and fine-tuning code for Big Transfer (BiT), a deep-learning computer vision model. The models are pre-trained on publicly-available generic image datasets and can meet or exceed state-of-the-art performance on several vision benchmarks after fine-tuning on just a few samples.
-
Google's V8 Engine Adds Support for WebAssembly SIMD
The WebAssembly SIMD proposal has come to Google JavaScript engine V8, albeit still as an experimental feature. Exploiting data parallelism, V8 support for SIMD (Single instruction, multiple data) aims to accelerate compute intensive tasks like audio/video processing, machine learning, and more.
-
Apple Acquires Edge-Focused AI Startup Xnor.ai
Apple has acquired Xnor.ai, a Seattle-based startup that builds AI models that run on edge devices, for approximately $200 million.
-
Uber's Synthetic Training Data Speeds Up Deep Learning by 9x
Uber AI Labs has developed an algorithm called Generative Teaching Networks (GTN) that produces synthetic training data for neural networks which allows the networks to be trained faster than when using real data. Using this synthetic data, Uber sped up its neural architecture search (NAS) deep-learning optimization process by 9x.
-
Facebook AI Releases New Computer Vision Library Detectron2
Facebook AI Research (FAIR) has released Detectron2, a PyTorch-based computer vision library that brings a series of new research and production capabilities to the framework. While the first Detectron was written in Caffe2, Detectron2 represents a full rewrite of the original framework in PyTorch from the ground up, with several new object detection capabilities.
-
Google Announces Updates to AutoML Vision Edge, AutoML Video, and the Video Intelligence API
In a recent blog post, Google announced enhancements to a part of its Vision AI portfolio: AutoML Vision Edge, AutoML Video, and the Video Intelligence API. Each received updates to enhance their capabilities.
-
Waymo Shares Autonomous Vehicle Dataset for Machine Learning
Waymo, the self-driving technology company, released a dataset containing sensor data collected by their autonomous vehicles during more than five hours of driving. The set contains high-resolution data from lidar and camera sensors collected in several urban and suburban environments in a wide variety of driving conditions and includes labels for vehicles, pedestrians, cyclists, and signage.
-
New Technique Speeds up Deep-Learning Inference on TensorFlow by 2x
Researchers at North Carolina State University recently presented a paper at the International Conference on Supercomputing (ICS) on their new technique, "deep reuse" (DR), that can speed up inference time for deep-learning neural networks running on TensorFlow by up to 2x, with almost no loss of accuracy.
-
University Research Teams Open-Source Natural Adversarial Image DataSet for Computer-Vision AI
Research teams from three universities recently released a dataset called ImageNet-A, containing natural adversarial images: real-world images that are misclassified by image-recognition AI. When used as a test-set on several state-of-the-art pre-trained models, the models achieve an accuracy rate of less than 3%.
-
Researchers Develop Technique for Reducing Deep-Learning Model Sizes for Internet of Things
Researchers from Arm Limited and Princeton University have developed a technique that produces deep-learning computer-vision models for internet-of-things (IoT) hardware systems with as little as 2KB of RAM. By using Bayesian optimization and network pruning, the team is able to reduce the size of image recognition models while still achieving state-of-the-art accuracy.
-
AWS Enhances Deep Learning AMI, AI Services SageMaker Ground Truth, and Rekognition
Amazon Web Services (AWS) announced updates to their Deep Learning virtual machine image, as well as improvements to their AI services SageMaker Ground Truth and Rekognition.
-
Google Uses Mannequin Challenge Videos to Learn Depth Perception
Google AI Research published a paper describing their work on depth perception from two-dimensional images. Using a training dataset created from YouTube videos of the Mannequin Challenge, researchers trained a neural network that can reconstruct depth information from videos of moving people, taken by moving cameras.
-
Google Announces TensorFlow Graphics Library for Unsupervised Deep Learning of Computer Vision Model
At a presentation during Google I/O 2019, Google announced TensorFlow Graphics, a library for building deep neural networks for unsupervised learning tasks in computer vision. The library contains 3D rendering functions written in TensorFlow, as well as tools for learning with non-rectangular mesh-based input data.
-
OpenAI Introduces Sparse Transformers for Deep Learning of Longer Sequences
OpenAI has developed the Sparse Transformer, a deep neural-network architecture for learning sequences of data, including text, sound, and images. The networks can achieve state-of-the-art performance on several deep-learning tasks with faster training times.