InfoQ Homepage Computer Vision Content on InfoQ
-
Google Releases Google-Landmarks-V2, a Large-Scale Dataset for Landmark Recognition & Retrieval
Google has released Google-Landmarks-v2, an improved dataset for Landmark Recognition & Retrieval, along with Detect-to-Retrieve, a Tensorflow codebase for large-scale instance-level image recognition. Two companion Kaggle competitions based on Google-Landmarks-v2 were also launched. With over 200,000 landmarks in 5 million images, it is the largest landmark dataset ever published.
-
Salesforce Adds Intelligence to its Einstein Services Offering
In a recent press release, Salesforce announced additions to their Einstein platform that target bringing AI solutions to Salesforce developers and admins using a low code, point and click configuration-based solution. The recent additions to the platform include Einstein Translation and Einstein Optical Character Recognition (OCR).
-
Microsoft Expands the Availability of Its Cognitive Services: Anomaly Detector and Custom Vision
Microsoft recently announced the public preview of Anomaly Detector and general availability of Custom Vision. With both services, Microsoft further expands its Cognitive Services offering for its customers.
-
AWS Marketplace Offers Machine Learning Algorithms and Model Packages
Amazon Web Services is offering machine learning algorithms and model packages on their AWS Marketplace. This was announced at AWS re:Invent Conference last week.
-
Face-api.js: JavaScript Face Recognition Leveraging TensorFlow.js
Face-api.js is a JavaScript API for face detection and face recognition in the browser implemented on top of the tensorflow.js core API, which implements a series of convolutional neural networks (CNNs), optimized for the web and for mobile devices.
-
Introducing EmoPy: An Open Source Toolkit for Facial Expression Recognition
In a recent blog post, Angelica Perez shared information about a new open source project for an interactive film experience. The project is called EmoPy and focuses on Facial Expression Recognition (FER) by providing a toolkit that allows developers to accurately predict emotions based upon images passed to the service.
-
Dataiku's Latest Release Integrates Deep-Learning for Computer Vision
Collaborative data science platform Dataiku's latest release of its Data Science Studio includes pre-trained deep learning models for image processing. The DSS platform implements each step of a data-science project from data-sourcing and visualization to production deployment. Its machine-learning module supports standard libraries and it integrates with Hadoop and multiple Spark engines.
-
Facebook Releases Open Source "Detectron" Deep-Learning Library for Object Detection
Recent releases from Facebook and Google implement the most current deep-learning algorithms to take a crack at the challenging problem of machine object detection.
-
How Apple Uses Neural Networks for Object Detection in Point Clouds
Apple invented a neural network configuration that can segmentate objects in point clouds obtained with a LIDAR sensor. Recently Apple joined the field of autonomous vehicles. Apple has now created an end-to-end neural network to segmentate objects in point clouds. This approach does not rely on any hand-crafted features or other machine learning algorithms other than neural networks.
-
Start-up Vicarious Defeats CAPTCHA Security with AI Inspired by Brain’s Visual Cortex
Vicarious improved on neural network capable of solving CAPTCHA challenges using a novel network layout called Recursive Cortical Network. In contrast to a normal neural network, which starts without any knowledge before training, a RCN starts with knowledge of contours and surfaces. This prior knowledge facilitates model building and generalisability.
-
Teachable Machine: Teach a Machine Using Your Camera in Your Browser
Teachable Machine is a browser application that you can train with your webcam to recognize objects or expressions. In the demo you use your webcam as input to recognize three different classes of objects or expressions. Based on your camera input, the site shows different gifs, plays prerecorded sounds, or plays speech. The demo can be found here: teachablemachine.withgoogle.com
-
Facebook Publishes New Neural Machine Translation Algorithm
Facebook’s Artificial Intelligence Research team published research results using a new approach for neural machine translation (NMT). Their algorithm scores higher than any other system on three established machine translation tasks.
-
Android Things Brings TensorFlow-Based Machine Learning and Computer Vision to IoT Devices
Recently released Developer Preview 2 (DP2) for Android Things makes it easier to use TensorFlow for machine learning and computer vision on IoT devices. Additionally, it extends USB audio for several IoT platforms, adds Intel Joule support, and enables direct use of native drivers through a new Native PIO API.
-
Google Announces Development Kit for a Tablet with Advanced Vision Capabilities
Google has announced the availability of Project Tango Development Kit, which should allow developers to make applications that track full 3-dimensional motion and capture surfaces in the environment. Tango development kit, created in collaboration with NVDIA, includes the new Tegra K1 mobile processor and aims at providing a platform designed for computer vision and 3D sensing.