Microsoft Cognitive Services released its Face, Computer Vision, and Content Moderator APIs to its users in late April.
Cognitive Services comprises many APIs and services that enable developers to add image recognition, speech, translation and other functionality to their own applications. These APIs enable developers to add artificial intelligence and machine learning capabilities to their applications without having to develop these capabilities themselves.
The Face API can detect and identify human faces. Microsoft can determine if two images are pictures of the same person (useful for firms like Uber, who use such technologies to verify their drivers). Face also organizes people into groups according to visual similarity. One example of a use case is putting old and young people in a different category. If a person was previously tagged, Face will start to identify them in new images. Besides this, Face also detects the emotion on the faces.
The Computer Vision API can tag images according to their content. For example, this image on their site received the tags "water", "sport", "swimming", and "pool". It also detected that the image contains neither racist nor adult-level content.
The Computer Vision API also has two domain-specific models you can use to recognize landmarks or celebrities.
The Computer Vision service can also describe an image with a sentence. An example of such a description is: "A person sitting on a bench". Microsoft also added handwriting detection, which can detect, segment, and read written text. They show several use cases where the API converts Post-It notes and "don't forget" lists to computer-readable text.
The Content Moderator API can be used to moderate input in your app for both text and images. It recognizes offensive or unwanted images, and finds offensive words in the image. They already can detect profanity in written text for more than 100 languages. Content Moderator also searches for possible personally identifiable information (PII). Video Moderation can be used to detect adult content in videos, although this feature is still in preview mode.
The APIs were already online since April 2015, when they were in the alpha phase. While going into the general availability phase, many features were added. These APIs are part of the 25 Cognitive Services APIs in the domains of vision, language, speech, search, and knowledge. All APIs can be used for free with limited data. There are online demo pages that let you try with your own images what features the API can offer for your own data. Users who want to identify more than 30,000 images per month will pay between €0.55/$0.65 and €1.27/$1.50 per 1000 images.