In a recent blog post, Amazon announced the addition of three new features to its Rekognition Service. These new features enable detection and recognition of text in images, face detection up to a hundred in a full picture, and real-time face recognition across ten million faces.
At Amazon re:Invent 2016, Amazon introduced Rekognition, allowing developers to add image analysis to their applications. Amazon is now making additional investments in this cognitive service with these new features. An early adopter of the Amazon Rekognition service is Pinterest. Their CTO, Vanja Josifovski, said:
As a visually-driven platform, Pinterest relies heavily on the speed and quality of images, but the text behind those images is just as important, as it provides context and makes Pins actionable for our 200M+ active Pinners. In working with Amazon Rekognition Text in Image, we can better extract the rich text captured in images at scale and with low latency for the millions of Pins stored in Amazon S3. We look forward to continuing to develop the partnership with AWS for high quality and fast experiences for Pinners and businesses on Pinterest.
Cloudinary, a media management platform for web and mobile developers, has incorporated the Rekognition Service's new features into their services. Daniel Amitai, vice president of Business Development at Cloudinary, said:
Cloudinary has worked closely with Amazon Web Services (AWS) to bring this solution to life. Our integration with Amazon Rekognition takes this relationship to the next level, eliminating the tedious work of manually analyzing image content and allowing users to automatically categorize images as part of their existing image management workflow.
Amazon’s Rekognition API provides several operations for facial recognition, and image analysis features, which are:
- DetectFaces to detect up to a hundred of the largest faces in an image. For each face, it provides details like age range, gender, and emotions.
- CompareFaces to compare facial features. The comparison of one or up to a hundred of the largest faces in the input image is compared in detected faces in the target picture.
- DetectText to detect and extract text from images. The text is returned as an array of elements.
- RecognizeCelebrities to identify specific celebrities in an image up to a hundred. For each determined celebrity details like the name, URL links to additional information, and match confidence are provided.
Customers can try out the API capabilities by logging into Amazon AWS.
Developers can analyze images uploaded as an object or byte array to Amazon Simple Storage Service (S3) by leveraging the Rekognition API. Rekognition supports both JPEG and PNG formats, and images can be up to 15 MB when passed as an S3 object, or up to 5 MB as a byte array. The Rekognition API can be consumed in various languages such as Java, .NET, and Python through their respective AWS SDKs. Amazon Rekognition is currently available in US regions and Europe, and pricing details are available on their site.
Amazon, Google, and Microsoft compete with each other through their respective Artificial Intelligence (AI) offerings in an attempt to bring in more customers to their platforms. Microsoft offers over 25 APIs with its Cognitive Services, including the Emotion and Computer Vision API that has the image and face analysis capabilities. Google provides the Vision API, which enables detection of objects and faces in images. Both Google and Microsoft have had these recognition services available for some time, and Amazon is now stepping up to catch up with more push of their services. According to Bloomberg in an Investopedia article:
Among the areas where Amazon falls short is the development of AI applications that will assist cloud computing clients in parsing data, understanding speech, and recognizing images. While Amazon remains the market leader in cloud computing, Microsoft and Google are closing the gap partly by offering AI applications that are inducing clients to switch.