Recently Microsoft announced the public preview of a new version of the Computer Vision Image Analysis API, making all visual image features ranging from Optical Character Recognition (OCR) to object detection available through a single endpoint.
Computer Vision Image Analysis API is part of Microsoft Azure Cognitive Service offering. With the API, customers can extract various visual features from their images. The latest version, 4.0, offers a new feature with OCR, optimized for image scenarios that make OCR easy to use for user interfaces and near real-time experiences. In addition, it now supports 164 languages, including Cyrillic, Arabic, and Hindi languages. The feature recognizes printed and handwritten text in image files (supported formats are .JPEG, .JPG, .PNG, .BMP, and .TIFF).
Next to the OCR feature, also in preview is the "detect people in image" feature. Finally, all the features are available in one API.
Note that Microsoft tests its products with the API. PowerPoint, Designer, Word, Outlook, Edge, and LinkedIn are using Vision APIs to power design suggestions, alt text for accessibility, SEO, document processing, and content moderation.
Besides the OCR and detecting people in image features of Image Analysis, the company also previews another feature: Spatial Analysis. With this feature, developers can create applications that count people in a room, understand dwell times in front of a retail display, and determine wait times in lines.
The upgrade of Computer Vision API is also part of Microsoft’s Responsible AI process and its principles of fairness, inclusiveness, reliability, safety, transparency, privacy and security, and accountability. Other public cloud companies like Google follow the same principle with their responsible AI guidelines.
The company recommends using the new version of the API in the future. Developers can use Image Analysis through a client library SDK by calling the REST API or try out the feature with Vision Studio.
More details on the Computer Vision API are available on the documentation landing page and FAQs. Price details and availability can be found on the pricing page.
Lastly, Andy Beatman, a senior product marketing manager at Azure AI, revealed what will come next in an Azure blog post:
We will continue to release breakthrough vision AI through this new API over the coming months, including capabilities powered by the Florence foundation model featured in this year’s premiere computer vision conference keynote at CVPR.