Google released their beta Cloud Natural Language API on July 20, joining the movement to make advances in natural language processing (NLP) from the small world of cutting-edge research and to the hands of everyday data scientists and software engineers. Google’s NLP API lets users take advantage of three core NLP features:
- Sentiment Analysis - Interpretation of the tone of language, for example, positive or negative.
- Entity Recognition - Recognition of different entities like people or organizations within language.
- Syntax Analysis - Recognition of parts of speech within language, for example, sentence x has 3 nouns.
These tools utilize Google’s deep learning machine learning algorithms to distinguish its API from other home grown data science efforts.
NLP software, programs built to understand human speech or text, has been making its way to mainstream use through an influx of developer friendly APIs from tech giants like Google and IBM. In a Google blog post developer Sara Robinson uses the API’s entity recognition feature to identify key people and locations in Harry Potter (since she had no spells on hand). She goes on to describe the contrast in the effort it would take to develop and maintain all the software involved herself versus using an NLP API:
I could write my own algorithm to find the people and locations mentioned in this sentence, but that would be difficult. And it would be even more difficult if I wanted to gather more data on each of the mentioned entities, or analyze entities across thousands of sentences, while accounting for mentions of the same entity that are phrased differently.
NLP software is especially difficult to build out from the ground up, as Robinson mentions, due to the sheer amount of attention needed for data gathering, prepping, and training, even before most of the real development work on tooling can even begin. NLP APIs like Google’s allow users to take advantage of the powerful analytics of NLP algorithms without having to deal with the large overhead of high mathematical, engineering, and data modeling complexity.
Another recent NLP API addition is Watson’s Conversation API. With Watson’s API you can use NLP to interpret different user commands and convey these commands to different pieces of smart software in your house, like app controlled lighting for example. The Watson API has interactive Swagger documentation where you can test out requests like “turn on the lights” and “what’s the weather?” While Google’s NLP API is built for more all purpose NLP usage, Watson’s is focused on facilitating human to machine communication through text or speech. It builds on the increasing popularity of IOT (internet of things) technology, where NLP can be used as the perfect medium to communicate with your smart car, or home, or even toilet paper roll.
Facebook has taken a more direct route in releasing their internal NLP code to the larger tech community by posting the entire source code for their word representation learning and sentence classification library fastText on GitHub in July. While developers won’t have the ease of a clean API, being able to branch off of Facebook’s code a provides a higher level of inclusion with the existing NLP data science community. In the hands of the community this library could easily generate many more NLP APIs and libraries in of itself. No matter the format, it’s clear that NLP is growing to be more and more accessible to the masses.