AI helps us to build human interfaces based on speaking and writing, instead of using a keyboard or mouse; it allows humans to stay human. The biggest challenges are finding ways to tell systems what answers are unsatisfactory to help them learn, be transparent in what data is recorded and retained, and ensure that diversity and inclusion is part of our training data to prevent bias in AI systems.
Christian Heilmann, senior program manager at Microsoft, spoke about building human interfaces powered by AI at Codemotion Berlin 2018. InfoQ is covering this conference with Q&A, summaries, and articles.
We’ve become more accustomed to computers being a part of our lives, argued Heilmann. This is probably the last generation of people who see a computer as a keyboard attached to a screen. Talking to computers that are always there for us is becoming the norm - for better or worse.
Heilmann stated that by using an AI approach and creating interfaces that allow us to do what we do as humans - speak, write and show emotions - we create properly helpful tools without the overhead of having to learn them. To enable our interfaces to be more human we need to add human understandable information to the data we accumulate and the sensor readings we get.
The big issue with AI is that it is a hype that promises us science fiction interfaces that work flawlessly. Having a Siri or Cortana not understand you feels like much more of a let-down than not finding a result in a dataset with a form, argued Heilmann. We promise our end users human interfaces, so we need to make sure that our code and training models allow for human randomness and errors.
InfoQ interviewed Heilmann about applying AI in human interfaces, the benefits that AI brings, and the main challenges of software development with AI.
InfoQ: Artificial intelligence seems to have become a hot thing in software development. Why is this?
Christian Heilmann: The topic of AI has been around for a very long time, but technical realities in the past kept it from flourishing. With the advancements in technology that we have these days the necessary computations to do deep learning on massive datasets has come down from months of number crunching to seconds.
We accumulate more data than ever - either consciously by taking lots of photos and recording lots of video, or automatically via sensors in every device. Where in the past we wrote programs that had explicit instructions on how to handle data, the sheer amount of information we collect requires that systems learn from the data itself and find patterns to act on.
Humans are only there to point out the outliers and mistakes. No need for us to do the boring, repetitive task of detecting patterns and sorting information, when computers are faster and better at it.
InfoQ: How can we apply artificial intelligence in human interfaces?
Heilmann: This is already happening. Photo software automatically detects people and things in photos and adds the results as metadata for easier retrieval.
When you use, for example, Google Photos for a few weeks, and search for "food" in your photos, it will find photos containing food without you ever having to describe the images. There are two main ingredients to this convenience: a lot of data and a way to detect and categorise it automatically. This is where machine learning and deep learning come into play.
In most cases we use a hosted cloud service to train our systems, as the computational overhead is pretty high. However, recent innovation in chipsets and languages that can benefit from the power of the computer architecture even makes it possible to do that on-device. Where in the past we had to take a photo, send it on to a cloud service to detect that it contains the Eiffel tower, already existing datasets to compare it with makes it possible for our cameras to do that in real time without any third party or connection speed overhead.
InfoQ: What are the benefits that AI brings us when it comes to developing human interfaces?
Heilmann: The main trick here is to allow humans to stay human. For decades computers were not exciting to use as they required us to change our ways. We needed to click the right buttons, in the right order to achieve a task. We needed to remember passwords and addresses and know which program to use for different tasks. In essence, we needed to get conditioned to software to use it and to learn how to interface with it before we enjoyed it.
When you talk to Cortana, Siri or Google, you don’t need to use a keyboard or a mouse and you can ask questions like "what is the temperature today in the capital of Denmark?" without having to know what the capital is or tell the computer what "today" means.
We have a lot of data already out there and computers can analyse the data without extra work from our side. That way we add the extra information the computer needs to give us the right results for the questions we ask.
The main change here is for people to start using computers in that fashion and not expect this to fail. I am always amazed how intelligent interfaces already are, but I am conditioned to expect computers to be stupid. When you drag a photo into powerpoint, it creates a human readable description of it under the hood to explain the image to search engines and non-visual users alike. For example, I use a photo of my dog and the description "a dog sitting on a sidewalk" is created automatically. This is amazing and we should build all of our systems this way. A form that expects users to ask questions in a certain format and fails to get any results when the user made a typo is an anachronism. We should be better than that.
InfoQ: What are the main challenges of software development with AI?
Heilmann: There are a few challenges we still need to tackle. AI is all about size and speed. In order to get a good result from an intelligent system you need to have a lot of data that has been properly trained, and you need to ask precise questions to get sensible results. Humans, as a whole, aren’t good at asking the right questions, so often an intelligent system will give answers that are unsatisfactory. Instead of discarding the system as a failure right after that, we need to find a way to tell the system why the answer was unsatisfactory. Machines can’t have their feelings hurt, so telling them that something is flat-out wrong is as productive as saying it was right.
The main challenge I see though is that we wield a lot of power and we deal with people’s personal information and sometimes even parts of their identity. As a security and privacy-conscious person, I am worried that people give out far too much of their information in exchange for convenience. An intelligent speaker in your house is very much like the hidden microphones in hotel rooms in old spy movies. But we’re OK with our lives being recorded 24/7 for the sake of being able to ask a ubiquitous computer what the weather is like outside. It is up to us as the providers of intelligent systems not only to give great results, but also to instill a sense of ownership into the users of these systems, and be transparent in what data is being recorded, how much of it is retained and where it goes.
We also need to be careful not to have our biases amplified by machine learning. A dataset for facial recognition that has only been trained on white people will tell a person of colour that they are not allowed to use a system. This is not good. We need to ensure that diversity and inclusion are part of our training data and our interfaces, and not cater to the people we are ourselves or the ones we’d love to reach.
InfoQ: Where can people go if they want to learn more about using AI in software development?
Heilmann: This is a pretty open question and a good example of how they are hard to answer. Most big software companies have good portals that get you started in understanding the basics, but also use already pre-build datasets and APIs to benefit from deep learning without having to grasp it. Here are a few that helped me:
I am keeping an ongoing list of resources for myself to look at if you’re interested at AI for humans.