BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Demystifying Machine Learning for Development Teams and Children

Demystifying Machine Learning for Development Teams and Children

This item in japanese

QCon London 2018 opened on 5th March to a keynote by Rob Harrop, titled Artificial Intelligence and Machine Learning for Software Engineers. Machine learning (ML) expertise, according to Harrop, often sits behind a siloed wall between development and data science teams. These divisions can lead to developing models which are divorced from an understanding of the data and its underlying domain. In addition, software teams are often removed from being able to develop their own competence due to such divisions and the aura of mysticism around ML. Complementing this, Dale Lane spoke in the sponsor stream, and demonstrated how he has been making ML accessible to children through declarative and accessible tools in combination with practical coaching around the edge cases of ML.

Harrop is the CTO of Skipjaq and an original founder of SpringSource. His QCon London 2018 keynote, viewable at qcon.ai, highlighted the dangers of reintroducing handovers between specialist silos; this time between data specialists and teams wanting to apply ML capabilities. Harrop spoke of the need to avoid introduction of bias when working with data specialists who lacked a contextual business understanding of the development team's bounded context.

Lane, a developer at IBM, one of the conference's sponsors, presented a demonstration of ml-for-kids which offers web-based tools targeted at educating children in machine learning. ml-for-kids builds on MIT's Scratch, a visual platform used for teaching programming. It offers intuitive interfaces that enable children to create programmable flows which incorporate ML capabilities. A simple interface allows users to train models used for image recognition, natural language processing (NLP), sentiment analysis and detecting other patterns.

Lane discussed how, through working on practical examples, he has been able to make children conscious of data quality issues such as overfitting or introducing a data bias. Using the example of training a recommendation model for theme parks or funfairs, Lane presented his class with a dataset bias towards funfairs. Since the resulting model which had been overfitted towards funfairs, Lane was able to challenge children to think about the moral consequences towards individual livelihoods and success of both businesses. Encouraging children to consider the recommendation of life-saving medication, he told the story of how they became better aware of the moral significance of such data bias.

Harrop brought up the issue of the EU's General Data Protection Regulation (GDPR), which will soon make it illegal for organisations to use data which discriminates on the basis of personal beliefs, religious background, ethnicity, sexual orientation or political affiliation. He highlighted the dangers that a model may still be able to learn inherent patterns and apply similar bias, even with pre-filtering of data. Addressing this, Harrop recommends that developers will have to engineer and test towards a solution which doesn't have an unwanted bias. He felt that of "all the sociological issues around machine learning, bias is the most important one."

Both Harrop and Lane talked about the mystification of machine learning in society. Harrop reminded the audience that while there is often a focus on the need for data-specialists who understand the underlying theories, for most use-cases using ML is just another software engineering activity. Reducing this barrier to entry even further, when taking questions Lane talked about how less-technical teachers were able to start to understand, teach and effectively utilise applied ML through the ml-for-kids framework.

Harrop's summarises his talk pointing out that Machine Learning is a key competitive advantage, but at its heart it is mostly software engineering. He warned the audience to avoid the mistake of old, stating:

Don't try to have a data science team and a software team. Co-locate those things together. Make sure everyone has an understanding of what everybody else is doing.

Lane shared a number of immediately accessible web-based ML platforms which children and adults can start experimenting with:

Rate this Article

Adoption
Style

BT