Analytics Across the Enterprise: How IBM Realizes Business Value from Big Data and Analytics book by Brenda L. Dietrich, Emily C. Plachy, and Maureen F. Norton is a collection of experiences by analytics practitioners in IBM.
The book is organized in an introductory chapter, followed by nine functional units covering case studies and an ending chapter about reflections and future outlook. The authors interviewed more than 70 people including executives and practitioners for the purpose of this book.
The case studies are diversified in industry terms and range from finance to HR to marketing and Information Technology applications. All of the case studies have in common that they show in easy to understand terms why and how would someone use Big Data Analytics to bring business value to his organization.
InfoQ spoke with the authors about the lessons learned from the book, the arsenal of technologies IBM has about Big Data and the future of Analytics.
InfoQ: What was the motivation to write this book?
Dietrich, Plachy & Norton: IBM employees are often interviewed for books or articles on analytics; after all, we provide many of the tools in this space. Additionally IBM employees have written articles and given conference presentations about specific analytics projects, but there was no single place where interested readers could see the range and variety of the work. We had developed a customer-focused presentation on IBM’s use of analytics to support its transformation, and in the presentation we saw the makings of a book. So we decided to tell the story, or at least the story so far.
InfoQ: In your book, you mention the four V's of Big Data: Volume, Variety, Velocity and Veracity. Out of the 4 V's which ones do you believe are the toughest to tackle with existing systems? Which ones are the most commonly observed?
Dietrich, Plachy & Norton: In the book we discuss dealing primarily with Variety and Veracity. IBM has dealt with Volume and Velocity for many years, typically through improvements in the underlying hardware and software products that collect, store, manage, and provide access to data. Volume and Velocity can usually be addressed outside of the specific application by simply distributing and accelerating the underlying atomic computing functions. Dealing with Variety and Veracity is harder, and the methods used may need to be customized for the data and the context in which the data is being analyzed. Human-generated text as a data source, with all of its errors, alternatives and ambiguities, is one area in which special care is needed when creating an application. And increasingly, IBM and other enterprises are using human generated text to enrich the analysis of transactional data to provide better decision making capability.
InfoQ: In the past years, we had separate DW's from transactional data stores. With the emergence of NewSQL technologies, do you see this trend reversing? How do IBM's technologies fare in this new landscape?
Dietrich, Plachy & Norton: Yes, database technologies are emerging that lessen the need to have data warehouses separate from transactional databases. For example, IBM’s DB2 Cancun release (10.5.0.4) has a new feature called Shadow Tables that enables applications to create BLU shadow copies of existing row-organized tables. While the transactional workload continues to use the row-organized tables, complex analytical queries are automatically rerouted to the BLU shadow tables for much better reporting performance. This is an important step towards better coexistence of transaction processing and analytics in the same database.
InfoQ: In your book you describe several case studies of analyzing customer data to derive business decisions. How can we derive causation from correlation in these data sets? Do we care?
Dietrich, Plachy & Norton: The second question is key – do we care? In the era of big data and analytics correlation is often enough. While we often want to understand causation, it is not always possible. An example from our book is increasing employee retention in chapter two. Through predictive modeling, we are able to act on the correlation that we see from the data and reduce voluntary attrition without determining why an employee is likely to leave.
InfoQ: In your book you discuss how explaining how the results were obtained to outside parties helps drive adoption of the business recommendations. How can IBM's tools help in that direction?
Dietrich, Plachy & Norton: While people seem willing to accept other advances in technologies as something that just works to make life better, without feeling that it is necessary to understand the details behind how and why it works, their reaction to data or mathematically based computerized methods is sometimes different, especially when these methods are used to enhance something a human already does, like make decisions. We’ve found that in addition to presenting the “answer”, providing access to the evidence supporting the answer, and in some cases to the logic by which the answer was reached, helps the user build trust in the method that produces the answer. The Watson Q&A technology, which often produces multiple possible answers, with evidence and confidence associated with each, represents a major step forward. And the Watson Analytics offering, which allows intuitive navigation of the data as well as the analysis of the data, will make analytics accessible to a much wider set of decision makers.
InfoQ: Several authors, including you, suggest sharing and collaborating on analytics assets across business units to drive maximum value. What is the best way to convince different data silos in a corporation to share data with each other?
Dietrich, Plachy & Norton: We use the Analytics Quotient (AQ) assessment which outlines several phases that companies go through on their analytics journey. The only way to attain the “master” level is with the sharing and collaborating on analytics assets across the business units. Some organizations use the AQ assessment to set goals and measure results in this direction. When people have a clear understanding of the path ahead they are often more receptive to collaborating. As always, strong leadership is a key to successfully creating an analytics culture to share assets.
InfoQ: Cognitive computing is a hot topic and one that IBM definitely has a great deal of experience with. With Watson in the cloud, what are the areas in which you see greater potential for growth and insights?
Dietrich, Plachy & Norton: As Watson capability becomes accessible through cloud services, I expect that we will gain a much greater understanding of how individuals use information to gain understanding and make decisions. I can see Watson technology not only supporting the information exchange functions performed by companies, such as call centers, but also being used to gain a more complete understanding of the customers’ interests, needs, and priorities, and lead, over time, to better choices and better service.
InfoQ: Are there other books on the subject that you would recommend as a follow-up or complementary to yours?
Dietrich, Plachy & Norton: There are many books on the topic but few that provide an in-depth corporate wide perspective with real-world applications, which is what we set out to do. One that our department recently read that we found very useful is How to Measure Anything, by Douglas W. Hubbard, a great primer for expanding your thinking of what is possible.
About the Book Authors
Dr. Brenda L. Dietrich is an IBM Fellow and Vice President. Since joining IBM in 1984 she has worked with numerous business units, applying analytics to decision processes. She led Mathematical Sciences in IBM Research for over a decade. She was a president of INFORMS, is an INFORMS Fellow, and is a member of the National Academy of Engineering. She holds a BS in mathematics from UNC and an MS and PhD in operations research/information engineering from Cornell. Her personal research includes manufacturing scheduling and services resource management. She currently leads the emerging technologies team in the IBM Watson group.
Dr. Emily C. Plachy is a Distinguished Engineer in Business Analytics Transformation, responsible for leading the use of analytics across IBM. Since joining IBM in 1982, she has held a number of technical leadership roles in IBM, including development, research, technical sales, and services. Her technology skills include data and enterprise integration. She holds a BS degree in applied mathematics and a DSc in Computer Science from Washington University and an MSc degree in computer science from the University of Waterloo. She served as president of the IBM Academy of Technology. She is a member of WITI, SWE and INFORMS.
Maureen Fitzgerald Norton, MBA, JD is a Distinguished Market Intelligence Professional and Executive Program Manager in Business Analytics Transformation at IBM. She leads initiatives to bring more science to decision making. Maureen pioneered the development of an outcome-focused communications strategy to drive the culture change needed for analytics adoption. Maureen created analytics case studies and innovative learning exercises and taught big data and analytics workshops for MBA students in Europe and the Middle East. Previously Maureen led analytic project teams in public safety, global social services, commerce, and merchandising, specializing in cost/benefit analysis and return on investment.