Jim Johnson is the founder and chairman of the Standish Group, a globally respected source of independent primary research and analysis of IT project performance. He is best known for his research on why projects fail, as well as on system costs and availability. He is also a pioneer of modern research techniques such as virtual focus groups and case-based analytical technology.
The Standish Group's claim to fame is, of course, The CHAOS Chronicles, comprising 12 years of research, done through focus groups, in-depth surveys and executive interviews, on project performance of over 50,000 completed IT projects. The objectives of CHAOS research are to document the scope of application software development project failures, the major factors for failure, and ways to reduce failure. In 1994, The Standish Group made public its first CHAOS Report, documenting the billions of dollars wasted on software development for projects that were never completed. That report is among the most oft-quoted in the industry since then.
Taking time out from his vacation, Jim Johnson spoke with me this week about how this research came to be, how it is done, and the role of Agile in his findings. We were joined by Gordon Divitt, Executive Vice President of Operations for the Acxsys Corporation, a software industry veteran who has attended CHAOS University events since they began.
Deborah Hartmann for InfoQ: How did the first CHAOS Report come about?
Jim Johnson: We've been doing research for a long time, let me tell you how we did it.
The first study was triggered by an IBM class we were holding in Belgium - we had 100 people or so, we were trying to track sales of middleware, and it just didn't track right. I mean, if you sell a development package, you can expect to sell a certain number of runtime licenses. But we weren't finding what we expected, so we started asking people why they were not buying licenses. They said that their projects weren't getting done. At the time, the stories we heard indicated that around 40% were cancelled. People indicated a real problem.
So we started doing focus groups etc. to get feedback, so we could determine how to frame this. We did surveys and tested to see if the sample was right, we tweaked and improved, until we had a balanced sample: crossing a variety of industries and organization sizes.
InfoQ: Do you think the CHAOS sample is representative of application development in general?
JJ: Yes
InfoQ: So, does it include small software development shops, for example?
JJ: Well, no. Here's how it breaks down: government and commercial organizations only - no vendors, suppliers or consultants. So Microsoft isn't in our sample. And organizations of all sizes, down to roughly 10 million US, with a few exceptions, like Gordon.
Gordon Divitt: The CHAOS university events seem to me somewhat biased toward bigger organizations?
JJ: We do have big clients. But in the survey, we try to make sure we cover the industry. We're not simply interviewing clients of Standish services - we pay people to fill out the survey, it's completely separate. You know, like church and state :-)
InfoQ: You pay people to do your survey?
JJ: Oh, they come to focus groups and we pay them for their time. If they fill out a survey we pay them an honorarium or give them a gift. It's a tradition in industry studies. This way we are "clean" and not getting biased info. If you don't pay you will get a bias of those trying to influence the study. Paying them makes it neutral. However, about 25% donate their honorarium to charity - most of the government people will not take the money.
We do get case info, where they pay us to do the analysis, but we can't use this data in our case write-ups.
The key to our ability to get this data is that it's never attributed to a company - it's always aggregated, completely confidential, we never give out the names. Otherwise we couldn't get the data.
GD: It's an asset of the company.
JJ: Right, and we wouldn't let anyone see real data.
InfoQ: Your resulting demographics information is spelled out on your site. But what I think is missing is this: how are the companies in your database selected? Since you are particularly aimed at understanding development failures, do you look for companies with spectacular failures, or lots of failures? Are you selecting customers with more or larger failures than the general IT community experiences? Or are they self-selecting?
Ok, the short version :-) : Is your sample representative of application development failures in general?
JJ: We might profile a big failure as a case study; we're definitely looking for instructive failures as case studies. But not as part of the data for a survey. We're not asking for 'failure' projects. Our original study was a broad-based mass mailing, perhaps somewhat random because of the low response level it received. [editor's note: the 1994 study represented over 8000 application development projects].
Now, we invite people to participate in our research using our SURF database, and we have certain entry criteria. Participants...
- must have access to certain project data,
- must already be running applications,
- must be running particular platforms.
The database currently has around 3000 active members. I suppose you might ask if they are self selecting into SURF because they have problems - but I don't see it. I think they may be self selecting to get to see the data, which is for members only. But I don't think there's any bias in there... I mean, we do a lot of looking around, cleaning up.
InfoQ: Cleaning up? What do you mean?
JJ: We'll look at the data, and phone up to clarify, or exclude something that doesn't look quite right, or if we're if not sure it has the veracity we want. We're not just filling a database, we want clean data.
InfoQ: As you know, I've been writing a news item on Robert Glass' questions about the CHAOS findings. Have your numbers been challenged before?
JJ: Not really. Most of the time I hear: "the numbers are too optimistic", people are somewhat surprised. Their response to the absolute failure rate of 18% (cancelled or finished but not used) is that they find this not unreasonable - if people are pushing themselves to get ahead there should be a certain number of projects that will fail, because they are on the edge.
People know that the more common scenario in our industry is still: over budget, over time, and with fewer features than planned. Most of the comments I get on that are: "I don't have ANY that come in on time..."
Our demographics have been presented in hundreds of cities around the world. It's on the web. It doesn't take a genius to know our methodology, it's always been public.
GD: And no one has ever come back from business to challenge - people have been depending on it, attending events and doing the survey for 12 years now. They keep coming back, that says something.
There's no doubt in my mind that process is transparent... as a regular at CHAOS, I can attest to the openness of Standish. And, more importantly in my mind, to the acceptance by their clients of the data - as shown by the ongoing attendance of participants who have long associations with the group. If there was anything "rotten in Denmark" it would have been apparent way before now.
InfoQ: You've mentioned to me that people tend to roll "failed" and "challenged" into a single figure - I've been guilty of that one - and that you really don't recommend it. Can you say a few words on that?
JJ: I do think many people lump challenged and failed into the same bucket. This is misleading.
The problem is: you're lumping projects that could be of value in with those that are of no value. I like to think of the value ratio. A project that overruns by $1,000,000 might have provided even more value because the new system might offer a higher gain. On the other hand there is a lot of waste within challenged projects. We are trying to distinguish Project failures from Project Management failures, which may still deliver value. This is a subject of current interest to me, and our recent surveys include this topic specifically.
You can have a "success" that breaks all three of the parameters - but it's still a very vibrant, successful project. We're always asking: "How would a reasonable person categorize this project?" We don't want to penalize... but if it's really over, we'd call it challenged. It's difficult; it's not cut and dried.
InfoQ: Right, projects change so much over time, it must be difficult to get good information about original and final scope, for example?
JJ: We work hard to try to figure out what's a truly challenged project versus a successful project. Then there's sandbagging: overstating project budget to avoid failure - we have to watch for that, too.
InfoQ: Your work on project failure is really a search for "how to succeed", isn't it? I noticed your list of Project Success Factors includes #5: Agile Development. As in, Agile Software Development?
JJ: Oh, absolutely! I'm a big believer in Agile, having introduced the iterative process in the early 90s and followed with the CHAOS reports on quick deliverables. We're a real flag waver for small projects, small teams, Agile process. Kent Beck has talked at CHAOS University, and I have talked at his seminars. I am a real fan of XP. My new book "My Life is Failure" looks at Scrum, RUP, XP and talks about those methods.
GD: Agile has helped by chunking projects into small pieces. Even if you start to get it wrong, you know it early on. Not like the old waterfall projects where you go into a cave and find out the bad news two years later...
JJ: I think that's the secret - incremental. I think that's why we're seeing improvement. [editor's note: here are some summary graphs which Jim uses in his talks, from the 2004 survey]
A big problem was project bloat, causing projects to go over time, over budget, and creating features and functions not required. Especially in government,. Agile really helps this - small increments. You know: they say "I need this", but by next sprint: "it's not as important as I originally thought it was."
GD: Yes, prioritizing features helps... always asking: what's the business advantage? Identifying low value items and putting them at the end... and then maybe the project never gets there, so you didn't need them after all.
InfoQ: Have you collected any data in relation to Agile in your research?
JJ: Yes, we've tried that. We hope we can show some of that. We've been putting it in our surveys, but, not everyone fills it out - it's difficult to get clean data on that. We started to try 2 surveys back... hopefully this time we can. In the next few days we'll be doing a last push to SURF members, to close the 2006 survey.
InfoQ: Agile must bring in new issues: how do you say when "planned" scope is accomplished, for a project using adaptive planning?
JJ: That's a good question. With companies like Webex, Google, Amazon, eBay, it's a challenge - they're doing something called "pipelining" instead of releases. "Whatever is done in 2 weeks, we'll put that up." They're successful because users only get small, incremental changes. And their implementation is proven - remember, some projects only fail after the coding is finished, at the implementation step.
InfoQ: Pipelining? Sounds like variable scope..
JJ: Yes. It's very enlightening to work on variable scope - it makes people a lot happier to see things getting done. People like to see progress and results, and Agile embodies that.
GD: For those companies, their user groups are young, geeky early adopters - they love change, love to try a new thing. Big banks don't have the same appetite for change. That's my concern about that process... is it robust?
JJ: Not every process works for every project. Agile is difficult for some to implement because their culture is so difficult.
GD: One of the things I like about Agile process is that you do something, see it, users can say: "I like it," or "I don't, can you move this?" or "no, that's not the way to calculate it..." You have rapid feedback and the quality downstream is so much better. With waterfall even if you can get through the project and get it written as you thought, you have all these quality issues to search out afterwards. With Agile testing and feedback, quality is better now.
JJ: People think Agile is 'loosey goosey'... If people really looked at it, they'd see it's just as stringent as waterfall or anything James Martin could put together. To say there has not been major improvements is the last dozen years is just plain foolishness. We have made great progress.
InfoQ: Well, I promised that would be the last question :-) Who gets the last word?
GD: You know, Jim is very much a seeker of the truth, as is the whole gang he works with. That's what keeps me going back.
InfoQ: Jim, Gordon, thanks very much for taking time out to talk with me!
Related News:
- Standish CHAOS Report Methods Questioned, August 2006.
- Standish: Why were Project Failures Up and Cost Overruns Down in 1998? October 2006