Peter C. Rigby and Daniel M. German of the Software Engineering Group at the University of Victoria under the supervision of Margaret-Anne Storey have released A case study of Apache peer review. This paper has also been submitted to FSE2007. The paper is a follow-up to a previously released technical report A preliminary examination of code review processes in open source projects released in 2006 by Peter and Daniel which analyzed the code review processes of Linux, GCC, Mozilla, and Apache.
The new case study targets Apache specifically and seeks to answer the following research questions:
- Process: What types of review are performed and what are the processes for conducting those reviews?
- Frequency and Activity: How often are the reviews performed? Is the frequency of review related to the development activity?
- Participation: How many developers participate in the review? How much discussion is there during a review?
- Size: What is the size of the artifact under review?
- Interval: How long do reviews take to perform?
- Defects: How many reviews find defects?
In answering these questions the paper presents both the CTR and RTC review processes used by the Apache project as defined in the Apache glossary:
Commit-Then-Review (CTR): A policy governing code changes that [grammar edited] permits developers to make changes at will, with the possibility of being retroactively vetoed. C-T-R is an application of decision making through lazy consensus. The C-T-R model is useful in rapid-prototyping environments, but because of the lack of mandatory review it may permit more bugs through than the R-T-C alternative
Review-Then-Commit (RTC): Commit policy, which requires that all changes receive consensus approval in order to be committed.
'Consensus approval' refers to a vote, which has completed with at least three binding +1 votes and no vetos.
Included in the analysis of the review processes is a discussion of the strengths and weaknesses of each. Unlike the Apache project, commercial development environments tend to favor the RTC process in dealing with code reviews (often said with a basis of risk management, considering it better to lose a change than to have a bad change make it into your source base). This discussion could be used to make a convincing argument towards the viability of CTR in commercial environments. The discussion and analysis of the CTR process is also of importance as data related to the process is lacking.
Also under analysis is the size of artifact being reviewed, Apache (and other projects) use a “Review Early, Review Often” mentality, Apache in particular “performs reviews at drastically higher frequency, shorter interval, and with smaller artifact size” in comparison with Porter et al.’s findings. The trade offs associated with these smaller artifacts become an important topic of discussion.
Another critical discussion focuses on the mediation of defects:
With formal review techniques, the discussion centers around defects. A good mediator does not allow reviewers to start discussing anything but defects. The developer must fix the defect and report to the mediator when the problem is solved.
It then contrasts Apache’s methodology in that
The reviewers are not interested in the defect, but in what caused the defect and how it can be fixed. The discussion immediately turns from “we found a defect” to “what is the fix for the defect”.
The paper concludes that that the “ideal time to find a solution to a defect is when it is found because all participants understand the problem”. It also finds that “this is not possible in traditional inspection due to time constraints” comparing in specific an example from Lessons from Three Years of Inspection Data, by Edward F. Weller.
Although some question the harvesting methodology and the lack of interaction with Apache team members, the papers offer an interesting set of discussions comparing and contrasting various review methodologies and should be of use to anyone setting up review methodologies in either an open source or a commercial environment.