CI better but costlier than TDD

Tweet: Code inspection more effective than TDD at reducing software defects, but also more expensive

Tags: Code inspection, Test-driven development, Quasi-experiment, Student subjects

Paper: Jerod W. Wilkerson, Jay F. Nunamaker, Rick Mercer, "Comparing the Defect Reduction Benefits of Code Inspection and Test-Driven Development," IEEE Transactions on Software Engineering, vol. 38, no. 3, pp. 547-560, May-June 2012, doi:10.1109/TSE.2011.46

Background

B1. Code inspections have been researched a lot and shown to be effective at reducing software defects
B2. Agile proponents often claim test-driven development (TDD) is very effective
Problem: Previous research unclear on if TDD is a good complement or alternative to code inspection

Method

M1. Had 58 undergraduate computer science students implement a part of a spam filter in Java given a detailed specification
M2. Students were divided into four groups: Inspection, TDD, Inspection+TDD, and Control (neither Inspection or TDD)
M3. Based on a Java and OO programming pre-test only data for the 40 highest scoring students were collected, but for different reasons only 29 could be included in final analysis
M4. Inspections (in the style of Fagan) were performed by the same 3 students using an online, collaborative inspection tool
M5. Scenario-based reading was used for inspection preparation with each inspector focusing on (but not limiting themselves to) a specific defect type: "missing functionality", "incorrect functionality", or "incorrect Java coding"
M6. All students had a lecture on JUnit and TDD and had a one-week programming assignment involving TDD prior to the experiment
M7. Number of remaining defects was then measured by the sum of failing acceptance tests + defects found in a measurement inspection (by 2 "new" inspectors")

Results

R1. Code inspection was more effective (-23%) than TDD (-11%) at reducing defects, compared to the control
R2. Code inspection + TDD reduced even more but not statistically significant
R3. Code inspection was 63% slower compared to the control, while TDD was 14% faster
R4. TDD needs to be more clearly defined since it allows too much variation today to allow detailed comparisons

Limitations/Threats

L1. Students (undergrad) rather than professional developers
L2. Inspectors were also students without professional inspection experience

Raw numbers

D1. 40 undergraduate students (mostly junior students)
D2. 11 students excluded so only data for 29 students used in analysis
D3. Average of 550 lines of code developed, not a big variation between groups
D4. The same 3 students (not included among developers) performed all code inspections after a 4 hour training session
D5. Mean inspection rate was 180 non-comment source statements per hour with a maximum of 395.
D6. The Control group had 16.15 remaining defects on average, TDD reduced this to 14.37 (-11%), Inspection reduced it to 12.37 (-23%), and Inspection+TDD to 8.21 (-49%).
D7. The Control group took 12.79 man hours on average, TDD reduced this to 11.01 (-14%), while Inspection increased it to 20.88 (+63%).
Tiny SE - latest SW Eng research in <140 chars