Carrying out empirical studies is slowly becoming widely held to be of importance by the software engineering community. A view perhaps less widely held is that experiments should be replicated externally to both verify and generalise the original results. This paper serves a number of purposes. The need for external replications is established and the role of replication in experimental software engineering is discussed. Without the con rming power of external replications, results in experimental software engineering should only provisionally be accepted, if at all. The paper then draws heavily on the authors ' experiences in externally replicating three software engineering experiments ([Kor86, PVB95, KL96])to provide guidance on three, relatively neglected areas: the improving of experimental `recipes ' during replication to either focus or generalise results, the use of alternative data analysis techniques, speci cally rule induction, to seek alternative explanations when the results of replications di er, and packaging of experiments for replication. The facilitation of replication through properly constructed replication packages places formidable but largely unrewarding demands on the original experimenter. Yet without appropriately constructed and documented replication packages systematic and cumulative empirical investigation will be di cult to achieve. It is demonstrated that a modest generalisation of the characterisation scheme, or framework, proposed by Lott and Rombach [LR96], will serve well as the basis for the reporting, packaging and recipe improvement of software engineering experiments for the purposes of external replication. It also demonstrates the insights that can be achieved from seeking explanations beyond those provided by purely statistical analyses. 1
|
2489
|
Induction of Decision Trees
– Quinlan
- 1986
|
|
246
|
The TAME Project: Towards Improvement-oriented Software Environments
– Basili, Rombach
- 1988
|
|
185
|
The KDD Process for Extracting Useful Knowledge from Volumes of Data
– Fayyad, Piatetsky-Shapiro, et al.
- 1996
|
|
161
|
Specifying Software Requirements for Complex Systems: New Techniques and Their Applications
– Heninger
- 1980
|
|
115
|
ªComparing Detection Methods for Software Requirements Inspections: A Replicated Experiment,º
– Porter, Votta, et al.
- 1995
|
|
107
|
Experimentation in Software Engineering
– Basili, Selby, et al.
- 1986
|
|
97
|
Building Knowledge through Families of Experiments
– Basili, Shull, et al.
- 1999
|
|
68
|
Should Computer Scientists Experiment More
– Tichy
- 1998
|
|
64
|
Does Every Inspection Need a Meeting
– Votta
- 1993
|
|
63
|
Needed: An empirical science of algorithms
– Hooker
- 1994
|
|
51
|
Studying programmer behavior experimentally : the problems of proper methodology
– Brooks
- 1980
|
|
49
|
A Pattern Recognition Approach for Software Engineering Data Analysis
– Briand, Basili, et al.
- 1992
|
|
48
|
A controlled experiment in program testing and code walkthroughs/inspections
– Myers
- 1978
|
|
48
|
Learning from examples: generation and evaluation of decision trees software resource analysis
– Selby, Porter
- 1988
|
|
31
|
An Empirical Evaluation of Three Defect-Detection Techniques
– Lott, Christopher
- 1995
|
|
30
|
Measurement and experimentation in software engineering
– Curtis
- 1980
|
|
30
|
A Replicated Experiment to Assess Requirements Inspection Techniques", Empirical Software Engineering
– Fusaro, Lanubile, et al.
- 1997
|
|
30
|
Further Experiences with Scenarios and Checklists
– Miller, Wood, et al.
- 1998
|
|
25
|
Repeatable Software Engineering Experiments for Comparing Defect-Detection Techniques
– Lott, Rombach
- 1996
|
|
20
|
Guidelines for reporting results of computational experiments. Report of the ad hoc committee
– Jackson, Boggs, et al.
- 1991
|
|
19
|
Comparing Inspection Strategies for Software Requirements Specifications
– Cheng, Jeffery
- 1996
|
|
18
|
An empirical study of the object-oriented paradigm and software reuse
– Lewis, Henry, et al.
- 1991
|
|
17
|
On reporting computational experiments with mathematical software
– Crowder, Dembo, et al.
- 1979
|
|
16
|
The problem of statistical power
– Baroudi, Orlikowski
- 1989
|
|
14
|
Experience in the use of an inductive system in knowledge engineering
– Hart
- 1984
|
|
13
|
Critical review of quantitative assessment
– Kitchenham, Linkman, et al.
- 1994
|
|
7
|
Software measurement: A necessary scienti c basis
– Fenton
- 1994
|
|
5
|
Inductive analysis applied to the evaluation of a CAl tutorial
– Brooks, Vezza
- 1989
|
|
5
|
Eras of software technology transfer
– Davis
- 1996
|
|
5
|
The reporting of computation-based results in statistics
– Hoaglin, Andrews
- 1975
|
|
5
|
Rigor in software complexity measurement experimentation
– MacDonnell
- 1991
|
|
5
|
An empirical evaluation of defect detection techniques
– Roper, Wood, et al.
- 1997
|
|
4
|
IRIS integrated rule induction system
– Arisholm
- 1987
|
|
4
|
A survey regarding the reporting of simulation studies
– Hauck, Anderson
- 1984
|
|
4
|
Guidelines for reporting computational results
– Lee, Bard, et al.
- 1993
|
|
3
|
Betrayers of the Truth, page 17 and 81
– Broad, Wade
- 1986
|
|
3
|
TOO HOT TO HANDLE The Story of the Race for Cold Fusion
– Close
- 1990
|
|
3
|
An external replication of Korson's experiment
– Daly, Brooks, et al.
- 1994
|
|
3
|
Veri cation of results in software maintenance through external replication
– Daly, Brooks, et al.
- 1994
|
|
3
|
Software's chronic crisis. Scienti c
– Gibbs
- 1994
|
|
3
|
We are all scientists
– Huxley
- 1965
|
|
3
|
Structured owcharts outperform pseudocode: An experimental comparison
– Scanlan
- 1989
|
|
3
|
Experimental investigations of the utility of detailed owcharts in programming
– Shneiderman, Mayer, et al.
- 1977
|
|
2
|
Publication politics, experimenter bias and the replication process in social science research
– Bornstein
- 1991
|
|
2
|
Comparing the e ectiveness of software testing techniques
– Basili, Selby
- 1987
|
|
2
|
An experimental analysis of program veri cation methods
– Hetzel
- 1976
|
|
2
|
Software-engineering research revisted
– Potts
- 1993
|
|
2
|
Replication in behavioural research
– Rosenthal
- 1991
|
|
1
|
Replication research: A "must" for the scienti c advancement ofpsychology
– Amir, Sharon
- 1991
|
|
1
|
Madigan ad Daryl Pregibon, and Padhraic Smyth. Statistical inference and data mining
– Glymour, David
- 1996
|