The idea of reproducible research

Next: Example Up: Introduction Previous: Introduction

The idea of reproducible research

Is the Internet nothing but a huge library? A stack of dusty documents and illicit images? Maybe today, but for computational researchers it can be much more: it can be a globally shared laboratory. I advocate that computational researchers - chemists, physicists, engineers, applied mathematicians, astronomers, geophysicists, or even meteorologists - change their practice and philosophy of publishing. Reading new material may never again be done in an armchair in front of the fire place. Instead research publications are to be explored by the researcher as well as by a researcher's computer. The researcher reads the article, a mere advertisement and exposition of ideas. The researcher's computer recomputes the articles results and verifies that the implementation, the details, the software, is complete and readily available to the reader.

I do not suggest that all researchers submit to some complicated standard of programming or that all researchers have to add cute interactivity to their publications. My goal is simpler, more general, and more easily achieved. I suggest that researcher make their computational results reproducible.

Reproducibility is the crux of scientific work and the key to the efficiency and productivity of the modern age. It is what Newton called standing on other's shoulder to see further. Scientific results own their power to the principle that they apply anywhere and anytime and consequently that anyone can reproduce and apply them. A mathematical theorem or a simple computational algorithm can be described exhaustively in the summary of a printed journal.

Computational research is a form of experimental science. A given data set is a problem, a given algorithm expresses a scientific model, the application of the algorithm to the data is an experiment. In publications, researchers present the idea of the algorithm and document the usefulness of the underlying model with a few results. While our computational experiment is fully reproducible in theory, in practice our published results hardly ever are.

The reproducibility of computational research experiments, however, is flawed. Modern computational experiments usually depend on a multitude of implementation details and parameters that cannot be communicated efficiently in a printed publication. A researcher Newton that reads about some computational results by a researcher Galileo might spend about as much time to reimplement and verify Galileo's computations as it took Galileo in the first place. Consequently, in computational research the traditional publication is simply advertisement for the research stored in implemented algorithms.

But the computer that allows us to do such complex computations in the first place also allows us to preserve and communicate them efficiently. Pseudo-standards develop in computational communities for programming languages (Matlab, C, or Fortran) and data formats (SEGY, CDF). Large disks, CD-ROMs and the Internet facilitate easy data exchange and storage. The current bottleneck is to preserve and communicate the computational functionality of a project's pieces: The commands that combine the collection of programs and data files to an organic, functional entity and produces the desired results.

I suggest that research is published as electronic reproducible documents that combine a project's software and its scientific document and offer for every result three standard commands to any reader:

View displays the result and offers it for scrutiny.
Build recomputes the result from a given set of source files (such as data files and computer programs).
Info lists any information important for the reproduction of the result, e.g. a time estimate for the recomputation of the figure or a commercial software needed.

If the recomputation of the result file creates intermediate files an additional Clean button offers the reader to delete all intermediate files. In a simpler version, the Build command deletes all intermediate files after it recomputed the result files.

In an early paper Schwab and Claerbout (1996) I describe the solution of my laboratory (Jon Claerbout's Stanford Exploration Project) based on C, Fortran programs, L^ATEXand makefiles. A variation of this system is used by William Symes at The Rice Inversion Project 1998. David Donoho 1996 developed a similar concept for his Wavelet research using Matlab. Other researcher Nagler (1995) (need NASA citation) recognize the need for reproducible research as well.

On first impression, reproducibility must sound ridiculously altruistic and inefficient to a researcher eager to churn out publications. In my experience, the author of a reproducible document reaps the largest benefits from it. Every researcher has sad stories how the own programs and projects become alien and unaccessible over time. The proposed reproducibility commands facilitate a frequent and automatic rebuilding of any and all past research projects. with a minimum effort the research can maintain a portfolio of former projects well-oiled and ready for continued use. The researchers size of software library and his confidence in the library's quality soar. In that sense, an author is the first reader of her own research.

Furthermore, I claim that a researcher can easily add the three reproducibility commands to the standard computational experiment. During active development, the researcher ceaselessly repeats the computation. Given a set of short shell scripts or makefile rules, (see below for examples) the author needs to adhere to a few naming conventions to implement the three reproducibility commands. The implementations of the reproducibility commands may vary, but I suggest that the functionality of the three basic commands - the reader interface - is universal and standard.

Is the computational research community going to revolutionize once we distribute reproducible research? Maybe. I believe, every researcher would like to be more productive by using another researcher's software. In my experience, reproducible research documents using Fortran, C, and L^ATEX, and GNU make improved collaboration among the researchers of my laboratory. However, the distribution of our documents and software to sponsoring industrial research laboratories on CD-ROMs failed since it required a rather difficult installation step on a UNIX system. A reader who considers using a third-party software desires a stepwise adoption that lets him continuously evaluate the cost-benefit the software offers. A time-consuming download and installation step of a software package of dubious quality and documentation, is unattractive. A third-party Java software package is much more attractive: Java's integration in Internet browsers enables the reader to test the software actively before downloading it. Java's portability eliminates the installation step. The test establishes that the advertised research is available, complete, and correct.

An author needs to offer software in self-contained units that integrate into a reader's environment. The traditional software categories by distribution were

a project's private, highly individual files (Usually the core of the research project).
an author's standard repertoire of software tools that he shares among research projects and possibly among laboratory colleagues.
an author's tools that are common among readers of the author's research community.

To distribute their software, authors need to judge their readership and package the software correspondingly. Using Java, a reader's browser downloads the programs it needs to do the reader's bidding. The author does not need to prepackage the various programs.

In summary, I believe that the combination of Java and reproducible research documents may change the way the computational research community publishes. Java invites reader to explore programs without committing them. Reproducibility commands standardize the exploration and organize a research project in functional units. Both technologies - Java and reproducible documents - can be used independently, but in their combination they offer opportunities to any researcher seeking an audience.

Next: Example Up: Introduction Previous: Introduction

Stanford Exploration Project
3/8/1999