Reproducible electronic documents
Matt Schwab and Jon Claerbout
We give you our system for filing scientific computational
research: Reproducible electronic documents.
These documents enable you - or anyone with access
to your files - to handily regenerate your results. Thus
your research and your software can be shared and reused.
Reproducible electronic documents rely on UNIX makefiles,
a few file naming conventions, and a small set of make
rules and definitions.
(These rules require GNU make 3.75 or higher).
Universal rules for reproducible documents
- The White Paper
(html,
ps.gz)
explains reproducible electronic documents and their implementation.
This article is the best introduction to reproducible documents.
I am looking for a journal to publish this document.
- The software package (tar.gz)
that accompanies the article contains
a complete, reproducible document and a generic set of GNU make rules.
Use this package if you want a simple reproducible research example.
If you consider adapting our rules for your purposes, I suggest you
start from the complete set of rules our laboratory developed (next
item).
- SEP's GNU make rules
contains all rules our laboratory developed over the years.
The Doc rules handle reproducible documents.
The Prog rules handle the compilation of programs.
If you are serious about building your own set of GNU make rules,
you may want to start from here. Look for the README file.
- The Installation of GNU make
rules is simple.
- Two pages of motivation and summary:
Why computational scientists need reproducible research documents.
- Their disappointment with CD-ROM
technology led Claerbout, Schwab, and Karrenbach to look
forward to the evolution of the web.
- A Promotional blurb about
reproducible electronic documents was prepared by Claerbout,
after publishing his first reproducible document in 1992.
- At their 1992 SEG presentation ,
Claerbout and Karrenbach defined reproducible research
for the 1992 Society of Exploration Geophysics meeting.
Archived reproducible electronic documents
-
During my tenure as "editor" at
SEP,
we published over 3500 pages of reproducible research documents:
textbooks,
theses, and
sponsor reports.
Before the arrival of the internet,
I generated interactive
CD-ROMs
that contained SEP's reproducible documents.
-
Of course,
I developed my own work as reproducible research documents:
sdi,
jest.
-
I tested almost all published reproducible documents of SEP
by removing and rebuilding all the documents' easily reproducible figures.
The tests were done by automatic scripts and constituted SEP's sole
measure of software quality.
Software related to reproducible research
-
From 1992 to 1995 we used cake. 1995 I introduced GNU make at SEP.
Cake had served us well for many years.
Mr Somogyi
wrote cake before GNU make existed.
-
With the
Xtpanel
(by Steve Cole and Dave Nichols)
scripting language we easily create graphic user interfaces for
our electronic reproducible documents.
-
My first experiments
to deliver reproducible research documents via the internet.
Reproducible research elsewhere
If you create reproducible, electronic research documents,
please let us know and we will point to your web page.
- At the
Trip industry consortium,
Bill Symes adapted our rules to his needs and uses it happily.
- At the
Wavelet research
page, Jonathan Buckheit, Shaobing Chen, David Donoho, Iain Johnstone,
and Jeffrey Scargle are delivering reproducible research on the web..
They use Matlab.
Their reproducible research is not integrated with its documentation
like ours is.
matt@sep.stanford.edu