Reproducible documents on the Internet

Next: A few more musings Up: Automatic document generation Previous: No more make

Reproducible documents on the Internet

So what is different about offering reproducible documents on the Internet? Obviously, we can reach many readers effortlessly. The readers can invoke our documents reproducibility commands and can download our source files. Theoretically, an author can give a reader as much functional access to a document as she has herself. Almost ...

A reproducibility command on a local computer can create a file on the local computer system; In the past, a reproducibility command executed by a browser could not. For security reasons, browsers restricted an applet's access to resources on the client machine. Particularly, an applet was unable to write or read a file to the local file system. Consequently, applet's were restricted to in-core computations and a reproducible research document did not need a clean rule. Recent browser versions (e.g. Netscape 4.0, HotJava) allow a reader to grant applets access to local files. The reader indicates trusted web sites in the browsers security settings and the browser verifies an applets origin when loading it from the web. My example of a reproducible result for the web does not exploit this new flexibility in browser security.

A reproducibility command on a local computer ensures the up-to-dateness of all dependencies by recomputing what is out-of-date. An applet is traditionally limited to downloading and processing existing files from the server. Usually an applet does not verify that a given file on the server is up-to-date or invokes a different command when the file is out-of-date. I believe that such a mechanism is feasible, especially if we had a Java mechanism to up-date targets. The current web prototype does not offer an up-to-dateness check. Instead the user has the simple choice between inspecting the finished result or recomputing it from scratch.

A process issued on a local computer enjoys a larger bandwidth when accessing data on the server than a process issued by an applet executed by a remote browser. We can ease the effect on large computations by distributing separating the downloading and the interactive execution. Either as CD-ROMs or by ftp from a data server. The data will then be local on the client machine during the execution of an applet (This again requires that the applet has access to the local client file-system). The distribution of large data sets still inhibits my vision of an effortless access to remote reproducible results.

Next: A few more musings Up: Automatic document generation Previous: No more make

Stanford Exploration Project
3/8/1999