README.1st: Newcomer's short directory of written resources on SEP's computing environment



SEP Home

About SEP
People
Research
Courses
Software
Internal Info
SEP WWW
SEP Computing
Data Library
Newcomer's Guide
SEP Proofreading



Intro

There are many written resources that describe how SEP's computers work, and they are generally easy to get from the Web; but as I found out from my own experience, in order for someone to really get going with it, there is a need of having this information altogether, in one piece, or as a 'diet' with specific prescriptions for each stage of SEP basic skills development. This short note attempts to provide such a single unified guide for the preliminary instruction of a person new to SEP. This information is available on the Web, and the software can be downloaded as well (from sepwww.stanford.edu/software/septour.html). Thus, it is possible, if you have the time, to get familiar with some of these things before coming here. This would be very helpful, since it would alleviate some of the inherent efforts necessary in order to adapt to a new place, to meet new people and to learn new rules, and, if you are an international student, to immerse into a whole new culture.

0. Why?

Because most of the computers in the world use either MS Windows or Apple's Mac OS as an operating system, chances are that you are most accustomed to the above mentioned environments. They have the advantage of being user-friendly, but this is not one of the important things for people concerned with doing computations involving really large datasets (seismic data, for example). The top priority is being able to control and to optimize every single aspect of the computations and to manage very well the resources of the machines when the large volume of the computations demand it. Doing that means interacting with the machine at a level as low as pragmatically possible - somewhere between binary code and Windows 2000. This is the world of 32 processor machines, of Linux, emacs, Makefile, TeX, Fortran 90 and C++. There may not be a friendly virtual ``Office Assistant'' around, but these tools will get your work reliably done. So they must be learned, no matter how `unnatural' it seems at first to use line of command statements instead of pushing colored buttons.

1. UNIX

This is where you start from; you simply need it, in order to manage your files. At this time, knowing the basic commands and having learned how to make scripts and use pipes - and maybe some filters - should do. Advanced features can be learned later, as you will need them. ``The UNIX Programming Environment'' by B. W. Kernighan and R. Pike, Prentice-Hall Software Series, 1984, is a good book (I used it to learn UNIX) and is abundant at the fourth floor of Mitchell Bldg (that's where the offices of SEP students are), but actually, for the basic level, any book is OK. Many links to on-line documentation can be found at

sepwww.stanford.edu/internal/junk/linux/.

2. Computing environment

The SEP computing environment is literally an environment .. like in nature, it works altogether, and everything, hardware and software, has its distinctive function. It is so integrated that it becomes more than the mere sum of its components, it is aimed at providing a coherent workflow. Depending on your background, you may or may not have taken part in such environments. If you have not, you should acknowledge that since integration is needed for the fluency of the complex, team-based research process, that will lead to a high level of standardization within the ``chain of production''. The result is world-class research, the side effect is working with a less user-friendly environment at all moments (such as having to use emacs-edited TeX instead of your WYSIWYG-styled favorite text editor). Ample detail on the components of the above-described functional ``chain'' can be found at:

sepwww.stanford.edu/internal/computing/environ.html

sepwww.stanford.edu/internal/computing/misc.html

sepwww.stanford.edu/internal/computing/print.html

sepwww.stanford.edu/internal/computing/setup.html

sepwww.stanford.edu/software/ and its subordinate pages.

sepwww.stanford.edu/public/docs/sephelp/.

3. Makefiles

If you did some programming, it is possible that you already used such stuff (in principle, not necessarily with this particular syntax or under Linux), probably by re-inventing this wheel, because it was necessary. Here's how it works: you wrote a program to do some stuff. That program needs some results from another program. You need one other program to display the data. You also need some files with input data, and some files with various parameters necessary for your computations. (This probably looks like the description of something you had to do for a small school assignment). For small-scale projects, when you do not need to use the programs more than once, you probably keep the whole picture in your mind, and manually go through the chain, doing things one by one. But what if you need to run the program a few hundred/thousand times (such as in a sequence of simulating something - visually comparing the result with the real thing - changing some parameter so that the next one will be closer to reality - simulate again)? You will need to be much more organized and you will write a main file (Let's call it a ``Makefile'') that invokes the whole sequence of separate programs for you - only the initial parameter must be modified manually. What if you need to share your research with other people? You will use GNU's Makefile, a system with its own syntax, that will do a lot of things for you, including cleaning the unnecessary intermediate files when you order it to do so. The output can be anything - a picture, a TeX document, etc. You just need to keep the source files, the parameter files and the Makefile in the same directory. Type ``make'' and the whole thing will wake up to life, making your processor busy and producing a nice output and a lot of intermediate, unnecessary files that you can delete later typing ``make clean''. That will not destroy your output, you need to type ``make burn'' for that. The Makefile system even looks when the parameter files were modified for the last time, and if all are older than the output file, does not waste time recomputing the output again - it only displays it. It also does many other things. More detail can be found at

sepwww.stanford.edu/software/septour.html

sepwww.stanford.edu/internal/computing/makestuff.html

4. Reproducible research

Ethics and common sense tell us that all research must be reproducible, so that people might be able to build upon that foundation or sometimes to check the veracity of a piece of research. In the pre-electronic documents era, that meant disclosing enough information in a paper in order for someone else to be able to redo the experiment/calculations involved, with the same result. With the increasing complexity of computing, it became possible and desirable to have the check done automatically, and to be able to modify some parameters and then to rebuild the numerical/visual result. Of course, it is desirable that the person doing that should not have to rewrite and debug the necessary codes, etc. How to do that with the push of a button? The secret lies in having a organized file structure. The integrated computing environment and rules of SEP bring that element of order and standardization that makes it possible. Makefiles are very important in this respect. Read more about reproducible research at

sepwww.stanford.edu/research/redoc/ and the affiliated pages..

5. Word processing in TeX/LaTeX

These are a sort of combination between a programming language and text editing software. The good thing is that they allow automatic document generation/styling; they also have many other qualities, which you will discover with time. The bad thing is that they are not visual (you see only plain text in a terminal window, not the output), and that there is a syntax to be learned. The .tex files are compiled, in order to get a viewable .dvi or .ps nice looking paper. More at:

sepwww.stanford.edu/public/docs/sephelp/

www.emerson.emory.edu/services/latex/latex2e/latex2e_toc.html

sepwww.stanford.edu/software/softtex/index.html

TeX and LaTeX documents, as well as some codes, are written using a word processor called Emacs. SEP uses the GNU free version, developed by the Free Software Foundation, and downloadable from www.gnu.org, along with other goodies.

6. Ratfor

Ratfor stands for RATional FORtran, and, in its Ratfor90 version, it is a more object-oriented, concise-syntaxed Fortran. For a description/download, see

sepwww.stanford.edu/public/docs/bei/rat/paper_html/node1.html

The "Programming utilities" chapter of SEP83: sepwww.stanford.edu/public/docs/sep83/index.html

sep.stanford.edu/software/ratfor90.html

Since Ratfor is not widely used, it is more likely that you will have the opportunity to learn Fortran wherever you may are. This would do: all valid Fortran code is valid Ratfor, since the latter is an extension of the former.

7. SEPlib

SEPlib is the SEP built, freely downloadable data processing software. A description of how it works can be found in the "SEPlib tour for new users" from:

sepwww.stanford.edu/public/docs/sephelp/septour/index.html

with an older version at

sepwww.stanford.edu/public/docs/sep73/joe1/paper_html/index.html

The whole SEPlib package can be downloaded from

sepwww.stanford.edu/software/seplib/index.html,

where you will find some other instructions as well.

Matlab, Mathematica, and C/C++ are good to know, people use them sometimes. ``And what else?!!'' you may burst. Don't worry. Although all that was mentioned so far probably seems a whole lot to you, you can learn it without problems after your arrival in sunny California. But, of course, getting a head start by preparing in advance is a good thing to do, whenever one has the time and the resources.


© 2007 , Stanford Exploration Project
Department of Geophysics
Stanford University

Modified: 06/07/07, 14:49:03 PDT , by bill
Page Maintainer: nick `AT' sep.stanford.edu