Intellectual map of the School of Earth Science

Intellectual map of the School of Earth Science

by Jon Claerbout (produced March 15, 2007)
http://sepwww.stanford.edu/data/media/public/sep/jon/banquet/

In the interest of optimally dividing the School of Earth Science, I prepared a questionnaire for Earth Science Faculty to identify colleagues closest to them in curriculum (or any measure they choose). This data was input to a "banquet seating" optimization program. With 50 professors there are 2500 possible relationships, so it's a job for a computer.

Results

Names cluster in the (x,y) plane. Names of respondents are preceded by a "+". Here's a current example.

This is not a unique answer but solutions from other starting locations look much the same.

Check yourself and tell me if I have not gotten your neighbors reasonably. If not, I'll double check my input of your data. (Unfortunately data were not collected consistently for emeriti and Roughgarden all whom if nonresponding have been changed to initials.)

Participating

Faculty responding by mid of the second week (March 14) are: Aydin, Chamber, Dunbar, Ernst, Fendorf, Francis, Gorelick, Graham, Hilley, Mahood, Matson, Miller, Moldowan, Payne, Pollard, Switzer, Arrigo, Beroza, Biondi, Claerbout, Harris, Klemperer, Knight, Segall, Sleep, Tabazadeh, Zebker, Zoback, Aziz, Durlofsky, Gerritsen, Horne, Journel, Kovscek, Orr, Tchelepi,

Faculty not yet responding are: Bird, Brown, Loague, Lowe, Mao, Paytan, Seto, Stebbins, Kovach, Mavko, Nur, Roughgard, Thompson, Caers,

Non-respondants wishing to respond could print and mail this copy of the original questionnaire. If you'd prefer to reply by email, it may be easiest to begin from this file. As long as more data comes in, I plan to rerun the software every Friday.

Method

The method is to define a penalty function; then in a million iterations select a random pair and swap it if that reduces the penalty. The penalty for each pair is proportional to the Euclidian distance of their separation. Each weighting factor is supplied by the respondents. Each pair is considered twice, A to B, and B to A, which explains how non-respondents come to be placed.

Non responders

Some non responders made clear to me their non response was a matter of principle for them. Early on I understood mathematically and by simple examples that the nonresponders would be positioned by the responders. I had hoped I might be able to ignore the nonresponders, but ignoring nonresponders led to my discovery that they tend to end out on the periphery. As a geophysical data analyst this annoyed me deeply. In most geophysical surveys we have locations which are not or cannot be reached by the data collection. Yet our final map should show the earth itself without showing the footprint of data acquisition. We have no choice but to address the issue of missing data. Luckily I found a way to bring the nonresponders back into the interior. Here it is:

Please understand each row of the "attraction" matrix to represent weights given by one respondant. I have scaled their weights to sum to unity. Non respondents have a row of zeros. Now suppose Mary loves John but John did not respond the the questionnaire. Now I take the weight of John's love to match Mary's. Thus, I fill in zeros of the row of John by copying the entries from the transposed matrix. Then I normalize these new rows. If a group of non respondents happen to form a cluster in real life, they may find themselves not clustered in my results, because their positions have been determined by those who did respond. Their group could be the oil in the sandstone.

Non-uniqueness

The results seem reasonable but there are theoretical reasons for disatisfaction. The penalty function (weighted distance) is a sum of (almost) convex functions. Normally we expect this specification to assure us that we directly descend to a unique minimum penalty. Unfortunately the checkerboard parameterization seems to destroy this assurance. I'm finding that different starting positions lead to different ending penalty minima. For each starting position, I log its minima. I also use the method of simulated annealing, gradually lowering the "temperature". Very long cooling runs (20 minutes and 3 hours) lead to very slightly lower penalties and people maps that differ slightly. I do not regard this as significant. Much bigger changes are seen whenever a new data point arrives in the mail.

If we were to find the ultimately low penalty there remains the more familiar issue of null space. Many different solutions could have the same unique minimum penalty. Some are easy to visualize. The entire blob of names can be translated or reflected. Non-participants who are not mentioned by participants can float to arbitrary positions. Other more subtle solution non-uniqueness might arise.

Confidentiality

I (Jon Claerbout) do not intend to share the raw data except with others who might assist in data analysis directed to the goal expressed here. There is the problem that if John likes Mary but if Mary does not like John... I do not want to be the one to tell.

CONCLUSION

I have a small data analysis problem that different initial random locations descend to many different solutions. But they differ only in small details. All solutions lead to a few suggestions:
  1. Arrigo and Tabazadeh might like to leave Geophysics.
  2. Pollard is a good prospect to join Geophysics.

The split between Geophysics and Geology is a natural one that any student can explain to you, one with too much math, the other with too much minerology. Before I began this project I imagined the G&ES department might split along some such natural axis. I envisioned and heard about

  1. geography versus geology
  2. hard rock versus environmental
  3. external fund raisers versus internal
  4. old professors versus young professors
  5. the dean and her cabal versus all the rest of us

My opinions are no more valid than those of my readers. However, a group of faculty names was suggested to me to consider for the new department. To facilitate discussion I colored them in yellow. Clearly yellow forms a cluster, but some people on the yellow boundary (and elsewhere!) might like to join the yellows or leave them.

I do not wish to call yellow the "Dean's Cabal" so I will search for a name for yellow. Being here longer than anyone else, I believe it my duty to recall for you some Stanford history of naming. I arrived at the School of Earth Science to four departments, Geology, Geophysics, Petroleum Engineering, and Mineral Engineering. Mineral Engineering was unhappy with their name so they chose another one, "Applied Earth Science". I do applied earth science in the Geophysics Department. I began seeing mail addressed to me in the new Applied Earth Science department. I did not like someone else usurping the name of what I do and what they do not do. I suspect professors Knight and Harris were also annoyed when another department took the word "Environment" into its name.

Why are California geophysicists required to get a state license? No geophysicist ever wanted the nuisance of a state license. It's a long sad story, but the bottom line is this. We had no choice after Civil Engineers and Geologists got themselves state licensed and defined themselves to include geophysics.

A name should make honest distinctions!

Just about everyone on campus can guess the difference between Biological Anthropology and Cultural Anthropology, but both were recently forced to merge into one. If we need a new division, we need division names no less clear than those in Anthropology. I suggest we reserve the word "Environmental" for the school name. Have we found clear and honest names for four divisions? I think not. I suggest:

With these names everyone will know what we do, and we'll not be grabbing one another's turf. I've learned the name Geo-ecology is not appealing to the yellow group. They prefer a phrase for a name, something like The Department of Earth Systems and Environmental Dynamics which seems to me to cover much of what Geology and Geophysics already do.

Should the new division be a formal department? The answer to that question is not my department.

Acknowledgement

I would like to thank Biondo Biondi and Jerry Harris for helpful discussions on the analytic technique. I'd like to thank Howard Zebker for suggesting I color some names for a potential new department. I'd like to thank many others for helpful discussions and assisting me getting a good data turnout.