     # Short Note Inversion shortcuts by model statistics

Author has no known email address

For the purpose of data analysis, various transforms have been proposed in order to provide a sparse model space with intuitive or quantitative value for interpretation. For the purposes of velocity analysis or understanding the source components captured passively by an array of geophones a sparse model space is desired that need not exactly forward model the supplied data. This is the paramount difference between data analysis and data synthesis. Because the rate of convergence of an inversion scales with the size of the model and data spaces, 3D problems supplied with large data sets and model domains, are computationally intensive operations. To assure sparsity in the model space, the situation is often exacerbated by using expensive inversion algorithms such as linear programming or BFGS.

Lloyd's algorithm (LA) is an iterative binning operation normally implemented on the histogram of values within a data space. The algorithm is used to decimate the bandwidth of signals in an optimally representative manner. It was developed to quantize/downsample the color values in images for display on graphics systems with limited memory/bandwidth.

The hypothesis of this work is to test whether the algorithm can be used to optimally select a small number of model space coordinates from an incomplete inversion. To test the hypothesis, I stop iterative inversion with linear and hyperbolic Radon transforms before convergence. I then translate the model space into a form usable by LA to select model-space coordinates that best represent the incomplete inversion. The goal is to minimally represent important model-space parameters despite the lack of focus of the incomplete inversion.

Data for Lloyd's algorithm (LA) consist of a set of N-dimensional parameters over which the algorithm optimally selects a user-supplied number, or fewer, combinations that best represent the set. The model space of a linear operator however contains a spanning parameter set differentiated by the amplitude at each location. Consider a model space defined as the Fourier transform of a trace with two sinusoids with different frequencies. The output of the transform contains two frequencies with high amplitude and many with zero amplitude. Viewing the output, an interpreter can select the two frequencies with energy and discount the rest of the model space. Only two numbers are important to know, while the rest of the transformed space can be discarded. I introduce LA to make this selection optimally and automatically.

The transforms are cast within the framework of least-squares inversion with time domain operators. I test the hypothesis on synthetic volumes, a shot-gather from the Yilmaz data collection, the passively collected telescopic solar observation data, and the passively collected hydrophone data from the Valhall oil field in the North Sea.     