next up previous [pdf]

Next: Mapping Optimization Problems to Up: Linear Programming Previous: Linear Programming

Theory

The goal of Linear Programming, like optimization in general, is to maximize an objective function, subject to a set of constraints. In this case, the objective function is a linear equation in $ {\bf x}$.

Linear programming developed out of the Operations Research community following World War II (Dantzig, 1963). It developed from earlier planning algorithms and optimization methods, finally emerging as one of the simplest tools that could optimally satisfy a set of competing equations. It is commonly used in business analytics, supply-chain management, and path planning. Variations of the concept have been applied to geophysical problems, including tomography inversion (Berryman, 1989); but in general, the geophysical inversion community has not shown strong adoption of this body of techniques. This is unfortunate, because the methodology of constructing and navigating within a feasible region of solutions enables both numerical optimization to find a global minimum, as well as heuristic ``picking'' interpretations of other valid, non-minimum solutions that satisfy the problem constraints.

We have successfully mapped general-purpose geophysical optimization problems into this numerical framework, particularly emphasizing data-fitting with linear operators. Due to its widespread use in other fields, a variety of software tools exist to find solutions to linear programming problems. A variety of numerical programming environments provide linear programming solvers natively; we explored several of these environments and evaluated their solver implementations. The most promising environment of the ones we considered is the GNU Scientific Library and the GNU Linear Programming Kit (GLPK), with language bindings for C, Python, and FORTRAN. Tools and libraries are also available for GNU Octave and its commercial equivalent, MATLAB.

The setup for linear programming revolves around a generalized parameter space, $ {\bf x}$, which we seek to modify until an optimum is found with respect to some objective function $ {\bf c}$ subject to constraints $ {\bf b}$.

We have:

$\displaystyle x_i$ $\displaystyle =$ the parameter space (2)
$\displaystyle c_i$ $\displaystyle =$ the objective definition (3)
$\displaystyle a_{ij}$ $\displaystyle =$ the definition of the constraints (4)
$\displaystyle b_j$ $\displaystyle =$ the constraints vector (5)

where $ i$ ranges from 1 to the number of parameters; and $ j$ ranges from 1 to the number of constraints.

Additional physical constraints are represented numerically by introducing auxiliary variables, denoted $ {\bf x_s}$, and adding an objective coefficient $ c_i$ for each introduced constraint. Minimization of $ \vert{\bf x_s\vert}$ is equivalent to optimal satisfaction of these physical constraints; this objective competes with the minimization of $ \vert{\bf x\vert}$. This relationship between model- and data-fitting constraints is the basis of the optimzation problem.

General solutions to this optimization can be found according to the Simplex Algorithm (Dantzig, 1963), design of which is rooted in topological graph theory. To find an optimal solution, the system is represented in augmented matrix form. A scalar value $ Z$ is constructed, representing the objective function minus the constraints weightings, for any given intermediate solution. At each iteration, the simplex solver evaluates the maximal gradient of $ Z$, with respect to a current set of basis equations, $ \begin{bmatrix}x \vert x_s\end{bmatrix}^T$.

This gradient provides a descent direction; the model $ x$ is updated accordingly, traversing the linear system until a vertex of the solution graph is found. At each vertex, a new basis set is calculated by matrix transvection (or equivalent, but slower, Gaussian elimination). This allows the objective scalar $ Z$ to be expressed in terms of the new basis $ \begin{bmatrix}x \vert x_s\end{bmatrix}^T$ - in other words, by rotating the state-matrix according to a transvection operator. (Transvection simply involves adding a scalar multiple of one row to another row). By choosing the correct transvections, the next descent direction can be directly read as the coefficients of the matrix representation. It can be shown that an optimum solution, if one exists, must lie on either a vertex or as a set of equally-optimum elements on a single edge of the solution space. This lies within the feasible region defined by the constraining equations.


next up previous [pdf]

Next: Mapping Optimization Problems to Up: Linear Programming Previous: Linear Programming

2009-10-19