next up previous contents
Next: Conditional parameter spaces Up: The inversion algorithm Previous: Monte Carlo methods   Contents


The neighbourhood algorithm

The Voronoi decomposition of the parameter space is the base of an approximation of the misfit function $ L$ which is progressively refined during the inversion process. The approximation is set as constant inside each cell and the misfit value calculated at the central point is affected to the whole cell. A two-dimensional parameter space is given as an example in figure 2.2(a). The black dots are some model points for which a misfit is calculated.

The neighbourhood algorithm needs four tuning parameters:

$ it_{max}$ is the number of iteration to perform;
$ n_{s0}$ is the number of models chosen at random inside the parameter space at the beginning of the inversion;
$ n_s$ is the number of models to generate at each iteration;
$ n_r$ is the number of best cells (with the lowest misfit) where the $ n_s$ models are generated.

The inversion process is composed of the following phases:

  1. a set of $ n_{s0}$ models is randomly generated with a uniform probability in the parameter space;
  2. the misfit function is calculated for the most recently generated models;
  3. the $ n_r$ models with the lowest misfit of all models generated so far are selected;
  4. generate an average of $ \frac{n_s}{n_r}$ new samples with a uniform probability in each selected cell;
  5. add the $ n_s$ new samples to the previous ensemble of models and go back to (2).

Figure 2.2: Voronoi cells for a two-dimensional parameter space (from Sambridge (1999a)).
\includegraphics{fig_chapinv/voronoi.eps}

Figure 2.2(a) is an example of a two-dimensional parameter showing the models (black dots) and the limits of the Voronoi cells. $ n_{s0}$ (=9, in this case) models are generated and the grey cell has the lowest misfit. In this example, seven new models are generated in one cell ($ n_r=1$ , and $ n_s=7$ ). Figure 2.2(b) depicts the Voronoi geometry after the first iteration. The size of the original cell decreases as the sampling rate increases. If the cell with the grey outline has the lowest misfit, the density of sampling will not decrease systematically after each iteration. This is an interesting property of the Voronoi geometry that allows the centre of sampling to jump from place to place, whilst always sampling the most promising $ n_r$ regions simultaneously.

In the neighbourhood algorithm, a random walk (Gibbs sampler) is performed with a uniform probability density function inside the cell and zero outside. A walk is a sequence of perturbations to a model along all axis. The modified model is statistically independent of the original model. Asymptotically, the samples produced by this walk will be uniformly distributed inside the cell regardless of its shape. To confine the random walk inside a particular cell it is mandatory to calculate the multi-dimensional limits of the cell. Calculating the complete Voronoi geometry for high dimensional spaces becomes practically impossible when the number of models increases. Sambridge (1999a) proposed an original algorithm to compute only the limits along lines which are parallel to axis, in a precise and efficient way. These lines support the successive segments of the random walks.

There are only a few number of control parameters: $ n_{s0}$ , $ n_s$ , $ n_r$ , and $ it_{max}$ which is the maximum number of iterations. The neighbourhood algorithm is more exploratory if the $ n_s$ new samples are distributed on many cells and it optimizes more if they are restricted to the very few best cells. Typical values for the tuning parameters are 100 for $ n_{s0}$ , $ n_s$ , $ n_r$ . To force a better optimization, $ n_r$ may be set to 5, 10 or 50. Tests show that generally better misfits are obtained with fewer iterations if $ n_r$ is low, but the inversion is more trapped in local minima. The exploratory mode (e.g. $ n_r$ =100 and $ n_s$ =100) usually provides better final misfits if the inversion is conducted with a great number of iterations. The number of iterations ranges from 50 to 200. This makes a total of 5,000 to 20,000 generated models. Compared to linearized methods the number of forward computations is much greater. Consequently, the forward computation has to be correct for each parameter set without a visual check and it must be highly optimized to reduce the total computation time. These aspects are analysed in chapter 3 when designing the dispersion curve algorithm.

The neighbourhood algorithm like all other Monte Carlo techniques relies on a quasi or pseudo-random generator. A basic random generator on a computer is a series of numbers with a uniform probability, which is initialized by a special number called the random seed. The seed may take any integer value. Two inversion processes started with distinct seeds generate different models. However, if the problem is sufficiently constrained, the algorithm converges towards the same zone of the parameter space. For less constrained parameters, the investigated zones may be quite different. An interest of launching several inversion processes for the same case is to test the robustness of the fine result. All sets of models generated by separated processes can be merged to construct a more refined approximation of the misfit function.

The ensemble of models obtained from the neighbourhood algorithm has not the same statistical properties as the posterior probability density. Moreover, the statistical properties of the resulting ensemble strongly depend upon the tuning parameters. If lots of iterations are performed, the number of models near the best model is greater than for an inversion with less iterations. By the means of a resampling of the parameter space and approximating the posterior probability density with the misfit function, Sambridge (1999b) calculated the Bayesian integrals on an ensemble of models having the statistical properties corresponding to the posterior probability density. In our work, the algorithm we tested did not work properly, probably due to internal bugs. By the lack of time, we did not investigate more this approach but this second stage of the inversion is certainly valuable to measure the resolution and trade-off in a quantitative way.


next up previous contents
Next: Conditional parameter spaces Up: The inversion algorithm Previous: Monte Carlo methods   Contents
2007-03-15