2. PCA inversion
Let the spectral signal be which could represent the flux for a stellar spectral line, or any one of the Stokes parameters for a solar spectral line. For simplicity imagine for the moment that we are dealing with just a flux or intensity profile; generalisation to the other Stokes parameters will be discussed at the end of this section. Suppose the line is sampled at N discrete wavelengths. Let denote the signal for the wavelength. Then the spectral profile can be written as an N-dimensional column vector, which we also call the signal vector ,
where T denotes the matrix transpose.
For a particular atmospheric model one can generate a synthetic model profile . Inversion of an observed spectral line involves finding the model which minimises the Euclidean distance
This is a non-linear least squares problem which is usually solved by iterative adjustment of the model parameters (eg. del Toro Iniesta & Ruiz Cobo 1996b. Our approach sidesteps the numerical difficulties of this method.
Assume that we compute a database of model profiles
for models with parameters ranging over all likely values. Here is the signal for the wavelength and model. We use PCA to achieve a compact linear reconstruction of these model signals in terms of an orthonormal set of principal components, or eigenprofiles , . We propose using the same reconstruction formula for an observed signal . This is equivalent to assuming the model is an adequate representation of reality.
In Sect. 3 we use the form
is the mean signal vector. In this case the eigenprofiles are the eigenvectors of the signal covariance which we estimate using the entire model database as a training set. To do this we subtract the mean signal vector from each model signal vector and form the matrix,
Ignoring a constant factor, the covariance estimate is
Then we have
which can be solved by singular value decomposition (Press et al., 1988).
In Sects. 4.1 we use the alternative form,
where the eigenprofiles are eigenvectors of the signal correlation . Now let denote this correlation. Then , and are computed using the above formulae with
an matrix composed of the synthetic signal vectors . This form is also used in Sect. 4.2, but the training data are the observations themselves, rather than the synthetic signals.
Note that PCA achieves significant data compression. The principal components are indexed in decreasing order of significance in a least-squares sense, and good signal reconstruction can be obtained using only a small number of them, ie. . The appropriate cut-off can be estimated from a study of the eigenvalues (eg. Murtagh & Heck 1987).
Thus using PCA we can map the high-dimensional signal vector onto a low-dimensional eigenfeature vector
Using the fact that the eigenprofiles are orthonormal, it follows that Eq. 2 is, to a good approximation, equivalent to
The eigenfeature vectors derived from the model signal vectors can be regarded as M samples points on a hypersurface in n-dimensional eigenfeature space. We call this the model manifold , . Solving the inverse problem can be recast as finding the point on the model manifold closest to , as shown schematically in Fig. 1 for the case . Thus the inverse problem has been reduced to database search in a low-dimensional eigenfeature space.
Consider now the problem of inverting polarised spectra. Let be the profiles, or signal vectors, of the Stokes parameters. Stokes inversion involves solving for the model (magnetic field and other atmospheric parameters) which minimises the composite Euclidean distance,
where the are constant weights. Applying PCA as above to each of the Stokes parameters one can readily show that
where the dimensions of the Stokes eigenfeature vectors may differ. We now need to search four databases to find the closest points (with consistent model parameters) on four model manifolds .
Any observed spectra will produce a model solution using this approach, and one must be sure that the solution is physically meaningful. One way to test this would be to reject solutions where the distance to each model manifold exceeds some prescribed threshold. There is, of course, the risk that the model itself is inadequate, but that is a problem faced by any inversion method. An interesting possibility might be to use the solution found by PCA inversion to initialise a non-linear least squares inversion procedure. Assuming the PCA solution is near optimality, one might expect rapid convergence. However this is not guaranteed, given the unknown topology of the error surface.
As shown in Sects. 3 and 4, the eigenfeatures are smooth functions of the model parameters , so it should be easy to augment the eigenfeature databases by dense re-sampling using spline interpolation. This is the method adopted by Nene & Nayar (1996) in their computer vision application of PCA. A potential advantage of this smoothness is that it may not be necessary to compute extremely large training databases; all that might be needed is to choose model parameters which lead to eigenfeature vectors that are well distributed on the model manifold. Nene & Nayar describe a fast binary search algorithm for handling large databases.
© European Southern Observatory (ESO) 2000
Online publication: March 9, 2000