          Astron. Astrophys. 329, 863-872 (1998)

## 3. Principal component analysis

PCA is a mathematical tool, which can reduce a multi-dimensional data set into a small number of linearly independent variables. A general description of the PCA method can be found in Kendall & Stuart (1976) or Jolliffe (1986). The special case of PCA applied to AGN spectra is described by Mittaz et al. (1990) or by Francis et al. (1992) for object-to-object variations.

### 3.1. Description of the method

The analysis we applied to the spectra of each object was done as follows. We constructed a matrix containing the fluxes of the m observations in the n 1 Å wavelength bins ( ) between 1229 and 1948 Å. These 720 bins represent nearly the entire SWP range except the geocoronal Ly emission bellow 1229 Å. The fluxes in each form a base of a n -dimensional space, in which the m observations are represented by m points.

The aim of the PCA is to form a new base , in which only a few number of the n vectors are relevant to describe the variations. is the base in which the expression of the covariance matrix of is diagonal. The vectors are thus the eigenvectors of . They are called components, since they describe linearly independent forms of variations and can be represented as spectra (each component is a linear combination of the n initial ). For a given component k, the eigenvalue of is equal to the flux variance in this component. The contribution of the component k to the total variability is given by its relative importance defined by: . The principal component is thus the component with the highest and its direction in the n -space follows the maximal elongation of the m observation points.

The matrix , which is the expression of in the new base , contains the fluxes of the m observations in the n "pseudo-wavelengths" . We therefore have the n lightcurves of the components from which we derive the mean flux and the flux variance in each component i. The raw components are normalized to unity and their sign is undefined. In order to be able to add or subtract components, we have to normalize them properly. Two possible additive normalizations are (a) to multiply the components by their mean flux or (b) to multiply the squared components by their flux variance . The total component (i.e. the sum of all components) is then equal to (a) the mean spectrum or (b) the variance spectrum (i.e. respectively the mean flux and the flux variance in each ).

In our analysis, we concentrate on only two components: the principal component and the rest component, which is constructed by subtracting the principal from the total component and is therefore the sum of the minor components. In Fig. 1, we show the decomposition into these two components of (a) the mean and (b) the variance spectrum taken around the C IV 1549 line of NGC 3783. The lightcurve of the total component is obtained by a vectorial addition of the lightcurves in the n components (i.e. for each observation the squared flux in the total component is the sum of the squared fluxes in the n components). Fig. 1. Principal component analysis of the C IV 1549 line in NGC 3783. a decomposition of the mean spectrum (points) into the principal component (solid line) and the rest component (dotted line). b similar decomposition of the variance spectrum. The flux is expressed in ### 3.2. Properties of the components

The power of the PCA lies in the diagonalization of the covariance matrix of . A diagonal covariance matrix ensures that all structures constituting a single PCA component vary simultaneously, while the different components have variations that are completely uncorrelated at zero lag.

We applied the PCA on the widest possible spectral range to include a significant fraction of the continuum in the analysis. This ensures that the principal component reflects the main continuum variations, since the continuum is known to vary with larger amplitudes than the emission-lines. Fig. 1b illustrates that in NGC 3783 nearly all continuum variations are described by the principal component alone. Under this assumption, we can physically interpret the line profiles in the two components. The line profile in the principal component shows which line-part varies simultaneously with the continuum, whereas the line profile in the rest component shows line-parts that do not vary in tune with the continuum.

To test this interpretation and to further investigate the relationship between the continuum and our two components, we applied the PCA method to only the well sampled observations of the 1989 monitoring campaign on NGC 5548 (Clavel et al. 1991). In Fig. 2a, we compare the principal and the rest component lightcurves with the original continuum lightcurve at 1350 Å from Clavel et al. (1991) in units of . Fig. 2. PCA study of the 1989 monitoring campaign on NGC 5548. a logarithmic lightcurves of the continuum flux at 1350 Å from Clavel et al. (1991) (star points), of the principal component (filled points) and of the rest component (open points). b cross-correlations of the principal component (filled points) and the rest component (open points) with the continuum lightcurve at 1350 Å. c PCA decomposition of the C IV line as in Fig. 1a

We analysed the correlation of the two components with the UV continuum using the interpolated cross-correlation function (ICCF) of Gaskell & Sparke (1986) as described by White & Peterson (1994). The results show that the principal component variations are indeed strongly correlated with those of the continuum flux at zero lag, while the rest component variations are completely uncorrelated at zero lag with those of the continuum. However, the cross-correlation between the rest component and the UV continuum suggests that the variable line-parts in the rest component follow the continuum with a delay of the order of 25 days. This lag is twice as large as the lag between the continuum and the integrated Ly and C IV lines (8-16 days) (Clavel et al. 1991), which means that the PCA divides the line into two parts: a line-part that responds to the continuum with a very small delay (&lsim; 5 days) and a line-part that responds with a much longer delay ( 25 days).

The C IV line profile of these two parts is shown in Fig. 2c. The full width at half maximum (FWHM) of the line-part in the rest component is much narrower (1 850 ) than in the principal component (7 300 ). However, the full width at 1/4 maximum of the line-part in the rest component (6 650 ) shows that high velocities are also present.

This analysis suggests that the line-part in the principal component is emitted in the inner broad-line region (BLR), since it responds with large amplitudes and nearly zero lag to the continuum, whereas the line-part in the rest component is both constituted of a less varying contribution from outer parts of the BLR that respond with a greater delay to the continuum and of a non-varying contribution from the narrow-line region (NLR).

The lag of 25 days found for the rest component in NGC 5548 is comparable to the lag of the C III ] 1909 line (26-32 days) (Clavel et al. 1991) and to the Balmer-line lags ( 20 days for H ) (e.g. Peterson et al. 1991). It is therefore possible that the C IV line-part in the rest component is emitted at about the same distance from the ionizing continuum than the C III ] or the H line.

In other objects, the temporal sampling of the observations does not allow us to determine a clear lag between the two components and the continuum. However, this will be possible with the other monitoring campaign data, that will soon be available in the IUEFA.    © European Southern Observatory (ESO) 1998

Online publication: December 16, 1997 