Astron. Astrophys. 362, 851-864 (2000)

## 2. Deriving the FP

The choice of the best fitting procedure in the study of astronomical data is generally a not trivial question: the appropriate method should be always suggested by the scientific problem to be analyzed (see the discussion in Isobe et al. 1990, hereafter IFA90).

The derivation of the FP is usually based on least squares methods: the ordinary least squares (OLS) fits (e.g. Lucey et al. 1991, Kjærgaard et al. 1993, Hudson et al. 1997), in which the root mean square (rms) of residuals relative to one of the variables is minimized, the orthogonal least squares (ORLS) fit (e.g. Busarello et al. 1997), in which it is minimized the rms of residuals perpendicular to the plane, and other methods derived from the previous ones, as the bisector fit (e.g. Graham & Colless 1996) and the arithmetic mean of the OLS coefficients (e.g. Faber et al. 1987). To reduce the effect of outliers, the fits are often performed by robust procedures, in which the sum of absolute residuals is minimized (see JFK96).

Fitting methods based on multivariate analysis techniques, like the principal component analysis (e.g. Bender et al. 1992), have been sometimes adopted. By their nature, however, these methods are not suitable for the determination of best-fit coefficients, so that they will be not considered in the following.

The most natural interpretation of the FP is in terms of a `mean relation' between global quantities of E galaxies with respect to which they scatter in the space of the observed properties. Drawing hint from AkB96, we will take this interpretation into mathematical terms, by introducing a statistical model for the FP (Sect. 2.1). The corresponding fitting procedures will be also derived.

We will then discuss (Sect. 2.2) the problem of recovering the mean relation and illustrate the origin of the dependence of the FP coefficients on the fitting method.

### 2.1. A statistical model: the MIST fits

Let us introduce three random variables , describing the distribution of global quantities of E galaxies. We will assume that verify the following identity:

where and are the slopes and is the zero point of the mean relation.

In the following, we will use capitals to indicate random variables (RV) and the corresponding small letters for their outputs. Given two RVs A and B, we will indicate with their covariance, with and their variances and with and their expected values. The estimators of a given quantity (i.e. the RVs used to approximate the quantity we look for) will be marked with a caret (e.g. , and ). An estimator will be said `unbiased' if its expected value coincides with the quantity to be estimated.

In modeling the scatter around the plane defined by Eq. (2), we have not only to consider measurement errors on the variables but also an intrinsic dispersion.

To this aim, let us introduce two sets of RVs, and , with zero expected values, that describe respectively the intrinsic dispersion and the scatter due to measurement errors, and let us consider the following relations:

where are the RVs that describe the distribution of the observed quantities of E galaxies (e.g. , , ).

The problem is then to determine unbiased estimators of , and .

From Eq. (2), one obtains the following identities:

where and are the components of the covariance matrix (CM) of . Assuming that , and are mutually not correlated (hereafter `hypothesis ') and making use of Eqs. (2) and (3), we can express the first and second order moments of as (see App. A for a straight demonstration and compare AkB96):

Eliminating and from Eqs. (4-6) by means of Eqs. (7-9), we obtain a linear system of three equations in , and , whose solution is:

where we put:

On increasing the sample size, the expressions obtained by substituting in the previous equations the terms with their unbiased estimates (hereafter , and ), will approximate more and more closely the `true coefficients' , and . The quantities , and will thus furnish unbiased, asymptotically normal, estimates of , and . In App. B the analytical formulae of the relative CM components are also derived.

If are normally distributed, the unbiased estimators for the moments of are defined by the following usual expressions:

The fitting procedure based on Eqs. (10-13) and (14-16) will be indicated hereafter by the acronyms MIST, Measurement errors and Intrinsic Scatter Three dimensional fit.

### 2.2. Deriving the coefficients of the FP

To discuss the application of Eqs. (10-16) to the derivation of the FP, it is important to remark the following points:

• . The terms and do not appear in Eqs. (10-13). The estimators of , and are therefore independent of the scatter along the dependent variable.

• . Setting equal to zero the other components of the CMs of and in Eqs. (10-13) and using Eqs. (14-16), we obtain exactly the expressions of the estimators, where the subscript is used to indicate the dependent variable in the fit 3. If, on the other hand, every variable is affected by dispersion, the quantities would give a biased estimate of as shown by Eqs. (5, 6). To correct for the bias the CM components of and are needed (see AkB96).

• . As follows from and , to recover the mean relation by Eqs. (10-13) we have altogether ten degrees of freedom, given by five elements of the CMs of and respectively.

• . Since Eqs. 3 have the same dependency on the three variables, the best-fit coefficients obtained by means of Eqs. (10-13) are independent of the choice of the dependent variable .

• . New unbiased estimators of the coefficients in Eq. (2) can be defined by taking some average of the slopes of the three planes determined by Eqs. (10-13) using each of the as dependent variable. We can define a bisector plane (see also Graham & Colless 1996), whose slopes are given by the vectorial sum of the normal vectors to the three planes obtained by Eqs. (10-13), and whose zero point is obtained by Eq. (12). The advantage of the `bisector' fit (hereafter fit) consists in the larger effectiveness (i.e. smaller variances for given sample size) of the relative estimators (see Sect. 3.2, and IFA90 for the two-dimensional case).

• . There are often large systematic in-homogeneities between different samples of data, due to differences in the procedures adopted to derive them (see e.g. Smith et al. 1997). To account for such in-homogeneities, the least squares fits (ORLS and OLS) are often performed using robust estimators. The robust statistics can also be implemented to the MIST regression, by using robust estimators for the moments of in Eqs. (10-13).

From the derivation of the equations in the previous section, it follows that, in order to recover the coefficients of the mean relation, , and , some requests must be satisfied:

• : , and are not mutually correlated;

• : estimates of the CM components of are known;

• : estimates of the CM components of are available;

• : unbiased estimates of the CM components of are known.

The hypothesis is practically equivalent to assume that the measurement errors and the intrinsic scatter on the FP variables do not depend on the `location' on the FP. This is not the case for the measurement uncertainties on the observed parameters and, a priori , it could be not true even for the intrinsic dispersion. However Eqs. (8-13) continue to be valid by simply substituting the CM components of and with their expected values as a function of the `position' on the plane (see App. A and B for details).

Concerning the measurement uncertainties (hypothesis ), Eqs. (10-13) can still be used by adopting averaged values of . To this aim, we suggest to proceed as follows. The quantities can be estimated as the square of the mean error 4 on the parameters . For what concerns , the only term that does not vanish, is due to the correlation between the uncertainties in and . The quantities are never given in literature, so that one is forced to make some approximations. Since with (see JFK96), we can set . The uncertainties introduced by this approximation will be discussed in Sect. 4.

Assumptions and , cannot be satisfied because we do not have a physical model of the probability distribution of E galaxies in the space of the observed quantities. It is thus necessary to introduce some simplifying assumptions:

• : on the basis of we can only assign all the intrinsic dispersion to the dependent variable.

• : the only possible simple assumption is that are normally distributed (see Eqs. 14-16).

Both these assumptions introduce a `bias' in the estimate of , and , so that the coefficientsa,bandcin Eq. (1) do not necessarily coincide with , and . Moreover, because of , the estimates obtained by Eqs. (10-13) with a different choice of the dependent variable do not correspond to the same statistical model and so do not define the same plane.

In order to better understand the bias introduced by and , we used numerical simulations of the FP. The simulations were constructed giving a scatter around the plane in the direction by a normal RV (see Sect. 3.1 for details). The variance of has been varied in the range of values obtained for the scatter of different samples of galaxies (see Sect. 4.1). The FP coefficients have been derived for each simulation by the MIST fits, with assumptions and , using each of as the dependent variable (hereafter fits), and by the . We used the estimates of obtained by typical values of the uncertainties on FP parameters (0.03 in , 0.1 in and 0.03 in ).

In Fig. 1, we plot the coefficient a against (similar results are obtained for b and c). The values of the FP coefficients obtained by the various fitting methods turn out to be systematically different, by an amount that increases with the scatter. An estimate of this difference for the MIST fits will be given in Sect. 4 by comparing the FPs of different clusters of galaxies.

 Fig. 1. Coefficient a of FP simulations against the variance, . The size of the simulations is . The symbols correspond to different fitting methods as shown in the lower left.

The assumption introduces likely the same amount of bias in the various fits. To illustrate this point, let us consider an example in which the CM components of the intrinsic dispersion are `known'. We will assume (a scatter of in and 5 and that the other CM terms of and vanish. We now create a FP simulation, and derive a, b and c by the MIST fits, using the assigned values of the CM components of and , and making therefore only the hypothesis . The results of the fits are shown in Table 1: the FP coefficients are practically independent of the fitting method.

Table 1. Values of a, b and c of a FP simulation with intrinsic dispersion `known' (see text). The sample size is . Column 1: FP coefficients a, b and c. Columns 2, 3, 4 and 5: values obtained by the different MIST fits (see subscripts in the top row) with corresponding uncertainties (1 intervals).

We then conclude that the differences of Fig. 1 are due only to the assumption : the lack of a model for the intrinsic dispersion of the FP variables is thus at the origin of the dependency of the FP coefficients on the fitting procedure . The larger is the scatter around the plane, the larger will be the bias due to the fitting method.

© European Southern Observatory (ESO) 2000

Online publication: October 30, 2000