Forum Springer Astron. Astrophys.
Forum Whats New Search Orders

Astron. Astrophys. 362, 851-864 (2000)

Previous Section Next Section Title Page Table of Contents

2. Deriving the FP

The choice of the best fitting procedure in the study of astronomical data is generally a not trivial question: the appropriate method should be always suggested by the scientific problem to be analyzed (see the discussion in Isobe et al. 1990, hereafter IFA90).

The derivation of the FP is usually based on least squares methods: the ordinary least squares (OLS) fits (e.g. Lucey et al. 1991, Kjærgaard et al. 1993, Hudson et al. 1997), in which the root mean square (rms) of residuals relative to one of the variables is minimized, the orthogonal least squares (ORLS) fit (e.g. Busarello et al. 1997), in which it is minimized the rms of residuals perpendicular to the plane, and other methods derived from the previous ones, as the bisector fit (e.g. Graham & Colless 1996) and the arithmetic mean of the OLS coefficients (e.g. Faber et al. 1987). To reduce the effect of outliers, the fits are often performed by robust procedures, in which the sum of absolute residuals is minimized (see JFK96).

Fitting methods based on multivariate analysis techniques, like the principal component analysis (e.g. Bender et al. 1992), have been sometimes adopted. By their nature, however, these methods are not suitable for the determination of best-fit coefficients, so that they will be not considered in the following.

The most natural interpretation of the FP is in terms of a `mean relation' between global quantities of E galaxies with respect to which they scatter in the space of the observed properties. Drawing hint from AkB96, we will take this interpretation into mathematical terms, by introducing a statistical model for the FP (Sect. 2.1). The corresponding fitting procedures will be also derived.

We will then discuss (Sect. 2.2) the problem of recovering the mean relation and illustrate the origin of the dependence of the FP coefficients on the fitting method.

2.1. A statistical model: the MIST fits

Let us introduce three random variables [FORMULA], describing the distribution of global quantities of E galaxies. We will assume that [FORMULA] verify the following identity:


where [FORMULA] and [FORMULA] are the slopes and [FORMULA] is the zero point of the mean relation.

In the following, we will use capitals to indicate random variables (RV) and the corresponding small letters for their outputs. Given two RVs A and B, we will indicate with [FORMULA] their covariance, with [FORMULA] and [FORMULA] their variances and with [FORMULA] and [FORMULA] their expected values. The estimators of a given quantity (i.e. the RVs used to approximate the quantity we look for) will be marked with a caret (e.g. [FORMULA], [FORMULA] and [FORMULA]). An estimator will be said `unbiased' if its expected value coincides with the quantity to be estimated.

In modeling the scatter around the plane defined by Eq. (2), we have not only to consider measurement errors on the variables but also an intrinsic dispersion.

To this aim, let us introduce two sets of RVs, [FORMULA] and [FORMULA], with zero expected values, that describe respectively the intrinsic dispersion and the scatter due to measurement errors, and let us consider the following relations:


where [FORMULA] are the RVs that describe the distribution of the observed quantities of E galaxies (e.g. [FORMULA][FORMULA], [FORMULA][FORMULA], [FORMULA][FORMULA]).

The problem is then to determine unbiased estimators of [FORMULA], [FORMULA] and [FORMULA].

From Eq. (2), one obtains the following identities:


where [FORMULA] and [FORMULA] are the components of the covariance matrix (CM) of [FORMULA]. Assuming that [FORMULA], [FORMULA] and [FORMULA] are mutually not correlated (hereafter `hypothesis [FORMULA]') and making use of Eqs. (2) and (3), we can express the first and second order moments of [FORMULA] as (see App. A for a straight demonstration and compare AkB96):


Eliminating [FORMULA] and [FORMULA] from Eqs. (4-6) by means of Eqs. (7-9), we obtain a linear system of three equations in [FORMULA], [FORMULA] and [FORMULA], whose solution is:


where we put:


On increasing the sample size, the expressions obtained by substituting in the previous equations the terms [FORMULA] with their unbiased estimates [FORMULA] (hereafter [FORMULA], [FORMULA] and [FORMULA]), will approximate more and more closely the `true coefficients' [FORMULA], [FORMULA] and [FORMULA]. The quantities [FORMULA], [FORMULA] and [FORMULA] will thus furnish unbiased, asymptotically normal, estimates of [FORMULA], [FORMULA] and [FORMULA]. In App. B the analytical formulae of the relative CM components are also derived.

If [FORMULA] are normally distributed, the unbiased estimators for the moments of [FORMULA] are defined by the following usual expressions:


The fitting procedure based on Eqs. (10-13) and (14-16) will be indicated hereafter by the acronyms MIST, Measurement errors and Intrinsic Scatter Three dimensional fit.

2.2. Deriving the coefficients of the FP

To discuss the application of Eqs. (10-16) to the derivation of the FP, it is important to remark the following points:

  • [FORMULA]. The terms [FORMULA] and [FORMULA] do not appear in Eqs. (10-13). The estimators of [FORMULA], [FORMULA] and [FORMULA] are therefore independent of the scatter along the dependent variable.

  • [FORMULA]. Setting equal to zero the other components of the CMs of [FORMULA] and [FORMULA] in Eqs. (10-13) and using Eqs. (14-16), we obtain exactly the expressions of the [FORMULA] estimators, where the subscript [FORMULA] is used to indicate the dependent variable in the fit 3. If, on the other hand, every variable is affected by dispersion, the quantities [FORMULA] would give a biased estimate of [FORMULA] as shown by Eqs. (5, 6). To correct for the bias the CM components of [FORMULA] and [FORMULA] are needed (see AkB96).

  • [FORMULA]. As follows from [FORMULA] and [FORMULA], to recover the mean relation by Eqs. (10-13) we have altogether ten degrees of freedom, given by five elements of the CMs of [FORMULA] and [FORMULA] respectively.

  • [FORMULA]. Since Eqs. 3 have the same dependency on the three variables, the best-fit coefficients obtained by means of Eqs. (10-13) are independent of the choice of the dependent variable .

  • [FORMULA]. New unbiased estimators of the coefficients in Eq. (2) can be defined by taking some average of the slopes of the three planes determined by Eqs. (10-13) using each of the [FORMULA] as dependent variable. We can define a bisector plane (see also Graham & Colless 1996), whose slopes are given by the vectorial sum of the normal vectors to the three planes obtained by Eqs. (10-13), and whose zero point is obtained by Eq. (12). The advantage of the `bisector' fit (hereafter [FORMULA] fit) consists in the larger effectiveness (i.e. smaller variances for given sample size) of the relative estimators (see Sect. 3.2, and IFA90 for the two-dimensional case).

  • [FORMULA]. There are often large systematic in-homogeneities between different samples of data, due to differences in the procedures adopted to derive them (see e.g. Smith et al. 1997). To account for such in-homogeneities, the least squares fits (ORLS and OLS) are often performed using robust estimators. The robust statistics can also be implemented to the MIST regression, by using robust estimators for the moments of [FORMULA] in Eqs. (10-13).

From the derivation of the equations in the previous section, it follows that, in order to recover the coefficients of the mean relation, [FORMULA], [FORMULA] and [FORMULA], some requests must be satisfied:

  • [FORMULA]: [FORMULA], [FORMULA] and [FORMULA] are not mutually correlated;

  • [FORMULA]: estimates of the CM components of [FORMULA] are known;

  • [FORMULA]: estimates of the CM components of [FORMULA] are available;

  • [FORMULA]: unbiased estimates of the CM components of [FORMULA] are known.

The hypothesis [FORMULA] is practically equivalent to assume that the measurement errors and the intrinsic scatter on the FP variables do not depend on the `location' on the FP. This is not the case for the measurement uncertainties on the observed parameters and, a priori , it could be not true even for the intrinsic dispersion. However Eqs. (8-13) continue to be valid by simply substituting the CM components of [FORMULA] and [FORMULA] with their expected values as a function of the `position' on the plane (see App. A and B for details).

Concerning the measurement uncertainties (hypothesis [FORMULA]), Eqs. (10-13) can still be used by adopting averaged values of [FORMULA]. To this aim, we suggest to proceed as follows. The quantities [FORMULA] can be estimated as the square of the mean error 4 on the parameters [FORMULA]. For what concerns [FORMULA], the only term that does not vanish, is [FORMULA] due to the correlation between the uncertainties in [FORMULA] and [FORMULA]. The quantities [FORMULA] are never given in literature, so that one is forced to make some approximations. Since [FORMULA] with [FORMULA] (see JFK96), we can set [FORMULA]. The uncertainties introduced by this approximation will be discussed in Sect. 4.

Assumptions [FORMULA] and [FORMULA], cannot be satisfied because we do not have a physical model of the probability distribution of E galaxies in the space of the observed quantities. It is thus necessary to introduce some simplifying assumptions:

  • [FORMULA]: on the basis of [FORMULA] we can only assign all the intrinsic dispersion to the dependent variable.

  • [FORMULA]: the only possible simple assumption is that [FORMULA] are normally distributed (see Eqs. 14-16).

Both these assumptions introduce a `bias' in the estimate of [FORMULA], [FORMULA] and [FORMULA], so that the coefficientsa,bandcin Eq. (1) do not necessarily coincide with [FORMULA], [FORMULA] and [FORMULA] . Moreover, because of [FORMULA], the estimates obtained by Eqs. (10-13) with a different choice of the dependent variable do not correspond to the same statistical model and so do not define the same plane.

In order to better understand the bias introduced by [FORMULA] and [FORMULA], we used numerical simulations of the FP. The simulations were constructed giving a scatter around the plane in the [FORMULA] direction by a normal RV (see Sect. 3.1 for details). The variance of [FORMULA] has been varied in the range of values obtained for the [FORMULA] scatter of different samples of galaxies (see Sect. 4.1). The FP coefficients have been derived for each simulation by the MIST fits, with assumptions [FORMULA] and [FORMULA], using each of [FORMULA] as the dependent variable (hereafter [FORMULA] fits), and by the [FORMULA]. We used the estimates of [FORMULA] obtained by typical values of the uncertainties on FP parameters (0.03 in [FORMULA], 0.1 in [FORMULA] and 0.03 in [FORMULA]).

In Fig. 1, we plot the coefficient a against [FORMULA] (similar results are obtained for b and c). The values of the FP coefficients obtained by the various fitting methods turn out to be systematically different, by an amount that increases with the scatter. An estimate of this difference for the MIST fits will be given in Sect. 4 by comparing the FPs of different clusters of galaxies.

[FIGURE] Fig. 1. Coefficient a of FP simulations against the [FORMULA] variance, [FORMULA]. The size of the simulations is [FORMULA]. The symbols correspond to different fitting methods as shown in the lower left.

The assumption [FORMULA] introduces likely the same amount of bias in the various fits. To illustrate this point, let us consider an example in which the CM components of the intrinsic dispersion are `known'. We will assume [FORMULA] (a scatter of [FORMULA] in [FORMULA] and [FORMULA]5 and that the other CM terms of [FORMULA] and [FORMULA] vanish. We now create a FP simulation, and derive a, b and c by the MIST fits, using the assigned values of the CM components of [FORMULA] and [FORMULA], and making therefore only the hypothesis [FORMULA]. The results of the fits are shown in Table 1: the FP coefficients are practically independent of the fitting method.


Table 1. Values of a, b and c of a FP simulation with intrinsic dispersion `known' (see text). The sample size is [FORMULA]. Column 1: FP coefficients a, b and c. Columns 2, 3, 4 and 5: values obtained by the different MIST fits (see subscripts in the top row) with corresponding uncertainties (1[FORMULA] intervals).

We then conclude that the differences of Fig. 1 are due only to the assumption [FORMULA]: the lack of a model for the intrinsic dispersion of the FP variables is thus at the origin of the dependency of the FP coefficients on the fitting procedure . The larger is the scatter around the plane, the larger will be the bias due to the fitting method.

Previous Section Next Section Title Page Table of Contents

© European Southern Observatory (ESO) 2000

Online publication: October 30, 2000