## 2. Deriving the FPThe choice of the best fitting procedure in the study of astronomical data is generally a not trivial question: the appropriate method should be always suggested by the scientific problem to be analyzed (see the discussion in Isobe et al. 1990, hereafter IFA90). The derivation of the FP is usually based on least squares methods: the ordinary least squares (OLS) fits (e.g. Lucey et al. 1991, Kjærgaard et al. 1993, Hudson et al. 1997), in which the root mean square (rms) of residuals relative to one of the variables is minimized, the orthogonal least squares (ORLS) fit (e.g. Busarello et al. 1997), in which it is minimized the rms of residuals perpendicular to the plane, and other methods derived from the previous ones, as the bisector fit (e.g. Graham & Colless 1996) and the arithmetic mean of the OLS coefficients (e.g. Faber et al. 1987). To reduce the effect of outliers, the fits are often performed by robust procedures, in which the sum of absolute residuals is minimized (see JFK96). Fitting methods based on multivariate analysis techniques, like the principal component analysis (e.g. Bender et al. 1992), have been sometimes adopted. By their nature, however, these methods are not suitable for the determination of best-fit coefficients, so that they will be not considered in the following. The most natural interpretation of the FP is in terms of a `mean relation' between global quantities of E galaxies with respect to which they scatter in the space of the observed properties. Drawing hint from AkB96, we will take this interpretation into mathematical terms, by introducing a statistical model for the FP (Sect. 2.1). The corresponding fitting procedures will be also derived. We will then discuss (Sect. 2.2) the problem of recovering the mean relation and illustrate the origin of the dependence of the FP coefficients on the fitting method. ## 2.1. A statistical model: the MIST fitsLet us introduce three random variables , describing the distribution of global quantities of E galaxies. We will assume that verify the following identity: where and are the slopes and is the zero point of the mean relation. In the following, we will use capitals to indicate random variables
(RV) and the corresponding small letters for their outputs. Given two
RVs In modeling the scatter around the plane defined by Eq. (2), we have not only to consider measurement errors on the variables but also an intrinsic dispersion. To this aim, let us introduce two sets of RVs, and , with zero expected values, that describe respectively the intrinsic dispersion and the scatter due to measurement errors, and let us consider the following relations: where are the RVs that describe the distribution of the observed quantities of E galaxies (e.g. , , ). The problem is then to determine unbiased estimators of , and . From Eq. (2), one obtains the following identities: where and are the components of the covariance matrix (CM) of . Assuming that , and are mutually not correlated (hereafter `hypothesis ') and making use of Eqs. (2) and (3), we can express the first and second order moments of as (see App. A for a straight demonstration and compare AkB96): Eliminating and from Eqs. (4-6) by means of Eqs. (7-9), we obtain a linear system of three equations in , and , whose solution is: On increasing the sample size, the expressions obtained by substituting in the previous equations the terms with their unbiased estimates (hereafter , and ), will approximate more and more closely the `true coefficients' , and . The quantities , and will thus furnish unbiased, asymptotically normal, estimates of , and . In App. B the analytical formulae of the relative CM components are also derived. If are normally distributed, the unbiased estimators for the moments of are defined by the following usual expressions: The fitting procedure based on Eqs. (10-13) and (14-16) will
be indicated hereafter by the acronyms ## 2.2. Deriving the coefficients of the FPTo discuss the application of Eqs. (10-16) to the derivation of the FP, it is important to remark the following points: -
. The terms and do not appear in Eqs. (10-13). The estimators of , and are therefore independent of the scatter along the dependent variable. -
. Setting equal to zero the other components of the CMs of and in Eqs. (10-13) and using Eqs. (14-16), we obtain exactly the expressions of the estimators, where the subscript is used to indicate the dependent variable in the fit ^{3}. If, on the other hand, every variable is affected by dispersion, the quantities would give a biased estimate of as shown by Eqs. (5, 6). To correct for the bias the CM components of and are needed (see AkB96). -
. As follows from and , to recover the mean relation by Eqs. (10-13) we have altogether ten degrees of freedom, given by five elements of the CMs of and respectively. -
. Since Eqs. 3 have the same dependency on the three variables, *the best-fit coefficients obtained by means of Eqs. (10-13) are independent of the choice of the dependent variable*. -
. New unbiased estimators of the coefficients in Eq. (2) can be defined by taking some average of the slopes of the three planes determined by Eqs. (10-13) using each of the as dependent variable. We can define a *bisector plane*(see also Graham & Colless 1996), whose slopes are given by the vectorial sum of the normal vectors to the three planes obtained by Eqs. (10-13), and whose zero point is obtained by Eq. (12). The advantage of the `bisector' fit (hereafter fit) consists in the larger effectiveness (i.e. smaller variances for given sample size) of the relative estimators (see Sect. 3.2, and IFA90 for the two-dimensional case). -
. There are often large systematic in-homogeneities between different samples of data, due to differences in the procedures adopted to derive them (see e.g. Smith et al. 1997). To account for such in-homogeneities, the least squares fits (ORLS and OLS) are often performed using robust estimators. The robust statistics can also be implemented to the MIST regression, by using robust estimators for the moments of in Eqs. (10-13).
From the derivation of the equations in the previous section, it follows that, in order to recover the coefficients of the mean relation, , and , some requests must be satisfied: -
: , and are not mutually correlated; -
: estimates of the CM components of are known; -
: estimates of the CM components of are available; -
: unbiased estimates of the CM components of are known.
The hypothesis is practically
equivalent to assume that the measurement errors and the intrinsic
scatter on the FP variables do not depend on the `location' on the FP.
This is not the case for the measurement uncertainties on the observed
parameters and, Concerning the measurement uncertainties (hypothesis
), Eqs. (10-13) can still be
used by adopting averaged values of .
To this aim, we suggest to proceed as follows. The quantities
can be estimated as the square of
the mean error Assumptions and , cannot be satisfied because we do not have a physical model of the probability distribution of E galaxies in the space of the observed quantities. It is thus necessary to introduce some simplifying assumptions: -
: on the basis of we can only assign all the intrinsic dispersion to the dependent variable. -
: the only possible simple assumption is that are normally distributed (see Eqs. 14-16).
Both these assumptions introduce a `bias' in the estimate of
,
and , so that In order to better understand the bias introduced by and , we used numerical simulations of the FP. The simulations were constructed giving a scatter around the plane in the direction by a normal RV (see Sect. 3.1 for details). The variance of has been varied in the range of values obtained for the scatter of different samples of galaxies (see Sect. 4.1). The FP coefficients have been derived for each simulation by the MIST fits, with assumptions and , using each of as the dependent variable (hereafter fits), and by the . We used the estimates of obtained by typical values of the uncertainties on FP parameters (0.03 in , 0.1 in and 0.03 in ). In Fig. 1, we plot the coefficient
The assumption introduces likely
the same amount of bias in the various fits. To illustrate this point,
let us consider an example in which the CM components of the intrinsic
dispersion are `known'. We will assume
(a scatter of
in
and
)
We then conclude that the differences of Fig. 1 are due only
to the assumption : © European Southern Observatory (ESO) 2000 Online publication: October 30, 2000 |