Astron. Astrophys. 362, 851-864 (2000)
2. Deriving the FP
The choice of the best fitting procedure in the study of
astronomical data is generally a not trivial question: the appropriate
method should be always suggested by the scientific problem to be
analyzed (see the discussion in Isobe et al. 1990, hereafter
IFA90).
The derivation of the FP is usually based on least squares methods:
the ordinary least squares (OLS) fits (e.g. Lucey et al. 1991,
Kjærgaard et al. 1993, Hudson et al. 1997), in which the root
mean square (rms) of residuals relative to one of the variables is
minimized, the orthogonal least squares (ORLS) fit (e.g. Busarello et
al. 1997), in which it is minimized the rms of residuals perpendicular
to the plane, and other methods derived from the previous ones, as the
bisector fit (e.g. Graham & Colless 1996) and the arithmetic mean
of the OLS coefficients (e.g. Faber et al. 1987). To reduce the effect
of outliers, the fits are often performed by robust procedures, in
which the sum of absolute residuals is minimized (see JFK96).
Fitting methods based on multivariate analysis techniques, like the
principal component analysis (e.g. Bender et al. 1992), have been
sometimes adopted. By their nature, however, these methods are not
suitable for the determination of best-fit coefficients, so that they
will be not considered in the following.
The most natural interpretation of the FP is in terms of a `mean
relation' between global quantities of E galaxies with respect to
which they scatter in the space of the observed properties. Drawing
hint from AkB96, we will take this interpretation into mathematical
terms, by introducing a statistical model for the FP (Sect. 2.1).
The corresponding fitting procedures will be also derived.
We will then discuss (Sect. 2.2) the problem of recovering the
mean relation and illustrate the origin of the dependence of the FP
coefficients on the fitting method.
2.1. A statistical model: the MIST fits
Let us introduce three random variables
, describing the distribution of
global quantities of E galaxies. We will assume that
verify the following identity:
![[EQUATION]](img10.gif)
where and
are the slopes and
is the zero point of the mean
relation.
In the following, we will use capitals to indicate random variables
(RV) and the corresponding small letters for their outputs. Given two
RVs A and B, we will indicate with
their covariance, with
and
their variances and with
and
their expected values. The
estimators of a given quantity (i.e. the RVs used to approximate the
quantity we look for) will be marked with a caret (e.g.
,
and ). An estimator will be said
`unbiased' if its expected value coincides with the quantity to be
estimated.
In modeling the scatter around the plane defined by Eq. (2),
we have not only to consider measurement errors on the variables but
also an intrinsic dispersion.
To this aim, let us introduce two sets of RVs,
and
, with zero expected values, that
describe respectively the intrinsic dispersion and the scatter due to
measurement errors, and let us consider the following relations:
![[EQUATION]](img24.gif)
where are the RVs that describe
the distribution of the observed quantities of E galaxies (e.g.
![[FORMULA]](img26.gif) ,
![[FORMULA]](img28.gif) ,
![[FORMULA]](img30.gif) ).
The problem is then to determine unbiased estimators of
,
and .
From Eq. (2), one obtains the following identities:
![[EQUATION]](img32.gif)
where and
are the components of the covariance
matrix (CM) of . Assuming that
,
and are mutually not correlated
(hereafter `hypothesis ') and making
use of Eqs. (2) and (3), we can express the first and second
order moments of as (see App. A for a
straight demonstration and compare AkB96):
![[EQUATION]](img38.gif)
Eliminating and
from Eqs. (4-6) by means of
Eqs. (7-9), we obtain a linear system of three equations in
,
and , whose solution is:
![[EQUATION]](img41.gif)
where we put:
![[EQUATION]](img42.gif)
On increasing the sample size, the expressions obtained by
substituting in the previous equations the terms
with their unbiased estimates
(hereafter
,
and ), will approximate more and more
closely the `true coefficients' ,
and
. The quantities
,
and will thus furnish unbiased,
asymptotically normal, estimates of ,
and
. In App. B the analytical formulae
of the relative CM components are also derived.
If are normally distributed, the
unbiased estimators for the moments of
are defined by the following usual
expressions:
![[EQUATION]](img46.gif)
The fitting procedure based on Eqs. (10-13) and (14-16) will
be indicated hereafter by the acronyms MIST, Measurement
errors and Intrinsic Scatter Three dimensional fit.
2.2. Deriving the coefficients of the FP
To discuss the application of Eqs. (10-16) to the derivation
of the FP, it is important to remark the following points:
-
. The terms
and
do not appear in Eqs. (10-13).
The estimators of ,
and
are therefore independent of the
scatter along the dependent variable.
-
. Setting equal to zero the other
components of the CMs of and
in Eqs. (10-13) and using
Eqs. (14-16), we obtain exactly the expressions of the
estimators, where the subscript
is used to indicate the dependent
variable in the
fit 3. If, on the
other hand, every variable is affected by dispersion, the quantities
would give a biased estimate of
as shown by Eqs. (5, 6). To
correct for the bias the CM components of
and
are needed (see AkB96).
-
. As follows from
and
, to recover the mean relation by
Eqs. (10-13) we have altogether ten degrees of freedom, given by
five elements of the CMs of and
respectively.
-
. Since Eqs. 3 have the same
dependency on the three variables, the best-fit coefficients
obtained by means of Eqs. (10-13) are independent of the choice
of the dependent variable .
-
. New unbiased estimators of the
coefficients in Eq. (2) can be defined by taking some average of
the slopes of the three planes determined by Eqs. (10-13) using
each of the as dependent variable.
We can define a bisector plane (see also Graham & Colless
1996), whose slopes are given by the vectorial sum of the normal
vectors to the three planes obtained by Eqs. (10-13), and whose
zero point is obtained by Eq. (12). The advantage of the
`bisector' fit (hereafter fit)
consists in the larger effectiveness (i.e. smaller variances for given
sample size) of the relative estimators (see Sect. 3.2, and IFA90
for the two-dimensional case).
-
. There are often large systematic
in-homogeneities between different samples of data, due to differences
in the procedures adopted to derive them (see e.g. Smith et al. 1997).
To account for such in-homogeneities, the least squares fits (ORLS and
OLS) are often performed using robust estimators. The robust
statistics can also be implemented to the MIST regression, by using
robust estimators for the moments of
in Eqs. (10-13).
From the derivation of the equations in the previous section, it
follows that, in order to recover the coefficients of the mean
relation, ,
and
, some requests must be
satisfied:
The hypothesis is practically
equivalent to assume that the measurement errors and the intrinsic
scatter on the FP variables do not depend on the `location' on the FP.
This is not the case for the measurement uncertainties on the observed
parameters and, a priori , it could be not true even for the
intrinsic dispersion. However Eqs. (8-13) continue to be valid by
simply substituting the CM components of
and
with their expected values as a
function of the `position' on the plane (see App. A and B for
details).
Concerning the measurement uncertainties (hypothesis
), Eqs. (10-13) can still be
used by adopting averaged values of .
To this aim, we suggest to proceed as follows. The quantities
can be estimated as the square of
the mean error 4
on the parameters . For what concerns
, the only term that does not vanish,
is due to the correlation between
the uncertainties in and
. The quantities
are never given in literature, so
that one is forced to make some approximations. Since
with
(see JFK96), we can set
. The uncertainties introduced by
this approximation will be discussed in Sect. 4.
Assumptions and
, cannot be satisfied because we do
not have a physical model of the probability distribution of E
galaxies in the space of the observed quantities. It is thus necessary
to introduce some simplifying assumptions:
Both these assumptions introduce a `bias' in the estimate of
,
and , so that the
coefficientsa,bandcin
Eq. (1) do not necessarily coincide with
,
and . Moreover, because of
, the estimates obtained by
Eqs. (10-13) with a different choice of the dependent variable do
not correspond to the same statistical model and so do not define the
same plane.
In order to better understand the bias introduced by
and
, we used numerical simulations of
the FP. The simulations were constructed giving a scatter around the
plane in the direction by a normal
RV (see Sect. 3.1 for details). The variance of
has been varied in the range of
values obtained for the scatter of
different samples of galaxies (see Sect. 4.1). The FP
coefficients have been derived for each simulation by the MIST fits,
with assumptions and
, using each of
as the dependent variable (hereafter
fits), and by the
. We used the estimates of
obtained by typical values of the
uncertainties on FP parameters (0.03 in
, 0.1 in
and 0.03 in
).
In Fig. 1, we plot the coefficient a against
(similar results are obtained for
b and c). The values of the FP coefficients obtained by
the various fitting methods turn out to be systematically different,
by an amount that increases with the scatter. An estimate of this
difference for the MIST fits will be given in Sect. 4 by
comparing the FPs of different clusters of galaxies.
![[FIGURE]](img86.gif) |
Fig. 1. Coefficient a of FP simulations against the variance, . The size of the simulations is . The symbols correspond to different fitting methods as shown in the lower left.
|
The assumption introduces likely
the same amount of bias in the various fits. To illustrate this point,
let us consider an example in which the CM components of the intrinsic
dispersion are `known'. We will assume
(a scatter of
in
and
) 5
and that the other CM terms of and
vanish. We now create a FP
simulation, and derive a, b and c by the MIST
fits, using the assigned values of the CM components of
and
, and making therefore only
the hypothesis . The results of the
fits are shown in Table 1: the FP coefficients are practically
independent of the fitting method.
![[TABLE]](img95.gif)
Table 1. Values of a, b and c of a FP simulation with intrinsic dispersion `known' (see text). The sample size is . Column 1: FP coefficients a, b and c. Columns 2, 3, 4 and 5: values obtained by the different MIST fits (see subscripts in the top row) with corresponding uncertainties (1 intervals).
We then conclude that the differences of Fig. 1 are due only
to the assumption : the lack of a
model for the intrinsic dispersion of the FP variables is thus at the
origin of the dependency of the FP coefficients on the fitting
procedure . The larger is the scatter around the plane, the larger
will be the bias due to the fitting method.
© European Southern Observatory (ESO) 2000
Online publication: October 30, 2000
helpdesk.link@springer.de  |