3. Uncertainties on the FP coefficients
Two kinds of methods are usually adopted to estimate the uncertainties on fit coefficients: theoretical methods and re-sampling techniques.
The `theoretical uncertainties' are obtained from the analytical expression of the variances of the estimators (e.g. IFA90 and Feigelson & Babu 1992). By their nature, these estimates are valid only asymptotically, i.e. for large sample sizes.
When analytical formulae are not available, or when the sample is small, re-sampling procedures are adopted. The statistics of interest is calculated for various `pseudo-samples' drawn from the original data set. The uncertainties are then derived from the distribution of pseudo-values. The main re-sampling procedures are known as `jackknife' (see Quenouille 1949 and Tukey 1958) and `bootstrap' (see Efron 1979 and Efron & Tibshirani 1986). In the jackknife, one point is extracted in sequence from the original data set, so that a number of pseudo-samples equal to the sample size is constructed. In the bootstrap, random samples are drawn by replacement from the actual data set.
Fig. 2 shows the relative uncertainties on the FP coefficients a and b, as drawn from literature, as a function of the sample size N. The large scatter of the values of the uncertainties is indicative of the inconsistency between the methods used in the different works to estimate the errors .
In the following we will analyze the performance of the methods to estimate the FP uncertainties with special regard to the rôle of the sample size. The analysis will be performed by numerical simulations, which are described in the next section.
3.1. The simulation algorithm
The simulations consist in distributions of points extracted from a common `parent distribution'. To derive the parent population we based on the distribution in the parametric space of the galaxies in the Coma cluster (hereafter the `template' sample). This choice is mandatory since Coma is the only cluster with FP parameters available for a large number of galaxies.
The simulation algorithm consists in the following.
One of the simulated samples is compared to the template in Fig. 3, to show how the simulations resemble very well the distribution of Coma galaxies.
3.2. Estimating the uncertainties
We derived the `true uncertainties' on the FP coefficients as a function of the sample size in the following way.
FP simulations of fixed size N were constructed and the coefficients a, b and c determined for each sample. The `true uncertainties', , and were determined from the distributions of a, b and c by using 1 standard intervals. To obtain estimates independent of the `fitting scale', we divided , and by the `true' coefficients of the FP, , and , derived from a simulation of large size. In Fig. 4 we plot , and against the logarithm of the sample size N for the various MIST fits. To allow a direct comparison with Fig. 2 the same range of N has been plotted.
It is apparent that the true intervals depend on the fitting method and that the most effective fit (i.e. lower values of and for fixed N) is the method, in agreement with what found by IFA90 for the bisector line. The largest variances are obtained for the fit. It is also worth to notice that, by changing the scatter along the direction in the simulations, the curves shown in the figure undergo just a translation in the y direction, without any change in shape. In the case shown in Fig. 4 we adopted the MV of the rms of residuals of various clusters of galaxies (see Sect. 4).
Fig. 4 can be used as a ready tool to state the number of galaxies necessary to achieve a given accuracy on the FP.
Concerning the zero point of the FP, it is important to remark that the uncertainties plotted in Fig. 4 do not represent the estimates of usual interest. For all the applications of the FP zero point (i.e. distance determinations, constraining of cosmological and evolution parameters), the uncertainties on c are derived with the hypothesis that the FP slopes are exactly determined. The errors on c are then given as multiplied by the rms of the dependent variable residuals. Such estimates are generally smaller than .
For this reason, in the following we will focus our analysis on the uncertainties of the FP slopes.
Although the comparison of Fig. 2 and Fig. 4 does not show an evident disagreement, for () the values reported in literature appear almost as a scatter diagram. Starting from this remark, we now analyze the performances of the different methods used to estimate the uncertainties.
3.2.1. Theoretical methods
Although statistics allows to prove the asymptotic validity of variance estimators, it does not furnish an estimate of the `minimum sample size' for the theoretical formulae to be valid. Since such an estimate will generally depend on the `shape' of the parent population, it should be obtained each time by using simulation methods.
To test the performance of theoretical variance estimators for the FP coefficients we apply the results of Sect. 2. FP simulations of fixed size N were constructed and theoretical relative uncertainties (1 standard intervals) on a and b derived for each sample by Eqs. (B17, 10 - 16). In Fig. 5, the MVs of the theoretical estimates, and , with corresponding `error bars', are plotted against the logarithm of N. The error bars were obtained by connecting the 5th and 95th percentiles 6 of the and distributions. For comparison, we also plot the `true uncertainties' on the slopes as derived from the simulations. As expected, the larger the sample size, the better it is the agreement between theoretical and actual values. For sample sizes smaller than (), we see however that theoretical formulae (I) largely scatter (up to for a and for b) and (II) increasingly underestimate the `true' values (see also IFA90 and Feigelson & Babu 1992).
To discuss in more detail the point II, we calculated for each sample the coefficient a, the theoretical uncertainty , and the `discrepancy' , where the actual value, , of the coefficient a was derived by fitting a simulation of large size. We then determined the fraction of simulations, , with greater than a fixed value . The calculation was iterated by varying the sample size.
In Fig. 6, we plot against for two different values of . We chose that, for a normal distribution, defines a confidence level (CL) of , and , corresponding to a CL. If Eqs. (14 - 16) worked well, on the average, for every value of N, the fraction of not-consistent samples would be independent of N and determined by the CL corresponding to the value of . Fig. 6 shows that for () is in good agreement (within 5%) with the expected CLs. For sample sizes smaller than , the theoretical formulae do not furnish reliable estimates of the desired CLs: the curves of Fig. 6 increase steeply. The same result was obtained for the coefficient b, and by varying the MIST fit. We also found that the estimate of is largely independent of the simulation parameters: the same estimate was obtained using the FP coefficients and the dispersion of the samples studied in Sect. 4.
Fig. 6 suggests that in order to obtain a given confidence level, the uncertainties on the FP coefficients of small samples should be estimated using a interval dependent on N. For instance, to obtain a CL with (), an effective interval of 2.6 should be used. However, it turns out that the smaller the sample size with respect to , the stronger is the dependence of on the adopted fitting method and, what is more critical, on the simulation parameters. For instance, the fraction of not-consistent samples with varies by by varying the scatter in the simulations.
We conclude that for the theoretical formulae are not reliable. Although the desired confidence intervals can be roughly obtained by using effective, suitably tested, standard intervals, the individual estimates can be significantly different, up to for a and for b, with respect to the true values.
3.2.2. Re-sampling procedures
The hypothesis underlying the use of re-sampling methods is that the available data set furnishes a good approximation to the parent population. The statistics of interest is calculated for various pseudo-samples drawn from the actual data set. If this `sampling hypothesis' holds, the distribution of pseudo-values coincides with the `true' one and the confidence intervals can be accurately estimated at the cost of some computing time (see e.g. Efron & Tibshirani 1986). However, the smaller the sample size, the larger is the probability that the actual sample gives a poor representation of the parent population. In order to derive a `minimum sample size' for the re-sampling methods to be reliable, numerical simulations have to be employed.
On the basis of the analysis of the previous section, we studied the performance of re-sampling uncertainties on the FP coefficients by testing the use of the bootstrap method. The results that follow were found to be largely independent of the simulation parameters, of the actual FP coefficients, and of the fitting method.
FP simulations of fixed size N were constructed. For each sample, we determined the FP slopes and the bootstrap uncertainties and by using 1 standard intervals with pseudo-samples (by further increasing the following results do not change). In Fig. 7, the MVs of the bootstrap relative uncertainties, and , with corresponding error bars (see Sect. 3.2.1), are plotted against the logarithm of the sample size and compared to the true values. The error bars are given by percentile intervals 7.
As shown in the figure, the bootstrap method gives a good measure of the average uncertainties, but a very poor, largely scattered, representation of the actual errors.
Only for () the MVs of the re-sampling standard errors appear to differ from the true uncertainties. As a matter of fact, it turns out that this difference is due to the use of 1 standard intervals to estimate the bootstrap confidence intervals. For small samples, in fact, the distribution of pseudo-values is significantly different from a normal one, so that the desired CLs must be obtained by non-parametric estimates 8 (see Efron & Tibshirani 1986 and Efron 1987). To illustrate this point, we tested the use of bootstrap percentile intervals proceeding as in the previous section (Fig. 6). For each sample size, we derived the fraction of simulations, , that are not consistent with the `true' FP slopes. The calculation was iterated by adopting different percentile intervals of the pseudo-values. To have a direct comparison with Fig. 6 we chose intervals with and respectively. In Fig. 8, the fraction of non-consistent samples is plotted against the logarithm of the sample size. It turns out that, on the average, the bootstrap allows to give very accurate estimates of the true intervals. For every sample size, differences of only some percents are found with respect to the desired CLs.
On the other hand, by looking at Fig. 7, we notice that for small samples the bootstrap uncertainties have a very large dispersion with respect to true values. The scatter varies from for up to for . Below the FP parent population is poorly represented, so that the single bootstrap estimates can be very unsatisfactory.
To have a comparison with the theoretical methods, we compared the width of the error bars shown in Fig. 5 and Fig. 7. In Fig. 9 we plot the difference of the error bars against the logarithm of the sample size.
While for large samples theoretic and bootstrap uncertainties have a similar scatter, for () the bootstrap errors become increasingly less accurate. For , the scatter increases up to for a and to for b.
The conclusions are thus the following.
For large samples, both theoretic and bootstrap methods give accurate estimates of the uncertainties.
For , both methods give estimates that can differ significantly from the true values. The bootstrap is accurate on the average but the uncertainties have a very large scatter. The theoretical methods give values that are more precise, but systematically underestimated.
© European Southern Observatory (ESO) 2000
Online publication: October 30, 2000