Appendix A: The Gaussian kernel method
Consider a sample of size n: () of identically distributed, independent random variables. We intend to provide an estimate of the probability density function of the generic variable X, namely , defined by the property:
If we estimate the probability density function through the derivative of the empirical repartition function, we obtain a sum of Dirac functions.
However, if we convolute these Dirac functions by a kernel which has `good' properties, the function obtained is then a continuous function and an estimator of the probability density function (Devroye 1986). The estimator of the density function of the sample is denoted , and is given, after convolution, by:
where is a parameter which is fixed with the choice of the kernel applied. This parameter is used to smooth the shape of the curve, so that has to tend to zero when n goes to infinity, but not too fast:
The `good' properties of are:
which ensure the convergence of in probability to the true density function of the random variable X. Therefore:
A useful kernel is the Gaussian kernel which, moreover, has the property to be :
where , and is the standard deviation.
In our case, , and .
© European Southern Observatory (ESO) 2000
Online publication: December 5, 2000