Astron. Astrophys. 340, 335-342 (1998)

## 3. Higher order statistics via the wavelet coefficients

### 3.1. DWT decomposition and 2nd order statistics

It is generally believed that the cosmic mass (or number) density distribution can be mathematically treated as a homogeneous random field. It is often more convenient to express in terms of its Fourier transform, , which is the Fourier coefficient of .

One reason for expressing the density distribution in terms of its Fourier transform is that for Gaussian random fields all the statistical features of can be completely described by the amplitude of the Fourier coefficients. In this case, the phase of is not important and the power spectrum of the perturbations, , or equivalently, the two-point correlation function, are all that is necessary to describe the statistical behavior of the density distribution. However, if the field is non-Gaussian, then in order to have a full statistical description of the field the phases of the Fourier coefficients are essential.

As is well known, it is difficult, even practically impossible, to find information about the phases of the Fourier coefficient as soon as there is some computational noise. The lack of information about the phases makes the description incomplete: we might know the scales k of the structures, but nothing about the positions of the considered structures. A possible way of simultaneously describing the scale and position of structures is provided by the discrete wavelet transform (DWT) (Daubechies 1992, Meyer 1993.) The DWT analysis has successfully been applied to the problems of turbulence (Yamada & Ohkitani 1991; Farge 1992) and high energy physics (Huang et al. 1996.) Our previous studies also show that the DWT analysis is useful for large scale structure study (Pando & Fang 1996, Fang & Pando 1997, 1998; Pando et al. 1998, Fang, Deng & Xia, 1998, Xu, Fang & Wu, 1998.)

Let's consider a 1-D mass density contrast , which covers a spatial range . The expansion of the field in terms of the DWT basis is given by

where , , are the basis of the DWT. The DWT basis are orthogonal and complete. The wavelet function coefficient (WFC), , is computed by

The wavelet transform basis functions are generated from the basic wavelet by a dilation, , and a translation l, i.e.

The basic wavelet is designed to be continuous, admissible and localized. Unlike the Fourier basis , which are non-local in physical space, the wavelet basis are localized in both physical space and Fourier (scale) space. In physical space, is centered at position , and in Fourier space, it is centered at wavenumber . Therefore, the WFCs, , have two subscripts j and l, which describe, respectively, the scale and position of the density perturbations.

A clearer picture of how the transforms work can be seen in the phase space . A complete, orthogonal basis set resolves the whole phase space into "elements" of size and . Each mode corresponds to elements in the phase space. For the Fourier transform, this corresponds to elements of size and . For the wavelet transform, both and are finite, and the corresponding area of the element is as small as . That is, the DWT is able to resolve an arbitrary function simultaneously in terms of x and k up to the limit of the uncertainty principle. The DWT decomposes the density fluctuation field into domains in phase space, and for each mode, the corresponding area in the phase space is as small as that allowed by the uncertainty principle.

The WFC and its intensity describe, respectively, the fluctuation of the density and its power on scale at position . As with the Fourier basis, Parseval's theorem holds for the DWT basis. It is (Fang & Pando 1997, Pando & Fang, 1998)

It is possible to define the power spectrum of the density perturbation on scale j by the variance of the WFCs as

where

It has been shown that the DWT power spectrum Eq. (6) can be converted to the Fourier power spectrum, i.e. in terms of second order statistical description the DWT and Fourier transform are equivalent.

### 3.2. One-point distribution of WFCs and non-Gaussianity

The cosmic density field is usually assumed to be ergodic: the average over an ensemble is equal to the spatial average taken over one realization. This is the so-called "fair sample hypothesis" (Peebles 1980). A homogeneous Gaussian field with continuous spectrum is certainly ergodic (Adler 1981). In some non-Gaussian cases, such as homogeneous and isotropic turbulence (Vanmarke, 1983), ergodicity also approximately holds. Roughly, the ergodic hypothesis is reasonable if spatial correlations are decreasing sufficiently rapidly with increasing separation. In this case, the volumes separated by large distances are approximately statistically independent, and can be treated as independent realizations. Note that the are orthogonal with respect to the position index l, and therefore, for an ergodic field, the WFCs, , , at a given j should be statistically independent. Thus the WFCs at fixed j and different l can be treated as independent measures of the density fluctuation field. The WFCs, , from one realization of can be employed as a statistical ensemble. In this way, when the fair sample hypothesis holds, an average of the ensemble can be well estimated by averaging over l, i.e. , where denotes the ensemble average.

Consequently, the distribution of the is actually the one-point distribution of the WFCs at a given scale j. The non-Gaussianity of the density field can directly be measured by the deviation of the one-point distribution from a Gaussian distribution. For this purpose, we can calculate the cumulant moments defined by (Carruthers 1991; Carruthers, Eggers & Sarcevic 1991)

where

From Eqs. (6), (7) and (8), one sees that the second order cumulant moment is the DWT power spectrum on the scale j, i.e.

For Gaussian fields all the cumulant moments higher than order 2 are zero. Thus one can measure the non-Gaussianity of the density field by with . Analogous to being called the DWT power spectrum, we will call the DWT spectrum of n-th cumulant. The cumulant measures and are related to the better known skewness and kurtosis, respectively. Thus, the non-Gaussianity of can be detected by the DWT skewness and kurtosis spectra defined as

and are basic statistical measures employed in this paper.

For comparison, the definitions of the "standard" skewness S and kurtosis K for a 1-D distribution covering on N bins are given as follows

and

where the variance is given by

Obviously, no scale information is given by S and K.

### 3.3. Central limit theorem and the DWT analysis

It is well known that not all one-point distributions can detect non-Gaussianities. This is due to the constraints imposed by the central limit theorem. For instance, if the universe consists of a large number of dense clumps with a non-Gaussian probability distribution function (PDF), the one-point distributions of the real and imaginary components of each individual Fourier mode are still Gaussian due to the central limit theorem (Ivanonv & Leonenko 1989). Further, even when the non-Gaussian clumps are correlated the central limit theorem still holds if the two-point correlation function of the clumps approaches zero sufficiently fast (Fan & Bardeen 1995). For these reasons, the one-point distribution function of Fourier modes is not sensitive enough to detect deviations from Gaussian behavior. Even for samples with strong non-linear evolution, the one point distribution function of Fourier modes is found to be consistent with a Gaussian distribution (Suginohara & Suto 1991). It should be pointed out that the inefficiency of the Fourier mode one-point distribution in detecting non-Gaussianity is not because Fourier transform loses information about the distribution . The Fourier coefficients contain all the information on non-Gaussianity, but the information is mainly contained in the phases of the Fourier coefficients. As mentioned in Sect. 3.1, it is very difficult to detect the distribution of the phases of Fourier coefficients.

On the other hand, the wavelet coefficients are not subject to the central limit theorem. In this respect, the DWT analysis is similar to the count in cell (CIC) method (Hamilton 1985; Alimi, Blanchard & Schaeffer 1990; Gaztañaga & Yokoyama 1993; Bouchet et al. 1993; Kofman et al. 1994; Gaztañaga & Frieman 1994). The CIC is not subject to the central limit theorem as it based on spatially localized window functions (Adler 1981). The DWT's basis are also localized. If the scale of a clump in the universe is d, Eq. (3) shows that the WFCs, , with , is determined only by the density field in a range containing no more than one clump. That is, for scale j the wavelet coefficients are not given by a superposition of a large number of clumps, but are determined by at most one of them. Therefore, it avoids the constraints imposed by the central limit theorem.

This point can also be shown from the orthonormal basis being used for the expansion of the distribution . A basic condition of the central limit theorem is that the modulus of the basis be less than , where C is a constant (Ivanonv & Leonenko 1989). Obviously, all Fourier-related orthonormal basis satisfy this condition because and , where C is independent of coordinates in both physical space x and scale space k. On the other hand, the normalized wavelets functions have [Eq. (4)]

Because the magnitude of the basic wavelet is of order 1. The condition will no longer hold for a constant C independent of scale variable j. Hence, the one-point-distribution of wavelet coefficients on scale j should be a good estimator for the PDF of a distribution.

© European Southern Observatory (ESO) 1998

Online publication: November 9, 1998