## 3. Mathematical formulation## 3.1. The precise definition of `radial velocity'When aiming at sub-km s The above approach to the interpretation of lineshift measurements
is conceptually rather different from the traditional way in which
radial velocities are determined. The aim of the latter is (usually)
to determine in some sense the `true' radial components of the space
motions of the stars, by removing all other sources of spectroscopic
shifts. This can be done, at least for solar-type stars and up to a
certain degree, through comparison with the solar spectrum, directly
or indirectly via minor planets. The pioneering efforts by Griffin et
al. (1988) and Gunn et al. (1988) to obtain accurate radial velocities
for stars in the Hyades cluster may be cited as an example of this
classical approach. We think, however, that current and future
spectroscopic measurements of much higher accuracy will require a more
stringent definition, in which the observable quantity (represented by
the radial-velocity measure The determination of astrometric radial velocities is not affected by factors such as the transverse Doppler effect and gravitational redshift. Nevertheless we need to state explicitly what we mean by radial velocity, in order to compare our astrometric results with spectroscopic determinations. The main point to consider is the light-time effects due to the finite speed of light. As the star moves through space, the time interval from light emission at the object () to the arrival at the solar system barycentre () changes, and the corresponding stretching or compression of the time scale naturally affects the observations. Rigorous treatment of this problem is beyond the present paper. The residual effect is however very small provided a single time scale (such as ) is consistently used to describe the phenomena. Since proper motions are defined as the time derivatives of direction with respect to (, where is the barycentric direction to the star) we adopt the convention that the (astrometric) radial velocity is defined as . In the absence of relativistic effects this is related to the Doppler shift by . ## 3.2. The maximum-likelihood methodThe maximum-likelihood (ML) method is a well-known technique for parameter estimation described in most textbooks on probability theory and statistics (Kendall & Stuart 1979; Casella & Berger 1990). The following brief introduction provides some of the mathematical framework, notations and terminology required for subsequent sections. Application of the ML estimation method requires that the observed
quantities ( In the following we use diacritic marks to distinguish between
different realisations (or versions) of the same random variable. For
the generic variable The observations provide a unique realisation of the random variable . The problem is to find the `best' estimate of consistent with the observed data. We use the principle of maximum likelihood to obtain this estimate. The likelihood function is defined as and the ML estimate is the set of parameters maximising the likelihood or, equivalently, the log-likelihood function . The curvature of in the vicinity of its maximum is a measure of the sharpness (precision) of the ML estimate. Statistical theory provides an estimate of the covariance of in the form of a lower bound, known as the Cramér-Rao inequality (Kendall & Stuart 1979). Subject to regularity conditions this bound can be written, for the vector-valued parameter , (Silvey 1970). Here E is the statistical expectation operator and the prime denotes matrix transposition. Although Eq. (1) formally only provides a lower bound to the covariance, it is in practice often quite accurate. However, it is recommended that its validity is always checked by means of Monte Carlo simulations (Sect. 4). A complete formulation of the moving-cluster problem thus requires
specification of the model parameters
(), the observables
(), and the probability density
function . Additionally, the
mathematical formulation includes ## 3.3. Cluster model parametersThe stars in a cluster are distinguished by the subscript The three-dimensional position (in pc) of a star can be written
, where
is the direction (unit vector)
towards the star and the parallax in
mas. We regard as error-free, i.e.
belonging to the category of auxiliary data, and
as a parameter of the model. The
Let be the centroid velocity, i.e. the mean velocity of the cluster member stars. The equatorial Cartesian components of constitute three more elements of the parameter vector . The astrometric measurements are accurate enough to detect the deviations of the individual velocities from the centroid velocity, i.e. the tangential components of the peculiar velocities . Thus, a statistical description of the peculiar velocities is needed, in the form of a parametrised pdf for . We assume that the peculiar motions are Maxwellian (i.e., Gaussian in the rectangular components), and thus fully described by a dispersion tensor . We take this to be isotropic and independent of stellar mass and position in the cluster: The internal velocity dispersion is thus another element of the parameter vector . However, we shall not a priori exclude the possibility of systematic velocity patterns in the cluster such as rotation and (non-isotropic) dilation. To a first approximation such patterns may be described as a linear velocity field, represented by the tensor introduced in Appendix A of Paper I. The expected space velocity of a star at position is then where is the centroid position.
The components of are uniquely determined by Eq. (4) and the assumed expansion rate , since . is the angular velocity of the cluster, while represents (non-isotropic) dilation. The components of and are additional elements of the general parameter vector . The complete parameter vector is, therefore,
. The total number of parameters is
. Although this is our most general
cluster model, we shall normally assume that the internal systematic
velocities are negligible, in which case only the
parameters in
are estimated. We refer to this
restricted parameter set as the In summary, the pdf for the space velocity of star where is the mathematical constant. ## 3.4. Observation modelFor each star the observables are the trigonometric parallax
() and the proper motion components
in right ascension
The actually observed values are in the arrays . It is assumed that the observations are unbiased, with known covariance matrices The observational errors for the different stars, on the other hand, are assumed to be uncorrelated: [This assumption does not hold strictly e.g. for Hipparcos data. We discuss this further in Sect. 5.4.] Gaussian error distributions are assumed. The pdf for the observables, conditional upon their true values, is then Astrometry also provides the barycentric right ascension () and declination () of each star for a certain epoch. For the present purpose the positional data can be regarded as error-free and defining the unit vector from the solar system barycentre towards the star. Two more auxiliary unit vectors, tangent to the unit sphere at , are needed: in the direction of increasing right ascension (local `East'), and in the direction of increasing declination (local `North'). , and form a right-handed orthogonal coordinate frame known as the `normal triad' at with respect to the equatorial frame (Murray 1983). The explicit formulae for these vectors are given in Eq. (A.2). Given the position and velocity of a star, the `true' observables are calculated as where
km yr s In the presence of a non-zero velocity dispersion, however, is itself a random variable with pdf according to Eq. (5). The joint pdf of the observables with the velocity is since the observational errors are assumed to be independent of the random velocities. The pdf of the observables is then obtained as the marginal density This integral can be evaluated analytically after insertion of
and
from Eqs. (5) and (10) in
Eq. (12). Since the product of two normal probability density
functions is normal, and the marginal density of a normal pdf is also
normal, it follows that and, using the isotropic dispersion tensor from Eq. (2), ## 3.5. The likelihood functionSince the observational errors and random velocities of the individual stars are assumed to be statistically independent, the pdf of the whole set of observables equals the product of the individual pdf 's. The log-likelihood function is, therefore, where and depend on through Eq. (15) and (16). The ML estimate is obtained by finding the maximum of , or, equivalently, the minimum of The practical algorithm to find the maximum of is discussed in Appendix A. ## 3.6. Radial-velocity estimatesGiven the set of estimated parameters
the astrometric radial velocity of
star It should be noted that the error in this quantity is the sum of two statistically independent components: (1) the radial component of the estimation error in , and (2) the radial component of the peculiar velocity of the star, ; see Eq. (A.18). ## 3.7. Rejection of outliersOur formulation of the cluster model does not take into account that the observational material may include field stars which do not share the common (mean) space velocity of the cluster. The ML method requires that the model provide a statistically correct description of the data. In particular, it must only be applied to the actual members of the cluster, or rather to members whose mean space motion during the observing period agrees with the model. In practice this rules out also a number of close binaries, even if they are members of the cluster, since the short-term motions of their photocentres may deviate significantly from the motion of their centres of mass. Because of the high frequency of duplicity and the wide distribution of separations and periods, the subset of astrometrically detectable binaries blends continuously with the non-perturbed, single member stars. The elimination of outliers, whether members or not, is therefore an equally important and delicate part of the application of the moving-cluster method. Outliers can be detected by computing a suitable goodness-of-fit statistic for each star in the solution. The quantity , where is defined by Eq. (19), is a quadratic measure of the distance between the observed and fitted vector , weighted by the inverse of the expected covariance of the difference. Therefore, can be used for detection of outliers. In order to define a suitable rejection criterion it is desirable to know, at least approximately, the distribution of in the nominal case when the data behave according to the model. The quadratic form in Eq. (19) and the assumption of Gaussian
errors suggest that should
approximately follow a chi-square distribution. There are
observables and
parameters in the basic cluster
model, and consequently degrees of
freedom, or degrees of freedom for
each (if where is a scaling factor to be determined by the simulations (Sect. 3.8). For a given level of significance the star should therefore be rejected if , where In simulations of Hyades data (Sect. 4.1.3) we find , so that a 1 per cent significance level requires . As discussed in Sect. 4.2, it is possible to derive an optimal value for if the distribution of peculiar velocities can be properly modelled. A complication with this rejection procedure is that the goodness-of-fit statistics depend on the estimated through Eq. (16). Eliminating outliers will however decrease the estimated velocity dispersion and consequently increase the values. This, in turn, will in general cause other stars to fall beyond the adopted acceptance limit. It is not obvious how to find the maximum subset of stars for which all , or if this subset is unique or even exists. Testing each of the possible subsets is obviously not a viable method for . As a practical (if not necessarily optimal) solution we have adopted a sequential rejection procedure, in which the one star with the largest () is removed from the sample. A new solution is then computed, including new values. The process is repeated, removing the star with the largest and computing a new solution, until all . In some cases it may happen that the solution becomes unstable before this criterion is satisfied. In those cases where this happens when has been reduced to practically zero, the number of model parameters could be reduced, e.g. by assuming . ## 3.8. Use of numerical simulationsMonte Carlo simulation of the ML estimation problem is essential for studying the efficiency and convergence of the adopted procedure, as well as the precision and possible bias of the resulting estimates. In a Monte Carlo simulation, a set of `true' parameters is assumed and from this, many realisations of hypothetical (`observed') data are generated according to the adopted model. Random observational errors and other variations, in our case due to the internal velocity dispersion, are simulated by means of a random number generator. Applying the estimation algorithm to each hypothetical data set results in an estimated parameter vector . From the assembly of these vectors one can determine various statistics, in particular the bias and rms scatter of the individual parameter . Synthetic cluster data are generated according to the following
general recipe. First, the overall characteristics of the cluster and
observations are specified: the number of stars We use the following terminology: an © European Southern Observatory (ESO) 2000 Online publication: April 17, 2000 |