Astron. Astrophys. 325, 693-699 (1997)

## 3. Definition of the linear regression model

A mathematical apparatus has been developed by Malyuto and Shvelidze (1989) to elaborate the scheme of automated quantitative spectral classification with the use of classification criteria based on Abastumani objective prism spectra. This approach contains the statistical method of stepwise linear regression enabling us to define the analytical dependences between classification criteria and the main physical parameters. The method has been successfully applied to the Strömgren photometric data (Malyuto, 1994) for F-stars and to spectrophotometric data obtained with the RUBIKON device for M-stars (Malyuto, Oestreicher, Schmidt-Kaler, 1996).

Multiple linear regressions for analytical representation of various dependences are widely used in astronomy (see, e.g., Heck 1983) as well as in other natural sciences. We used the regression model:

where Y are the dependent variable (in this context - luminosity class sensitive classification index, or , described in the previous section), are independent variables (spectral class sensitive classification index and luminosity codes L and/or functions of these parameters), , are regression coefficients. q is the number of independent variables.

A statistical method of stepwise linear regression is meant for introducing a redundant set of independent variables and applying a stepwise procedure for choosing the best subsets of independent variables. Some statistical packages containing algorithms of stepwise linear regression have been elaborated and applied by investigators to various statistical tasks (see, e.g., Afifi, Azen, 1979). In astronomy, one may refer to Heck and Merch (1980) where the well-known package BMDP has been used to select a subset of the most significant variables in their linear model used in predicting spectral classifications from photometric observations. Another sample of appling stepwise linear regression technique may be found in the paper of Schuster and Nissen (1989) where the package MINITAB has been applied to Strömgren photometric data.

In defining our linear regression model we use the software package CTATEC-5 (Tiits et al., 1986), based on Jennrich's (1977) scheme of calculating partial correlation coefficients. Some details, as well as an example, may be found in Malyuto (1994). Our initial redundant set of the independent variables was rather arbitrary and was inspired by the visual appearance of Figs. 7 and 8. We tried the full polynomial with powers up to the third order together with their products (, , , L, , , , , , the total number of the terms was nine) and we found the approach to be quite satisfactory.

The method of stepwise linear regression has been applied to each of the diagrams in Figs. 7 and 8 from Sect. 2. As a result we have obtained the following analytical model (Fig. 7):

with the multiple correlation coefficient R=0.949. The respective partial correlation coefficients for each term are given in the second row.

The analytical model for the data in Fig. 8 is

the multiple correlation coefficient is R=0.985. All terms are significant on the 95% significance level.

We see from Fig. 7, that the index is good for luminosity classification of stars of luminosity classes V and IV and their segregation from stars of higher luminosities (with ). But the model lines are mixed among giants and supergiants and reliable luminosity classification is impossible in that region of the diagram. On the contrary, in Fig. 8 the scatter is somewhat larger for dwarfs. The model lines are well segregated among giants and supergiants and the index is good for luminosity classification in the respective region of the diagram (again with ). Combination of the indices and seems to be effective in luminosity classification of all luminosity classes.

The model lines in Figs. 7 and 8 were used for graphical inversion of the model (in other words to calculate luminosity codes from the measured indices for our standard stars). The rms differences (published minus calculated luminosity codes) were

0.81 of luminosity codes (41 stars, ).

from the diagram in Fig. 8 (the luminosity sensitive indice were used). As expected, the scatter is larger for stars of luminosity classes IV and V (see the data comparison in Fig. 9).

 Fig. 9. Comparison of the published luminosity codes with the codes calculated from the index (crosses) for all stars and with the codes calculated from the index (circles) for the stars with (luminosity classes IV and V). Only the stars with are considered

From the diagram in Fig. 7 the luminosity codes were calculated with the use of the luminosity indice (only for stars with , where stars of luminosity class IV and V are located). If we combine these data with the data from Fig. 8 for the remaining stars, the respective rms difference is only

0.41 of luminosity codes (41 stars, ).

In Fig. 9 the calculated luminosity codes were plotted against published ones, different designations were used for the codes determinations from Fig. 8 (crosses) and from Fig. 7 (circles). We conclude that the combination of the diagrams in Fig. 7 and 8 provides a refined luminosity classification for G-K stars of all luminosity classes.

© European Southern Observatory (ESO) 1997

Online publication: April 28, 1998