Skip to main content
Log in

Discrete regularized discriminant analysis

  • Papers
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A method of regularized discriminant analysis for discrete data, denoted DRDA, is proposed. This method is related to the regularized discriminant analysis conceived by Friedman (1989) in a Gaussian framework for continuous data. Here, we are concerned with discrete data and consider the classification problem using the multionomial distribution. DRDA has been conceived in the small-sample, high-dimensional setting. This method has a median position between multinomial discrimination, the first-order independence model and kernel discrimination. DRDA is characterized by two parameters, the values of which are calculated by minimizing a sample-based estimate of future misclassification risk by cross-validation. The first parameter is acomplexity parameter which provides class-conditional probabilities as a convex combination of those derived from the full multinomial model and the first-order independence model. The second parameter is asmoothing parameter associated with the discrete kernel of Aitchison and Aitken (1976). The optimal complexity parameter is calculated first, then, holding this parameter fixed, the optimal smoothing parameter is determined. A modified approach, in which the smoothing parameter is chosen first, is discussed. The efficiency of the method is examined with other classical methods through application to data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aitchison, J. and Aitken, C. G. G. (1976) Multivariate binary discrimination by the kernel method.Biometrika,63, 413–420.

    Google Scholar 

  • Anderson, J. A. (1972) Separate sample logistic discrimination.Biometrika,66, 19–35.

    Google Scholar 

  • Bahadur, R. R. (1961) A representation of the joint distribution of responses ton dichotomous items, inStudies in Item Analysis and Prediction, H. Salomon (ed), Stanford University Press, Stanford, CA, pp. 158–168.

    Google Scholar 

  • Bowman, A. W. (1980) A note on consistency of kernel method for the analysis of categorical data.Biometrika,67, 682–684.

    Google Scholar 

  • Bowman, A. W., Hall, P. and Titterington, D. M. (1984) Crossvalidation in nonparametric estimation of probabilities and probability densities.Biometrika,71, 341–351.

    Google Scholar 

  • Brown, P. J. and Rundell, P. W. K. (1985) Kernel estimates for categorical data.Technometrics,27, 293–299.

    Google Scholar 

  • Butler, W. J. and Kronmal, R. A. (1986) Discrimination with polychotomous predictor variables using orthogonal functions.Journal of American Statistical Association,80, 305–313.

    Google Scholar 

  • Cox, D. R. (1972) Regression models and life tables (with discussion).Journal of the Royal Statistical Society B,32, 443–448.

    Google Scholar 

  • Dillon, W. R. and Goldstein, M. (1978) On the performance of some multinomial classification rules.Journal of American Statistical Association,73, 305–313.

    Google Scholar 

  • Efron, B. (1983) Estimating the error rate of a prediction rule: improvement on cross-validation.Journal of American Statistical Association,78, 316–331.

    Google Scholar 

  • Friedman, J. H. (1989) Regularized discriminant analysis.Journal of American Statistical Association,84, 165–175.

    Google Scholar 

  • Glick, N. (1972) Sample-based classification procedures derived from density estimates.Journal of American Statistical Association,67, 116–121.

    Google Scholar 

  • Goldstein, M. and Dillon, W. R. (1978)Discrete Discriminant Analysis. Wiley, New York.

    Google Scholar 

  • Hall, P. (1981a) On nonparametric multivariate binary discrimination.Biometrika,68, 287–294.

    Google Scholar 

  • Hall, P. (1981b) Optimal near neighbour estimator for use in discriminant analysis.Biometrika,68, 572–575.

    Google Scholar 

  • Hall, P. and Wand, P. (1988) Nonparametric discrimination using density differences.Biometrika,75, 541–547.

    Google Scholar 

  • Hand, D. J. (1982)Kernel Discriminant Analysis. Research Studies Press/Wiley, Chichester.

    Google Scholar 

  • Hand, D. J. (1983) A comparison of two methods of discriminant analysis applied to binary data.Biometrics,39, 683–694.

    Google Scholar 

  • Hand, D. J. (1986) Recent advances in error rate estimation.Pattern Recognition Letters,4, 335–346.

    Google Scholar 

  • Hills, M. (1967) Discrimination and allocation with discrete data.Applied Statistics,16, 237–250.

    Google Scholar 

  • Krzanowski, W. J. (1975) Discrimination and classification using both binary and continuous variables.Journal of American Statistical Association,70, 782–790.

    Google Scholar 

  • Martin, D. C. and Bradley, R. A. (1972) Probability models, estimation, and classification for multivariate dichotomous populations.Biometrics,28, 203–222.

    Google Scholar 

  • Ott, J. and Kronmal, R. A. (1976) Some classification procedures for binary data using orthogonal functions.Journal of American Statistical Association,71, 391–399.

    Google Scholar 

  • Titterington, D. M. (1980) A comparative study of kernel-based density estimates for categorical data.Technometrics,22, 259–268.

    Google Scholar 

  • Titterington, D. M. and Bowman, A. W. (1985) A comparative study of smoothing procedures for ordered categorical data.Journal of Statistical Computation and Simulation,21, 291–312.

    Google Scholar 

  • Titterington, D. M., Murray, G. D., Murray, L. S., Spiegelhalter, D. J., Skene, A. M., Habbema, J. D. F. and Gelpke, G. J. (1981) Comparative of discrimination techniques applied to a computer data set of head injured patients.Journal of the Royal Statistical Society A,144, 145–175.

    Google Scholar 

  • Tutz, G. (1986) An alternative choice of smoothing for kernelbased density estimates in discrete discriminant analysis.Biometrika,73, 405–411.

    Google Scholar 

  • Tutz, G. (1989) On cross-validation for discrete kernel estimation in discrimination.Communication in Statistics-Theory and Methods,18 (11), 4145–4162.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Celeux, G., Mkhadri, A. Discrete regularized discriminant analysis. Stat Comput 2, 143–151 (1992). https://doi.org/10.1007/BF01891206

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01891206

Keywords

Navigation