Abstract
A method of regularized discriminant analysis for discrete data, denoted DRDA, is proposed. This method is related to the regularized discriminant analysis conceived by Friedman (1989) in a Gaussian framework for continuous data. Here, we are concerned with discrete data and consider the classification problem using the multionomial distribution. DRDA has been conceived in the small-sample, high-dimensional setting. This method has a median position between multinomial discrimination, the first-order independence model and kernel discrimination. DRDA is characterized by two parameters, the values of which are calculated by minimizing a sample-based estimate of future misclassification risk by cross-validation. The first parameter is acomplexity parameter which provides class-conditional probabilities as a convex combination of those derived from the full multinomial model and the first-order independence model. The second parameter is asmoothing parameter associated with the discrete kernel of Aitchison and Aitken (1976). The optimal complexity parameter is calculated first, then, holding this parameter fixed, the optimal smoothing parameter is determined. A modified approach, in which the smoothing parameter is chosen first, is discussed. The efficiency of the method is examined with other classical methods through application to data.
Similar content being viewed by others
References
Aitchison, J. and Aitken, C. G. G. (1976) Multivariate binary discrimination by the kernel method.Biometrika,63, 413–420.
Anderson, J. A. (1972) Separate sample logistic discrimination.Biometrika,66, 19–35.
Bahadur, R. R. (1961) A representation of the joint distribution of responses ton dichotomous items, inStudies in Item Analysis and Prediction, H. Salomon (ed), Stanford University Press, Stanford, CA, pp. 158–168.
Bowman, A. W. (1980) A note on consistency of kernel method for the analysis of categorical data.Biometrika,67, 682–684.
Bowman, A. W., Hall, P. and Titterington, D. M. (1984) Crossvalidation in nonparametric estimation of probabilities and probability densities.Biometrika,71, 341–351.
Brown, P. J. and Rundell, P. W. K. (1985) Kernel estimates for categorical data.Technometrics,27, 293–299.
Butler, W. J. and Kronmal, R. A. (1986) Discrimination with polychotomous predictor variables using orthogonal functions.Journal of American Statistical Association,80, 305–313.
Cox, D. R. (1972) Regression models and life tables (with discussion).Journal of the Royal Statistical Society B,32, 443–448.
Dillon, W. R. and Goldstein, M. (1978) On the performance of some multinomial classification rules.Journal of American Statistical Association,73, 305–313.
Efron, B. (1983) Estimating the error rate of a prediction rule: improvement on cross-validation.Journal of American Statistical Association,78, 316–331.
Friedman, J. H. (1989) Regularized discriminant analysis.Journal of American Statistical Association,84, 165–175.
Glick, N. (1972) Sample-based classification procedures derived from density estimates.Journal of American Statistical Association,67, 116–121.
Goldstein, M. and Dillon, W. R. (1978)Discrete Discriminant Analysis. Wiley, New York.
Hall, P. (1981a) On nonparametric multivariate binary discrimination.Biometrika,68, 287–294.
Hall, P. (1981b) Optimal near neighbour estimator for use in discriminant analysis.Biometrika,68, 572–575.
Hall, P. and Wand, P. (1988) Nonparametric discrimination using density differences.Biometrika,75, 541–547.
Hand, D. J. (1982)Kernel Discriminant Analysis. Research Studies Press/Wiley, Chichester.
Hand, D. J. (1983) A comparison of two methods of discriminant analysis applied to binary data.Biometrics,39, 683–694.
Hand, D. J. (1986) Recent advances in error rate estimation.Pattern Recognition Letters,4, 335–346.
Hills, M. (1967) Discrimination and allocation with discrete data.Applied Statistics,16, 237–250.
Krzanowski, W. J. (1975) Discrimination and classification using both binary and continuous variables.Journal of American Statistical Association,70, 782–790.
Martin, D. C. and Bradley, R. A. (1972) Probability models, estimation, and classification for multivariate dichotomous populations.Biometrics,28, 203–222.
Ott, J. and Kronmal, R. A. (1976) Some classification procedures for binary data using orthogonal functions.Journal of American Statistical Association,71, 391–399.
Titterington, D. M. (1980) A comparative study of kernel-based density estimates for categorical data.Technometrics,22, 259–268.
Titterington, D. M. and Bowman, A. W. (1985) A comparative study of smoothing procedures for ordered categorical data.Journal of Statistical Computation and Simulation,21, 291–312.
Titterington, D. M., Murray, G. D., Murray, L. S., Spiegelhalter, D. J., Skene, A. M., Habbema, J. D. F. and Gelpke, G. J. (1981) Comparative of discrimination techniques applied to a computer data set of head injured patients.Journal of the Royal Statistical Society A,144, 145–175.
Tutz, G. (1986) An alternative choice of smoothing for kernelbased density estimates in discrete discriminant analysis.Biometrika,73, 405–411.
Tutz, G. (1989) On cross-validation for discrete kernel estimation in discrimination.Communication in Statistics-Theory and Methods,18 (11), 4145–4162.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Celeux, G., Mkhadri, A. Discrete regularized discriminant analysis. Stat Comput 2, 143–151 (1992). https://doi.org/10.1007/BF01891206
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF01891206