Summary
Friedman (1989) proposed a regularised discriminant function (RDF) as a compromise between the normal-based linear and quadratic discriminant functions, by considering alternatives to the usual maximum likelihood estimates for the covariance matrices. These alternatives are characterised by two (regularisation) parameters, the values of which are customised to individual situations by jointly minimising a sample-based (cross-validated) estimate of future misclassification risk. This technique appears to provide considerable gains in classification accuracy in many circumstances, although it is computationally intensive.
Because of the computational burden inherent in the RDF, and with regards to criticisms of the technique by Rayens et al. (1991), we investigated whether information about appropriate values of the two regularisation parameters could be obtained from examining the behaviour of the Bhattacharyya distance between the various populations. This distance measure is found to give information which leads to unique and generally appropriate values for the regularisation parameters being selected.
Similar content being viewed by others
References
Aeberhard, S., Coomans, D. and de Vel, O. (1994). Comparative analysis of statistical pattern recognition methods in high dimensional settings. Pattern Recognition, 27(8), 1065–1077.
Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analysis. Second Edition. New York: Wiley.
Bhattacharyya, A. (1946). On a measure of divergence between two multinomial populations. Sankhya, A7, 401–406.
Efron, B. (1983). Estimating the error rate of a prediction rule: improvement on cross-validation. J. Amer. Statist. Assoc., 78, 316–331.
Friedman, J.H. (1989). Regularized discriminant analysis. J. Amer. Statist. Assoc., 84, 165–175.
Fukunaga, K. and Hayes, R. R. (1989). Effects of sample size in classifier design. IEEE Trans. Pattern Anal. Machine Intell., PAMI-11, 873–885.
Ganeshanandam, S. and Krzanowski, W. J. (1990). Error-rate estimation in two-group discriminant analysis using the linear discriminant function. J. Statist. Comput. Simul., 36, 157–175.
Greene, T. and Rayens, W. (1989). Partially pooled covariance matrix estimation in discriminant analysis. Comm. Statist. Theory Meth., 18 (10), 3679–3702.
Hong, Z. Q. and Yang, J. Y. (1991). Optimal discriminant plane for a small number of samples and design method of classifier on the plane. Pattern Recognition, 24, 317–324.
Jain, A. K. (1976). On an estimate of the Bhattacharyya distance. IEEE Trans. Syst. Man Cybern., SMC-6, 763–766.
Kailath, T. (1967). The divergence and Bhattacharyya distance measures in signal selection. IEEE Trans. Commun. Tech., COM-15, 52–60.
Koolaard, J. P. and Lawoko, C. R. O. (1993). Estimating error rates in discriminant analysis with correlated training observations: a simulation study. J. Statist. Comput. Simul., 48, 81–99.
Koolaard, J. P., Lawoko, C. R. O. and Ganesalingam, S. (1996). Regularized discriminant (classification) analysis involving Bhattacharya distance measure. Proceedings of the 8th Australasian Remote Sensing Conference, Canberra, Australia (March 1996). Volume 2, Poster, pp 35–43.
Koolaard, J. P. and Lawoko, C. R. O. (1996). The linear and Euclidean discriminant functions: a comparison via asymptotic expansions and simulation study. Commun. Statist.-Theory Meth., 25(12), 2989–3011.
Koolaard, J. P.(1997). Some aspects of covariance regularisation in discriminant analysis. Unpublished PhD thesis, Massey University, New Zealand.
Koolaard, J. P., Ganesalingam, S. and Lawoko, C. R. O. (1998). Comparison of regularised discriminant analysis with the standard discrimination methods. Comp. Statist., 13 (4), 495–509.
Lindsey, J.C., Herzberg, A.M. and Watts, D.G. (1987). A method for cluster analysis based on projections and quantile-quantile plots. Biometrics, 43, 327–341.
Marco, V.R., Young, D.M., and Turner, D.W. (1987). The Euclidean distance classifier: an alternative to the linear discriminant function. Commun. Statist.-Simula., 16, 485–505.
Morant, G.M. (1923). A first study of the Tibetan skull. Biometrika, 14, 193–260.
Rayens, W and Greene, T. (1991). Covariance pooling and stabilization for classification. Comput. Statist. Data Anal., 11, 17–42.
Reaven, G.M. and Miller, R.G. (1979). An attempt to define the nature of chemical diabetes using a multidimensional analysis. Diabetologia, 16, 17–24.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Koolaard, J.P., Ganesalingam, S. & Lawoko, C.R.O. The Use of a Distance Measure in Regularised Discriminant Analysis. Computational Statistics 17, 185–202 (2002). https://doi.org/10.1007/s001800200101
Published:
Issue Date:
DOI: https://doi.org/10.1007/s001800200101