Advertisement

Computational Statistics

, Volume 17, Issue 2, pp 185–202 | Cite as

The Use of a Distance Measure in Regularised Discriminant Analysis

  • J. P. KoolaardEmail author
  • S. Ganesalingam
  • C. R. O. Lawoko
Article

Summary

Friedman (1989) proposed a regularised discriminant function (RDF) as a compromise between the normal-based linear and quadratic discriminant functions, by considering alternatives to the usual maximum likelihood estimates for the covariance matrices. These alternatives are characterised by two (regularisation) parameters, the values of which are customised to individual situations by jointly minimising a sample-based (cross-validated) estimate of future misclassification risk. This technique appears to provide considerable gains in classification accuracy in many circumstances, although it is computationally intensive.

Because of the computational burden inherent in the RDF, and with regards to criticisms of the technique by Rayens et al. (1991), we investigated whether information about appropriate values of the two regularisation parameters could be obtained from examining the behaviour of the Bhattacharyya distance between the various populations. This distance measure is found to give information which leads to unique and generally appropriate values for the regularisation parameters being selected.

Keywords

Regularised discriminant function regularisation parameter Bhattacharyya distance 

Notes

Acknowledgement

Preliminary results from this project were reported in a paper presented at the 8th Australasian Remote Sensing Conference, Canberra, 1996 (see Koolaard, Lawoko and Ganesalingam (1996).

References

  1. Aeberhard, S., Coomans, D. and de Vel, O. (1994). Comparative analysis of statistical pattern recognition methods in high dimensional settings. Pattern Recognition, 27(8), 1065–1077.CrossRefGoogle Scholar
  2. Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analysis. Second Edition. New York: Wiley.zbMATHGoogle Scholar
  3. Bhattacharyya, A. (1946). On a measure of divergence between two multinomial populations. Sankhya, A7, 401–406.MathSciNetzbMATHGoogle Scholar
  4. Efron, B. (1983). Estimating the error rate of a prediction rule: improvement on cross-validation. J. Amer. Statist. Assoc., 78, 316–331.MathSciNetCrossRefGoogle Scholar
  5. Friedman, J.H. (1989). Regularized discriminant analysis. J. Amer. Statist. Assoc., 84, 165–175.MathSciNetCrossRefGoogle Scholar
  6. Fukunaga, K. and Hayes, R. R. (1989). Effects of sample size in classifier design. IEEE Trans. Pattern Anal. Machine Intell., PAMI-11, 873–885.CrossRefGoogle Scholar
  7. Ganeshanandam, S. and Krzanowski, W. J. (1990). Error-rate estimation in two-group discriminant analysis using the linear discriminant function. J. Statist. Comput. Simul., 36, 157–175.MathSciNetCrossRefGoogle Scholar
  8. Greene, T. and Rayens, W. (1989). Partially pooled covariance matrix estimation in discriminant analysis. Comm. Statist. Theory Meth., 18 (10), 3679–3702.MathSciNetCrossRefGoogle Scholar
  9. Hong, Z. Q. and Yang, J. Y. (1991). Optimal discriminant plane for a small number of samples and design method of classifier on the plane. Pattern Recognition, 24, 317–324.MathSciNetCrossRefGoogle Scholar
  10. Jain, A. K. (1976). On an estimate of the Bhattacharyya distance. IEEE Trans. Syst. Man Cybern., SMC-6, 763–766.MathSciNetCrossRefGoogle Scholar
  11. Kailath, T. (1967). The divergence and Bhattacharyya distance measures in signal selection. IEEE Trans. Commun. Tech., COM-15, 52–60.CrossRefGoogle Scholar
  12. Koolaard, J. P. and Lawoko, C. R. O. (1993). Estimating error rates in discriminant analysis with correlated training observations: a simulation study. J. Statist. Comput. Simul., 48, 81–99.CrossRefGoogle Scholar
  13. Koolaard, J. P., Lawoko, C. R. O. and Ganesalingam, S. (1996). Regularized discriminant (classification) analysis involving Bhattacharya distance measure. Proceedings of the 8th Australasian Remote Sensing Conference, Canberra, Australia (March 1996). Volume 2, Poster, pp 35–43.Google Scholar
  14. Koolaard, J. P. and Lawoko, C. R. O. (1996). The linear and Euclidean discriminant functions: a comparison via asymptotic expansions and simulation study. Commun. Statist.-Theory Meth., 25(12), 2989–3011.MathSciNetCrossRefGoogle Scholar
  15. Koolaard, J. P.(1997). Some aspects of covariance regularisation in discriminant analysis. Unpublished PhD thesis, Massey University, New Zealand.Google Scholar
  16. Koolaard, J. P., Ganesalingam, S. and Lawoko, C. R. O. (1998). Comparison of regularised discriminant analysis with the standard discrimination methods. Comp. Statist., 13 (4), 495–509.zbMATHGoogle Scholar
  17. Lindsey, J.C., Herzberg, A.M. and Watts, D.G. (1987). A method for cluster analysis based on projections and quantile-quantile plots. Biometrics, 43, 327–341.MathSciNetCrossRefGoogle Scholar
  18. Marco, V.R., Young, D.M., and Turner, D.W. (1987). The Euclidean distance classifier: an alternative to the linear discriminant function. Commun. Statist.-Simula., 16, 485–505.MathSciNetCrossRefGoogle Scholar
  19. Morant, G.M. (1923). A first study of the Tibetan skull. Biometrika, 14, 193–260.CrossRefGoogle Scholar
  20. Rayens, W and Greene, T. (1991). Covariance pooling and stabilization for classification. Comput. Statist. Data Anal., 11, 17–42.MathSciNetCrossRefGoogle Scholar
  21. Reaven, G.M. and Miller, R.G. (1979). An attempt to define the nature of chemical diabetes using a multidimensional analysis. Diabetologia, 16, 17–24.CrossRefGoogle Scholar

Copyright information

© Physica-Verlag 2002

Authors and Affiliations

  • J. P. Koolaard
    • 1
    Email author
  • S. Ganesalingam
    • 2
  • C. R. O. Lawoko
    • 3
  1. 1.Crop and Food Research LimitedPalmerston NorthNew Zealand
  2. 2.Institute of Information Sciences and TechnologyMassey UniversityPalmerston NorthNew Zealand
  3. 3.Predictive MarketingNational Australia BankMelbourneAustralia

Personalised recommendations