Feature Weighted Minimum Distance Classifier with Multi-class Confidence Estimation

  • Mamatha Rudrapatna
  • Arcot Sowmya
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4304)


In many recognition tasks, a simple discrete class label is not sufficient and ranking of the classes is desirable; in others, a numeric score that represents the confidence of class membership for multiple classes is also required. Differential diagnosis in medical domains and terrain classification in surveying are prime examples. The Minimum Distance Classifier is a well-known, simple and efficient scheme for producing multi-class probabilities. However, when features contribute unequally to the classification, noisy and irrelevant features can distort the distance function. We enhance the minimum distance classifier with feature weights leading to the Feature Weighted Minimum Distance classifier. We empirically compare minimum distance classifier and its enhanced feature weighted version with a number of standard classifiers. We also present preliminary results on medical images with acceptable performance and better interpretability.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley Interscience, Chichester (2001)MATHGoogle Scholar
  2. 2.
    Dabney, A.R.: Classification of microarrays to nearest centroids. Bioinformatics 21, 4148–4154 (2005)CrossRefGoogle Scholar
  3. 3.
    Wettschereck, D., Aha, D.W., Mohri, T.: A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms. Artificial Intelligence Review 11, 273–314 (1997)CrossRefGoogle Scholar
  4. 4.
    Domingos, P.M.: Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier. Machine Learning 29, 103–130 (1997)CrossRefMATHGoogle Scholar
  5. 5.
    Zhang, H., Su, J.: Naive Bayesian Classifiers for Ranking. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 501–512. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In: ICML, 609–616 (2001)Google Scholar
  7. 7.
    Friedman, N., Greiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29, 103–130 (1997)CrossRefGoogle Scholar
  8. 8.
    Quinlan, J.R.: C4.5: programs for machine learning. Machine Learning (1993)Google Scholar
  9. 9.
    Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: ICML, pp. 445–453 (1998)Google Scholar
  10. 10.
    Provost, F., Domingos, P.: Tree induction for probability-based ranking. Machine Learning 52, 199–215 (2003)CrossRefMATHGoogle Scholar
  11. 11.
    Kohavi, R.: Scaling up the accuracy of naïve bayes classifiers: A decision-tree hybrid. In: KDD, pp. 202–207 (1996)Google Scholar
  12. 12.
    Margineantu, D.D., Dietterich, T.G.: Improved class probability estimates from decision tree models. LNS, vol. 171, pp. 169–184 (2002)Google Scholar
  13. 13.
    Ling, C., Yan, J.: Decision tree with better ranking. In: ICML, pp. 480–487 (2003)Google Scholar
  14. 14.
    Delany, S., Cunningham, P., Doyle, D., Zamolotskikh, A.: Generating Estimates of Classification Confidence for a Case-Based Spam Filter. In: Muñoz-Ávila, H., Ricci, F. (eds.) ICCBR 2005. LNCS (LNAI), vol. 3620, pp. 177–190. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  15. 15.
    Dietterich, T.G.: Ensemble Learning Methods. In: Handbook of Brain Theory and Neural Networks, 2nd edn., MIT Press, Cambridge (2001)Google Scholar
  16. 16.
    Zadrozny, B., Elkan, C.: Transforming Classifier Scores into Accurate Multiclass Probability Estimates. In: KDD, pp. 259–268 (2002)Google Scholar
  17. 17.
    Niculescu-Mizil, A., Caruana, R.: Predicting Good Probabilities With Supervised Learning. In: ICML, pp. 625–632 (2005)Google Scholar
  18. 18.
    Lin, H., Venetsanopoulos, A.: A Weighted Minimum Distance Classifier for Pattern Recognition. In: Canadian Conference on Electrical and Computer Engineering, pp. 14–17, 904–907 (1993)Google Scholar
  19. 19.
    Kononenko, I.: Estimating attributes: analysis and extensions of Relief. In: ECML, pp. 171–182 (1994)Google Scholar
  20. 20.
    Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.U.: Repository of machine learning databases, University of California, Irvine http://www.ics.uci.edu/~mlearn/MLRepository.html
  21. 21.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. (2005)Google Scholar
  22. 22.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)CrossRefGoogle Scholar
  23. 23.
    Domingos, P., Provost, F.: Well-trained PETs: Improving probability estimation trees, CDER Working Paper #00-04-IS, Stern School of Business, NYU, NY 10012 (2000)Google Scholar
  24. 24.
    Hand, D.J., Till, R.J.: A simple generalization of the area under the ROC curve to multiple class classification problems. Machine Learning 45, 171–186 (2001)CrossRefMATHGoogle Scholar
  25. 25.
    Brier, G.W.: Verification of forecasts expressed in terms of probability. Monthly Weather Review 78, 1–3 (1950)CrossRefGoogle Scholar
  26. 26.
    DeGroot, M., Fienberg, S.: The comparison and evaluation of forecasters. Statistician 32, 12–22 (1982)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Mamatha Rudrapatna
    • 1
  • Arcot Sowmya
    • 1
    • 2
  1. 1.School of Computer Science and EngineeringUniversity of New South WalesSydneyAustralia
  2. 2.Division of Engineering, Science and TechnologyUNSWSingaporeAsia

Personalised recommendations