Skip to main content

Probability Based Metrics for Locally Weighted Naive Bayes

  • Conference paper
Advances in Artificial Intelligence (Canadian AI 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4509))

Abstract

Locally weighted naive Bayes (LWNB) is a successful instance-based classifier, which first finds the neighbors of the test instance using Euclidean metric, and then builds a naive Bayes model in the local neighborhood. However, Euclidean metric is not the best choice for LWNB. For nominal attributes, Euclidean metric has to order and number the values of attributes, or judge whether the attribute values are identical or not. For numeric attributes, Euclidean metric is not appropriate for different attribute scales and variability, and encounters the problem of attribute value outliers when normalizing values. In this paper, we systematically study probability based metrics, such as Interpolated Value Difference Metric (IVDM), Extended Short and Fukunaga Metric (SF2), SF2 calibrated by logarithm (SF2LOG) and Minimum Risk Metric (MRM), and apply them to LWNB. These probability based metrics can solve the above problems of Euclidean metric since they depend on the difference between the probabilities to evaluate the distances between the instances. We conduct the experiments to compare the performances of LWNB classifiers using Euclidean metric and probability based metrics on UCI datasets. The results show that LWNB classifiers using IVDM outperform the ones using Euclidean metric and other probability based metrics. We also observe that SF2, SF2LOG and MRM do not perform well due to their inaccurate probability estimates. An artificial dataset is built by logical sampling in a Bayesian network, where accurate probability estimates can be produced. We conduct the experiment on the artificial dataset. The results show that SF2, SF2LOG and MRM using accurate probability estimates perform better than Euclidean metric and IVDM in LWNB.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Frank, E., Hall, M., Pfahringer, B.: Locally Weighted Naive Bayes. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 249–256. Morgan Kaufmann, San Francisco (2003)

    Google Scholar 

  2. Wilson, R.D., Martinez, T.R.: Improved Heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)

    MATH  MathSciNet  Google Scholar 

  3. Blanzieri, E., Ricci, F.: Probability Based Metrics for Nearest Neighbor Classification and Case-Based Reasoning. In: Althoff, K.-D., Bergmann, R., Branting, L.K. (eds.) ICCBR 1999. LNCS (LNAI), vol. 1650, p. 14. Springer, Heidelberg (1999)

    Google Scholar 

  4. Stanfill, C., Waltz, D.: Toward Memory-based reasoning. Communication of the ACM 29, 1213–1228 (1986)

    Article  Google Scholar 

  5. Short, R.D., Fukunaga, K.: The Optimal Distance Measure for Nearest Neighbour Classification. IEEE Transactions on Information Theory 27, 622–627 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  6. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  7. Loader, C.: Local Regression and Likelihood. Springer, Heidelberg (1999)

    MATH  Google Scholar 

  8. Cover, T.M., Hart, P.E.: Nearest Neighbor Pattern Classification. IEEE Transaction on Information Theory 13, 21–27 (1967)

    Article  MATH  Google Scholar 

  9. Myles, J.P., Hand, D.J.: The multi-class metric problem in nearest neighbour discrimination rules. Pattern Recognition 23(11), 1291–1297 (1990)

    Article  Google Scholar 

  10. Witten, I.H., Frank, E.: Data Mining-practical Machine Learning Tools and Techniques with Java Implementation. Morgan Kaufmann, San Mateo (2000)

    Google Scholar 

  11. Merz, C., Murphy, P., Aha, D.: UCI Repository of Machine Learning Databases. Dept of ICS, University of California, Irvine (1997), http://www.ics.uci.edu/~mlearn/MLRepository.html

  12. Zadrozny, B., Elkan, C.: Obtaining Calibrated Probability Estimates from Decision Trees and Naive Bayesian Classifiers. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 609–616. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  13. Zadrozny, B., Elkan, C.: Transforming Classifier Scores into Accurate Multiclass Probability Estimates. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 694–699 (2002)

    Google Scholar 

  14. Knorr, E.M., Ng, R.T., Zamar, H. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, San Francisco, California, 126-135 (2001)

    Google Scholar 

  15. Domingos, P., Pazzani, M.J.: On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning 29, 103–130 (1997)

    Article  MATH  Google Scholar 

  16. Henrion, M.: Propagating Uncertainty in Bayesian Networks by Probabilistic Logic Sampling. In: Uncertainty in Artificial Intelligence, pp. 317–324. North-Holland, Amsterdam (1988)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ziad Kobti Dan Wu

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Wang, B., Zhang, H. (2007). Probability Based Metrics for Locally Weighted Naive Bayes. In: Kobti, Z., Wu, D. (eds) Advances in Artificial Intelligence. Canadian AI 2007. Lecture Notes in Computer Science(), vol 4509. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72665-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72665-4_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72664-7

  • Online ISBN: 978-3-540-72665-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics