An Unsupervised Bayesian Distance Measure

Kontkanen, Petri; Lahtinen, Jussi; Myllymäki, Petri; Tirri, Henry

doi:10.1007/3-540-44527-7_14

Petri Kontkanen⁵,
Jussi Lahtinen⁵,
Petri Myllymäki⁵ &
…
Henry Tirri⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1898))

Included in the following conference series:

European Workshop on Advances in Case-Based Reasoning

629 Accesses
7 Citations

Abstract

We introduce a distance measure based on the idea that two vectors are considered similar if they lead to similar predictive probability distributions. The suggested approach avoids the scaling problem inherent to many alternative techniques as the method automatically transforms the original attribute space to a probability space where all the numbers lie between 0 and 1. The method is also flexible in the sense that it allows different attribute types (discrete or continuous) in the same consistent framework. To study the validity of the suggested measure, we ran a series of experiments with publicly available data sets. The empirical results demonstrate that the unsupervised distance measure is sensible in the sense that it can be used for discovering the hidden clustering structure of the data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

D. Aha. A Study of Instance-Based Algorithms for Supervised Learning Tasks: Mathematical, Empirical, an Psychological Observations. PhD thesis, University of California, Irvine, 1990.
Google Scholar
D. Aha, (editor). Lazy Learning. Kluwer Academic Publishers, Dordrecht, 1997. Reprinted from Artificial Intelligence Review, 11:1–5.
MATH Google Scholar
C. Atkeson, A. Moore, and S. Schaal. Locally weighted learning. In Aha [2], pages 11–73.
Google Scholar
J. O. Berger. Statistical Decision Theory and Bayesian Analysis. Springer-Verlag, New York, 1985.
MATH Google Scholar
C. Blake, E. Keogh, and C. Merz. UCI repository of machine learning databases, 1998. URL: http://www.ics.uci.edu/~nilearn/MLRepository.html.
E. Castillo, J. Gutiérrez, and A. Hadi. Expert Systems and Probabilistic Network Models. Monographs in Computer Science. Springer-Verlag, New York, NY, 1997.
Google Scholar
C. Chatfield and A. Collins. Introduction to Multivariate Analysis. Chapman and Hall, New York, 1980.
MATH Google Scholar
G. Cooper and E. Herskovits. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9:309–347, 1992.
MATH Google Scholar
N. Friedman, D. Geiger, and M. Goldszmidt. Bayesian network classifiers. Machine Learning, 29:131–163, 1997.
Article MATH Google Scholar
A. Gelman, J. Carlin, H. Stern, and D. Rubin. Bayesian Data Analysis. Chapman & Hall, 1995.
Google Scholar
D. Heckerman, D. Geiger, and D. M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3):197–243, September 1995.
Google Scholar
D. Heckerman and C. Meek. Models and selection criteria for regression and classification. In D. Geiger and P. Shenoy, (editors), Uncertainty in Arificial Intelligence 13, pages 223–228. Morgan Kaufmann Publishers, San Mateo, CA, 1997.
Google Scholar
F. Jensen. An Introduction to Bayesian Networks. UCL Press, London, 1996.
Google Scholar
T. Kohonen. Self-Organizing Maps. Springer-Verlag, Berlin, 1995.
Google Scholar
J. Kolodner. Case-Based Reasoning. Morg.an Kaufmann Publishers, San Mateo, 1993.
Google Scholar
P. Kontkanen, J. Lahtinen, P. Myllymäki, T. Silander, and H. Tirri. Using Bayesian networks for visualizing high-dimensional data. Intelligent Data Analysis, 2000. To appear.
Google Scholar
P. Kontkanen, P. Myllymäki, T. Silander, and H. Tirri. BAYDA: Software for Bayesian classification and feature selection. In R. Agrawal, P. Stolorz, and G. Piatetsky-Shapiro, (editors), Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), pages 254–258. AAAI Press, Menlo Park, 1998.
Google Scholar
P. Kontkanen, P. Myllymäki, T. Silander, and H. Tirri. Bayes optimal instance-based learning. In C. Nédellec and C. Rouveirol, (editors), Machine Learning: ECML-98, Proceedings of the 10th European Conference, volume 1398 of Lecture Notes in Artificial Intelligence, pages 77–88. Springer-Verlag, 1998.
Google Scholar
P. Kontkanen, P. Myllymäki, T. Silander, and H. Tirri. On Bayesian case matching. In B. Smyth and P. Cunningham, (editors), Advances in Case-Based Reasoning, Proceedings of the 4th European Workshop (EWCBR-98), volume 1488 of Lecture Notes in Artificial Intelligence, pages 13–24. Springer-Verlag, 1998.
Google Scholar
P. Kontkanen, P. Myllymäki, T. Silander, and H. Tirri. On supervised selection of Bayesian networks. In K. Laskey and H. Prade, (editors), Proceedings of the 15th International Conference on Uncertainty in Artificial Intelligence (UAI’99), pages 334–342. Morgan Kaufmann Publishers, 1999.
Google Scholar
P. Kontkanen, P. Myllymäki, T. Silander, H. Tirri, and P. Grünwald. On predictive distributions and Bayesian networks. Statistics and Computing, 10:39–54, 2000.
Article Google Scholar
A. Moore. Acquisition of dynamic control knowledge for a robotic manipulator. In Seventh International Machine Learning Workshop. Morgan Kaufmann, 1990.
Google Scholar
R. E. Neapolitan. Probabilistic Reasoning in Expert Systems. John Wiley & Sons, New York, NY, 1990.
Google Scholar
J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, San Mateo, CA, 1988.
Google Scholar
C. Stanfill and D. Waltz. Toward memory-based reasoning. Communications of the ACM, 29(12):1213–1228, 1986.
Article Google Scholar
H. Tirri, P. Kontkanen, and P. Myllymäki. Probabilistic instance-based learning. In L. Saitta, (editor), Machine Learning: Proceedings of the Thirteenth International Conference (ICML’96), pages 507–515. Morgan Kaufmann Publishers, 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Complex Systems Computation Group (CoSCo) P.O.Box 26, Department of Computer Science, University of Helsinki, FIN-00014, Finland
Petri Kontkanen, Jussi Lahtinen, Petri Myllymäki & Henry Tirri

Authors

Petri Kontkanen
View author publications
You can also search for this author in PubMed Google Scholar
Jussi Lahtinen
View author publications
You can also search for this author in PubMed Google Scholar
Petri Myllymäki
View author publications
You can also search for this author in PubMed Google Scholar
Henry Tirri
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centre for Scientific and Technological Research (ITC-irst), Istituto Trention di Cultura, via Sommarive 18, 38050, Povo (Trento), Italy
Enrico Blanzieri
DISTA, University of Eastern Piedmont “Amedeo Avogadro”, C.so Borsalino 54, 15100, Alessandria, Italy
Luigi Portinale

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kontkanen, P., Lahtinen, J., Myllymäki, P., Tirri, H. (2000). An Unsupervised Bayesian Distance Measure. In: Blanzieri, E., Portinale, L. (eds) Advances in Case-Based Reasoning. EWCBR 2000. Lecture Notes in Computer Science, vol 1898. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44527-7_14

Download citation

DOI: https://doi.org/10.1007/3-540-44527-7_14
Published: 14 January 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67933-2
Online ISBN: 978-3-540-44527-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics