Unsupervised Learning without Overfitting: Empirical Risk Approximation as an Induction Principle for Reliable Clustering

Buhmann, Joachim M.; Held, Marcus

doi:10.1007/978-1-4471-0833-7_17

Joachim M. Buhmann² &
Marcus Held²

154 Accesses
10 Citations

Abstract

Unsupervised learning algorithms are designed to extract structure from data samples on the basis of a cost function for structures. For a reliable and robust inference process, the unsupervised learning algorithm has to guarantee that the extracted structures are typical for the data source. In particular, it has to reject all structures where the inference is dominated by the arbitrariness of the sample noise and which, consequently, can be characterized as overfitting in unsupervised learning. This paper summarizes an inference principle called Empirical Risk Approximation which allows us to quantitatively measure the overfitting effect and to derive a criterion as a saveguard against it. The crucial condition for learning is met if (i) the empirical risk of learning uniformly converges towards the expected risk and if (ii) the hypothesis class retains a minimal variety for consistent inference. Parameter selection of learnable data structures is demonstrated for the case of k-means clustering and Monte Carlo simulations are presented to support the selection principle.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

REFERENCES

J. M. Buhmann. Empirical risk approximation. Technical Report IAI-TR 98–3, Institut für Informatik III, Universität Bonn, 1998.
Google Scholar
V. N. Vapnik. The Nature of Statistical Learning Theory.Springer-Verlag, New York, Berlin, Heidelberg, 1995.
MATH Google Scholar
A. W. van der Vaart and J. A. Wellner. Weak Convergence and Empirical Processes.Springer-Verlag, New York, Berlin, Heidelberg, 1996.
MATH Google Scholar
D. Pollard. Quantization and the method of A;-means. IEEE Transactions on Information Theory,28(2): 199–205, 1982.
Article MathSciNet MATH Google Scholar
L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition.Springer Verlag, New York, Berlin, Heidelberg, 1996.
MATH Google Scholar
T. Linder, G. Lugosi, and K. Zeger. Empirical quantizer design in the presence of source noise or channel noise. IEEE Transactions on Information Theory,43(2):612–623, March 1997.
Article MATH Google Scholar
M. Kearns, Y. Mansour, and A. Y. Ng. An information-theoretic analysis of hard and soft assignment methods for clustering. In Proceedings of Uncertainty in Artificial Intelligence.AAAI, 1997.
Google Scholar
K. Rose, E. Gurewitz, and G. Fox. Statistical mechanics and phase transitions in clustering. Physical Review Letters,65(8):945–948, 1990.
Article Google Scholar
J. M. Buhmann and H. Kiihnel. Vector quantization with complexity costs. IEEE Transactions on Information Theory,39(4): 1133–1145, July 1993.
Article MATH Google Scholar
N. Barkai and H. Sompolinski. Statistical mechanics of the maximum-likelihood density estimation. Physical Review A,50:1766–1769, September 1994.
Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Informatik III, Universität Bonn, Germany
Joachim M. Buhmann & Marcus Held

Authors

Joachim M. Buhmann
View author publications
You can also search for this author in PubMed Google Scholar
Marcus Held
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science, University of Exeter, EX4 4PT, Exeter, UK
Sameer Singh PhD

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Buhmann, J.M., Held, M. (1999). Unsupervised Learning without Overfitting: Empirical Risk Approximation as an Induction Principle for Reliable Clustering. In: Singh, S. (eds) International Conference on Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-0833-7_17

Download citation

DOI: https://doi.org/10.1007/978-1-4471-0833-7_17
Publisher Name: Springer, London
Print ISBN: 978-1-4471-1214-3
Online ISBN: 978-1-4471-0833-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics