Abstract
Unsupervised learning algorithms are designed to extract structure from data samples on the basis of a cost function for structures. For a reliable and robust inference process, the unsupervised learning algorithm has to guarantee that the extracted structures are typical for the data source. In particular, it has to reject all structures where the inference is dominated by the arbitrariness of the sample noise and which, consequently, can be characterized as overfitting in unsupervised learning. This paper summarizes an inference principle called Empirical Risk Approximation which allows us to quantitatively measure the overfitting effect and to derive a criterion as a saveguard against it. The crucial condition for learning is met if (i) the empirical risk of learning uniformly converges towards the expected risk and if (ii) the hypothesis class retains a minimal variety for consistent inference. Parameter selection of learnable data structures is demonstrated for the case of k-means clustering and Monte Carlo simulations are presented to support the selection principle.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
REFERENCES
J. M. Buhmann. Empirical risk approximation. Technical Report IAI-TR 98–3, Institut für Informatik III, Universität Bonn, 1998.
V. N. Vapnik. The Nature of Statistical Learning Theory.Springer-Verlag, New York, Berlin, Heidelberg, 1995.
A. W. van der Vaart and J. A. Wellner. Weak Convergence and Empirical Processes.Springer-Verlag, New York, Berlin, Heidelberg, 1996.
D. Pollard. Quantization and the method of A;-means. IEEE Transactions on Information Theory,28(2): 199–205, 1982.
L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition.Springer Verlag, New York, Berlin, Heidelberg, 1996.
T. Linder, G. Lugosi, and K. Zeger. Empirical quantizer design in the presence of source noise or channel noise. IEEE Transactions on Information Theory,43(2):612–623, March 1997.
M. Kearns, Y. Mansour, and A. Y. Ng. An information-theoretic analysis of hard and soft assignment methods for clustering. In Proceedings of Uncertainty in Artificial Intelligence.AAAI, 1997.
K. Rose, E. Gurewitz, and G. Fox. Statistical mechanics and phase transitions in clustering. Physical Review Letters,65(8):945–948, 1990.
J. M. Buhmann and H. Kiihnel. Vector quantization with complexity costs. IEEE Transactions on Information Theory,39(4): 1133–1145, July 1993.
N. Barkai and H. Sompolinski. Statistical mechanics of the maximum-likelihood density estimation. Physical Review A,50:1766–1769, September 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag London Limited
About this paper
Cite this paper
Buhmann, J.M., Held, M. (1999). Unsupervised Learning without Overfitting: Empirical Risk Approximation as an Induction Principle for Reliable Clustering. In: Singh, S. (eds) International Conference on Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-0833-7_17
Download citation
DOI: https://doi.org/10.1007/978-1-4471-0833-7_17
Publisher Name: Springer, London
Print ISBN: 978-1-4471-1214-3
Online ISBN: 978-1-4471-0833-7
eBook Packages: Springer Book Archive