Generalization in Unsupervised Learning

Abou-Moustafa, Karim T.; Schuurmans, Dale

doi:10.1007/978-3-319-23528-8_19

Karim T. Abou-Moustafa¹⁰ &
Dale Schuurmans¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9284))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

Abstract

We are interested in the following questions. Given a finite data set \(\mathcal {S}\), with neither labels nor side information, and an unsupervised learning algorithm \(\mathsf {A}\), can the generalization of \(\mathsf {A}\) be assessed on \(\mathcal {S}\)? Similarly, given two unsupervised learning algorithms, \(\mathsf {A}_1\) and \(\mathsf {A}_2\), for the same learning task, can one assess whether one will generalize “better” on future data drawn from the same source as \(\mathcal {S}\)? In this paper, we develop a general approach to answering these questions in a reliable and efficient manner using mild assumptions on \(\mathsf {A}\). We first propose a concrete generalization criterion for unsupervised learning that is analogous to prediction error in supervised learning. Then, we develop a computationally efficient procedure that realizes the generalization criterion on finite data sets, and propose and extension for comparing the generalization of two algorithms on the same data set. We validate the overall framework on algorithms for clustering and dimensionality reduction (linear and nonlinear).

Download to read the full chapter text

Chapter PDF

Designing Algorithms for Machine Learning and Data Mining

Learning from Positive and Negative Examples: Dichotomies and Parameterized Algorithms

Generalizing from Example Clusters

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Anandkumar, A., Hsu, D., Kakade, S.: A method of moments for mixture models and hidden Markov models. CoRR abs/1203.0683 (2012)
Google Scholar
Balle, B., Quattoni, A., Carreras, X.: Local loss optimization in operator models: a new insight into spectral learning. In: Proceedings of the 29th International Conference on Machine Learning, pp. 1879–1886 (2012)
Google Scholar
Bartlett, P., Mendelson, S.: Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3, 463–482 (2003)
MathSciNet MATH Google Scholar
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for data representation. Neural Computation 15, 1373–1396 (2003)
Article MATH Google Scholar
Bousquet, O., Elisseeff, A.: Stability and generalization. Journal of Machine Learning Research 2, 499–526 (2002)
MathSciNet MATH Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)
MATH Google Scholar
Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, New York (1996)
Book MATH Google Scholar
Devroye, L., Wagner, T.: Distribution-free inequalities for the deleted and holdout error estimates. IEEE Transactions on Information Theory 25(2), 202–207 (1979)
Article MathSciNet MATH Google Scholar
Dietterich, T.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 10(7), 1895–1923 (1998)
Article Google Scholar
Efron, B.: Bootstrap methods: another look at the jackknife. Annals of Statistics 7, 1–26 (1979)
Article MathSciNet MATH Google Scholar
Hansen, L., Larsen, J.: Unsupervised learning and generalization. In: Proceedings of the IEEE International Conference on Neural Networks, pp. 25–30 (1996)
Google Scholar
Hsu, D., Kakade, S., Zhang, T.: A spectral algorithm for learning hidden Markov models. In: Proceedings of the 22nd Conference on Learning Theory (2009)
Google Scholar
Jordan, M.: On statistics, computation and scalability. Bernoulli 19(4), 1378–1390 (2013)
Article MathSciNet MATH Google Scholar
Kearns, M., Ron, D.: Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. In: Proceedings of the Conference on Learning Theory, pp. 152–162. ACM (1999)
Google Scholar
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 1137–1143 (1995)
Google Scholar
Kutin, S., Niyogi, P.: Almost-everywhere algorithmic stability and generalization error. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 275–282 (2002)
Google Scholar
Mukherjee, S., Niyogi, P., Poggio, T., Rifkin, R.: Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Advances in Computational Mathemathics 25, 161–193 (2006)
Article MathSciNet MATH Google Scholar
Newman, D., Hettich, S., Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (1998). http://www.ics.uci.edu/~mlearn/MLRepository.html
Saul, L., Roweis, S.: Think globally, fit locally: Unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research 4, 119–155 (2003)
MathSciNet Google Scholar
Shalev-Shwartz, S., Shamir, O., Srebro, N., Sridharan, K.: Learnability, stability and uniform convergence. Journal of Machine Learning Research 11, 2635–2670 (2010)
MathSciNet MATH Google Scholar
Song, L., Boots, B., Siddiqi, S., Gordon, G., Smola, A.: Hilbert space embeddings of hidden Markov models. In: Proceedings of the 27th International Conference on Machine Learning, pp. 991–998 (2010)
Google Scholar
Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, Sussex (1998)
MATH Google Scholar
Vapnik, V.N.: An overview of statistical learning theory. IEEE Transactions on Neural Networks 10(5), 988–999 (1999)
Article Google Scholar
Xu, H., Mannor, S.: Robustness and generalization. Machine Learning 86(3), 391–423 (2012)
Article MathSciNet MATH Google Scholar
Xu, L., White, M., Schuurmans, D.: Optimal reverse prediction: a unified perspective on supervised, unsupervised and semi-supervised learning. In: Proceedings of the International Conference on Machine Learning, vol. 382 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E8, Canada
Karim T. Abou-Moustafa & Dale Schuurmans

Authors

Karim T. Abou-Moustafa
View author publications
You can also search for this author in PubMed Google Scholar
Dale Schuurmans
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Karim T. Abou-Moustafa .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Porto, Porto, Portugal
Pedro Pereira Rodrigues
University of Porto - CRACS/INESC TEC, Porto, Portugal
Vítor Santos Costa
University of Porto - INESC TEC, Porto, Portugal
Carlos Soares
University of Porto - INESC TEC, Porto, Portugal
João Gama
University of Porto - INESC TEC, Porto, Portugal
Alípio Jorge

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abou-Moustafa, K.T., Schuurmans, D. (2015). Generalization in Unsupervised Learning. In: Appice, A., Rodrigues, P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science(), vol 9284. Springer, Cham. https://doi.org/10.1007/978-3-319-23528-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-23528-8_19
Published: 29 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23527-1
Online ISBN: 978-3-319-23528-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Generalization in Unsupervised Learning

Abstract

Chapter PDF

Similar content being viewed by others

Designing Algorithms for Machine Learning and Data Mining

Learning from Positive and Negative Examples: Dichotomies and Parameterized Algorithms

Generalizing from Example Clusters

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Generalization in Unsupervised Learning

Abstract

Chapter PDF

Similar content being viewed by others

Designing Algorithms for Machine Learning and Data Mining

Learning from Positive and Negative Examples: Dichotomies and Parameterized Algorithms

Generalizing from Example Clusters

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation