Semi-Supervised Learning on Riemannian Manifolds
We consider the general problem of utilizing both labeled and unlabeled data to improve classification accuracy. Under the assumption that the data lie on a submanifold in a high dimensional space, we develop an algorithmic framework to classify a partially labeled data set in a principled manner. The central idea of our approach is that classification functions are naturally defined only on the submanifold in question rather than the total ambient space. Using the Laplace-Beltrami operator one produces a basis (the Laplacian Eigenmaps) for a Hilbert space of square integrable functions on the submanifold. To recover such a basis, only unlabeled examples are required. Once such a basis is obtained, training can be performed using the labeled data set.
Our algorithm models the manifold using the adjacency graph for the data and approximates the Laplace-Beltrami operator by the graph Laplacian. We provide details of the algorithm, its theoretical justification, and several practical applications for image, speech, and text classification.
- Belkin, M., Niyogi, P. (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15: pp. 1373-1396
- Belkin, M., Matveeva, I., & Niyogi, P. (2003). Regression and regularization on large graphs. University of Chicago Computer Science, Technical Report TR-2003-11.
- Blum, A., & Chawla, S. (2001). Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the International Conference on Machine Learning.
- Bousquet, O., & Elisseeff, A. (2001). Stability and generalization. Journal of Machine Learning Research.
- Buser, P. (1982). A note on the isoperimetric constant. Ann. Sci. Ec. Norm. Sup. 15.
- Castelli, V., & Cover, T. M. (1995). On the exponential value of labeled samples. Pattern Recognition Letters, 16.
- Cheeger, J. (1970). A lower bound for the smallest eigenvalue of the laplacian. In R.C. Gunnings (Ed.), Problems in analysis. Princeton University Press.
- Chapelle, O., Weston, J., & Scholkopf, B. (2003). Cluster kernels for semi-supervised learning. Advances in Neural Information Processing Systems.
- Cucker, F., Smale, S. (2001) On the mathematical foundations of learning. Bulletin of the AMS 39: pp. 1-49
- Chung, F. R. K. (1997). Spectral graph theory. Regional Conference Series in Mathematics, number 92.
- Chung, F. R. K., Grigor'yan, A., Yau, S.-T. (2000) Higher eigenvalues and isoperimetric inequalities on Riemannian manifolds and graphs. Communications on Analysis and Geometry 8: pp. 969-1026
- Haykin, S. (1999). Neural networks, A comprehensive foundation. Prentice Hall.
- Joachims, T. (2003). Transductive learning via spectral graph partitioning. In Proceedings of the International Conference on Machine Learning.
- Kannan, R., Vempala, S., & Adrian Vetta. (2000). On clusterings: Good, bad and spectral. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science.
- Kondor, R., & Lafferty, J. (2002). Diffusion kernels on graphs and other discrete input spaces. In Proceedings of the International Conference on Machine Learning.
- Kutin, S., & Niyogi, P. (2002). Almost everywhere algorithmic stability and generalization error. In Proceedings of Uncertainty in Artificial Intelligence.
- Nigam, K., McCallum, A. K., Thrun, S., & Mitchell, T. (2000). Text classification from labeled and unlabeled data. Machine Learning, 39:2/3.
- Rosenberg, S. (1997). The Laplacian on a riemmannian manifold. Cambridge University Press.
- Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290.
- Schölkopf, B., Smola, A., & Mller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10:5.
- Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:8.
- Smola, A., & Kondor, R. (2003). Kernels and regularization on graphs. In The Sixteenth Annual Conference on Learning Theory/The Seventh Workshop on Kernel Machines.
- Szummer, M., & Jaakkola, T. (2002). Partially labeled classification with Markov random walks. Advances in Neural Information Processing Systems.
- Tenenbaum, J. B., de Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290.
- Tikhonov, A. N., Arsenin, V. Y. (1977) Solutions of ill-posed problems. W. H. Winston, Washington, D.C.
- Wahba, G. (1990). Spline models for observational data. Society for Industrial and Applied Mathematics.
- Zhou, D., Bousquet, O., Lal, T.N., Weston, J., & Schölkopf, B. (2003). Learning with local and global consistency, Max Planck Institute for Biological Cybernetics Technical Report.
- Zhu, X., Lafferty, J., & Ghahramani, Z. (2003). Semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of the International Conference on Machine Learning.
- Semi-Supervised Learning on Riemannian Manifolds
Volume 56, Issue 1-3 , pp 209-239
- Cover Date
- Print ISSN
- Online ISSN
- Kluwer Academic Publishers-Plenum Publishers
- Additional Links
- semi-supervised learning
- manifold learning
- graph regularization
- laplace operator
- graph laplacian
- Industry Sectors