Abstract
This paper develops a theory of clustering and coding that combines a geometric model with a probabilistic model in a principled way. The geometric model is a Riemannian manifold with a Riemannian metric, \({g}_{ij}(\textbf{x})\), which we interpret as a measure of dissimilarity. The probabilistic model consists of a stochastic process with an invariant probability measure that matches the density of the sample input data. The link between the two models is a potential function, \(U(\textbf{x})\), and its gradient, \(\nabla U(\textbf{x})\). We use the gradient to define the dissimilarity metric, which guarantees that our measure of dissimilarity will depend on the probability measure. Finally, we use the dissimilarity metric to define a coordinate system on the embedded Riemannian manifold, which gives us a low-dimensional encoding of our original data.
Similar content being viewed by others
Data Availability
Source code for the examples in Section 5 will be made available in a repository yet to be determined.
References
Alain, G., Bengio, Y.: What regularized auto-encoders learn from the data generating distribution. J. Mach. Learn. Res. 15, 3743–3773 (2014)
Arias-Castro, E., Mason, D., Pelletier, B.: On the estimation of the gradient lines of a density and the consistency of the mean-shift algorithm. J. Mach. Learn. Res. 17(43), 1–28 (2016)
Auslander, L., MacKenzie, R.E.: Introduction to Differentiable Manifolds. Dover Publications (1977)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6), 1373–1396 (2003)
Belkin, M., Nyogi, P.: Towards a theoretical foundation for Laplacian-based manifold methods. In Proceedings of the Conference on Learning Theory (COLT), p. 486–500, (2005)
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Bishop, R.L., Crittenden, R.J.: Geometry of Manifolds. Academic Press, Pure and applied mathematics (1964)
Bishop, R.L., Goldberg, S.I.: Tensor Analysis on Manifolds. Macmillan (1968)
Brand, M.: Charting a manifold. In Advances in Neural Information Processing Systems 15, 961–968 (2003)
Cartan, H.: Differential Forms. Dover Books on Mathematics Series, Dover Publications (1971)
Casella, G., George, E.I.: Explaining the Gibbs sampler. Am. Stat. 46(3), 167–174 (1992)
Chen, C., Zhang, J., Fleischer, R.: Distance approximating dimension reduction of Riemannian manifolds. IEEE Trans. Syst. Man, Cybern. (Part B) 40(1), 208–217 (2010)
Chen, M., Silva, J., Paisley, J.W., Wang, C., Dunson, D.B., Carin, L.: Compressive sensing on manifolds using a nonparametric mixture of factor analyzers: Algorithm and performance bounds. IEEE Trans. Signal Process. 58(12), 6140–6155 (2010)
Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)
Coifman, R.R., Lafon, S.: Diffusion maps. Appl Comput Harmon Anal 21, 5–30 (2006)
Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)
Donoho, D., Grimes, C.: Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. 100, 5591–5596 (2003)
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Willey & Sons, New York (1973)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, chapter 10: Unsupervised Learning and Clustering, 2nd edn. Wiley & Sons Inc, New York (2001)
Emery, M., Meyer, P.A.: Stochastic Calculus in Manifolds. World Publishing Company (1989)
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)
Feynman, R.P.: Space-time approach to non-relativistic quantum mechanics. Rev. Modern Phys. 20, 367–387 (1948)
Fukunaga, K., Hostetler, L.D.: The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theory 21(1), 32–40 (1975)
Hein, M., Audibert, J.-Y., von Luxburg, U.: Graph Laplacians and their convergence on random neighborhood graphs. J. Mach. Learn. Res. 8, 1325–1368 (2007)
Hinton, G.E., Roweis, S.T.: Stochastic neighbor embedding. In Advances in Neural Information Processing Systems 15, 833–840 (2003)
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Hörmander, L.: Hypoelliptic second order differential equations. Acta Mathematica 119, 147–171 (1967)
Hsu, E.P.: Stochastic Analysis on Manifolds. American Mathematical Society, Contemporary Mathematics (2002)
Itô, K.: Stochastic differentials. Appl. Math. Optim. 1(4), 374–381 (1975)
Kac, M.: On distributions of certain Wiener functionals. Trans. Am. Math. Soc. 65, 1–13 (1949)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report, Department of Computer Science, University of Toronto (2009)
Lasserre, J.A., Bishop, C.M., Minka, T.: Principled hybrids of generative and discriminative models. In 2006 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 87–94 (2006)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Lee, A.B., Wasserman, L.: Spectral connectivity analysis. J. Am. Stat. Assoc. 105(491), 1241–1255 (2010)
Lovelock, D., Rund, H.: Tensors, Differential Forms, and Variational Principles. Wiley, Pure and Applied Mathematics (1975)
Lyasoff, A.: Path integral methods for parabolic partial differential equations with examples from computational finance. Math. J. 9(2), 399–422 (2004)
Mahalanobis, P.: On the generalized distance in statistics. Proc. Natl. Inst. Sci. India (Calcutta) 2, 49–55 (1936)
McCarty, L.T.: Differential similarity in higher dimensional spaces: Theory and applications, (2021). arXiv:1902.03667 [cs.LG, stat.ML]
Øksendal, B.K.: Stochastic Differential Equations: An Introduction With Applications, sixth edition. Springer (2003)
Pearson, K.: On lines and planes of closest fit to systems of points in space. Phil. Mag. 2, 559–572 (1901)
Pennec, X.: Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. J. Math. Imaging Vis. 25, 127–154 (2006)
Rifai, S., Dauphin, Y., Vincent, P., Bengio, Y., Muller, X.: The manifold tangent classifier. In Advances in Neural Information Processing Systems 24, 2294–2302 (2012)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Spivak, M.: A Comprehensive Introduction to Differential Geometry, vol. 1, third edition. Publish or Perish (1999)
Stratonovich, R.L.: A new representation for stochastic integrals and equations. SIAM J. Control 4(2), 362–371 (1966)
Stroock, D.W.: On the growth of stochastic integrals. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 18, 340–344 (1971)
Stroock, D.W.: Probability Theory: An Analytic View. Cambridge University Press (1993)
Stroock, D.W.: Gaussian measures in traditional and not so traditional settings. Bull. New. Ser. Am. Math. Soc. 33(2), 135–155 (1996)
Stroock, D.W.: An Introduction to the Analysis of Paths on a Riemannian Manifold. American Mathematical Society, Mathematical Surveys and Monographs (2000)
Stroock, D.W.: Markov Processes from K. Itô’s Perspective. Annals of Mathematics Studies. Princeton University Press (2003)
Stroock, D.W.: Probability Theory: An Analytic View, second edition. Cambridge University Press (2011)
Stroock, D.W., Taniguchi, S.: Diffusions as integral curves, or Stratonovich without Itô. In The Dynkin Festschrift. Markov processes and their applications. In celebration of Eugene B. Dynkin’s 70th birthday, p. 333–369. Boston, MA: Birkhäuser, (1994)
Stroock, D.W., Taniguchi, S.: Diffusions as integral curves on manifolds and Lie groups. In Probability theory and mathematical statistics. Lectures presented at the semester held in St. Petersburg, Russia, March 2–April 23, 1993, p. 219–226. Amsterdam: Gordon and Breach Publishers, (1996)
Tenenbaum, J.B., Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Ting, D., Jordan, M.I.: On nonlinear dimensionality reduction, linear smoothing and autoencoding, (2018). arXiv:1803.02432v1 [stat.ML]
Ting, D., Huang, L., Jordan, M.I.: An analysis of the convergence of graph Laplacians. In Proceedings of the 27th International Conference on Machine Learning (ICML), p. 1079–1086, (2010)
Tipping, M.E., Bishop, C.M.: Mixtures of probabilistic principal component analyzers. Neural Comput. 11(2), 443–482 (1999)
Vincent, P.: A connection between score matching and denoising autoencoders. Neural Comput. 23(7), 1661–1674 (2011)
Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In Advances in Neural Information Processing Systems 22, 2223–2231 (2010)
Zhang, Z., Zha, H.: Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J. Sci. Comput. 26(1), 313–338 (2004)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The author has no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
McCarty, L.T. Clustering, coding, and the concept of similarity. Ann Math Artif Intell (2024). https://doi.org/10.1007/s10472-024-09929-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s10472-024-09929-7