Skip to main content
Log in

Abstract

This paper develops a theory of clustering and coding that combines a geometric model with a probabilistic model in a principled way. The geometric model is a Riemannian manifold with a Riemannian metric, \({g}_{ij}(\textbf{x})\), which we interpret as a measure of dissimilarity. The probabilistic model consists of a stochastic process with an invariant probability measure that matches the density of the sample input data. The link between the two models is a potential function, \(U(\textbf{x})\), and its gradient, \(\nabla U(\textbf{x})\). We use the gradient to define the dissimilarity metric, which guarantees that our measure of dissimilarity will depend on the probability measure. Finally, we use the dissimilarity metric to define a coordinate system on the embedded Riemannian manifold, which gives us a low-dimensional encoding of our original data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability

Source code for the examples in Section 5 will be made available in a repository yet to be determined.

References

  1. Alain, G., Bengio, Y.: What regularized auto-encoders learn from the data generating distribution. J. Mach. Learn. Res. 15, 3743–3773 (2014)

    MathSciNet  Google Scholar 

  2. Arias-Castro, E., Mason, D., Pelletier, B.: On the estimation of the gradient lines of a density and the consistency of the mean-shift algorithm. J. Mach. Learn. Res. 17(43), 1–28 (2016)

    MathSciNet  Google Scholar 

  3. Auslander, L., MacKenzie, R.E.: Introduction to Differentiable Manifolds. Dover Publications (1977)

  4. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6), 1373–1396 (2003)

    Article  Google Scholar 

  5. Belkin, M., Nyogi, P.: Towards a theoretical foundation for Laplacian-based manifold methods. In Proceedings of the Conference on Learning Theory (COLT), p. 486–500, (2005)

  6. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MathSciNet  Google Scholar 

  7. Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  8. Bishop, R.L., Crittenden, R.J.: Geometry of Manifolds. Academic Press, Pure and applied mathematics (1964)

    Google Scholar 

  9. Bishop, R.L., Goldberg, S.I.: Tensor Analysis on Manifolds. Macmillan (1968)

  10. Brand, M.: Charting a manifold. In Advances in Neural Information Processing Systems 15, 961–968 (2003)

  11. Cartan, H.: Differential Forms. Dover Books on Mathematics Series, Dover Publications (1971)

    Google Scholar 

  12. Casella, G., George, E.I.: Explaining the Gibbs sampler. Am. Stat. 46(3), 167–174 (1992)

    Article  MathSciNet  Google Scholar 

  13. Chen, C., Zhang, J., Fleischer, R.: Distance approximating dimension reduction of Riemannian manifolds. IEEE Trans. Syst. Man, Cybern. (Part B) 40(1), 208–217 (2010)

    Article  Google Scholar 

  14. Chen, M., Silva, J., Paisley, J.W., Wang, C., Dunson, D.B., Carin, L.: Compressive sensing on manifolds using a nonparametric mixture of factor analyzers: Algorithm and performance bounds. IEEE Trans. Signal Process. 58(12), 6140–6155 (2010)

    Article  MathSciNet  Google Scholar 

  15. Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)

    Article  Google Scholar 

  16. Coifman, R.R., Lafon, S.: Diffusion maps. Appl Comput Harmon Anal 21, 5–30 (2006)

    Article  MathSciNet  Google Scholar 

  17. Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)

    Article  Google Scholar 

  18. Donoho, D., Grimes, C.: Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. 100, 5591–5596 (2003)

    Article  MathSciNet  Google Scholar 

  19. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Willey & Sons, New York (1973)

    Google Scholar 

  20. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, chapter 10: Unsupervised Learning and Clustering, 2nd edn. Wiley & Sons Inc, New York (2001)

    Google Scholar 

  21. Emery, M., Meyer, P.A.: Stochastic Calculus in Manifolds. World Publishing Company (1989)

  22. Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)

    MathSciNet  Google Scholar 

  23. Feynman, R.P.: Space-time approach to non-relativistic quantum mechanics. Rev. Modern Phys. 20, 367–387 (1948)

    Article  MathSciNet  Google Scholar 

  24. Fukunaga, K., Hostetler, L.D.: The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theory 21(1), 32–40 (1975)

    Article  MathSciNet  Google Scholar 

  25. Hein, M., Audibert, J.-Y., von Luxburg, U.: Graph Laplacians and their convergence on random neighborhood graphs. J. Mach. Learn. Res. 8, 1325–1368 (2007)

    MathSciNet  Google Scholar 

  26. Hinton, G.E., Roweis, S.T.: Stochastic neighbor embedding. In Advances in Neural Information Processing Systems 15, 833–840 (2003)

  27. Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  Google Scholar 

  28. Hörmander, L.: Hypoelliptic second order differential equations. Acta Mathematica 119, 147–171 (1967)

    Article  MathSciNet  Google Scholar 

  29. Hsu, E.P.: Stochastic Analysis on Manifolds. American Mathematical Society, Contemporary Mathematics (2002)

    Book  Google Scholar 

  30. Itô, K.: Stochastic differentials. Appl. Math. Optim. 1(4), 374–381 (1975)

    Article  Google Scholar 

  31. Kac, M.: On distributions of certain Wiener functionals. Trans. Am. Math. Soc. 65, 1–13 (1949)

    Article  MathSciNet  Google Scholar 

  32. Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report, Department of Computer Science, University of Toronto (2009)

    Google Scholar 

  33. Lasserre, J.A., Bishop, C.M., Minka, T.: Principled hybrids of generative and discriminative models. In 2006 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 87–94 (2006)

  34. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  35. Lee, A.B., Wasserman, L.: Spectral connectivity analysis. J. Am. Stat. Assoc. 105(491), 1241–1255 (2010)

    Article  MathSciNet  Google Scholar 

  36. Lovelock, D., Rund, H.: Tensors, Differential Forms, and Variational Principles. Wiley, Pure and Applied Mathematics (1975)

    Google Scholar 

  37. Lyasoff, A.: Path integral methods for parabolic partial differential equations with examples from computational finance. Math. J. 9(2), 399–422 (2004)

    Google Scholar 

  38. Mahalanobis, P.: On the generalized distance in statistics. Proc. Natl. Inst. Sci. India (Calcutta) 2, 49–55 (1936)

    Google Scholar 

  39. McCarty, L.T.: Differential similarity in higher dimensional spaces: Theory and applications, (2021). arXiv:1902.03667 [cs.LG, stat.ML]

  40. Øksendal, B.K.: Stochastic Differential Equations: An Introduction With Applications, sixth edition. Springer (2003)

  41. Pearson, K.: On lines and planes of closest fit to systems of points in space. Phil. Mag. 2, 559–572 (1901)

    Article  Google Scholar 

  42. Pennec, X.: Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. J. Math. Imaging Vis. 25, 127–154 (2006)

    Article  MathSciNet  Google Scholar 

  43. Rifai, S., Dauphin, Y., Vincent, P., Bengio, Y., Muller, X.: The manifold tangent classifier. In Advances in Neural Information Processing Systems 24, 2294–2302 (2012)

  44. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  45. Spivak, M.: A Comprehensive Introduction to Differential Geometry, vol. 1, third edition. Publish or Perish (1999)

  46. Stratonovich, R.L.: A new representation for stochastic integrals and equations. SIAM J. Control 4(2), 362–371 (1966)

    Article  MathSciNet  Google Scholar 

  47. Stroock, D.W.: On the growth of stochastic integrals. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 18, 340–344 (1971)

    Article  MathSciNet  Google Scholar 

  48. Stroock, D.W.: Probability Theory: An Analytic View. Cambridge University Press (1993)

    Google Scholar 

  49. Stroock, D.W.: Gaussian measures in traditional and not so traditional settings. Bull. New. Ser. Am. Math. Soc. 33(2), 135–155 (1996)

    Article  MathSciNet  Google Scholar 

  50. Stroock, D.W.: An Introduction to the Analysis of Paths on a Riemannian Manifold. American Mathematical Society, Mathematical Surveys and Monographs (2000)

    Google Scholar 

  51. Stroock, D.W.: Markov Processes from K. Itô’s Perspective. Annals of Mathematics Studies. Princeton University Press (2003)

  52. Stroock, D.W.: Probability Theory: An Analytic View, second edition. Cambridge University Press (2011)

  53. Stroock, D.W., Taniguchi, S.: Diffusions as integral curves, or Stratonovich without Itô. In The Dynkin Festschrift. Markov processes and their applications. In celebration of Eugene B. Dynkin’s 70th birthday, p. 333–369. Boston, MA: Birkhäuser, (1994)

  54. Stroock, D.W., Taniguchi, S.: Diffusions as integral curves on manifolds and Lie groups. In Probability theory and mathematical statistics. Lectures presented at the semester held in St. Petersburg, Russia, March 2–April 23, 1993, p. 219–226. Amsterdam: Gordon and Breach Publishers, (1996)

  55. Tenenbaum, J.B., Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)

    Article  Google Scholar 

  56. Ting, D., Jordan, M.I.: On nonlinear dimensionality reduction, linear smoothing and autoencoding, (2018). arXiv:1803.02432v1 [stat.ML]

  57. Ting, D., Huang, L., Jordan, M.I.: An analysis of the convergence of graph Laplacians. In Proceedings of the 27th International Conference on Machine Learning (ICML), p. 1079–1086, (2010)

  58. Tipping, M.E., Bishop, C.M.: Mixtures of probabilistic principal component analyzers. Neural Comput. 11(2), 443–482 (1999)

    Article  Google Scholar 

  59. Vincent, P.: A connection between score matching and denoising autoencoders. Neural Comput. 23(7), 1661–1674 (2011)

    Article  MathSciNet  Google Scholar 

  60. Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In Advances in Neural Information Processing Systems 22, 2223–2231 (2010)

  61. Zhang, Z., Zha, H.: Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J. Sci. Comput. 26(1), 313–338 (2004)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to L. Thorne McCarty.

Ethics declarations

Conflicts of interest

The author has no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

McCarty, L.T. Clustering, coding, and the concept of similarity. Ann Math Artif Intell (2024). https://doi.org/10.1007/s10472-024-09929-7

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10472-024-09929-7

Keywords

Mathematics Subject Classification (2010)

Navigation