Clustering, coding, and the concept of similarity

McCarty, L. Thorne

doi:10.1007/s10472-024-09929-7

L. Thorne McCarty¹

125 Accesses
Explore all metrics

Abstract

This paper develops a theory of clustering and coding that combines a geometric model with a probabilistic model in a principled way. The geometric model is a Riemannian manifold with a Riemannian metric, \({g}_{ij}(\textbf{x})\), which we interpret as a measure of dissimilarity. The probabilistic model consists of a stochastic process with an invariant probability measure that matches the density of the sample input data. The link between the two models is a potential function, \(U(\textbf{x})\), and its gradient, \(\nabla U(\textbf{x})\). We use the gradient to define the dissimilarity metric, which guarantees that our measure of dissimilarity will depend on the probability measure. Finally, we use the dissimilarity metric to define a coordinate system on the embedded Riemannian manifold, which gives us a low-dimensional encoding of our original data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Geometric Approaches

Similarity Between Points in Metric Measure Spaces

Empirical geodesic graphs and CAT(k) metrics for data analysis

Article 21 February 2019

Data Availability

Source code for the examples in Section 5 will be made available in a repository yet to be determined.

References

Alain, G., Bengio, Y.: What regularized auto-encoders learn from the data generating distribution. J. Mach. Learn. Res. 15, 3743–3773 (2014)
MathSciNet Google Scholar
Arias-Castro, E., Mason, D., Pelletier, B.: On the estimation of the gradient lines of a density and the consistency of the mean-shift algorithm. J. Mach. Learn. Res. 17(43), 1–28 (2016)
MathSciNet Google Scholar
Auslander, L., MacKenzie, R.E.: Introduction to Differentiable Manifolds. Dover Publications (1977)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6), 1373–1396 (2003)
Article Google Scholar
Belkin, M., Nyogi, P.: Towards a theoretical foundation for Laplacian-based manifold methods. In Proceedings of the Conference on Learning Theory (COLT), p. 486–500, (2005)
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MathSciNet Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Article Google Scholar
Bishop, R.L., Crittenden, R.J.: Geometry of Manifolds. Academic Press, Pure and applied mathematics (1964)
Google Scholar
Bishop, R.L., Goldberg, S.I.: Tensor Analysis on Manifolds. Macmillan (1968)
Brand, M.: Charting a manifold. In Advances in Neural Information Processing Systems 15, 961–968 (2003)
Cartan, H.: Differential Forms. Dover Books on Mathematics Series, Dover Publications (1971)
Google Scholar
Casella, G., George, E.I.: Explaining the Gibbs sampler. Am. Stat. 46(3), 167–174 (1992)
Article MathSciNet Google Scholar
Chen, C., Zhang, J., Fleischer, R.: Distance approximating dimension reduction of Riemannian manifolds. IEEE Trans. Syst. Man, Cybern. (Part B) 40(1), 208–217 (2010)
Article Google Scholar
Chen, M., Silva, J., Paisley, J.W., Wang, C., Dunson, D.B., Carin, L.: Compressive sensing on manifolds using a nonparametric mixture of factor analyzers: Algorithm and performance bounds. IEEE Trans. Signal Process. 58(12), 6140–6155 (2010)
Article MathSciNet Google Scholar
Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)
Article Google Scholar
Coifman, R.R., Lafon, S.: Diffusion maps. Appl Comput Harmon Anal 21, 5–30 (2006)
Article MathSciNet Google Scholar
Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)
Article Google Scholar
Donoho, D., Grimes, C.: Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. 100, 5591–5596 (2003)
Article MathSciNet Google Scholar
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Willey & Sons, New York (1973)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, chapter 10: Unsupervised Learning and Clustering, 2nd edn. Wiley & Sons Inc, New York (2001)
Google Scholar
Emery, M., Meyer, P.A.: Stochastic Calculus in Manifolds. World Publishing Company (1989)
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)
MathSciNet Google Scholar
Feynman, R.P.: Space-time approach to non-relativistic quantum mechanics. Rev. Modern Phys. 20, 367–387 (1948)
Article MathSciNet Google Scholar
Fukunaga, K., Hostetler, L.D.: The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theory 21(1), 32–40 (1975)
Article MathSciNet Google Scholar
Hein, M., Audibert, J.-Y., von Luxburg, U.: Graph Laplacians and their convergence on random neighborhood graphs. J. Mach. Learn. Res. 8, 1325–1368 (2007)
MathSciNet Google Scholar
Hinton, G.E., Roweis, S.T.: Stochastic neighbor embedding. In Advances in Neural Information Processing Systems 15, 833–840 (2003)
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet Google Scholar
Hörmander, L.: Hypoelliptic second order differential equations. Acta Mathematica 119, 147–171 (1967)
Article MathSciNet Google Scholar
Hsu, E.P.: Stochastic Analysis on Manifolds. American Mathematical Society, Contemporary Mathematics (2002)
Book Google Scholar
Itô, K.: Stochastic differentials. Appl. Math. Optim. 1(4), 374–381 (1975)
Article Google Scholar
Kac, M.: On distributions of certain Wiener functionals. Trans. Am. Math. Soc. 65, 1–13 (1949)
Article MathSciNet Google Scholar
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report, Department of Computer Science, University of Toronto (2009)
Google Scholar
Lasserre, J.A., Bishop, C.M., Minka, T.: Principled hybrids of generative and discriminative models. In 2006 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 87–94 (2006)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lee, A.B., Wasserman, L.: Spectral connectivity analysis. J. Am. Stat. Assoc. 105(491), 1241–1255 (2010)
Article MathSciNet Google Scholar
Lovelock, D., Rund, H.: Tensors, Differential Forms, and Variational Principles. Wiley, Pure and Applied Mathematics (1975)
Google Scholar
Lyasoff, A.: Path integral methods for parabolic partial differential equations with examples from computational finance. Math. J. 9(2), 399–422 (2004)
Google Scholar
Mahalanobis, P.: On the generalized distance in statistics. Proc. Natl. Inst. Sci. India (Calcutta) 2, 49–55 (1936)
Google Scholar
McCarty, L.T.: Differential similarity in higher dimensional spaces: Theory and applications, (2021). arXiv:1902.03667 [cs.LG, stat.ML]
Øksendal, B.K.: Stochastic Differential Equations: An Introduction With Applications, sixth edition. Springer (2003)
Pearson, K.: On lines and planes of closest fit to systems of points in space. Phil. Mag. 2, 559–572 (1901)
Article Google Scholar
Pennec, X.: Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. J. Math. Imaging Vis. 25, 127–154 (2006)
Article MathSciNet Google Scholar
Rifai, S., Dauphin, Y., Vincent, P., Bengio, Y., Muller, X.: The manifold tangent classifier. In Advances in Neural Information Processing Systems 24, 2294–2302 (2012)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Article Google Scholar
Spivak, M.: A Comprehensive Introduction to Differential Geometry, vol. 1, third edition. Publish or Perish (1999)
Stratonovich, R.L.: A new representation for stochastic integrals and equations. SIAM J. Control 4(2), 362–371 (1966)
Article MathSciNet Google Scholar
Stroock, D.W.: On the growth of stochastic integrals. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 18, 340–344 (1971)
Article MathSciNet Google Scholar
Stroock, D.W.: Probability Theory: An Analytic View. Cambridge University Press (1993)
Google Scholar
Stroock, D.W.: Gaussian measures in traditional and not so traditional settings. Bull. New. Ser. Am. Math. Soc. 33(2), 135–155 (1996)
Article MathSciNet Google Scholar
Stroock, D.W.: An Introduction to the Analysis of Paths on a Riemannian Manifold. American Mathematical Society, Mathematical Surveys and Monographs (2000)
Google Scholar
Stroock, D.W.: Markov Processes from K. Itô’s Perspective. Annals of Mathematics Studies. Princeton University Press (2003)
Stroock, D.W.: Probability Theory: An Analytic View, second edition. Cambridge University Press (2011)
Stroock, D.W., Taniguchi, S.: Diffusions as integral curves, or Stratonovich without Itô. In The Dynkin Festschrift. Markov processes and their applications. In celebration of Eugene B. Dynkin’s 70th birthday, p. 333–369. Boston, MA: Birkhäuser, (1994)
Stroock, D.W., Taniguchi, S.: Diffusions as integral curves on manifolds and Lie groups. In Probability theory and mathematical statistics. Lectures presented at the semester held in St. Petersburg, Russia, March 2–April 23, 1993, p. 219–226. Amsterdam: Gordon and Breach Publishers, (1996)
Tenenbaum, J.B., Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Article Google Scholar
Ting, D., Jordan, M.I.: On nonlinear dimensionality reduction, linear smoothing and autoencoding, (2018). arXiv:1803.02432v1 [stat.ML]
Ting, D., Huang, L., Jordan, M.I.: An analysis of the convergence of graph Laplacians. In Proceedings of the 27th International Conference on Machine Learning (ICML), p. 1079–1086, (2010)
Tipping, M.E., Bishop, C.M.: Mixtures of probabilistic principal component analyzers. Neural Comput. 11(2), 443–482 (1999)
Article Google Scholar
Vincent, P.: A connection between score matching and denoising autoencoders. Neural Comput. 23(7), 1661–1674 (2011)
Article MathSciNet Google Scholar
Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In Advances in Neural Information Processing Systems 22, 2223–2231 (2010)
Zhang, Z., Zha, H.: Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J. Sci. Comput. 26(1), 313–338 (2004)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Rutgers, The State University of New Jersey, New Brunswick, NJ, 08901-8554, USA
L. Thorne McCarty

Authors

L. Thorne McCarty
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to L. Thorne McCarty.

Ethics declarations

Conflicts of interest

The author has no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

McCarty, L.T. Clustering, coding, and the concept of similarity. Ann Math Artif Intell (2024). https://doi.org/10.1007/s10472-024-09929-7

Download citation

Accepted: 18 September 2023
Published: 19 March 2024
DOI: https://doi.org/10.1007/s10472-024-09929-7

Keywords

Mathematics Subject Classification (2010)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering, coding, and the concept of similarity

Abstract

Access this article

Similar content being viewed by others

Geometric Approaches

Similarity Between Points in Metric Measure Spaces

Empirical geodesic graphs and CAT(k) metrics for data analysis

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Clustering, coding, and the concept of similarity

Abstract

Access this article

Similar content being viewed by others

Geometric Approaches

Similarity Between Points in Metric Measure Spaces

Empirical geodesic graphs and CAT(k) metrics for data analysis

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation