Skip to main content
SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
  1. Home
  2. Machine Learning
  3. Article

Visualizing non-metric similarities in multiple maps

  • Open Access
  • Published: 17 December 2011
  • volume 87, pages 33–55 (2012)
Download PDF

You have full access to this open access article

Machine Learning Aims and scope Submit manuscript
Visualizing non-metric similarities in multiple maps
Download PDF
  • Laurens van der Maaten1 &
  • Geoffrey Hinton2 
  • 3004 Accesses

  • 162 Citations

  • 11 Altmetric

  • 1 Mention

  • Explore all metrics

  • Cite this article

Abstract

Techniques for multidimensional scaling visualize objects as points in a low-dimensional metric map. As a result, the visualizations are subject to the fundamental limitations of metric spaces. These limitations prevent multidimensional scaling from faithfully representing non-metric similarity data such as word associations or event co-occurrences. In particular, multidimensional scaling cannot faithfully represent intransitive pairwise similarities in a visualization, and it cannot faithfully visualize “central” objects. In this paper, we present an extension of a recently proposed multidimensional scaling technique called t-SNE. The extension aims to address the problems of traditional multidimensional scaling techniques when these techniques are used to visualize non-metric similarities. The new technique, called multiple maps t-SNE, alleviates these problems by constructing a collection of maps that reveal complementary structure in the similarity data. We apply multiple maps t-SNE to a large data set of word association data and to a data set of NIPS co-authorships, demonstrating its ability to successfully visualize non-metric similarities.

Download to read the full article text

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

  • Banerjee, A., Krumpelman, C., Basu, S., Mooney, R., & Ghosh, J. (2005). Model based overlapping clustering. In Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining.

    Google Scholar 

  • Belkin, M., & Niyogi, P. (2002). Laplacian Eigenmaps and spectral techniques for embedding and clustering. In Advances in neural information processing systems (Vol. 14, pp. 585–591).

    Google Scholar 

  • Belongie, S., Malik, J., & Puzicha, J. (2001). Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(4), 509–522.

    Article  Google Scholar 

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Blei, D. M., Griffiths, T. L., Jordan, M. I., & Tenenbaum, J. B. (2004). Hierarchical topic models and the nested Chinese restaurant process. In S. Thrun, L. Saul, & B. Schölkopf (Eds.), Advances in neural information processing systems (Vol. 16, pp. 17–24). Cambridge: The MIT Press.

    Google Scholar 

  • Borg, I., & Groenen, P. J. F. (2005). Modern multidimensional scaling (2nd ed.). New York: Springer.

    MATH  Google Scholar 

  • Bostock, M., Ogievetsky, V., & Heer, J. (2011). D3: Data-driven documents. IEEE Transactions on Visualization and Computer Graphics, 17(12), 2301–2309.

    Article  Google Scholar 

  • Breitkreutz, B.-J., Stark, C., & Tyers, M. (2003). Osprey: a network visualization system. Genome Biology, 4(3), R22.1–R22.4.

    Google Scholar 

  • Carreira-Perpiñán, M. Á. (2010). The elastic embedding algorithm for dimensionality reduction. In Proceedings of the 27th international conference on machine learning (pp. 167–174).

    Google Scholar 

  • Cayton, L., & Dasgupta, S. (2006). Robust Euclidean embedding. In Proceedings of the 23rd international conference on machine learning (pp. 169–176).

    Google Scholar 

  • Collobert, R., & Weston, J. (2008). A unified architecture for natural language processing: deep neural networks with multitask learning. In Proceedings of the international conference on machine learning (pp. 160–167).

    Chapter  Google Scholar 

  • Cook, J. A., Sutskever, I., Mnih, A., & Hinton, G. E. (2007). Visualizing similarity data with a mixture of maps. JMLR Workshop and Conference Proceedings, 2, 67–74.

    Google Scholar 

  • Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., & Bengio, S. (2010) Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research, 11, 625–660.

    MathSciNet  Google Scholar 

  • Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315, 972–976.

    Article  MathSciNet  MATH  Google Scholar 

  • Gashi, I., Stankovic, V., Leita, C., & Thonnard, O. (2009). An experimental study of diversity with off-the-shelf antivirus engines. In Proceedings of the IEEE international symposium on network computing and applications (pp. 4–11).

    Google Scholar 

  • Globerson, A., & Roweis, S. (2007). Visualizing pairwise similarity via semidefinite programming. In Proceedings of the 11th international workshop on artificial intelligence and statistics (AI-STATS) (pp. 139–146).

    Google Scholar 

  • Globerson, A., Chechik, G., Pereira, F., & Tishby, N. (2007). Euclidean embedding of co-occurrence data. Journal of Machine Learning Research, 8, 2265–2295.

    MathSciNet  MATH  Google Scholar 

  • Griffiths, T. L., Steyvers, M., & Tenenbaum, J. L. (2007). Topics in semantic representation. Psychological Review, 114(2), 211–244.

    Article  Google Scholar 

  • Heller, K. A., & Ghahramani, Z. (2007). A nonparametric Bayesian approach to modeling overlapping clusters. In Proceedings of the 11th international conference on artificial intelligence and statistics.

    Google Scholar 

  • Hinton, G. E., & Roweis, S. T. (2003). Stochastic neighbor embedding. In Advances in neural information processing systems (Vol. 15, pp. 833–840).

    Google Scholar 

  • Hofmann, T. (1999). Probabilistic latent semantic indexing. In Proceedings of the 22th annual international SIGIR conference (pp. 50–57). New York: ACM Press.

    Google Scholar 

  • Jacobs, R. A. (1988). Increased rates of convergence through learning rate adaptation. Neural Networks, 1, 295–307.

    Article  Google Scholar 

  • Jäkel, F., Schölkopf, B., & Wichmann, F. A. (2008). Similarity, kernels, and the triangle inequality. Journal of Mathematical Psychology, 52(2), 297–303.

    Article  MathSciNet  MATH  Google Scholar 

  • Jamieson, A. R., Giger, M. L., Drukker, K., Li, H., Yuan, Y., & Bhooshan, N. (2010). Exploring nonlinear feature space dimension reduction and data representation in breast CADx with Laplacian Eigenmaps and t-SNE. Medical Physics, 37(1), 339–351.

    Article  Google Scholar 

  • Keim, D. A., Kohlhammer, J., Ellis, G., & Mansmann, F. (2010). Mastering the information age; solving problems with visual analytics. Eurographics Association.

  • Klimt, B., & Yang, Y. (2004). Lecture notes in computer science: Vol. 3201. The Enron corpus: a new dataset for email classification research (pp. 217–226).

    Google Scholar 

  • Kruskal, J. B., & Wish, M. (1986). Multidimensional scaling. Beverly Hills: Sage.

    Google Scholar 

  • Lacoste-Julien, S., Sha, F., & Jordan, M. I. (2009). DiscLDA: Discriminative learning for dimensionality reduction and classification. In Advances in neural information processing systems (Vol. 21, pp. 897–904).

    Google Scholar 

  • Lafon, S., & Lee, A. B. (2006). Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(9), 1393–1403.

    Article  Google Scholar 

  • Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104, 211–240.

    Article  Google Scholar 

  • Laub, J., & Müller, K.-R. (2004). Feature discovery in non-metric pairwise data. Journal of Machine Learning Research, 5, 801–818.

    MATH  Google Scholar 

  • Laub, J., Macke, J., Müller, K.-R., & Wichmann, F. A. (2007). Inducing metric violations in human similarity judgements. In Advances in neural information processing systems (Vol. 19, pp. 777–784).

    Google Scholar 

  • Lawrence, N. D. (2011). Spectral dimensionality reduction via maximum entropy. In Proceedings of the international conference on artificial intelligence and statistics (pp. 51–59).

    Google Scholar 

  • Lund, K., Burgess, C., & Atchley, R. A. (1995). Semantic and associative priming in high-dimensional semantic space. In Proceedings of the 17th annual conference of the cognitive science society (pp. 660–665). Mahwah: Erlbaum.

    Google Scholar 

  • Mao, Y., Balasubramanian, K., & Lebanon, G. (2010). Dimensionality reduction for text using domain knowledge. In Proceedings of the 23rd international conference on computational linguistics (pp. 801–809).

    Google Scholar 

  • McCallum, A. (1999). Multi-label text classification with a mixture model trained by em. In AAAI workshop on text learning. New York: ACM.

    Google Scholar 

  • McCallum, A., Corrada-Emmanuel, A., & Wang, X. (2004). The author-recipient-topic model for topic and role discovery in social networks: experiments with Enron and academic email (Technical Report UM-CS-2004-096). Department of Computer Science, University of Massachusetts, Amherst, MA.

  • Mnih, A., & Hinton, G. E. (2009). A scalable hierarchical distributed language model. In Advances in neural information processing systems (pp. 1081–1088).

    Google Scholar 

  • Nelson, D. L., McEvoy, C. L., & Schreiber, T. A. (1998). The University of South Florida word association, rhyme, and word fragment norms.

  • Pekalska, E., & Duin, R. P. W. (2005). The dissimilarity representation for pattern recognition: foundations and applications. Singapore: World Scientific.

    Book  MATH  Google Scholar 

  • Plaisant, C. (2004). The challenge of information visualization evaluation. In Proceedings of the working conference on advanced visual interfaces.

    Google Scholar 

  • Rosen-Zvi, M., Griffiths, T., Steyversand, M., & Smyth, P. (2004). The author-topic model for authors and documents. In Proceedings of the 20th conference on uncertainty in artificial intelligence. Arlington: AUAI Press.

    Google Scholar 

  • Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by Locally Linear Embedding. Science, 290(5500), 2323–2326.

    Article  Google Scholar 

  • Sammon, J. W. (1969). A nonlinear mapping for data structure analysis. IEEE Transactions on Computers, 18(5), 401–409.

    Article  Google Scholar 

  • Schmidtlein, S., Zimmermann, P., Schüpferling, R., & Weiss, C. (2007). Mapping the floristic continuum: ordination space position estimated from imaging spectroscopy. Journal of Vegetation Science, 18, 131–140.

    Article  Google Scholar 

  • Schölkopf, B., & Smola, A. J. (2002). Learning with kernels. Cambridge: MIT Press.

    Google Scholar 

  • Schölkopf, B., Smola, A. J., & Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299–1319.

    Article  Google Scholar 

  • Shaw, B., & Jebara, T. (2009). Structure preserving embedding. In Proceedings of the international conference on machine learning (pp. 937–944).

    Google Scholar 

  • Steyvers, M., & Tenenbaum, J. B. (2005). The large-scale structure of semantic networks: statistical analyses and a model of semantic growth. Cognitive Science, 29(1), 41–78.

    Article  Google Scholar 

  • Teh, Y., Jordan, M. I., Beal, M., & Blei, D. M. (2004). Hierarchical Dirichlet processes. In Advances in neural information processing systems (Vol. 17, pp. 1385–1392). Cambridge: MIT Press.

    Google Scholar 

  • Tenenbaum, J. B., de Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.

    Article  Google Scholar 

  • Thomas, J. J., & Cook, K. A. (2005). Illuminating the path: the research and development agenda for visual analytics.

  • Thonnard, O., Mees, W., & Dacier, M. (2009). Addressing the attack attribution problem using knowledge discovery and multi-criteria fuzzy decision-making. In Proceedings of the ACM SIGKDD workshop on CyberSecurity and intelligence informatics (pp. 11–21).

    Chapter  Google Scholar 

  • Torgerson, W. S. (1952). Multidimensional scaling I: theory and method. Psychometrika, 17, 401–419.

    Article  MathSciNet  MATH  Google Scholar 

  • Tversky, A., & Hutchinson, J. W. (1986). Nearest neighbor analysis of psychological spaces. Psychological Review, 93(11), 3–22.

    Article  Google Scholar 

  • van der Maaten, L. J. P. (2009). Learning a parametric embedding by preserving local structure. In Proceedings of the twelfth international conference on artificial intelligence and statistics (AI-STATS), JMLR W&CP (Vol. 5, pp. 384–391).

    Google Scholar 

  • van der Maaten, L. J. P., & Hinton, G. E. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2431–2456.

    Google Scholar 

  • van der Maaten, L. J. P., & Postma, E. O. (2010). Texton-based analysis of paintings. In SPIE optical engineering and applications (Vol. 7798-16).

    Google Scholar 

  • Venna, J., Peltonen, J., Nybo, K., Aidos, H., & Kaski, S. (2010). Information retrieval perspective to nonlinear dimensionality reduction for data visualization. Journal of Machine Learning Research, 11, 451–490.

    MathSciNet  Google Scholar 

  • Villmann, T., & Haase, S. (2010). Mathematical foundations of the generalization of t-SNE and SNE for arbitrary divergences (Technical Report 02/2010). University of Applied Sciences Mittweida.

  • von Luxburg, U. (2010). Clustering stability: an overview. Foundations and Trends in Machine Learning, 2(3), 235–274.

    Google Scholar 

  • Weinberger, K. Q., Packer, B. D., & Saul, L. K. (2005). Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization. In Proceedings of the 10th international workshop on AI and statistics. Barbados: Society for Artificial Intelligence and Statistics.

    Google Scholar 

  • Yang, Z., King, I., Oja, E., & Xu, Z. (2010). Heavy-tailed symmetric stochastic neighbor embedding. In Advances in neural information processing systems (Vol. 22). Cambridge: MIT Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Pattern Recognition and Bioinformatics Laboratory, Delft University of Technology, Mekelweg 4, Delft, 2628 CD, The Netherlands

    Laurens van der Maaten

  2. Department of Computer Science, University of Toronto, 6 King’s College Road, M5S 3G4, Toronto, ON, Canada

    Geoffrey Hinton

Authors
  1. Laurens van der Maaten
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Geoffrey Hinton
    View author publications

    You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Laurens van der Maaten.

Additional information

Editor: Paolo Frasconi.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and Permissions

About this article

Cite this article

van der Maaten, L., Hinton, G. Visualizing non-metric similarities in multiple maps. Mach Learn 87, 33–55 (2012). https://doi.org/10.1007/s10994-011-5273-4

Download citation

  • Received: 25 October 2010

  • Accepted: 17 November 2011

  • Published: 17 December 2011

  • Issue Date: April 2012

  • DOI: https://doi.org/10.1007/s10994-011-5273-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Multidimensional scaling
  • Embedding
  • Data visualization
  • Non-metric similarities
Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Advertisement

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

Not affiliated

Springer Nature

© 2023 Springer Nature