How to Visualize Large Data Sets?

Hammer, Barbara; Gisbrecht, Andrej; Schulz, Alexander

doi:10.1007/978-3-642-35230-0_1

Barbara Hammer⁴,
Andrej Gisbrecht⁴ &
Alexander Schulz⁴

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 198))

1360 Accesses
3 Citations
1 Altmetric

Abstract

We address novel developments in the context of dimensionality reduction for data visualization. We consider nonlinear non-parametric techniques such as t-distributed stochastic neighbor embedding and discuss the difficulties which are encountered if large data sets are dealt with, in contrast to parametric approaches such as the self-organizing map. We focus on the following topics, which arise in this context: (i) how can dimensionality reduction be realized efficiently in at most linear time, (ii) how can nonparametric approaches be extended to provide an explicit mapping, (iii) how can techniques be extended to incorporate auxiliary information as provided by class labeling?

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bengio, Y., Paiement, J.-F., Vincent, P., Delalleau, O., Roux, N.L., Ouimet, M.: Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering. In: Advances in Neural Information Processing Systems, pp. 177–184. MIT Press (2004)
Google Scholar
Biehl, M., Hammer, B., Merényi, E., Sperduti, A., Villmann, T.: Learning in the context of very high dimensional data (Dagstuhl Seminar 11341), vol. 1 (2011)
Google Scholar
Bishop, C.M., Svensén, M., Williams, C.K.I.: Gtm: The generative topographic mapping. Neural Computation 10, 215–234 (1998)
Article Google Scholar
Bunte, K., Biehl, M., Hammer, B.: A general framework for dimensionality reducing data visualization mapping. Neural Computation 24(3), 771–804 (2012)
Article MATH Google Scholar
Gisbrecht, A., Hammer, B.: Relevance learning in generative topographic mapping. Neurocomputing 74(9), 1359–1371 (2011)
Article Google Scholar
Gisbrecht, A., Hammer, B., Schleif, F.-M., Zhu, X.: Accelerating dissimilarity clustering for biomedical data analysis. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 154–161 (2011)
Google Scholar
Gisbrecht, A., Hofmann, D., Hammer, B.: Discriminative Dimensionality Reduction Mappings. In: Hollmén, J., Klawonn, F., Tucker, A. (eds.) IDA 2012. LNCS, vol. 7619, pp. 126–138. Springer, Heidelberg (2012)
Chapter Google Scholar
Gisbrecht, A., Lueks, W., Mokbel, B., Hammer, B.: Out-of-sample kernel extensions for nonparametric dimensionality reduction. In: ESANN 2012, pp. 531–536 (2012)
Google Scholar
Hammer, B., Gisbrecht, A., Hasenfuss, A., Mokbel, B., Schleif, F.-M., Zhu, X.: Topographic Mapping of Dissimilarity Data. In: Laaksonen, J., Honkela, T. (eds.) WSOM 2011. LNCS, vol. 6731, pp. 1–15. Springer, Heidelberg (2011)
Chapter Google Scholar
Hammer, B., Hasenfuss, A.: Topographic mapping of large dissimilarity datasets. Neural Computation 22(9), 2229–2284 (2010)
Article MathSciNet MATH Google Scholar
Kaski, S., Sinkkonen, J., Peltonen, J.: Bankruptcy analysis with self-organizing maps in learning metrics. IEEE Transactions on Neural Networks 12, 936–947 (2001)
Article Google Scholar
Kohonen, T.: Self-Organizing Maps. Springer (2000)
Google Scholar
Lee, J.A., Verleysen, M.: Nonlinear dimensionality redcution. Springer (2007)
Google Scholar
Peltonen, J., Klami, A., Kaski, S.: Improved learning of riemannian metrics for exploratory analysis. Neural Networks 17, 1087–1100 (2004)
Article MATH Google Scholar
Schneider, P., Biehl, M., Hammer, B.: Adaptive relevance matrices in learning vector quantization. Neural Computation 21, 3532–3561 (2009)
Article MathSciNet MATH Google Scholar
Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. Journal of the Royal Statistical Society, Series B 61, 611–622 (1999)
Article MathSciNet MATH Google Scholar
van der Maaten, L., Hinton, G.: Visualizing high-dimensional data using t-sne. Journal of Machine Learning Research 9, 2579–2605 (2008)
MATH Google Scholar
van der Maaten, L., Postma, E., van den Herik, H.: Dimensionality reduction: A comparative review. Technical report, Tilburg University Technical Report, TiCC-TR 2009-005 (2009)
Google Scholar
Venna, J., Peltonen, J., Nybo, K., Aidos, H., Kaski, S.: Information retrieval perspective to nonlinear dimensionality reduction for data visualization. Journal of Machine Learning Research 11, 451–490 (2010)
MathSciNet MATH Google Scholar
Ward, M., Grinstein, G., Keim, D.A.: Interactive Data Visualization: Foundations, Techniques, and Application. A. K. Peters, Ltd. (2010)
Google Scholar
Yin, H.: On the equivalence between kernel self-organising maps and self-organising mixture density networks. Neural Networks 19(6-7), 780–784 (2006)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

CITEC Centre of Excellence, University of Bielefeld, Bielefeld, Germany
Barbara Hammer, Andrej Gisbrecht & Alexander Schulz

Authors

Barbara Hammer
View author publications
You can also search for this author in PubMed Google Scholar
Andrej Gisbrecht
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Schulz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Barbara Hammer .

Editor information

Editors and Affiliations

, Department of Electrical Engineering, University of Chile, Av. Tupper 2007, Santiago, 8370451, Chile
Pablo A. Estévez
, Computational NeuroEngineering Lab., University of Florida, 216 Larsen Hall, Gainesville, 32611-6200, USA
José C. Príncipe
, Facultad de Ingeniería y, Universidad de los Andes, San Carlos de Apoquindo Nº 2200, Santiago, 8370451, Chile
Pablo Zegers

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hammer, B., Gisbrecht, A., Schulz, A. (2013). How to Visualize Large Data Sets?. In: Estévez, P., Príncipe, J., Zegers, P. (eds) Advances in Self-Organizing Maps. Advances in Intelligent Systems and Computing, vol 198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35230-0_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-35230-0_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35229-4
Online ISBN: 978-3-642-35230-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics