Abstract
We address novel developments in the context of dimensionality reduction for data visualization. We consider nonlinear non-parametric techniques such as t-distributed stochastic neighbor embedding and discuss the difficulties which are encountered if large data sets are dealt with, in contrast to parametric approaches such as the self-organizing map. We focus on the following topics, which arise in this context: (i) how can dimensionality reduction be realized efficiently in at most linear time, (ii) how can nonparametric approaches be extended to provide an explicit mapping, (iii) how can techniques be extended to incorporate auxiliary information as provided by class labeling?
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bengio, Y., Paiement, J.-F., Vincent, P., Delalleau, O., Roux, N.L., Ouimet, M.: Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering. In: Advances in Neural Information Processing Systems, pp. 177–184. MIT Press (2004)
Biehl, M., Hammer, B., Merényi, E., Sperduti, A., Villmann, T.: Learning in the context of very high dimensional data (Dagstuhl Seminar 11341), vol. 1 (2011)
Bishop, C.M., Svensén, M., Williams, C.K.I.: Gtm: The generative topographic mapping. Neural Computation 10, 215–234 (1998)
Bunte, K., Biehl, M., Hammer, B.: A general framework for dimensionality reducing data visualization mapping. Neural Computation 24(3), 771–804 (2012)
Gisbrecht, A., Hammer, B.: Relevance learning in generative topographic mapping. Neurocomputing 74(9), 1359–1371 (2011)
Gisbrecht, A., Hammer, B., Schleif, F.-M., Zhu, X.: Accelerating dissimilarity clustering for biomedical data analysis. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 154–161 (2011)
Gisbrecht, A., Hofmann, D., Hammer, B.: Discriminative Dimensionality Reduction Mappings. In: Hollmén, J., Klawonn, F., Tucker, A. (eds.) IDA 2012. LNCS, vol. 7619, pp. 126–138. Springer, Heidelberg (2012)
Gisbrecht, A., Lueks, W., Mokbel, B., Hammer, B.: Out-of-sample kernel extensions for nonparametric dimensionality reduction. In: ESANN 2012, pp. 531–536 (2012)
Hammer, B., Gisbrecht, A., Hasenfuss, A., Mokbel, B., Schleif, F.-M., Zhu, X.: Topographic Mapping of Dissimilarity Data. In: Laaksonen, J., Honkela, T. (eds.) WSOM 2011. LNCS, vol. 6731, pp. 1–15. Springer, Heidelberg (2011)
Hammer, B., Hasenfuss, A.: Topographic mapping of large dissimilarity datasets. Neural Computation 22(9), 2229–2284 (2010)
Kaski, S., Sinkkonen, J., Peltonen, J.: Bankruptcy analysis with self-organizing maps in learning metrics. IEEE Transactions on Neural Networks 12, 936–947 (2001)
Kohonen, T.: Self-Organizing Maps. Springer (2000)
Lee, J.A., Verleysen, M.: Nonlinear dimensionality redcution. Springer (2007)
Peltonen, J., Klami, A., Kaski, S.: Improved learning of riemannian metrics for exploratory analysis. Neural Networks 17, 1087–1100 (2004)
Schneider, P., Biehl, M., Hammer, B.: Adaptive relevance matrices in learning vector quantization. Neural Computation 21, 3532–3561 (2009)
Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. Journal of the Royal Statistical Society, Series B 61, 611–622 (1999)
van der Maaten, L., Hinton, G.: Visualizing high-dimensional data using t-sne. Journal of Machine Learning Research 9, 2579–2605 (2008)
van der Maaten, L., Postma, E., van den Herik, H.: Dimensionality reduction: A comparative review. Technical report, Tilburg University Technical Report, TiCC-TR 2009-005 (2009)
Venna, J., Peltonen, J., Nybo, K., Aidos, H., Kaski, S.: Information retrieval perspective to nonlinear dimensionality reduction for data visualization. Journal of Machine Learning Research 11, 451–490 (2010)
Ward, M., Grinstein, G., Keim, D.A.: Interactive Data Visualization: Foundations, Techniques, and Application. A. K. Peters, Ltd. (2010)
Yin, H.: On the equivalence between kernel self-organising maps and self-organising mixture density networks. Neural Networks 19(6-7), 780–784 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hammer, B., Gisbrecht, A., Schulz, A. (2013). How to Visualize Large Data Sets?. In: Estévez, P., PrÃncipe, J., Zegers, P. (eds) Advances in Self-Organizing Maps. Advances in Intelligent Systems and Computing, vol 198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35230-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-35230-0_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35229-4
Online ISBN: 978-3-642-35230-0
eBook Packages: EngineeringEngineering (R0)