A Method for Comparative Visualization of Labeled Multidimensional Data and Its Application to Machine Learning Data

Kosaka, Karen; Itoh, Takayuki

doi:10.1007/978-3-031-46549-9_9

Karen Kosaka⁷ &
Takayuki Itoh⁷

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1126))

39 Accesses

Abstract

This chapter proposes a method for visualizing differences among labeled multidimensional data. The proposed method arranges multiple given multidimensional datasets on the same screen space applying the same dimensionality reduction scheme, and then displays a group of samples semi-transparently representing the labels with their particular colors. This representation makes it easy to observe which labels have the most in common or differences among the multidimensional data, and which labels tend to cause outliers. This chapter presents an application example using MNIST, USPS, and CIFAR10, which are representative sample datasets for machine learning, and discusses the effectiveness and issues of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Bachthaler S, Weiskopf D (2008) Continuous scatterplots. IEEE Trans Visual Comput Graph 14(6):1428–1435
Article Google Scholar
Bernard J, Hutter M, Zeppelzauer M, Fellner D, Sedlmair M (2018) Comparing visual-interactive labeling with active learning: an experimental study. IEEE Trans Visual Comput Graph 24(1):298–308
Article Google Scholar
Harrison L, Yang F, Franconeri S, Chang R (2014) Ranking visualizations of correlation using weber’s law. IEEE Trans Visual Comput Graph 20(12):1943–1952
Article Google Scholar
Iijima A, Itoh T (2021) Visualization for image annotations based on semantic differential. IEEE VIS Posters
Google Scholar
Itoh T, Kumar A, Klein K, Kim J (2017) High dimensional data visualization by interactive construction of low dimensional parallel coordinate plots. J Vis Lang Comput 43:1–13
Article Google Scholar
Kobak D, Linderman GC (2019) UMAP does not preserve global structure any better than t-SNE when using the same initialization. https://doi.org/10.1101/2019.12.19.877522
Kosaka K, Itoh T (2021) A visualization method for training data comparison. In: 25th International conference on information visualisation (IV2021), pp 205–211
Google Scholar
Ma Y, Fan A, He J, Nelakurthi AR, Maciejewski R (2021) A visual analytics framework for explaining and diagnosing transfer learning processes. IEEE Trans Visual Comput Graph 27:1385–1395
Article Google Scholar
Moehrmann J, Bernstein S, Schlegel T, Werner G, Heidemann G (2011) Improving the usability of hierarchical representations for interactively labeling large image data sets. In: 14th International conference on human computer interaction: design and development approaches volume part I (HCII’11), pp 618–627
Google Scholar
Nakabayashi A, Itoh T (2019) A technique for selection and drawing of scatterplots for multi-dimensional data visualization. In: Proceedings of 23rd international conference on information visualisation (IV2019), pp 62–67
Google Scholar
Rousseeuw PJ, Ruts I, Tukey JW (1999) The bagplot: a bivariate boxplot. Am Stat 53(4):382–387
Article Google Scholar
Schreck T, Panse C (2007) A new metaphor for projection-based visual analysis and data exploration. In: IS &T/SPIE conference on visualization and data analysis
Google Scholar
Schreck T, Schüßler M, Worm K, Zeilfelder F (2008) Butterfly plots for visual analysis of large point cloud data. In: International conference in central europe on computer graphics, visualization and computer vision (WSCG’08), pp 33–40
Google Scholar
Sedlmair M, Tatu A, Munzner T, Tory M (2012) A taxonomy of visual cluster separation factors. Comput Graph Forum 31(3):1335–1344
Article Google Scholar
Shao L, Mahajan A, Schreck T, Lehmann DJ (2017) Interactive regression lens for exploring scatter plots. Comput Graph Forum 36(3):157–166
Google Scholar
Sips M, Neubert B, Lewis JP, Hanrahan PM (2009) Selecting good views of high-dimensional data using class consistency. Comput Graph Forum 28(3):831–838
Google Scholar
Swayamdipta S, Schwartz R, Lourie N, Wang Y, Hajishirzi H, Smith NA, Choi Y (2020) Dataset cartography: mapping and diagnosing datasets with training dynamics. In: Proceedings of conference on empirical methods in natural language processing
Google Scholar
Smilkov D, Thorat N, Nicholson C, Reif E, Viégas FB, Wattenberg M (2016) Embedding projector: interactive visualization and interpretation of embeddings. In: NIPS 2016 workshop on interpretable machine learning in complex systems
Google Scholar
Trautner T, Bolte F, Stoppel S, Bruckner S (2020) Sunspot plots: model-based structure enhancement for dense scatter plots. Comput Graph Forum 39(3):551–563
Article Google Scholar
Wang Y, Wang Z, Liu T, Correll M, Cheng Z, Deussen O, Sedlmair M (2020) Improving the robustness of scagnostics. IEEE Trans Visual Comput Graph 26(1):759–769
Article Google Scholar
Wilkinson L, Anand A, Grossman R (2005) Graph-theoretic scagnostics. In: IEEE symposium on information visualization, pp 157–164
Google Scholar
Xiang R, Wang W, Yang L, Wang S, Chaohan X, Chen X (2021) A comparison for dimensionality reduction methods of single-cell RNA-seq data. Front Genet 12:646936
Article Google Scholar
Xiang S, Ye X, Xia J, Wu J, Chen Y, Liu S (2019) Interactive correction of mislabeled training data. In: IEEE conference on visual analytics science and technology (VAST), pp 57–68
Google Scholar

Download references

Author information

Authors and Affiliations

Ochanomizu University, Tokyo, Japan
Karen Kosaka & Takayuki Itoh

Authors

Karen Kosaka
View author publications
You can also search for this author in PubMed Google Scholar
Takayuki Itoh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Takayuki Itoh .

Editor information

Editors and Affiliations

Dept. of Computer Science, Central Washington University, Ellensburg, WA, USA
Boris Kovalerchuk
Darmstadt University of Applied Sciences, Darmstadt, Germany
Kawa Nazemi
Dept. of Computer Science, Central Washington University, Ellensburg, WA, USA
Răzvan Andonie
ISEL, Polytechnic Institute of Lisbon, Lisboa, Portugal
Nuno Datia
Department of Informatics, London South Bank University, London, UK
Ebad Bannissi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kosaka, K., Itoh, T. (2024). A Method for Comparative Visualization of Labeled Multidimensional Data and Its Application to Machine Learning Data. In: Kovalerchuk, B., Nazemi, K., Andonie, R., Datia, N., Bannissi, E. (eds) Artificial Intelligence and Visualization: Advancing Visual Knowledge Discovery. Studies in Computational Intelligence, vol 1126. Springer, Cham. https://doi.org/10.1007/978-3-031-46549-9_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-46549-9_9
Published: 25 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46548-2
Online ISBN: 978-3-031-46549-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A Method for Comparative Visualization of Labeled Multidimensional Data and Its Application to Machine Learning Data