Skip to main content

A Method for Comparative Visualization of Labeled Multidimensional Data and Its Application to Machine Learning Data

  • Chapter
  • First Online:
Artificial Intelligence and Visualization: Advancing Visual Knowledge Discovery

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1126))

  • 39 Accesses

Abstract

This chapter proposes a method for visualizing differences among labeled multidimensional data. The proposed method arranges multiple given multidimensional datasets on the same screen space applying the same dimensionality reduction scheme, and then displays a group of samples semi-transparently representing the labels with their particular colors. This representation makes it easy to observe which labels have the most in common or differences among the multidimensional data, and which labels tend to cause outliers. This chapter presents an application example using MNIST, USPS, and CIFAR10, which are representative sample datasets for machine learning, and discusses the effectiveness and issues of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://yann.lecun.com/exdb/mnist/.

  2. 2.

    https://paperswithcode.com/dataset/usps.

References

  1. Bachthaler S, Weiskopf D (2008) Continuous scatterplots. IEEE Trans Visual Comput Graph 14(6):1428–1435

    Article  Google Scholar 

  2. Bernard J, Hutter M, Zeppelzauer M, Fellner D, Sedlmair M (2018) Comparing visual-interactive labeling with active learning: an experimental study. IEEE Trans Visual Comput Graph 24(1):298–308

    Article  Google Scholar 

  3. Harrison L, Yang F, Franconeri S, Chang R (2014) Ranking visualizations of correlation using weber’s law. IEEE Trans Visual Comput Graph 20(12):1943–1952

    Article  Google Scholar 

  4. Iijima A, Itoh T (2021) Visualization for image annotations based on semantic differential. IEEE VIS Posters

    Google Scholar 

  5. Itoh T, Kumar A, Klein K, Kim J (2017) High dimensional data visualization by interactive construction of low dimensional parallel coordinate plots. J Vis Lang Comput 43:1–13

    Article  Google Scholar 

  6. Kobak D, Linderman GC (2019) UMAP does not preserve global structure any better than t-SNE when using the same initialization. https://doi.org/10.1101/2019.12.19.877522

  7. Kosaka K, Itoh T (2021) A visualization method for training data comparison. In: 25th International conference on information visualisation (IV2021), pp 205–211

    Google Scholar 

  8. Ma Y, Fan A, He J, Nelakurthi AR, Maciejewski R (2021) A visual analytics framework for explaining and diagnosing transfer learning processes. IEEE Trans Visual Comput Graph 27:1385–1395

    Article  Google Scholar 

  9. Moehrmann J, Bernstein S, Schlegel T, Werner G, Heidemann G (2011) Improving the usability of hierarchical representations for interactively labeling large image data sets. In: 14th International conference on human computer interaction: design and development approaches volume part I (HCII’11), pp 618–627

    Google Scholar 

  10. Nakabayashi A, Itoh T (2019) A technique for selection and drawing of scatterplots for multi-dimensional data visualization. In: Proceedings of 23rd international conference on information visualisation (IV2019), pp 62–67

    Google Scholar 

  11. Rousseeuw PJ, Ruts I, Tukey JW (1999) The bagplot: a bivariate boxplot. Am Stat 53(4):382–387

    Article  Google Scholar 

  12. Schreck T, Panse C (2007) A new metaphor for projection-based visual analysis and data exploration. In: IS &T/SPIE conference on visualization and data analysis

    Google Scholar 

  13. Schreck T, Schüßler M, Worm K, Zeilfelder F (2008) Butterfly plots for visual analysis of large point cloud data. In: International conference in central europe on computer graphics, visualization and computer vision (WSCG’08), pp 33–40

    Google Scholar 

  14. Sedlmair M, Tatu A, Munzner T, Tory M (2012) A taxonomy of visual cluster separation factors. Comput Graph Forum 31(3):1335–1344

    Article  Google Scholar 

  15. Shao L, Mahajan A, Schreck T, Lehmann DJ (2017) Interactive regression lens for exploring scatter plots. Comput Graph Forum 36(3):157–166

    Google Scholar 

  16. Sips M, Neubert B, Lewis JP, Hanrahan PM (2009) Selecting good views of high-dimensional data using class consistency. Comput Graph Forum 28(3):831–838

    Google Scholar 

  17. Swayamdipta S, Schwartz R, Lourie N, Wang Y, Hajishirzi H, Smith NA, Choi Y (2020) Dataset cartography: mapping and diagnosing datasets with training dynamics. In: Proceedings of conference on empirical methods in natural language processing

    Google Scholar 

  18. Smilkov D, Thorat N, Nicholson C, Reif E, Viégas FB, Wattenberg M (2016) Embedding projector: interactive visualization and interpretation of embeddings. In: NIPS 2016 workshop on interpretable machine learning in complex systems

    Google Scholar 

  19. Trautner T, Bolte F, Stoppel S, Bruckner S (2020) Sunspot plots: model-based structure enhancement for dense scatter plots. Comput Graph Forum 39(3):551–563

    Article  Google Scholar 

  20. Wang Y, Wang Z, Liu T, Correll M, Cheng Z, Deussen O, Sedlmair M (2020) Improving the robustness of scagnostics. IEEE Trans Visual Comput Graph 26(1):759–769

    Article  Google Scholar 

  21. Wilkinson L, Anand A, Grossman R (2005) Graph-theoretic scagnostics. In: IEEE symposium on information visualization, pp 157–164

    Google Scholar 

  22. Xiang R, Wang W, Yang L, Wang S, Chaohan X, Chen X (2021) A comparison for dimensionality reduction methods of single-cell RNA-seq data. Front Genet 12:646936

    Article  Google Scholar 

  23. Xiang S, Ye X, Xia J, Wu J, Chen Y, Liu S (2019) Interactive correction of mislabeled training data. In: IEEE conference on visual analytics science and technology (VAST), pp 57–68

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takayuki Itoh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kosaka, K., Itoh, T. (2024). A Method for Comparative Visualization of Labeled Multidimensional Data and Its Application to Machine Learning Data. In: Kovalerchuk, B., Nazemi, K., Andonie, R., Datia, N., Bannissi, E. (eds) Artificial Intelligence and Visualization: Advancing Visual Knowledge Discovery. Studies in Computational Intelligence, vol 1126. Springer, Cham. https://doi.org/10.1007/978-3-031-46549-9_9

Download citation

Publish with us

Policies and ethics