Agreement Analysis of Quality Measures for Dimensionality Reduction

Conference paper
Part of the Mathematics and Visualization book series (MATHVISUAL)

Abstract

High-dimensional data sets commonly occur in various application domains. They are often analysed using dimensionality reduction methods, such as principal component analysis or multidimensional scaling. To determine the reliability of a particular embedding of a data set, users need to analyse its quality. For this purpose, the literature knows numerous quality measures. Most of these measures concentrate on a single aspect, such as the preservation of relative distances, while others aim to balance multiple aspects, such as intrusions and extrusions in k-neighbourhoods. Faced with multiple quality measures with different ranges and different value distributions, it is challenging to decide which aspects of a data set are preserved best by an embedding. We propose an algorithm based on persistent homology that permits the comparative analysis of different quality measures on a given embedding, regardless of their ranges. Our method ranks quality measures and provides local feedback about which aspects of a data set are preserved by an embedding in certain areas. We demonstrate the use of our technique by analysing quality measures on different embeddings of synthetic and real-world data sets.

References

  1. 1.
    Bertini, E., Tatu, A., Keim, D.: Quality metrics in high-dimensional data visualization: an overview and systematization. IEEE Trans. Vis. Comput. Graph. 17(12), 2203–2212 (2011)CrossRefGoogle Scholar
  2. 2.
    Carr, H., Snoeyink, J., Axen, U.: Computing contour trees in all dimensions. Comput. Geom. 24(2), 75–94 (2003)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Chazal, F., Guibas, L.J., Oudot, S.Y., Skraba, P.: Persistence-based clustering in Riemannian manifolds. J. ACM 60(6), 41:1–41:38 (2013)Google Scholar
  4. 4.
    Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)CrossRefGoogle Scholar
  5. 5.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn. MIT Press, Cambridge (2009)MATHGoogle Scholar
  6. 6.
    Correa, C., Lindstrom, P.: Towards robust topology of sparsely sampled data. IEEE Trans. Vis. Comput. Graph. 17(12), 1852–1861 (2011)CrossRefGoogle Scholar
  7. 7.
    Correa, C., Lindstrom, P., Bremer, P.T.: Topological spines: a structure-preserving visual representation of scalar fields. IEEE Trans. Vis. Comput. Graph. 17(12), 1842–1851 (2011)CrossRefGoogle Scholar
  8. 8.
    Doraiswamy, H., Shivashankar, N., Natarajan, V., Wang, Y.: Topological saliency. Comput. Graph. 37(7), 787–799 (2013)CrossRefGoogle Scholar
  9. 9.
    Edelsbrunner, H., Harer, J.: Computational Topology: An Introduction. American Mathematical Society, Providence, RI (2010)MATHGoogle Scholar
  10. 10.
    Gerber, S., Bremer, P.T., Pascucci, V., Whitaker, R.: Visual exploration of high dimensional scalar functions. IEEE Trans. Vis. Comput. Graph. 16(6), 1271–1280 (2010)CrossRefGoogle Scholar
  11. 11.
    Lee, J.A., Verleysen, M.: Quality assessment of dimensionality reduction: rank-based criteria. Neurocomputing 72(7–9), 1431–1443 (2009)CrossRefGoogle Scholar
  12. 12.
    Lee, J.H., McDonnell, K.T., Zelenyuk, A., Imre, D., Mueller, K.: A structure-based distance metric for high-dimensional space exploration with multidimensional scaling. IEEE Trans. Vis. Comput. Graph. 20(3), 351–364 (2014)CrossRefGoogle Scholar
  13. 13.
    Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml Google Scholar
  14. 14.
    Oesterling, P., Heine, C., Jänicke, H., Scheuermann, G., Heyer, G.: Visualization of high-dimensional point clouds using their density distribution’s topology. IEEE Trans. Vis. Comput. Graph. 17(11), 1547–1559 (2011)CrossRefGoogle Scholar
  15. 15.
    Oesterling, P., Heine, C., Weber, G.H., Scheuermann, G.: Visualizing nD point clouds as topological landscape profiles to guide local data analysis. IEEE Trans. Vis. Comput. Graph. 19(3), 514–526 (2013)CrossRefGoogle Scholar
  16. 16.
    Rieck, B., Mara, H., Leitte, H.: Multivariate data analysis using persistence-based filtering and topological signatures. IEEE Trans. Vis. Comput. Graph. 18(12), 2382–2391 (2012)CrossRefGoogle Scholar
  17. 17.
    Sauber, N., Theisel, H., Seidel, H.P.: Multifield-graphs: an approach to visualizing correlations in multifield scalar data. IEEE Trans. Vis. Comput. Graph. 12(5), 917–924 (2006)CrossRefGoogle Scholar
  18. 18.
    Schneider, D., Wiebel, A., Carr, H., Hlawitschka, M., Scheuermann, G.: Interactive comparison of scalar fields based on largest contours with applications to flow visualization. IEEE Trans. Vis. Comput. Graph. 14(6), 1475–1482 (2008)CrossRefGoogle Scholar
  19. 19.
    Schneider, D., Heine, C., Carr, H., Scheuermann, G.: Interactive comparison of multifield scalar data based on largest contours. Comput. Aided Geom. Des. 30(6), 521–528 (2013)MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)CrossRefGoogle Scholar
  21. 21.
    van der Maaten, L.J.P., Postma, E.O., van den Herik, H.J.: Dimensionality reduction: a comparative review. Technical Report 005, Tilburg University (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.IWRHeidelberg UniversityHeidelbergGermany

Personalised recommendations