Skip to main content
Log in

Using Discriminative Dimensionality Reduction to Visualize Classifiers

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Albeit automated classifiers offer a standard tool in many application areas, there exists hardly a generic possibility to directly inspect their behavior, which goes beyond the mere classification of (sets of) data points. In this contribution, we propose a general framework how to visualize a given classifier and its behavior as concerns a given data set in two dimensions. More specifically, we use modern nonlinear dimensionality reduction (DR) techniques to project a given set of data points and their relation to the classification decision boundaries. Furthermore, since data are usually intrinsically more than two-dimensional and hence cannot be projected to two dimensions without information loss, we propose to use discriminative DR methods which shape the projection according to given class labeling as is the case for a classification setting. With a given data set, this framework can be used to visualize any trained classifier which provides a probability or certainty of the classification together with the predicted class label. We demonstrate the suitability of the framework in the context of different dimensionality reduction techniques, in the context of different attention foci as concerns the visualization, and as concerns different classifiers which should be visualized.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. We use the estimator \(\hat{h}_{rot}\) provided in the literature to specify this parameter, see e.g. [36].

References

  1. Aupetit M, Catz T (2005) High-dimensional labeled data analysis with topology representing graphs. Neurocomputing 63:139–169

    Article  Google Scholar 

  2. Baudat G, Anouar F (2000) Generalized discriminant analysis using a kernel approach. Neural Comput 12:2385–2404

    Article  Google Scholar 

  3. Bunte K, Biehl M, Hammer B (2012) A general framework for dimensionality reducing data visualization mapping. Neural Comput 24(3):771–804

    Article  Google Scholar 

  4. Bunte K, Schneider P, Hammer B, Schleif F-M, Villmann T, Biehl M (2012) Limited rank matrix learning, discriminative dimension reduction and visualization. Neural Netw 26:159–173

    Article  Google Scholar 

  5. Caragea D, Cook D, Wickham H, Honavar V. Visual methods for examining svm classifiers. In: Simoff et al (2008) Visual data mining: theory, techniques and tools for visual analytics. (Lecture Notes in Computer Science), vol 4404. Springer, pp 136–153

  6. Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM transactions on intelligent systems and technology 2:27:1–27:27. http://www.csie.ntu.edu.tw/cjlin/libsvm. Accessed 1 July 2012

  7. Cohn D (2003) Informed projections. In: Becker S, Thrun S, Obermayer K (eds) NIPS. MIT Press, Cambridge, pp 849–856

    Google Scholar 

  8. Dhillon IS, Modha DS, Spangler WS (2002) Class visualization of high-dimensional data with applications. Comput Stat Data Anal 41(1):59–90

    Article  Google Scholar 

  9. dos Santos Amorim EP, Brazil EV, II JD, Joia P, Nonato LG, Sousa MC (2012) ilamp: exploring high-dimensional spacing through backward multidimensional projection. In: IEEE VAST, IEEE Computer Society, pp 53–62

  10. Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 1 July 2012

  11. Gisbrecht A, Hammer B Data visualization by nonlinear dimensionality reduction. WIREs Data Min Knowl Discov

  12. Gisbrecht A, Hofmann D, Hammer B (2012) Discriminative dimensionality reduction mappings. In: Hollmén J, Klawonn F, Tucker A (eds) IDA (Lecture Notes in Computer Science), Springer, pp 126–138

  13. Gisbrecht A, Schulz A, Hammer B (2015) Parametric nonlinear dimensionality reduction using kernel t-sne. Neurocomputing, 147(0):71–82, Advances in self-organizing maps subtitle of the special issue: selected papers from the workshop on self-organizing maps 2012 (WSOM 2012)

  14. Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2004) Neighbourhood components analysis. In: Advances in neural information processing systems vol 17. MIT Press, pp 513–520

  15. Hammer B, Hasenfuss A (2010) Topographic mapping of large dissimilarity datasets. Neural Comput 22(9):2229–2284

    Article  Google Scholar 

  16. Hammer B, Hofmann D, Schleif F-M, Zhu X (2013) Learning vector quantization for (dis-)similarities. Neurocomputing 131:43–51. doi:10.1016/j.neucom.2013.05.054

  17. Hernandez-Orallo J, Flach P, Ferri C (2011) Brier curves: a new cost-based visualisation of classifier performance. In: International conference on machine learning

  18. Hofmann D, Schleif F-M, Hammer B (2013) Learning interpretable kernelized prototype-based models. Neurocomputing 141:84–96. doi:10.1016/j.neucom.2014.03.003

  19. House TW (2012) Big data research and development initiative. http://www.whitehouse.gov/blog/2012/03/29/big-data-big-deal. Accessed 1 July 2012

  20. Jakulin A, Možina M, Demšar J, Bratko I, Zupan B (2005) Nomograms for visualizing support vector machines. In: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining, KDD ’05. NY, USA, ACM, New York, pp 108–117

  21. Kohonen T, Hynninen J, Kangas J, Laaksonen J, Torkkola K (Jan. 1996) LVQ\_PAK: the learning vector quantization program package. Report A30, Helsinki University of Technology, Laboratory of Computer and Information Science

  22. Kothari R, Dong M (2001) Decision trees for classification: a review and some new results. Pattern Recognit 171:169–184

    Article  Google Scholar 

  23. Kreßel UH-G (1999) Pairwise classification and support vector machines. In: Thompson JG (ed) Advances in kernel methods. MIT Press, Cambridge

    Google Scholar 

  24. Lee JA, Verleysen M (2007) Nonlinear dimensionality reduction. Springer, New York

    Book  Google Scholar 

  25. Ma B, Qu H, Wong H (2007) Kernel clustering-based discriminant analysis. Pattern Recognit 40(1):324–327

    Article  Google Scholar 

  26. Melnik O (2002) Decision region connectivity analysis: a method for analyzing high-dimensional classifiers. Mach Learn 48(1–3):321–351

    Article  Google Scholar 

  27. Otte C (2013) Safe and interpretable machine learning: a methodological review. In: Moewes C, Nürnberger A (eds) Computational intelligence in intelligent data analysis. Studies in computational intelligence. Springer, Berlin, Heidelberg, pp 111–122

    Chapter  Google Scholar 

  28. Peltonen J, Klami A, Kaski S (2004) Improved learning of riemannian metrics for exploratory analysis. Neural Netw 17:1087–1100

    Article  Google Scholar 

  29. Poulet F (2005) Visual svm. In: Chen C-S, Filipe J, Seruca I, Cordeiro J (eds) ICEIS 2:309–314

  30. Roweis S (2012) Machine learning data sets. http://www.cs.nyu.edu/~roweis/data.html. Accessed 1 July 2012

  31. Rüping S (2006) Learning interpretable models. PhD thesis, Dortmund University

  32. Schneider P, Biehl M, Hammer B (2009) Adaptive relevance matrices in learning vector quantization. Neural Comput 21:3532–3561

    Article  Google Scholar 

  33. Schulz A, Gisbrecht A, Hammer B (2013) Using nonlinear dimensionality reduction to visualize classifiers. In: Rojas I, Caparrós GJ, Cabestany J (eds) IWANN (1) (Lecture Notes in Computer Science), vol 7902. Springer, pp 59–68

  34. Seo S, Obermayer K (2003) Soft learning vector quantization. Neural Comput 15(7):1589–1604

    Article  Google Scholar 

  35. Simoff SJ, Böhlen MH, Mazeika A editors (2008) Visual data mining: theory, techniques and tools for visual analytics (Lecture Notes in Computer Science), vol 4404. Springer

  36. Turlach BA (1993) Bandwidth selection in kernel density estimation: a review. In: CORE and Institut de Statistique, pp 23–493

  37. van der Maaten, L (2013) Barnes-hut-sne. CoRR, abs/1301.3342

  38. van der Maaten L, Hinton G (2008) Visualizing high-dimensional data using t-sne. J Mach Learn Res 9:2579–2605

    Google Scholar 

  39. van der Maaten L, Postma E, van den Herik H (2009) Dimensionality reduction: a comparative review. Technical report, Tilburg University Technical Report, TiCC-TR 2009–005

  40. Vapnik VN (1995) The nature of statistical learning theory. Springer-Verlag New York Inc, New York

    Book  Google Scholar 

  41. Vellido A, Martin-Guerroro J, Lisboa P (2012) Making machine learning models interpretable. In: ESANN’12

  42. Venna J, Peltonen J, Nybo K, Aidos H, Kaski S (2010) Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J Mach Learn Res 11:451–490

    Google Scholar 

  43. Wang X, Wu S, Wang X, Li Q (2006) Svmv - a novel algorithm for the visualization of svm classification results. In: Wang J, Yi Z, Zurada J, Lu B-L, Yin H (eds) Advances in neural networks: ISNN 2006 (Lecture Notes in Computer Science), vol 3971. Berlin/Heidelberg, Springer, pp 968–973

  44. Ward M, Grinstein G, Keim DA (2010) Interactive data visualization: foundations, techniques, and application. A. K Peters Ltd, Natick

    Google Scholar 

  45. Yang Z, Peltonen J, Kaski S (2013) Scalable optimization of neighbor embedding for visualization. In: Dasgupta S, Mcallester D (eds) Proceedings of the 30th International Conference on Machine Learning (ICML-13), vol 28, pp 127–135. JMLR Workshop and Conference Proceedings

Download references

Acknowledgments

Funding from DFG under Grant number HA2719/7-1 and by the CITEC center of excellence is gratefully acknowledged. We also would like to thank the reviewers for many helpful comments and ideas concerning the evaluation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Schulz.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schulz, A., Gisbrecht, A. & Hammer, B. Using Discriminative Dimensionality Reduction to Visualize Classifiers. Neural Process Lett 42, 27–54 (2015). https://doi.org/10.1007/s11063-014-9394-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-014-9394-1

Keywords

Navigation