Discriminative Dimensionality Reduction for the Visualization of Classifiers

  • Andrej Gisbrecht
  • Alexander Schulz
  • Barbara Hammer
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 318)

Abstract

Modern nonlinear dimensionality reduction offers powerful techniques to directly inspect high dimensional data in the plane. Since the task of data projection is generally ill-posed and information loss cannot be avoided while projecting, the quality and meaningfulness of the outcome is not clear. In this contribution, we argue that discriminative dimensionality reduction, i.e. the concept to enhance the dimensionality reduction technique by supervised label information, offers a principled way to shape the outcome of a dimensionality reduction technique. We demonstrate the capacity of this approach for benchmark data sets. In addition, based on discriminative dimensionality reduction, we propose a pipeline how to visualize the function of general nonlinear classifiers in the plane. We demonstrate this approach by providing a generic visualization of the function of support vector machine classifiers.

Keywords

Dimensionality reduction Fisher information metric  Classifier visualization Evaluation. 

Notes

Acknowledgments

Funding by DFG under grants number HA 2719/7-1, HA 2719/6-2 and by the CITEC centre of excellence are gratefully acknowledged.

References

  1. 1.
    Baudat, G., Anouar, F.: Generalized discriminant analysis using a kernel approach. Neural Comput. 12, 2385–2404 (2000)CrossRefGoogle Scholar
  2. 2.
    Bekkerman, R., Bilenko, M., Langford, J. (eds.): Scaling Up Machine Learning: Parallel and Distributed Approaches. Cambridge University Press, Cambridge (2011)Google Scholar
  3. 3.
    Biehl, M., Hammer, B., Mer\(\acute{\rm {e}}\)nyi, E., Sperduti, A., Villmann, T. (eds.): Learning in the context of very high dimensional data (Dagstuhl Seminar 11341), vol. 1 (2011)Google Scholar
  4. 4.
    Braun, M.L., Buhmann, J.M., Müller, K.-R.: On relevant dimensions in kernel feature spaces. J. Mach. Learn. Res. 9, 1875–1908 (2008)MathSciNetMATHGoogle Scholar
  5. 5.
    Bunte, K., Biehl, M., Hammer, B.: A general framework for dimensionality reducing data visualization mapping. Neural Comput. 24(3), 771–804 (2012)CrossRefMATHGoogle Scholar
  6. 6.
    Bunte, K., Schneider, P., Hammer, B., Schleif, F.-M., Villmann, T., Biehl, M.: Limited rank matrix learning, discriminative dimension reduction and visualization. Neural Netw. 26, 159–173 (2012)CrossRefGoogle Scholar
  7. 7.
    Caragea, D., Cook, D., Wickham, H., Honavar, V.: Visual methods for examining svm classifiers. In: Simoff, S.J., Böhlen, M.H., Mazeika, A. (eds.) Visual Data Mining. Lecture Notes in Computer Science, vol. 4404, pp. 136–153. Springer, Berlin (2008)Google Scholar
  8. 8.
    Cohn, D.: Informed projections. In: Becker, S., Thrun, S., Obermayer, K. (eds.) NIPS, pp. 849–856. MIT Press, Cambridge (2003)Google Scholar
  9. 9.
    Geng, X., Zhan, D.-C., Zhou, Z.-H.: Supervised nonlinear dimensionality reduction for visualization and classification. IEEE Trans. Syst. Man Cybern, Part B 35(6), 1098–1107 (2005)Google Scholar
  10. 10.
    Gisbrecht, A., Mokbel, B., Hammer, B.: Linear basis-function t-sne for fast nonlinear dimensionality reduction. In: IJCNN (2013)Google Scholar
  11. 11.
    Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Advances in Neural Information Processing Systems 17, pp. 513–520. MIT Press, Cambridge (2004)Google Scholar
  12. 12.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer Series in Statistics. Springer, New York (2001)CrossRefMATHGoogle Scholar
  13. 13.
    Hernandez-Orallo, J., Flach, P., Ferri, C.: Brier curves: a new cost-based visualisation of classifier performance. In: Proceedings of International Conference on Machine Learning (2011)Google Scholar
  14. 14.
    Humphrey, G.: The psychology of the gestalt. J. Educ. Psychol. 15(7), 401–412 (1924)CrossRefGoogle Scholar
  15. 15.
    Iwata, T., Saito, K., Ueda, N., Stromsten, S., Griffiths, T.L., Tenenbaum, J.B.: Parametric embedding for class visualization. Neural Comput. 19(9), 2536–2556 (2007)CrossRefMATHGoogle Scholar
  16. 16.
    Jakulin, A., Možina, M., Demšar, J., Bratko, I., Zupan, B.: Nomograms for visualizing support vector machines. In: KDD, pp. 108–117. ACM, New York (2005)Google Scholar
  17. 17.
    Kaski, S., Peltonen, J.: Dimensionality reduction for data visualization [applications corner]. IEEE Signal Process. Mag. 28(2), 100–104 (2011)CrossRefGoogle Scholar
  18. 18.
    Kaski, S., Sinkkonen, J., Peltonen, J.: Bankruptcy analysis with self-organizing maps in learning metrics. IEEE Trans. Neural Netw. 12, 936–947 (2001)CrossRefGoogle Scholar
  19. 19.
    Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer, New York (2007)CrossRefMATHGoogle Scholar
  20. 20.
    Lee, J.A., Verleysen, M.: Scale-independent quality criteria for dimensionality reduction. Pattern Recognit. Lett. 31, 2248–2257 (2010)CrossRefGoogle Scholar
  21. 21.
    Ma, B., Qu, H., Wong, H.: Kernel clustering-based discriminant analysis. Pattern Recognit. 40(1), 324–327 (2007)CrossRefMATHGoogle Scholar
  22. 22.
    Memisevic, R., Hinton, G.: Multiple relational embedding. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 17, pp. 913–920. MIT Press, Cambridge (2005)Google Scholar
  23. 23.
    Mika, S., Rätsch, G., Weston, J., Schölkopf, B., Müller, K.-R.: Fisher discriminant analysis with kernels. In: Proceedings of IEEE, pp. 41–48 (1999)Google Scholar
  24. 24.
    Min, M.R., van der Maaten, L., Yuan, Z., Bonner, A.J., Zhang, Z.: Deep supervised t-distributed embedding. In: ICML, pp. 791–798 (2010)Google Scholar
  25. 25.
    Peltonen, J., Klami, A., Kaski, S.: Improved learning of riemannian metrics for exploratory analysis. Neural Netw. 17, 1087–1100 (2004)CrossRefMATHGoogle Scholar
  26. 26.
    Poulet, F.: Visual svm. In: ICEIS. vol. 2, pp. 309–314 (2005)Google Scholar
  27. 27.
    Ruiz, H., Jarman, I.H., Mart\(\acute{{\text{ i }} }\)n, J.D., Lisboa, P.J.G.: The role of fisher information in primary data space for neighbourhood mapping. In: ESANN (2011)Google Scholar
  28. 28.
    Rüping, S.: Learning Interpretable Models. Ph.D. thesis, Dortmund University (2006)Google Scholar
  29. 29.
    Tsang, I.W., Kwok, J.T., ming Cheung, P., Cristianini, N.: Core vector machines: fast svm training on very large data sets. J. Mach. Learn. Res. 6, 363–392 (2005)MathSciNetMATHGoogle Scholar
  30. 30.
    van der Maaten, L.: Learning a parametric embedding by preserving local structure. J. Mach. Learn. Res. Proc. Track 5, 384–391 (2009)Google Scholar
  31. 31.
    van der Maaten, L., Hinton, G.: Visualizing high-dimensional data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008)MATHGoogle Scholar
  32. 32.
    Vellido, A., Mart\(\acute{{\text{ i }} }\)n, J.D., Rossi, F., Lisboa, P.J.G.: Seeing is believing: the importance of visualization in real-world machine learning applications. In: ESANN (2011)Google Scholar
  33. 33.
    Vellido, A., Martin-Guerroro, J., Lisboa, P.: Making machine learning models interpretable. In: ESANN’12 (2012)Google Scholar
  34. 34.
    Venna, J., Peltonen, J., Nybo, K., Aidos, H., Kaski, S.: Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J. Mach. Learn. Res. 11, 451–490 (2010)MathSciNetMATHGoogle Scholar
  35. 35.
    Wang, X., Wu, S., Wang, X., Li, Q.: Svmv—a novel algorithm for the visualization of svm classification results. In: Wang, J., Yi, Z., Zurada, J., Lu, B.-L., Yin, H. (eds.) Advances in Neural Networks—ISNN 2006. Lecture Notes in Computer Science, vol. 3971, pp. 968–973. Springer, Berlin (2006)Google Scholar
  36. 36.
    Ward, M., Grinstein, G., Keim, D.A.: Interactive Data Visualization: Foundations, Techniques, and Application. A.K. Peters Ltd, Natick (2010)Google Scholar
  37. 37.
    Witten, D.M., Tibshirani, R.: Supervised multidimensional scaling for visualization, classification, and bipartite ranking. Comput. Stat. Data Anal. 55(1), 789–801 (2011)CrossRefMathSciNetMATHGoogle Scholar
  38. 38.
    Yang, Z., Peltonen, J., Kaksi, S.: Scalable optimization of neighbor embedding for visualization, In: ICML (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Andrej Gisbrecht
    • 1
  • Alexander Schulz
    • 1
  • Barbara Hammer
    • 1
  1. 1.University of Bielefeld - CITEC Centre of ExcellenceBielefeldGermany

Personalised recommendations