Skip to main content

Visualizations of Deep Neural Networks in Computer Vision: A Survey

  • Chapter
  • First Online:
Transparent Data Mining for Big and Small Data

Part of the book series: Studies in Big Data ((SBD,volume 32))

Abstract

In recent years, Deep Neural Networks (DNNs) have been shown to outperform the state-of-the-art in multiple areas, such as visual object recognition, genomics and speech recognition. Due to the distributed encodings of information, DNNs are hard to understand and interpret. To this end, visualizations have been used to understand how deep architecture work in general, what different layers of the network encode, what the limitations of the trained model was and how to interactively collect user feedback. In this chapter, we provide a survey of visualizations of DNNs in the field of computer vision. We define a classification scheme describing visualization goals and methods as well as the application areas. This survey gives an overview of what can be learned from visualizing DNNs and which visualization methods were used to gain which insights. We found that most papers use Pixel Displays to show neuron activations. However, recently more sophisticated visualizations like interactive node-link diagrams were proposed. The presented overview can serve as a guideline when applying visualizations while designing DNNs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Tools available http://yosinski.com/deepvis and https://github.com/bruckner/deepViz, last accessed 2016-09-08.

References

  1. Becker, B., Kohavi, R., Sommerfield, D.: Visualizing the simple Bayesian classifier. In: KDD Workshop Issues in the Integration of Data Mining and Data Visualization (1997)

    Google Scholar 

  2. Bruckner, D.M.: Ml-o-scope: a diagnostic visualization system for deep machine learning pipelines. Tech. Rep. UCB/EECS-2014-99, University of California at Berkeley (2014)

    Google Scholar 

  3. Cao, C., Liu, X., Yang, Y., Yu, Y., Wang, J., Wang, Z., Huang, Y., Wang, L., Huang, C., Xu, W., Ramanan, D., Huang, T.S.: Look and think twice: capturing top-down visual attention with feedback convolutional neural networks. In: 2015 IEEE International Conference on Computer Vision (ICCV) (2015)

    Google Scholar 

  4. Caputo, B., Hayman, E., Mallikarjuna, P.: Class-specific material categorisation. In: Tenth IEEE International Conference on Computer Vision, vol. 1 (2005)

    Google Scholar 

  5. Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. CoRR, abs/1311.3618 (2014)

    Google Scholar 

  6. Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: Computer Vision and Pattern Recognition, pp. 3642–3649 (2012)

    Google Scholar 

  7. Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2 (4), 303–314 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  8. Dai, J., Wu, Y.N.: Generative modeling of convolutional neural networks. CoRR, abs/1412.6296 (2014)

    Google Scholar 

  9. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255, June 2009

    Google Scholar 

  10. Di Battista, G., Eades, P., Tamassia, R., Tollis, I.G.: Algorithms for drawing graphs: an annotated bibliography. Comput. Geom. 4 (5), 235–282 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  11. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Learning (2014)

    Google Scholar 

  12. Dosovitskiy, A., Brox, T.: Inverting visual representations with convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2016).

    Book  Google Scholar 

  13. Eichner, M., Ferrari, V.: Better appearance models for pictorial structures. In: Proceedings of the British Machine Vision Conference, pp. 3.1–3.11. BMVA Press, Guildford (2009). doi:10.5244/C.23.3

    Google Scholar 

  14. Eichner, M., Ferrari, V.: We are family: joint pose estimation of multiple persons. In: Proceedings of the 11th European Conference on Computer Vision: Part I, pp. 228–242. Springer, Berlin/Heidelberg (2010)

    Google Scholar 

  15. Eichner, M., Ferrari, V.: Human pose co-estimation and applications. IEEE Trans. Pattern Anal. Mach. Intell. 34 (11), 2282–2288 (2012)

    Article  Google Scholar 

  16. Erhan, D., Courville, A., Bengio, Y.: Understanding representations learned in deep architectures. Tech. Rep. 1355, Université de Montréal/DIRO, October 2010

    Google Scholar 

  17. Escalera, S., Baró, X., Gonzalez, J., Bautista, M.A., Madadi, M., Reyes, M., Ponce-López, V., Escalante, H.J., Shotton, J., Guyon, I.: Chalearn looking at people challenge 2014: Dataset and results. In: Workshop at the European Conference on Computer Vision (2014)

    Google Scholar 

  18. Everingham, M., van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88 (2), 303–338 (2010)

    Article  Google Scholar 

  19. Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) - Volume 2 - Volume 02, pp. 524–531. IEEE Computer Society, Washington, DC (2005)

    Google Scholar 

  20. Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28 (4), 594–611 (2006)

    Article  Google Scholar 

  21. Fuchs, R., Waser, J., Gröller, E.: Visual human+machine learning. Proc. Vis. 09 15 (6), 1327–1334 (2009)

    Google Scholar 

  22. Fukushima, K., Miyake, S.: Neocognitron: a new algorithm for pattern recognition tolerant of deformations and shifts in position. Pattern Recogn. 15 (6), 455–469 (1982)

    Article  Google Scholar 

  23. Griffin, G., Houlub, A., Perona, P.: Caltech-256 object category dataset. Tech. Rep., California Institute of Technology (2007)

    Google Scholar 

  24. Grother, P.J.: NIST special database 19 – Handprinted forms and characters database. Technical report, Natl. Inst. Stand. Technol. (NIST) (1995). https://www.nist.gov/sites/default/files/documents/srd/nistsd19.pdf

  25. Grün, F., Rupprecht, C., Navab, N., Tombari, F.: A taxonomy and library for visualizing learned features in convolutional neural networks. In: Proceedings of the International Conference on Machine Learning (2016)

    Google Scholar 

  26. Harley, A.W.: An Interactive Node-Link Visualization of Convolutional Neural Networks, pp. 867–877. Springer International Publishing, Cham (2015)

    Google Scholar 

  27. Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18 (7), 1527–1554 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  28. Huang, G.B.: Learning hierarchical representations for face verification with convolutional deep belief networks. In: Proceedings Conference on Computer Vision and Pattern Recognition, pp. 2518–2525. IEEE Computer Society, Washington, DC (2012)

    Google Scholar 

  29. Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Tech. Rep. 07–49, University of Massachusetts, Amherst, October 2007

    Google Scholar 

  30. Jammalamadaka, N., Zisserman, A., Eichner, M., Ferrari, V., Jawahar, C.: Has my algorithm succeeded? an evaluator for human pose estimators. In: European Conference on Computer Vision (2012)

    Book  Google Scholar 

  31. Johnson, S., Everingham, M.: Clustered pose and nonlinear appearance models for human pose estimation. In: Proceedings of the British Machine Vision Conference (2010). doi:10.5244/C.24.12

    Google Scholar 

  32. Keim, D., Bak, P., Schäfer, M.: Dense Pixel Displays. In: Liu, L., Öszu, M.T. (eds.) Encyclopedia of Database Systems, pp. 789–795. Springer, New York (2009)

    Google Scholar 

  33. Kohavi, R.: Data mining and visualization. Invited talk at the National Academy of Engineering US Frontiers of Engineers (NAE) (2000)

    Google Scholar 

  34. Krizhevsky, A.: Learning multiple layers of features from tiny images. Tech. Rep., University of Toronto (2009)

    Google Scholar 

  35. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc., Red Hook (2012)

    Google Scholar 

  36. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521 436–444 (2015)

    Article  Google Scholar 

  37. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86 (11), 2278–2324 (1998)

    Article  Google Scholar 

  38. Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 609–616. ACM, New York (2009)

    Google Scholar 

  39. Li, L., Fei-Fei, L.: What, where and who? classifying events by scene and object recognition. In: IEEE International Conference on Computer Vision (2007)

    Book  Google Scholar 

  40. Li, S., Liu, Z.-Q., Chan, A.B.: Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2014)

    Book  Google Scholar 

  41. Li, S., Liu, Z.-Q., Chan, A.B.: Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network. Int. J. Comput. Vis. 113 (1), 19–36 (2015)

    Article  MathSciNet  Google Scholar 

  42. Lin, T.-Y., Maji, S.: Visualizing and understanding deep texture representations. In: Conference on Computer Vision and Pattern Recognition (2016)

    Google Scholar 

  43. Liu, C.-L., Yin, F., Wang, D.-H., Wang, Q.-F.: Casia online and offline Chinese handwriting databases. In: 2011 International Conference on Document Analysis and Recognition (2011)

    Google Scholar 

  44. Liu, M., Shi, J., Li, Z., Li, C., Zhu, J., Liu, S.: Towards better analysis of deep convolutional neural networks. CoRR, abs/1604.07043 (2016)

    Google Scholar 

  45. Long, J., Zhang, N., Darrell, T.: Do convnets learn correspondence? CoRR, abs/1411.1091 (2014)

    Google Scholar 

  46. Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (2015)

    Book  Google Scholar 

  47. Mahendran, A., Vedaldi, A.: Visualizing deep convolutional neural networks using natural pre-images. In: International Journal of Computer Vision (2016)

    Google Scholar 

  48. McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5 (4), 115–133 (1943)

    Article  MathSciNet  MATH  Google Scholar 

  49. Minsky, M., Papert, S.: Perceptrons: An Introduction to Computational Geometry. MIT Press, Cambridge, MA (1969)

    MATH  Google Scholar 

  50. Montavon, G., Bach, S., Binder, A., Samek, W., Müller, K.-R.: Explaining nonlinear classification decisions with deep Taylor decomposition. CoRR, abs/1512.02479 (2015)

    Google Scholar 

  51. Munzner, T.: Visualization Analysis and Design. A K Peters Visualization Series. CRC Press, Boca Raton, FL (2014)

    Google Scholar 

  52. Nguyen, G.P., Worring, M.: Interactive access to large image collections using similarity-based visualization. J. Vis. Lang. Comput. 19 (2), 203–224 (2008)

    Article  Google Scholar 

  53. Nguyen, A.M., Yosinski, J., Clune, J.: Multifaceted feature visualization: uncovering the different types of features learned by each neuron in deep neural networks. CoRR, abs/1602.03616 (2016)

    Google Scholar 

  54. Patterson, G.: Sun attribute database: discovering, annotating, and recognizing scene attributes. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2751–2758. IEEE Computer Society, Washington, DC (2012)

    Google Scholar 

  55. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115 (3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  56. Samek, W., Binder, A., Montavon, G., Bach, S., Müller, K.-R.: Evaluating the visualization of what a deep neural network has learned. CoRR, abs/1509.06321 (2015)

    Google Scholar 

  57. Sapp, B., Taskar, B.: Modec: multimodal decomposable models for human pose estimation. In: Proceedings of the Computer Vision and Pattern Recognition (2013)

    Google Scholar 

  58. Seifert, C., Granitzer, M.: User-based active learning. In: Proceedings of 10th International Conference on Data Mining Workshops, pp. 418–425 (2010)

    Google Scholar 

  59. Seifert, C., Lex, E.: A novel visualization approach for data-mining-related classification. In: Proceedings of the International Conference on Information Visualisation (IV), pp. 490–495. Wiley, New York (2009)

    Google Scholar 

  60. Sharan, L., Rosenholtz, R., Adelson, E.: Material perception: what can you see in a brief glance? J. Vis. 9, 784 (2009). doi:10.1167/9.8.784

    Article  Google Scholar 

  61. Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. CoRR, abs/1312.6034 (2014)

    Google Scholar 

  62. Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The German traffic sign recognition benchmark: a multi-class classification competition. In: Neural Networks (IJCNN), The 2011 International Joint Conference on (2011)

    Google Scholar 

  63. Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 2553–2561. Curran Associates, Inc., Red Hook (2013)

    Google Scholar 

  64. Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)

    Google Scholar 

  65. Thearling, K., Becker, B., DeCoste, D., Mawby, W., Pilote, M., Sommerfield, D.: Chapter Visualizing data mining models. In: Information Visualization in Data Mining and Knowledge Discovery, pp. 205–222. Morgan Kaufmann Publishers Inc., San Francisco, CA (2001)

    Google Scholar 

  66. Urbanek, S.: Exploring statistical forests. In: Proceedings of the 2002 Joint Statistical Meeting (2002)

    Google Scholar 

  67. van der Maaten, L., Hinton, G.E.: Visualizing high-dimensional data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  68. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-ucsd birds-200-2011 dataset. Tech. Rep. CNS-TR-2011-001, California Institute of Technology (2011)

    Google Scholar 

  69. Wang, J., Yu, B., Gasser, L.: Classification visualization with shaded similarity matrices. Tech. Rep., GSLIS University of Illinois at Urbana-Champaign (2002)

    Google Scholar 

  70. Wang, J., Zhang, Z., Premachandran, V., Yuille, A.L.: Discovering internal representations from object-cnns using population encoding. CoRR, abs/1511.06855 (2015)

    Google Scholar 

  71. Wang, L., Ouyang, W., Wang, X., Lu, H.: Visual tracking with fully convolutional networks. In: IEEE International Conference on Computer Vision (2015)

    Book  Google Scholar 

  72. Wilkinson, L., Friendly, M.: The history of the cluster heat map. Am. Stat. 63 (2), 179–184 (2009)

    Article  MathSciNet  Google Scholar 

  73. Wolf, L., Hassner, T., Maoz, I.: Face recognition in unconstrained videos with matched background similarity. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2011)

    Book  Google Scholar 

  74. Wu, D., Pigou, L., Kindermans, P.J., Le, N.D.H., Shao, L., Dambre, J., Odobez, J.M.: Deep dynamic neural networks for multimodal gesture segmentation and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 38 (8), 1583–1597 (2016)

    Article  Google Scholar 

  75. Xiang, Y., Mottaghi, R., Savarese, S.: Beyond Pascal: a benchmark for 3d object detection in the wild. In: IEEE Winter Conference on Applications of Computer Vision (2014)

    Google Scholar 

  76. Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (2010)

    Google Scholar 

  77. Yao, B., Jiang, X., Khosla, A., Lin, A.L., Guibas, L., Fei-Fei, L.: Human action recognition by learning bases of action attributes and parts. In: International Conference on Computer Vision (2011)

    Book  Google Scholar 

  78. Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 3320–3328. Curran Associates, Inc., Red Hook (2014)

    Google Scholar 

  79. Yosinski, J., Clune, J., Nguyen, A., Fuchs, T., Lipson, H.: Understanding neural networks through deep visualization. In: Proceedings of the International Conference on Machine Learning (2015)

    Google Scholar 

  80. Yu, W., Yang, K., Bai, Y., Yao, H., Rui, Y.: Visualizing and comparing convolutional neural networks. CoRR, abs/1412.6631 (2014)

    Google Scholar 

  81. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Computer Vision 13th European Conference (2014)

    Google Scholar 

  82. Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Object detectors emerge in deep scene cnns. CoRR, abs/1412.6856 (2014)

    Google Scholar 

  83. Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. CoRR, abs/1512.04150 (2015)

    Google Scholar 

  84. Zhou, B., Lapedriza, À., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 487–495. Curran Associates, Inc., Red Hook (2014)

    Google Scholar 

  85. Zintgraf, L.M., Cohen, T., Welling, M.: A new method to visualize deep neural networks. CoRR, abs/1603.02518 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christin Seifert .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Seifert, C. et al. (2017). Visualizations of Deep Neural Networks in Computer Vision: A Survey. In: Cerquitelli, T., Quercia, D., Pasquale, F. (eds) Transparent Data Mining for Big and Small Data. Studies in Big Data, vol 32. Springer, Cham. https://doi.org/10.1007/978-3-319-54024-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54024-5_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54023-8

  • Online ISBN: 978-3-319-54024-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics