Abstract
There are good arguments to support the claim that deep neural networks (DNNs) capture better feature representations than the previous hand-crafted feature engineering, which leads to a significant performance improvement. In this paper, we move a tiny step towards understanding the dynamics of feature representations over layers. Specifically, we model the process of class separation of intermediate representations in pre-trained DNNs as the evolution of communities in dynamic graphs. Then, we introduce modularity, a generic metric in graph theory, to quantify the evolution of communities. In the preliminary experiment, we find that modularity roughly tends to increase as the layer goes deeper and the degradation and plateau arise when the model complexity is great relative to the dataset. Through an asymptotic analysis, we prove that modularity can be broadly used for different applications. For example, modularity provides new insights to quantify the difference between feature representations. More crucially, we demonstrate that the degradation and plateau in modularity curves represent redundant layers in DNNs and can be pruned with minimal impact on performance, which provides theoretical guidance for layer pruning. Our code is available at https://github.com/yaolu-zjut/Dynamic-Graphs-Construction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alain, G., Bengio, Y.: Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644 (2016)
Alet, F., Lozano-Pérez, T., Kaelbling, L.P.: Modular meta-learning. In: Conference on Robot Learning, pp. 856–868 (2018)
Azarian, K., Bhalgat, Y., Lee, J., Blankevoort, T.: Learned threshold pruning. arXiv preprint arXiv:2003.00075 (2020)
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57 (2017)
Chen, S., Zhao, Q.: Shallowing deep networks: layer-wise pruning based on feature representations. IEEE Trans. Pattern Anal. Mach. Intell. 41(12), 3048–3056 (2018)
Csordás, R., van Steenkiste, S., Schmidhuber, J.: Are neural nets modular? Inspecting their functionality through differentiable weight masks. In: International Conference on Learning Representations (2021)
Das, A., Rad, P.: Opportunities and challenges in explainable artificial intelligence (XAI): a survey. arXiv preprint arXiv:2006.11371 (2020)
Davis, B., Bhatt, U., Bhardwaj, K., Marculescu, R., Moura, J.M.: On network science and mutual information for explaining deep neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8399–8403 (2020)
Delange, M., et al.: A continual learning survey: defying forgetting in classification tasks. IEEE Trans. Pattern Anal. Mach. Intell. 44, 3366–3385 (2021)
Ding, G.W., Wang, L., Jin, X.: AdverTorch v0.1: an adversarial robustness toolbox based on Pytorch. arXiv preprint arXiv:1902.07623 (2019)
Donahue, J., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Learning, pp. 647–655. PMLR (2014)
Dosovitskiy, A., et al. An image is worth 16x16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations (2021)
Elkerdawy, S., Elhoushi, M., Singh, A., Zhang, H., Ray, N.: To filter prune, or to layer prune, that is the question. In: Proceedings of the Asian Conference on Computer Vision (2020)
Feng, Y., Zhai, R., He, D., Wang, L., Dong, B.: Transferred discrepancy: quantifying the difference between representations. arXiv preprint arXiv:2007.12446 (2020)
Fortunato, S.: Community detection in graphs. CoRR abs/0906.0612 (2009)
Frankle, J., Carbin, M.: The lottery ticket hypothesis: finding sparse, trainable neural networks. In: International Conference on Learning Representations (2019)
Goh, G., et al.: Multimodal neurons in artificial neural networks. Distill 6(3), e30 (2021)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (2015)
Goyal, A., et al.: Recurrent independent mechanisms. In: International Conference on Learning Representations (2021)
Harary, F., Gupta, G.: Dynamic graph models. Math. Comput. Model. 25(7), 79–87 (1997)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hod, S., Casper, S., Filan, D., Wild, C., Critch, A., Russell, S.: Detecting modularity in deep neural networks. arXiv preprint arXiv:2110.08058 (2021)
Jonsson, P.F., Cavanna, T., Zicha, D., Bates, P.A.: Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis. BMC Bioinform. 7(1), 1–13 (2006)
Kirsch, L., Kunze, J., Barber, D.: Modular networks: Learning to decompose neural computation. In: Advances in Neural Information Processing Systems, pp. 2414–2423 (2018)
Kornblith, S., Norouzi, M., Lee, H., Hinton, G.: Similarity of neural network representations revisited. In: International Conference on Machine Learning, pp. 3519–3529 (2019)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Lake, B.M., Ullman, T.D., Tenenbaum, J.B., Gershman, S.J.: Building machines that learn and think like people. Behav. Brain Sci. 40 (2017)
Li, J., et al.: Aha! adaptive history-driven attack for decision-based black-box models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16168–16177 (2021)
Lin, M., et al.: HRank: filter pruning using high-rank feature map. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1529–1538 (2020)
Lin, S., et al.: Towards optimal structured CNN pruning via generative adversarial learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2790–2799 (2019)
Lin, T., Stich, S.U., Barba, L., Dmitriev, D., Jaggi, M.: Dynamic model pruning with feedback. In: International Conference on Learning Representations (2020)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Lusseau, D.: The emergent properties of a dolphin social network. Proc. R. Soc. London Ser. B Biol. Sci. 270(suppl_2), S186–S188 (2003)
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11) (2008)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations (2018)
Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5188–5196 (2015)
Maho, T., Furon, T., Le Merrer, E.: SurFree: a fast surrogate-free black-box attack. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10430–10439 (2021)
Maqueda, A.I., Loquercio, A., Gallego, G., García, N., Scaramuzza, D.: Event-based vision meets deep learning on steering prediction for self-driving cars. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5419–5427 (2018)
Morcos, A.S., Raghu, M., Bengio, S.: Insights on representational similarity in neural networks with canonical correlation. In: Advances in Neural Information Processing Systems, pp. 5732–5741 (2018)
Narodytska, N., Kasiviswanathan, S.P.: Simple black-box adversarial perturbations for deep networks. arXiv preprint arXiv:1612.06299 (2016)
Newman, M.E.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103(23), 8577–8582 (2006)
Newman, M.E., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)
Nguyen, T., Raghu, M., Kornblith, S.: Do wide and deep networks learn the same things? uncovering how neural network representations vary with width and depth. In: International Conference on Learning Representations (2021)
Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS &P), pp. 372–387 (2016)
Paszke, A., et al.: Automatic differentiation in PyTorch (2017)
Raghu, M., Gilmer, J., Yosinski, J., Sohl-Dickstein, J.: SVCCA: singular vector canonical correlation analysis for deep learning dynamics and interpretability. In: Advances in Neural Information Processing Systems, pp. 6076–6085 (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. California Univ. San Diego La Jolla Inst. for Cognitive Science, Technical report (1985)
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: International Conference on Computer Vision, pp. 618–626 (2017)
Shafahi, A., et al.: Adversarial training for free! In: Advances in Neural Information Processing Systems 32 (2019)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)
Su, J., Vargas, D.V., Sakurai, K.: One pixel attack for fooling deep neural networks. IEEE Trans. Evolut. Comput. 23(5), 828–841 (2019)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Szegedy, C., et al.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014)
Tang, S., Maddox, W.J., Dickens, C., Diethe, T., Damianou, A.: Similarity of neural networks with gradients. arXiv preprint arXiv:2003.11498 (2020)
Tang, Y., et al.: Manifold regularized dynamic network pruning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5018–5028 (2021)
Wang, F., Liu, H., Cheng, J.: Visualizing deep neural network by alternately image blurring and deblurring. Neural Netw. 97, 162–172 (2018)
Wang, L., et al.: Towards understanding learning representations: To what extent do different neural networks learn the same representation. In: Advances in Neural Information Processing Systems, pp. 9607–9616 (2018)
Wang, W., et al.: Accelerate CNNs from three dimensions: a comprehensive pruning framework. In: International Conference on Machine Learning, pp. 10717–10726 (2021)
Wang, W., Zhao, S., Chen, M., Hu, J., Cai, D., Liu, H.: DBP: discrimination based block-level pruning for deep model acceleration. arXiv preprint arXiv:1912.10178 (2019)
Watanabe, C.: Interpreting layered neural networks via hierarchical modular representation. In: International Conference on Neural Information Processing, pp. 376–388 (2019)
Watanabe, C., Hiramatsu, K., Kashino, K.: Modular representation of layered neural networks. Neural Netw. 97, 62–73 (2018)
Watanabe, C., Hiramatsu, K., Kashino, K.: Understanding community structure in layered neural networks. Neurocomputing 367, 84–102 (2019)
Wong, E., Rice, L., Kolter, J.Z.: Fast is better than free: revisiting adversarial training. In: International Conference on Learning Representations (2020)
Xu, P., Cao, J., Shang, F., Sun, W., Li, P.: Layer pruning via fusible residual convolutional block for deep neural networks. arXiv preprint arXiv:2011.14356 (2020)
Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. Knowl. Inf. Syst. 42(1), 181–213 (2013). https://doi.org/10.1007/s10115-013-0693-z
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)
You, J., Leskovec, J., He, K., Xie, S.: Graph structure of neural networks. In: International Conference on Machine Learning, pp. 10881–10891 (2020)
Zachary, W.W.: An information flow model for conflict and fission in small groups. J. Anthropol. Res. 33(4), 452–473 (1977)
Zagoruyko, S., Komodakis, N.: Wide residual networks. In: Proceedings of the British Machine Vision Conference (2016)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Zhou, Y., Yen, G.G., Yi, Z.: Evolutionary shallowing deep neural networks at block levels. IEEE Trans. Neural Netw. Learn. Syst. (2021)
Acknowledgements
This work was supported in part by the Key R &D Program of Zhejiang under Grant 2022C01018, by the National Natural Science Foundation of China under Grants U21B2001, 61973273, 62072406, 11931015, U1803263, by the Zhejiang Provincial Natural Science Foundation of China under Grant LR19F030001, by the National Science Fund for Distinguished Young Scholars under Grant 62025602, by the Fok Ying-Tong Education Foundation, China under Grant 171105, and by the Tencent Foundation and XPLORER PRIZE. We also sincerely thank Jinhuan Wang, Zhuangzhi Chen and Shengbo Gong for their excellent suggestions.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lu, Y. et al. (2022). Understanding the Dynamics of DNNs Using Graph Modularity. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13672. Springer, Cham. https://doi.org/10.1007/978-3-031-19775-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-19775-8_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19774-1
Online ISBN: 978-3-031-19775-8
eBook Packages: Computer ScienceComputer Science (R0)