There are good arguments to support the claim that deep neural networks (DNNs) capture better feature representations than the previous hand-crafted feature engineering, which leads to a significant performance improvement. In this paper, we move a tiny step towards understanding the dynamics of feature representations over layers. Specifically, we model the process of class separation of intermediate representations in pre-trained DNNs as the evolution of communities in dynamic graphs. Then, we introduce modularity, a generic metric in graph theory, to quantify the evolution of communities. In the preliminary experiment, we find that modularity roughly tends to increase as the layer goes deeper and the degradation and plateau arise when the model complexity is great relative to the dataset. Through an asymptotic analysis, we prove that modularity can be broadly used for different applications. For example, modularity provides new insights to quantify the difference between feature representations. More crucially, we demonstrate that the degradation and plateau in modularity curves represent redundant layers in DNNs and can be pruned with minimal impact on performance, which provides theoretical guidance for layer pruning. Our code is available at https://github.com/yaolu-zjut/Dynamic-Graphs-Construction.
This work was supported in part by the Key R &D Program of Zhejiang under Grant 2022C01018, by the National Natural Science Foundation of China under Grants U21B2001, 61973273, 62072406, 11931015, U1803263, by the Zhejiang Provincial Natural Science Foundation of China under Grant LR19F030001, by the National Science Fund for Distinguished Young Scholars under Grant 62025602, by the Fok Ying-Tong Education Foundation, China under Grant 171105, and by the Tencent Foundation and XPLORER PRIZE. We also sincerely thank Jinhuan Wang, Zhuangzhi Chen and Shengbo Gong for their excellent suggestions.
