Overfitting measurement of convolutional neural networks using trained network weights

Watanabe, Satoru; Yamana, Hayato

doi:10.1007/s41060-022-00332-1

Overfitting measurement of convolutional neural networks using trained network weights

Regular Paper
Published: 12 May 2022

Volume 14, pages 261–278, (2022)
Cite this article

International Journal of Data Science and Analytics Aims and scope Submit manuscript

Satoru Watanabe¹ &
Hayato Yamana²

450 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Overfitting reduces the generalizability of convolutional neural networks (CNNs). Overfitting is generally detected by comparing the accuracies and losses of the training and validation data, where the validation data are formed from a portion of the training data; however, detection methods are ineffective for pretrained networks distributed without the training data. Thus, in this paper, we propose a method to detect overfitting of CNNs using the trained network weights inspired by the dropout technique. The dropout technique has been employed to prevent CNNs from overfitting, where the neurons in the CNNs are invalidated randomly during their training. It has been hypothesized that this technique prevents CNNs from overfitting by restraining the co-adaptations among neurons, and this hypothesis implies that the overfitting of CNNs results from co-adaptations among neurons and can be detected by investigating the inner representation of CNNs. The proposed persistent homology-based overfitting measure (PHOM) method constructs clique complexes in CNNs using the trained network weights, and the one-dimensional persistent homology investigates co-adaptations among neurons. In addition, we enhance PHOM to normalized PHOM (NPHOM) to mitigate fluctuation in PHOM caused by the difference in network structures. We applied the proposed methods to convolutional neural networks trained for the classification problems on the CIFAR-10, street view house number, Tiny ImageNet, and CIFAR-100 datasets. Experimental results demonstrate that PHOM and NPHOM can indicate the degree of overfitting of CNNs, which suggests that these methods enable us to filter overfitted CNNs without requiring the training data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Topological measurement of deep neural networks using persistent homology

Article Open access 03 July 2021

Satoru Watanabe & Hayato Yamana

LayerOut: Freezing Layers in Deep Neural Networks

Article 08 September 2020

Kelam Goutam, S. Balasubramanian, … R. Raghunatha Sarma

Towards a universal mechanism for successful deep learning

Article Open access 11 March 2024

Yuval Meir, Yarden Tzach, … Ido Kanter

Availability of data and materials

The datasets are available as referenced in the paper.

Notes

We used the notation from https://towardsdatascience.com/convolutional-neural-networks-mathematics-1beb3e6447c0 with modifications based on our understanding.
The source code and models used in the evaluation can be accessed at https://github.com/satoru-watanabe-aw/phom/.

References

Dictionary, O.: Oxford dictionaries. Language Matters (2014)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Switzerland (2006)
MATH Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Rieck, B., Togninalli, M., Bock, C., Moor, M., Horn, M., Gumbsch, T., Borgwardt, K.: Neural persistence: a complexity measure for deep neural networks using algebraic topology. In: International Conference on Learning Representations (2018)
Corneanu, C.A., Madadi, M., Escalera, S., Martinez, A.M.: What does it mean to learn in deep networks? and, how does one detect adversarial attacks? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4757–4766 (2019)
Corneanu, C.A., Escalera, S., Martinez, A.M.: Computing the testing error without a testing set. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2677–2685 (2020)
Watanabe, S., Yamana, H.: Topological measurement of deep neural networks using persistent homology. In: ISAIM (2020)
Watanabe, S., Yamana, H.: Topological measurement of deep neural networks using persistent homology. Ann. Math. Artif. Intell. (2021). https://doi.org/10.1007/s10472-021-09761-3
Article MATH Google Scholar
Watanabe, S., Yamana, H.: Deep neural network pruning using persistent homology. In: 2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), pp. 153–156. IEEE (2020)
Otter, N., Porter, M.A., Tillmann, U., Grindrod, P., Harrington, H.A.: A roadmap for the computation of persistent homology. EPJ Data Sci. 6(1), 17 (2017)
Article Google Scholar
Petri, G., Expert, P., Turkheimer, F., Carhart-Harris, R., Nutt, D., Hellyer, P.J., Vaccarino, F.: Homological scaffolds of brain functional networks. J. R. Soc. Interface 11(101), 20140873 (2014)
Article Google Scholar
Cassidy, B., Bowman, F.D., Rae, C., Solo, V.: On the reliability of individual brain activity networks. IEEE Trans. Med. Imaging 37(2), 649–662 (2018). https://doi.org/10.1109/TMI.2017.2774364
Article Google Scholar
Sizemore, A.E., Giusti, C., Kahn, A., Vettel, J.M., Betzel, R.F., Bassett, D.S.: Cliques and cavities in the human connectome. J. Comput. Neurosci. 44(1), 115–145 (2018)
Article MathSciNet Google Scholar
Xia, K., Wei, G.-W.: Persistent homology analysis of protein structure, flexibility, and folding. Int. J. Numer. Methods Biomed. Eng. 30(8), 814–844 (2014)
Article MathSciNet Google Scholar
Gameiro, M., Hiraoka, Y., Izumi, S., Kramar, M., Mischaikow, K., Nanda, V.: A topological measurement of protein compressibility. Jpn. J. Ind. Appl. Math. 32(1), 1–17 (2015)
Article MathSciNet Google Scholar
Kramar, M., Goullet, A., Kondic, L., Mischaikow, K.: Persistence of force networks in compressed granular media. Phys. Rev. E 87(4), 042207 (2013)
Article Google Scholar
Hiraoka, Y., Nakamura, T., Hirata, A., Escolar, E.G., Matsue, K., Nishiura, Y.: Hierarchical structures of amorphous solids characterized by persistent homology. Proc. Natl. Acad. Sci. 113(26), 7035–7040 (2016)
Article Google Scholar
Watanabe, S., Yamana, H.: Overfitting measurement of deep neural networks using no data. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10 (2021). https://doi.org/10.1109/DSAA53316.2021.9564119
Edelsbrunner, H., Harer, J.: Computational Topology: An Introduction. American Mathematical Soc., Providence (2010)
MATH Google Scholar
The GUDHI Project: GUDHI User and Reference Manual, GUDHI Editorial Board (2015). http://gudhi.gforge.inria.fr/doc/latest/
Tausz, A., Vejdemo-Johansson, M., Adams, H.: JavaPlex: a research software package for persistent (co)homology. In: Hong, H., Yap, C. (eds.) Proceedings of ICMS 2014. Lecture Notes in Computer Science, vol 8592, pp. 129–136 (2014). Software available at http://appliedtopology.github.io/javaplex/
Edelsbrunner, H., Morozov, D.: Persistent homology: theory and practice. Technical report, Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States) (2012)
Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. In: Proceedings 41st Annual Symposium on Foundations of Computer Science, pp. 454–463. IEEE (2000)
Masulli, P., Villa, A.E.: The topology of the directed clique complex as a network invariant. Springerplus 5(1), 388 (2016)
Article Google Scholar
Montavon, G., Lapuschkin, S., Binder, A., Samek, W., Müller, K.-R.: Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn. 65, 211–222 (2017)
Article Google Scholar
Virtanen, P., Gommers, R., Oliphant, T.E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., et al.: Scipy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17(3), 261–272 (2020)
Article Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)
Le, Y., Yang, X.: Tiny imagenet visual recognition challenge. CS 231N 7(7), 3 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (2015)
Gulli, A., Pal, S.: Deep Learning with Keras. Packt Publishing Ltd., Birmingham (2017)
Google Scholar
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Adv. Neural Inf. Process. Syst. 27, 3320–3328 (2014)
Google Scholar
Chollet, F., et al.: Deep Learning with Python, vol. 361. Manning New York, New York (2018)
Google Scholar
Mishra, R., Gupta, H.P., Dutta, T.: A survey on deep neural network compression: Challenges, overview, and solutions. arXiv preprint arXiv:2010.03954 (2020)
Blalock, D., Gonzalez Ortiz, J.J., Frankle, J., Guttag, J.: What is the state of neural network pruning? In: Dhillon, I., Papailiopoulos, D., Sze, V. (eds.) Proceedings of Machine Learning and Systems, vol. 2, pp. 129–146 (2020)
Edelsbrunner, H., Harer, J., et al.: Persistent homology—a survey. Contemp. Math. 453, 257–282 (2008)
Article MathSciNet Google Scholar
Werpachowski, R., György, A., Szepesvari, C.: Detecting overfitting via adversarial examples. Adv. Neural Inf. Process. Syst. 32, 7858–7868 (2019)
Google Scholar
Grosse, K., Lee, T., Park, Y., Backes, M., Molloy, I.M.: A new measure for overfitting and its implications for backdooring of deep learning. CoRR arXiv: 2006.06721 (2020)
Raghu, M., Gilmer, J., Yosinski, J., Sohl-Dickstein, J.: SVCCA: singular vector canonical correlation analysis for deep learning dynamics and interpretability. In: Advances in Neural Information Processing Systems, pp. 6076–6085 (2017)
Morcos, A., Raghu, M., Bengio, S.: Insights on representational similarity in neural networks with canonical correlation. In: Advances in Neural Information Processing Systems, pp. 5727–5736 ( 2018)
Kornblith, S., Norouzi, M., Lee, H., Hinton, G.: Similarity of neural network representations revisited. In: International Conference on Machine Learning, pp. 3519–3529 (2019)

Download references

Acknowledgements

We are grateful to Hitachi, Ltd. for the tuition subsidy. The founder had no role in both study design and technical investigation in this paper.

Funding

This research has not received any funding supports.

Author information

Authors and Affiliations

Department of Computer Science and Communications Engineering, Waseda University, 3-4-1, Okubo, Shinjuku-ku, Tokyo, 169-8555, Japan
Satoru Watanabe
Faculty of Science and Engineering, Waseda University, 3-4-1, Okubo, Shinjuku-ku, Tokyo, 169-8555, Japan
Hayato Yamana

Authors

Satoru Watanabe
View author publications
You can also search for this author in PubMed Google Scholar
Hayato Yamana
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors proposed a persistent homology-based overfitting measure that detects the overfitting of convolutional neural networks without relying on the training data.

Corresponding author

Correspondence to Satoru Watanabe.

Ethics declarations

Conflict of interest

The authors have no conflict of interest and competing interests about this paper.

Code availability

The source codes are available as referenced in the paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Watanabe, S., Yamana, H. Overfitting measurement of convolutional neural networks using trained network weights. Int J Data Sci Anal 14, 261–278 (2022). https://doi.org/10.1007/s41060-022-00332-1

Download citation

Received: 04 January 2022
Accepted: 20 April 2022
Published: 12 May 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s41060-022-00332-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Overfitting measurement of convolutional neural networks using trained network weights

Abstract

Access this article

Similar content being viewed by others

Topological measurement of deep neural networks using persistent homology

LayerOut: Freezing Layers in Deep Neural Networks

Towards a universal mechanism for successful deep learning

Availability of data and materials

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Overfitting measurement of convolutional neural networks using trained network weights

Abstract

Access this article

Similar content being viewed by others

Topological measurement of deep neural networks using persistent homology

LayerOut: Freezing Layers in Deep Neural Networks

Towards a universal mechanism for successful deep learning

Availability of data and materials

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation