Skip to main content
Log in

Overfitting measurement of convolutional neural networks using trained network weights

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

Overfitting reduces the generalizability of convolutional neural networks (CNNs). Overfitting is generally detected by comparing the accuracies and losses of the training and validation data, where the validation data are formed from a portion of the training data; however, detection methods are ineffective for pretrained networks distributed without the training data. Thus, in this paper, we propose a method to detect overfitting of CNNs using the trained network weights inspired by the dropout technique. The dropout technique has been employed to prevent CNNs from overfitting, where the neurons in the CNNs are invalidated randomly during their training. It has been hypothesized that this technique prevents CNNs from overfitting by restraining the co-adaptations among neurons, and this hypothesis implies that the overfitting of CNNs results from co-adaptations among neurons and can be detected by investigating the inner representation of CNNs. The proposed persistent homology-based overfitting measure (PHOM) method constructs clique complexes in CNNs using the trained network weights, and the one-dimensional persistent homology investigates co-adaptations among neurons. In addition, we enhance PHOM to normalized PHOM (NPHOM) to mitigate fluctuation in PHOM caused by the difference in network structures. We applied the proposed methods to convolutional neural networks trained for the classification problems on the CIFAR-10, street view house number, Tiny ImageNet, and CIFAR-100 datasets. Experimental results demonstrate that PHOM and NPHOM can indicate the degree of overfitting of CNNs, which suggests that these methods enable us to filter overfitted CNNs without requiring the training data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Availability of data and materials

The datasets are available as referenced in the paper.

Notes

  1. We used the notation from https://towardsdatascience.com/convolutional-neural-networks-mathematics-1beb3e6447c0 with modifications based on our understanding.

  2. The source code and models used in the evaluation can be accessed at https://github.com/satoru-watanabe-aw/phom/.

References

  1. Dictionary, O.: Oxford dictionaries. Language Matters (2014)

  2. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Switzerland (2006)

    MATH  Google Scholar 

  3. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  4. Rieck, B., Togninalli, M., Bock, C., Moor, M., Horn, M., Gumbsch, T., Borgwardt, K.: Neural persistence: a complexity measure for deep neural networks using algebraic topology. In: International Conference on Learning Representations (2018)

  5. Corneanu, C.A., Madadi, M., Escalera, S., Martinez, A.M.: What does it mean to learn in deep networks? and, how does one detect adversarial attacks? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4757–4766 (2019)

  6. Corneanu, C.A., Escalera, S., Martinez, A.M.: Computing the testing error without a testing set. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2677–2685 (2020)

  7. Watanabe, S., Yamana, H.: Topological measurement of deep neural networks using persistent homology. In: ISAIM (2020)

  8. Watanabe, S., Yamana, H.: Topological measurement of deep neural networks using persistent homology. Ann. Math. Artif. Intell. (2021). https://doi.org/10.1007/s10472-021-09761-3

    Article  MATH  Google Scholar 

  9. Watanabe, S., Yamana, H.: Deep neural network pruning using persistent homology. In: 2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), pp. 153–156. IEEE (2020)

  10. Otter, N., Porter, M.A., Tillmann, U., Grindrod, P., Harrington, H.A.: A roadmap for the computation of persistent homology. EPJ Data Sci. 6(1), 17 (2017)

    Article  Google Scholar 

  11. Petri, G., Expert, P., Turkheimer, F., Carhart-Harris, R., Nutt, D., Hellyer, P.J., Vaccarino, F.: Homological scaffolds of brain functional networks. J. R. Soc. Interface 11(101), 20140873 (2014)

    Article  Google Scholar 

  12. Cassidy, B., Bowman, F.D., Rae, C., Solo, V.: On the reliability of individual brain activity networks. IEEE Trans. Med. Imaging 37(2), 649–662 (2018). https://doi.org/10.1109/TMI.2017.2774364

    Article  Google Scholar 

  13. Sizemore, A.E., Giusti, C., Kahn, A., Vettel, J.M., Betzel, R.F., Bassett, D.S.: Cliques and cavities in the human connectome. J. Comput. Neurosci. 44(1), 115–145 (2018)

    Article  MathSciNet  Google Scholar 

  14. Xia, K., Wei, G.-W.: Persistent homology analysis of protein structure, flexibility, and folding. Int. J. Numer. Methods Biomed. Eng. 30(8), 814–844 (2014)

    Article  MathSciNet  Google Scholar 

  15. Gameiro, M., Hiraoka, Y., Izumi, S., Kramar, M., Mischaikow, K., Nanda, V.: A topological measurement of protein compressibility. Jpn. J. Ind. Appl. Math. 32(1), 1–17 (2015)

    Article  MathSciNet  Google Scholar 

  16. Kramar, M., Goullet, A., Kondic, L., Mischaikow, K.: Persistence of force networks in compressed granular media. Phys. Rev. E 87(4), 042207 (2013)

    Article  Google Scholar 

  17. Hiraoka, Y., Nakamura, T., Hirata, A., Escolar, E.G., Matsue, K., Nishiura, Y.: Hierarchical structures of amorphous solids characterized by persistent homology. Proc. Natl. Acad. Sci. 113(26), 7035–7040 (2016)

    Article  Google Scholar 

  18. Watanabe, S., Yamana, H.: Overfitting measurement of deep neural networks using no data. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10 (2021). https://doi.org/10.1109/DSAA53316.2021.9564119

  19. Edelsbrunner, H., Harer, J.: Computational Topology: An Introduction. American Mathematical Soc., Providence (2010)

    MATH  Google Scholar 

  20. The GUDHI Project: GUDHI User and Reference Manual, GUDHI Editorial Board (2015). http://gudhi.gforge.inria.fr/doc/latest/

  21. Tausz, A., Vejdemo-Johansson, M., Adams, H.: JavaPlex: a research software package for persistent (co)homology. In: Hong, H., Yap, C. (eds.) Proceedings of ICMS 2014. Lecture Notes in Computer Science, vol 8592, pp. 129–136 (2014). Software available at http://appliedtopology.github.io/javaplex/

  22. Edelsbrunner, H., Morozov, D.: Persistent homology: theory and practice. Technical report, Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States) (2012)

  23. Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. In: Proceedings 41st Annual Symposium on Foundations of Computer Science, pp. 454–463. IEEE (2000)

  24. Masulli, P., Villa, A.E.: The topology of the directed clique complex as a network invariant. Springerplus 5(1), 388 (2016)

    Article  Google Scholar 

  25. Montavon, G., Lapuschkin, S., Binder, A., Samek, W., Müller, K.-R.: Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn. 65, 211–222 (2017)

    Article  Google Scholar 

  26. Virtanen, P., Gommers, R., Oliphant, T.E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., et al.: Scipy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17(3), 261–272 (2020)

    Article  Google Scholar 

  27. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

  28. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)

  29. Le, Y., Yang, X.: Tiny imagenet visual recognition challenge. CS 231N 7(7), 3 (2015)

  30. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (2015)

  31. Gulli, A., Pal, S.: Deep Learning with Keras. Packt Publishing Ltd., Birmingham (2017)

    Google Scholar 

  32. Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Adv. Neural Inf. Process. Syst. 27, 3320–3328 (2014)

    Google Scholar 

  33. Chollet, F., et al.: Deep Learning with Python, vol. 361. Manning New York, New York (2018)

    Google Scholar 

  34. Mishra, R., Gupta, H.P., Dutta, T.: A survey on deep neural network compression: Challenges, overview, and solutions. arXiv preprint arXiv:2010.03954 (2020)

  35. Blalock, D., Gonzalez Ortiz, J.J., Frankle, J., Guttag, J.: What is the state of neural network pruning? In: Dhillon, I., Papailiopoulos, D., Sze, V. (eds.) Proceedings of Machine Learning and Systems, vol. 2, pp. 129–146 (2020)

  36. Edelsbrunner, H., Harer, J., et al.: Persistent homology—a survey. Contemp. Math. 453, 257–282 (2008)

    Article  MathSciNet  Google Scholar 

  37. Werpachowski, R., György, A., Szepesvari, C.: Detecting overfitting via adversarial examples. Adv. Neural Inf. Process. Syst. 32, 7858–7868 (2019)

    Google Scholar 

  38. Grosse, K., Lee, T., Park, Y., Backes, M., Molloy, I.M.: A new measure for overfitting and its implications for backdooring of deep learning. CoRR arXiv: 2006.06721 (2020)

  39. Raghu, M., Gilmer, J., Yosinski, J., Sohl-Dickstein, J.: SVCCA: singular vector canonical correlation analysis for deep learning dynamics and interpretability. In: Advances in Neural Information Processing Systems, pp. 6076–6085 (2017)

  40. Morcos, A., Raghu, M., Bengio, S.: Insights on representational similarity in neural networks with canonical correlation. In: Advances in Neural Information Processing Systems, pp. 5727–5736 ( 2018)

  41. Kornblith, S., Norouzi, M., Lee, H., Hinton, G.: Similarity of neural network representations revisited. In: International Conference on Machine Learning, pp. 3519–3529 (2019)

Download references

Acknowledgements

We are grateful to Hitachi, Ltd. for the tuition subsidy. The founder had no role in both study design and technical investigation in this paper.

Funding

This research has not received any funding supports.

Author information

Authors and Affiliations

Authors

Contributions

The authors proposed a persistent homology-based overfitting measure that detects the overfitting of convolutional neural networks without relying on the training data.

Corresponding author

Correspondence to Satoru Watanabe.

Ethics declarations

Conflict of interest

The authors have no conflict of interest and competing interests about this paper.

Code availability

The source codes are available as referenced in the paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Watanabe, S., Yamana, H. Overfitting measurement of convolutional neural networks using trained network weights. Int J Data Sci Anal 14, 261–278 (2022). https://doi.org/10.1007/s41060-022-00332-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-022-00332-1

Keywords

Navigation