Skip to main content

Advertisement

Log in

The Mode-Fisher pooling for time complexity optimization in deep convolutional neural networks

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In this paper, we aim to improve the performance, time complexity and energy efficiency of deep convolutional neural networks (CNNs) by combining hardware and specialization techniques. Since the pooling step represents a process that contributes significantly to CNNs performance improvement, we propose the Mode-Fisher pooling method. This form of pooling can potentially offer a very promising results in terms of improving feature extraction performance. The proposed method reduces significantly the data movement in the CNN and save up to 10% of total energy, without any performance penalty.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. Computer file that contains an uncompressed image. It is not viewable directly by most computer systems.

  2. In the literature, the most used filters do not exceed the size (\(5 \times 5\)).

  3. Max pooling: y= Max (\(x_{ij}\)).

  4. Average pooling: y= Mean (\(x_{ij}\)),y represents the output, i and j are the row and column index of the pooling region.

  5. All data sets are summarized in Table 1.

References

  1. Abhimanyu D, Otkrist G, Ramesh R, Nikhil N (2018) Maximum-entropy fine grained classification. In: Advances in neural information processing systems, pp 637–647

  2. Akata Z, Perronnin F, Harchaoui Z, Schmid C (2014) Good practice in large-scale learning for image classification. IEEE Trans Pattern Anal Mach Intell 36:507–520

    Article  Google Scholar 

  3. Asif U, Bennamoun M, Sohel F (2017) A multi-modal, discriminative and spatially invariant CNN for RGB-D object labeling. IEEE Trans Pattern Anal Mach Intell 40(9):2051–2065

    Article  Google Scholar 

  4. Beigpour S, Riess C, Van De Weijer J, Angelopoulou E (2014) Multi-illuminant estimation with conditional random fields. IEEE Trans Image Process 23:83–96

    Article  MathSciNet  Google Scholar 

  5. Bianco S (2017) Single and multiple illuminant estimation using convolutional neural networks. IEEE Trans Image Process 26(9):4347–4362

    Article  MathSciNet  Google Scholar 

  6. Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Jackel LD, LeCun Y, Muller UA, et al (1994) Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR international conference on pattern recognition, conference B: computer vision and image processing, IEEE, vol 2, pp 77–82

  7. Brain G (2017) Tensorflow: an open-source software library for machine intelligence

  8. Chen Y-H, Emer J, Sze V (2017) Using dataflow to optimize energy efficiency of deep neural network accelerators. IEEE Micro 37:12–21

    Article  Google Scholar 

  9. Chollet F (2018) Keras: the python deep learning library. https://keras.io/#keras-the-python-deep-learning-library

  10. CIFAR10, Lenet-cifar10. http://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html. Accessed 08 Jan 2018

  11. Cimpoi M, Maji S, Kokkinos I, Vedaldi A (2016) Deep filter banks for texture recognition, description, and segmentation. Int J Comput Vis 118:65–94

    Article  MathSciNet  Google Scholar 

  12. Cohen BH, Lea RB (2004) Essentials of statistics for the social and behavioral sciences, vol 3. Wiley, New York

    Google Scholar 

  13. Conneau A, Schwenk H, Barrault L, Lecun Y (2017) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, Long Papers, vol. 1, pp 1107–1116

  14. core team P (2017) Pytorch: Tensors and dynamic neural networks in python with strong GPU acceleration

  15. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  16. Deng J, et al (2012) Large scale visual recognition challenge 2012 (ilsvrc2012). http://image-net.org/challenges/LSVRC/2012/index. Accessed Apr 2018

  17. Deshpande A (2018) The 9 deep learning papers you need to know about (understanding cnns part 3). https://adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html. Accessed 15 Apr 2018

  18. Dietterich T (1995) Overfitting and undercomputing in machine learning. ACM Comput Surv 27:326–327

    Article  Google Scholar 

  19. Dixit M, Chen S, Gao D, Rasiwasia N, Vasconcelos N (2015) Scene classification with semantic fisher vectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2974–2983

  20. Ebner M (2009) Color constancy based on local space average color. Mach Vis Appl 20:283–301

    Article  Google Scholar 

  21. Erhan D, Bengio Y, Courville A, Manzagol P-A, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11:625–660

    MathSciNet  MATH  Google Scholar 

  22. Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701

    Article  Google Scholar 

  23. Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92

    Article  MathSciNet  Google Scholar 

  24. Gijsenij A, Lu R, Gevers T (2012) Color constancy for multiple light sources. IEEE Trans Image Process 21:697–707

    Article  MathSciNet  Google Scholar 

  25. Goodfellow IJ, Warde-Farley D, Mirza M, Courville M, Bengio Y (2013) Maxout networks. arXiv preprint arXiv:1302.4389

  26. Gravetter F, Wallnau L (2015). Statistics for the behavioral sciences. Cengage Learning

  27. Hassaballah M, Abdelmgeid AA, Alshazly HA (2016) Image features detection, description and matching. Image feature detectors and descriptors. Springer, Cham, pp 11–45

    Chapter  Google Scholar 

  28. He K, Sun J (2015) Convolutional neural networks at constrained time cost. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 5353–5360

  29. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37:1904–1916

    Article  Google Scholar 

  30. Hensman P, Masko D (2015) The impact of imbalanced training data for convolutional neural networks. Degree Project in Computer Science, KTH Royal Institute of Technology

  31. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507

    Article  MathSciNet  Google Scholar 

  32. Hsi-Shou W (2018) Energy-efficient neural network architectures

  33. Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and $<$ 0.5 mb model size. arXiv preprint arXiv:1602.07360

  34. ImageNet

  35. Jianhua L (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37:145–151

    Article  MathSciNet  Google Scholar 

  36. Jolicoeur P (2012) Introduction to biometry. Springer, New York

    Google Scholar 

  37. Jupyter P (2017) Jupyter

  38. Krizhevsky A (2017) The cifar10/100 datasets

  39. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  40. lab FAR (2017) Torch: a scientific computing framework for luajit

  41. Land EH (1977) The retinex theory of color vision. Sci Am 237:108–129

    Article  Google Scholar 

  42. LeCun Y (2017) The MNIST database of handwritten digits

  43. LeCun Y, Bengio Y (2015) Deep learning. Nature 521:436–444

    Article  Google Scholar 

  44. LeCun Y, et al (2018) Lenet-5, convolutional neural networks. http://yann.lecun.com/exdb/lenet. Accessed 2018

  45. Lee C-Y, Gallagher P, Tu Z (2017) Generalizing pooling functions in CNNs: mixed, gated, and tree. IEEE Trans Pattern Anal Mach Intell PP(99):1

    Google Scholar 

  46. Li D, Chen X, Becchi M, Zong Z (2016) Evaluating the energy efficiency of deep convolutional neural networks on CPUs and GPUs. In: 2016 IEEE international conferences on big data and cloud computing (BDCloud), social computing and networking (SocialCom), sustainable computing and communications, IEEE, pp 477–484

  47. Liu L, Fieguth P, Guo Y, Wang X, Pietikäinen M (2017) Local binary features for texture classification: taxonomy and experimental study. Pattern Recogn 62:135–160

    Article  Google Scholar 

  48. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110

    Article  Google Scholar 

  49. Mittal S (2012) A survey of architectural techniques for dram power management. Int J High Perform Syst Archit 4:110–119

    Article  Google Scholar 

  50. MNIST (2018) Lenet-mnist. https://github.com/shawpan/lenet/blob/master/README.md. Accessed 06 Jan 2018

  51. Perronnin F, Dance C (2007) Fisher kernels on visual vocabularies for image categorization. In: IEEE conference on computer vision and pattern recognition, CVPR’07, IEEE, pp 1–8

  52. Perronnin F, Sánchez J (2010) Improving the fisher kernel for large-scale image classification. Comput Vis ECCV 2010:143–156

    Google Scholar 

  53. Perronnin F, Dance C (2007) Fisher kernels on visual vocabularies for image categorization. In: 2007 IEEE conference on computer vision and pattern recognition, IEEE, pp 1–8

  54. Ren M, Liao R, Urtasun R, Sinz FH, Zemel RS (2016) Normalizing the normalizers: comparing and extending network normalization schemes. arXiv preprint arXiv:1611.04520

  55. Sánchez J, Perronnin F, Mensink T, Verbeek J (2013) Image classification with the fisher vector: theory and practice. Int J Comput Vis 105:222–245

    Article  MathSciNet  Google Scholar 

  56. Scardapane S, Comminiello D, Hussain A (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89

    Article  Google Scholar 

  57. Simonyan K, Vedaldi A, Zisserman A (2013) Deep fisher networks for large-scale image classification. In: Advances in neural information processing systems, pp 163–171

  58. Song Y, Zhang F, Li Q, Huang H (2017) Locally-transferred fisher vectors for texture classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4912–4920

  59. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958

    MathSciNet  MATH  Google Scholar 

  60. Stehlík M, Kisel’ák J, Bukina E, Lu Y, Baran S (2020) Fredholm integral relation between compound estimation and prediction (FIRCEP). Stoch Anal Appl 38:427–459

    Article  MathSciNet  Google Scholar 

  61. Sumner R (2014) Processing raw images in MATLAB. University of California Sata Cruz, Department of Electrical Engineering, Sata Cruz

    Google Scholar 

  62. Sun M, Song Z, Jiang X, Pan J, Pang Y (2017) Learning pooling for convolutional neural network. Neurocomputing 224:96–104

    Article  Google Scholar 

  63. Sydorov V et al (2014) Deep fisher kernels-end to end learning of the fisher kernel GMM parameters. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1402–1409

  64. Sze V, Chen Y-H, Yang T-J, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105:2295–2329

    Article  Google Scholar 

  65. Tang P, Wang X, Shi B, Bai X, Liu W, Tu Z (2016) Deep fishernet for object classification. arXiv preprint arXiv:1608.00182

  66. Tong Z, Aihara K, Tanaka G (2016) A hybrid pooling method for convolutional neural networks. In: International conference on neural information processing, Springer, pp 454–461

  67. Wager S, Wang S, Liang PS (2013) Dropout training as adaptive regularization. In: Advances in neural information processing systems, pp 351–359

  68. Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1058–1066

  69. Wu H, Gu X (2015) Max-pooling dropout for regularization of convolutional neural networks. In: International conference on neural information processing, Springer, pp 46–54

  70. Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1249–1258

  71. Xie L, Tian Q, Zhang B (2016) Simple techniques make sense: feature pooling and normalization for image classification. IEEE Trans Circuits Syst Video Technol 26:1251–1264

    Article  Google Scholar 

  72. Xu Z, Yang Y, Hauptmann AG (2015) A discriminative CNN video representation for event detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1798–1807

  73. Yu Z, Ni D, Chen S, Qin J, Li S, Wang T, Lei B (2017) Hybrid dermoscopy image classification framework based on deep convolutional neural network and fisher vector. In: 2017 IEEE 14th international symposium on biomedical imaging (ISBI, 2017), IEEE, pp 301–304

  74. Zeiler M, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. In: Proceedings of the international conference on learning representation (ICLR)

Download references

Acknowledgements

We thank anonymous reviewers for their very useful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dou El Kefel Mansouri.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mansouri, D.E.K., Kaddar, B., Benkabou, SE. et al. The Mode-Fisher pooling for time complexity optimization in deep convolutional neural networks. Neural Comput & Applic 33, 6443–6465 (2021). https://doi.org/10.1007/s00521-020-05406-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05406-4

Keywords

Navigation