The Mode-Fisher pooling for time complexity optimization in deep convolutional neural networks

Mansouri, Dou El Kefel; Kaddar, Bachir; Benkabou, Seif-Eddine; Benabdeslem, Khalid

doi:10.1007/s00521-020-05406-4

The Mode-Fisher pooling for time complexity optimization in deep convolutional neural networks

Original Article
Published: 15 October 2020

Volume 33, pages 6443–6465, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Dou El Kefel Mansouri ORCID: orcid.org/0000-0001-7365-4804^1,2,
Bachir Kaddar¹,
Seif-Eddine Benkabou² &
…
Khalid Benabdeslem²

290 Accesses
Explore all metrics

Abstract

In this paper, we aim to improve the performance, time complexity and energy efficiency of deep convolutional neural networks (CNNs) by combining hardware and specialization techniques. Since the pooling step represents a process that contributes significantly to CNNs performance improvement, we propose the Mode-Fisher pooling method. This form of pooling can potentially offer a very promising results in terms of improving feature extraction performance. The proposed method reduces significantly the data movement in the CNN and save up to 10% of total energy, without any performance penalty.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Strided Convolution Instead of Max Pooling for Memory Efficiency of Convolutional Neural Networks

Balancing Convolutional Neural Networks Pipeline in FPGAs

Hardware/Software Co-design for Convolutional Neural Networks Acceleration: A Survey and Open Issues

Notes

Computer file that contains an uncompressed image. It is not viewable directly by most computer systems.
In the literature, the most used filters do not exceed the size ($5 \times 5$).
Max pooling: y= Max ($x_{ij}$).
Average pooling: y= Mean ($x_{ij}$),y represents the output, i and j are the row and column index of the pooling region.
All data sets are summarized in Table 1.

References

Abhimanyu D, Otkrist G, Ramesh R, Nikhil N (2018) Maximum-entropy fine grained classification. In: Advances in neural information processing systems, pp 637–647
Akata Z, Perronnin F, Harchaoui Z, Schmid C (2014) Good practice in large-scale learning for image classification. IEEE Trans Pattern Anal Mach Intell 36:507–520
Article Google Scholar
Asif U, Bennamoun M, Sohel F (2017) A multi-modal, discriminative and spatially invariant CNN for RGB-D object labeling. IEEE Trans Pattern Anal Mach Intell 40(9):2051–2065
Article Google Scholar
Beigpour S, Riess C, Van De Weijer J, Angelopoulou E (2014) Multi-illuminant estimation with conditional random fields. IEEE Trans Image Process 23:83–96
Article MathSciNet Google Scholar
Bianco S (2017) Single and multiple illuminant estimation using convolutional neural networks. IEEE Trans Image Process 26(9):4347–4362
Article MathSciNet Google Scholar
Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Jackel LD, LeCun Y, Muller UA, et al (1994) Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR international conference on pattern recognition, conference B: computer vision and image processing, IEEE, vol 2, pp 77–82
Brain G (2017) Tensorflow: an open-source software library for machine intelligence
Chen Y-H, Emer J, Sze V (2017) Using dataflow to optimize energy efficiency of deep neural network accelerators. IEEE Micro 37:12–21
Article Google Scholar
Chollet F (2018) Keras: the python deep learning library. https://keras.io/#keras-the-python-deep-learning-library
CIFAR10, Lenet-cifar10. http://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html. Accessed 08 Jan 2018
Cimpoi M, Maji S, Kokkinos I, Vedaldi A (2016) Deep filter banks for texture recognition, description, and segmentation. Int J Comput Vis 118:65–94
Article MathSciNet Google Scholar
Cohen BH, Lea RB (2004) Essentials of statistics for the social and behavioral sciences, vol 3. Wiley, New York
Google Scholar
Conneau A, Schwenk H, Barrault L, Lecun Y (2017) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, Long Papers, vol. 1, pp 1107–1116
core team P (2017) Pytorch: Tensors and dynamic neural networks in python with strong GPU acceleration
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Deng J, et al (2012) Large scale visual recognition challenge 2012 (ilsvrc2012). http://image-net.org/challenges/LSVRC/2012/index. Accessed Apr 2018
Deshpande A (2018) The 9 deep learning papers you need to know about (understanding cnns part 3). https://adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html. Accessed 15 Apr 2018
Dietterich T (1995) Overfitting and undercomputing in machine learning. ACM Comput Surv 27:326–327
Article Google Scholar
Dixit M, Chen S, Gao D, Rasiwasia N, Vasconcelos N (2015) Scene classification with semantic fisher vectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2974–2983
Ebner M (2009) Color constancy based on local space average color. Mach Vis Appl 20:283–301
Article Google Scholar
Erhan D, Bengio Y, Courville A, Manzagol P-A, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11:625–660
MathSciNet MATH Google Scholar
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701
Article Google Scholar
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92
Article MathSciNet Google Scholar
Gijsenij A, Lu R, Gevers T (2012) Color constancy for multiple light sources. IEEE Trans Image Process 21:697–707
Article MathSciNet Google Scholar
Goodfellow IJ, Warde-Farley D, Mirza M, Courville M, Bengio Y (2013) Maxout networks. arXiv preprint arXiv:1302.4389
Gravetter F, Wallnau L (2015). Statistics for the behavioral sciences. Cengage Learning
Hassaballah M, Abdelmgeid AA, Alshazly HA (2016) Image features detection, description and matching. Image feature detectors and descriptors. Springer, Cham, pp 11–45
Chapter Google Scholar
He K, Sun J (2015) Convolutional neural networks at constrained time cost. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 5353–5360
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37:1904–1916
Article Google Scholar
Hensman P, Masko D (2015) The impact of imbalanced training data for convolutional neural networks. Degree Project in Computer Science, KTH Royal Institute of Technology
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507
Article MathSciNet Google Scholar
Hsi-Shou W (2018) Energy-efficient neural network architectures
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and $<$ 0.5 mb model size. arXiv preprint arXiv:1602.07360
ImageNet
Jianhua L (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37:145–151
Article MathSciNet Google Scholar
Jolicoeur P (2012) Introduction to biometry. Springer, New York
Google Scholar
Jupyter P (2017) Jupyter
Krizhevsky A (2017) The cifar10/100 datasets
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
lab FAR (2017) Torch: a scientific computing framework for luajit
Land EH (1977) The retinex theory of color vision. Sci Am 237:108–129
Article Google Scholar
LeCun Y (2017) The MNIST database of handwritten digits
LeCun Y, Bengio Y (2015) Deep learning. Nature 521:436–444
Article Google Scholar
LeCun Y, et al (2018) Lenet-5, convolutional neural networks. http://yann.lecun.com/exdb/lenet. Accessed 2018
Lee C-Y, Gallagher P, Tu Z (2017) Generalizing pooling functions in CNNs: mixed, gated, and tree. IEEE Trans Pattern Anal Mach Intell PP(99):1
Google Scholar
Li D, Chen X, Becchi M, Zong Z (2016) Evaluating the energy efficiency of deep convolutional neural networks on CPUs and GPUs. In: 2016 IEEE international conferences on big data and cloud computing (BDCloud), social computing and networking (SocialCom), sustainable computing and communications, IEEE, pp 477–484
Liu L, Fieguth P, Guo Y, Wang X, Pietikäinen M (2017) Local binary features for texture classification: taxonomy and experimental study. Pattern Recogn 62:135–160
Article Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110
Article Google Scholar
Mittal S (2012) A survey of architectural techniques for dram power management. Int J High Perform Syst Archit 4:110–119
Article Google Scholar
MNIST (2018) Lenet-mnist. https://github.com/shawpan/lenet/blob/master/README.md. Accessed 06 Jan 2018
Perronnin F, Dance C (2007) Fisher kernels on visual vocabularies for image categorization. In: IEEE conference on computer vision and pattern recognition, CVPR’07, IEEE, pp 1–8
Perronnin F, Sánchez J (2010) Improving the fisher kernel for large-scale image classification. Comput Vis ECCV 2010:143–156
Google Scholar
Perronnin F, Dance C (2007) Fisher kernels on visual vocabularies for image categorization. In: 2007 IEEE conference on computer vision and pattern recognition, IEEE, pp 1–8
Ren M, Liao R, Urtasun R, Sinz FH, Zemel RS (2016) Normalizing the normalizers: comparing and extending network normalization schemes. arXiv preprint arXiv:1611.04520
Sánchez J, Perronnin F, Mensink T, Verbeek J (2013) Image classification with the fisher vector: theory and practice. Int J Comput Vis 105:222–245
Article MathSciNet Google Scholar
Scardapane S, Comminiello D, Hussain A (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89
Article Google Scholar
Simonyan K, Vedaldi A, Zisserman A (2013) Deep fisher networks for large-scale image classification. In: Advances in neural information processing systems, pp 163–171
Song Y, Zhang F, Li Q, Huang H (2017) Locally-transferred fisher vectors for texture classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4912–4920
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
MathSciNet MATH Google Scholar
Stehlík M, Kisel’ák J, Bukina E, Lu Y, Baran S (2020) Fredholm integral relation between compound estimation and prediction (FIRCEP). Stoch Anal Appl 38:427–459
Article MathSciNet Google Scholar
Sumner R (2014) Processing raw images in MATLAB. University of California Sata Cruz, Department of Electrical Engineering, Sata Cruz
Google Scholar
Sun M, Song Z, Jiang X, Pan J, Pang Y (2017) Learning pooling for convolutional neural network. Neurocomputing 224:96–104
Article Google Scholar
Sydorov V et al (2014) Deep fisher kernels-end to end learning of the fisher kernel GMM parameters. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1402–1409
Sze V, Chen Y-H, Yang T-J, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105:2295–2329
Article Google Scholar
Tang P, Wang X, Shi B, Bai X, Liu W, Tu Z (2016) Deep fishernet for object classification. arXiv preprint arXiv:1608.00182
Tong Z, Aihara K, Tanaka G (2016) A hybrid pooling method for convolutional neural networks. In: International conference on neural information processing, Springer, pp 454–461
Wager S, Wang S, Liang PS (2013) Dropout training as adaptive regularization. In: Advances in neural information processing systems, pp 351–359
Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1058–1066
Wu H, Gu X (2015) Max-pooling dropout for regularization of convolutional neural networks. In: International conference on neural information processing, Springer, pp 46–54
Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1249–1258
Xie L, Tian Q, Zhang B (2016) Simple techniques make sense: feature pooling and normalization for image classification. IEEE Trans Circuits Syst Video Technol 26:1251–1264
Article Google Scholar
Xu Z, Yang Y, Hauptmann AG (2015) A discriminative CNN video representation for event detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1798–1807
Yu Z, Ni D, Chen S, Qin J, Li S, Wang T, Lei B (2017) Hybrid dermoscopy image classification framework based on deep convolutional neural network and fisher vector. In: 2017 IEEE 14th international symposium on biomedical imaging (ISBI, 2017), IEEE, pp 301–304
Zeiler M, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. In: Proceedings of the international conference on learning representation (ICLR)

Download references

Acknowledgements

We thank anonymous reviewers for their very useful comments and suggestions.

Author information

Authors and Affiliations

University Ibn Khaldoun, BP P 78 zaâroura, 14000, Tiaret, Algeria
Dou El Kefel Mansouri & Bachir Kaddar
LIAS/ISAE-ENSMA, University of Poitiers, 1, Avenue Clement Ader, Futuroscope Cedex, 86960, Lyon, France
Dou El Kefel Mansouri, Seif-Eddine Benkabou & Khalid Benabdeslem

Authors

Dou El Kefel Mansouri
View author publications
You can also search for this author in PubMed Google Scholar
Bachir Kaddar
View author publications
You can also search for this author in PubMed Google Scholar
Seif-Eddine Benkabou
View author publications
You can also search for this author in PubMed Google Scholar
Khalid Benabdeslem
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dou El Kefel Mansouri.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mansouri, D.E.K., Kaddar, B., Benkabou, SE. et al. The Mode-Fisher pooling for time complexity optimization in deep convolutional neural networks. Neural Comput & Applic 33, 6443–6465 (2021). https://doi.org/10.1007/s00521-020-05406-4

Download citation

Received: 21 February 2020
Accepted: 29 September 2020
Published: 15 October 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s00521-020-05406-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Mode-Fisher pooling for time complexity optimization in deep convolutional neural networks

Abstract

Access this article

Similar content being viewed by others

Strided Convolution Instead of Max Pooling for Memory Efficiency of Convolutional Neural Networks

Balancing Convolutional Neural Networks Pipeline in FPGAs

Hardware/Software Co-design for Convolutional Neural Networks Acceleration: A Survey and Open Issues

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Mode-Fisher pooling for time complexity optimization in deep convolutional neural networks

Abstract

Access this article

Similar content being viewed by others

Strided Convolution Instead of Max Pooling for Memory Efficiency of Convolutional Neural Networks

Balancing Convolutional Neural Networks Pipeline in FPGAs

Hardware/Software Co-design for Convolutional Neural Networks Acceleration: A Survey and Open Issues

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation