Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Bi-linearly weighted fractional max pooling

An extension to conventional max pooling for deep convolutional neural network

  • 186 Accesses

  • 4 Citations

Abstract

In this paper, we propose to extend the flexibility of the commonly used 2 × 2 non-overlapping max pooling for Convolutional Neural Network. We name it as Bi-linearly Weighted Fractional Max-Pooling. This proposed method enables max pooling operation below stride size 2, and is computed based on four bi-linearly weighted neighboring input activations. Currently, in a 2 × 2 non-overlapping max pooling operation, as spatial size is halved in both x and y directions, three-quarter of activations in the feature maps are discarded. As such reduction is too abrupt, amount of said pooling operation within a Convolutional Neural Network is very limited: further increasing the number of pooling operation results in too little activation left for subsequent operations. Using our proposed pooling method, spatial size reduction can be more gradual and can be adjusted flexibly. We applied a few combinations of our proposed pooling method into 50-layered ResNet and 19-layered VGGNet with reduced number of filters, and experimented on FGVC-Aircraft, Oxford-IIIT Pet, STL-10 and CIFAR-100 datasets. Even with reduced memory usage, our proposed methods showed reasonable improvement in classification accuracy with 50-layered ResNet. Additionally, with flexibility of our proposed pooling method, we change the reduction rate dynamically every training iteration, and our evaluation results indicated potential regularization effect.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

References

  1. 1.

    Coates A, Lee H, Ng AY (2011) An analysis of single-layer networks in unsupervised feature learning. In: AISTATS 2011, vol 1001

  2. 2.

    Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: AISTATS, vol 9, pp 249–256

  3. 3.

    Graham B (2014) Fractional max-pooling. arXiv:1412.6071

  4. 4.

    Han D, Kim J, Kim J (2016) Deep pyramidal residual networks. arXiv:1610.02915

  5. 5.

    He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In: IEEE international conference on computer vision (ICCV) 2015

  6. 6.

    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR)

  7. 7.

    Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd international conference on machine learning, ICML 2015, lille, France, 6-11 July 2015, pp 448–456

  8. 8.

    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678

  9. 9.

    Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images

  10. 10.

    Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  11. 11.

    Maji S, Rahtu E, Kannala J, Blaschko M, Vedaldi A (2013) Fine-grained visual classification of aircraft. arXiv:1306.5151

  12. 12.

    Parkhi OM, Vedaldi A, Zisserman A, Jawahar C (2012) Cats and dogs. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3498–3505

  13. 13.

    Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 115(3):211–252. doi:10.1007/s11263-015-0816-y

  14. 14.

    Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: Integrated recognition, localization and detection using convolutional networks. In: International conference on learning representations (ICLR 2014). CBLS, p 16

  15. 15.

    Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations

  16. 16.

    Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

  17. 17.

    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  18. 18.

    Xu B, Wang N, Chen T, Li M (2015) Empirical evaluation of rectified activations in convolutional network. In: Deep learning workshop, ICML 2015

  19. 19.

    Zeiler M, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. In: Proceedings of the international conference on learning representation (ICLR)

  20. 20.

    Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer International Publishing, pp 818–833

Download references

Acknowledgements

We would like to thank MEXT KAKENHI, Grant-in-Aid for Challenging Exploratory Research, Grant Number 15K12027 for partial support of our work.

Author information

Correspondence to Siang Thye Hang.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hang, S.T., Aono, M. Bi-linearly weighted fractional max pooling. Multimed Tools Appl 76, 22095–22117 (2017). https://doi.org/10.1007/s11042-017-4840-5

Download citation

Keywords

  • Fractional max pooling
  • Convolutional neural network
  • Deep learning