Generalizing the Convolution Operator in Convolutional Neural Networks

Ghiasi-Shirazi, Kamaledin

doi:10.1007/s11063-019-10043-7

Generalizing the Convolution Operator in Convolutional Neural Networks

Published: 26 April 2019

Volume 50, pages 2627–2646, (2019)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Kamaledin Ghiasi-Shirazi ORCID: orcid.org/0000-0001-6043-1820¹

590 Accesses
24 Citations
2 Altmetric
Explore all metrics

Abstract

Convolutional neural networks (CNNs) have become an essential tool for solving many machine vision and machine learning problems. A major element of these networks is the convolution operator which essentially computes the inner product between a weight vector and the vectorized image patches extracted by sliding a window in the image planes of the previous layer. In this paper, we propose two classes of surrogate functions for the inner product operation inherent in the convolution operator and so attain two generalizations of the convolution operator. The first one is based on the class of positive definite kernel functions where their application is justified by the kernel trick. The second one is based on the class of similarity measures defined according to some distance function. We justify this by tracing back to the basic idea behind the neocognitron which is the ancestor of CNNs. Both of these methods are then further generalized by allowing a monotonically increasing function (possibly depending on the weight vector) to be applied subsequently. Like any trainable parameter in a neural network, the template pattern and the parameters of the kernel/distance function are trained with the back-propagation algorithm. As an aside, we use the proposed framework to justify the use of sine activation function in CNNs. Additionally, we discovered a family of generalized convolution operators which is based on the convex combination of the dot-product and the negative squared Euclidean distance functions. Our experiments on the MNIST dataset show that the performance of ordinary CNNs can be achieved by generalized CNNs based on weighted L1/L2 distances, proving the applicability of the proposed generalization of the convolutional neural networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

CBAM: Convolutional Block Attention Module

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

Notes

To understand these details at the level of code, the reader is referred to the implementation of ConvolutionLayer in Caffe.
For example, in our experiments on the MNIST dataset, we have 12 planes in the first convolution layer which, considering a window of size 5, induces a dimensionality of \(12\times 5\times 5=300\) on the input of the second convolution layer.
Initialization algorithms usually normalize the variance to 1 [11]. However, we experimentally measured the variance at the output of convolution layers in a network initialized by the Xavier method [6] on the MNIST dataset and found that the standard deviation of the first layer is approximately 0.5.
Truly speaking, although Glorot and Bengio [6] introduced a new algorithm which considers the backward gradient, the Xavier initialization algorithm in Caffe with default parameters is what has been introduced by LeCun et al. [15] many years ago.
See https://github.com/yihui-he/resnet-cifar10-caffe for the details about the resnet-20 network. Resnet networks were introduced by He et al. [7].

References

Chandar S, Khapra MM, Larochelle H, Ravindran B (2016) Correlational neural networks. Neural Comput 28(2):257–285
Article MathSciNet Google Scholar
Fletcher G, Hinde C (1994) Learning the activation function for the neurons in neural networks. In: ICANN94. Springer, pp 611–614
Fukushima K (1975) Cognitron: A self-organizing multilayered neural network. Biol Cybern 20(3–4):121–136
Article Google Scholar
Fukushima K (1980) Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4):193–202
Article Google Scholar
Fukushima K (1988) Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Netw 1(2):119–130
Article Google Scholar
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. AISTATS 9:249–256
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366
Article Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of The 32nd international conference on machine learning. pp 448–456
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, MM’14, Orlando, Florida, USA. ACM, New York, NY, pp 675–678. https://doi.org/10.1145/2647868.2654889
Krähenbühl P, Doersch C, Donahue J, Darrell T (2016) Data-dependent initializations of convolutional neural networks. In: International conference on learning representations
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
Article Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998a) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
LeCun Y, Bottou L, Orr GB, Müller KR (1998b) Efficient backprop. In: Neural networks: tricks of the trade. pp 9–50
Google Scholar
Li P (2016) Two classes of linear equations of discrete convolution type with harmonic singular operators. Complex Var Elliptic Equ 61(1):67–75
Article MathSciNet Google Scholar
Li P (2017a) Generalized convolution-type singular integral equations. Appl Math Comput 311:314–323
Article MathSciNet Google Scholar
Li P (2017b) Some classes of singular integral equations of convolution type in the class of exponentially increasing functions. J Inequal Appl 2017(1):307
Article MathSciNet Google Scholar
Li P, Ren G (2016) Some classes of equations of discrete type with harmonic singular operator and convolution. Appl Math Comput 284:185–194
MathSciNet MATH Google Scholar
Lin M, Chen Q, Yan S (2014) Network in network. In: International conference on learning representations
Mairal J (2016) End-to-end kernel learning with supervised convolutional kernel networks. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in neural information processing systems 29. Curran Associates Inc., Red Hook, pp 1399–1407
Google Scholar
Mairal J, Koniusz P, Harchaoui Z, Schmid C (2014) Convolutional kernel networks. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems. pp 2627–2635. https://papers.nips.cc/book/advances-inneural-information-processing-systems-27-2014
Mishkin D, Matas J (2016) All you need is a good init. In: International conference on learning representations
Nakagawa M (1995) An artificial neuron model with a periodic activation function. J Phys Soc Jpn 64(3):1023–1031
Article Google Scholar
Nalaie K, Ghiasi-Shirazi K, Akbarzadeh-T MR (2017) Efficient implementation of a generalized convolutional neural networks based on weighted euclidean distance. In: 2017 7th international conference on computer and knowledge engineering (ICCKE). pp 211–216. https://doi.org/10.1109/ICCKE.2017.8167877
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 115(3):211–252
Article MathSciNet Google Scholar
Schölkopf B, Smola A (2002) Learning with kernels- support vector machines, regularization, optimization and beyond. MIT Press, Cambridge
Google Scholar
Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T (2007) Robust object recognition with cortex-like mechanisms. IEEE Trans Pattern Anal Mach Intell 29(3):411–426
Article Google Scholar
Sopena JM, Romero E, Alquezar R (1999) Neural networks with periodic and monotonic activation functions: a comparative study in classification problems. In: 9th international conference on artificial neural networks: ICANN ’99, IET. pp 323–328
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1–9
Williams D, Hinton G (1986) Learning representations by back-propagating errors. Nature 323:533–536
Article Google Scholar

Download references

Acknowledgements

The author wishes to express appreciation to Research Deputy of Ferdowsi University of Mashhad for supporting this project by Grant No.: 2/43037. The author also thanks the anonymous reviewers and his fellows Ahad Harati and Ehsan Fazl-Ersi for their valuable comments.

Author information

Authors and Affiliations

Department of Computer Engineering, Ferdowsi University of Mashhad (FUM), Office No.: BC-123, Azadi Sq., Mashhad, Khorasan Razavi, Iran
Kamaledin Ghiasi-Shirazi

Authors

Kamaledin Ghiasi-Shirazi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kamaledin Ghiasi-Shirazi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghiasi-Shirazi, K. Generalizing the Convolution Operator in Convolutional Neural Networks. Neural Process Lett 50, 2627–2646 (2019). https://doi.org/10.1007/s11063-019-10043-7

Download citation

Published: 26 April 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11063-019-10043-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalizing the Convolution Operator in Convolutional Neural Networks

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

CBAM: Convolutional Block Attention Module

ImageNet Large Scale Visual Recognition Challenge

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generalizing the Convolution Operator in Convolutional Neural Networks

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

CBAM: Convolutional Block Attention Module

ImageNet Large Scale Visual Recognition Challenge

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation