Two methods of convolution-complexity reduction, and therefore acceleration of convolutional neural network processing, are introduced. Convolutional neural networks (CNNs) are widely used in computer vision problems. In the first method, we propose to change the structure of the convolutional layer of the neural network into a separable one, which is more computationally simple. It is shown experimentally that the proposed structure makes it possible to achieve up to a 5.6-fold increase in the operating speed of the convolutional layer for 11 × 11-sized convolutional filters without loss in recognition accuracy. The second method uses 1 × 1 fusing convolutions to increase the number of convolution outputs along with decreasing the number of filters. It decreases the computational complexity of convolution and provides an experimental processing speed increase of 11% in the case of large convolutional filters. It is shown that both proposed methods preserve accuracy when tested with the recognition of Russian letters, CIFAR-10, and MNIST images.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
E. Kuznetsova, E. Shvets, and D. Nikolaev, in Proc. 8th Int. Conf. on Machine Vision SPIE 9875 (Si Chuan Univ., Dec. 8, 2015).
A. Mastov, I. Konovalenko, and A. Grigoryev, in Proc. 8th Int. Conf. on Machine Vision SPIE 9875 (Si Chuan Univ., Dec. 8, 2015).
V. Kopenkov and V. Myasnikov, in Proc.23rd Int. Conf. on Computer Graphics, Visualization and Computer Vision (WSCG 2015) (Plzen, June 8–12, 2015), pp. 65–69.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Proc. IEEE 86, 2278 (1998).
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, in Proc. Conf. Advances in Neural Information Processing Systems (NIPS 1989) (Denver, CO, 1990), Vol. 2.
R. Rigamonti, A. Sironi, V. Lepetit, and P. Fua, in Proc. Conf. on Computer Vision and Pattern Recognition (Portland, OR, 2013).
A. Sheshkus, D. Nikolaev, A. Ingacheva, and N. Skoryukina, in Proc. 8th Int. Conf. on Machine Vision SPIE 9875 (Si Chuan Univ., Dec. 8, 2015).
E. Limonova, D. Ilin, and D. Nikolaev, in Proc. 8th Int. Conf. on Machine Vision SPIE 9875 (Si Chuan Univ., Dec. 8, 2015).
F. Mamalet, S. Roux, and C. Garcia, EURASIP J. Embedded Syst., No. 1, 1–8 (2007).
V. Vanhoucke, A. Senior, and M. Z. Mao, in Proc. Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011 (Granada, 2011).
S. Gupta, A. Agrawal, K. Gopalakrishnan, and P. Narayanan, in Proc. 32nd Int. Conf. on Machine Learning (ICML-15), Ed. by D. Blei and F. Bach (JMLR Workshop and Conference Proceedings, 2015), pp. 1737–1746.
E. L. Denton, W. Zaremba, J. Bruna, Y. LeCun, and R. Fergus, in Proc. 27th Conf. on Advances in Neural Information Processing Systems, Ed. by Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Weinberger (Curran Associates, Inc., 2014), pp. 1269–1277.
M. Jaderberg, A. Vedaldi, and A. Zisserman, in Proc. British Machine Vision Conf. (BMVA Press, 2014).
J. Jin, A. Dundar, and E. Culurciello, CoRR abs/1412.5474 (2014). http://arxiv.org/abs/1412.5474.
D. A. Patterson and J. L. Hennessy, Computer Organization and Design. The Hardware/Software Interface, 4th ed. (Morgan Kaufmann, San Francisco, CA, 2008).
Eigen, a C++ template library for linear algebra. http://eigen.tuxfamily.org. Cited March 22, 2017.
G. Bradski and A. Kaehler, Learning OpenCV: Computer Vision in C++ with the OpenCV Library, 2nd ed. (O’Reilly Media, 2013)
A. Krizhevsky, I. Sutskever, and G. E. Hinton, in Proc. 25th Conf. on Advances in Neural Information Processing Systems, Ed. by F. Pereira, C. Burges, L. Bottou, and K. Weinberger (Curran Associates, 2012), pp. 1097–1105.
X. Sierra-Canto, F. Madera-Ramirez, and V. Uc-Cetina, in Proc. 8th Int. Conf. on Machine Learning and Applications, ICMLA’10 (IEEE Computer Society, Washington, DC, 2010), pp. 307–312.
A. Coates, B. Huval, T. Wang, D. J. Wu, B. C. Catanzaro, and A. Y. Ng, in Proc. 30th Int. Conf. on Machine Learning, ICML 2013 (Atlanta, June 16–21, 2013), pp. 1337–1345.
H.-P. Kang and C.-R. Lee, in Euro-Par (Springer, 2015), Vol. 9233 of Lecture Notes in Computer Science, pp. 638–649.
A. R. Omondi and J. C. Rajapakse, FPGA Implementations of Neural Networks (Springer-Verlag, New York, Secaucus, NJ, 2006).
C. Farabet, C. Poulet, J. Han, and Y. LeCun, “CNP: an FPGA-based processor for convolutional networks,” in Proc. Int. Conf. on Field Programmable Logic and Applications, FPL 2009 (Prague, 2009), pp. 32–37.
F. Mamalet, S. Roux, and C. Garcia, in Proc. Int. Symp. on Circuits and Systems (ISCAS’10) (Paris, 2010).
E. Limonova, A. Sheshkus, and D. Nikolaev, Int. J. Appl. Eng. Res. 11, 7491 (2016).
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, in Proc. CVPR 2015 (Boston, 2015). http:// arxiv.org/abs/1409.4842.
K. Simonyan and A. Zisserman, CoRR abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556.
High-performance C++/CUDA implementation of convolutional neural networks. https://code.google. com/p/cuda-convnet/. Cited March 22, 2017.
K. Chellapilla, S. Puri, and P. Simard, in Proc. 10th Int. Workshop on Frontiers in Handwriting Recognition, Ed. by G. Lorette (Univ. de Rennes, 2006).
Intel math kernel library developer reference. https:// software.intel.com/en-us/articles/mkl-reference-manual. Cited March 22, 2017.
CIFAR-10 dataset. https://www.cs.toronto.edu/~kriz/cifar.html. Cited March 22, 2017.
K. He, X. Zhang, S. Ren, and J. Sun, CoRR abs/1603.05027 (2016). http://arxiv.org/abs/1603.05027.
The MNIST database of handwritten digits. http:// yann.lecun.com/exdb/mnist/. Cited March 22, 2017.
Elena Limonova was born in the city of Dolgoprudny, Moscow region, Russia. In 2017 she graduated from the Moscow Institute of Physics and Technology in applied mathematics and physics. In the same year she entered the postgraduate course at the FRC IC RAS. Scientific interests: image processing, pattern recognition on mobile devices.
Alexander Sheshkus was born in Stary Oskol, Belgorod region, Russia. He entered the Moscow Institute of Physics and Technology, where he studied physics, discrete mathematics, and computer science. Received master’s degree in 2011. Since 2013, has worked as a developer for the Smart Engines company. Areas of scientific interest: machine learning, pattern recognition, image processing, image segmentation.
Alena Ivanova was born in Myatlevo, Kaluga region, Russia. She graduated from Moscow State University in Design and Technology in 2014, where she studied applied mathematics, physics, and computer-aided design. In 2014, she entered the postgraduate course at the Institute for Information Transmission Problems, and since 2015 she has been working in laboratory No. 11 Vision Systems of the ITPI RAS. Scientific interests: machine learning, data analysis, image processing
Dmitry Nikolaev was born in Moscow, USSR. He entered Moscow State University, where he studied physics and computer science. He graduated from the magistracy in 2000, and in 2014 defended his PhD thesis. Since 2007 he has served as head of the Laboratory of the Visual Systems of the Institute of Information Transmission Problems. Research interests: image processing, image recognition, fast algorithms, computer vision, color consistency.
About this article
Cite this article
Limonova, E., Sheshkus, A., Ivanova, A. et al. Convolutional Neural Network Structure Transformations for Complexity Reduction and Speed Improvement. Pattern Recognit. Image Anal. 28, 24–33 (2018). https://doi.org/10.1134/S105466181801011X
- convolutional neural networks
- computational complexity
- separable filters
- fusing convolutions