Abstract
The paper considers usage of fine-tuning of the deep neural network ensemble for recognition of 60 event types in the set of 60,000 images from WIDER database. The applied ensemble consists of two deep convolutional neural networks (CNN) using the GoogLeNet architecture, previously trained on other image bases: ImageNet and Places. Separately the accuracy of recognition of 10 events was analyzed: “Car Racing”, “Ceremony”, “Concert”, “Demonstration”, “Football”, “Meeting”, “Picnic”, “Swimming”, “Tennis” and “Traffic”. During the ensemble training output layer in the each of deep CNN is replaced to the layer with respectively 10 and 60 neurons and we tune only weights which connect output layer with previous one. The classification accuracy of 10 event classes from the WIDER image database averages 83.22%, for 60 event classes accuracy is 50.4%. In addition, the approach based on the automatic features formation using deep CNN provided a much better recognition quality of social events compared to the choice of features manually (LBP, LDP or HOG) and their further classification by support vector machine. The testing time of the developed ensemble provides the possibility of using the classifier in practical applications of event recognition with a processing speed up to 20 frames per second.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zeno, B., Yudin, D., Alkhatib, B.: Event recognition on images using support vector machine and multi-level histograms of local patterns. ARPN J. Eng. Appl. Sci. 11(20), 12282–12287 (2016)
Hinton, G., Osindero, S., The, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE TPAMI 35(8), 1798–1828 (2013)
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)
Krizhevsky, A., Sutskever, I., Hinton, G. E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)
Razavian, A., Azizpour, H., Sullivan, J., Carlsson, S.: CNN Features off-the-shelf: an Astounding Baseline for Recognition. In: CoRR, arXiv:1403.6382 (2014)
Web Image Dataset for Event Recognition (WIDER). http://personal.ie.cuhk.edu.hk/~xy012/event_recog/WIDER/ Accessed 12 Apr 2017
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. arXiv:1409.4842 (2014)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Large Scale Visual Recognition Challenge 2012 (ILSVRC2012). http://www.image-net.org/challenges/LSVRC/2012/index Accessed 12 Apr 2017
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Aude, O.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014)
Places Database. http://places.csail.mit.edu/ Accessed 12 Apr 2017
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol. 8689, pp. 818–833. Springer, Cham (2014)
Jia, Y.: Caffe: Deep learning framework by the BVLC. http://caffe.berkeleyvision.org/ Accessed 12 Apr 2017
Zhang, B., Gao, Y.: Local Derivative Pattern Versus Local Binary Pattern: Face Recognition With High Order Local Pattern Descriptor. IEEE Trans. Image Process. 19(2), 533–544 (2010)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893. IEEE Computer Society, Washington (2005)
Xiong, Y., Zhu, K., Lin D., Tang, X.: Recognize Complex Events from Static Images by Fusing Deep Channels. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1600–1609 (2015)
Acknowledgment
This article is written in the course of the grant of the President of the Russian Federation for state support of young Russian scientists № MK-3130.2017.9 (contract № 14.Z56.17.3130-MK) on the theme “Recognition of road conditions on images using deep learning”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Yudin, D., Zeno, B. (2018). Event Recognition on Images by Fine-Tuning of Deep Neural Networks. In: Abraham, A., Kovalev, S., Tarassov, V., Snasel, V., Vasileva, M., Sukhanov, A. (eds) Proceedings of the Second International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’17). IITI 2017. Advances in Intelligent Systems and Computing, vol 679. Springer, Cham. https://doi.org/10.1007/978-3-319-68321-8_49
Download citation
DOI: https://doi.org/10.1007/978-3-319-68321-8_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68320-1
Online ISBN: 978-3-319-68321-8
eBook Packages: EngineeringEngineering (R0)