Abstract
This paper proposes a kernel-blending connection approximated by a neural network (KBNN) for image classification. A kernel mapping connection structure, guaranteed by the function approximation theorem, is devised to blend feature extraction and feature classification through neural network learning. First, a feature extractor learns features from the raw images. Next, an automatically constructed kernel mapping connection maps the feature vectors into a feature space. Finally, a linear classifier is used as an output layer of the neural network to provide classification results. Furthermore, a novel loss function involving a cross-entropy loss and a hinge loss is proposed to improve the generalizability of the neural network. Experimental results on three well-known image datasets illustrate that the proposed method has good classification accuracy and generalizability.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Cortes, C.; Vapnik, V. Support-vector networks. Machine Learning Vol. 20, 273–297, 1995.
Bagarinao, E.; Kurita, T.; Higashikubo, M.; Inayoshi, H. Adapting SVM image classifiers to changes in imaging conditions using incremental SVM: An application to car detection. In: Computer Vision-ACCV 2009. Lecture Notes in Computer Science, Vol. 5996. Zha, H.; Taniguchi, R.; Maybank, S. Eds. Springer Berlin Heidelberg, 363–372, 2010.
Guo, Y. Q.; Jia, X. P.; Paull, D. Effective sequential classifier training for SVM-based multitemporal remote sensing image classification. arXiv preprint arXiv:1706.04719, 2017.
Hinton, G. E.; Osindero, S.; Teh, Y. W. A fast learning algorithm for deep belief nets. Neural Computation Vol. 18, No. 7, 1527–1554, 2006.
Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 35, No. 8, 1798–1828, 2013.
LeCun, Y.; Boser, B. E.; Denker, J. S.; Henderson, D.; Howard, R.; Hubbard, W.; Jackel, L. D. Back propagation applied to handwritten zip code recognition. Neural Computation Vol. 1, No. 4, 541–551, 1989.
Eitel, A.; Springenberg, J. T.; Spinello, L.; Riedmiller, M.; Burgard, W. Multimodal deep learning for robust RGB-D object recognition. arXiv preprint arXiv:1507.06821, 2015.
Shi, W. W.; Gong, Y. H.; Tao, X. Y.; Cheng, D.; Zheng, N. N. Fine-grained image classification using modified DCNNs trained by cascaded softmax and generalized large-margin losses IEEE Transactions on Neural Networks and Learning Systems Vol. 30, No. 3, 683–694, 2018.
Niu, X. X.; Suen, C. Y. A novel hybrid CNN-SVM classifier for recognizing handwritten digits Pattern Recognition Vol. 45, No. 4, 1318–1325, 2012.
Sun, X.; Park, J.; Kang, K.; Hur J. Novel hybrid CNN-SVM model for recognition of functional magnetic resonance images. In: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 1001–1006, 2017.
Hubel, D. H.; Wiesel, T. N. Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology Vol. 195, No. 1, 215–243, 1968.
Fukushima, K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics Vol. 36, No. 4, 193–202, 1980.
Zeiler, M. D.; Fergus, R. Visualizing and understanding convolutional networks. In: Computer Vision — ECCV 2014. Lecture Notes in Computer Science, Vol. 8689. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 818–833, 2014.
Sermanet, P.; Eigen, D.; Zhang, X.; Mathieu, M.; Fergus, R.; LeCun, Y. OverFeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229, 2013.
Zhang, F. L.; Wu, X.; Li, R.-L.; Wang, J.; Zheng, Z. H.; Hu, S. M. Detecting and removing visual distractors for video aesthetic enhancement. IEEE Transactions on Multimedia Vol. 20, No. 8, 1987–1999, 2018.
Wen, Y. H.; Gao, L.; Fu, H. B.; Zhang, F. L.; Xia, S. H. Graph CNNs with motif and variable temporal block for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence Vol. 33, 8989–8996, 2019.
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.
Lin, M.; Chen, Q.; Yan, S. C. Network in network. arXiv preprint arXiv:1312.4400, 2013.
Cybenko, G. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems Vol. 2, No. 4, 303–314, 1989.
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE Vol. 86, No. 11, 2278–2324, 1998.
Krizhevsky, A.; Hinton, G. Learning multiple layers of features from tiny images. Master Thesis. University of Toronto, 2009.
Tang, Y. Deep learning using support vector machines. arXiv preprint arXiv:1306.0239, 2015.
Wan, W. T.; Zhong, Y. Y.; Li, T. P.; Chen, J. S. Rethinking feature distribution for loss functions in image classification. arXiv preprint arXiv:1803.02988, 2018.
Lee, H.; Grosse, R.; Ranganath, R.; Ng, A. Y. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, 609–616, 2009.
Chan, T. H.; Jia, K.; Gao, S. H.; Lu, J. W.; Zeng, Z. N.; Ma, Y. PCANet: A simple deep learning baseline for image classification? IEEE Transactions on Image Processing Vol. 24, No. 12, 5017–5032, 2015.
Hosseini-Asl, E.; Zurada, J. M.; Nasraoui, O. Deep learning of part-based representation of data using sparse autoencoders with nonnegativity constraints. IEEE Transactions on Neural Networks and Learning Systems Vol. 27, No. 12, 2486–2498, 2016.
Bristow, H.; Eriksson, A.; Lucey, S. Fast convolutional sparse coding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 391–398, 2013.
Xu, C. Y.; Lu, C. Y.; Liang, X. D.; Gao, J. B.; Zheng, W.; Wang, T. J.; Yan, S. C. Multi-loss regularized deep neural network. IEEE Transactions on Circuits and Systems for Video Technology Vol. 26, No. 12, 2273–2283, 2016.
Goodfellow, I. J.; Warde-Farley, D.; Mirza, M. Courville, A.; Bengio, Y. Maxout networks. arXiv preprint arXiv:1302.4389, 2013.
Wan, L.; Zeiler, M.; Zhang, S.; LeCun, Y.; Fergus, R. Regularization of neural networks using dropconnect. In: Proceedings of the 30th International Conference on Machine Learning, Vol. 28, 1058–1066, 2013.
Malinowski, M.; Fritz, M. Learnable pooling regions for image classification. arXiv preprint arXiv:1301.3516, 2013.
Zeiler, M. D.; Fergus, R.Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557, 2013.
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61972227 and 61672018), the Natural Science Foundation of Shandong Province (Grant No. ZR2019MF051), the Primary Research and Development Plan of Shandong Province (Grant No. 2018GGX101013), and the Fostering Project of Dominant Discipline and Talent Team of Shandong Province Higher Education Institutions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Xinxin Liu received her B.E. degree from the School of Computer Science and Technology, North China Institute of Science and Technology, Langfang, China, in 2016. She is currently working toward her M.S. degree at Shandong Provincial Key Laboratory of Digital Media Technology, Shandong University of Finance and Economics. Her research interests include particle swarm optimization, machine learning, and image processing.
Yunfeng Zhang received his B.E. degree in computational mathematics and application software from Shandong University of Technology, Jinan, China, in 2000, his M.S. degree in applied mathematics from Shandong University in 2003, and his Ph.D. degree in computational geometry from Shandong University in 2007. He is now a professor in Shandong Provincial Key Laboratory of Digital Media Technology, Shandong University of Finance and Economics. His current research interests include computer-aided geometric design, digital image processing, computational geometry, and function approximation.
Fangxun Bao received his M.Sc. degree from the Department of Mathematics of Qufu Normal University, China, in 1994, and his Ph.D. degree from the Department of Mathematics of Northwest University, Xi’an, China, in 1997. His current position is full professor in the Department of Mathematics, Shandong University. His research interests include computer-aided geometric design and computation, computational geometry, and function approximation.
Kai Shao received his B.E. degree from the School of Computer Science and Technology at Shandong University of Finance and Economics in 2018. He is currently working toward his M.S. degree at Shandong Provincial Key Laboratory of Digital Media Technology, Shandong University of Finance and Economics. His research interests include medical image processing and deep learning.
Ziyi Sun received her B.E. degree from the School of Computer Science and Technology at Shandong University of Finance and Economics in 2018. She is currently working toward her M.S. degree at Shandong Provincial Key Laboratory of Digital Media Technology, Shandong University of Finance and Economics. Her research interests include image processing and deep learning.
Caiming Zhang is a professor and doctoral supervisor of the School of Computer Science and Technology at Shandong University. He is now also the dean and professor of the School of Computer Science and Technology at Shandong Economic University. He received his B.S. and M.E. degrees in computer science from Shandong University in 1982 and 1984, respectively, and his Dr.Eng. degree in computer science from Tokyo Institute of Technology, Japan, in 1994. From 1997 to 2000, he held a visiting position at the University of Kentucky, USA. His research interests include CAGD, CG, information visualization, and medical image processing.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
About this article
Cite this article
Liu, X., Zhang, Y., Bao, F. et al. Kernel-blending connection approximated by a neural network for image classification. Comp. Visual Media 6, 467–476 (2020). https://doi.org/10.1007/s41095-020-0181-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-020-0181-9