Abstract
Image annotation is a challenging problem, which has attracted intensive attention recently due to the semantic gap between images and corresponding tags. However, most existing works neglect the imbalance distribution of different classes and the internal correlations across modalities. To address these issues, we propose a multiple kernel learning method based on weak learner for image annotation, which can acquire the semantic correlations to predict tags of a given image. More specifically, we first employ the convolutional neural network to extract the semantic features of images, and take advantage of the oversampling technique to generate new samples of minority classes which can solve the imbalance problem. Further, our proposed multiple kernel learning method is applied to obtain the internal correlations between images and tags. In order to further improve the prediction performance, we combine the boosting procedure with the multiple kernel learning to enhance the performance of classifier. We evaluate the proposed method on two benchmark datasets. The experimental results demonstrate that our method is superior to several state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bach, F.R., Lanckriet, G.R.G., Jordan, M.I.: Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the Twenty-First International Conference on Machine Learning, pp. 6–13. ACM, New York (2004)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Rakotomamonjy, A., Bach, F., Canu, S., Grandvalet, Y.: SimpleMKL. J. Mach. Learn. Res. 9, 2491–2521 (2008)
Freund, Y., Schapire, R.R.E.: Experiments with a new boosting algorithm. In: International Conference on Machine Learning, pp. 148–156 (1996)
Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556 [cs] (2014)
Makadia, A., Pavlovic, V., Kumar, S.: A new baseline for image annotation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 316–329. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88690-7_24
Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 309–316 (2009)
Verma, Y., Jawahar, C.V.: Image annotation using metric learning in semantic neighbourhoods. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 836–849. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33712-3_60
Lavrenko, V., Feng, S.L., Manmatha, R.: Multiple Bernoulli relevance models for image and video annotation. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1002–1009. IEEE Computer Society, Los Alamitos (2004)
Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. In: Advances in Neural Information Processing Systems, pp. 553–560 (2003)
Su, F., Xue, L.: Graph learning on K nearest neighbours for automatic image annotation. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 403–410. ACM, New York (2015)
Cusano, C., Bicocca, M., Bicocca, V.: Image annotation using SVM. In: Proceedings of SPIE, pp. 330–338 (2003)
Goh, K.S., Chang, E.Y., Li, B.: Using one-class and two-class SVMs for multiclass image annotation. IEEE Trans. Knowl. Data Eng. 17, 1333–1346 (2005)
Grangier, D., Bengio, S.: A discriminative kernel-based approach to rank images from text queries. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1371–1384 (2008)
Verma, Y., Jawahar, C.V.: Exploring SVM for image annotation in presence of confusing labels. In: ResearchGate, pp. 25.1–25.11 (2013)
Duygulu, P., Barnard, K., de Freitas, J.F.G., Forsyth, D.A.: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47979-1_7
Grubinger, M.: Analysis and evaluation of visual information systems performance, Ph.D thesis. Victoria University, Melbourne (2007)
Zhang, S., Huang, J., Huang, Y., Yu, Y., Li, H., Metaxas, D.N.: Automatic image annotation using group sparsity. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3312–3319 (2010)
Nakayama, H.: Linear distance metric learning for large-scale generic image recognition, Ph.D thesis. The University of Tokyo, Japan (2011)
Zhang, Q., Chen, Z., Yang, L.T.: A nodes scheduling model based on Markov chain prediction for big streaming data analysis. Int. J. Commun. Syst. 28, 1610–1619 (2015)
Chen, M., Zheng, A., Weinberger, K.: Fast image tagging. In: Proceedings of the 30th International Conference on Machine Learning (ICML-2013), vol. 28, pp. 1274–1282 (2013)
Murthy, V.N., Maji, S., Manmatha, R.: Automatic image annotation using deep learning representations. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 603–606. ACM, New York (2015)
Zhang, Q., Zhao, C., Yang, L.T., Chen, Z., Zhao, L., Li, P.: An incremental CFS algorithm for clustering large data in industrial internet of things. IEEE Trans. Industr. Inf. (2017). https://doi.org/10.1109/TII.2017.2684807
Jiu, M., Sahbi, H.: Nonlinear deep kernel learning for image annotation. IEEE Trans. Image Process. 26, 1820–1832 (2017)
Acknowledgments
This work was supported by the State Key Program of National Natural Science of China (U1301253), the Science and Technology Planning Key Project of Guangdong Province (2015B010110006), the Fundamental Research Funds for the Central Universities (DUT2017TB02), and the National Natural Science Foundation project of China (61672123).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Zhong, H., Yuan, X., Chen, Z., Zhong, F., Leng, Y. (2018). Multiple Kernel Learning Based on Weak Learner for Automatic Image Annotation. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10736. Springer, Cham. https://doi.org/10.1007/978-3-319-77383-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-77383-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77382-7
Online ISBN: 978-3-319-77383-4
eBook Packages: Computer ScienceComputer Science (R0)