Skip to main content
Log in

Multi-view multi-label learning for image annotation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Image annotation is posed as multi-class classification problem. Pursuing higher accuracy is a permanent but not stale challenge in the field of image annotation. To further improve the accuracy of image annotation, we propose a multi-view multi-label (abbreviated by MVML) learning algorithm, in which we take multiple feature (i.e., view) and ensemble learning into account simultaneously. By doing so, we make full use of the complementarity among the views and the base learners of ensemble learning, leading to higher accuracy of image annotation. With respect to the different distribution of positive and negative training examples, we propose two versions of MVML: the Boosting and Bagging versions of MVML. The former is suitable for learning over balanced examples while the latter applies to the opposite scenario. Besides, the weights of base learner is evaluated on validation data instead of training data, which will improve the generalization ability of the final ensemble classifiers. The experimental results have shown that the MVML is superior to the ensemble SVM of single view.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Alham NK, Li M, Liu Y, Ponraj M, Qi M (2012) A distributed SVM ensemble for image classification and annotation, 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), IEEE, pp 1581–1584

  2. Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, Springer 36(1-2):105–139

    Article  Google Scholar 

  3. Breiman L (1996) Bagging predictors. Mach Learn, Springer 24 (2):123–140

    MathSciNet  MATH  Google Scholar 

  4. Breiman Leo V (2001) Random forests. Mach Learn 45 (1):5–32

    Article  MathSciNet  MATH  Google Scholar 

  5. Burges C JC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc, Springer 5 (2):121–167

    Article  Google Scholar 

  6. Chang C-C, Lin C-J (2011) LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2 (3):1–27

    Article  Google Scholar 

  7. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  8. Dietterich TG (2000) Ensemble methods in machine learning, Multiple classifier systems, pp 1–15

  9. Dietterichl TGS (2002) Ensemble learning. The handbook of brain theory and neural networks, pp 405–408

  10. Freund Y, Schapire RE (1995) A desicion-theoretic generalization of on-line learning and an application to boosting, Computational learning theory. Springer, pp 23–37

  11. Galar M, Alberto F, Tartas EB, Sola HB, Herrera F (2012) A review on ensembles for the class imbalance problem: Bagging-, Boosting-, and Hybrid-Based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C, IEEE, pp 463–484

  12. Gonen M, Alpayd E (2011) Multiple kernel learning algorithms. The Journal of Machine Learning Research. JMLR.org, pp 2211–2268

  13. Haykin S (2004) A comprehensive neural networks. Neural Netw 2 (2004)

  14. Hosmer DW, Lemeshow S, Sturdivant RX (2000) Introduction to the logistic regression model, Wiley Online Library Inc.

  15. Khoshgoftaar TM, Van Hulse J, Napolitano A (2011) Comparing boosting and bagging techniques with noisy and imbalanced data. IEEE Transactions on Systems, Man, and Cybernetics, Part A, pp 552–568

  16. Kim H-C, Pang S, Je H-M, Kim D, Bang S-Y (2002) Support vector machine ensemble with bagging, Pattern recognition with support vector machines. Springer, pp 397–408

  17. Muda Z (2007) Classification and image annotation for bridging the semantic gap. In: Proceedings of the summer school on multimedia semantics, vol 2007, pp 15–21

  18. Sewell M (2008) Ensemble learning. RN, Citeseer 11(2):1–15

    Google Scholar 

  19. Song J, Yang Y, Huang Z, Shen HT, Hong R (2011) Multiple feature hashing for real-time large scale near-duplicate video retrieval. In: Proceedings of the 19th ACM international conference on multimedia, pp 423–432

  20. Valentini G, Dietterich TG (2003) Low bias bagged support vector machines, ICML, pp 752–759

  21. Wolpert DH (1992) Stacked generalization. Neural Netw Elsevier 5(2):241–259

    Article  MathSciNet  Google Scholar 

  22. Xu X-S, Xue X, Zhou Z-H (2011) Ensemble multi-instance multi-label learning approach for video annotation task. In: Proceedings of the 19th ACM international conference on multimedia. ACM, pp 1153–1156

  23. Yan G, Ma G, Zhu L (2006) Support vector machines ensemble based on fuzzy integral for classification. Advances in Neural Networks-ISNN 2006. Springer, pp 974–980

  24. Yan G, Ma G, Zhu L (2006) Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE, pp 1088–1099

  25. Yang Y, Huang Z, Yang Y, Liu J, Shen HT, Luo J (2013) Local image tagging via graph regularized joint group sparsity. Pattern Recogn, Elsevier Sc Inc 46(5):1358–1368

    Article  MATH  Google Scholar 

  26. Yang Y, Yang Y, Huang Z, Shen HT (2011) Tag localization with spatial correlations and joint group sparsity. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 881–888

  27. Yang Y, Yang Y, Shen HT, Zhang Y, Du X, Zhou X (2013) Discriminative nonnegative spectral clustering with out-of-sample extension. IEEE Trans Data Knowl Eng (TKDE) 25(8):1760–1771

    Article  Google Scholar 

  28. Yang Y, Zha Z-J, Gao Y, Zhu X, Chua T-S (2014) Exploiting web images for semantic video indexing via robust sample-speci?c loss. IEEE Trans Multimed 16(6):1677–1689

    Article  Google Scholar 

  29. Zhang L, Gao Y, Xia Y, Dai Q, Li X (2014) A fine-grained image categorization system by cellet-encoded spatial pyramid modeling. IEEE Transactions on Industrial Electronics, pp 1–8

  30. Zhang L, Han Y, Yang Y, Song M, Yan S, Tian Q (2013) Discovering discriminative graphlets for aerial image categories recognition. IEEE Transactions on Image Processing, pp 5071–5084

  31. Zhang L, Song M, Zhao Q, Liu X, Bu J, Chen C (2013) IEEE, probabilistic graphlet transfer for photo cropping. IEEE Trans Image Process 22 (2):802–815

    Article  MathSciNet  Google Scholar 

  32. Zhang L, Yi Y, Gao Y, Yu Y, Wang C, Li X (2014) A probabilistic associative model for segmenting weakly supervised images. IEEE Trans Image Process 23(9):4150–4159

    Article  MathSciNet  Google Scholar 

  33. Zhang L, Xia Y, Ji R, Li O (2014) IEEE, Spatial-aware object-level saliency prediction by learning graphlet hierarchies. IEEE Trans Ind Electron 99:1–8

    Google Scholar 

  34. Zhou Z-H (2009) Ensemble learning. Encyclopedia of Biometrics. Springer, pp 270–273

Download references

Acknowledgment

This work is supported in part by the National Basic Research Program (973 Program) of China under Grant No. 2011CB302305, the National Natural Science Foundation of China under Grant No. 61232004. The authors appreciate the valuable suggestions from the anonymous reviewers and the Editors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zou, F., Liu, Y., Wang, H. et al. Multi-view multi-label learning for image annotation. Multimed Tools Appl 75, 12627–12644 (2016). https://doi.org/10.1007/s11042-014-2423-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-2423-2

Keywords

Navigation