Science China Information Sciences

, Volume 54, Issue 12, pp 2508–2521 | Cite as

Stable multi-label boosting for image annotation with structural feature selection

Research Papers Special Focus

Abstract

Automatic annotating images with appropriate multiple tags are very important to image retrieval and image understanding. We can obtain high-dimensional heterogenous visual features from real-world images to describe their various aspects of visual characteristics, such as color, texture, and shape. Different kinds of heterogenous features have different intrinsic discriminative power for image understanding. The selection of groups of discriminative features for certain semantics is hence crucial to make the image understanding more interpretable. This paper proposes an approach, called stable multi-label boosting with structural feature selection (S-MtBFS), for image annotation. S-MtBFS comprises two steps, namely structural feature selection for each label and stable multi-label boosting by curds and whey. In the first step, a (structural) sparse selection model is learned to identify subgroups of homogenous features for the purpose of predicting a certain label. Moreover, a stable method of multi-label boosting with a re-sampling policy is employed in the second step to utilize the correlations among multiple tags. Extensive experiments on public image datasets show that the proposed approach has better and stable performance of multi-label image annotation and leads to a quite interpretable model for image understanding.

Keywords

image annotation structural feature selection multi-label boosting stability 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

Supplementary material, approximately 9.71 MB.

References

  1. 1.
    Grangier D, Bengio S. A discriminative kernel-based approach to rank images from text queries. IEEE Trans Patt Anal Mach Intel, 2008, 30: 1371–1384CrossRefGoogle Scholar
  2. 2.
    Chen Y, Wang J Z, Geman D. Image categorization by learning and reasoning with regions. J Mach Learn Res, 2004, 5: 913–939Google Scholar
  3. 3.
    Maron O, Ratan A L. Multiple-instance learning for natural scene classification. In: Proceedings of the 15th International Conference on Machine Learning, Madison, Wisconsin, USA, 1998. 341–349Google Scholar
  4. 4.
    Wang C, Yan S, Zhang L, et al. Multi-label sparse coding for automatic image annotation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009. 1643–1650Google Scholar
  5. 5.
    Han Y, Wu F, Jia J, et al. Multi-task sparse discriminant analysis (MTSDA) with overlapping categories. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, Atlanta, Georgia, USA, 2010. 469–474Google Scholar
  6. 6.
    Cao L, Luo J, Liang F, et al. Heterogeneous feature machines for visual recognition. In: Proceedings of the 12th IEEE International Conference on Computer Vision, Kyoto, Japan, 2009Google Scholar
  7. 7.
    Tibshirani R. Regression shrinkage and selection via the lasso. J Royal Stat Soc Ser B, 1996, 58: 267–288MATHMathSciNetGoogle Scholar
  8. 8.
    Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. J Royal Stat Soc Ser B, 2006, 68: 49–67CrossRefMATHMathSciNetGoogle Scholar
  9. 9.
    Zou H, Hastie T. Regularization and variable selection via the elastic net. J Royal Stat Soc Ser B, 2005, 67: 301–320CrossRefMATHMathSciNetGoogle Scholar
  10. 10.
    Friedman J, Hastie T, Tibshirani R. A note on the group lasso and a sparse group lasso. Arxiv preprint, arXiv: 1001.0736, 2010Google Scholar
  11. 11.
    Shen X, Huang H. Grouping pursuit in regression. J Am Stat Assoc, 2010, 105: 727–739CrossRefMathSciNetGoogle Scholar
  12. 12.
    Breiman L, Friedman J. Predicting multivariate responses in multiple linear regression. J Royal Stat Soc Ser B, 1997, 59: 3–54CrossRefMATHMathSciNetGoogle Scholar
  13. 13.
    Hotelling H. Relations between two sets of variates. Biometrika, 1936, 28: 321–377MATHGoogle Scholar
  14. 14.
    Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the International Joint Conference on Artificial Intelligence, Montréal, Québec, Canada, 1995. 1137–1145Google Scholar
  15. 15.
    Miller R G. The jackknife—a review. Biometrika, 1974, 61: 1–15MATHMathSciNetGoogle Scholar
  16. 16.
    Efron B. Bootstrap methods: another look at the jackknife. Ann Stat, 1979, 7: 1–26CrossRefMATHMathSciNetGoogle Scholar
  17. 17.
    Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imag Sci, 2009, 2: 183–202CrossRefMATHMathSciNetGoogle Scholar
  18. 18.
    Wu F, Han Y, Tian Q, et al. Multi-label boosting for image annotation by structural grouping sparsity. In: Proceedings of the ACM International Conference on Multimedia, Firenze, Italy, 2010. 15–24Google Scholar
  19. 19.
    Bishop C M. Pattern Recognition and Machine Learning, Volume 4. New York: Springer, 2006MATHGoogle Scholar
  20. 20.
    Bousquet O, Elisseeff A. Stability and generalization. J Mach Learn Res, 2002, 2: 499–526MATHMathSciNetGoogle Scholar
  21. 21.
    Zhou Z H, Zhang M L. Multi-instance multi-label learning with application to scene classification. In: Proceedings of Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, 2007Google Scholar
  22. 22.
    Chua T S, Tang J, Hong R, et al. Nus-wide: A real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, Island of Santorini, Greece, 2009. 1–9Google Scholar
  23. 23.
    Lewis D D. Evaluating text categorization. In: Proceedings of Speech and Natural Language Workshop, Pacific Grove, California, USA, 1991. 312–318Google Scholar
  24. 24.
    Ji S, Tang L, Yu S, et al. Extracting shared subspace for multi-label classification. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, 2008. 381–389Google Scholar

Copyright information

© Science China Press and Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • YueTing Zhuang
    • 1
  • YaHong Han
    • 1
  • Fei Wu
    • 1
  • JiaCheng Yang
    • 1
  1. 1.College of Computer ScienceZhejiang UniversityHangzhouChina

Personalised recommendations