Automatic Image Annotation Using a Semi-supervised Ensemble of Classifiers

  • Heidy Marin-Castro
  • Enrique Sucar
  • Eduardo Morales
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4756)


Automatic image annotation consists on automatically labeling images, or image regions, with a pre-defined set of keywords, which are regarded as descriptors of the high-level semantics of the image. In supervised learning, a set of previously annotated images is required to train a classifier. Annotating a large quantity of images by hand is a tedious and time consuming process; so an alternative approach is to label manually a small subset of images, using the other ones under a semi-supervised approach. In this paper, a new semi-supervised ensemble of classifiers, called WSA, for automatic image annotation is proposed. WSA uses naive Bayes as its base classifier. A set of these is combined in a cascade based on the AdaBoost technique. However, when training the ensemble of Bayesian classifiers, it also considers the unlabeled images on each stage. These are annotated based on the classifier from the previous stage, and then used to train the next classifier. The unlabeled instances are weighted according to a confidence measure based on their predicted probability value; while the labeled instances are weighted according to the classifier error, as in standard AdaBoost. WSA has been evaluated with benchmark data sets, and 2 sets of images, with promising results.


Automatic image annotation Semi-supervised Learning Ensembles AdaBoost 


  1. 1.
    Aksoy, S., Haralick, R.: Textural features for image database retrieval. In: CBAIVL 1998, p. 45. IEEE Computer Society, Washington, DC, USA (1998)Google Scholar
  2. 2.
    Bennett, K., Demiriz, A., Maclin, R.: Exploiting unlabeled data in ensemble methods. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 289–296. ACM Press, New York (2002)CrossRefGoogle Scholar
  3. 3.
    Chen, L., Lu, G., Zhang, D.: Content-based image retrieval using gabor texture features. In: PCM 2000, Sydney, Australia, pp. 1139–1142 (2000)Google Scholar
  4. 4.
    Corel. Corel images (2003)Google Scholar
  5. 5.
    Blake, C.L., Newman, D., Hettich, S., Merz, C.: UCI repository of machine learning databases (1998)Google Scholar
  6. 6.
    Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: International Conference on Machine Learning, pp. 148–156 (1996)Google Scholar
  7. 7.
    Kuncheva, L.: Using measures of similarity and inclusion for multiple classifier fusion by decision templates. Fuzzy Set and Systems 122(3), 401–407 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Li, W., Maosong, S.: Automatic image annotation based on wordnet and hierarchical ensembles. In: Gelbukh, A. (ed.) CICLing 2006. LNCS, vol. 3878, pp. 417–428. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    de Freitas, N., Duygulu, P., Barnard, K., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Proceedings of the 7th European Conference on Computer Vision-Part IV, London, pp. 97–112. Springer-Verlag, Heidelberg (2002)Google Scholar
  10. 10.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)CrossRefGoogle Scholar
  11. 11.
    Tang, J., Hare, J.S., Lewis, P.H.: Image auto-annotation using a statistical model with salient regions (2006)Google Scholar
  12. 12.
    Witten, I., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Heidy Marin-Castro
    • 1
  • Enrique Sucar
    • 1
  • Eduardo Morales
    • 1
  1. 1.National Institute of Astrophysics, Optics and Electronics, Computer Science Department, Luis Enrique Erro 1, 72840 TonantzintlaMéxico

Personalised recommendations