Skip to main content

Bayesian Learning of Hierarchical Multinomial Mixture Models of Concepts for Automatic Image Annotation

  • Conference paper
Image and Video Retrieval (CIVR 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4071))

Included in the following conference series:

Abstract

We propose a novel Bayesian learning framework of hierarchical mixture model by incorporating prior hierarchical knowledge into concept representations of multi-level concept structures in images. Characterizing image concepts by mixture models is one of the most effective techniques in automatic image annotation (AIA) for concept-based image retrieval. However it also poses problems when large-scale models are needed to cover the wide variations in image samples. To alleviate the potential difficulties arising in estimating too many parameters with insufficient training images, we treat the mixture model parameters as random variables characterized by a joint conjugate prior density of the mixture model parameters. This facilitates a statistical combination of the likelihood function of the available training data and the prior density of the concept parameters into a well-defined posterior density whose parameters can now be estimated via a maximum a posteriori criterion. Experimental results on the Corel image dataset with a set of 371 concepts indicate that the proposed Bayesian approach achieved a maximum F1 measure of 0.169, which outperforms many state-of-the-art AIA algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barnard, K., Duygulu, P., Forsyth, D.: Clustering Art. In: Proceedings of CVPR (2001)

    Google Scholar 

  2. Carneiro, G., Vasconcelos, N.: Formulating Semantic Image Annotation as a Supervised Learning Problem. In: Proceedings of CVPR (2005)

    Google Scholar 

  3. Duyulu, P., Barnard, K., de Freitas, N., Forsyth, D.: Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  4. Fan, J.P., Luo, H.Z., Gao, Y.L.: Learning the Semantics of Images by Using Unlabeled Samples. In: Proceedings of CVPR (2005)

    Google Scholar 

  5. Gao, S., Wang, D.-H., Lee, C.-H.: Automatic Image Annotation through Multi-Topic Text Categorization. In: Proceedings of ICASSP, Toulouse, France (May 2006)

    Google Scholar 

  6. Huo, Q., Chan, C., Lee, C.-H.: Bayesian Adaptive Learning of the Parameters of Hidden Markov Model for Speech Recognition. IEEE Trans. Speech Audio Processing 3, 334–345 (1995)

    Article  Google Scholar 

  7. Jeon, J., Lavrenko, V., Manmatha, R.: Automatic Image Annotation and Retrieval Using Cross-Media Relevance Models. In: Proceedings of the 26th ACM SIGIR (2003)

    Google Scholar 

  8. Lavrenko, V., Manmatha, R., Jeon, J.: A Model for Learning the Semantics of Pictures. In: Proceedings of the 16th Conference on NIPS (2003)

    Google Scholar 

  9. Lee, C.-H., Huo, Q.: On Adaptive Decision Rules and Decision Parameter Adaptation for Automatic Speech Recognition. Proceedings of the IEEE 88(8) (August 2000)

    Google Scholar 

  10. Minka, T.: Estimating a Dirichlet Distribution (2003), http://www.stat.cmu.edu/~minka/papers/dirichlet

  11. Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to WordNet: an on-line lexical database. Intl. Jour. of Lexicography 3, 235–244 (1990)

    Article  Google Scholar 

  12. Mori, Y., Takahashi, H., Oka, R.: Image-to-Word Transformation Based on Dividing and Vector Quantizing Images with Words. In: Proceedings of MISRM (1999)

    Google Scholar 

  13. Novovicova, J., Malik, A.: Application of Multinomial Mixture Model to Text Classification. In: Perales, F.J., Campilho, A.C., Pérez, N., Sanfeliu, A. (eds.) IbPRIA 2003. LNCS, vol. 2652, pp. 646–653. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  14. Srikanth, M., Varner, J., Bowden, M., Moldovan, D.: Exploiting Ontologies for Automatic Image Annotation. In: Proceedings of the 28th ACM SIGIR (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shi, R., Chua, TS., Lee, CH., Gao, S. (2006). Bayesian Learning of Hierarchical Multinomial Mixture Models of Concepts for Automatic Image Annotation. In: Sundaram, H., Naphade, M., Smith, J.R., Rui, Y. (eds) Image and Video Retrieval. CIVR 2006. Lecture Notes in Computer Science, vol 4071. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11788034_11

Download citation

  • DOI: https://doi.org/10.1007/11788034_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-36018-6

  • Online ISBN: 978-3-540-36019-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics