Bayesian Learning of Hierarchical Multinomial Mixture Models of Concepts for Automatic Image Annotation

Shi, Rui; Chua, Tat-Seng; Lee, Chin-Hui; Gao, Sheng

doi:10.1007/11788034_11

Rui Shi²⁰,
Tat-Seng Chua²⁰,
Chin-Hui Lee²¹ &
…
Sheng Gao²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4071))

Included in the following conference series:

International Conference on Image and Video Retrieval

823 Accesses
13 Citations

Abstract

We propose a novel Bayesian learning framework of hierarchical mixture model by incorporating prior hierarchical knowledge into concept representations of multi-level concept structures in images. Characterizing image concepts by mixture models is one of the most effective techniques in automatic image annotation (AIA) for concept-based image retrieval. However it also poses problems when large-scale models are needed to cover the wide variations in image samples. To alleviate the potential difficulties arising in estimating too many parameters with insufficient training images, we treat the mixture model parameters as random variables characterized by a joint conjugate prior density of the mixture model parameters. This facilitates a statistical combination of the likelihood function of the available training data and the prior density of the concept parameters into a well-defined posterior density whose parameters can now be estimated via a maximum a posteriori criterion. Experimental results on the Corel image dataset with a set of 371 concepts indicate that the proposed Bayesian approach achieved a maximum F₁ measure of 0.169, which outperforms many state-of-the-art AIA algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barnard, K., Duygulu, P., Forsyth, D.: Clustering Art. In: Proceedings of CVPR (2001)
Google Scholar
Carneiro, G., Vasconcelos, N.: Formulating Semantic Image Annotation as a Supervised Learning Problem. In: Proceedings of CVPR (2005)
Google Scholar
Duyulu, P., Barnard, K., de Freitas, N., Forsyth, D.: Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Chapter Google Scholar
Fan, J.P., Luo, H.Z., Gao, Y.L.: Learning the Semantics of Images by Using Unlabeled Samples. In: Proceedings of CVPR (2005)
Google Scholar
Gao, S., Wang, D.-H., Lee, C.-H.: Automatic Image Annotation through Multi-Topic Text Categorization. In: Proceedings of ICASSP, Toulouse, France (May 2006)
Google Scholar
Huo, Q., Chan, C., Lee, C.-H.: Bayesian Adaptive Learning of the Parameters of Hidden Markov Model for Speech Recognition. IEEE Trans. Speech Audio Processing 3, 334–345 (1995)
Article Google Scholar
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic Image Annotation and Retrieval Using Cross-Media Relevance Models. In: Proceedings of the 26th ACM SIGIR (2003)
Google Scholar
Lavrenko, V., Manmatha, R., Jeon, J.: A Model for Learning the Semantics of Pictures. In: Proceedings of the 16th Conference on NIPS (2003)
Google Scholar
Lee, C.-H., Huo, Q.: On Adaptive Decision Rules and Decision Parameter Adaptation for Automatic Speech Recognition. Proceedings of the IEEE 88(8) (August 2000)
Google Scholar
Minka, T.: Estimating a Dirichlet Distribution (2003), http://www.stat.cmu.edu/~minka/papers/dirichlet
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to WordNet: an on-line lexical database. Intl. Jour. of Lexicography 3, 235–244 (1990)
Article Google Scholar
Mori, Y., Takahashi, H., Oka, R.: Image-to-Word Transformation Based on Dividing and Vector Quantizing Images with Words. In: Proceedings of MISRM (1999)
Google Scholar
Novovicova, J., Malik, A.: Application of Multinomial Mixture Model to Text Classification. In: Perales, F.J., Campilho, A.C., Pérez, N., Sanfeliu, A. (eds.) IbPRIA 2003. LNCS, vol. 2652, pp. 646–653. Springer, Heidelberg (2003)
Chapter Google Scholar
Srikanth, M., Varner, J., Bowden, M., Moldovan, D.: Exploiting Ontologies for Automatic Image Annotation. In: Proceedings of the 28th ACM SIGIR (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, National University of Singapore, 117543, Singapore
Rui Shi & Tat-Seng Chua
School of ECE, Georgia Institute of Technology, Atlanta, GA, 30332, USA
Chin-Hui Lee
Institute for Infocomm Research, 119613, Singapore
Sheng Gao

Authors

Rui Shi
View author publications
You can also search for this author in PubMed Google Scholar
Tat-Seng Chua
View author publications
You can also search for this author in PubMed Google Scholar
Chin-Hui Lee
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Gao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Arts, Media and Engineering Program, Arizona State University, 85281, Tempe, AZ,
Hari Sundaram
Intelligent Information Management Department, IBM T.J. Watson Research Center, 19 Skyline Drive, 10532, Hawthorne, NY, USA
Milind Naphade
Intelligent Information Management Department, IBM T. J. Watson Research Center, 19 Skyline Drive, 10532, Hawthorne, NY, USA
John R. Smith
Microsoft Corporation, Microsoft China R&D Group, 49 Zhichun Road, 100080, Beijing, China
Yong Rui

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shi, R., Chua, TS., Lee, CH., Gao, S. (2006). Bayesian Learning of Hierarchical Multinomial Mixture Models of Concepts for Automatic Image Annotation. In: Sundaram, H., Naphade, M., Smith, J.R., Rui, Y. (eds) Image and Video Retrieval. CIVR 2006. Lecture Notes in Computer Science, vol 4071. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11788034_11

Download citation

DOI: https://doi.org/10.1007/11788034_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36018-6
Online ISBN: 978-3-540-36019-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics