Skip to main content

A Novel Multi-modal Integration and Propagation Model for Cross-Media Information Retrieval

  • Conference paper
Advances in Multimedia Modeling (MMM 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7131))

Included in the following conference series:

Abstract

In this paper, we present a novel Probabilistic Latent Semantic Analysis-based (PLSA-based) aspect model and turn cross-media retrieval into two parts of multi-modal integration and correlation propagation. We first use multivariate Gaussian distributions to model continuous quantity in PLSA, avoiding information loss between feature-instance versus real-world matching. Multi-modal correlations are learned in an asymmetrical manner, giving a better control of the respective influence of each modality in the latent space. Then we propose a new propagation pattern to refine multi-modal correlations by efficiently taking the complementary from multi-modalities. Experimental results demonstrate that our method is accurate and robust for cross-media information retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Yu, B., Ma, W.Y., Nahrstedt, K., Zhang, H.J.: Video Summarization Based on User Log Enhanced Link Analysis. ACM Multimedia, 382–391 (2003)

    Google Scholar 

  2. Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: Proc. IEEE CVPR, vol. 2, pp. 1002–1009 (2004)

    Google Scholar 

  3. Datta, R., Li, J., Wang, J.Z.: Content-Based Image Retrieval - Approaches and Trends of the New Age. In: Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, Singapore, pp. 253–262 (2005)

    Google Scholar 

  4. Chang, E., Goh, K., Sychay, G., Wu, G.: CBSA: Content-Based Soft Annotation for Multimodal Image Retrieval Using Bayes Point Machines. IEEE Trans. on Circuits and Systems for Video Technology 13, 26–38 (2003)

    Article  Google Scholar 

  5. Zhang, H., Zhuang, Y.T., Wu, F.: Cross-Modal Correlation Learning for Clustering on Image-Audio Dataset. ACM Multimedia, 273–276 (2007)

    Google Scholar 

  6. Beal, M.J., Attias, H., Jojic, N.: Audio-Video Sensor Fusion with Probabilistic Graphical Models. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 736–750. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  7. Zhuang, Y.T., Yang, Y., Wu, F.: Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval. IEEE Trans. on Multimedia 10, 221–229 (2008)

    Article  Google Scholar 

  8. Wang, J.D., Zeng, H.J., Zheng, C., Lu, H.J., Li, T., Ma, W.Y.: ReCoM: Reinforcement Clustering of Multi-Type Interrelated Data Objects. In: ACM SIGIR, Canada, pp. 274–281 (2003)

    Google Scholar 

  9. Wang, X.J., Ma, W.Y., Xue, G.R., Li, X.: Multi-Model Similarity Propagation and its Application for Web Image Retrieval. ACM Multimedia, 944–951 (2004)

    Google Scholar 

  10. Yang, Y., Zhuang, Y.T., Wu, F., Pan, Y.H.: Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-media Retrieval. IEEE Transactions on Multimedia 10, 437–446 (2008)

    Article  Google Scholar 

  11. Blei, D.M., Jordan, M.I.: Modeling Annotated Data. In: Proc. ACM SIGIR, Toronto, Canada, pp. 127–134 (2003)

    Google Scholar 

  12. Barnard, K., Duygulu, P., Freitas, N.D., Forsyth, D., Blei, D.M., Jordan, M.I.: Matching Words and Pictures. J. Machine Learning Research 3, 1107–1135 (2003)

    MATH  Google Scholar 

  13. Monay, F., Perez, D.G.: Modeling Semantic Aspects for Cross-Media Image Indexing. IEEE Trans. on PAMI 29, 1802–1817 (2007)

    Article  Google Scholar 

  14. Li, Z.X., Shi, Z.P., Liu, X., Shi, Z.Z.: Automatic Image Annotation with Continuous PLSA. In: Proceedings of ICASSP, pp. 806–809 (2010)

    Google Scholar 

  15. Hofmann, T.: Unsupervised Learning by Probabilistic Latent Semantic Analysis. In: Proceedings of Machine Learning, vol. 42, pp. 117–196 (2001)

    Google Scholar 

  16. Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60, 91–110 (2004)

    Article  Google Scholar 

  17. Foote, J.: Content-Based Retrieval of Music and Audio. In: Multimedia Storage and Archiving Systems II, Proc. of SPIE, vol. 3229, pp. 138–147 (1997)

    Google Scholar 

  18. Jiang, W., Cotton, C., Chang, S.F., Ellis, D., Loui, A.C.: Short-Term Audio-Visual Atoms for Generic Video Concept Classification. ACM Multimedia, 5–14 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lin, W., Lu, T., Su, F. (2012). A Novel Multi-modal Integration and Propagation Model for Cross-Media Information Retrieval. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, CW., Andreopoulos, Y., Breiteneder, C. (eds) Advances in Multimedia Modeling. MMM 2012. Lecture Notes in Computer Science, vol 7131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27355-1_78

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27355-1_78

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27354-4

  • Online ISBN: 978-3-642-27355-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics