A Novel Multi-modal Integration and Propagation Model for Cross-Media Information Retrieval

Lin, Wanxia; Lu, Tong; Su, Feng

doi:10.1007/978-3-642-27355-1_78

Wanxia Lin²²,
Tong Lu^22,23 &
Feng Su²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7131))

Included in the following conference series:

International Conference on Multimedia Modeling

2067 Accesses
6 Citations

Abstract

In this paper, we present a novel Probabilistic Latent Semantic Analysis-based (PLSA-based) aspect model and turn cross-media retrieval into two parts of multi-modal integration and correlation propagation. We first use multivariate Gaussian distributions to model continuous quantity in PLSA, avoiding information loss between feature-instance versus real-world matching. Multi-modal correlations are learned in an asymmetrical manner, giving a better control of the respective influence of each modality in the latent space. Then we propose a new propagation pattern to refine multi-modal correlations by efficiently taking the complementary from multi-modalities. Experimental results demonstrate that our method is accurate and robust for cross-media information retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Multi-model Network for Fine-Grained Cross-Media Retrieval

Cross-media retrieval based on semi-supervised regularization and correlation learning

Article 05 May 2018

Joint graph regularization based modality-dependent cross-media retrieval

Article 15 June 2017

References

Yu, B., Ma, W.Y., Nahrstedt, K., Zhang, H.J.: Video Summarization Based on User Log Enhanced Link Analysis. ACM Multimedia, 382–391 (2003)
Google Scholar
Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: Proc. IEEE CVPR, vol. 2, pp. 1002–1009 (2004)
Google Scholar
Datta, R., Li, J., Wang, J.Z.: Content-Based Image Retrieval - Approaches and Trends of the New Age. In: Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, Singapore, pp. 253–262 (2005)
Google Scholar
Chang, E., Goh, K., Sychay, G., Wu, G.: CBSA: Content-Based Soft Annotation for Multimodal Image Retrieval Using Bayes Point Machines. IEEE Trans. on Circuits and Systems for Video Technology 13, 26–38 (2003)
Article Google Scholar
Zhang, H., Zhuang, Y.T., Wu, F.: Cross-Modal Correlation Learning for Clustering on Image-Audio Dataset. ACM Multimedia, 273–276 (2007)
Google Scholar
Beal, M.J., Attias, H., Jojic, N.: Audio-Video Sensor Fusion with Probabilistic Graphical Models. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 736–750. Springer, Heidelberg (2002)
Chapter Google Scholar
Zhuang, Y.T., Yang, Y., Wu, F.: Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval. IEEE Trans. on Multimedia 10, 221–229 (2008)
Article Google Scholar
Wang, J.D., Zeng, H.J., Zheng, C., Lu, H.J., Li, T., Ma, W.Y.: ReCoM: Reinforcement Clustering of Multi-Type Interrelated Data Objects. In: ACM SIGIR, Canada, pp. 274–281 (2003)
Google Scholar
Wang, X.J., Ma, W.Y., Xue, G.R., Li, X.: Multi-Model Similarity Propagation and its Application for Web Image Retrieval. ACM Multimedia, 944–951 (2004)
Google Scholar
Yang, Y., Zhuang, Y.T., Wu, F., Pan, Y.H.: Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-media Retrieval. IEEE Transactions on Multimedia 10, 437–446 (2008)
Article Google Scholar
Blei, D.M., Jordan, M.I.: Modeling Annotated Data. In: Proc. ACM SIGIR, Toronto, Canada, pp. 127–134 (2003)
Google Scholar
Barnard, K., Duygulu, P., Freitas, N.D., Forsyth, D., Blei, D.M., Jordan, M.I.: Matching Words and Pictures. J. Machine Learning Research 3, 1107–1135 (2003)
MATH Google Scholar
Monay, F., Perez, D.G.: Modeling Semantic Aspects for Cross-Media Image Indexing. IEEE Trans. on PAMI 29, 1802–1817 (2007)
Article Google Scholar
Li, Z.X., Shi, Z.P., Liu, X., Shi, Z.Z.: Automatic Image Annotation with Continuous PLSA. In: Proceedings of ICASSP, pp. 806–809 (2010)
Google Scholar
Hofmann, T.: Unsupervised Learning by Probabilistic Latent Semantic Analysis. In: Proceedings of Machine Learning, vol. 42, pp. 117–196 (2001)
Google Scholar
Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Article Google Scholar
Foote, J.: Content-Based Retrieval of Music and Audio. In: Multimedia Storage and Archiving Systems II, Proc. of SPIE, vol. 3229, pp. 138–147 (1997)
Google Scholar
Jiang, W., Cotton, C., Chang, S.F., Ellis, D., Loui, A.C.: Short-Term Audio-Visual Atoms for Generic Video Concept Classification. ACM Multimedia, 5–14 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210093, China
Wanxia Lin, Tong Lu & Feng Su
Jiangyin Institute of Information Technology, Nanjing University, China
Tong Lu

Authors

Wanxia Lin
View author publications
You can also search for this author in PubMed Google Scholar
Tong Lu
View author publications
You can also search for this author in PubMed Google Scholar
Feng Su
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Information Technology, Alpen-Adria-Universität Klagenfurt, Universitätsstr. 65-67, 9020, Klagenfurt, Austria
Klaus Schoeffmann
EURECOM, 2229 Rout des Crêtes, BP 193, 06904, Sophia Antipolis Cedex, France
Bernard Merialdo
School of Computer Science, Carnegie Mellon University, 5000 Forbes Ave, 15213-3890, Pittsburgh, PA, USA
Alexander G. Hauptmann
Department of Computer Science, City University of Hong Kong, Tat Chee Ave, Kowloon, Hong Kong
Chong-Wah Ngo
Department of Electronic and Electrical Engineering, University College London, Roberts Building, Torrington Place, WC1E 7JE, London, UK
Yiannis Andreopoulos
Institute of Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstrasse 9-11 188/2, 1040, Vienna, Austria
Christian Breiteneder

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, W., Lu, T., Su, F. (2012). A Novel Multi-modal Integration and Propagation Model for Cross-Media Information Retrieval. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, CW., Andreopoulos, Y., Breiteneder, C. (eds) Advances in Multimedia Modeling. MMM 2012. Lecture Notes in Computer Science, vol 7131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27355-1_78

Download citation

DOI: https://doi.org/10.1007/978-3-642-27355-1_78
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27354-4
Online ISBN: 978-3-642-27355-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Novel Multi-modal Integration and Propagation Model for Cross-Media Information Retrieval

Abstract

Access this chapter

Preview

Similar content being viewed by others

Multi-model Network for Fine-Grained Cross-Media Retrieval

Cross-media retrieval based on semi-supervised regularization and correlation learning

Joint graph regularization based modality-dependent cross-media retrieval

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Novel Multi-modal Integration and Propagation Model for Cross-Media Information Retrieval

Abstract

Access this chapter

Preview

Similar content being viewed by others

Multi-model Network for Fine-Grained Cross-Media Retrieval

Cross-media retrieval based on semi-supervised regularization and correlation learning

Joint graph regularization based modality-dependent cross-media retrieval

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation