Discovering Image-Text Associations for Cross-Media Web Information Fusion
The diverse and distributed nature of the information published on the World Wide Web has made it difficult to collate and track information related to specific topics. Whereas most existing work on web information fusion has focused on multiple document summarization, this paper presents a novel approach for discovering associations between images and text segments, which subsequently can be used to support cross-media web content summarization. Specifically, we employ a similarity-based multilingual retrieval model and adopt a vague transformation technique for measuring the information similarity between visual features and textual features. The experimental results on a terrorist domain document set suggest that combining visual and textual features provides a promising approach to image and text fusion.
KeywordsTextual Feature Text Segment Document Summarization Word Space Linear Mixture Model
- 1.Radev, D.R.: A common theory of information fusion from multiple text sources step one: cross-document structure. In: Proceedings of the 1st SIGdial workshop on Discourse and dialogue, Morristown, NJ, USA, pp. 74–83. Association for Computational Linguistics (2000)Google Scholar
- 2.Mandl, T.: Vague transformations in information retrieval. In: ISI, pp. 312–328 (1998)Google Scholar
- 3.Chang, S.F., Manmatha, R., Chua, T.S.: Combining text and audio-visual features in video indexing. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005 (ICASSP 2005), pp. 1005–1008 (2005)Google Scholar