Multimodal Queries to Access Multimedia Information Sources: First Steps

Martínez, Ángel; Lana Serrano, Sara; Martínez-Fernández, José L.; Martínez, Paloma

doi:10.1007/978-3-642-35145-7_5

Ángel Martínez¹⁷,
Sara Lana Serrano¹⁷,
José L. Martínez-Fernández^17,18 &
…
Paloma Martínez¹⁸

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 60))

Included in the following conference series:

International Conference on User Centric Media

360 Accesses

Abstract

This position paper deals with queries beyond text, mixing several multimedia contents: audio, video, image and text. Search approaches combining some of these formats have been studied, including query by example techniques in situations where only one format is considered. It is worth mentioning that most of these research works do not deal with text content. A new approach to allow users introducing multimodal queries and exploring multimedia repositories is proposed. For this purpose, different ranked result lists must be combined to produce the final results shown for a given query. The main goal of this proposal is to reduce the semantic gap between low level features and high level concepts in multimedia contents. The use of qualitative data giving more relevance to text content along with machine learning methods to combine results of monomodal retrieval systems is proposed. Although it is too soon to show experimentation results, a prototype implementing the approach is under development and evaluation.

This work has been partially supported by the Spanish Center for Industry Technological Development (CDTI, Ministry of Industry, Tourism and Trade), through the project Buscamedia (CEN-20091026). Authors would like to thank all Buscamedia partners for their knowledge and contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bashir, F.I., Khanvilkar, S., Schonfeld, D., Khokhar, A.: Multimedia Systems: Content Based Indexing and Retrieval. In: Chen, W. (ed.) The Electrical Engineering Handbook, sec. 4, ch. 6. Academic Press (2004)
Google Scholar
Escalante, H.J., Hérnandez, C.A., Sucar, L.E., Montes, M.: Late fusion of heterogeneous methods for multimedia image retrieval. In: Proceeding of the 1st ACM International Conference on Multimedia information Retrieval, MIR 2008, Vancouver, British Columbia, Canada, October 30-31, pp. 172–179. ACM, New York (2008)
Google Scholar
Joshi, D., Naphade, M., Natsev, A.: Semantics reinforcement and fusion learning for multimedia streams. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR 2007, Amsterdam, The Netherlands, July 09-11, pp. 309–316. ACM, New York (2007)
Google Scholar
Martínez-Santiago, F.: El problema de la fusión de colecciones en la recuperación de información multilingüe y distribuida: cálculo de la relevancia documental en dos pasos. Doctoral Thesis, UNED (2004)
Google Scholar
Mittal, A.: An Overview of Multimedia Content-Based Retrieval Strategies, Informatica. International Journal of Computing and Informatics 30(3), 347–356 (2006)
MATH Google Scholar
Naphade, M.R., Kristjansson, T., Frey, B., Huang, T.S.: Probabilistic Multimedia Objects Multijects: A novel Approach to Indexing and Retrieval in Multimedia Systems. In: Proc. IEEE International Conference on Image Processing, vol. 3, pp. 536–540 (1998)
Google Scholar
Nowak, S., Dunker, P.: Overview of the CLEF 2009 Large-Scale Visual Concept Detection and Annotation Task. In: Peters, C., Caputo, B., Gonzalo, J., Jones, G.J.F., Kalpathy-Cramer, J., Müller, H., Tsikrika, T. (eds.) CLEF 2009. LNCS, vol. 6242, pp. 94–109. Springer, Heidelberg (2010), http://www.clef-campaign.org/2009/working_notes/Overview_VCDT.pdf
Chapter Google Scholar
PetaMedia: State of the art report. PetaMedia Deliverable D 5.1 (2008)
Google Scholar
Poh, N., Kittler, J.: Multimodal Information Fusion, Multimodal Signal Processing: Theory and Applications for Human-Computer Interaction. In: Thiran, J.-P., Bourlard, H., Marques, F. (eds.) to appear in Elsevier Science (2009) ISBN-13: 978-0-12-374825-6
Google Scholar
Olsson, J.S., Oard, D.W.: Combining Speech Retrieval Results with Generalized Additive Models. In: Proceedings of ACL 2008: HLT, Association for Computational Linguistics, pp. 461–469 (2008)
Google Scholar
Tollari, S., Detyniecki, M., Marsala, C., Fakeri-Tabrizi, A., Amini, M., Gallinari, P.: Exploiting Visual Concepts to Improve Text-Based Image Retrieval. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 701–705. Springer, Heidelberg (2009)
Chapter Google Scholar
Wiguna, W., Fernández-Tébar, J., García-Serrano, A.: Using a Fuzzy Model for Combining Search Results from Different Information Sources to Build a Metasearch Engine. In: Computational Intelligence, Theory and Applications, pp. 325–334 (2006), doi:10.1007/3-540-34783-6_34
Google Scholar
Yan, R.: Probabilistic Models for Combining Diverse Knowledge Sources in Multimedia Retrieval. PhD thesis, Carnegie Mellon University (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

DAEDALUS, Data, Decisions And Language, S.A., Avda. de la Albufera, 321, 28031, Madrid, Spain
Ángel Martínez, Sara Lana Serrano & José L. Martínez-Fernández
Advanced Databases Group, Universidad Carlos III de Madrid, Avda. de la Universidad, 30, 28911, Leganés, Spain
José L. Martínez-Fernández & Paloma Martínez

Authors

Ángel Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Sara Lana Serrano
View author publications
You can also search for this author in PubMed Google Scholar
José L. Martínez-Fernández
View author publications
You can also search for this author in PubMed Google Scholar
Paloma Martínez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Ingenieros de Telecomunicacíon, Universidad Politécnica de Madrid, E.T.S., Avenida Complutense 30, 28040, Madrid, Spain
Federico Alvarez
CREATE-NET, Via alla Cascata 56/D, 38123, Povo, Trento, Italy
Cristina Costa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martínez, Á., Lana Serrano, S., Martínez-Fernández, J.L., Martínez, P. (2012). Multimodal Queries to Access Multimedia Information Sources: First Steps. In: Alvarez, F., Costa, C. (eds) User Centric Media. UCMEDIA 2010. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 60. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35145-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-35145-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35144-0
Online ISBN: 978-3-642-35145-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics