Abstract
This position paper deals with queries beyond text, mixing several multimedia contents: audio, video, image and text. Search approaches combining some of these formats have been studied, including query by example techniques in situations where only one format is considered. It is worth mentioning that most of these research works do not deal with text content. A new approach to allow users introducing multimodal queries and exploring multimedia repositories is proposed. For this purpose, different ranked result lists must be combined to produce the final results shown for a given query. The main goal of this proposal is to reduce the semantic gap between low level features and high level concepts in multimedia contents. The use of qualitative data giving more relevance to text content along with machine learning methods to combine results of monomodal retrieval systems is proposed. Although it is too soon to show experimentation results, a prototype implementing the approach is under development and evaluation.
This work has been partially supported by the Spanish Center for Industry Technological Development (CDTI, Ministry of Industry, Tourism and Trade), through the project Buscamedia (CEN-20091026). Authors would like to thank all Buscamedia partners for their knowledge and contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bashir, F.I., Khanvilkar, S., Schonfeld, D., Khokhar, A.: Multimedia Systems: Content Based Indexing and Retrieval. In: Chen, W. (ed.) The Electrical Engineering Handbook, sec. 4, ch. 6. Academic Press (2004)
Escalante, H.J., Hérnandez, C.A., Sucar, L.E., Montes, M.: Late fusion of heterogeneous methods for multimedia image retrieval. In: Proceeding of the 1st ACM International Conference on Multimedia information Retrieval, MIR 2008, Vancouver, British Columbia, Canada, October 30-31, pp. 172–179. ACM, New York (2008)
Joshi, D., Naphade, M., Natsev, A.: Semantics reinforcement and fusion learning for multimedia streams. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR 2007, Amsterdam, The Netherlands, July 09-11, pp. 309–316. ACM, New York (2007)
Martínez-Santiago, F.: El problema de la fusión de colecciones en la recuperación de información multilingüe y distribuida: cálculo de la relevancia documental en dos pasos. Doctoral Thesis, UNED (2004)
Mittal, A.: An Overview of Multimedia Content-Based Retrieval Strategies, Informatica. International Journal of Computing and Informatics 30(3), 347–356 (2006)
Naphade, M.R., Kristjansson, T., Frey, B., Huang, T.S.: Probabilistic Multimedia Objects Multijects: A novel Approach to Indexing and Retrieval in Multimedia Systems. In: Proc. IEEE International Conference on Image Processing, vol. 3, pp. 536–540 (1998)
Nowak, S., Dunker, P.: Overview of the CLEF 2009 Large-Scale Visual Concept Detection and Annotation Task. In: Peters, C., Caputo, B., Gonzalo, J., Jones, G.J.F., Kalpathy-Cramer, J., Müller, H., Tsikrika, T. (eds.) CLEF 2009. LNCS, vol. 6242, pp. 94–109. Springer, Heidelberg (2010), http://www.clef-campaign.org/2009/working_notes/Overview_VCDT.pdf
PetaMedia: State of the art report. PetaMedia Deliverable D 5.1 (2008)
Poh, N., Kittler, J.: Multimodal Information Fusion, Multimodal Signal Processing: Theory and Applications for Human-Computer Interaction. In: Thiran, J.-P., Bourlard, H., Marques, F. (eds.) to appear in Elsevier Science (2009) ISBN-13: 978-0-12-374825-6
Olsson, J.S., Oard, D.W.: Combining Speech Retrieval Results with Generalized Additive Models. In: Proceedings of ACL 2008: HLT, Association for Computational Linguistics, pp. 461–469 (2008)
Tollari, S., Detyniecki, M., Marsala, C., Fakeri-Tabrizi, A., Amini, M., Gallinari, P.: Exploiting Visual Concepts to Improve Text-Based Image Retrieval. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 701–705. Springer, Heidelberg (2009)
Wiguna, W., Fernández-Tébar, J., García-Serrano, A.: Using a Fuzzy Model for Combining Search Results from Different Information Sources to Build a Metasearch Engine. In: Computational Intelligence, Theory and Applications, pp. 325–334 (2006), doi:10.1007/3-540-34783-6_34
Yan, R.: Probabilistic Models for Combining Diverse Knowledge Sources in Multimedia Retrieval. PhD thesis, Carnegie Mellon University (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Martínez, Á., Lana Serrano, S., Martínez-Fernández, J.L., Martínez, P. (2012). Multimodal Queries to Access Multimedia Information Sources: First Steps. In: Alvarez, F., Costa, C. (eds) User Centric Media. UCMEDIA 2010. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 60. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35145-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-35145-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35144-0
Online ISBN: 978-3-642-35145-7
eBook Packages: Computer ScienceComputer Science (R0)