Learning User Queries in Multimodal Dissimilarity Spaces

Bruno, Eric; Moenne-Loccoz, Nicolas; Marchand-Maillet, Stéphane

doi:10.1007/11670834_14

Eric Bruno²⁰,
Nicolas Moenne-Loccoz²⁰ &
Stéphane Marchand-Maillet²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3877))

Included in the following conference series:

International Workshop on Adaptive Multimedia Retrieval

269 Accesses
9 Citations

Abstract

Different strategies to learn user semantic queries from dissimilarity representations of audio-visual content are presented. When dealing with large corpora of videos documents, using a feature representation requires the on-line computation of distances between all documents and a query. Hence, a dissimilarity representation may be preferred because its offline computation speeds up the retrieval process. We show how distances related to visual and audio video features can directly be used to learn complex concepts from a set of positive and negative examples provided by the user. Based on the idea of dissimilarity spaces, we derive three algorithms to fuse modalities and therefore to enhance the precision of retrieval results. The evaluation of our technique is performed on artificial data and on the annotated TRECVID corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berry, M.W., Dumais, S.T., O’Brien, G.W.: Using linear algebra for intelligent information retrieval. SIAM Review 37(4), 573–595 (1995)
Article MathSciNet MATH Google Scholar
Boldareva, L., Hiemstra, D.: Interactive content-based retrieval using pre-computed object-object similarities. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 308–316. Springer, Heidelberg (2004)
Chapter Google Scholar
Bruno, E., Moenne-Loccoz, N., Marchand-Maillet, S.: Unsupervised event discrimination based on nonlinear temporal modelling of activity. Pattern Analysis and Application, special issue on Video Event Mining (2005) DOI: 10.1007/s10044-005-0242-9
Google Scholar
Chang, E.Y., Li, B., Wu, G., Go, K.: Statistical learning for effective visual information retrieval. In: Proceedings of the IEEE International Conference on Image Processing (2003)
Google Scholar
Chávez, E., Navarro, G., Baeza-Yates, R., Marroquin, J.L.: Searching in metric spaces. ACM Computing Surveys 33(3), 273–321 (2001)
Article Google Scholar
Cox, T.F., Cox, M.A.A.: Multidimensional scaling. Chapman & Hall, London (1995)
MATH Google Scholar
Duin, R.P.W.: The combining classifier: To train or not to train? In: Proceedings of the 16th International Conference on Pattern Recognition, ICPR 2002, Quebec City, vol. II, pp. 765–770. IEEE Computer Socity Press, Los Alamitos (2004)
Google Scholar
Gauvain, J.L., Lamel, L., Adda, G.: The limsi broadcast news transcription system. Speech Communication 37(1-2), 89–108 (2002)
Article MATH Google Scholar
Gu, J., Lu, L., Zhang, H.J., Yang, J.: Dominant feature vectors based audio similarity measure. In: PCM, vol. 2, pp. 890–897
Google Scholar
Heesch, D., Rueger, S.: Nnk networks for content-based image retrieval. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 253–266. Springer, Heidelberg (2004)
Chapter Google Scholar
Moënne-Loccoz, N., Bruno, E., Maillet, S.M.: Interactive retrieval of video sequences from local feature dynamics. In: Detyniecki, M., Jose, J.M., Nürnberger, A., van Rijsbergen, C.J.‘. (eds.) AMR 2005. LNCS, vol. 3877, pp. 128–140. Springer, Heidelberg (2006)
Chapter Google Scholar
Moenne-Loccoz, N., Bruno, E., Marchand-Maillet, S.: Interactive partial matching of video sequences in large collections. In: IEEE International Conference on Image Processing (ICIP 2005), Genova, Italy (2005)
Google Scholar
Pekalska, E., Paclík, P., Duin, R.P.W.: A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research 2, 175–211 (2001)
MathSciNet MATH Google Scholar
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: 14th International Joint Conference on Artificial Intelligence, IJCAI, Montreal, Canada, pp. 448–453 (1995)
Google Scholar
Smith, J.R., Jaimes, A., Lin, C.-Y., Naphade, M., Natsev, A., Tseng, B.: Interactive search fusion methods for video database retrieval. In: IEEE International Conference on Image Processing (ICIP) (2003)
Google Scholar
Wu, Y., Chang, E.Y., Chang, K.C.-C., Smith, J.R.: Optimal multimodal fusion for multimedia data analysis. In: Proceedings of ACM Int., Conf. on Multimedia, New York (2004)
Google Scholar
Yan, R., Hauptmann, A., Jin, R.: Negative pseudo-relevance feedback in contentbased video retrieval. In: Proceedings of ACM Multimedia (MM 2003), Berkeley, USA (2003)
Google Scholar
Yang, J., Hauptmann, A.G.: Multi-modality analysis for person type classification in news video. In: Electronic Imaging 2005 - Conference on Storage and Retrieval Methods and Applications for Multimedia, San Jose, USA (January 2005)
Google Scholar
Zhou, X.S., Garg, A., Huang, T.S.: A discussion of nonlinear variants of biased discriminant for interactive image retrieval. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 353–364. Springer, Heidelberg (2004)
Chapter Google Scholar
Zhou, X.S., Huang, T.S.: Small sample learning during multimedia retrieval using biasmap. In: Proceedings of the IEEE Conference on Pattern Recognition and Computer Vision, CVPR 2001, Hawaii, vol. 1, pp. 11–17 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Viper group, Computer Vision and Multimedia Laboratory, University of Geneva, Switzerland
Eric Bruno, Nicolas Moenne-Loccoz & Stéphane Marchand-Maillet

Authors

Eric Bruno
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Moenne-Loccoz
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Marchand-Maillet
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Laboratoire d’Informatique de Paris 6, France
Marcin Detyniecki
Department of Computer Science, University of Glasgow, 17 Lilybank Gardens, G12 8QQ, Glasgow, UK
Joemon M. Jose
Fakultät für Informatik, Otto-von-Guericke Universität Madgeburg, Universitätsplatz 2, 39106, Germany
Andreas Nürnberger
Department of Computing Science, University of Glasgow, G12 8QQ, Glasgow, UK
C. J. van Rijsbergen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bruno, E., Moenne-Loccoz, N., Marchand-Maillet, S. (2006). Learning User Queries in Multimodal Dissimilarity Spaces. In: Detyniecki, M., Jose, J.M., Nürnberger, A., van Rijsbergen, C.J. (eds) Adaptive Multimedia Retrieval: User, Context, and Feedback. AMR 2005. Lecture Notes in Computer Science, vol 3877. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11670834_14

Download citation

DOI: https://doi.org/10.1007/11670834_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32174-3
Online ISBN: 978-3-540-32175-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics