Skip to main content

Learning User Queries in Multimodal Dissimilarity Spaces

  • Conference paper
Adaptive Multimedia Retrieval: User, Context, and Feedback (AMR 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3877))

Included in the following conference series:

Abstract

Different strategies to learn user semantic queries from dissimilarity representations of audio-visual content are presented. When dealing with large corpora of videos documents, using a feature representation requires the on-line computation of distances between all documents and a query. Hence, a dissimilarity representation may be preferred because its offline computation speeds up the retrieval process. We show how distances related to visual and audio video features can directly be used to learn complex concepts from a set of positive and negative examples provided by the user. Based on the idea of dissimilarity spaces, we derive three algorithms to fuse modalities and therefore to enhance the precision of retrieval results. The evaluation of our technique is performed on artificial data and on the annotated TRECVID corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berry, M.W., Dumais, S.T., O’Brien, G.W.: Using linear algebra for intelligent information retrieval. SIAM Review 37(4), 573–595 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  2. Boldareva, L., Hiemstra, D.: Interactive content-based retrieval using pre-computed object-object similarities. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 308–316. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Bruno, E., Moenne-Loccoz, N., Marchand-Maillet, S.: Unsupervised event discrimination based on nonlinear temporal modelling of activity. Pattern Analysis and Application, special issue on Video Event Mining (2005) DOI: 10.1007/s10044-005-0242-9

    Google Scholar 

  4. Chang, E.Y., Li, B., Wu, G., Go, K.: Statistical learning for effective visual information retrieval. In: Proceedings of the IEEE International Conference on Image Processing (2003)

    Google Scholar 

  5. Chávez, E., Navarro, G., Baeza-Yates, R., Marroquin, J.L.: Searching in metric spaces. ACM Computing Surveys 33(3), 273–321 (2001)

    Article  Google Scholar 

  6. Cox, T.F., Cox, M.A.A.: Multidimensional scaling. Chapman & Hall, London (1995)

    MATH  Google Scholar 

  7. Duin, R.P.W.: The combining classifier: To train or not to train? In: Proceedings of the 16th International Conference on Pattern Recognition, ICPR 2002, Quebec City, vol. II, pp. 765–770. IEEE Computer Socity Press, Los Alamitos (2004)

    Google Scholar 

  8. Gauvain, J.L., Lamel, L., Adda, G.: The limsi broadcast news transcription system. Speech Communication 37(1-2), 89–108 (2002)

    Article  MATH  Google Scholar 

  9. Gu, J., Lu, L., Zhang, H.J., Yang, J.: Dominant feature vectors based audio similarity measure. In: PCM, vol. 2, pp. 890–897

    Google Scholar 

  10. Heesch, D., Rueger, S.: Nnk networks for content-based image retrieval. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 253–266. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Moënne-Loccoz, N., Bruno, E., Maillet, S.M.: Interactive retrieval of video sequences from local feature dynamics. In: Detyniecki, M., Jose, J.M., Nürnberger, A., van Rijsbergen, C.J.‘. (eds.) AMR 2005. LNCS, vol. 3877, pp. 128–140. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Moenne-Loccoz, N., Bruno, E., Marchand-Maillet, S.: Interactive partial matching of video sequences in large collections. In: IEEE International Conference on Image Processing (ICIP 2005), Genova, Italy (2005)

    Google Scholar 

  13. Pekalska, E., Paclík, P., Duin, R.P.W.: A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research 2, 175–211 (2001)

    MathSciNet  MATH  Google Scholar 

  14. Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: 14th International Joint Conference on Artificial Intelligence, IJCAI, Montreal, Canada, pp. 448–453 (1995)

    Google Scholar 

  15. Smith, J.R., Jaimes, A., Lin, C.-Y., Naphade, M., Natsev, A., Tseng, B.: Interactive search fusion methods for video database retrieval. In: IEEE International Conference on Image Processing (ICIP) (2003)

    Google Scholar 

  16. Wu, Y., Chang, E.Y., Chang, K.C.-C., Smith, J.R.: Optimal multimodal fusion for multimedia data analysis. In: Proceedings of ACM Int., Conf. on Multimedia, New York (2004)

    Google Scholar 

  17. Yan, R., Hauptmann, A., Jin, R.: Negative pseudo-relevance feedback in contentbased video retrieval. In: Proceedings of ACM Multimedia (MM 2003), Berkeley, USA (2003)

    Google Scholar 

  18. Yang, J., Hauptmann, A.G.: Multi-modality analysis for person type classification in news video. In: Electronic Imaging 2005 - Conference on Storage and Retrieval Methods and Applications for Multimedia, San Jose, USA (January 2005)

    Google Scholar 

  19. Zhou, X.S., Garg, A., Huang, T.S.: A discussion of nonlinear variants of biased discriminant for interactive image retrieval. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 353–364. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  20. Zhou, X.S., Huang, T.S.: Small sample learning during multimedia retrieval using biasmap. In: Proceedings of the IEEE Conference on Pattern Recognition and Computer Vision, CVPR 2001, Hawaii, vol. 1, pp. 11–17 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bruno, E., Moenne-Loccoz, N., Marchand-Maillet, S. (2006). Learning User Queries in Multimodal Dissimilarity Spaces. In: Detyniecki, M., Jose, J.M., Nürnberger, A., van Rijsbergen, C.J. (eds) Adaptive Multimedia Retrieval: User, Context, and Feedback. AMR 2005. Lecture Notes in Computer Science, vol 3877. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11670834_14

Download citation

  • DOI: https://doi.org/10.1007/11670834_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32174-3

  • Online ISBN: 978-3-540-32175-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics