Skip to main content

A Multimodal Information Collector for Content-Based Image Retrieval System

  • Conference paper
  • 2472 Accesses

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 7064)

Abstract

Explicit relevance feedback requires the user to explicitly refine the search queries for content-based image retrieval. This may become laborious or even impossible due to the ever-increasing volume of digital databases. We present a multimodal information collector that can unobtrusively record and asynchronously transmit the user’s implicit relevance feedback on a displayed image to the remote CBIR server for assisting in retrieving relevant images. The modalities of user interaction include eye movements, pointer tracks and clicks, keyboard strokes, and audio including speech. The client-side information collector has been implemented as a browser extension using the JavaScript programming language and has been integrated with an existing CBIR server. We verify its functionality by evaluating the performance of the gaze-enhanced CBIR system in on-line image tagging tasks.

Keywords

  • Implicit relevance feedback
  • JavaScript
  • gaze tracking
  • content-based image retrieval
  • image tagging

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-642-24965-5_83
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-642-24965-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.00
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys 40(2), 1–60 (2008)

    CrossRef  Google Scholar 

  2. Kelly, D., Teevan, J.: Implicit feedback for inferring user preference: a bibliography. SIGIR Forum 37(2), 18–28 (2003)

    CrossRef  Google Scholar 

  3. Zhang, H., Koskela, M., Laaksonen, J.: Report on forms of enriched relevance feedback. Technical Report TKK-ICS-R10, Helsinki University of Technology (2008)

    Google Scholar 

  4. Hardoon, D.R., Shawe-Taylor, J., Ajanki, A., Puolamäki, K., Kaski, S.: Information retrieval by inferring implicit queries from eye movements. In: Eleventh International Conference on Artificial Intelligence and Statistics (2007)

    Google Scholar 

  5. Rayner, K.: Eye movements in reading and information processing: 20 years of research. Psychological Bulletin 124(3), 372–422 (1998)

    CrossRef  Google Scholar 

  6. Klami, A., Saunders, C., de Campos, T., Kaski, S.: Can relevance of images be inferred from eye movements? In: Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval, pp. 134–140. ACM (2008)

    Google Scholar 

  7. Hardoon, D., Pasupa, K.: Image ranking with implicit feedback from eye movements. In: Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications, pp. 291–298. ACM (2010)

    Google Scholar 

  8. Maglio, P.P., Campbell, C.S.: Attentive agents. Commun. ACM 46(3), 47–51 (2003)

    CrossRef  Google Scholar 

  9. Gruenstein, A., McGraw, I., Badr, I.: The WAMI Toolkit for developing, deploying, and evaluating web-accessible multimodal interfaces. In: Proceedings of Tenth International Conference on Multimodal Interfaces (ICMI 2008), Chania, Greece (October 2008)

    Google Scholar 

  10. Laaksonen, J., Koskela, M., Oja, E.: PicSOM—Self-organizing image retrieval with MPEG-7 content descriptions. IEEE Transactions on Neural Networks, Special Issue on Intelligent Multimedia Processing 13(4), 841–853 (2002)

    CrossRef  MATH  Google Scholar 

  11. Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer Series in Information Sciences, vol. 30. Springer, Berlin (2001)

    CrossRef  MATH  Google Scholar 

  12. Viitaniemi, V., Laaksonen, J.: Evaluating the performance in automatic image annotation: example case by adaptive fusion of global image features. Signal Processing: Image Communications 22(6), 557–568 (2007)

    Google Scholar 

  13. Ames, M., Naaman, M.: Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 971–980. ACM, New York (2007)

    CrossRef  Google Scholar 

  14. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007, VOC 2007 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, H., Sjöberg, M., Laaksonen, J., Oja, E. (2011). A Multimodal Information Collector for Content-Based Image Retrieval System. In: Lu, BL., Zhang, L., Kwok, J. (eds) Neural Information Processing. ICONIP 2011. Lecture Notes in Computer Science, vol 7064. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24965-5_83

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24965-5_83

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24964-8

  • Online ISBN: 978-3-642-24965-5

  • eBook Packages: Computer ScienceComputer Science (R0)