MirBot: A Multimodal Interactive Image Retrieval System

  • Antonio Pertusa
  • Antonio-Javier Gallego
  • Marisa Bernabeu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7887)

Abstract

This study presents a multimodal interactive image retrieval system for smartphones (MirBot). The application is designed as a collaborative game where users can categorize photographs according to the WordNet hierarchy. After taking a picture, the region of interest of the target can be selected, and the image information is sent with a set of metadata to a server in order to classify the object. The user can validate the category proposed by the system to improve future queries. The result is a labeled database with a structure similar to ImageNet, but with contents selected by the users, fully marked with regions of interest, and with novel metadata that can be useful to constrain the search space in a future work. The MirBot app is freely available on the Apple app store.

Keywords

Image retrieval multimodality interactive labeling 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based Multimedia Information Retrieval: State of the Art and Challenges. ACM Trans. on Multimedia Computing, Communications, and Applications 2(1), 1–19 (2006)CrossRefGoogle Scholar
  2. 2.
    Smeulders, A., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. on Pattern Analysis and Machine Intelligence 22(12), 1349–1380 (2000)CrossRefGoogle Scholar
  3. 3.
    Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys 40(2), 1–60 (2008)CrossRefGoogle Scholar
  4. 4.
    Jegou, H., Douze, M., Schmid, C.: Recent Advances in Large Scale Image Search. In: Nielsen, F. (ed.) ETVC 2008. LNCS, vol. 5416, pp. 305–326. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: A Large-Scale Hierarchical Image Database. In: IEEE CVPR, pp. 248–255 (2009)Google Scholar
  6. 6.
    Torralba, A., Fergus, R., Freeman, W.T.: 80 Million Tiny Images: A Large Data Set for Non-parametric Object and Scene Recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence 30(11), 1958–1970 (2008)CrossRefGoogle Scholar
  7. 7.
    Dinakaran, B., Annapurna, J., Kumar, C.A.: Interactive image retrieval using text and image content. Cybernetics and Information Technologies 10(3), 20–30 (2010)Google Scholar
  8. 8.
    Boutell, M., Luo, J.: Beyond pixels: Exploiting camera metadata for photo classification. Pattern Recognition 38(6), 935–946 (2005)CrossRefGoogle Scholar
  9. 9.
    Barrington, L.L., Turnbull, D.D., Lanckriet, G.G.: Game-powered machine learning. Proc. National Academy of Science (PNAS) 109(17), 6411–6416 (2012)CrossRefGoogle Scholar
  10. 10.
    Von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: CHI 2004: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 319–326. ACM Press, NY (2004)CrossRefGoogle Scholar
  11. 11.
    Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press (1998)Google Scholar
  12. 12.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)CrossRefGoogle Scholar
  13. 13.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-Up Robust Features (SURF). Computer Vision and Image Understanding 110(3), 346–359 (2008)CrossRefGoogle Scholar
  14. 14.
    Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. IJCV 60(1), 63–86 (2004)CrossRefGoogle Scholar
  15. 15.
    Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477 (2003)Google Scholar
  16. 16.
    Thomee, B., Bakker, E.M., Lew, M.S.: TOP-SURF: a visual words toolkit. In: Proc. of the 18th ACM Int. Conf. on Multimedia, Firenze, Italy, pp. 1473–1476 (2010)Google Scholar
  17. 17.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proc. of IEEE Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  18. 18.
    Salton, G., McGill, M.: Introduction to modern information retrieval. McGraw-Hill (1983)Google Scholar
  19. 19.
    Jeong, S.: Histogram-Based Color Image Retrieval, Stanford University (2001)Google Scholar
  20. 20.
    Exchangeable image file format for digital still cameras: Exif Version 2.3. CIPA, http://www.cipa.jp/english/hyoujunka/kikaku/pdf/DC-008-2010_E.pdf
  21. 21.
    Lin, J.: Divergence measures based on the Shannon entropy. IEEE Trans. on Information Theory 37(1), 145–150 (1991)MATHCrossRefGoogle Scholar
  22. 22.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Antonio Pertusa
    • 1
  • Antonio-Javier Gallego
    • 1
  • Marisa Bernabeu
    • 1
  1. 1.DLSIUniversity of AlicanteSpain

Personalised recommendations