Indexing heterogeneous features with superimages

  • Qingjun Luo
  • Shiliang Zhang
  • Tiejun Huang
  • Wen Gao
  • Qi TianEmail author
Regular Paper


As an important procedure in image retrieval, off-line indexing focuses on organizing relevant images together and largely decides the efficiency, accuracy, and memory cost of the retrieval system. Because the image contains multi-level visual and semantic clues, the described indexing strategy should be able to reflect such multi-level relevance. However, most of the existing indexing strategies view database images individually and only consider partial relevance, i.e., relevance reflected by either local or global feature. To overcome these issues and design better indexing strategy, we propose to package semantically relevant images into superimages, and then index superimages instead of single images. Superimage effectively packages multiple images into one new unit, and hence significantly decreases the number of images to be indexed. This naturally saves the memory cost and retrieval time. To make the final index file discriminative to both visual and semantic relevances, we extract local descriptors from superimages and index them with inverted file. During online retrieval, we only need to extract local descriptors from queries, but could get semantic-aware retrieval results. This is because during our off-line indexing stage, both the semantically and visually relevant images are organized together by indexing heterogeneous features in superimages. Therefore, our approach is naturally superior to many online retrieval fusion algorithms in terms of retrieval efficiency and memory consumption. Moreover, extensive experiments on multiple retrieval tasks also manifest the promising accuracy of our approach.


Image indexing Large-scale visual search Superimage Semantic feature 



This work was supported in part to Dr. Qi Tian by ARO grant W911NF-12-1-0057, Faculty Research Award by NEC Laboratories of America, and 2012 UTSA START-R Research Award, respectively. This work was supported in part by NSFC 61128007.


  1. 1.
    Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: FOCSGoogle Scholar
  2. 2.
    Bay H, Tuytelaars T, Van Gool L (2006) Surf: Speeded up robust features. In: ECCV. Springer, Berlin, pp 404–417Google Scholar
  3. 3.
    Bergamo A, Torresani L (2012) Meta-class features for large-scale object categorization on a budget. In: CVPRGoogle Scholar
  4. 4.
    Calonder M, Lepetit V, Strecha C, Fua P (2010) Brief: Binary robust independent elementary features. In: ECCVGoogle Scholar
  5. 5.
    Deng J, Berg AC, Fei-Fei L (2011) Hierarchical semantic indexing for large scale image retrieval. In: CVPRGoogle Scholar
  6. 6.
    Douze M, Jégou H, Sandhawalia H, Amsaleg L, Schmid C (2009) Evaluation of gist descriptors for web-scale image search. In: ICIVR. ACM, p 19Google Scholar
  7. 7.
    Douze M, Ramisa A, Schmid C (2011) Combining attributes and fisher vectors for effcient image retrieval. In: CVPRGoogle Scholar
  8. 8.
    Fagin R, Kumar R, Sivakumar D (2003) Efficient similarity search and classification via rank aggregation. In: ACM SIGMODGoogle Scholar
  9. 9.
    Fellbaum C (1998) Wordnet: an electronic lexical database. Bradford BooksGoogle Scholar
  10. 10.
    Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Gionis A, Indyky P, Motwaniz R (1999) Similarity search in high dimensions via hashing. In: VLDB, pp. 518–529Google Scholar
  12. 12.
    Huiskes MJ, Lew MS (2008) The mir flickr retrieval evaluation. In: MIR ’08: Proceedings of the 2008 ACM ICMIR. ACM, New YorkGoogle Scholar
  13. 13.
    Jégou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: ECCVGoogle Scholar
  14. 14.
    Jégou H, Douze M, Schmid C (2010) Improving bag-of- feature for large scale image search. IJCV 87(3):316–336CrossRefGoogle Scholar
  15. 15.
    Jégou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. TPAMI 33(1):117–128CrossRefGoogle Scholar
  16. 16.
    Jégou H, Schmid C, Harzallah H, Verbeek J (2010) Accurate image search using the contextual dissimilarity measure. TPAMI 32(1):2–11CrossRefGoogle Scholar
  17. 17.
    Karp RM (1972) Reducibility among combinatorial problems. Springer, BerlinGoogle Scholar
  18. 18.
    Ke Y, Sukthankar R (2004) Pca-sift: A more distinctive representation for local image descriptors. In: CVPR, IEEE, vol. 2, pp II-506Google Scholar
  19. 19.
    Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: NIPSGoogle Scholar
  20. 20.
    Liu Z, Li H, Zhou W, Tian Q (2012) Embedding spatial context into inverted file for large-scale image search. In: ACM MultimediaGoogle Scholar
  21. 21.
    Large scale visual recognition challenge (2010).
  22. 22.
    Lowe DG (2004) Distinctive image features from scale invariant keypoints. IJCV 60(2):91–110CrossRefGoogle Scholar
  23. 23.
    Makino K, Uno T (2004) New algorithms for enumerating all maximal cliques. In: Algorithm Theory-SWAT 2004, pp. 260–272. Springer, BerlinGoogle Scholar
  24. 24.
    Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. TPAMI 27(10):1615–1630CrossRefGoogle Scholar
  25. 25.
    Ng AY, Jordan MI, Weiss Y et al (2002) On spectral clustering: Analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856Google Scholar
  26. 26.
    Nistér D, Stewénius H (2006) Scalable recognition with a vocabulary tree. In: CVPRGoogle Scholar
  27. 27.
    Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 42(3):145–175CrossRefzbMATHGoogle Scholar
  28. 28.
    Perronnin F, Sánchez J, Mensink T (2010) Improving the fisher kernel for large-scale image classification. ECCV 4:143–156Google Scholar
  29. 29.
    Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: CVPRGoogle Scholar
  30. 30.
    Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: an effcient alternative to sift or surf. In: ICCVGoogle Scholar
  31. 31.
    Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: ICCVGoogle Scholar
  32. 32.
    Tomita E, Tanaka A, Takahashi H (2006) The worst-case time complexity for generating all maximal cliques and computational experiments. Theor Comput Sci 363(1):28–42MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Torralba A, Fergus R, Weiss Y (2008) Small codes and large image databases for recognition. In: CVPRGoogle Scholar
  34. 34.
    Torresani L, Szummer M, Fitzgibbon A (2010) Efficient object category recognition using classemes. In: ECCV, pp. 776–789Google Scholar
  35. 35.
    Wu Z, Ke Q, Isard M, Sun J (2009) Bundling feature for large scale partial-duplicated web image search. In: CVPRGoogle Scholar
  36. 36.
    Ye G, Liu D, Jhuo IH, Chang SF (2012) Robust late fusion with rank minimization. In: CVPRGoogle Scholar
  37. 37.
    Zhang S, Huang J, Huang Y, Yu Y, Li H, Metaxas DN (2010) Automatic image annotation using group sparsity. In: CVPR, IEEE, pp 3312–3319Google Scholar
  38. 38.
    Zhang S, Huang Q, Hua G, Jiang S, Gao W (2010) Tian, Q.: building contextual visual vocabulary for large-scale image applications. In: ACM multimedia Google Scholar
  39. 39.
    Zhang S, Tian Q, Hua G, Huang Q, Gao W (2009) Descriptive visual words and visual phrases for image applications. In: ACM multimediaGoogle Scholar
  40. 40.
    Zhang S, Tian Q, Lu K, Huang Q, Gao W (2013) Edge-sift: discriminative binary descriptor for scalable partial-duplicate mobile search. TIPGoogle Scholar
  41. 41.
    Zhang S, Yang M, Cour T, Yu K, Metaxas DN (2012) Query specific fusion for image retrieval. ECCV 2:660–673Google Scholar
  42. 42.
    Zhang S, Yang M, Wang X, Lin Y, Tian Q (2013) Sematnic-aware co-indexing for image retrieval. In: ICCVGoogle Scholar
  43. 43.
    Zhang Y, Jia, Z, Chen T (2011) Image retrieval with geometry-preserving visual phrases. In: CVPRGoogle Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  • Qingjun Luo
    • 1
  • Shiliang Zhang
    • 2
  • Tiejun Huang
    • 1
  • Wen Gao
    • 1
  • Qi Tian
    • 2
    Email author
  1. 1.School of EECSPeking UniversityBeijingChina
  2. 2.Department of Computer ScienceUniversity of Texas at San AntonioSan AntonioUSA

Personalised recommendations