Fast Search of Binary Codes with Distinctive Bits

  • Yanping Ma
  • Hongtao Xie
  • Zhineng Chen
  • Qiong Dai
  • Yinfei Huang
  • Guangrong Ji
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8879)


Although distance between binary codes can be computed fast in hamming space, linear search is not practical for large scale dataset. Therefore attention has been paid to the efficiency of performing approximate nearest neighbor search, in which Hierarchical Clustering Trees (HCT) is the state-of-the-art method. However, HCT builds index with the whole binary codes, which degrades search performance. In this paper, we first propose an algorithm to compress binary codes by extracting distinctive bits according to the standard deviation of each bit. Then, a new index is proposed using com-pressed binary codes based on hierarchical decomposition of binary spaces. Experiments conducted on reference datasets and a dataset of one billion binary codes demonstrate the effectiveness and efficiency of our method.


binary codes approximate nearest neighbor search binary indexing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Zhang, W., Gao, K., Zhang, Y., Li, J.: Efficient Approximate Nearest Neighbor Search with Integrated Binary Codes. In: ACM MM, pp. 1189–1192 (2011)Google Scholar
  2. 2.
    Chu, W.-T., Li, C.-J., Tseng, S.-C.: Travelmedia: an intelligent management system for media captured in travel. JVCI 22, 93–104 (2011)Google Scholar
  3. 3.
    Torralba, A., Fergus, R., Weiss, Y.: Small codes and large image databases for recognition. In: IEEE CVPR (2008)Google Scholar
  4. 4.
    Zhang, L., Zhang, Y., Tang, J., Gu, X., Li, J., Tian, Q.: Topology Preserving Hashing for Similarity Search. In: ACM MM (2013)Google Scholar
  5. 5.
    Xie, H., Zhang, Y., Tan, J., Guo, L., Li, J.: Contextual Query Expansion for Image Retrieval. IEEE Trans. on Multimedia 16(4) (2014)Google Scholar
  6. 6.
    Salakhutdinov, R., Hinton, G.: Semantic Hashing. International Journal of Approximate Reasoning (2009)Google Scholar
  7. 7.
    Strecha, C., Bronstein, A., Bronstein, M., Fua, P.: LDAHash: improved matching with smaller descriptors. IEEE Transactions on PAMI 34(1), 66–78 (2012)CrossRefGoogle Scholar
  8. 8.
    Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: IEEE ICCV, pp. 2564–2571 (2011)Google Scholar
  9. 9.
    Norouzi, M., Punjani, A., Fleet, D.J.: Fast search in hamming space with multi-index hashing. In: IEEE CVPR (2012)Google Scholar
  10. 10.
    Muja, M., Lowe, D.G.: Fast matching of binary features. In: CRV (2012)Google Scholar
  11. 11.
  12. 12.
    Zitnick, C.L.: Binary coherent edge descriptors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 170–182. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  13. 13.
    Weiss, Y., Torralba, A., Fergus, R.: Spectral Hashing. In: Advances in Neural Information Processing Systems (2008)Google Scholar
  14. 14.
    Jegou, H., Douze, M., et al.: Product Quantization for Nearest Neighbor Search. IEEE Transactions on PAMI 33(1), 117–128 (2011)CrossRefGoogle Scholar
  15. 15.
    Aly, M., Munich, M., Perona, P.: Distributed kd-trees for retrieval from very large image collections. In: BMVC (2011)Google Scholar
  16. 16.
    Babenko, A., Lempitsky, V.: The inverted multi-index. In: IEEE CVPR (2012)Google Scholar
  17. 17.
    SilpaAnan, C., Hartley, R.: Optimized KD-trees for fast image descriptor matching. In: CVPR (2008)Google Scholar
  18. 18.
    Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the International Conference on Very Large Data Bases (1999)Google Scholar
  19. 19.
    Broder, A.Z.: On the resemblance and containment of documents. In: IEEE Compression and Complexity of Sequences, pp. 21–29 (1997)Google Scholar
  20. 20.
    Park, H.S., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Expert Systems with Applications 36(2), 3336–3341 (2009)CrossRefGoogle Scholar
  21. 21.
    Zhang, L., Zhang, Y., Tang, J., Lu, K., Tian, Q.: Binary Code Ranking with Weighted Hamming Distance. In: IEEE CVPR (2013)Google Scholar
  22. 22.
    Bland, J.M., Altman, D.G.: Statistics notes: measurement error (1996)Google Scholar
  23. 23.
    Jegou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vis. 87(3), 316–336 (2010)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Yanping Ma
    • 1
  • Hongtao Xie
    • 2
  • Zhineng Chen
    • 3
  • Qiong Dai
    • 2
  • Yinfei Huang
    • 4
  • Guangrong Ji
    • 1
  1. 1.School of Information Science and EngineeringOcean University of ChinaQingdaoChina
  2. 2.National Engineering Laboratory for Information Security TechnologiesInstitute of Information Engineering, Chinese Academy of SciencesBeijingChina
  3. 3.Interactive Digital Media Technology Research CenterInstitute of Automation, Chinese Academy of SciencesBeijingChina
  4. 4.Shanghai Stock ExchangeShanghaiChina

Personalised recommendations