A Fast Image Retrieval Method Based on A Quantization Tree

  • Xiaochun WangEmail author
  • Xiali Wang
  • Don Mitchell Wilkes


Traditional content-based image retrieval technology expresses the content of images by low-level features, giving rise to the “semantic gap” problem. Therefore, it is of great significance to obtain better accuracy of retrieval by fusing the visual spatial semantic content of images into content-based image retrieval algorithms. To help retrieval algorithms improve the ability of image analysis and understanding and improve the accuracy of the retrieval algorithm, in this chapter, we propose a new image retrieval methodology that begins by partitioning images into small overlapping image patches, on which feature vectors in the form of color histograms are exacted. Next, feature vectors extracted from a small number of training images are clustered to obtain the knowledge base consisting of visual vocabulary words, and images in the whole database are parsed by the knowledge base and their size-reduced versions are stored separately and subsequently used as an index. Finally the query image is partitioned and parsed, its similarities with the indexed images in the database are calculated, and the most similar images are output. The focus of this paper is on a fast visual vocabulary tree (a fast approximate nearest neighbor search tree), Quantization Tree, to quickly parse the query images so as to partially resolve the semantic gap problem between the semantic image content and the low-level image features. Extensive experimental results for object recognition tasks in an indoor environment as well as in an outdoor environment demonstrate the effectiveness and the efficiency of the Quantization Tree.


Color histogram Image patch MST-based clustering algorithm Vocabulary tree Random quantization tree 


  1. Amit, Y., & Geman, D. (1977). Shape quantization and reconstruction with randomized trees. Neural Computation, 9(7), 1545–1588.CrossRefGoogle Scholar
  2. Beis, J. S., & Lowe, D. G. (1997). Shape indexing using approximate nearest neighbor search in high-dimensional spaces. In Proceedings of the 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ‘97) (pp. 1000–1006), Puerto Rico.Google Scholar
  3. Bentley, J. L. (1970). Multidimensional binary search trees in database applications. IEEE Transactions on Software Engineering, 5(4), 333–340.CrossRefGoogle Scholar
  4. Chen, D., Tsai, S. S., & Chandrasekhar, V. (2009). Robust image retrieval using multiview scalable vocabulary trees. In Visual Communications & Image Processing. International Society for Optics and Photonics.Google Scholar
  5. Duch, A., Estivill-Castro, V., & Martinez, C. (1998). Randomized K-dimensional binary search trees. In K.-Y. Chwa & O. H. Ibarra (Eds.), ISAAC ‘98, LNCS (Vol. 1533, pp. 199–209). Berlin Heidelberg: Springer.Google Scholar
  6. Fukushima, K. (1980). Neocognition: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193–202.CrossRefGoogle Scholar
  7. Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3–42.CrossRefGoogle Scholar
  8. Guttman, A. (1984). R-trees: A dynamic index structure for spatial searching. In Proceedings of the ACM International Conference on Management of Data (SIGMOD ‘84) (pp. 47–57), New York, NY.Google Scholar
  9. Hunter, J. E., Tugcu, M., Wang, X., Costello, C., & Wilkes, D. M. (2011). Exploiting sparse representations in very high-dimensional feature spaces obtained from patch based processing. Machine Vision and Applications, 22(3), 449–460.Google Scholar
  10. Karypis, G., Han, E.-H., & Kumar, V. (1999). CHAMELEON: A hierarchical clustering algorithm using dynamic modeling. IEEE Computer, 32(8), 68–75.CrossRefGoogle Scholar
  11. Katayama, N., & Satoh, S. (1997). The SR-tree: An index structure for high-dimensional nearest neighbor queries. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD ‘97) (pp. 369–380), Tucson, Arizona.Google Scholar
  12. Konolige, K., Bowman, J., Chen, J. D., Mihelich, P., Calonder, M., Lepetit, V., et al. (2010). View-based maps. International Journal of Robotics Research, 29(8), 941–957.CrossRefGoogle Scholar
  13. Laszlo, M., & Mukherjee, S. (2005). Minimum spanning tree partitioning algorithm for microaggregation. IEEE Transactions on Knowledge and Data Engineering, 17(7), 902–911.CrossRefGoogle Scholar
  14. Lepetit, V., & Fua, P. (2006). Keypoint recognition using randomized trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(9), 1465–1479.CrossRefGoogle Scholar
  15. Lowe, D. G. (1999). Object recognition from local scale-invariant features. In Proceedings of the International Conference on Computer Vision (ICCV ‘99) (pp. 1150–1157).Google Scholar
  16. Moosmann, F., Triggs, B., & Jurie, F. (2006). Fast discriminative visual codebooks using randomized clustering forest. In Proceedings of International Conference on Neural Information Processing Systems. MIT Press.Google Scholar
  17. Nister, D., & Stewenius, H. (2006). Scalable recognition with a vocabulary tree. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ‘06) (Vol. 2, pp. 2161–2168), Washington, DC, USA.Google Scholar
  18. Pereira, F. C. N., Tishby, N., & Lee, L. (1993). Distributional clustering of English words. In Proceeding of the 31st Meeting of the Association for Computational Linguistics (pp. 183–190), Columbus, OH.Google Scholar
  19. Quinlan, J. R. (1992). C4.5: Program for machine learning. Morgan Kaufmann.Google Scholar
  20. Riemenschneider, H., Donoser, M., & Bischof, H. (2009). Bag of optical flow volumes for image sequence recognition. In Proceedings of British Machine Vision Conference (BMVC).Google Scholar
  21. Silpa-Anan, C., & Hartley, R. (2008). Optimised KD-trees for fast image descriptor matching. In Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ‘08) (pp. 1–8).Google Scholar
  22. Swain, M. J., & Ballard, D. H. (1991). Color indexing. International Journal of Computer Vision, 7(1), 11–32.CrossRefGoogle Scholar
  23. Veltkamp, R. C., Tanase, M., & Sent, D. (2001). Features in content-based image retrieval systems: A Survey. Computational Imaging and Vision, 36(03), 97–124.CrossRefGoogle Scholar
  24. Wang, X., Tugcu, M., Hunter, J. E., & Wilkes, D. M. (2009a). Exploration of configural representation in landmark learning using working memory toolkit. Pattern Recognition Letters, 30(1), 66–79.CrossRefGoogle Scholar
  25. Wang, X., Wang, X. L., & Wilkes, D. M. (2009b). A divide-and-conquer approach for minimum spanning tree-based clustering. IEEE Transactions on Knowledge and Data Engineering, 21(7), 945–958.CrossRefGoogle Scholar
  26. White, D. A., & Jain, R. (1996). Similarity indexing with the SS-tree. In Proceedings of the 12th IEEE International Conference on Data Engineering (ICDE ‘96) (pp. 516–523), Washington, DC.Google Scholar
  27. Zhao, Z., & Elgammal, A. (2008). Information theoretic key frame selection for action recognition. In Proceedings of British Machine Vision Conference (BMVC).Google Scholar

Copyright information

© Xi'an Jiaotong University Press 2020

Authors and Affiliations

  • Xiaochun Wang
    • 1
    Email author
  • Xiali Wang
    • 2
  • Don Mitchell Wilkes
    • 3
  1. 1.School of Software EngineeringXi’an Jiaotong UniversityXi’anChina
  2. 2.School of Information EngineeringChang’an UniversityXi’anChina
  3. 3.Department of Electrical Engineering and Computer ScienceVanderbilt UniversityNashvilleUSA

Personalised recommendations