EBS k-d Tree: An Entropy Balanced Statistical k-d Tree for Image Databases with Ground-Truth Labels

  • Grant J. Scott
  • Chi-Ren Shyu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2728)


In this paper we present a new image database indexing structure — Entropy Balanced Statistical (EBS) k-d Tree. This indexing mechanism utilizes the statistical properties and ground-truth labeling of image data for efficient and accurate searches. It is particularly valuable in the domains of medical and biological image database retrieval, where ground-truth labeling are available and archived with the images. The EBS k-d tree is an extension to the statistical k-d tree that attempts to optimize a multi-dimensional decision tree based on the fundamental principles from which it is constructed. Our approach is to develop and validate the notion of an entropy balanced statistical based decision tree. It is shown that by making balanced split decisions in the growth processing of the tree, that the average search depth is improved and the worst case search depth is usually dramatically improved. Furthermore, a method for linking the tree leaves into a non-linear structure was developed to increase the n-nearest neighbor similarity search accuracy. We have applied this to a large-scale medical diagnostic image database and have shown increases in search speed and accuracy over an ordinary distance-based search and the original statistical k-d tree index.


Statistical k-d Tree entropy multi-dimensional index image database content-based image retrieval (CBIR) 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    S. Arya, D.M. Mount, N. S. Netanyahu, R. Silverman, and A.Y. Wu, An optimal algorithm for approximate nearest neighbor searching in fixed dimensions, Proc. ACM-SIAM Symp. on Discrete Alg., 1994: 573–582.Google Scholar
  2. [2]
    R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, 1st Edition, 1999, ACM Press/ Addison Wesley.Google Scholar
  3. [3]
    J. L. Bentley, Multidimensional binary search trees in database applications, in IEEE Trans. on Software Engineering, Vol. SE-5, No. 4, July 1979.Google Scholar
  4. [4]
    D. Comer, The ubiquitous B-Tree,” in Computing Surveys, Vol. 11, No. 2, June 1979.Google Scholar
  5. [5]
    M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, Query by image and video content: The QBIC system, IEEE Computer, September 1995: 23–32.Google Scholar
  6. [6]
    K. Fukunaga, Introduction to Statistical Pattern Recognition, Academic Press, 1990.Google Scholar
  7. [7]
    R.A. Johnson and D.W. Wichern, Applied Multivariate Statistical Analysis, Fourth Edition, Prentice Hall, 1998.Google Scholar
  8. [8]
    P.M. Kelly, T.M. Cannon, and D.R. Hush, Query by image example: the CANDID approach, in SPIE Vol. 2420 Storage and Retrieval for Image and Video Databases III, San Jose, CA, 1995: 238–248.Google Scholar
  9. [9]
    L. Grewe and A.C. Kak, Interactive learning of a multi-attribute hash table classifier for fast object recognition, Computer Vision and Image Understanding, Vol. 61, No. 3, 1995: 387–416.CrossRefGoogle Scholar
  10. [10]
    J. S. Milton and J.C. Arnold, Introduction to Probability and Statistics: Principles and Applications for Engineering and Computing Sciences, 3rd Edition, Irwin McGraw-Hill, 1995.Google Scholar
  11. [11]
    P. Pudil, J. Novovicova and J. Kittler, Floating search methods in feature selection, Pattern Recognition Letters, 15, 1994: 1119–1125.CrossRefGoogle Scholar
  12. [12]
    G.P. Robinson, H.D. Tagare, J. S. Duncan, and C. C. Jaffe, Medical image collection indexing: Shape-based retrieval using KD-Tree, Computerized Medical Imaging and Graphics, Vol. 20, No. 4, 1996: 209–217.CrossRefGoogle Scholar
  13. [13]
    H. Samet, The quadtree and related hierarchical data structure, in ACM Computing Survey, Vol. 16, No. 2, 1984, 187–260.CrossRefMathSciNetGoogle Scholar
  14. [14]
    S. Sclaroff, L. Taycher, and M. La Cascia, ImageRover: A content-based image browser for the world wide web, Proc. IEEE Workshop on Content-based Access of Image and Video Libraies, June 1997: 2–9.Google Scholar
  15. [15]
    C.R. Shyu, C.E. Brodley, A.C. Kak, A. Kosaka, A. M. Aisen and L. S. Broderick, ASSERT: a physician-in-the-loop content-based retrieval system for HRCT image databases, in Computer Vision and Image Understanding, Vol. 75, Nos. 1/2, July/August, 1999: 111–132.CrossRefGoogle Scholar
  16. [16]
    C. R. Shyu, C. Pavlopoulou, A.C. Kak, C. E. Brodely, and L. S. Broderick, Using human perceptual categories for content-based retrieval from a medical image database, Computer Vision and Image Understanding, Vol. 88, Issue 3, 2002: 119–151.zbMATHCrossRefGoogle Scholar
  17. [17]
    A.W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, Contentbased image retrieval at the end of the early years, IEEE Trans. on Pattern and Machine Intelligence, Vol. 22, No. 12, December 2000: 1349–1380.CrossRefGoogle Scholar
  18. [18]
    J.R. Smith, Image Retrieval Evaluation, Proc. IEEE Workshop of Content-Based Access of Image and Video Databases, Santa Barbara, CA, June 1998: 112–113.Google Scholar
  19. [19]
    S. Theodoridis and K. Koutroumbas, Pattern Recognition, Academic Press, 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Grant J. Scott
    • 1
  • Chi-Ren Shyu
    • 1
  1. 1.Department of Computer Engineering and Computer ScienceUniversity of Missouri-ColumbiaColumbiaUSA

Personalised recommendations