Advertisement

The Onion-Tree: Quick Indexing of Complex Data in the Main Memory

  • Caio César Mori Carélo
  • Ives Renê Venturini Pola
  • Ricardo Rodrigues Ciferri
  • Agma Juci Machado Traina
  • Caetano Traina-Jr.
  • Cristina Dutra de Aguiar Ciferri
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5739)

Abstract

Searching for elements in a dataset that are similar to a given query element is a core problem in applications that use complex data, and has been carried out aided by a metric access method (MAM). A growing number of these applications require indices that can be built faster and for several times, in addition to providing smaller response times for similarity queries. Besides, the increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper, we propose the Onion-tree, a new and robust dynamic memory-based MAM that performs a hierarchical division of the metric space into disjoint subspaces. The Onion-tree is very compact, requiring a small fraction of the main memory (e.g., at most 4.8%). Comparisons of the Onion-tree, a memory-based version of the Slim-tree, and the memory-based MM-tree showed that the Onion-tree always produced the smallest elapsed time to build the index. Our experiments also showed that the Onion-tree produced the best query performance results, followed by the MM-tree, which in turn outperformed the Slim-tree. With regard to the MM-tree, the Onion-tree provided a reduction in the number of distance calculations that ranged from 1% to 11% in range queries and from 16% up to 64% in k-NN queries. The Onion-tree also significantly improved the required elapsed time, which ranged from 12% to 39% in range query processing and from 40% up to 70% in k-NN query processing, as compared to the MM-tree, its closest competitor. The Onion-tree source code is available at http://gbd.dc.ufscar.br/download/Onion-tree .

Keywords

metric access method complex data similarity search 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hjaltason, G.R., Samet, H.: Index-driven similarity search in metric spaces. ACM Trans. Database Syst. 28(4), 517–580 (2003)CrossRefGoogle Scholar
  2. 2.
    Chávez, E., Navarro, G., Baeza-Yates, R.A., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)CrossRefGoogle Scholar
  3. 3.
    Wilson, D.R., Martinez, T.R.: Improved heterogeneous distance functions. J. of Artificial Intelligence Research 6, 1–34 (1997)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: VLDB, pp. 426–435 (1997)Google Scholar
  5. 5.
    Traina Jr., C., Traina, A.J.M., Faloutsos, C., Seeger, B.: Fast indexing and visualization of metric data sets using slim-trees. IEEE TKDE 14(2), 244–260 (2002)Google Scholar
  6. 6.
    Vieira, M.R., Traina Jr., C., Chino, F.J.T., Traina, A.J.M.: DBM-tree: A dynamic metric access method sensitive to local density data. In: SBBD, pp. 163–177 (2004)Google Scholar
  7. 7.
    Skopal, T., Pokorný, J., Snásel, V.: PM-tree: Pivoting metric tree for similarity search in multimedia databases. In: ADBIS (Local Proceedings) (2004)Google Scholar
  8. 8.
    Traina Jr., C., Santos Filho, R.F., Traina, A.J.M., Vieira, M.R., Faloutsos, C.: The omni-family of all-purpose access methods: a simple and effective way to make similarity search more efficient. VLDB J. 16(4), 483–505 (2007)CrossRefGoogle Scholar
  9. 9.
    Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett. 40(4), 175–179 (1991)CrossRefzbMATHGoogle Scholar
  10. 10.
    Brin, S.: Near neighbor search in large metric spaces. In: VLDB, pp. 574–584 (1995)Google Scholar
  11. 11.
    Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: SODA, pp. 311–321 (1993)Google Scholar
  12. 12.
    Pola, I.R.V., Traina Jr., C., Traina, A.J.M.: The MM-tree: A memory-based metric tree without overlap between nodes. In: Ioannidis, Y., Novikov, B., Rachev, B. (eds.) ADBIS 2007. LNCS, vol. 4690, pp. 157–171. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  13. 13.
    Burkhard, W.A., Keller, R.M.: Some approaches to best-match file searching. Communications of the ACM 16(4), 230–236 (1973)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Caio César Mori Carélo
    • 1
  • Ives Renê Venturini Pola
    • 1
  • Ricardo Rodrigues Ciferri
    • 2
  • Agma Juci Machado Traina
    • 1
  • Caetano Traina-Jr.
    • 1
  • Cristina Dutra de Aguiar Ciferri
    • 1
  1. 1.Departamento de Ciências de ComputaçãoUniversidade de São PauloSão CarlosBrazil
  2. 2.Departamento de ComputaçãoUniversidade Federal de São CarlosSão CarlosBrazil

Personalised recommendations