The Onion-Tree: Quick Indexing of Complex Data in the Main Memory
Searching for elements in a dataset that are similar to a given query element is a core problem in applications that use complex data, and has been carried out aided by a metric access method (MAM). A growing number of these applications require indices that can be built faster and for several times, in addition to providing smaller response times for similarity queries. Besides, the increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper, we propose the Onion-tree, a new and robust dynamic memory-based MAM that performs a hierarchical division of the metric space into disjoint subspaces. The Onion-tree is very compact, requiring a small fraction of the main memory (e.g., at most 4.8%). Comparisons of the Onion-tree, a memory-based version of the Slim-tree, and the memory-based MM-tree showed that the Onion-tree always produced the smallest elapsed time to build the index. Our experiments also showed that the Onion-tree produced the best query performance results, followed by the MM-tree, which in turn outperformed the Slim-tree. With regard to the MM-tree, the Onion-tree provided a reduction in the number of distance calculations that ranged from 1% to 11% in range queries and from 16% up to 64% in k-NN queries. The Onion-tree also significantly improved the required elapsed time, which ranged from 12% to 39% in range query processing and from 40% up to 70% in k-NN query processing, as compared to the MM-tree, its closest competitor. The Onion-tree source code is available at http://gbd.dc.ufscar.br/download/Onion-tree .
Keywordsmetric access method complex data similarity search
Unable to display preview. Download preview PDF.
- 4.Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: VLDB, pp. 426–435 (1997)Google Scholar
- 5.Traina Jr., C., Traina, A.J.M., Faloutsos, C., Seeger, B.: Fast indexing and visualization of metric data sets using slim-trees. IEEE TKDE 14(2), 244–260 (2002)Google Scholar
- 6.Vieira, M.R., Traina Jr., C., Chino, F.J.T., Traina, A.J.M.: DBM-tree: A dynamic metric access method sensitive to local density data. In: SBBD, pp. 163–177 (2004)Google Scholar
- 7.Skopal, T., Pokorný, J., Snásel, V.: PM-tree: Pivoting metric tree for similarity search in multimedia databases. In: ADBIS (Local Proceedings) (2004)Google Scholar
- 10.Brin, S.: Near neighbor search in large metric spaces. In: VLDB, pp. 574–584 (1995)Google Scholar
- 11.Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: SODA, pp. 311–321 (1993)Google Scholar