Abstract
Clustering-based methods for searching in metric spaces partition the space into a set of disjoint clusters. When solving a query, some clusters are discarded without comparing them with the query object, and clusters that can not be discarded are searched exhaustively. In this paper we propose a new strategy and algorithms for clustering-based methods that avoid the exhaustive search within clusters that can not be discarded, at the cost of some extra information in the index. This new strategy is based on progressively reducing the cluster until it can be discarded from the result. We refer to this approach as cluster reduction. We present the algorithms for range and kNN search. The results obtained in an experimental evaluation with synthetic and real collections show that the search cost can be reduced by a 13% - 25% approximately with respect to existing methods.
This work has been partially funded by “Ministerio de Ciencia y Innovación” (PGE and FEDER) refs. TIN2009-14560-C03-02, TIN2010-21246-C02-01, and ref. AP2010-6038 (FPU Program) for Alberto Ordóñez Pereira, and by “Xunta de Galicia” refs. 2010/17 (Fondos FEDER), and 10SIN028E.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Computing Surveys 33, 273–321 (2001)
Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity search. The metric space approach. Advances in Database Systems, vol. 32. Springer (2006)
Hjaltason, G.R., Samet, H.: Index-driven similarity search in metric spaces. ACM Transactions on Database Systems 28(4), 517–580 (2006)
Kalantari, I., McDonald, G.: A data structure and an algorithm for the nearest point problem. IEEE Transactions on Software Engineering 9, 631–634 (1983)
Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Information Processing Letters 40, 175–179 (1991)
Brin, S.: Near neighbor search in large metric spaces. In: Procs. of Conf. on Very Large Databases (VLDB 1995), pp. 574–584. Morgan Kaufmann Publishers (1995)
Dehne, F., Noltemeier, H.: Voronoi trees and clustering problems. Information Systems 12(2), 171–175 (1987)
Navarro, G.: Searching in metric spaces by spatial approximation. In: Procs. of String Processing and Information Retrieval (SPIRE 1999), pp. 141–148. IEEE CS Press (1999)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Procs. of Conf. on Very Large Databases (VLDB 1997), pp. 426–435. ACM Press (1997)
Traina Jr., C., Traina, A.J.M., Seeger, B., Faloutsos, C.: Slim-Trees: High Performance Metric Trees Minimizing Overlap between Nodes. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 51–65. Springer, Heidelberg (2000)
Chávez, E., Navarro, G.: A compact space decomposition for effective metric indexing. Pattern Recognition Letters 26(9), 1363–1376 (2005)
Bozkaya, T., Ozsoyoglu, M.: Distance-based indexing for high-dimensional metric spaces. In: Proc. of the ACM Conf. on Management of Data (SIGMOD 1997), pp. 357–368. ACM Press (1997)
Novak, D., Batko, M., Zezula, P.: Metric index: An efficient and scalable solution for precise and approximate similarity search. Information Systems 36(4), 721–733 (2009)
Skopal, T., Pokorný, J., Snásel, V.: Pm-tree: Pivoting metric tree for similarity search in multimedia databases. In: Procs. of Advances in Database Systems (ADBIS 2004), Local Procs., pp. 803–815 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ares, L.G., Brisaboa, N.R., Ordóñez Pereira, A., Pedreira, O. (2012). Efficient Similarity Search in Metric Spaces with Cluster Reduction. In: Navarro, G., Pestov, V. (eds) Similarity Search and Applications. SISAP 2012. Lecture Notes in Computer Science, vol 7404. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32153-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-32153-5_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32152-8
Online ISBN: 978-3-642-32153-5
eBook Packages: Computer ScienceComputer Science (R0)