Abstract
Efficient and effective processing of the distance-based join query (DJQ) is of great importance in spatial databases due to the wide area of applications that may address such queries (mapping, urban planning, transportation planning, resource management, etc.). The most representative and studied DJQs are the K Closest Pairs Query (KCPQ) and εDistance Join Query (εDJQ). These spatial queries involve two spatial data sets and a distance function to measure the degree of closeness, along with a given number of pairs in the final result (K) or a distance threshold (ε). In this paper, we propose four new plane-sweep-based algorithms for KCPQs and their extensions for εDJQs in the context of spatial databases, without the use of an index for any of the two disk-resident data sets (since, building and using indexes is not always in favor of processing performance). They employ a combination of plane-sweep algorithms and space partitioning techniques to join the data sets. Finally, we present results of an extensive experimental study, that compares the efficiency and effectiveness of the proposed algorithms for KCPQs and εDJQs. This performance study, conducted on medium and big spatial data sets (real and synthetic) validates that the proposed plane-sweep-based algorithms are very promising in terms of both efficient and effective measures, when neither inputs are indexed. Moreover, the best of the new algorithms is experimentally compared to the best algorithm that is based on the R-tree (a widely accepted access method), for KCPQs and εDJQs, using the same data sets. This comparison shows that the new algorithms outperform R-tree based algorithms, in most cases.
Similar content being viewed by others
References
Roumelis G, Vassilakopoulos M, Corral A, Manolopoulos Y (2014) A new plane-sweep algorithm for the k-closest-pairs query. In: SOFSEM conference, pp 478–490
Güting R H (1994) An introduction to spatial database systems. VLDB J 3 (4):357–399
Shekhar S, Chawla S (2003) Spatial databases - a tour. Prentice Hall
Gaede V, Günther O (1998) Multidimensional access methods. ACM Comput Surv 30(2):170–231
Corral A, Manolopoulos Y, Theodoridis Y, Vassilakopoulos M (2000) Closest pair queries in spatial databases. In: SIGMOD conference, pp 189–200
Corral A, Manolopoulos Y, Theodoridis Y, Vassilakopoulos M (2004) Algorithms for processing k-closest-pair queries in spatial databases. Data Knowl Eng 49(1):67–104
Preparata FP, Shamos MI (1985) Computational geometry - an introduction. Springer
Hinrichs K, Nievergelt J, Schorn P (1988) Plane-sweep solves the closest pair problem elegantly. Inf Process Lett 26(5):255–261
Jacox EH, Samet H (2007) Spatial join techniques. ACM Trans Database Syst 32(1):7
Shin H, Moon B, Lee S (2003) Adaptive and incremental processing for distance join queries. IEEE Trans Knowl Data Eng 15(6):1561–1578
Beckmann N, Kriegel H-P, Schneider R, Seeger B (1990) The r*-tree: an efficient and robust access method for points and rectangles. In: SIGMOD conference, pp 322–331
Jacox E H, Samet H (2003) Iterative spatial join. ACM Trans Database Syst 28(3):230–256
Arge L, Procopiuc O, Ramaswamy S, Suel T, Vitter J S (1998) Scalable sweeping-based spatial join. In: VLDB conference, pp 570–581
Gurret C, Rigaux P (2000) The sort/sweep algorithm: a new method for r-tree based spatial joins. In: SSDBM conference, pp 153–165
Roumelis G, Corral A, Vassilakopoulos M, Manolopoulos Y (2014) New plane-sweep algorithms for distance-based join queries in spatial databases, Tech. Rep. TR-01-2014, Data Eng. Lab, AUTH, Greece, http://delab.csd.auth.gr/~michalis/TR-01-2014.pdf
Hjaltason G R, Samet H (1998) Incremental distance join algorithms for spatial databases. In: SIGMOD conference, pp 237–248
Rigaux P, Scholl M, Voisard A (2002) Spatial databases - with applications to GIS. Elsevier, San Francisco
Samet H (2007) Foundations of multidimensional and metric data structures. Morgan Kaufmann, San Francisco
Nobari S, Tauheed F, Heinis T, Karras P, Bressan S, Ailamaki A (2013) TOUCH: in-memory spatial join by hierarchical data-oriented partitioning. In: SIGMOD conference, pp 701–712
Sowell B, Salles MAV, Cao T, Demers AJ, Gehrke J (2013) An experimental analysis of iterated spatial joins in main memory. PVLDB 6(14):1882–1893
Sidlauskas D, Jensen CS (2014) Spatial joins in main memory Implementation matters! PVLDB 8(1):97–100
Zhang H, Chen G, Ooi B C, Tan K, Zhang M (2015) Inmemory big data management and processing: a survey. IEEE Trans Knowl Data Eng 27(7):1920–1948
Mamoulis N, Papadias D (2001) Multiway spatial joins. ACM Trans Database Syst 26(4):424–475
Brinkhoff T, Kriegel H-P, Seeger B (1993) Efficient processing of spatial joins using r-trees. In: SIGMOD conference, pp 237–246
Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: SIGMOD conference, pp 47–57
Lo M-L, Ravishankar CV (1996) Spatial hash-joins. In: SIGMOD conference, pp 247–258
Patel JM, DeWitt DJ (1996) Partition based spatial-merge join. In: SIGMOD conference, pp 259–270
Smid M (2000) Closest-point problems in computational geometry. In: Sack J-R, Urrutia J (eds) Handbook of computational geometry. Elsevier, Ch 20, pp 877–935
Corral A, Almendros-Jiménez JM (2007) A Performance comparison of distance-based query algorithms using r-trees in spatial databases. Inf Sci 177 (11):2207–2237
Kim YJ, Patel JM (2010) Performance comparison of the r*-tree and the quadtree for knn and distance join queries. IEEE Trans Knowl Data Eng 22(7):1014–1027
Gutiérrez G, Sáez P (2013) The k closest pairs in spatial databases when only one set is indexed. GeoInformatica 17(4):543–565
Weber R, Schek H-J, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: VLDB conference, pp 194–205
Koudas N, Sevcik KC (2000) High dimensional similarity joins: algorithms and performance evaluation. IEEE Trans Knowl Data Eng 12(1):3–18
Chan E P F (2003) Buffer queries . IEEE Trans Knowl Data Eng 15(4):895–910
Yang C, Lin K-I (2002) An index structure for improving nearest closest pairs and related join queries in spatial databases. In: IDEAS conference, pp 140–149
Angiulli F, Pizzuti C (2005) An approximate algorithm for top-k closest pairs join query in large high dimensional data. Data Knowl Eng 53(3):263–281
Corral A, Vassilakopoulos M (2005) On approximate algorithms for distance-based queries using r-trees. Comput J 48(2):220–238
Shan J, Zhang D, Salzberg B (2003) On spatial-range closest-pair query. In: SSTD conference, pp 252–269
ULH, Mamoulis N, Yiu ML (2008) Computation and monitoring of exclusive closest pairs. IEEE Trans Knowl Data Eng 20(12):1641–1654
Cheema MA, Lin X, Wang H, Wang J, Zhang W (2011) A unified approach for computing top-k pairs in multidimensional space. In: ICDE conference, pp 1031–1042
Choi D, Chung C, Tao Y (2014) Maximizing range sum in external memory. ACM Trans. Database Syst. 39(3):21:1–21:44
Shou Y, Mamoulis N, Cao H, Papadias D, Cheung D W (2003) Evaluation of iceberg distance joins. In: SSTD conference, pp 270–288
Böhm C, Krebs F (2004) The k-nearest neighbour join: turbo charging the kdd process. Knowl Inf Syst 6(6):728–749
Zhang J, Mamoulis N, Papadias D, Tao Y (2004) All-nearest-neighbors queries in spatial databases. In: SSDBM conference, pp 297–306
Bryan B, Eberhardt F, Faloutsos C (2008) Compact similarity joins. In: ICDE conference, pp 346– 355
Graefe G (1993) Query evaluation techniques for large databases. ACM Comput Surv 25(2):73– 170
Aggarwal A, Vitter JS (1988) The input/output complexity of sorting and related problems. Commun ACM 31(9):1116–1127
Leutenegger ST, Edgington JM, Lopez MA (1997) Str: a simple and efficient algorithm for r-tree packing. In: ICDE conference, pp 497–506
Acknowledgments
Work of all authors funded by the Development of a GeoENvironmental information system for the region of CENtral Greece (GENCENG) project (SYNERGASIA 2011 action, supported by the European Regional Development Fund and Greek National Funds); project number 11SYN 8 1213. Work of Antonio Corral also supported by the MINECO research project [TIN2013-41576-R] and the Junta de Andalucia research project [P10-TIC-6114].
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary partial version of this work appeared in [1].
Rights and permissions
About this article
Cite this article
Roumelis, G., Corral, A., Vassilakopoulos, M. et al. New plane-sweep algorithms for distance-based join queries in spatial databases. Geoinformatica 20, 571–628 (2016). https://doi.org/10.1007/s10707-016-0246-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-016-0246-1