Skip to main content
Log in

Index-based query processing on distributed multidimensional data

  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

This work introduces decentralized query processing techniques based on MIDAS, a novel distributed multidimensional index. In particular, MIDAS implements a distributed k-d tree, where leaves correspond to peers, and internal nodes dictate message routing. MIDAS requires that peers maintain little network information, and features mechanisms that support fault tolerance and load balancing. The proposed algorithms process point and range queries over the multidimensional indexed space in only O(log n) hops in expectance, where n is the network size. For nearest neighbor queries, two processing alternatives are discussed. The first, termed eager processing, has low latency (expected value of O(log n) hops) but may involve a large number of peers. The second, termed iterative processing, has higher latency (expected value of O(log2 n) hops) but involves far fewer peers. A detailed experimental evaluation demonstrates that our query processing techniques outperform existing methods for settings involving real spatial data as well as in the case of high dimensional synthetic data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. Peers periodically inform their backlinks about their load.

  2. In our implementation, a timeout process is initiated. Note that missed messages do not affect the algorithm’s correctness, since the global guarantee is correctly computed (see proof of Lemma 9) on the set G of retrieved local guarantees.

  3. http://www.rtreeportal.org

References

  1. Aberer K, Cudré-Mauroux P, Datta A, Despotovic Z, Hauswirth M, Punceva M, Schmidt R (2003) P-grid: a self-organizing structured p2p system. SIGMOD Record 32(3):29–33

    Article  Google Scholar 

  2. Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517

    Article  Google Scholar 

  3. Bentley JL (1990) K-d trees for semidynamic point sets. In: Symposium on computational geometry, pp 187–197

  4. Bharambe AR, Agrawal M, Seshan S (2004) Mercury: supporting scalable multi-attribute range queries. In: SIGCOMM, pp 353–366

  5. Blanas S, Samoladas V (2007) Contention-based performance evaluation of multidimensional range search in p2p networks. In: InfoScale’07, pp 1–8

  6. Cai M, Frank MR, Chen J, Szekely PA (2004) Maan: a multi-attribute addressable network for grid information services. J Grid Comp 2(1):3–14

    Article  Google Scholar 

  7. Datta A, Hauswirth M, John R, Schmidt R, Aberer K (2005) Range queries in trie-structured overlays. In: P2P Computing, pp 57–66

  8. Duch A, Estivill-Castro V, Martínez C (1998) Randomized k-dimensional binary search trees. In: ISAAC, pp 199–208

  9. Falchi F, Gennaro C, Zezula P (2008) Nearest neighbor search in metric spaces through content-addressable networks. Inf Process Manag 44(1):411–429

    Article  Google Scholar 

  10. Ganesan P, Yang B, Garcia-Molina H (2004) One torus to rule them all: multidimensional queries in p2p systems. In: WebDB, pp 19–24

  11. Jagadish HV, Ooi BC, Vu QH (2005) Baton: a balanced tree structure for peer-to-peer networks. In: VLDB, pp. 661–672

  12. Jagadish HV, Ooi BC, Vu QH, Zhang R, Zhou A (2006) Vbi-tree: a peer-to-peer framework for supporting multi-dimensional indexing schemes. In: ICDE, p 34

  13. Jain R, Chiu D, Hawe W (1984) A quantitative measure of fairness and discrimination for resource allocation in shared computer systems. In: DEC Research Report TR-301

  14. Karger D, Lehman E, Leighton T, Panigrahy R, Levine M, Lewin D (1997) Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In: ACM Symp. on Theory of Comp., pp 654–663

  15. Maymounkov P, Mazières D (2002) Kademlia: a peer-to-peer information system based on the xor metric. In: IPTPS, pp 53–65

  16. Plaxton CG, Rajaraman R, Richa AW (1999) Accessing nearby copies of replicated objects in a distributed environment. Theory Comput Syst 32(3):241–280

    Article  Google Scholar 

  17. Ratnasamy S, Francis P, Handley M, Karp R, Schenker S (2001) A scalable content-addressable network. In: SIGCOMM ’01, pp 161–172

  18. Reed BA (2003) The height of a random binary search tree. J ACM 50(3):306–332

    Article  Google Scholar 

  19. Rowstron AIT, Druschel P (2001) Pastry: scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Middleware, pp 329–350

  20. Shu Y, Ooi BC, Tan KL, Zhou A (2005) Supporting multi-dimensional range queries in peer-to-peer systems. In: Peer-to-Peer computing, pp 173–180

  21. Stoica I, Morris R, Liben-Nowell D, Karger DR, Kaashoek MF, Dabek F, Balakrishnan H (2003) Chord: a scalable p2p lookup protocol for internet applications. IEEE/ACM Trans Netw 11(1):17–32

    Article  Google Scholar 

  22. Tsatsanifos G, Sacharidis D, Sellis T (2011) Midas: multi-attribute indexing for distributed architecture systems. In: Proceedings of the international symposium on spatial and temporal databases (SSTD)

  23. Wang J, Wu S, Gao H, Li J, Ooi BC (2010) Indexing multi-dimensional data in a cloud system. In: SIGMOD, pp 591–602

  24. Zhao B, Kubiatowicz J, Joseph AD (2004) Tapestry: a resilient global-scale overlay for service deployment. IEEE J Sel Areas Commun 22(1):41–53

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitris Sacharidis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsatsanifos, G., Sacharidis, D. & Sellis, T. Index-based query processing on distributed multidimensional data. Geoinformatica 17, 489–519 (2013). https://doi.org/10.1007/s10707-012-0163-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10707-012-0163-x

Keywords

Navigation