Skip to main content
Log in

Optimal Data-Space Partitioning of Spatial Data for Parallel I/O

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

It is desirable to design partitioning methods that minimize the I/O time incurred during query execution in spatial databases. This paper explores optimal partitioning for two-dimensional data for a class of queries and develops multi-disk allocation techniques that maximize the degree of I/O parallelism obtained in each case. We show that hexagonal partitioning has optimal I/O performance for circular queries among all partitioning methods that use convex non-overlapping regions. An analysis and extension of this result to all possible partitioning techniques is also given. For rectangular queries, we show that hexagonal partitioning has overall better I/O performance for a general class of range queries, except for rectilinear queries, in which case rectangular grid partitioning is superior. By using current algorithms for rectangular grid partitioning, parallel storage and retrieval algorithms for hexagonal partitioning can be constructed. Some of these results carry over to circular partitioning of the data—which is an example of a non-convex region.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. M.J. Atallah and S. Prabhakar, "(Almost) optimal parallel block access for range queries," in Proc. ACM Symp. on Principles of Database Systems, Dallas, Texas, May 2000, pp. 205–215.

    Google Scholar 

  2. N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger, "The R *tree: An efficient and robust access method for points and rectangles," in Proc. ACM SIGMOD Int. Conf. on Management of Data, May 23-25, 1990, pp. 322–331.

  3. S. Berchtold, C. Bohm, B. Braunmuller, D.A. Keim, and H-P. Kriegel, "Fast parallel similarity search in multimedia databases," in Proc. ACM SIGMOD Int. Conf. on Management of Data, Arizona, USA, 1997, pp. 1–12.

    Google Scholar 

  4. S. Berchtold, C. Bohm, D. Keim, and H.-P. Kriegel, "A cost model for nearest neighbor search," in Proc. ACM Symp. on Principles of Database Systems, Tuscon, Arizona, USA, June 1997, pp. 78–86.

    Google Scholar 

  5. S. Berchtold, C. Bohm, and H.-P. Kriegel, "The Pyramid-Technique: Towards breaking the curse of dimen-sionality," in Proc. ACMSIGMOD Int. Conf. on Management of Data, Seattle, Washington, USA, June 1998, pp. 142–153.

    Google Scholar 

  6. R. Bhatia, R.K. Sinha, and C.-M. Chen, "Hierarchical declustering schemes for range queries," in Advances in Database Technology-EDBT 2000, 7th International Conference on Extending Database Technology, Lecture Notes in Computer Science, Konstanz, Germany, March 2000, pp. 525–537.

    Google Scholar 

  7. C.-M. Chen, R. Bhatia, and R. Sinha, "Declustering using golden ratio sequences," in International Conference on Data Engineering, San Diego, California, Feb 2000, pp. 271–280.

    Google Scholar 

  8. C.-M. Chen and C.T. Cheng, "From discrepancy to declustering: Near optimal multidimensional declustering strategies for range queries," in Proc. ACM Symp. on Principles of Database Systems, Wisconsin, Madison, 2002, pp. 29–38.

    Google Scholar 

  9. X. Cheng, R. Dolin, M. Neary, S. Prabhakar, K. Ravikanth, D. Wu, D. Agrawal, A. El Abbadi, M. Freeston, A. Singh, T. Smith, and J. Su, "Scalable access within the context of digital libraries," in IEEE Proceedings of the International Conference on Advances in Digital Libraries, ADL, Washington, D.C., 1997, pp. 70–81.

    Google Scholar 

  10. H.C. Du and J.S. Sobolewski, "Disk allocation for cartesian product files on multiple-disk systems," ACM Transactions of Database Systems, vol. 7, no. 1, pp. 82–101, 1982.

    Google Scholar 

  11. O. Egecioglu and H. Ferhatosmanoglu, "Circular data-space partitioning for similarity queries and parallel disk allocation," in Proc. of IASTED International Conference on Parallel and Distributed Computing and Systems, Nov. 1999, pp. 194–200.

  12. C. Faloutsos and P. Bhagwat, "Declustering using fractals," in Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems, San Diego, CA, Jan. 1993, pp. 18–25.

    Google Scholar 

  13. H. Ferhatosmanoglu, D. Agrawal, and A. El Abbadi, "Clustering declustered data for efficient retrieval," in Proc. Conf. on Information and Knowledge Management, Kansas City, Missouri, Nov. 1999, pp. 343–350.

    Google Scholar 

  14. H. Ferhatosmanoglu, D. Agrawal, and A. El Abbadi, "Concentric hyperspaces and disk allocation for fast parallel range searching," in Proc. Int. Conf. Data Engineering, Sydney, Australia, March 1999, pp. 608–615.

    Google Scholar 

  15. H. Ferhatosmanoglu, D. Agrawal, and A. El Abbadi, "Optimal partitioning for efficient I/O in spatial databases," in Proc. of the European Conference on Parallel Computing (Euro-Par), Manchester, UK, Aug. 2001.

    Google Scholar 

  16. H. Ferhatosmanoglu, E. Tuncel, D. Agrawal, and A. El Abbadi, "Vector approximation based indexing for non-uniform high dimensional data sets," in Proceedings of the 9th ACM Int. Conf. on Information and Knowledge Management, McLean, Virginia, Nov. 2000, pp. 202–209.

    Google Scholar 

  17. V. Gaede and O. Gunther, "Multidimensional access methods," ACM Computing Surveys, vol. 30, pp. 170–231, 1998.

    Google Scholar 

  18. A. Guttman, "R-trees: A dynamic index structure for spatial searching," in Proc. ACM SIGMOD Int. Conf. on Management of Data, 1984, pp. 47–57.

  19. T.C. Hales, "The honeycomb conjecture. Available at http://xxx.lanl.gov/abs/math.MG/9906042, June 1999.

  20. T.C. Hales, "Historical background on hexagonal honeycomb. http://www.math.lsa.umich.edu/ hales/ countdown/honey/hexagonHistory.html, March 2000.

  21. J. Hellerstein, E. Koutsoupias, and C. Papadimitriou, "On the analysis of indexing schemes," in Proc. ACM Symp. on Principles of Database Systems, Tucson, Arizona, June 1997, pp. 249–256.

    Google Scholar 

  22. K.A. Hua and H.C. Young, "A general multidimensional data allocation method for multicomputer database systems," in Database and Expert System Applications, Toulouse, France, Sept. 1997, pp. 401–409.

    Google Scholar 

  23. M.H. Kim and S. Pramanik, "Optimal file distribution for partial match retrieval," in Proc. ACM SIGMOD Int. Conf. on Management of Data, Chicago, 1988, pp. 173–182.

  24. J. Nievergelt, H. Hinterberger, and K.C. Sevcik, "The grid file: An adaptable, symmetric multikey file struc-ture," ACM Transactions on Database Systems vol. 9, no. 1, pp. 38–71, 1984.

    Google Scholar 

  25. A. Okabe, B. Boots, K. Sugihara, and S. Nok Chiu, Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, Wiley, 2001.

  26. S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A. El Abbadi, "Cyclic allocation of two-dimensional data," in International Conference on Data Engineering, Orlando, Florida, Feb. 1998, pp. 94–101.

    Google Scholar 

  27. S. Prabhakar, D. Agrawal, and A. El Abbadi, "Efficient disk allocation for fast similarity searching," in 10th International Symposium on Parallel Algorithms and Architectures, SPAA'98, Puerto Vallarta, Mexico, June 1998, pp. 78–87.

    Google Scholar 

  28. J.T. Robinson, "The kdb-tree: A search structure for large multi-dimensional dynamic indexes," in Proc. ACM SIGMOD Int. Conf. on Management of Data, 1981, pp. 10–18.

  29. H. Samet, The Design and Analysis of Spatial Structures. Addison Wesley Publishing Company, Inc., Mas-sachusetts, 1989.

    Google Scholar 

  30. A.S. Tosun and H. Ferhatosmanoglu, "Optimal parallel I/O using replication," in Proceedings of International Workshops on Parallel Processing (ICPP), Vancouver, Canada, Aug. 2002.

    Google Scholar 

  31. R. Weber, H.-J. Schek, and S. Blott, "A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces," in Proceedings of the Int. Conf. on Very Large Data Bases, New York City, New York, Aug. 1998, pp. 194–205.

    Google Scholar 

  32. D. White and R. Jain, "Similarity indexing with the SS-tree," in Proc. Int. Conf. Data Engineering, 1996, pp. 516–523.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ferhatosmanoğlu, H., Agrawal, D., Eğecioğlu, Ö. et al. Optimal Data-Space Partitioning of Spatial Data for Parallel I/O. Distributed and Parallel Databases 17, 75–101 (2005). https://doi.org/10.1023/B:DAPD.0000045550.56749.75

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:DAPD.0000045550.56749.75

Navigation