Skip to main content

GBI: A Generalized R-Tree Bulk-Insertion Strategy

  • Conference paper
  • First Online:
Advances in Spatial Databases (SSD 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1651))

Included in the following conference series:

Abstract

A lot of recent work has studied strategies related to bulk loading of large data sets into multidimensional index structures. In this paper, we address the problem of bulk insertions into existing index structures with particular focus on R-trees — which are an important class of index structures used widely in commercial database systems. We propose a new technique, which as opposed to the current technique of inserting data one by one, bulk inserts entire new incoming datasets into an active R-tree. This technique, called GBI (for Generalized Bulk Insertion), partitions the new datasets into sets of clusters and outliers, constructs an R-tree (small tree) from each cluster, identifies and prepares suitable locations in the original R-tree (large tree) for insertion, and lastly performs the insertions of the small trees and the outliers into the large tree in bulk. Our experimental studies demonstrate that GBI does especially well (over 200% better than the existing technique) for randomly located data as well as for real datasets that contain few natural clusters, while also consistently outperforming the alternate technique in all other circumstances.

This work was supported in part by the University of Michigan ITS Research Center of Excellence grant (DTFH61-93-X-00017-Sub) sponsored by the U.S. Dept. of Transportation and by the Michigan Dept. of Transportation. Dr. Rundensteiner thanks IBM for the Corporate IBM partnership award, and Li Chen thanks IBM for the Corporate IBM fellowship as well as mentoring from the IBM Toronto Labora.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L. Arge, K. Hinrichs, J. Vahrenhold, and J. Vitter. Efficient Bulk Operations on Dynamic R-trees. Workshop on Algorithm Engineering and Experimentation ALENEX 99, pages 92–103, 1999.

    Google Scholar 

  2. M. R. Anderberg. Probability and Mathematical Statistics. Academic Press, New York, San Francisco, London, 1973.

    Google Scholar 

  3. C.H. Ang and T.C. Tan. New Linear Node Splitting Algorithm for R-trees. Advances in Spatial Databases, pages 339–349, 1997.

    Google Scholar 

  4. N. Beckmann, H.P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: An Efficient and Robust Access Method for Points and Rectangles. Proceedings of SIGMOD, pages 322–331, 1990.

    Google Scholar 

  5. J. Bercken, P. Widmayer, and B. Seeger. A Generic Approach to Bulk Loading Multidimensional Index Structures. International Conference on Very Large Data Bases, pages 406–415, 1997.

    Google Scholar 

  6. L. Chen, R. Choubey, and E. A. Rundensteiner. Bulk Insertions into R-trees using the Small-Tree-Large-Tree Approach. Proceedings of ACM GIS Workshop, pages 161–162, 1998.

    Google Scholar 

  7. L. Chen, R. Choubey, and E.A. Rundensteiner. Bulk Insertions into R-Trees. WPI, Tech. Rep. CS-WPI-98-05, February 1998.

    Google Scholar 

  8. R. Choubey, L. Chen, and E. A. Rundensteiner. GBI: A Generalized R-Tree Bulk Insertion Strategy. WPI Technical Report TR-98-15-STLT, 1998.

    Google Scholar 

  9. W. Chen. Programming with Logical Queries, Bulk Updates, and Hypothetical Reasoning. IEEE Transactions on Knowledge and Data Engineering, pages 587–599, July 1997.

    Google Scholar 

  10. R. Choubey. R-Tree Bulk Insertion Strategies. Master Thesis in progress, Worcester Polytechnic Institute, 1999.

    Google Scholar 

  11. P. Ciaccia and M. Patella. Bulk Loading the M-tree. Proceedings of the Australasian Database Conference, February 1998.

    Google Scholar 

  12. B. Duran and P. Odell.Cluster Analysis-A Survey. Springer-Verlag, Berlin, Heidelberg, New York, 1974.

    MATH  Google Scholar 

  13. A. Guttman. R-trees: A Dynamic Index Structure for Spatial Searching. Proceedings of SIGMOD, pages 47–57, 1984.

    Google Scholar 

  14. Y.W. Huang, N. Jing, and E.A. Rundensteinder. Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations. International Conference on Very Large Data Bases, pages 396–405, 1997.

    Google Scholar 

  15. Y.W. Huang, M. Jones, and E.A. Rundensteiner. Symbolic Intersect Detection: A Method for Improving Spatial Intersect Joins. Proc. of the International Symposium on Spatial Databases, pages 165–177, 1997.

    Google Scholar 

  16. Y.W. Huang, N Jing, and E.A. Rundensteiner. A cost model for estimating the performance of spatial joins using R-tree. International Working Conference on Scientific and Statistical Database Management, pages 30–38, August 1997.

    Google Scholar 

  17. Informix Corporation (“http://www.informix.com”). Informix. 108

  18. I. Kamel and C. Faloutsos. On Packing R-trees. Proceedings of International Conference on Information and Knowledge Management, pages 490–499, November 1993.

    Google Scholar 

  19. S. Leutenegger, M. Lopez, and J. Edgigton. STR: A Simple and Efficient Algorithm for R-tree Packing. Proceedings of IEEE International Conference on Data Engineering, pages 497–506, 1997.

    Google Scholar 

  20. S. Leutenegger, M. Lopez, and Y. Garcia. A Greedy Algorithm for Bulk Loading R-trees. Technical report, University of Denver Computer Science Technical Report #97-02, 1997.

    Google Scholar 

  21. S. Leutenegger and D. Nicol. Efficient Bulk-Loading of Gridles. IEEE Transactions on Knowledge and Data Engineering, pages 410–420, May 1997.

    Google Scholar 

  22. J. Li, D. Rotem, and J. Srivastave. Algorithms for Loading Parallel Gridles. Proceedings of SIGMOD, pages 347–356, 1993.

    Google Scholar 

  23. MapInfo Corporation. SpatialWare-http://www.mapinfo.com/spatialware/spatial20.html.

  24. A. Moitra. Spatio-Temporal Data Management Using R-trees. International Journal of Geographic Information Systems, 1993.

    Google Scholar 

  25. H.Charles Romesburg. Cluster Analysis for Researchers. Lifetime Learning Publications, Belmont, California, 1984.

    Google Scholar 

  26. N. Roussopoulos, M. Roussopoulos, and Y. Kotidis. Cubetree: Organization of and Bulk Incremental Updates on the Data Cube. Proceedings of SIGMOD, pages 89–99, 1997.

    Google Scholar 

  27. H. Spath. Cluster Analysis Algorithms for Data Reduction and Classification of Objects. Ellis Horwook Publishers, Chichester, 1982.

    Google Scholar 

  28. Y. Theodoridis and T. Sellis. Optimization Issues in R-tree Construction (extended abstract). Lecture Notes in Computer Science, pages 270–273, 1994.

    Google Scholar 

  29. J. H. Ward. Hierarchical Grouping Analysis For Applications. Journal of American Statistics Association, 58:236–244, 1963.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Choubey, R., Chen, L., Rundensteiner, E.A. (1999). GBI: A Generalized R-Tree Bulk-Insertion Strategy. In: Güting, R.H., Papadias, D., Lochovsky, F. (eds) Advances in Spatial Databases. SSD 1999. Lecture Notes in Computer Science, vol 1651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48482-5_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-48482-5_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66247-1

  • Online ISBN: 978-3-540-48482-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics