Implementing a General Spatial Indexing Library for Relational Databases of Large Numerical Simulations

  • Gerard Lemson
  • Tamás Budavári
  • Alexander Szalay
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6809)

Abstract

Large multi-terabyte numerical simulations of different physical systems consist of billions of particles or grid points and hundreds to thousands of snapshots. Increasingly these data sets are stored in large object-relational databases. Most statistical analyses involve extracting various spatio-temporal subsets. Existing built-in spatial indexes in commercial systems lack essential features required for many applications in the physical sciences. We describe a library that we have implemented in several languages and platforms (Java/Oracle, C#/SQL Server) based on generic space-filling curves, implemented as plug-ins. The index provides a mapping of higher dimensional space into the standard linear B-tree index of any relational database. The architecture allows intersections with different geometric primitives. The library has been used for cosmological N-body simulations and isotropic turbulence, providing sub-second response time over datasets exceeding several tens of terabytes. The library can also address complex space-time challenges, like temporal look-back into past light-cones of cosmological simulations.

Keywords

spatial indexing numerical simulations relational databases 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Angulo, R., Springel, V., White, S.D.M., et al.: (in preparation 2011) Google Scholar
  2. 2.
    Bayer, R.: The Universal B-Tree for Multidimensional Indexing: General Concepts. In: World-Wide Computing and Its Applications 1997 (WWCA 1997), Tsukuba, Japan, pp. 198–209 (1997)Google Scholar
  3. 3.
    Boylan-Kolchin, M., Springel, V., White, S.D.M., et al.: Resolving cosmic structure formation with the Millennium-II Simulation. Monthly Notices of the Royal Astronomical Society 398, 1150–1164 (2009)CrossRefGoogle Scholar
  4. 4.
    Diemand, J., Kuhlen, M., Madau, P., et al.: Clumps and streams in the local dark matter distribution. Nature 454, 735–738 (2008)CrossRefGoogle Scholar
  5. 5.
    Faloutsos, C., Rong, Y.: DOT: A Spatial Access Method Using Fractals. In: Proceedings of the Seventh International Conference on Data Engineering, Kobe, Japan, April 8-12, pp. 152–159 (1991)Google Scholar
  6. 6.
    Jagadish, H.V.: Linear Clustering of Objects with Multiple Atributes. In: Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, May 23-25, pp. 332–342 (1990)Google Scholar
  7. 7.
    Kamel, I., Faloutsos, C.: Hilbert R-tree: An Improved R-tree using Fractals. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB 1994), Santiago de Chile, Chile, September 12-15, pp. 500–509 (1994)Google Scholar
  8. 8.
    Lawrence, E., Heitmann, K., White, M., et al.: The Coyote Universe III. Simulation Suite and Precision Emulator for the Nonlinear Power Spectrum, Astrophysical Journal 713, 1322–1331 (2010)Google Scholar
  9. 9.
    Lemson, G.: The Virgo Consortium Halo and Galaxy Formation Histories from the Millennium Simulation: Public release of a VO-oriented and SQL-queryable database for studying the evolution of galaxies in the eprint arXiv:astro-ph/0608019 (2006)Google Scholar
  10. 10.
    Markl, V.: MISTRAL: Processing Relational Queries using a Multidimensional Access Technique Ph. D. Thesis, TU München (1999), http://mistral.informatik.tu-muenchen.de/results/publications/Mar99.pdf
  11. 11.
    Moon, B., Jagadish, H.V., Faloutsos, C., Saltz, J.H.: Analysis of the Clustering Properties of Hilbert Space-filling Curve. IEEE Transactions on Knowledge and Data Engineering 13(1) (2001)Google Scholar
  12. 12.
    O’Mullane, W., Li, N., Nieto-Santisteban, M.A., Thakar, A., Szalay, A.S., Gray, J.: Batch is back: CasJobs, serving multi-TB data on the Web, Microsoft Tech Report MSR-TR-2005-19 (2005)Google Scholar
  13. 13.
    Sagan, H.: Space-Filling Curves. Springer, Heidelberg (1994)CrossRefMATHGoogle Scholar
  14. 14.
    Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan-Kaufmann, San Francisco (2006) ISBN 0-12-369446-9MATHGoogle Scholar
  15. 15.
    Springel, V., White, S.D.M., Jenkins, A., et al.: Simulations of the formation, evolution and clustering of galaxies and quasars. Nature 435, 629–636 (2005)CrossRefGoogle Scholar
  16. 16.
    Springel, V., Wang, J., Vogelsberger, M., et al.: The Aquarius Project: the subhaloes of galactic haloes. Monthly Notices of the Royal Astronomical Society 391, 1685–1711 (2008)CrossRefGoogle Scholar
  17. 17.
    Szalay, A., Gray, J., Thakar, A., Kuntz, P., Malik, T., Raddick, J., Stoughton, C., Vandenberg, J.: The SDSS SkyServer Public Access to the Sloan Digital Sky Server Data. In: Proc SIGMOD 2002 Conference, pp. 570–581 (2002)Google Scholar
  18. 18.
    Szalay, A.S., Kunszt, P., Thakar, A., Gray, J., Slutz, D., Brunner, R.: Designing and Mining Multi-Terabyte Astronomy Archives: The Sloan Digital Sky Survey. In: Proc. SIGMOD 2000 Conference, pp. 451–462 (2000)Google Scholar
  19. 19.
    Taghizadeh-Popp, M.: CfunBASE: A Cosmological Functions Library for Astronomical Databases Publications of the Astronomical Society of the Pacific, vol. 122, pp. 976–989 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Gerard Lemson
    • 1
  • Tamás Budavári
    • 2
  • Alexander Szalay
    • 2
  1. 1.MPAGermany
  2. 2.JHUUSA

Personalised recommendations