Skip to main content

An Efficient Retrieval Method for Astronomical Catalog Time Series Data

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11334))

  • The original version of this chapter was revised: The grant numbers of the Joint Research Fund in Astronomy were incorrect in the acknowledgement on p. 297. The correction to this chapter is available at https://doi.org/10.1007/978-3-030-05051-1_45

Abstract

Astronomical catalog time series data refer to the data collected at different time, which can provide a comprehensive understanding of the celestial objects’ attributes and expose various astronomical phenomena. Its retrieval is indispensable to astronomy research. However, the existing time series data retrieval methods involve lots of manual work and extremely time-consuming. The complexity will also be augmented by the exponentially growth of observation data. In this paper, we propose an automatic and efficient retrieval method for astronomical catalog time series data. With the goal of identifying the same celestial objects time series data automatically, a cross-match scheme is designed, which labeled a unique MatchID for each record matched with the datum catalog. To accelerate the matching process, an in-memory index structure based on Redis is specially designed, which enables matching speed 1.67 times faster than that of MySQL in massive amounts of data. Moreover, Catalog-Mongo—an improved database of MongoDB—is presented, in which a Data Blocking Algorithm is proposed to improve the data partitioning of MongoDB and accelerate query performance. The experimental results show that the query speed is about 2 times faster than MongoDB and 7.6 to 8.7 times than MySQL.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Change history

  • 05 April 2019

    The original version of the chapter starting on p. 284 was revised. The grant numbers of the Joint Research Fund in Astronomy were incorrect in the acknowledgement on p. 297. The original chapter was corrected.

References

  1. Berriman, G.B., Groom, S.L.: How will astronomy archives survive the data tsunami? Commun. ACM 54(12), 52–56 (2011)

    Article  Google Scholar 

  2. Boch, T., Pineau, F.X., Derriere, S.: CDS xMatch service documentation (2016)

    Google Scholar 

  3. Brown, P.G.: Overview of SciDB: large scale array storage, processing and analysis. In: ACM SIGMOD International Conference on Management of Data, pp. 963–968 (2010)

    Google Scholar 

  4. Budavari, T., Lee, M.A.: Xmatch: GPU enhanced astronomic catalog cross-matching. Astrophysics Source Code Library, p. 03021 (2013)

    Google Scholar 

  5. Chilingarian, I., Bartunov, O., Richter, J., Sigaev, T.: PostgreSQL: the suitable DBMS solution for astronomy and astrophysics. Astron. Data Anal. Softw. Syst. (ADASS) 314, 225 (2004)

    Google Scholar 

  6. Chodorow, K.: MongoDB: The Definitive Guide. O’Reilly Media, Inc., Sebastopol (2013)

    Google Scholar 

  7. Damodaran, B.D., Salim, S., Vargese, S.M.: Performance evaluation of MySQL and MongoDB databases. Int. J. Cybern. Inform. 5, 387–394 (2016)

    Google Scholar 

  8. Fan, D., Budav, S.T.R., Norris, P.R., Hopkins, M.A.: Matching radio catalogues with realistic geometry: application to SWIRE and ATLAS. Mon. Not. R. Astron. Soc. 451(2), 1299–1305 (2015)

    Article  Google Scholar 

  9. Gray, J., Nieto-Santisteban, M.A., Szalay, A.S.: The zones algorithm for finding points-near-a-point or cross-matching spatial datasets. Microsoft Research (2007)

    Google Scholar 

  10. Górski, K.M.: HEALPix: a framework for high-resolution discretization and fast analysis of data distributed on the sphere. Astrophys. J. 622(2), 759–771 (2004)

    Article  Google Scholar 

  11. Huijse, P., Estevez, P.A., Protopapas, P., Principe, J.C., Zegers, P.: Computational intelligence challenges and applications on large-scale astronomical time series databases. IEEE Comput. Intell. Mag. 9(3), 27–39 (2015)

    Article  Google Scholar 

  12. Jia, X., Luo, Q.: Multi-assignment single joins for parallel cross-match of astronomic catalogs on heterogeneous clusters. In: Proceedings of the 28th International Conference on Scientific and Statistical Database Management, pp. 1–12 (2016)

    Google Scholar 

  13. Jia, X., Luo, Q., Fan, D.: Cross-matching large astronomical catalogs on heterogeneous clusters, pp. 617–624(2015)

    Google Scholar 

  14. Kunszt, P.Z., Szalay, A.S., Thakar, A.R.: The hierarchical triangular mesh. In: Banday, A.J., Zaroubi, S., Bartelmann, M. (eds.) Mining the Sky, pp. 631–637. Springer, Berlin (2001). https://doi.org/10.1007/10849171_83

    Chapter  Google Scholar 

  15. Lee, M.A., Budavári, T.: Cross-identification of astronomical catalogs on multiple GPUs. Astron. Data Anal. Softw. Syst. 475, 235 (2013)

    Google Scholar 

  16. Li, L., Tang, D., Liu, T., Liu, H., Li, W., Cui, C.: Optimizing the join operation on hive to accelerate cross-matching in astronomy. In: IEEE International Parallel and Distributed Processing Symposium Workshops, pp. 1735–1745 (2014)

    Google Scholar 

  17. Mesmoudi, A., Hacid, M.S.: A comparison of systems to large-scale data access. In: Han, W.S., Lee, M.L., Muliantara, A., Sanjaya, N.A., Thalheim, B., Zhou, S. (eds.) DASFAA 2014. LNCS, vol. 8505, pp. 161–175. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43984-5_12

    Chapter  Google Scholar 

  18. NASA: Jet propulsion laboratory HEALPix homepage. http://healpix.jpl.nasa.gov/

  19. Ochsenbein, F., Bauer, P., Marcout, J.: The VizieR database of astronomical catalogues. Astron. Astrophys. Suppl. 143(1), 23–32 (2000)

    Article  Google Scholar 

  20. Ochsenbein, F., Derriere, S., Nicaisse, S., Schaaff, A.: Clustering the large VizieR catalogues, the CoCat experience. Astron. Data Anal. Softw. Syst. (ADASS) 314(314), 58 (2004)

    Google Scholar 

  21. Planthaber, G., Stonebraker, M., Frew, J.: EarthDB: scalable analysis of MODIS data using SciDB. In: ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, pp. 11–19 (2012)

    Google Scholar 

  22. Richter, S., Quiané-Ruiz, J.A., Schuh, S., Dittrich, J.: Towards zero-overhead static and adaptive indexing in Hadoop. VLDB J. 23(3), 469–494 (2014)

    Article  Google Scholar 

  23. Salvato, M., et al.: Finding counterparts for all-sky X-ray surveys with NWAY: a Bayesian algorithm for cross-matching multiple catalogues. Mon. Not. R. Astron. Soc. 473, 4937–4955 (2018)

    Article  Google Scholar 

  24. Smareglia, R., Laurino, O., Knapic, C.: VODance: VO data access layer service creation made easy, vol. 442, p. 575 (2011)

    Google Scholar 

  25. Soumagnac, M.T., Ofek, E.O.: catsHTM - a tool for fast accessing and cross-matching large astronomical catalogs. ArXiv e-prints (2018)

    Google Scholar 

  26. Taylor, M.: TOPCAT - tool for operations on catalogues and tables. Starlink User Note 253 (2011)

    Google Scholar 

  27. Wang, S., Zhao, Y., Luo, Q., Wu, C., Yang, X.: Accelerating in-memory cross match of astronomical catalogs. In: IEEE International Conference on E-Science, pp. 326–333 (2013)

    Google Scholar 

  28. Wenger, M., Ochsenbein, F., Egret, D., et al.: The SIMBAD astronomical database. The CDS reference database for astronomical objects. Astron. Astrophys. Suppl. 143(1), 9–22 (2000)

    Article  Google Scholar 

  29. White, T., Cutting, D.: Hadoop: The Definitive Guide, vol. 215, no. 11, pp. 1–4. O’reilly Media Inc., sebastopol (2012)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the Joint Research Fund in Astronomy (U1531111, U1731243, U1731125) under cooperative agreement between the National Natural Science Foundation of China (NSFC) and Chinese Academy of Sciences (CAS), the National Natural Science Foundation of China (11573019, 61602336).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ce Yu or Shanjiang Tang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, B. et al. (2018). An Efficient Retrieval Method for Astronomical Catalog Time Series Data. In: Vaidya, J., Li, J. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2018. Lecture Notes in Computer Science(), vol 11334. Springer, Cham. https://doi.org/10.1007/978-3-030-05051-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05051-1_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05050-4

  • Online ISBN: 978-3-030-05051-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics