Development of a Polystore Data Management System for an Evolving Big Scientific Data Archive

  • Manoj PoudelEmail author
  • Rashmi P. Sarode
  • Shashank Shrestha
  • Wanming Chu
  • Subhash Bhalla
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11721)


Handling large datasets can be a big challenge in case of most astronomical data repositories. Many astronomical repositories manage images, text, key-values, and graphs that make up the enormous volume of data available in the astronomical domain. Palomar Transient Factory (PTF/iPTF) is one such project which has relational data, image data, lightcurve data sets, graphs, and text data. Organizing these data in a single data management system may have low performance and efficiency issue. Thus, we propose to demonstrate a prototype system to manage such heterogeneous data with multiple storage units using polystore based approaches. The prototype supports a set-theoretic query language for access to cloud-based data resources.


Astronomical data Multi data stores Query management system PTF/iPTF data ZTF data 


  1. 1.
    Law, N.M., et al.: The Palomar Transient Factory: system overview, performance, and first results. Publ. Astron. Soc. Pac. 121(886), 1395 (2009)CrossRefGoogle Scholar
  2. 2.
    Grillmair, C.J., et al.: An overview of the Palomar Transient Factory pipeline and archive at the infrared processing and analysis center. In: Astronomical Data Analysis Software and Systems XIX, vol. 434 (2010)Google Scholar
  3. 3.
  4. 4.
  5. 5.
    About Intermediate Palomar Transient Factory.
  6. 6.
    Smith, R.M., et al.: The Zwicky transient facility observing system. In: Ground-Based and Airborne Instrumentation for Astronomy V, vol. 9147. International Society for Optics and Photonics (2014)Google Scholar
  7. 7.
    Pence, W.D., et al.: Definition of the flexible image transport system (fits), version 3.0. Astron. Astrophys. 524, A42 (2010)CrossRefGoogle Scholar
  8. 8.
  9. 9.
    Surace, J., et al.: The Palomar Transient Factory: high quality realtime data processing in a cost-constrained environment. arXiv preprint arXiv:1501.06007 (2015)
  10. 10.
    Rusu, F., Nugent, P., Wu, K.: Implementing the Palomar Transient Factory real-time detection pipeline in GLADE: results and observations. In: Madaan, A., Kikuchi, S., Bhalla, S. (eds.) DNIS 2014. LNCS, vol. 8381, pp. 53–66. Springer, Cham (2014). Scholar
  11. 11.
    Ofek, E.O., et al.: The Palomar Transient Factory photometric calibration. Publ. Astron. Soc. Pac. 124(911), 62 (2012)CrossRefGoogle Scholar
  12. 12.
    Laher, R.R., et al.: IPAC image processing and data archiving for the Palomar Transient Factory. Publ. Astron. Soc. Pac. 126(941), 674 (2014)Google Scholar
  13. 13.
    Information about IBE, June 2019.
  14. 14.
    Koruga, P., Bača, M.: Analysis of B-tree data structure and its usage in computer forensics. In: Central European Conference on Information and Intelligent Systems (2010)Google Scholar
  15. 15.
  16. 16.
    Shrestha, S., et al.: PDSPTF: polystore database system for scalability and access to PTF time-domain astronomy data archives. In: Gadepally, V., Mattson, T., Stonebraker, M., Wang, F., Luo, G., Teodoro, G. (eds.) DMAH/Poly 2018. LNCS, vol. 11470, pp. 78–92. Springer, Cham (2019). Scholar
  17. 17.
  18. 18.
    Level-1 data images search query interface, June 2019.
  19. 19.
  20. 20.
    About Lightcurve and source Database, June 2019.
  21. 21.
    Samos, J., Saltor, F., Sistac, J., Bardés, A.: Database architecture for data warehousing: an evolutionary approach. In: Quirchmayr, G., Schweighofer, E., Bench-Capon, T.J.M. (eds.) DEXA 1998. LNCS, vol. 1460, pp. 746–756. Springer, Heidelberg (1998). Scholar
  22. 22.
    Robitaille, T.P., et al.: Astropy: a community Python package for astronomy. Astron. Astrophys. 558, A33 (2013)CrossRefGoogle Scholar
  23. 23.
    About Images Visualizer API DS9, June 2019.
  24. 24.
    About Datawnt0 workflow based query system, June 2019.
  25. 25.
    Sun, J.: Information requirement elicitation in mobile commerce. Commun. ACM 46(12), 45–47 (2003)CrossRefGoogle Scholar
  26. 26.
    About Zwicky Transient Facility data products, June 2019.
  27. 27.
    Gadepally, V., et al.: Version 0.1 of the BigDAWG polystore system. arXiv preprint arXiv:1707.00721 (2017)
  28. 28.
  29. 29.
    Kolev, B., et al.: CloudMdsQL: querying heterogeneous cloud data stores with a common language. Distrib. Parallel Databases 34(4), 463–503 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Manoj Poudel
    • 1
    Email author
  • Rashmi P. Sarode
    • 1
  • Shashank Shrestha
    • 1
  • Wanming Chu
    • 1
  • Subhash Bhalla
    • 1
  1. 1.University of AizuAizu-WakamatsuJapan

Personalised recommendations