Advertisement

GpDL: A Spatially Aggregated Data Layout for Long-Term Astronomical Observation Archive

  • Zhen Li
  • Ce Yu
  • Chao Sun
  • Shanjiang Tang
  • Jie Yan
  • Xiangfei Meng
  • Yang Zhao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11335)

Abstract

A great number of excellent astronomical academic achievements are built on historical observation data. So long-term astronomical observation archive has great significance for astronomical research. At the observation site, data from different sky areas shot in a consecutive time period are stored in one disk. So original data layout is temporally aggregated and spatially scattered. After an observation cycle, data are backuped into long-term astronomical observation archive. Astronomers request data from archive. But original data layout does not match requests’ spatial locality, i.e., one request focuses on specific sky area during a time period. In this situation, archive adopting original data layout consumes lots of energy and shortens disk life. Therefore, a reorganized spatially aggregated data layout is indispensable for archive. But how to aggregate observation data from nearby sky areas into one disk while keeping high disk capacity utilization is challenging. In this paper, we propose a spatially aggregated data layout based on HEALPix and graph partition for long-term astronomical observation archive, named GpDL. GpDL is generated based on distribution-known original data layout before observation data are backuped into archive. GpDL saves a lot of resources for archive while keeping up to 91% disk capacity utilization. In simulation experiments, compared with TaDL (original temporally aggregated data layout) and AmrDL (another spatially aggregated data layout based on thought of Adaptive Mesh Refinement), GpDL effectively reduces open disks number and energy cost for the same requests.

Keywords

Spatially aggregated Data layout Astronomical observation Long-term archive Energy cost 

Notes

Acknowledgments

This work is supported by the Joint Research Fund in Astronomy (U1531111, U1731423, U1731125) under cooperative agreement between the National Natural Science Foundation of China (NSFC) and Chinese Academy of Sciences (CAS), the National Natural Science Foundation of China (11573019, 61602336).

References

  1. 1.
    Cui, X., Yuan, X., Gong, X.: Antarctic schmidt telescopes (AST3) for dome A. In: Ground-Based and Airborne Telescopes II, vol. 7012, p. 70122D. International Society for Optics and Photonics (2008)Google Scholar
  2. 2.
    Gong, Z., et al.: Multi-level layout optimization for efficient spatio-temporal queries on ISABELA-compressed data. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium (IPDPS), pp. 873–884. IEEE (2012)Google Scholar
  3. 3.
    Gorski, K.M., et al.: HEALPix: a framework for high-resolution discretization and fast analysis of data distributed on the sphere. Astrophys. J. 622(2), 759 (2005)CrossRefGoogle Scholar
  4. 4.
    Graham, M.J., Djorgovski, S.G., Mahabal, A., Donalek, C., Drake, A., Longo, G.: Data challenges of time domain astronomy. Distrib. Parallel Databases 30(5–6), 371–384 (2012)CrossRefGoogle Scholar
  5. 5.
    He, Y.Q., Sun, S.X.: A data layout and access control strategies of the video storage server based disk array. In: 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIHMSP 2008, pp. 433–437. IEEE (2008)Google Scholar
  6. 6.
    Hong, Z., et al.: AQUAdex: a highly efficient indexing and retrieving method for astronomical big data of time series images. In: Wang, G., Zomaya, A., Perez, G.M., Li, K. (eds.) ICA3PP 2015. LNCS, vol. 9529, pp. 92–105. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-27122-4_7CrossRefGoogle Scholar
  7. 7.
    Hoque, I., Gupta, I.: Disk layout techniques for online social network data. IEEE Internet Comput. 16(3), 24–36 (2012)CrossRefGoogle Scholar
  8. 8.
    Huang, D., Zhang, X., Shi, W., Zheng, M., Jiang, S., Qin, F.: LiU: hiding disk access latency for HPC applications with a new SSD-enabled data layout. In: 2013 IEEE 21st International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 111–120. IEEE (2013)Google Scholar
  9. 9.
    Huang, H., Hung, W., Shin, K.G.: FS2: dynamic data replication in free disk space for improving disk performance and energy consumption. In: ACM SIGOPS Operating Systems Review, vol. 39, pp. 263–276. ACM (2005)Google Scholar
  10. 10.
    Karypis, G., Kumar, V.: Multilevelk-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput. 48(1), 96–129 (1998)CrossRefGoogle Scholar
  11. 11.
    Nan, R.: Five hundred meter aperture spherical radio telescope (FAST). Sci. China Ser. G 49(2), 129–148 (2006)CrossRefGoogle Scholar
  12. 12.
    Rubin, S., Bodík, R., Chilimbi, T.: An efficient profile-analysis framework for data-layout optimizations. In: ACM SIGPLAN Notices, vol. 37, pp. 140–153. ACM (2002)Google Scholar
  13. 13.
    Son, S.W., Chen, G., Kandemir, M.: Disk layout optimization for reducing energy consumption. In: Proceedings of the 19th Annual International Conference on Supercomputing, pp. 274–283. ACM (2005)Google Scholar
  14. 14.
    Szalay, A.S., Kunszt, P.Z., Thakar, A., Gray, J., Slutz, D., Brunner, R.J.: Designing and mining multi-terabyte astronomy archives: the Sloan digital sky survey. ACM SIGMOD Rec. 29(2), 451–462 (2000)CrossRefGoogle Scholar
  15. 15.
    Vogt, S.S., et al.: APF-the lick observatory automated planet finder. Publ. Astron. Soc. Pac. 126(938), 359 (2014)CrossRefGoogle Scholar
  16. 16.
    Xiao, L., Yu-An, T.: TPL: a data layout method for reducing rotational latency of modern hard disk drive. In: 2009 WRI World Congress on Computer Science and Information Engineering, vol. 7, pp. 336–340. IEEE (2009)Google Scholar
  17. 17.
    Yan, J., et al.: Optimized data layout for spatio-temporal data in time domain astronomy. In: Ibrahim, S., Choo, K.-K.R., Yan, Z., Pedrycz, W. (eds.) ICA3PP 2017. LNCS, vol. 10393, pp. 431–440. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-65482-9_30CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyTianjin UniversityTianjinChina
  2. 2.National Supercomputer Center in TianjinTianjinChina

Personalised recommendations