Skip to main content

AQUAdex: A Highly Efficient Indexing and Retrieving Method for Astronomical Big Data of Time Series Images

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9529))

Abstract

In the era of Big Data, scientific research is challenged with handling massive data sets. To actually take advantage of Big Data, the key problem is to retrieve the desired cup of data from the ocean, as most applications only need a fraction of the entire data set. As the indexing and retrieving method is intrinsically connected with specific features of the data set and the goal of research, a universal solution is hardly possible. Designed for efficiently querying Big Data in astronomy time domain research, AQUAdex, a new spatial indexing and retrieving method is proposed to extract Time Series Images form Astronomical Big Data. By mapping images to tiles (pixels) on the celestial sphere, AQUAdex can complete queries 9 times faster, which is proven by theoretical analysis and experimental results. AQUAdex is especially suitable for Big Data applications because of its excellent scalability. The query time only increases 59 % while the data size grows 14 times larger.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aji, A., Wang, F., Saltz, J.H.: Towards building a high performance spatial query system for large scale medical imaging data. In: Proceedings of the 20th International Conference on Advances in Geographic Information Systems, pp. 309–318. ACM (2012)

    Google Scholar 

  2. Aji, A., Wang, F., Vo, H., Lee, R., Liu, Q., Zhang, X., Saltz, J.: Hadoop GIS: a high performance spatial data warehousing system over mapreduce. Proc. VLDB Endow. 6(11), 1009–1020 (2013)

    Article  Google Scholar 

  3. Alagiannis, I., Borovica, R., Branco, M., Idreos, S., Ailamaki, A.: NoDB: efficient query execution on raw data files. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 241–252. ACM (2012)

    Google Scholar 

  4. Alagiannis, I., Borovica, R., Branco, M., Idreos, S., Ailamaki, A.: NoDB in action: adaptive query processing on raw data. Proc. VLDB Endow. 5(12), 1942–1945 (2012)

    Article  Google Scholar 

  5. Berriman, G.B., Groom, S.L.: How will astronomy archives survive the data tsunami? Commun. ACM 54(12), 52–56 (2011)

    Article  Google Scholar 

  6. Blanas, S., Wu, K., Byna, S., Dong, B., Shoshani, A.: Parallel data analysis directly on scientific file formats. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 385–396. ACM (2014)

    Google Scholar 

  7. Brown, P.G.: Overview of SciDB: large scale array storage, processing and analysis. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 963–968. ACM (2010)

    Google Scholar 

  8. China-VO: Data explorer of China virtual observatory. http://explorer.china-vo.org

  9. Ivanova, M., Kersten, M., Manegold, S.: Data vaults: a symbiosis between database technology and scientific file repositories. In: Ailamaki, A., Bowers, S. (eds.) SSDBM 2012. LNCS, vol. 7338, pp. 485–494. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. NASA: Jet propulsion laboratory healpix homepage. http://healpix.jpl.nasa.gov/

  11. Ng, M.K., Huang, Z.: Data-mining massive time series astronomical data: challenges, problems and solutions. Inf. Softw. Technol. 41(9), 545–556 (1999)

    Article  Google Scholar 

  12. Planthaber, G., Stonebraker, M., Frew, J.: EarthDB: scalable analysis of MODIS data using SciDB. In: Proceedings of the 1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, pp. 11–19. ACM (2012)

    Google Scholar 

  13. Richter, S., Quiané-Ruiz, J.-A., Schuh, S., Dittrich, J.: Towards zero-overhead static and adaptive indexing in hadoop. VLDB J. 23(3), 469–494 (2014)

    Article  Google Scholar 

  14. Silva, V., de Oliveira, D., Mattoso, M.: Exploratory analysis of raw data files through dataflows. In: 2014 International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW), pp. 114–119. IEEE (2014)

    Google Scholar 

  15. Stonebraker, M., Brown, P., Poliakov, A., Raman, S.: The architecture of SciDB. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 1–16. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  16. Tian, Y., Alagiannis, I., Liarou, E., Ailamaki, A., Michiardi, P., Vukolić, M.: DiNoDB: Efficient large-scale raw data analytics. In: Proceedings of the First International Workshop on Bringing the Value of Big Data to Users (Data4U 2014), p. 1. ACM (2014)

    Google Scholar 

  17. Zhao, Q.: Research on high-efficient massive data oriented astronomical cross-match. Ph.D. thesis, Tianjin University (2010)

    Google Scholar 

  18. Hong, Z.: Source code of the algorithms in this paper. http://paperdata.china-vo.org/Hong.Zhi/2015/ICA3PP/AQUAdex/AQUAdex_Zhi.cpp

Download references

Acknowledgments

This work was supported in part by National Natural Science Foundation of China (NSFC) through grant 61303021, U1531111 and U1231108. The data set used in the experiments are provided by the AST3 team of NAOC. The authors wish to express gratitude to Ms. Yiyi Gao and Ms. Xingyu Xu for their insightful suggestions. Sincere thanks also goes to Mr. Jie Wen for helping putting the final touches in place.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ce Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Hong, Z. et al. (2015). AQUAdex: A Highly Efficient Indexing and Retrieving Method for Astronomical Big Data of Time Series Images. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9529. Springer, Cham. https://doi.org/10.1007/978-3-319-27122-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27122-4_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27121-7

  • Online ISBN: 978-3-319-27122-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics