Generic Distributed In Situ Aggregation for Earth Remote Sensing Imagery

  • Ramon Antonio Rodriges ZalipynisEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11179)


Earth remote sensing imagery come from satellites, unmanned aerial vehicles, airplanes, and other sources. National agencies, commercial companies, and individuals across the globe collect enormous amounts of such imagery daily. Array DBMS are one of the prominent tools to manage and process large volumes of geospatial imagery. The core data model of an array DBMS is an N-dimensional array. Recently we presented a geospatial array DBMS – ChronosDB – which outperforms SciDB by up to \(75\times \) on average. We are about to launch a Cloud service running our DBMS. SciDB is the only freely available distributed array DBMS to date. Remote sensing imagery are traditionally stored in files of sophisticated formats, not in databases. Unlike SciDB, ChronosDB does not require importing files into an internal DBMS format and works with imagery “in situ”: directly in their native file formats. This is one of the many virtues of ChronosDB. It has now certain aggregation capabilities, but this paper focuses on more advanced aggregation queries which still constitute a large portion of a typical workload applied to remote sensing imagery. We integrate the aggregation types into the data model, present the respective algorithms to perform aggregations in a distributed fashion, and thoroughly compare the performance of our technique with SciDB. We carried out experiments on real-world data on 8- and 16-node clusters in Microsoft Azure Cloud.


ChronosDB SciDB Cloud computing Array DBMS Satellite imagery In Situ Big data Landsat 



This work was partially supported by Russian Science Foundation (grant №17-11-01052).


  1. 1.
  2. 2.
    Baumann, P., Dumitru, A.M., Merticariu, V.: The array database that is not a database: file based array query answering in RasDaMan. In: Nascimento, M.A., et al. (eds.) SSTD 2013. LNCS, vol. 8098, pp. 478–483. Springer, Heidelberg (2013). Scholar
  3. 3.
    Baumann, P., Holsten, S.: A comparative analysis of array models for databases. Int. J. Database Theory Appl. 5(1), 89–120 (2012)Google Scholar
  4. 4.
    Cudre-Mauroux, P., et al.: A demonstration of SciDB: a science-oriented DBMS. PVLDB 2(2), 1534–1537 (2009)Google Scholar
  5. 5.
  6. 6.
  7. 7.
  8. 8.
    Nativi, S., Caron, J., Domenico, B., Bigagli, L.: Unidata’s common data model mapping to the ISO 19123 data model. Earth Sci. Inform. 1, 59–78 (2008)CrossRefGoogle Scholar
  9. 9.
    Newberry, R.G., Lupo, A.R., Jensen, A.D., Rodriges Zalipynis, R.A.: An analysis of the spring-to-summer transition in the West Central Plains for application to long range forecasting. Atmos. Clim. Sci. 6(3), 375–393 (2016)Google Scholar
  10. 10.
  11. 11.
    Papadopoulos, S., et al.: The TileDB array data storage manager. PVLDB 10(4), 349–360 (2016)Google Scholar
  12. 12.
  13. 13.
  14. 14.
  15. 15.
    Rodriges Zalipynis, R.A.: ChronosServer: real-time access to “native” multi-terabyte retrospective data warehouse by thousands of concurrent clients. Inform. Cybern. Comput. Eng. 14(188), 151–161 (2011)Google Scholar
  16. 16.
    Rodriges Zalipynis, R.A.: Efficient isolines construction method for visualization of gridded georeferenced data. Probl. Model. Des. Autom. 10(197), 111–123 (2011)Google Scholar
  17. 17.
    Rodriges Zalipynis, R.A.: Representing Earth remote sensing data as time series. Syst. Anal. Environ. Soc. Sci. 2(3), 135–145 (2012)Google Scholar
  18. 18.
    Rodriges Zalipynis, R.A.: Ecologic assessment of air pollution by nitrogen dioxide over the territory of Europe using Earth remote sensing data. Inform. Cybern. Comput. Eng. 1(19), 126–130 (2014)Google Scholar
  19. 19.
    Rodriges Zalipynis, R.A.: ChronosServer: fast in situ processing of large multidimensional arrays with command line tools. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2016. CCIS, vol. 687, pp. 27–40. Springer, Cham (2016). Scholar
  20. 20.
    Rodriges Zalipynis, R.A.: Array DBMS in environmental science: satellite sea surface height data in the cloud. In: 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, IDAACS 2017, Bucharest, Romania, 21–23 September 2017, pp. 1062–1065. IEEE (2017).
  21. 21.
    Rodriges Zalipynis, R.A.: ChronosDB: distributed, file based, geospatial array DBMS. PVLDB 11(10), 1247–1261 (2018). Scholar
  22. 22.
    Rodriges Zalipynis, R.A.: Distributed in situ processing of big raster data in the cloud. In: Petrenko, A.K., Voronkov, A. (eds.) PSI 2017. LNCS, vol. 10742, pp. 337–351. Springer, Cham (2018). Scholar
  23. 23.
    Rodriges Zalipynis, R.A., et al.: The Wikience: community data science. Concept and implementation. In: Informatics and Computer Technologies, pp. 113–117. DNTU (2011)Google Scholar
  24. 24.
    Rodriges Zalipynis, R.A., et al.: Retrospective satellite data in the cloud: an array DBMS approach. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2017. Communications in Computer and Information Science, vol. 793, pp. 351–362. Springer, Cham (2017). Scholar
  25. 25.
    Rodriges Zalipynis, R.A., Pozdeev, E., Bryukhov, A.: Array DBMS and satellite imagery: towards big raster data in the cloud. In: van der Aalst, W.M.P., et al. (eds.) AIST 2017. LNCS, vol. 10716, pp. 267–279. Springer, Cham (2018). Scholar
  26. 26.
  27. 27.
  28. 28.
    Zhang, Y., et al.: SciQL: bridging the gap between science and relational DBMS. In: IDEAS (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.National Research University Higher School of EconomicsMoscowRussia

Personalised recommendations