Skip to main content

Advertisement

Log in

Infrastructures and services for remote sensing data production management across multiple satellite data centers

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

With the number of satellite sensors and date centers being increased continuously, it is becoming a trend to manage and process massive remote sensing data from multiple distributed sources. However, the combination of multiple satellite data centers for massive remote sensing (RS) data collaborative processing still faces many challenges. In order to reduce the huge amounts of data migration and improve the efficiency of multi-datacenter collaborative process, this paper presents the infrastructures and services of the data management as well as workflow management for massive remote sensing data production. A dynamic data scheduling strategy was employed to reduce the duplication of data request and data processing. And by combining the remote sensing spatial metadata repositories and Gfarm grid file system, the unified management of the raw data, intermediate products and final products were achieved in the co-processing. In addition, multi-level task order repositories and workflow templates were used to construct the production workflow automatically. With the help of specific heuristic scheduling rules, the production tasks were executed quickly. Ultimately, the Multi-datacenter Collaborative Process System (MDCPS) were implemented for large-scale remote sensing data production based on the effective management of data and workflow. As a consequence, the performance of MDCPS in experiments environment showed that those strategies could significantly enhance the efficiency of co-processing across multiple data centers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Holmgren, J., Persson, Å., Söderman, U.: Species identification of individual trees by combining high resolution lidar data with multi-spectral images. Int. J. Remote Sens. 29(5), 1537–1552 (2008)

    Article  Google Scholar 

  2. Hall, F.G., Hilker, T., Coops, N.C., Lyapustin, A., Huemmrich, K.F., Middleton, E., Margolis, H., Drolet, G., Black, T.A.: Multi-angle remote sensing of forest light use efficiency by observing pri variation with canopy shadow fraction. Remote Sens. Environ. 112(7), 3201–3211 (2008)

    Article  Google Scholar 

  3. Lunetta, R.S., Knight, J.F., Ediriwickrema, J., Lyon, J.G., Worthy, L.D.: Land-cover change detection using multi-temporal modis ndvi data. Remote Sens. Environ. 105(2), 142–154 (2006)

    Article  Google Scholar 

  4. McCabe, M.F., Wood, E.F.: Scale influences on the remote estimation of evapotranspiration using multiple satellite sensors. Remote Sens. Environ. 105(4), 271–285 (2006)

    Article  Google Scholar 

  5. Nasa eosdis web site. http://www.esdis.eosdis.nasa.gov/

  6. Chi, M., Plaza, A., Benediktsson, J.A., Sun, Z., Shen, J., Zhu, Y.: Big data for remote sensing: challenges and opportunities. Proc. IEEE (2015)

  7. Institute of remote sensing and digital earth, Chinese Academy of Science. http://english.radi.cas.cn/

  8. Zhang, W., Wang, L., Ma, Y., Liu, D.: Design and implementation of task scheduling strategies for massive remote sensing data processing across multiple data centers. Softw. Pract. Exp. 44(7), 873–886 (2014)

    Article  Google Scholar 

  9. Bartholomé, E., Belward, A.: Glc 2000: a new approach to global land cover mapping from earth observation data. Int. J. Remote Sens. 26(9), 1959–1977 (2005)

    Article  Google Scholar 

  10. Scharlemann, J.P., Benz, D., Hay, S.I., Purse, B.V., Tatem, A.J., Wint, G.W., Rogers, D.J.: Global data for ecology and epidemiology: a novel algorithm for temporal fourier processing modis data. PloS One 3(1), e1408 (2008)

    Article  Google Scholar 

  11. Wang, L., Lu, K., Liu, P., Ranjan, R., Chen, L.: Ik-svd: dictionary learning for spatial big data via incremental atom update. Comput. Sci. Eng. 16(4), 41–52 (2014)

    Article  Google Scholar 

  12. Wang, L., Ranjan, R., Kolodziej, J., Zomaya, A.Y., Alem, L.: Software tools and techniques for big data computing in healthcare clouds. Future Gener. Comput. Syst. 43, 38–39 (2015)

    Article  Google Scholar 

  13. Wang, L., Chen, D., Hu, Y., Ma, Y., Wang, J.: Towards enabling cyberinfrastructure as a service in clouds. Comput. Electr. Eng. 39(1), 3–14 (2013)

    Article  Google Scholar 

  14. Chen, L., Ma, Y., Liu, P., Wei, J., Jie, W., He, J.: A review of parallel computing for large-scale remote sensing image mosaicking. Cluster Comput. 18(2), 517–529 (2015)

    Article  Google Scholar 

  15. Petcu, D., Gorgan, D., Pop, F., Tudor, D., Zaharie, D.: Satellite image processing on a grid-based platform. Int. J. Comput. 7(2), 51–58 (2014)

    Google Scholar 

  16. Cossu, R., Bally, P., Colin, O., Fusco, L.: Esa grid processing on demand for fast access to earth observation data and rapid mapping of flood events. European Geosciences Union General Assembly (2008)

  17. Chervenak, A., Foster, I., Kesselman, C., Salisbury, C., Tuecke, S.: The data grid: towards an architecture for the distributed management and analysis of large scientific datasets. J. Netw. Comput. Appl. 23(3), 187–200 (2000)

    Article  Google Scholar 

  18. Kussul, N., Shelestov, A., Skakun, S.: Grid approach to satellite monitoring systems integration (2008)

  19. Tudor, D.: Mediogrid: a grid-based platform for satellite image processing (2007)

  20. Ramapriyan, H.K., Behnke, J., Sofinowski, E.: Evolution of the earth observing system (eos) data and information system (eosdis). Standard-Based Data and Information Systems for Earth Observation, pp. 63–92. Springer, Berlin (2010)

    Chapter  Google Scholar 

  21. Zhang, X., Jiang, J., Zhang, X., Wang, X.: A data transmission algorithm for distributed computing system based on maximum flow. Cluster Comput. 18(3), 1157–1169 (2015)

    Article  Google Scholar 

  22. Cafaro, M., Epicoco, I., Quarta, G., Fiore, S., Aloisio, G.: Design and implementation of a grid computing environment for remote sensing. High Performance Computing in Remote Sensing, p. 281. Chapman&Hall/CRC, Boca Raton (2007)

    Google Scholar 

  23. Hoschek, W., Jaen-Martinez, J., Samar, A., Stockinger, H., Stockinger, K.: Data management in an international data grid project. In: Grid ComputingGRID 2000. Springer, pp. 77–90 (2000)

  24. Di, L.: The development of remote-sensing related standards at fgdc, ogc, and iso tc 211. In: 2003 IEEE International on Geoscience and Remote Sensing Symposium, 2003. IGARSS’03, vol. 1, pp. 643–647. IEEE (2003)

  25. Coleşa, A., Ignat, I., Opriş, R.: Providing high data availability in mediogrid. In: Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, 2006. SYNASC’06, pp. 296–302. IEEE (2006)

  26. Tatebe, O., Hiraga, K., Soda, N.: Gfarm grid file system. New Gener. Comput. 28(3), 257–275 (2010)

    Article  MATH  Google Scholar 

  27. Wang, L., Tao, J., Ranjan, R., Marten, H., Streit, A., Chen, J., Chen, D.: G-hadoop: Mapreduce across distributed data centers for data-intensive computing. Future Gener. Comput. Syst. 29(3), 739–750 (2013)

    Article  Google Scholar 

  28. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10. IEEE (2010)

  29. Wang, Y., Liu, Z., Liao, H., Li, C.: Improving the performance of gis polygon overlay computation with mapreduce for spatial big data processing. Cluster Comput. 18(2), 507–516 (2015)

    Article  Google Scholar 

  30. El Homsi, A.: Workflow system and method (2006). US Patent 7,065,493

  31. Guo, H., Wang, L., Chen, F., Liang, D.: Scientific big data and digital earth. Chin. Sci. Bull. 59(35), 5066–5073 (2014)

    Article  Google Scholar 

  32. Yu, J., Buyya, R., Ramamohanarao, K.: Workflow scheduling algorithms for grid computing. Metaheuristics for Scheduling in Distributed Computing Environments, pp. 173–214. Springer, Berlin (2008)

    Chapter  Google Scholar 

  33. Wang, L., Khan, S.U., Chen, D., Kołodziej, J., Ranjan, R., Xu, C.Z., Zomaya, A.: Energy-aware parallel task scheduling in a cluster. Future Gener. Comput. Syst. 29(7), 1661–1670 (2013)

    Article  Google Scholar 

  34. Nita, M.C., Pop, F., Voicu, C., Dobre, C., Xhafa, F.: Momth: multi-objective scheduling algorithm of many tasks in hadoop. Cluster Comput. 18(3), 1011–1024 (2015)

    Article  Google Scholar 

  35. Song, W., Yue, S., Wang, L., Zhang, W., Liu, D.: Task scheduling of massive spatial data processing across distributed data centers: Wwat’s new? In: 2011 IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS), pp. 976–981. IEEE (2011)

  36. Zhang, W., Wang, L., Liu, D., Song, W., Ma, Y., Liu, P., Chen, D.: Towards building a multi-datacenter infrastructure for massive remote sensing image processing. Concurr. Comput. Pract. Exp. 25(12), 1798–1812 (2013)

    Article  Google Scholar 

  37. Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurr. Comput. Pract. Exp. 18(10), 1039–1065 (2006)

  38. Jaeger, E., Altintas, I., Zhang, J., Ludäscher, B., Pennington, D., Michener, W.: A scientific workflow approach to distributed geospatial data processing using web services. In: SSDBM, vol. 3, pp. 87–90. Citeseer (2005)

  39. Maheswaran, M., Ali, S., Siegal, H., Hensgen, D., Freund, R.F.: Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems. In: Eighth Proceedings on Heterogeneous Computing Workshop, 1999 (HCW’99), pp. 30–44. IEEE (1999)

  40. Multiple satellite data centre workflow scheduling algorithm on basis of near data calculation principle (2015). https://www.google.com/patents/CN104484230A?cl=en CN Patent App. CN 201,410,851,865

Download references

Acknowledgments

Dr. Yan Ma’s work is supported by the National High Technology Research and Development Program of China (“863” Program) (No. 2013AA12A301). The authors would also like to acknowledge the editors and anonymous reviewers for their valuable comments on the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Ma.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Yan, J., Ma, Y. et al. Infrastructures and services for remote sensing data production management across multiple satellite data centers. Cluster Comput 19, 1243–1260 (2016). https://doi.org/10.1007/s10586-016-0577-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-016-0577-6

Keywords

Navigation