Abstract
With the increasing interest in large-scale, high-resolution and real-time geographic information system (GIS) applications and spatial big data processing, traditional GIS is not efficient enough to handle the required loads due to limited computational capabilities.Various attempts have been made to adopt high performance computation techniques from different applications, such as designs of advanced architectures, strategies of data partition and direct parallelization method of spatial analysis algorithm, to address such challenges. This paper surveys the current state of parallel GIS with respect to parallel GIS architectures, parallel processing strategies, and relevant topics. We present the general evolution of the GIS architecture which includes main two parallel GIS architectures based on high performance computing cluster and Hadoop cluster. Then we summarize the current spatial data partition strategies, key methods to realize parallel GIS in the view of data decomposition and progress of the special parallel GIS algorithms. We use the parallel processing of GRASS as a case study. We also identify key problems and future potential research directions of parallel GIS.
This is a preview of subscription content,
to check access.




References
Aji, A., Wang, F., Vo, H., Lee, R., Liu, Q., Zhang, X., Saltz, J.: Hadoop gis: a high performance spatial data warehousing system over mapreduce. Proc. VLDB Endow. 6(11), 1009–1020 (2013)
Akhter, S., Aida, K., Chemin, Y.: Grass gis on high performance computing with mpi, openmp and ninf-g programming framework. In: Proceeding of ISPRS 2010 (2010)
Alesheikh, A., Helali, H., Behroz, H.: Web gis: technologies and its applications. In: Symposium on Geospatial Theory, Processing and Applications, vol. 15 (2002)
Aronoff, S.: Geographic Information Systems: A Management Perspective. Taylor & Francis, London (1989)
Bader, D.A., JáJá, J.: Parallel algorithms for image histogramming and connected components with an experimental study (1998)
Benedičič, L., Cruz, F.A., Hamada, T., Korošec, P.: A grass gis parallel module for radio-propagation predictions. Int. J. Geogr. Inf. Sci. 28(4), 799–823 (2014)
Berson, A.: Client-Server Architecture. IEEE-802. McGraw-Hill, New York (1992)
Bhat, M.A., Shah, R.M., Ahmad, B.: Cloud computing: a solution to geographical information systems(gis). Int. J. Comput. Sci. Eng. 3(2), 594–600 (2011)
Bilal, K., Khan, S.U., Zhang, L., Li, H., Hayat, K., Madani, S.A., Min-Allah, N., Wang, L., Chen, D., Iqbal, M.I., Xu, C.Z., Zomaya, A.Y.: Quantitative comparisons of the state-of-the-art data center architectures. Concurr. Comput. Pract Exp. 25(12), 1771–1783 (2013). doi:10.1002/cpe.2963
Bok, K., Seo, D., Song, S., Kim, M., Yoo, J.: An index structure for parallel processing of multidimensional data. In: Advances in Web-Age Information Management, pp. 589–600. Springer, New York (2005)
Boukerram, A., Azzou, S.A.K.: Parallelisation of algorithms of mathematical morphology. J. Comput. Sci. 2(8), 615–618 (2006)
Cordeau, J.F., Maischberger, M.: A parallel iterated tabu search heuristic for vehicle routing problems. Comput. Oper. Res. 39(9), 2033–2050 (2012)
Dalton, C.M., Thatcher, J.: Inflated Granularity: Spatial Big Dataand Geodemographics. Available at SSRN 2544638 (2015)
Dash, M., Petrutiu, S., Scheuermann, P.: ppop: fast yet accurate parallel hierarchical clustering using partitioning. Data Knowl. Eng. 61(3), 563–578 (2007)
Delling, D., Katz, B., Pajor, T.: Parallel computation of best connections in public transportation networks. J. Exp. Algorithmics 17, 4–4 (2012)
Dewitt, D.J., Kabra, N., Luo, J., Patel, J.M., Yu, J.B.: Client-server paradise. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 558–569 (2001)
Dong, W., Liu, D., Zhao, L.: A new mpi-based grass technology for parallel processing and its architecture[j]. Remote Sens. Inf. 28(01), 102–109 (2013)
Egenhofer, M.J.: Reasoning about binary topological relations. In: Advances in Spatial Databases, pp. 141–160. Springer, New York (1991)
Fan, J., Ji, M., Gu, G., Sun, Y.: Optimization approaches to mpi and area merging-based parallel buffer algorithm. Boletim de Ciências Geodésicas 20(2), 237–256 (2014)
Festa, P., Resende, M.G.: Hybridizations of grasp with path-relinking. In: Hybrid Metaheuristics, pp. 135–155. Springer, New York (2013)
Foster, I.: Designing and Building Parallel Programs. Addison Wesley Publishing Company, Reading (1995)
Frank, A.U.: Qualitative spatial reasoning: cardinal directions as an example. Int. J. Geogr. Inf. Sci. 10(3), 269–290 (1996)
Franklin, W.R., Narayanaswami, C., Kankanhalli, M., Sun, D., Zhou, M.C., Wu, P.Y.: Uniform grids: a technique for intersection detection on serial and parallel machines. In: Proceedings of Auto Carto 9: Ninth International Symposium on Computer-Assisted Cartography, pp. 100–109 (1989)
Gao, S., Li, L., Li, W., Janowicz, K., Zhang, Y.: Constructing gazetteers from volunteered big geo-data based on hadoop. Comput. Environ. Urban Syst. (2014). doi:10.1016/j.compenvurbsys.2014.02.004
Garcıa-López, F., Melián-Batista, B., Moreno-Pérez, J.A., Moreno-Vega, J.M.: Parallelization of the scatter search for the p-median problem. Parallel Comput. 29(5), 575–589 (2003)
Gong, J., Xie, J.: Extraction of drainage networks from large terrain datasets using high throughput computing. Comput. Geosci. 35(2), 337–346 (2009)
Goodchild, M.F.: Geographical information science. Int. J. Geogr. Inf. Syst. 6(1), 31–45 (1992)
Goodchild, M.F.: The quality of big (geo) data. Dialogues Human Geogr. 3(3), 280–284 (2013)
Groër, C., Golden, B., Wasil, E.: A parallel algorithm for the vehicle routing problem. INFORMS J. Comput. 23(2), 315–330 (2011)
Guo, H., Wang, L., Chen, F., Liang, D.: Scientific big data and digital earth. Chin. Sci. Bull. 59(35), 5066–5073 (2014). doi:10.1007/s11434-014-0645-3
Guo, M.: Research on the key technologies of high performance computing webgis model. Ph.D. thesis, China University of Geosciences, Wuhan (2012)
Hawick, K.A., Coddington, P.D., James, H.A.: Distributed frameworks and parallel algorithms for processing large-scale geographic data. Parallel Comput. 29(10), 1297–1333 (2003)
Healey, R., Dowers, S., Gittings, B., Mineter, M.J.: Parallel Processing Algorithms for GIS. CRC Press, Basingstoke (1997)
Hu, B., Wang, H.F., Wang, P.F., Liu, H.Z.: A parallel algorithm of pca image fusion in remote sensing and its implementation. Microelectron. Comput. 23(10), 153–157 (2006)
Huang, F., Liu, D., Liu, P., Wang, S., Zeng, Y., Li, G., Yu, W., Wang, J., Zhao, L., Pang, L.: Research on cluster-based parallel gis with the example of parallelization on grass gis. In: Sixth International Conference on Grid and Cooperative Computing, 2007. GCC 2007, pp. 642–649. IEEE (2007)
Huang, F., Liu, D., Tan, X., Wang, J., Chen, Y., He, B.: Explorations of the implementation of a parallel idw interpolation algorithm in a linux cluster-based parallel gis. Comput. Geosci. 37(4), 426–434 (2011)
Hussain, H., Malik, S.U.R., Hameed, A., Khan, S.U., Bickler, G., Min-Allah, N., Qureshi, M.B., Zhang, L., Wang, Y., Ghani, N., Kolodziej, J., Zomaya, A.Y., Xu, C.Z., Balaji, P., Vishnu, A., Pinel, F., Pecero, J.E., Kliazovich, D., Bouvry, P., Li, H., Wang, L., Chen, D., Rayes, A.: A survey on resource allocation in high performance distributed computing systems. Parallel Comput. 39(11), 709–736 (2013)
Jia, T., Wei, Z., Tang, S., Kim, J.H.: New spatial data partition approach for spatial data query. Comput. Sci. 37(8), 198–200 (2013)
Jin, H., Meng, L., Wang, X.: Cluster-based architecture design of parallel gis [j]. Geospat. Inf. 5, 015 (2005)
Kalpana, R., Thambidurai, P.: Optimizing shortest path queries with parallelized arc flags. In: International Conference on Recent Trends in Information Technology (ICRTIT), 2011, pp. 601–606. IEEE (2011)
Kamel, I., Faloutsos, C.: Parallel R-Trees, vol. 21. In: ACM (1992)
Katz, R.H.: High-performance network and channel-based storage. Proc. IEEE 80(8), 1238–1261 (1992)
Kolodziej, J., Khan, S.U., Wang, L., Byrski, A., Min-Allah, N., Madani, S.A.: Hierarchical genetic-based grid scheduling with energy optimization. Clust. Comput. 16(3), 591–609 (2013). doi:10.1007/s10586-012-0226-7
Kwok, T., Smith, K., Lozano, S., Taniar, D.: Parallel fuzzy c-means clustering for large data sets. In: Euro-Par 2002 Parallel Processing, pp. 365–374. Springer, New York (2002)
Lai, S., Zhu, F., Sun, Y.: A design of parallel r-tree on cluster of workstations. In: Databases in Networked Information Systems, pp. 119–133. Springer, New York (2000)
Lee, C.K., Hamdi, M.: Parallel image processing applications on a network of workstations. Parallel Comput. 21(1), 137–160 (1995)
Lin, D., Liang, Q.: Research progress and connotation of cloud gis [j]. Prog. Geogr. 11, 013 (2012)
Liu, D., Liu, Y.: A review on spatial reasoning and geographic information system. J. Softw. 11(12), 1598–1606 (2000)
Liu, L., Yang, A., Chen, L., Xiong, W., Wu, Q., Jing, N.: Higis-when gis meets hpc. In: 12th International Conference on GeoComputation, Wuhan (2013)
Liu, P., Yuan, T., Ma, Y., Wang, L., Liu, D., Yue, S., Kolodziej, J.: Parallel processing of massive remote sensing images in a gpu architecture. Comput. Inf. 33(1), 197–217 (2014)
Ma, Y., Wang, L., Liu, D., Yuan, T., Liu, P., Zhang, W.: Distributed data structure templates for data-intensive remote sensing applications. Concurr. Comput. Pract. Exp. 25(12), 1784–1797 (2013). doi:10.1002/cpe.2965
Modenesi, M.V., Costa, M.C., Evsukoff, A.G., Ebecken, N.F.: Parallel fuzzy c-means cluster analysis. In: High Performance Computing for Computational Science-VECPAR 2006, pp. 52–65. Springer, New York (2007)
Modenesi, M.V., Evsukoff, A.G., Costa, M.C.: A load balancing knapsack algorithm for parallel fuzzy c-means cluster analysis. In: High Performance Computing for Computational Science-VECPAR 2008, pp. 269–279. Springer, New York (2008)
Nagesh, H., Goil, S., Choudhary, A.: Parallel algorithms for clustering high-dimensional large-scale datasets. In: Data Mining for Scientific and Engineering Applications, pp. 335–356. Springer, New York (2001)
Osterman, A.: Implementation of the r. cuda. los module in the open source grass gis by using parallel computation on the nvidia cuda graphic cards. ELEKTROTEHNIË\(\breve{\rm {G}}\)SKI VESTNIK 79(1–2), 19–24 (2012)
Padmanabhan, A., Wang, S., Navarro, J.P.: A cybergis gateway approach to interoperable access to the national science foundation teragrid and the open science grid. In: Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery, p. 42. ACM (2011)
Pang, L., Li, G., Yan, Y., Ma, Y.: Research on parallel buffer analysis with grided based hpc technology. In: IEEE International Geoscience and Remote Sensing Symposium, 2009, IGARSS 2009, vol. 4, pp. IV–200. IEEE (2009)
Paulsen, J., Körner, C.: Gis-analysis of tree-line elevation in the swiss alps suggests no exposure effect. J. Veg. Sci. 12(6), 817–824 (2001)
Qatawneh, M., Sleit, A., Almobaideen, W.: Parallel implementation of polygon clipping using transputer. Am. J. Appl. Sci. 6(2), 214 (2009)
Rajasekaran, S.: Efficient parallel hierarchical clustering algorithms. IEEE Trans. Parallel Distrib. Syst. 6, 497–502 (2005)
Rao, Q., Ding, J., Su, L., Gu, Y., Xia, L., Hu, Z.: The design and implementation of distributed map tiling service based on cloud computing. Geomat. Spat. Inf. Technol. 36, 29–35 (2013)
Schnitzer, B., Leutenegger, S.T.: Master-client r-trees: a new parallel r-tree architecture. In: Eleventh International Conference on Scientific and Statistical Database Management, 1999, pp. 68–77. IEEE (1999)
Shekhar, S., Gunturi, V., Evans, M.R., Yang, K.: Spatial big-data challenges intersecting mobility and cloud computing. In: Proceedings of the Eleventh ACM International Workshop on Data Engineering for Wireless and Mobile Access, pp. 1–6. ACM (2012)
Shen, Z., Luo, J., Zhou, C., Cai, S., Zheng, J., Chen, Q., Ming, D., Sun, Q.: Architecture design of grid gis and its applications on image processing based on lan. Inf. Sci. 166(1), 1–17 (2004)
Sloan, T.M., Mineter, M.J., Dowers, S., Mulholland, C., Darling, G., Gittings, B.M.: Partitioning of vector-topological data for parallel gis operations: Assessment and performance analysis. In: Euro-Par’99 Parallel Processing, pp. 691–694. Springer, New York (1999)
Sun, W., Tan, Z., Wang, J., Zhou, C., He, J.: An analysis of parallelizing shortest path algorithm. Geogr. GeoInf. Sci. 4, 005 (2013)
Theoharis, T., Page, I.: Two parallel methods for polygon clipping. In: Computer Graphics Forum, vol. 8, pp. 107–114. Wiley Online Library (1989)
Tomlinson, R.F., Calkins, H.W., Marble, D.F.: Computer Handling of Geographical Data. UNESCO Press, Paris (1976)
Wang, B., Horinokuchi, H., Kaneko, K., Makinouchi, A.: Parallel r-tree search algorithm on dsvm. In: Proceedings of the 6th International Conference on Database Systems for Advanced Applications, 1999, pp. 237–244. IEEE (1999)
Wang, L., Chen, D., Hu, Y., Ma, Y., Wang, J.: Towards enabling cyberinfrastructure as a service in clouds. Comput. Electr. Eng. 39(1), 3–14 (2013)
Wang, L., Kunze, M., Tao, J., von Laszewski, G.: Towards building a cloud for scientific applications. Adv. Eng. Softw. 42(9), 714–722 (2011)
Wang, L., von Laszewski, G., Kunze, M., Tao, J., Dayal, J.: Provide virtual distributed environments for grid computing on demand. Adv. Eng. Softw. 41(2), 213–219 (2010)
Wang, L., von Laszewski, G., Younge, A.J., He, X., Kunze, M., Tao, J., Fu, C.: Cloud computing: a perspective study. New Gener. Comput. 28(2), 137–146 (2010)
Wang, L., Lu, K., Liu, P.: Compressed sensing of a remote sensing image based on the priors of the reference image. IEEE Geosci. Remote Sens. Lett. 12(4), 736–740 (2015)
Wang, L., Tao, J., Ma, Y., Khan, S.U., Kolodziej, J., Chen, D.: Software design and implementation for mapreduce across distributed data centers. Int. J. Appl. Math. Inf. Sci. 7(1), 85–90 (2013)
Wang, S.: A cybergis framework for the synthesis of cyberinfrastructure, gis, and spatial analysis. Ann. Assoc. Am. Geogr. 100(3), 535–557 (2010)
Wang, S., Anselin, L., Bhaduri, B., Crosby, C., Goodchild, M.F., Liu, Y., Nyerges, T.L.: Cybergis software: a synthetic review and integration roadmap. Int. J. Geogr. Inf. Sci. 27(11), 2122–2145 (2013)
Wang, Y., Meng, L., Zhao, C.: The research of massive spatial data partitioning algorithm, based on the hilbert space permutation code. Geomat. Inf. Sci. Wuhan Univ. 32(7), 650–653 (2007)
Wilson, G.: Assessing the usability of parallel programming systems: The cowichan problems. In: Proceedings of the IFIP Working Conference on Programming Environments for Massively Parallel Distributed Systems, pp. 183–193 (1994)
Wu, X., Huang, B., Wang, L., Lu, K., Zhang, J.: Gpu-based parallel design of the hyperspectral signal subspace identification by minimum error (hysime). IEEE J. Sel. Topics Appl. Earth Obs. Remote Sens. Accepted (2015)
Wu, X., Xu, S., Wan, B., Wu, L.: Next generation software architecture t-c-v. Earth Sci. J. China Univ. Geosci. 39(2), 221–226 (2014)
Yan, Z., Sun, W., Zhou, C., Xiong, T., Wang, J.: A parallel scatter search algorithm for the p-median problem. Geogr. GeoInf. Sci. 4, 011 (2013)
Yang, C., Goodchild, M., Huang, Q., Nebert, D., Raskin, R., Xu, Y., Bambacus, M., Fay, D.: Spatial cloud computing: how can the geospatial sciences use and help shape cloud computing? Int. J. Digit. Earth 4(4), 305–329 (2011)
Yang, Y., Lixin, W.: A vector data partitioning method for realizing efficient parallel computing of topological relations. Geogr. GeoInf. Sci. 29(7), 25–29 (2013)
Yao, Y., Gao, J., Meng, L., Deng, S.: Parallel computing of buffer analysis based on grid computing [j]. Geospat. Inf. 1, 035 (2007)
Yu, B., Hao, Z.: Research of distributed and parallel spatial index mechanism based on dpr-tree [j]. Comput. Technol. Dev. 6, 012 (2010)
Zhang, J., Xu, M.: Design and implementation of connected component labeling parallel algorithm with multi-core processor. Comput. Syst. Appl. 19(4), 140–143 (2010)
Zhang, J., You, S.: Cudagis: report on the design and realization of a massive data parallel gis on gpus. In: Proceedings of the Third ACM SIGSPATIAL International Workshop on GeoStreaming, pp. 101–108. ACM (2012)
Zhang, W., Wang, L., Liu, D., Song, W., Ma, Y., Liu, P., Chen, D.: Towards building a multi-datacenter infrastructure for massive remote sensing image processing. Concurr. Comput. Pract. Exp. 25(12), 1798–1812 (2013)
Zhang, W., Wang, L., Ma, Y., Liu, D.: Design and implementation of task scheduling strategies for massive remote sensing data processing across multiple data centers. Software: Practice and Experience 44(7), 873–886 (2014)
Zhao, Y., Li, C.: Research on the distributed parallel spatial indexing schema based on r-tree. Geogr. GeoInf. Sci. 6, 009 (2007)
Zhong, Y.: Towards distributed management scheme for big spatio-temporal data. Ph.D. thesis, Institute of Computing Technology, Chinese Academy of Sciences, Beijing (2013)
Zhou, Y., Zhu, Q., Yeting, Z.: The spatial data partitioning method, based on the hilbert curve hierarchical decomposition. Geogr. GeoInf. Sci. 23(4), 13–17 (2007)
Acknowledgments
This study is supported by National Natural Science Foundation of China (41301028).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Zhao, L., Chen, L., Ranjan, R. et al. Geographical information system parallelization for spatial big data processing: a review. Cluster Comput 19, 139–152 (2016). https://doi.org/10.1007/s10586-015-0512-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-015-0512-2