Abstract
While big data technologies are growing rapidly and benefit a wide range of science and engineering domains, many barriers remain for the remote sensing community to fully exploit the benefits provided by these powerful and rapidly developing technologies. To overcome existing barriers, this paper presents the in-depth experience gained when adopting a distributed computing framework – Hadoop HBase – for storage, indexing, and integration of large scale, high resolution laser scanning point cloud data. Four data models were conceptualized, implemented, and rigorously investigated to explore the advantageous features of distributed, key-value database systems. In addition, the comparison of the four models facilitated the reassessment of several well-known point cloud management techniques founded in traditional computing environments in the new context of a distributed, key-value database. The four models were derived from two row-key designs and two columns structures, thereby demonstrating various considerations during the development of a data solution for high-resolution, city-scale aerial laser scan for a portion of Dublin, Ireland. This paper presents lessons learned from the data model design and its implementation for spatial data management in a distributed computing framework. The study is a step towards full exploitation of powerful emerging computing assets for dense spatio-temporal data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Vo, A.V., Laefer, D.F., Bertolotto, M.: Airborne laser scanning data storage and indexing: state of the art review. Int. J. Remote Sens. 37(24), 6187–6204 (2016). https://doi.org/10.1080/01431161.2016.1256511
Kitchin, R., McArdle, G.: What makes Big Data, Big Data? exploring the ontological characteristics of 26 datasets. Big Data Soc. 3(1), 1–10 (2016). https://doi.org/10.1177/2053951716631130
Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: Proceedings of the 19th ACM Symposium Operating Systems Principles, New York, pp. 29–43 (2003)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2004). https://doi.org/10.1145/1327452.1327492
White, T.: Hadoop The Definitive Guide, 4th ed. O’Reilly, Massachusetts (2015)
Chang, F., et al.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4 (2008). https://doi.org/10.1145/1365815.1365816
George, L.: HBase The Definitive Guide, 1st edn. O’Reilly, Massachusetts (2011)
Middleton, W., Spilhaus, A.: The measurement of atmospheric humidity. In: Meteorological Instruments, Toronto, pp. 105–111 (1953)
Shepherd, E.C.: Laser to watch height: New Scientist, vol. 6, no. 437, p. 33 (1965)
van Oosterom, P., et al.: Massive point cloud data management: design, implementation and execution of a point cloud benchmark. Comput. & Graph. 49, 92–125 (2015). https://doi.org/10.1016/j.cag.2015.01.007
Cura, R., Perret, J., Paparoditis, N.: A scalable and multi-purpose point cloud server (PCS) for easier and faster point cloud data management and processing. ISPRS J. Photogramm. Remote Sens. 127, 39–56 (2017). https://doi.org/10.1016/j.isprsjprs.2016.06.012
Krishnan, S., Baru, C., Crosby, C.: Evaluation of MapReduce for gridding LIDAR data. In: 2010 IEEE Second International Conference on Cloud Computing Technology and Science, pp. 33–40 (2010). https://doi.org/10.1109/cloudcom.2010.34
Li, Z., Hodgson, M.E., Li, W.: A general-purpose framework for parallel processing of large-scale LiDAR data, vol. 8947. Int. J. Digit Earth 11(1), 26–47 (2017). https://doi.org/10.1080/17538947.2016.1269842
Rizki, P.N.M., Eum, J., Lee, H., Oh, S.: Spark-based in-memory DEM creation from 3D LiDAR point clouds. Remote Sens. Lett. 8(4), 360–369 (2017). https://doi.org/10.1080/2150704X.2016.1275053
Hamraz, H., Contreras, M.A., Zhang, J.: A scalable approach for tree segmentation within small-footprint Airborne LiDAR data. Comput. Geosci. 8(4), 360–369 (2017). https://doi.org/10.1080/2150704X.2016.1275053
Aljumaily, H., Laefer, D.F., Cuadra, D.: Urban point cloud mining based on density clustering and MapReduce. J. Comput. Civ. Eng. 31(5) (2017). https://doi.org/10.1061/(asce)cp.1943-5487.0000674
Moler, C.: Matrix computation on distributed memory multiprocessors. In: Hypercube Multiprocessors 1986, pp. 181–195 (1987)
Baumann, P., et al.: Big Data analytics for Earth sciences: the EarthServer approach. Int. J. Digit. Earth 9(1), 3–29 (2015). https://doi.org/10.1080/17538947.2014.1003106
Boehm, J., Liu, K.: NoSQL for storage and retrieval of large LiDAR data collections. In: ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XL-3/W3, pp. 577–582, La Grande Motte (2015)
Martinez-Rubi, O., et al.: Benchmarking and improving point cloud data management in MonetDB. SIGSPATIAL Special - Big Spatial 6(2), 11–18 (2014). https://doi.org/10.1145/2744700.2744702
Gertz, M., Renz, M., Zhou, X., Hoel, E., Ku, W.-S., Voisard, A., Zhang, C., Chen, H., Tang, L., Huang, Y., Lu, C.-T., Ravada, S. (eds.): SSTD 2017. LNCS, vol. 10411. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64367-0
Mosa, A.S.M., Schön, B., Bertolotto, M., Laefer, D.F.: Evaluating the benefits of octree-based indexing for LiDAR data. Photogramm. Eng. Remote Sens. 78(9), 927–934 (2012). https://doi.org/10.14358/PERS.78.9.927
Ramsey, P.: LiDAR in PostgreSQL with PointCloud. In: FOSS4G, Nottingham (2013)
Nandigam, V., Baru, C., Crosby, C.: Database design for high-resolution LIDAR topography data. In: Gertz, M., Ludäscher, B. (eds.) SSDBM 2010. LNCS, vol. 6187, pp. 151–159. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13818-8_12
Murray, C., et al.: Oracle Spatial and Graph - developer’ s guide, 12c Release 1 (2017). https://docs.oracle.com/database/121/SPATL/toc.htm
Vo, A.-V.: Spatial data storage and processing strategies for urban laser scanning. Ph.D. thesis. University College Dublin (2017). https://doi.org/10.13140/rg.2.2.12798.48962
Haverkort, H., van Walderveen, F.: Locality and bounding-box quality of two-dimensional space-filling curves. Comput. Geom. 43(2), 131–147 (2008). https://doi.org/10.1016/j.comgeo.2009.06.002
Wang, J., Shan, J.: Space-filling curve based point clouds index. In: Proceedings of the 8th International Conference on GeoComputation, Michigan (2005)
Psomadaki, S., van Oosterom, P.J.M., Tijssen, T.P.M., Baart, F.: Using a space filling curve approach for the management of dynamic point clouds. In: ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, IV-2/W1, pp. 107–118 (2016). https://doi.org/10.5194/isprs-annals-iv-2-w1-107-2016
Towns J., Cockerill T., Dahan M., Foster I., Gaither K., Grimshaw A., Hazlewood V., Lathrop S., Lifka D., Peterson G.D., Roskies R., Scott J.R., Wilkins-Diehr N.: XSEDE: accelerating scientific discovery. Comput. Sci. Eng. 16(5), 62–74 (2014). https://doi.org/10.1109/mcse.2014.80
Acknowledgments
The Hadoop cluster used for the work presented in this paper was provided by allocation TG-CIE170036 - Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562 [30]. The authors would like to thank the staff at Pittsburg Supercomputing Center for the truly outstanding technical support provided during setting up the testing. This research also made use of data collected with funding from the European Research Council grant ERC-2012-StG 20111012 “RETURN - Rethinking Tunnelling in Urban Neighbourhoods” Project 307836.
The dataset is available from NYU Spatial Data Repository https://doi.org/10.17609/N8MQ0N.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Vo, AV., Konda, N., Chauhan, N., Aljumaily, H., Laefer, D.F. (2018). Lessons Learned with Laser Scanning Point Cloud Management in Hadoop HBase. In: Smith, I., Domer, B. (eds) Advanced Computing Strategies for Engineering. EG-ICE 2018. Lecture Notes in Computer Science(), vol 10863. Springer, Cham. https://doi.org/10.1007/978-3-319-91635-4_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-91635-4_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91634-7
Online ISBN: 978-3-319-91635-4
eBook Packages: Computer ScienceComputer Science (R0)