Lessons Learned with Laser Scanning Point Cloud Management in Hadoop HBase
While big data technologies are growing rapidly and benefit a wide range of science and engineering domains, many barriers remain for the remote sensing community to fully exploit the benefits provided by these powerful and rapidly developing technologies. To overcome existing barriers, this paper presents the in-depth experience gained when adopting a distributed computing framework – Hadoop HBase – for storage, indexing, and integration of large scale, high resolution laser scanning point cloud data. Four data models were conceptualized, implemented, and rigorously investigated to explore the advantageous features of distributed, key-value database systems. In addition, the comparison of the four models facilitated the reassessment of several well-known point cloud management techniques founded in traditional computing environments in the new context of a distributed, key-value database. The four models were derived from two row-key designs and two columns structures, thereby demonstrating various considerations during the development of a data solution for high-resolution, city-scale aerial laser scan for a portion of Dublin, Ireland. This paper presents lessons learned from the data model design and its implementation for spatial data management in a distributed computing framework. The study is a step towards full exploitation of powerful emerging computing assets for dense spatio-temporal data.
KeywordsLiDAR Point cloud Big Data Spatial data management Hadoop HBase Distributed database
The Hadoop cluster used for the work presented in this paper was provided by allocation TG-CIE170036 - Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562 . The authors would like to thank the staff at Pittsburg Supercomputing Center for the truly outstanding technical support provided during setting up the testing. This research also made use of data collected with funding from the European Research Council grant ERC-2012-StG 20111012 “RETURN - Rethinking Tunnelling in Urban Neighbourhoods” Project 307836.
The dataset is available from NYU Spatial Data Repository https://doi.org/10.17609/N8MQ0N.
- 3.Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: Proceedings of the 19th ACM Symposium Operating Systems Principles, New York, pp. 29–43 (2003)Google Scholar
- 5.White, T.: Hadoop The Definitive Guide, 4th ed. O’Reilly, Massachusetts (2015)Google Scholar
- 7.George, L.: HBase The Definitive Guide, 1st edn. O’Reilly, Massachusetts (2011)Google Scholar
- 8.Middleton, W., Spilhaus, A.: The measurement of atmospheric humidity. In: Meteorological Instruments, Toronto, pp. 105–111 (1953)Google Scholar
- 9.Shepherd, E.C.: Laser to watch height: New Scientist, vol. 6, no. 437, p. 33 (1965)Google Scholar
- 12.Krishnan, S., Baru, C., Crosby, C.: Evaluation of MapReduce for gridding LIDAR data. In: 2010 IEEE Second International Conference on Cloud Computing Technology and Science, pp. 33–40 (2010). https://doi.org/10.1109/cloudcom.2010.34
- 16.Aljumaily, H., Laefer, D.F., Cuadra, D.: Urban point cloud mining based on density clustering and MapReduce. J. Comput. Civ. Eng. 31(5) (2017). https://doi.org/10.1061/(asce)cp.1943-5487.0000674CrossRefGoogle Scholar
- 17.Moler, C.: Matrix computation on distributed memory multiprocessors. In: Hypercube Multiprocessors 1986, pp. 181–195 (1987)Google Scholar
- 23.Ramsey, P.: LiDAR in PostgreSQL with PointCloud. In: FOSS4G, Nottingham (2013)Google Scholar
- 25.Murray, C., et al.: Oracle Spatial and Graph - developer’ s guide, 12c Release 1 (2017). https://docs.oracle.com/database/121/SPATL/toc.htm
- 26.Vo, A.-V.: Spatial data storage and processing strategies for urban laser scanning. Ph.D. thesis. University College Dublin (2017). https://doi.org/10.13140/rg.2.2.12798.48962
- 28.Wang, J., Shan, J.: Space-filling curve based point clouds index. In: Proceedings of the 8th International Conference on GeoComputation, Michigan (2005)Google Scholar
- 29.Psomadaki, S., van Oosterom, P.J.M., Tijssen, T.P.M., Baart, F.: Using a space filling curve approach for the management of dynamic point clouds. In: ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, IV-2/W1, pp. 107–118 (2016). https://doi.org/10.5194/isprs-annals-iv-2-w1-107-2016CrossRefGoogle Scholar
- 30.Towns J., Cockerill T., Dahan M., Foster I., Gaither K., Grimshaw A., Hazlewood V., Lathrop S., Lifka D., Peterson G.D., Roskies R., Scott J.R., Wilkins-Diehr N.: XSEDE: accelerating scientific discovery. Comput. Sci. Eng. 16(5), 62–74 (2014). https://doi.org/10.1109/mcse.2014.80CrossRefGoogle Scholar