Abstract
Images ubiquitously support visualization and documentation of as-built status of built infrastructure. Images captured on-site during construction contain rich semantic information, such as object categories, materials and topological relationships, which are useful for many applications, such as progress monitoring, crack detection, quality control, and safety management. Recent advancements in deep learning and convolutional neural network can effectively extract various types of semantic information from images, but they require annotated datasets for model training. Most existing scene parsing datasets contain annotated images captured at project closeout and there lacks a dataset that can be used for construction site scene understanding. In order to support construction scene understanding, we present the Construction Scene Parsing (CSP), an annotated image dataset that contains over 150 construction scenes with image segmentation labelled by experts. The CSP dataset have two primary contributions: 1) It provides a hierarchical semantic structure rather than a unitary label for each image to deal with incomplete and changing components presented on construction images; 2) It provides pixel-wise annotations for every scene and can support various types of scene understanding tasks, such as object recognition, semantic segmentation, instance segmentation and panoptic segmentation. The dataset can be accessed at https://github.com/yugitw/Construction-Scene-Parsing.
Keywords
- Semantic segmentation
- Deep learning
- Convolutional neural network
- Construction
- Images
This is a preview of subscription content, access via your institution.
Buying options






References
Golparvar-Fard, M., Peña Mora, F., Silvio, S.: D4 AR – a 4-dimensional augmented reality model for automating construction progress monitoring data collection, processing and communication. Electron. J. Inf. Technol. Constr. 14, 129–153 (2009). http://www.itcon.org/paper/2009/13
Wei, Y., Kasireddy, V., Akinci, B.: 3D imaging in construction and infrastructure management: technological assessment and future research directions. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 37–60 (2018). https://doi.org/10.1007/978-3-319-91635-4_3
Xiao, J., Furukawa, Y.: Reconstructing the world’s museums. Int. J. Comput. Vis. 110(3), 243–258 (2014). https://doi.org/10.1007/s11263-014-0711-y
Xiong, X., Adan, A., Akinci, B., Huber, D.: Automatic creation of semantically rich 3D building models from laser scanner data. Autom. Constr. 31, 325–337 (2013). https://doi.org/10.1016/j.autcon.2012.10.006
Bosché, F.: Plane-based registration of construction laser scans with 3D/4D building models. Adv. Eng. Inform. 26, 90–102 (2012). https://doi.org/10.1016/j.aei.2011.08.009
Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R.: Manhattan-world stereo. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009, pp. 1422–1429 (2009). https://doi.org/10.1109/CVPRW.2009.5206867
Barazzetti, L.: Parametric as-built model generation of complex shapes from point clouds. Adv. Eng. Inform. 30, 298–311 (2016). https://doi.org/10.1016/j.aei.2016.03.005
Bosché, F., Guillemet, A., Turkan, Y., Haas, C.T., Haas, R.: Tracking the built status of MEP works: assessing the value of a scan-vs-BIM system. J. Comput. Civ. Eng. 28, 05014004 (2014). https://doi.org/10.1061/(ASCE)CP.1943-5487.0000343
Bosché, F., Ahmed, M., Turkan, Y., Haas, C.T., Haas, R.: The value of integrating scan-to-BIM and scan-vs-BIM techniques for construction monitoring using laser scanning and BIM: the case of cylindrical MEP components. Autom. Constr. 49, 201–213 (2015). https://doi.org/10.1016/j.autcon.2014.05.014
Jia, D., Wei, D., Socher, R., Li-Jia, L., Kai, L., Li, F.-F.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPRW.2009.5206848
Yang, J., Park, M.W., Vela, P.A., Golparvar-Fard, M.: Construction performance monitoring via still images, time-lapse photos, and video streams: Now, tomorrow, and the future. Adv. Eng. Inform. 29, 211–224 (2015). https://doi.org/10.1016/j.aei.2015.01.011
Kim, D., Liu, M., Lee, S.H., Kamat, V.R.: Remote proximity monitoring between mobile construction resources using camera-mounted UAVs. Autom. Constr. 99, 168–182 (2019). https://doi.org/10.1016/j.autcon.2018.12.014
Zhang, B., Zhu, Z., Hammad, A., Aly, W.: Automatic matching of construction onsite resources under camera views. Autom. Constr. 91, 206–215 (2018). https://doi.org/10.1016/j.autcon.2018.03.011
Li, D., Cong, A., Guo, S.: Sewer damage detection from imbalanced CCTV inspection data using deep convolutional neural networks with hierarchical classification. Autom. Constr. 101, 199–208 (2019). https://doi.org/10.1016/j.autcon.2019.01.017
Wei, Y., Akinci, B.: A vision and learning-based indoor localization and semantic mapping framework for facility operations and management. Autom. Constr. 107, 102915 (2019). https://doi.org/10.1016/j.autcon.2019.102915
Czerniawski, T., Leite, F.: Semantic segmentation of images of building facilities. In: CEUR Workshop Proceedings (2019)
Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ADE20K dataset. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 5122–5130 (2017). https://doi.org/10.1109/CVPR.2017.544
Czerniawski, T., Leite, F.: 3DFacilities: annotated 3D reconstructions of building facilities. In: Workshop of the European Group for Intelligent Computing in Engineering, pp. 186–200. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91635-4_10
Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2014). https://doi.org/10.1007/s11263-014-0733-5
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3354–3361 (2012). https://doi.org/10.1109/CVPR.2012.6248074
Arabi, S., Haghighat, A., Sharma, A.: A deep learning based solution for construction equipment detection: from development to deployment (2019)
Fang, W., Ding, L., Zhong, B., Love, P.E.D., Luo, H.: Automated detection of workers and heavy equipment on construction sites: a convolutional neural network approach. Adv. Eng. Inform. 37, 139–149 (2018). https://doi.org/10.1016/j.aei.2018.05.003
Armeni, I., Sax, S., Zamir, A.R., Savarese, S.: Joint 2D-3D-semantic data for indoor scene understanding (2017)
Chang, A., Dai, A., Funkhouser, T., Halber, M., Niebner, M., Savva, M., Song, S., Zeng, A., Zhang, Y.: Matterport3D: learning from RGB-D data in indoor environments. In: Proceedings - 2017 International Conference on 3D Vision, 3DV 2017, pp. 667–676 (2018). https://doi.org/10.1109/3DV.2017.00081
Gupta, S., Arbeláez, P., Malik, J.: Perceptual organization and recognition of indoor scene from RGB-D images semantic segmentation with RGB-D. In: CVPR, pp. 1–9 (2013). https://doi.org/10.1109/ICCVW.2011.6130298
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016). https://doi.org/10.1109/CVPR.2016.350
Building Systems, Savings-to-Investment Ratio, Cost Risk Analysis: Standard classification for building elements and related sitework—UNIFORMAT II 1. ASTM (2005). https://doi.org/10.1520/E1557-09.2
MasterFormat. https://www.csiresources.org/standards/masterformat
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wei, Y., Akinci, B. (2021). Construction Scene Parsing (CSP): Structured Annotations of Image Segmentation for Construction Semantic Understanding. In: Toledo Santos, E., Scheer, S. (eds) Proceedings of the 18th International Conference on Computing in Civil and Building Engineering. ICCCBE 2020. Lecture Notes in Civil Engineering, vol 98. Springer, Cham. https://doi.org/10.1007/978-3-030-51295-8_80
Download citation
DOI: https://doi.org/10.1007/978-3-030-51295-8_80
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-51294-1
Online ISBN: 978-3-030-51295-8
eBook Packages: EngineeringEngineering (R0)