Skip to main content

Construction Scene Parsing (CSP): Structured Annotations of Image Segmentation for Construction Semantic Understanding

Part of the Lecture Notes in Civil Engineering book series (LNCE,volume 98)


Images ubiquitously support visualization and documentation of as-built status of built infrastructure. Images captured on-site during construction contain rich semantic information, such as object categories, materials and topological relationships, which are useful for many applications, such as progress monitoring, crack detection, quality control, and safety management. Recent advancements in deep learning and convolutional neural network can effectively extract various types of semantic information from images, but they require annotated datasets for model training. Most existing scene parsing datasets contain annotated images captured at project closeout and there lacks a dataset that can be used for construction site scene understanding. In order to support construction scene understanding, we present the Construction Scene Parsing (CSP), an annotated image dataset that contains over 150 construction scenes with image segmentation labelled by experts. The CSP dataset have two primary contributions: 1) It provides a hierarchical semantic structure rather than a unitary label for each image to deal with incomplete and changing components presented on construction images; 2) It provides pixel-wise annotations for every scene and can support various types of scene understanding tasks, such as object recognition, semantic segmentation, instance segmentation and panoptic segmentation. The dataset can be accessed at


  • Semantic segmentation
  • Deep learning
  • Convolutional neural network
  • Construction
  • Images

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-51295-8_80
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   349.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-51295-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   449.99
Price excludes VAT (USA)
Hardcover Book
USD   449.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.


  1. 1.


  1. Golparvar-Fard, M., Peña Mora, F., Silvio, S.: D4 AR – a 4-dimensional augmented reality model for automating construction progress monitoring data collection, processing and communication. Electron. J. Inf. Technol. Constr. 14, 129–153 (2009).

    Google Scholar 

  2. Wei, Y., Kasireddy, V., Akinci, B.: 3D imaging in construction and infrastructure management: technological assessment and future research directions. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 37–60 (2018).

  3. Xiao, J., Furukawa, Y.: Reconstructing the world’s museums. Int. J. Comput. Vis. 110(3), 243–258 (2014).

    CrossRef  Google Scholar 

  4. Xiong, X., Adan, A., Akinci, B., Huber, D.: Automatic creation of semantically rich 3D building models from laser scanner data. Autom. Constr. 31, 325–337 (2013).

    CrossRef  Google Scholar 

  5. Bosché, F.: Plane-based registration of construction laser scans with 3D/4D building models. Adv. Eng. Inform. 26, 90–102 (2012).

    CrossRef  Google Scholar 

  6. Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R.: Manhattan-world stereo. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009, pp. 1422–1429 (2009).

  7. Barazzetti, L.: Parametric as-built model generation of complex shapes from point clouds. Adv. Eng. Inform. 30, 298–311 (2016).

    CrossRef  Google Scholar 

  8. Bosché, F., Guillemet, A., Turkan, Y., Haas, C.T., Haas, R.: Tracking the built status of MEP works: assessing the value of a scan-vs-BIM system. J. Comput. Civ. Eng. 28, 05014004 (2014).

    CrossRef  Google Scholar 

  9. Bosché, F., Ahmed, M., Turkan, Y., Haas, C.T., Haas, R.: The value of integrating scan-to-BIM and scan-vs-BIM techniques for construction monitoring using laser scanning and BIM: the case of cylindrical MEP components. Autom. Constr. 49, 201–213 (2015).

    CrossRef  Google Scholar 

  10. Jia, D., Wei, D., Socher, R., Li-Jia, L., Kai, L., Li, F.-F.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009).

  11. Yang, J., Park, M.W., Vela, P.A., Golparvar-Fard, M.: Construction performance monitoring via still images, time-lapse photos, and video streams: Now, tomorrow, and the future. Adv. Eng. Inform. 29, 211–224 (2015).

    CrossRef  Google Scholar 

  12. Kim, D., Liu, M., Lee, S.H., Kamat, V.R.: Remote proximity monitoring between mobile construction resources using camera-mounted UAVs. Autom. Constr. 99, 168–182 (2019).

    CrossRef  Google Scholar 

  13. Zhang, B., Zhu, Z., Hammad, A., Aly, W.: Automatic matching of construction onsite resources under camera views. Autom. Constr. 91, 206–215 (2018).

    CrossRef  Google Scholar 

  14. Li, D., Cong, A., Guo, S.: Sewer damage detection from imbalanced CCTV inspection data using deep convolutional neural networks with hierarchical classification. Autom. Constr. 101, 199–208 (2019).

    CrossRef  Google Scholar 

  15. Wei, Y., Akinci, B.: A vision and learning-based indoor localization and semantic mapping framework for facility operations and management. Autom. Constr. 107, 102915 (2019).

    CrossRef  Google Scholar 

  16. Czerniawski, T., Leite, F.: Semantic segmentation of images of building facilities. In: CEUR Workshop Proceedings (2019)

    Google Scholar 

  17. Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ADE20K dataset. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 5122–5130 (2017).

  18. Czerniawski, T., Leite, F.: 3DFacilities: annotated 3D reconstructions of building facilities. In: Workshop of the European Group for Intelligent Computing in Engineering, pp. 186–200. Springer, Cham (2018).

  19. Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2014).

    CrossRef  Google Scholar 

  20. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3354–3361 (2012).

  21. Arabi, S., Haghighat, A., Sharma, A.: A deep learning based solution for construction equipment detection: from development to deployment (2019)

    Google Scholar 

  22. Fang, W., Ding, L., Zhong, B., Love, P.E.D., Luo, H.: Automated detection of workers and heavy equipment on construction sites: a convolutional neural network approach. Adv. Eng. Inform. 37, 139–149 (2018).

    CrossRef  Google Scholar 

  23. Armeni, I., Sax, S., Zamir, A.R., Savarese, S.: Joint 2D-3D-semantic data for indoor scene understanding (2017)

    Google Scholar 

  24. Chang, A., Dai, A., Funkhouser, T., Halber, M., Niebner, M., Savva, M., Song, S., Zeng, A., Zhang, Y.: Matterport3D: learning from RGB-D data in indoor environments. In: Proceedings - 2017 International Conference on 3D Vision, 3DV 2017, pp. 667–676 (2018).

  25. Gupta, S., Arbeláez, P., Malik, J.: Perceptual organization and recognition of indoor scene from RGB-D images semantic segmentation with RGB-D. In: CVPR, pp. 1–9 (2013).

  26. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016).

  27. Building Systems, Savings-to-Investment Ratio, Cost Risk Analysis: Standard classification for building elements and related sitework—UNIFORMAT II 1. ASTM (2005).

  28. MasterFormat.

  29. OmniClass.

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Yujie Wei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Wei, Y., Akinci, B. (2021). Construction Scene Parsing (CSP): Structured Annotations of Image Segmentation for Construction Semantic Understanding. In: Toledo Santos, E., Scheer, S. (eds) Proceedings of the 18th International Conference on Computing in Civil and Building Engineering. ICCCBE 2020. Lecture Notes in Civil Engineering, vol 98. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-51294-1

  • Online ISBN: 978-3-030-51295-8

  • eBook Packages: EngineeringEngineering (R0)