Abstract
State of the art internet of things (IoT) and mobile monitoring systems promise to help gathering real time progress information from construction sites. However, on remote sites the adaptation of those technologies is frequently difficult due to a lack of infrastructure and often harsh and dynamic environments. On the other hand, visual inspection by experts usually allows a quick assessment of a project’s state. In some fields, drones are already commonly used to capture aerial footage for the purpose of state estimation by domain experts.
We propose a two-stage model for progress estimation leveraging images taken at the site. Stage 1 is dedicated to extract possible visual cues, like vehicles and resources. Stage 2 is trained to map the visual cues to specific project states. Compared to an end-to-end learning task, we intend to have an interpretable representation after the first stage (e.g. what objects are present, or later what are their relationships (spatial/semantic)). We evaluated possible methods for the pipeline in two use-case scenarios - (1) road and (2) wind turbine construction.
We evaluated methods like YOLOv3-SPP for object detection, and compared various methods for image segmentation, like Encoder-Decoder, DeepLab V3, etc. For the progress state estimation a simple decision tree classifier was used in both scenarios. Finally, we tested progress estimation by a sentence classification network based on provided free-text image descriptions.
This work has been partly funded by the Federal Ministry of Education and Research of Germany (BMBF) within the framework of the project ConWearDi (project number 02K16C034).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anwar, N., Izhar, M.A., Najam, F.A.: Construction monitoring and reporting using drones and unmanned aerial vehicles (UAVs). In: The Tenth International Conference on Construction in the 21st Century (CITC-10) (2018)
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Bucchiarone, A., et al.: Smart construction: remote and adaptable management of construction sites through IoT. IEEE Internet Things Mag. 2(3), 38–45 (2019). https://doi.org/10.1109/IOTM.0001.1900044. https://ieeexplore.ieee.org/document/8950968, print ISSN: 2576-3180 Electronic ISSN: 2576-3199
Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation (2017)
Congress, S.S.C., Puppala, A.J.: Novel methodology of using aerial close range photogrammetry technology for monitoring the pavement construction projects. In: International Airfield and Highway Pavements Conference 2019, pp. 121–130. American Society of Civil Engineers (2019). https://doi.org/10.1061/9780784482476.014
Drath, R., Horch, A.: Industrie 4.0: Hit or hype? [industry forum]. IEEE Ind. Electron. Mag. 8(2), 56–58 (2014)
Jocher, G., et al.: ultralytics/yolov3: Rectangular Inference, Conv2d + Batchnorm2d Layer Fusion (2019). https://doi.org/10.5281/zenodo.2672652
Kestur, R., et al.: UFCN: a fully convolutional neural network for road extraction in RGB imagery acquired by remote sensing from an unmanned aerial vehicle. J. Appl. Remote Sens. 12(1), 016020 (2018). https://doi.org/10.1117/1.JRS.12.016020
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics, Doha, Qatar, October 2014. https://doi.org/10.3115/v1/D14-1181, https://www.aclweb.org/anthology/D14-1181
Kopsida, M., Brilakis, I., Vela, P.: A review of automated construction progress monitoring and inspection methods. In: Proceedings of the 32nd CIB W78 Conference on Construction IT (2015)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Navon, R., Shpatnitsky, Y.: Field experiments in automated monitoring of road construction. J. Constr. Eng. Manage. 131(4), 487–493 (2005). https://doi.org/10.1061/(ASCE)0733-9364(2005)131:4(487)
Navon, R., Shpatnitsky, Y.: A model for automated monitoring of road construction. Constr. Manage. Econ. 23(9), 941–951 (2005). https://doi.org/10.1080/01446190500183917
Otto, A., Agatz, N., Campbell, J., Golden, B., Pesch, E.: Optimization approaches for civil applications of unmanned aerial vehicles (UAVs) or aerial drones: a survey. Networks 72(4), 411–458 (2018). https://doi.org/10.1002/net.21818
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel matters-improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4353–4361 (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Valada, A., Vertens, J., Dhall, A., Burgard, W.: Adapnet: adaptive semantic segmentation in adverse environmental conditions. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4644–4651. IEEE (2017)
Vick, S., Brilakis, I.: Road design layer detection in point cloud data for construction progress monitoring. J. Comput. Civ. Eng. 32(5) (2018). https://doi.org/10.1061/(ASCE)CP.1943-5487.0000772
Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2015)
Wu, W., et al.: Coupling deep learning and UAV for infrastructure condition assessment automation. In: 2018 IEEE International Smart Cities Conference (ISC2), pp. 1–7. IEEE, 16–19 September 2018. https://doi.org/10.1109/ISC2.2018.8656971
Xiao, X., Wang, L., Ding, K., Xiang, S., Pan, C.: Deep hierarchical encoder-decoder network for image captioning. IEEE Trans. Multimed. 21(11), 2942–2956 (2019)
Yao, T., Pan, Y., Li, Y., Mei, T.: Exploring visual relationship for image captioning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 711–727. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_42
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Hevesi, P. et al. (2021). Towards Construction Progress Estimation Based on Images Captured on Site. In: Peñalver, L., Parra, L. (eds) Industrial IoT Technologies and Applications. Industrial IoT 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 365. Springer, Cham. https://doi.org/10.1007/978-3-030-71061-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-71061-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71060-6
Online ISBN: 978-3-030-71061-3
eBook Packages: Computer ScienceComputer Science (R0)