Abstract
With the continuous development of technology, the application of computer vision technology in cost monitoring of construction projects is becoming increasingly important. By utilizing computer vision technology, construction companies can monitor the progress of construction sites, material usage, and allocation of human resources in real-time, thereby better controlling project costs. The application of this technology can not only improve the efficiency of construction projects, but also reduce human errors and waste, which is of great significance for the successful completion of projects. Therefore, construction companies should actively adopt computer vision technology to improve the efficiency and accuracy of cost monitoring. In recent years, China's economy has continued to grow, continuously driving the development of infrastructure and manufacturing industries. However, at the same time, the production safety situation in China is becoming increasingly severe, and safety accidents continue to occur. One of the main causes of safety accidents is human misconduct, which includes workers wearing uniforms and safety helmets improperly during construction. Workers engage in some dangerous behaviors during construction, such as making phone calls, falling, and squatting for long periods of time. In response to the above issues, this article studies how the SSD (Single Shot MultiBox Detector) algorithm, CNN (Convolutional Neural Networks) algorithm, and YOLO (You only look once) algorithm can be applied to the behavior detection of construction workers, and compares them with existing object detection algorithms through experiments. The experimental data shows that the AP (Average precision), AP50, and AP75 values of the SSD-CNN algorithm in this article are 51.29%, 69.85%, and 54.81%, respectively. Among all the algorithms, the values of the three indicators rank in the top few.
You have full access to this open access chapter, Download conference paper PDF
Keyword
1 Introduction
The environment at construction sites is usually very harsh, with various production materials arranged in a disorderly manner, and workers, equipment, and production materials moving together. So it is easy to cause building accidents at construction sites. Many construction accidents are collision accidents, such as workers being hit by mobile equipment on construction sites. Even if workers wear high visibility clothing on site in accordance with existing safety regulations and standards, construction accidents may still occur. Therefore, adding additional safety measures to protect the safety of construction workers is of great research significance.
The purpose of this article is to explore and evaluate the feasibility and effect of applying computer vision technology in the cost monitoring of construction projects. Through the introduction of computer vision technology, functions such as automatic identification of workers, attendance records, working time tracking, work efficiency evaluation, and safety behavior monitoring can be realized. This computer vision-based cost monitoring method has many advantages, including reducing human error, improving data accuracy, real-time monitoring and evaluation of worker behavior, and improving safety.
In Chapter 3, this article introduces the CNN algorithm, the principle of object detection algorithm, and project schedule management. In Chapter 4, an experimental analysis of a construction worker object detection model based on the SSD-CNN algorithm is presented. Finally, a summary of the entire article is made.
2 Related Works
Experts have long conducted specialized research on cost monitoring for construction projects. Alizadehsalehi S has established a digital twin technology based approach to achieve real-time monitoring of construction progress [1]. Elghaish F's research has found that drones can be combined with 4D Building Information Modeling (BIM) to evaluate project progress and check compliance with geometric design models [2]. Keskin B aimed to systematically explore how building information models can change complex digital infrastructure environments (such as airports) by improving the connections and collaboration between key stakeholders and building technology solutions. It was found that the application of BIM can improve the utilization rate of the construction technology ecosystem and enhance the degree of process interconnection among various participating entities [3]. Dallasega P analyzed the advantages and disadvantages of production planning and management methods in construction enterprises through a systematic review of relevant literature [4]. Parsamehr M has established a decision-making system for construction industry enterprises based on BIM, and also pointed out the shortcomings in future research [5]. Rafsanjani H N revealed the important roles of virtual design and manufacturing and digital twins in the two technologies through comparative analysis, analyzed the development trends of the construction industry, and made cost predictions [6]. Nafe Assafi M adopted a 4D construction information model to reduce manual intervention, human error, and project progress. He used Auto desk Navisworks design software to design a 4D-BIM system. Finally, using this system, a simulation was conducted on a project under construction, which was delayed due to design errors and inefficient planning. Through simulation of actual engineering, good results have been achieved [7]. Crowther J studied the role of 4D-BIM in construction projects. There are 8 ways to build 4D-BIM to support project performance. It was found that among BIM coordinators, there was a lack of shared responsibility, a severe lack of understanding and training in 4D-BIM, and a complex process for effective execution [8]. Chen X aims to provide an overview of the technologies used in the construction industry and the benefits they bring [9]. Han Y sorted out the current situation of the prefabricated building supply chain and predicted its future development trends by reviewing existing research [10]. Hou L has introduced the traditional open construction industry into innovative technological and collaborative models. Collaborative designers include architects, engineers, architectural experts, real estate managers, as well as providers of building materials, software, production equipment, assembly equipment, and more [11]. Oke A E aimed to improve its application level in the construction industry by evaluating the application of the Internet of Things in construction projects [12]. In order to identify and analyze the potential hazards that may pose a threat to human life and property in construction projects, as well as the associated safety risks, Namian M's research shows that the use of unmanned aerial vehicles can bring many hazards to construction projects that industry professionals are not aware of. The three most serious safety hazards are “collision with property”, “collision with people”, and “distraction” [13]. Statsenko L conducted systematic research on the construction industry, constructed a key technology system for Building 4.0, and conducted empirical research on its application in the construction industry. C4.0 is suitable for energy conservation, prefabricated buildings, sustainable development, safety and environmental protection, indoor comfort, and efficient asset utilization [14]. In order to investigate the application of cloud computing in Nigeria's construction industry for sustainable development, Oak A E used exploratory factor analysis method for empirical analysis. The survey results show that this method has significant advantages in terms of extensive application in information storage (location independence), high situational awareness ability, team collaboration ability, compatibility with advanced production equipment, and optimization of engineering plans [15]. Traditional construction project cost monitoring often relies on manual input and processing of data, lacking real-time cost information, making it difficult to detect and solve cost problems in a timely manner. Moreover, project cost monitoring may not be able to identify the reasons for cost fluctuations in a timely manner, resulting in ineffective measures to control costs. Construction projects involve a large number of workers and labor resources, and it is a complex task to manage and monitor the actual work conditions, attendance and working hours of these workers. These projects have certain safety risks, and workers need to comply with safety regulations and operating procedures to reduce the risk of accidents and personal injuries. It is essential to monitor the safety behavior and compliance of workers, as well as to detect and resolve potential safety hazards in a timely manner. In addition, construction projects involve a large amount of material and equipment resources, and improper use or abuse of these resources by workers may lead to waste and increased costs. Monitoring and controlling workers’ use of resources and preventing and reducing waste of resources is an important challenge.
3 Methods
3.1 CNN Algorithm
The essence of CNN is a multi-layer perceptron, which can directly input images without the need for complex image preprocessing. It has the advantages of local connections and shared weights [16, 17]. Local connection refers to changing the connection method between the upper input unit and the hidden unit, and the input area connected by each hidden unit is the receptive field. Implicit units can only connect a small portion of adjacent regions of the upper input unit, which greatly reduces computational complexity [18]. Parameter sharing is based on the assumption that the importance of parameters to different points in the image is the same. Therefore, the same plane should share the same set of weights and biases, which has two benefits. On the one hand, repetition can be recognized without considering its position in the field of view. On the other hand, the number of parameters is greatly reduced, reducing computational complexity and time consumption. The weight sharing of CNN is similar to the image analysis method of biometrics, so convolutional neural networks have significant advantages in reducing the number of weights, especially when inputting multi-dimensional images, and are suitable for use in machine recognition images [19, 20].
3.2 Principle of Object Detection Algorithm
Object detection algorithm is a new method that introduces computer technology into the visual system. It can generate a candidate box in an image, and then classify and recognize the objects in the candidate box. This algorithm first inputs the image into the trained model, then operates on the input image to generate a candidate box, classifies the objects in the candidate box, and then uses the NMS (Non-Maximum Suppression) algorithm to remove excess candidate boxes. Figure 1 shows the identification process.
The commonly used object detection algorithm models include region based R-series models and regression based models. The characteristics of the R series model are as follows: using an end-to-end convolutional network model for recognition and detection to ensure a certain degree of accuracy, but its timeliness and detection speed need to be improved. The characteristic of regression models is to transform detection tasks into regression problems, improving detection speed and meeting real-time requirements. This article uses the YOLO V3 algorithm under a regression model for object detection.
3.3 Project Progress Management
Project schedule management mainly manages the contract period goals of engineering projects. In order to ensure that the project can be completed as expected, it is necessary to prepare a reasonable and scientific schedule plan. Then, during the implementation process, it is possible to strictly follow the plan. If there are any discrepancies between the actual progress and the planned progress, it is necessary to identify the reasons and make adjustments or corrections to ensure the smooth progress of the project. Progress management is the most important part of project management, and it holds the same status as cost management and quality management. Good schedule management not only allows for early completion, but also saves a lot of manpower and material resources.
Currently, how to reduce the occurrence of safety accidents and strengthen the safety management of workers has become a complex and important task. Firstly, this article proposes a method to obtain the bounding rectangle Box of a character based on the human skeleton information extracted by a CNN network; Then the SSD algorithm can track the status of the character Box in the image in real time and set different IDs for each character. Based on the ID, it is possible to quickly lock characters with improper wearing and abnormal movements. Finally, different character images can be extracted based on the size of the character Box, and different character images as well as corresponding bone and ID information can be input to the server for recognition. The server can provide feedback on the problematic individuals to the client.
The SSD (Single Shot MultiBox Detector) network mainly includes the following core features:
-
(1)
Using multi-scale feature maps for object detection
The size of the six feature maps of SSD is not equal. Predicting on multiple feature maps with different resolutions can use convolution or pooling from shallow to deep to reduce the size of the feature maps. The receptive field of shallow small downsampling feature maps is relatively small and is generally used to predict small-scale objects; The receptive field of deep downsampling feature maps is larger and is generally used to predict large-scale objects.
-
(2)
Using convolution for detection
Compared to other single-stage network models, such as YOLO, which uses fully connected layers to predict the coordinates of bounding boxes and their corresponding classifications, SSD uses convolution to extract features from feature maps of different scales.
-
(3)
Prior boxes with different scales and aspect ratios
The SSD algorithm sets different specifications of prior boxes on its feature maps of different scales, and the scale of the prior boxes increases linearly with the decrease of feature map size. The prior box scale of each layer is shown in Eq. (1).
Among them, \(S_k\) represents the proportion of the k-th prior box size of SSD to the image, while \(S_{\min }\) and \(S_{\max }\) represent the lowest and highest feature maps of 0.2 and 0.9, respectively, corresponding to the second and sixth feature maps. m represents the number of predicted feature layers, which should have been 6 here. However, since the first layer feature map is set separately, the calculated value here is 5. The first feature map is generally \(S_{\min } /2\), which is 0.1.
In order to reduce the computational cost of convolution operations, most mainstream lightweight neural network models currently use group convolution or depthwise separable convolution, but it also brings a lot of computational cost. By permeating the channels after group convolution, the information flow between groups can be better reflected, thereby enhancing the feature expression ability of the model. The essence of channel permutation is to achieve information exchange between grouped convolutional channels without increasing computational complexity. On this basis, the paper replaced traditional 3 × 3, 1 × 1 and other convolutional modules with group convolution and channel permutation, thereby achieving model compression.
The mean and variance during testing are based on the moving average distance during training, as follows:
-
(1)
When testing or predicting, only a single data can be passed each time, and the model can use global statistics instead of batch statistics.
-
(2)
When training each batch, a set (mean, variance) can be obtained.
-
(3)
The global statistic is to calculate the mathematical expectation corresponding to these means and variances, and the specific formula is as follows:
$$ E[x] \leftarrow E[\mu_i ] $$(2)$$ Var[x] \leftarrow \frac{m}{m - 1}E[\sigma_i^2 ]Var[x] \leftarrow \frac{m}{m - 1}E[\sigma_i^2 ] $$(3)Among them, \(\mu_i\) and \(\sigma_i\) respectively represent the mean and standard deviation saved in the i-th batch processing, m is the batch size, and the coefficient \(m/m - 1\) is used to calculate unbiased square error estimation. At this point, BN (x) changes to:
$$ BN(x_i ) = \gamma \frac{x_i - E[x_i ]}{{\sqrt {Var[x_i ] + \varepsilon } }} + \beta $$(4)Among them, \(\gamma\) is the training parameter and β is the iteration step size.
4 Results and Discussion
The commonly used measurement indicators for detection include Recall, Precision, Average Accuracy (AP), and Intersection of Union (IoU). Precision is the proportion of TP (true positive) in the identified target image, and Recall is the proportion of TP in the identified target image to the target image in the test set. The threshold is set to 0.5, which means that when the intersection to union ratio is greater than or equal to 0.5, it is determined that the class it belongs to can be recognized, and then the most suitable box is selected and output through non maximum suppression. After training the model on a cluster for 500 epochs, good weight parameters were obtained. After training, the Precious, Recall, and F1 data for each category in the dataset of the model are shown in Fig. 2.
From Fig. 2, it can be seen that the accuracy, recall, and F1 of worker behavior object detection based on SSD algorithm and CNN algorithm are 80%–95%, 60%–75%, and 70%–82.5%, respectively. These indicators indicate that worker behavior object detection models based on SSD algorithm and CNN algorithm have certain performance in different aspects, and can help evaluate the performance of the model in worker behavior detection tasks. However, these indicators also indicate that the model's performance in terms of recall is relatively low and may require further improvement to improve the accuracy of identifying worker behavior.
The following conclusions can be drawn from the data in Table 1: AP50 refers to using a 50% recall rate as the benchmark when calculating AP in object detection tasks. This means that when calculating AP, only the accuracy when the recall rate reaches 50% is considered to evaluate the performance of the model when half of the targets are correctly detected. This indicator can help evaluate the performance of the model in high recall situations. AP50 is commonly used to evaluate the performance of object detection models on different datasets and can be used in conjunction with other AP values (such as AP75, AP90, etc.) to comprehensively evaluate the performance of the model. Among them, the values of AP, AP50, and AP75 for the SSD-CNN algorithm in this article are 51.29%, 69.85%, and 54.81%, respectively. Among all the algorithms, the values of the three indicators rank in the top, indicating that the algorithm has achieved excellent detection performance under different evaluation indicators.
Figure 3 shows the histogram values when abnormal behavior occurs in a video recording. The video consists of 700 frames, with the first 600 frames corresponding to normal numerical curves and 600–700 frames corresponding to abnormal situations. For normal behavior, the test is square, while for abnormal behavior, it is circular. It can be seen that when an ordinary action becomes an abnormal action, human actions become more irregular, and the initial value can also increase. After more than 700 frames, the human body has already left the camera. The histogram legitimate value detection algorithm based on SSD algorithm and CNN algorithm can accurately detect the time when abnormal behavior occurs.
From Fig. 4, it can be seen that the response times of SSD CNN, Mask R-CNN, R-CNN, and FCOS are only 1s, 1.1s, 1.1s, and 0.2s, respectively, making them the fastest among all algorithms.
5 Conclusion
The traditional construction site inspection methods mainly rely on manual inspections and surveillance cameras, which have many problems such as low efficiency, high cost, and easy error. Therefore, detection technology based on artificial intelligence has become a new solution. The SSD algorithm, as an advanced object detection algorithm, has the advantages of fast detection speed, high accuracy, and strong adaptability, and is therefore widely used in construction site detection. The SSD algorithm discussed in this article can effectively detect various targets in construction sites, such as construction personnel, mechanical equipment, material stacking, etc., by performing multi-scale convolution operations on images. Compared with traditional detection methods, SSD algorithm can not only achieve comprehensive monitoring of construction sites, but also accurately identify and locate different targets, providing strong technical support for the management and monitoring of construction sites. Through the introduction of computer vision technology, this paper designs and implements a prototype of a cost monitoring system based on this technology. The research results show that computer vision technology can realize the functions of automatic identification of workers, tracking of working hours, work efficiency evaluation and safety behavior monitoring, and provide a novel, automated and accurate cost monitoring method. The research results of this paper emphasize the importance of technological innovation to improve construction project management. The introduction of computer vision technology has brought new ideas and methods to the cost monitoring of construction projects, and promoted innovation and development in the field of project management. In the cost monitoring of construction projects, there are still many specific issues that need to be studied in depth. For example, how to accurately identify workers’ work status and behavior, how to effectively evaluate work efficiency and safety behavior, etc. Future research can conduct in-depth research on these problems and propose more specific solutions.
References
Alizadehsalehi, S., Yitmen, I.: Digital twin-based progress monitoring management model through reality capture to extended reality technologies (DRX)[J]. Smart and Sustainable Built Environment 12(1), 200–236 (2021)
Elghaish, F., Matarneh, S., Talebi, S., et al.: Toward digitalization in the construction industry with immersive and drones technologies: a critical literature review. Smart and Sustainable Built Environment 10(3), 345–363 (2021)
Keskin, B., Salman, B., Ozorhon, B.: Airport project delivery within BIM-centric construction technology ecosystems. Eng. Constr. Archit. Manag. 28(2), 530–548 (2021)
Dallasega, P., Marengo, E., Revolti, A.: Strengths and shortcomings of methodologies for production planning and control of construction projects: a systematic literature review and future perspectives. Production Planning & Control 32(4), 257–282 (2021)
Parsamehr, M., Perera, U.S., Dodanwala, T.C., et al.: A review of construction management challenges and BIM-based solutions: perspectives from the schedule, cost, quality, and safety management. Asian Journal of Civil Engineering 24(1), 353–389 (2023)
Rafsanjani, H.N., Nabizadeh, A.H.: Towards digital architecture, engineering, and construction (AEC) industry through virtual design and construction (VDC) and digital twin. Energy and Built Environment 4(2), 169–178 (2023)
Nafe Assafi, M., Hossain, M.M., Chileshe, N., et al.: Development and validation of a framework for preventing and mitigating construction delay using 4D BIM platform in Bangladeshi construction sector. Constr. Innov. 23(5), 1255–1278 (2023)
Crowther, J., Ajayi, S.O.: Impacts of 4D BIM on construction project performance. Int. J. Constr. Manag. 21(7), 724–737 (2021)
Chen, X., Chang-Richards, A.Y., Pelosi, A., et al.: Implementation of technologies in the construction industry: a systematic review. Eng. Constr. Archit. Manag. 29(8), 3181–3209 (2022)
Han, Y., Yan, X., Piroozfar, P.: An overall review of research on prefabricated construction supply chain management. Eng. Constr. Archit. Manag. 30(10), 5160–5195 (2023)
Hou, L., Tan, Y., Luo, W., et al.: Towards a more extensive application of off-site construction: a technological review. Int. J. Constr. Manag. 22(11), 2154–2165 (2022)
Oke, A.E., Arowoiya, V.A.: Evaluation of internet of things (IoT) application areas for sustainable construction. Smart and Sustainable Built Environment 10(3), 387–402 (2021)
Namian, M., Khalid, M., Wang, G., et al.: Revealing safety risks of unmanned aerial vehicles in construction. Transp. Res. Rec. 2675(11), 334–347 (2021)
Statsenko, L., Samaraweera, A., Bakhshi, J., et al.: Construction 4.0 technologies and applications: a systematic literature review of trends and potential areas for development. Construction Innovation 23(5), 961–993 (2023)
Oke, A.E., Kineber, A.F., Al-Bukhari, I., et al.: Exploring the benefits of cloud computing for sustainable construction in Nigeria. J. Eng. Desi. Technol. 21(4), 973–990 (2023)
Deepalakshmi, P., Lavanya, K., Srinivasu, P.N.: Plant leaf disease detection using CNN algorithm. Int. J. Info. Sys. Model. Desi. (IJISMD) 12(1), 1–21 (2021)
Wu, J.M.T., Li, Z., Herencsar, N., et al.: A graph-based CNN-LSTM stock price prediction algorithm with leading indicators. Multimedia Syst. 29(3), 1751–1770 (2023)
Permana, S.D.H., Saputra, G., Arifitama, B., et al.: Classification of bird sounds as an early warning method of forest fires using convolutional Neural Network (CNN) algorithm. J. King Saud Uni.-Comp. Info. Sci. 34(7), 4345–4357 (2022)
Chen, D.J.I.Z.: Automatic vehicle license plate detection using K-means clustering algorithm and CNN. J. Electr. Eng. Automat. 3(1), 15–23 (2021)
Liu, J., Ban, W., Chen, Y., et al.: Multi-dimensional CNN fused algorithm for hyperspectral remote sensing image classification. Zhongguo Jiguang/Chinese Journal of Lasers 48(16), 1–11 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2024 The Author(s)
About this paper
Cite this paper
Ou, X. (2024). Computer Vision Technology in Cost Monitoring of Construction Projects. In: Bieliatynskyi, A., Komyshev, D., Zhao, W. (eds) Proceedings of Conference on Sustainable Traffic and Transportation Engineering in 2023. CSTTE 2023. Lecture Notes in Civil Engineering, vol 603. Springer, Singapore. https://doi.org/10.1007/978-981-97-5814-2_50
Download citation
DOI: https://doi.org/10.1007/978-981-97-5814-2_50
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5813-5
Online ISBN: 978-981-97-5814-2
eBook Packages: EngineeringEngineering (R0)