Keyword

1 Introduction

The environment at construction sites is usually very harsh, with various production materials arranged in a disorderly manner, and workers, equipment, and production materials moving together. So it is easy to cause building accidents at construction sites. Many construction accidents are collision accidents, such as workers being hit by mobile equipment on construction sites. Even if workers wear high visibility clothing on site in accordance with existing safety regulations and standards, construction accidents may still occur. Therefore, adding additional safety measures to protect the safety of construction workers is of great research significance.

The purpose of this article is to explore and evaluate the feasibility and effect of applying computer vision technology in the cost monitoring of construction projects. Through the introduction of computer vision technology, functions such as automatic identification of workers, attendance records, working time tracking, work efficiency evaluation, and safety behavior monitoring can be realized. This computer vision-based cost monitoring method has many advantages, including reducing human error, improving data accuracy, real-time monitoring and evaluation of worker behavior, and improving safety.

In Chapter 3, this article introduces the CNN algorithm, the principle of object detection algorithm, and project schedule management. In Chapter 4, an experimental analysis of a construction worker object detection model based on the SSD-CNN algorithm is presented. Finally, a summary of the entire article is made.

2 Related Works

Experts have long conducted specialized research on cost monitoring for construction projects. Alizadehsalehi S has established a digital twin technology based approach to achieve real-time monitoring of construction progress [1]. Elghaish F's research has found that drones can be combined with 4D Building Information Modeling (BIM) to evaluate project progress and check compliance with geometric design models [2]. Keskin B aimed to systematically explore how building information models can change complex digital infrastructure environments (such as airports) by improving the connections and collaboration between key stakeholders and building technology solutions. It was found that the application of BIM can improve the utilization rate of the construction technology ecosystem and enhance the degree of process interconnection among various participating entities [3]. Dallasega P analyzed the advantages and disadvantages of production planning and management methods in construction enterprises through a systematic review of relevant literature [4]. Parsamehr M has established a decision-making system for construction industry enterprises based on BIM, and also pointed out the shortcomings in future research [5]. Rafsanjani H N revealed the important roles of virtual design and manufacturing and digital twins in the two technologies through comparative analysis, analyzed the development trends of the construction industry, and made cost predictions [6]. Nafe Assafi M adopted a 4D construction information model to reduce manual intervention, human error, and project progress. He used Auto desk Navisworks design software to design a 4D-BIM system. Finally, using this system, a simulation was conducted on a project under construction, which was delayed due to design errors and inefficient planning. Through simulation of actual engineering, good results have been achieved [7]. Crowther J studied the role of 4D-BIM in construction projects. There are 8 ways to build 4D-BIM to support project performance. It was found that among BIM coordinators, there was a lack of shared responsibility, a severe lack of understanding and training in 4D-BIM, and a complex process for effective execution [8]. Chen X aims to provide an overview of the technologies used in the construction industry and the benefits they bring [9]. Han Y sorted out the current situation of the prefabricated building supply chain and predicted its future development trends by reviewing existing research [10]. Hou L has introduced the traditional open construction industry into innovative technological and collaborative models. Collaborative designers include architects, engineers, architectural experts, real estate managers, as well as providers of building materials, software, production equipment, assembly equipment, and more [11]. Oke A E aimed to improve its application level in the construction industry by evaluating the application of the Internet of Things in construction projects [12]. In order to identify and analyze the potential hazards that may pose a threat to human life and property in construction projects, as well as the associated safety risks, Namian M's research shows that the use of unmanned aerial vehicles can bring many hazards to construction projects that industry professionals are not aware of. The three most serious safety hazards are “collision with property”, “collision with people”, and “distraction” [13]. Statsenko L conducted systematic research on the construction industry, constructed a key technology system for Building 4.0, and conducted empirical research on its application in the construction industry. C4.0 is suitable for energy conservation, prefabricated buildings, sustainable development, safety and environmental protection, indoor comfort, and efficient asset utilization [14]. In order to investigate the application of cloud computing in Nigeria's construction industry for sustainable development, Oak A E used exploratory factor analysis method for empirical analysis. The survey results show that this method has significant advantages in terms of extensive application in information storage (location independence), high situational awareness ability, team collaboration ability, compatibility with advanced production equipment, and optimization of engineering plans [15]. Traditional construction project cost monitoring often relies on manual input and processing of data, lacking real-time cost information, making it difficult to detect and solve cost problems in a timely manner. Moreover, project cost monitoring may not be able to identify the reasons for cost fluctuations in a timely manner, resulting in ineffective measures to control costs. Construction projects involve a large number of workers and labor resources, and it is a complex task to manage and monitor the actual work conditions, attendance and working hours of these workers. These projects have certain safety risks, and workers need to comply with safety regulations and operating procedures to reduce the risk of accidents and personal injuries. It is essential to monitor the safety behavior and compliance of workers, as well as to detect and resolve potential safety hazards in a timely manner. In addition, construction projects involve a large amount of material and equipment resources, and improper use or abuse of these resources by workers may lead to waste and increased costs. Monitoring and controlling workers’ use of resources and preventing and reducing waste of resources is an important challenge.

3 Methods

3.1 CNN Algorithm

The essence of CNN is a multi-layer perceptron, which can directly input images without the need for complex image preprocessing. It has the advantages of local connections and shared weights [16, 17]. Local connection refers to changing the connection method between the upper input unit and the hidden unit, and the input area connected by each hidden unit is the receptive field. Implicit units can only connect a small portion of adjacent regions of the upper input unit, which greatly reduces computational complexity [18]. Parameter sharing is based on the assumption that the importance of parameters to different points in the image is the same. Therefore, the same plane should share the same set of weights and biases, which has two benefits. On the one hand, repetition can be recognized without considering its position in the field of view. On the other hand, the number of parameters is greatly reduced, reducing computational complexity and time consumption. The weight sharing of CNN is similar to the image analysis method of biometrics, so convolutional neural networks have significant advantages in reducing the number of weights, especially when inputting multi-dimensional images, and are suitable for use in machine recognition images [19, 20].

3.2 Principle of Object Detection Algorithm

Object detection algorithm is a new method that introduces computer technology into the visual system. It can generate a candidate box in an image, and then classify and recognize the objects in the candidate box. This algorithm first inputs the image into the trained model, then operates on the input image to generate a candidate box, classifies the objects in the candidate box, and then uses the NMS (Non-Maximum Suppression) algorithm to remove excess candidate boxes. Figure 1 shows the identification process.

Fig. 1.
figure 1

Object Detection Process

The commonly used object detection algorithm models include region based R-series models and regression based models. The characteristics of the R series model are as follows: using an end-to-end convolutional network model for recognition and detection to ensure a certain degree of accuracy, but its timeliness and detection speed need to be improved. The characteristic of regression models is to transform detection tasks into regression problems, improving detection speed and meeting real-time requirements. This article uses the YOLO V3 algorithm under a regression model for object detection.

3.3 Project Progress Management

Project schedule management mainly manages the contract period goals of engineering projects. In order to ensure that the project can be completed as expected, it is necessary to prepare a reasonable and scientific schedule plan. Then, during the implementation process, it is possible to strictly follow the plan. If there are any discrepancies between the actual progress and the planned progress, it is necessary to identify the reasons and make adjustments or corrections to ensure the smooth progress of the project. Progress management is the most important part of project management, and it holds the same status as cost management and quality management. Good schedule management not only allows for early completion, but also saves a lot of manpower and material resources.

Currently, how to reduce the occurrence of safety accidents and strengthen the safety management of workers has become a complex and important task. Firstly, this article proposes a method to obtain the bounding rectangle Box of a character based on the human skeleton information extracted by a CNN network; Then the SSD algorithm can track the status of the character Box in the image in real time and set different IDs for each character. Based on the ID, it is possible to quickly lock characters with improper wearing and abnormal movements. Finally, different character images can be extracted based on the size of the character Box, and different character images as well as corresponding bone and ID information can be input to the server for recognition. The server can provide feedback on the problematic individuals to the client.

The SSD (Single Shot MultiBox Detector) network mainly includes the following core features:

  1. (1)

    Using multi-scale feature maps for object detection

    The size of the six feature maps of SSD is not equal. Predicting on multiple feature maps with different resolutions can use convolution or pooling from shallow to deep to reduce the size of the feature maps. The receptive field of shallow small downsampling feature maps is relatively small and is generally used to predict small-scale objects; The receptive field of deep downsampling feature maps is larger and is generally used to predict large-scale objects.

  2. (2)

    Using convolution for detection

    Compared to other single-stage network models, such as YOLO, which uses fully connected layers to predict the coordinates of bounding boxes and their corresponding classifications, SSD uses convolution to extract features from feature maps of different scales.

  3. (3)

    Prior boxes with different scales and aspect ratios

The SSD algorithm sets different specifications of prior boxes on its feature maps of different scales, and the scale of the prior boxes increases linearly with the decrease of feature map size. The prior box scale of each layer is shown in Eq. (1).

$$ S_k = S_{\min } + \frac{{S_{\max } - S_{\min } }}{m - 1}(k - 1),k \in [1,m] $$
(1)

Among them, \(S_k\) represents the proportion of the k-th prior box size of SSD to the image, while \(S_{\min }\) and \(S_{\max }\) represent the lowest and highest feature maps of 0.2 and 0.9, respectively, corresponding to the second and sixth feature maps. m represents the number of predicted feature layers, which should have been 6 here. However, since the first layer feature map is set separately, the calculated value here is 5. The first feature map is generally \(S_{\min } /2\), which is 0.1.

In order to reduce the computational cost of convolution operations, most mainstream lightweight neural network models currently use group convolution or depthwise separable convolution, but it also brings a lot of computational cost. By permeating the channels after group convolution, the information flow between groups can be better reflected, thereby enhancing the feature expression ability of the model. The essence of channel permutation is to achieve information exchange between grouped convolutional channels without increasing computational complexity. On this basis, the paper replaced traditional 3 × 3, 1 × 1 and other convolutional modules with group convolution and channel permutation, thereby achieving model compression.

The mean and variance during testing are based on the moving average distance during training, as follows:

  1. (1)

    When testing or predicting, only a single data can be passed each time, and the model can use global statistics instead of batch statistics.

  2. (2)

    When training each batch, a set (mean, variance) can be obtained.

  3. (3)

    The global statistic is to calculate the mathematical expectation corresponding to these means and variances, and the specific formula is as follows:

    $$ E[x] \leftarrow E[\mu_i ] $$
    (2)
    $$ Var[x] \leftarrow \frac{m}{m - 1}E[\sigma_i^2 ]Var[x] \leftarrow \frac{m}{m - 1}E[\sigma_i^2 ] $$
    (3)

    Among them, \(\mu_i\) and \(\sigma_i\) respectively represent the mean and standard deviation saved in the i-th batch processing, m is the batch size, and the coefficient \(m/m - 1\) is used to calculate unbiased square error estimation. At this point, BN (x) changes to:

    $$ BN(x_i ) = \gamma \frac{x_i - E[x_i ]}{{\sqrt {Var[x_i ] + \varepsilon } }} + \beta $$
    (4)

    Among them, \(\gamma\) is the training parameter and β is the iteration step size.

4 Results and Discussion

The commonly used measurement indicators for detection include Recall, Precision, Average Accuracy (AP), and Intersection of Union (IoU). Precision is the proportion of TP (true positive) in the identified target image, and Recall is the proportion of TP in the identified target image to the target image in the test set. The threshold is set to 0.5, which means that when the intersection to union ratio is greater than or equal to 0.5, it is determined that the class it belongs to can be recognized, and then the most suitable box is selected and output through non maximum suppression. After training the model on a cluster for 500 epochs, good weight parameters were obtained. After training, the Precious, Recall, and F1 data for each category in the dataset of the model are shown in Fig. 2.

Fig. 2.
figure 2

Accuracy, recall, and F1 of trained categories

From Fig. 2, it can be seen that the accuracy, recall, and F1 of worker behavior object detection based on SSD algorithm and CNN algorithm are 80%–95%, 60%–75%, and 70%–82.5%, respectively. These indicators indicate that worker behavior object detection models based on SSD algorithm and CNN algorithm have certain performance in different aspects, and can help evaluate the performance of the model in worker behavior detection tasks. However, these indicators also indicate that the model's performance in terms of recall is relatively low and may require further improvement to improve the accuracy of identifying worker behavior.

Table 1. Comparison of Target Detection Performance in Construction Scenarios (%)

The following conclusions can be drawn from the data in Table 1: AP50 refers to using a 50% recall rate as the benchmark when calculating AP in object detection tasks. This means that when calculating AP, only the accuracy when the recall rate reaches 50% is considered to evaluate the performance of the model when half of the targets are correctly detected. This indicator can help evaluate the performance of the model in high recall situations. AP50 is commonly used to evaluate the performance of object detection models on different datasets and can be used in conjunction with other AP values (such as AP75, AP90, etc.) to comprehensively evaluate the performance of the model. Among them, the values of AP, AP50, and AP75 for the SSD-CNN algorithm in this article are 51.29%, 69.85%, and 54.81%, respectively. Among all the algorithms, the values of the three indicators rank in the top, indicating that the algorithm has achieved excellent detection performance under different evaluation indicators.

Fig. 3.
figure 3

Change curve of entropy

Figure 3 shows the histogram values when abnormal behavior occurs in a video recording. The video consists of 700 frames, with the first 600 frames corresponding to normal numerical curves and 600–700 frames corresponding to abnormal situations. For normal behavior, the test is square, while for abnormal behavior, it is circular. It can be seen that when an ordinary action becomes an abnormal action, human actions become more irregular, and the initial value can also increase. After more than 700 frames, the human body has already left the camera. The histogram legitimate value detection algorithm based on SSD algorithm and CNN algorithm can accurately detect the time when abnormal behavior occurs.

Fig. 4.
figure 4

Response time (s) for object detection using different algorithms

From Fig. 4, it can be seen that the response times of SSD CNN, Mask R-CNN, R-CNN, and FCOS are only 1s, 1.1s, 1.1s, and 0.2s, respectively, making them the fastest among all algorithms.

5 Conclusion

The traditional construction site inspection methods mainly rely on manual inspections and surveillance cameras, which have many problems such as low efficiency, high cost, and easy error. Therefore, detection technology based on artificial intelligence has become a new solution. The SSD algorithm, as an advanced object detection algorithm, has the advantages of fast detection speed, high accuracy, and strong adaptability, and is therefore widely used in construction site detection. The SSD algorithm discussed in this article can effectively detect various targets in construction sites, such as construction personnel, mechanical equipment, material stacking, etc., by performing multi-scale convolution operations on images. Compared with traditional detection methods, SSD algorithm can not only achieve comprehensive monitoring of construction sites, but also accurately identify and locate different targets, providing strong technical support for the management and monitoring of construction sites. Through the introduction of computer vision technology, this paper designs and implements a prototype of a cost monitoring system based on this technology. The research results show that computer vision technology can realize the functions of automatic identification of workers, tracking of working hours, work efficiency evaluation and safety behavior monitoring, and provide a novel, automated and accurate cost monitoring method. The research results of this paper emphasize the importance of technological innovation to improve construction project management. The introduction of computer vision technology has brought new ideas and methods to the cost monitoring of construction projects, and promoted innovation and development in the field of project management. In the cost monitoring of construction projects, there are still many specific issues that need to be studied in depth. For example, how to accurately identify workers’ work status and behavior, how to effectively evaluate work efficiency and safety behavior, etc. Future research can conduct in-depth research on these problems and propose more specific solutions.