1 Introduction

Traditional 3D (three-dimensional) human body models usually are captured by the structured light or laser scanner. Although it can obtain high-precision personalized 3D human body model, the cost is high and the operation is complicated [1]. The Kinect device launched by Microsoft can make use of infrared technology to achieve the fast acquisition of three-dimensional information at low cost. This breakthrough has greatly promoted some applications using 3D technology, such as human action recognition based on Kinect, skeleton modeling, face recognition, and 3D reconstruction of the scene, which have become a research hotspot in the related fields [2,3,4]. Because of its low cost and simple operation, Kinect depth camera is also used as a scanner to rapidly build personalized 3D human body models in real time [5, 6]. However, the point cloud density of the model of Kinect scanning is too large, so the 3D human model constructed by it is difficult to be widely applied. For example, in the crowd simulation scene, the real-time rendering speed of human body is required to be fast due to the large scale of the crowd. If the 3D human body model scanned by Kinect is applied directly to the group simulation, it will undoubtedly increase the system cost and reduce the efficiency of crowd simulation. Therefore, how to preserve the necessary details of the model while simplifying the model with high similarity to original model is a very meaningful problem.

At present, the simplified method commonly used for three-dimensional models is the simplified method in CGAL (computational geometry algorithms library). CGAL is a computational geometry algorithms library written in C++. In the simplification of the 3D mesh, the simplified method of edge collapse is used [7, 8], and the Lindstrom-Turk method is used to calculate the collapse cost of each side. Although the efficiency is high, the accuracy is not high, and the effective detail features of the model cannot be retained. The simplified method based on the fillet surface reconstruction proposed by Peng et al. [9] realizes the simplification of the fillet surface, which is applicable to the model with fillet characteristics but not universal. Other simplified methods for 3D models, such as the 3D model simplification algorithm with texture proposed by Feng and Zhou [10], consider the geometric information of the model and the geometric error of the texture information. Zhou and Chen [11] put forward a mesh model simplification algorithm based on polygon vertex normal vector, which is a simplified method of visual feature optimization. Quan et al. [12] propose the geometric model simplification method based on the region segmentation. This method needs to keep the detail features and introduce the region segmentation principle of the image. A mesh model region segmentation method based on curvature for region growing is proposed, and then it is simplified according to the ratio of the number of triangles in the region. Zhang et al. [13] put forward a new simplification method of terrain model based on divergence function. This method combines the discrete particle swarm theory to simplify the terrain model based on the hierarchical structure of the implicit quadtree. An improved mesh simplification algorithm based on edge collapse was introduced in [14]. In this algorithm, the quadric error metric was utilized to compute the vertex significance and control the sequence of edge collapse. Sanchez et al. [15] use an estimated local density of the point cloud to simplify the point cloud. In this method, the point clouds are clustered by using the expectation maximization algorithm according to the local distribution of the points. Then, a linear programming model is applied to reduce the cloud. Han et al. [16] propose a point cloud simplification method with preserved edge points. In this work, the authors try their best to retain the edge points since these points have more significant properties than non-edge points. First, a least square plane is constructed by using the topology relationship of the points and normal vectors and then each point in the vicinity is projected to the fitted plane. Next, the edge points can be extracted according to the homogeneity of projection. These detected edge points are retained in the decimating process. As For the non-edge points, the authors delete the least important points until the predesigned simplification rate is satisfied. In [17], the incremental segmentation and triangulation of planar segments from dense point clouds are studied to enhance the quality and efficiency. In this paper, the authors proposed a point-based triangulation algorithm to improve the planar segment decimation and triangulation in a gradually expanding point cloud map as well as a polygon-based triangulation algorithm. Both of the two algorithms can produce more accurate and simpler planar triangulations. Although the above method can simplify the model, it cannot keep the effective details of the model, and it is inefficient when dealing with the models with large amount of points.

When the whole 3D body model is collected by Kinect, the amount of data is huge and the later processing cost is big instead of its convenience. The huge amount of 3D information and data is mainly manifested in three aspects: dense point, dense edge, and dense surface. Thus, the simplification of 3D model is mainly from the three aspects of point, edge, and surface. In this paper, we first use Kinect to get the head with the most personalized features, and then use the edge collapse simplification method based on edge curvature and area error. In addition, the interactive approach is used to preserve the detail characteristics and simplify the model. Furthermore, we also use the improved FCF (fusion control function) [18] model fusion method to realize the seamless integration between models automatically. With this model, the complete character models can be reconstructed to build a database of LOD (levels of detail) character model. The simplified 3D body models can reduce system overhead and improve system efficiency when they are applied to crowd simulation.

2 The simplification and fusion method

The related work of 3D human reconstruction algorithm proposed in this paper was three-dimensional scanning technology, model denoising, model simplification, model fusion, and simulation experiment. Firstly, the head model was obtained by using a cheap Kinect depth camera, and then a smooth model was obtained by removing the fragments in the model, the model denoising, and other preprocessing operations. Secondly, we simplified the complex head model by using the simplified method of preserving model features based on edge curvature and area error proposed in this paper. Finally, the improved FCF model fusion method was used to fuse the head model and the body model, and the personalized hierarchical model library was built, which was finally applied to group simulation.

2.1 Preprocessing

The denoising of the initial model can be divided into two steps: One is to delete the fragments generated during scanning. In this paper, we used DFS (depth first search) algorithm to delete fragments. The other is to use the weighted Laplace smoothing algorithm [19] to remove the tiny noise on the model and make the model as smooth as possible. The Laplace smoothing algorithm has a low computational complexity and can control the details of the model very well in operation. Therefore, this paper used the Laplace smoothing algorithm to remove the small noise in the model.

The topological structure of the model was mapped to a graph, and the connectivity of the whole graph can be obtained by using the method of DFS. The largest connected graph was the structure that needed to be retained, and the rest of the disconnected graph was deleted. Figure 1a is the graph before deleting the fragment, and Fig. 1b is the graph after the fragment is deleted.

Fig. 1
figure 1

Remove fragment

The specific operation of the Laplace algorithm is as follows: The 3D position of the vertex is moved to the center of gravity position of the surrounding vertexes to minimize the difference between the vertex and the surrounding vertexes. For every point in the model, the space position of the point is recalculated according to the position information of the surrounding vertexes. The Laplace smoothing formula is as follows:

$$ \overline{x_i}=\frac{1}{N}\sum \limits_{j=1}^N{x}_j $$
(1)

N is the number of vertexes around the current point and \( \overline{x_i} \)is the new coordinate of the i-th vertex.

In practice, N cannot be too large. If it is too large, the details may be lost. If N is too small, the smooth effect will not be achieved. Therefore, an improved Laplace smoothing algorithm was used in this paper, which was weighted by Gauss weighted method. For the point far away from the center point, its influence on the center point was reduced, and for the point near the center point, its influence on the center point was increased. Figure 2a is a model before denoising, and Fig. 2b is a model after denoising. It is obvious that the preprocessing method proposed in this paper can effectively remove the small noise on the model.

Fig. 2
figure 2

Denoising

2.2 Detail features preserving

According to the different geometric elements, mesh model simplification method is divided into vertex deletion, edge collapse, triangle deletion, and patch deletion. Edge collapse operation is based on half edge structure and an edge with the smallest triangular patch cost is collapsed in order to achieve the purpose of deleting the simplified model of triangular patch.

When the mesh model is simplified by the simplification method based on geometric elements, the higher the simplification rate, the more serious loss of the details of the model, and the detail features of the model cannot be preserved. The purpose of model simplification is to maintain the detail features of the model to the maximum extent on the basis of reducing the scale of the model. Based on the edge collapse method, an interactive method was proposed in this paper, which can effectively retain the detail features of the model and make the model have higher identifiability.

The detail features of the model can be reflected by the edge curvature and the area change of the triangular patch on the model. If the curvature of the edge is large in a region, the features of the model are obvious here. The change of the area of the related triangle patch caused by the collapse edge also reflects the features of the model here. The smaller the change, the smoother. Therefore, the edge collapse cost can be defined as two errors: one is the curvature of the edge, and the other is the change of the area brought by the deletion point.

As shown in Fig. 3, the way to calculate the edge curvature is:

$$ {E}_c=\beta \left(1-\mathit{\cos}\alpha \right)=\left(\frac{l}{h_1}+\frac{l}{h_2}\right)\left(1-{n}_1{n}_2\right) $$
(2)
Fig. 3
figure 3

Edge curvature

Among them, l = ‖q − p‖, hi = d{vi, e} 。.

As shown in Fig. 4, t is the reproduced triangular patch while deleting the triangular patch t. This process can be achieved by rotating the triangular patch T by an angle. When the angle of triangle rotation is set as θ, the error Qt resulting from the deletion of one surface t is as follows:

$$ {Q}_t={l}_t\times \theta $$
(3)
Fig. 4
figure 4

Area error

Among them, lt = (A1 + A2)/2,A1 is the area of the triangle t = ( v0, v1, v2), A2 is the area of the triangle t = (v, v1, v2) . The computation angle θ involves trigonometric operation, so the computation speed is slow. In order to improve the computation speed, let \( \theta =1-{n}_t{n}_{t^{\prime }} \), in which ni is a normal vector of the surface i. As shown in Fig. 4, the area error Ea caused by the deletion edgy is:

$$ {E}_s=\sum \limits_{t\in P}{Q}_t $$
(4)

P is the set of all triangle patches connected to v0. Therefore, the error of the edge collapse is as follows:

$$ E={E}_c+{E}_s $$
(5)

Although the detail features of the model can be retained by using this method, with the improvement of the rate of simplification, the detail features of the model will also be lost gradually. In order to keep the detail features of the model, an interactive method was used in this paper to retain the detail features of the model. The simplified operation to preserve the detail features is shown in Algorithm 1:

figure a

The complexity of the algorithm is the calculation of the folding cost of each edge. The folding cost of each edge includes two parts: the edge curvature and the area difference. For each edge, the value of both of the two parts should be calculated. Generally, the simplification of the whole model can be realized if the order of all edges is sorted by the size of all edges according to the folding cost of all edges and the edge folding operation. However, the specific details on the model cannot be preserved and it cannot guarantee the real sense of the model through the simple model can be obtained according the simplification method mentioned above. Therefore, we introduce an interactive method to mark the area that needs to be preserved by a manual way when choosing the folding edge, so as to retain the local features with special significance. In this paper, a random weight is set up for the folding cost of the edge of the reserved region, that is (rand() + a). The average folding cost of all edges is added to obtain the edges of the reserved area with a larger folding cost, so that the edges of the feature area can be retained effectively. For the head model collected by Kinect, we can adjust the value of a to control the simplification rate of the reserved area. We introduce a function rand() to reduce the effect of noise on the model. When a takes a large value, it can completely preserve the details of the feature area; when a is taken for smaller values, the simplification ratio of the reserved area becomes larger, which cannot achieve the purpose of preserving the details of the model. When a is taken between the two values, the reserved area can be simplified to a suitable level.

The proposed algorithm can effectively retain the detail features of the model. Experiments show that this way of processing can effectively retain the important features of the model, and it can also retain the features of the model well when the simplification ratio is high.

2.3 Fusion of models

The simplified head model also needs to be fused with the body model to get a personalized human model. Kanai et al. propose a FCF-based mesh model fusion method. This method needs to map the model to two-dimensional space and then map it to 3D space according to FCF function, so as to achieve the integration between models. However, the high computational complexity does not apply to the fusion of large models, and it needs to choose the fusion area manually, which cannot realize the automatic integration between models. Therefore, in this paper, an improved FCF fusion method based on boundary edge was applied to achieve the automatic fusion between two large-scale models. The first step was to determine the fusion area according to the boundary edge of the two models. As shown in Fig. 5, the fusion areas F1, F2 were determined and then mapped to the two-dimensional space H1, H2to achieve the mapping from three dimensions to two dimensions. By using the method in literature [20], we can approximately calculate the locations of all points on H1 and H2. The second step was to merge H1 and H2 and get Hc. Hc included two kinds of vertexes: one was the original vertex that is directly inherited from F1, F2, and the other was the new point generated by cross computing. The third step was to reconstruct the blending surface Fc according to every point in Hc. Each point in Hccontained two parts of three-dimensional information, one was the three-dimensional information on F1, and the other was the three-dimensional information on F2. The transition curved surface Fc of F1 and F2can be obtained by FCF method. As shown in Fig. 6, we define a non-uniform cubic B-spline curve interpolation algorithm, called f(s) (0 ≤ f(s) ≤ 1), to control the fusion models.

Fig. 5
figure 5

An overview of mesh fusion

Fig. 6
figure 6

Fusion control function

The coordinate of the vertex vc in Fcis as follows:

$$ {v}^c=f(s){v}^1+\left(1-f(s)\right){v}^2 $$
(6)

Among them, v1 represents the coordinate of vc onF1, v2 represents the coordinate of vc on F2, s = 1 − l/L,l. The definition of L is shown in Fig. 6. l is the distance from point to bottom boundary of Hc, and L is the distance between upper and lower boundaries. It can be concluded that the value of s is closer to 1 when the distance between a point and the corresponding lower boundary is smaller. On the other hand, when the distance between a point and the lower boundary is larger, s is closer to 0.

3 Discussions and result analysis

The algorithm has been implemented on Microsoft Visual Studio platform by using C++ programming. The test computer’s CPU (central processing unit) is Intel Core i5-2520M with the basic frequency 2.5 GHz. In addition, the memory is 8 GB and the GPU (graphics processing unit) is Intel HD Graphics 3000.

Figure 7 shows the original model. Figure 7b, c shows the points and triangles of model in Fig. 7a which has 123,801 points and 41,267 triangles. Figure 8 shows the comparison results between the original model and the simplified models. Figure 8a is the original model, with 20,921 points and 41,261 triangle patches. Figure 8b–d use the simplified method preserving the detail features of the model proposed in this paper, retaining the detail features around the eyes and nose. Figure 8b has 3640 points and 7986 triangle patches, and the rate of simplification is 80%. Figure 8c has 1916 points and 3799 triangle patches, and the rate of simplification is 90%. Figure 8d has 1014 points and 1995 triangle patches, and the rate of simplification is 95%. Figure 8e is the figure that does not adopt the simplified method of preserving detail features proposed in this paper, and it has 1243 points and 1905 triangles, and the rate of simplification is 95%. Through Fig. 8b–d, we can see that when the simplification ratio is very low, the detail features of the eyes and nose of the model can be also effectively retained. As can be seen from Fig. 8d, e, although the simplification ratio of the two models is the same, the detail features of the eyes and nose of Fig. 8d are more obvious than Fig. 8e and have higher identifiability.

Fig. 7
figure 7

The original model

Fig. 8
figure 8

The simplified model

As shown in Fig. 9, we use our proposed algorithm and the cost-driven algorithm to simplify the original model shown in Fig. 7. Figure 9a is a model after simplification of our algorithm, with 818 points and 1248 triangular patches. Figure 9b is a model after the use of cost-driven simplification, with 1036 points and 1491 triangular patches. Obviously, the boundary of the model obtained by using our algorithm is smoother than that obtained by cost-driven algorithm. Moreover, we can see from Fig. 9 that the simplified algorithm in this paper can retain the detail features of the model more effectively.

Fig. 9
figure 9

Model fusion

Figure 10 is the efficiency comparison when different simplification algorithms are used to simplify the head models of different sizes. Cost-driven method algorithm refers to the literature [7, 8], and JADE (just another decimator) algorithm refers to the literature [21]. From Fig. 10, we can see that the algorithm efficiency in this paper is better than that of JADE algorithm, and slightly better than that of cost-driven method algorithm. For the head models of different sizes, a simplified algorithm for constructing hierarchical models proposed in this paper is very efficient.

Fig. 10
figure 10

Comparison of different simplification algorithms

As shown in Fig. 11, this paper uses a semicircle simulated human body with a hole at the top to fuse with a simplified head model, which is shown in Fig. 11c. After fusion, the detail features of the model will not be lost, and the original boundary edge area will be very smooth after fusion. For the fusion method based on the boundary edge, the size of the fusion region of the model is controllable, because the new vertex produced by the fusion has little effect on the complexity of the model and can be ignored. As shown in Fig 11c, there are 3740 points before the fusion of the model, and after fusion, there are 3934 points, which only less than 200 points are generated. The fusion method in this paper will not change the complexity of the model too much and still enables the model to maintain its scale before fusion.

Fig. 11
figure 11

Model fusion

As shown in Table 1, the mesh model fusion method based on FCF is compared with the improved FCF method. When there are 420 and 398 points in F1, F2, respectively, 2058 points can be generated through fusion by using the method before the improvement, and 1023 points can be generated through fusion by using the improved method, which is far less than the points produced by using the method before improvement. According to the comparison of data in Table 1, we can see that compared with the original fusion algorithm, the improved fusion algorithm can get a simpler transition model. The effect diagram of fusion experiment in Fig. 12 shows that, although the improved fusion algorithm making the transition model simpler, there is little difference in the effect of fusion.

Table 1 Comparison of different fusion algorithms
Fig. 12
figure 12

Comparison of different fusion results

The hierarchical human body model library can be constructed by this method and applied to crowd simulation evacuation [22,23,24,25,26]. As shown in Fig. 13, the scene of an emergency evacuation of the crowd in a teaching building is simulated. In Fig. 13a), the LOD method is used. When the viewpoint distance is less than 40, a fine human body model (with 43,261 triangular patches) is used, and a medium precision model (with 3216 triangular patches) is used when the viewpoint distance is greater than 40 less than 100, and the rough model (with 1046 triangular patches) is used when the viewpoint distance is greater than 100. The most complex model is used in the Fig. 13b scene. The average frame rate using the LOD method is 22 FPS (frames per second), while the frame rate using the most complex model is only 7.5 FPS. It can be seen that building hierarchical model library in crowd simulation can effectively improve the performance of the system.

Fig. 13
figure 13

The application of simplified model in school scene

4 Conclusions

The 3D human body model is single, and the production process is complex, which is time-consuming and laborious. The Kinect depth camera launched by Microsoft can quickly scan the human body model with human body construction features, so as to reconstruct the 3D model of human body. However, the amount of data of 3D human body model reconstructed with Kinect is large, and the computational complexity is high, which cannot be directly applied to the group simulation and needs to be simplified. At the same time, we need to keep the effective detail features of the model while simplifying it. Therefore, the main contributions of this paper are as follows: An interactive method for preserving detail features of models based on the simplified methods of edge curvature and area error is proposed. The fusion method based on FCF function is improved, which realizes the automatic fusion between models. The simplified method and improved fusion method proposed in this paper have good robustness for different models. Through the above methods, the personalized character model suitable for group simulation can be reconstructed, and a hierarchical character model database can be constructed, which improves the sense of reality and efficiency of the group simulation.