1 Introduction

Digital twin which is considered as a digital replica of the physical world has been increasingly common in today’s life. This platform is to digitally represent physical objects in the real world. In fact, urban is considered as one of the most complex man-made environments, therefore, urban modeling and digitalization have received great attention from the scientific and technological community. While urban objects, such as buildings and traffic infrastructure, have been modeled in many studies (Bolkas et al. 2020; Oniga et al. 2022; Park and Guldmann 2019; Preka and Doulamis 2016), mines with special structures both in nature and man-made, above and under the ground surface are recognized as one of the most distinctive features of the real world; however, have received less attention in building 3D models. Therefore, an optimal 3D data model needs to be established to represent mines, and then it can be transformed or integrated into other 3D digital forms of the world such as mining digital twins for improving its operations and management.

Nowadays, City Geography Markup Language (CityGML) standard developed by (Open Geospatial Consortium 2021) has been commonly used to encode most of virtual semantic 3D city models (Nguyen and Kolbe 2021). With the flexibility of exchange between geometric and semantic information, CityGML shows its great capability in sharing data for visualization and analysis applications (Groeger and Pluemer 2012). Normally, city objects are represented in CityGML with classes and relations based on their semantic, geometric, topological, and appearance properties (Kolbe and Donaubauer 2021). The boundary representation (B-Rep) method is used to describe the geometry of city objects (Groeger and Pluemer 2012). B-Rep is one of the common surface models with several structures, such as triangular irregular networks and tetrahedronised irregular networks which consist of 2D and 3D simplexes, respectively (Graciano et al. 2018). The GML3 provides the representation of 3D geometries which is a set of geometric primitives, including: Point, Curve, Surface, and Solid. Specifically, 0D points form 1D curves which in turn bound a 2D surface, and 2D surfaces or MultiSurface bound a 3D solid (Löwner et al. 2016). This is why many complex geometries, such as complexes, composites, and aggregates can be formed by the combination of these geometric primitives (Nguyen and Kolbe 2021). Using MultiSurface, a 3D object can be reconstructed regardless of whatever its 3D geometry is (Ying et al. 2017). With its capabilities in representing the 3D model of world objects, CityGML reveals its potential in representing high accurate and level of detail of features in mining sector.

The mining industry plays an important role in the economy of a country as it provides materials to other industries and jobs to the labor market. However, the mining industry also puts many negative impacts on the environment such as pollution, subsidence, landslide, and running out of natural resources, and mining activities contain high risks to working safety. In order to improve work efficiency and safety for the production of mines and mitigate the effects of mining operations on the environment, the management of mines with the support of a 3D mine information model is necessary. For the significance of 3D models in mining, there has been several studies supposed that 3D models can be utilized in mining for many purposes, including visualization (Jin et al. 2011), There have been several studies on utilizing 3D information models for mine development (Duncan and Abdul Rahman 2015; Jinhu and Na 2010; Li et al. 2021). They proposed workflows in which building 3D models of features is the main task. The building of 3D models is often based on survey data collected by both traditional methods (total station, Global Navigation Satellite System—GNSS) and advanced methods (Unmanned Aerial Vehicles—UAV, Light Detection and Ranging—LIDAR, and other integrated technologies).

During the last decade, rapid development in geospatial technologies, such as UAV with Structure from Motion (UAV-SfM) and Terrestrial Laser Scanning (TLS) has facilitated the process of data collection in both scientific and industrial sectors. This is also true for the mining industry where changes rapidly occur. The demand for quick and accurate data collection has been increasing with the increasingly common use of large heavy machines and advanced technologies in mine excavation. It is proven that while UAV-SfM has several advantages, including low cost, effectiveness, and safety, TLS is able to provide accurate and high-density datasets. Both technologies outweigh conventional ones in gathering high-density datasets because they can produce millions of accurate 3D points in a short time. UAV-SfM and TLS dense point clouds can be used to extract features of mine engineering structures at a high level of detail and accuracy. This is valuable in the 3D modeling of complex engineering structures which are often common in mines. However, each has some drawbacks. For example, both TLS and UAV-SfM methods produce data gaps with relative shadow effects due to restrictions of instrument position and working activity. Meanwhile, TLS is limited to collecting top-point clouds, such as building roofs. Although it can be mounted on UAV platforms, this could lead to high risks of equipment use and labor safety.

From the aforementioned analysis, an optimal approach for dense point cloud collection of mine engineering structures could be the integration of UAV-SfM and TLS. This could offer an approach in which UAV is used to collect data from the top and large surrounding areas, while TLS is for scanning complex structures in small indoor spaces. However, a raising challenging issue of how to fuse the two datasets collected by UAV and TLS should be considered if using this approach. There has been some research on UAV-SfM and TLS point cloud fusion. For example, Zang et al. (2019) proposed a novel registration algorithm that accurately registers TLS stations to UAV dense image points in order to detect deformation occurring in chaotic hillside areas. Rugged topography with vegetation covering the earth surface is also challenging to a single high-resolution topography technique, but it can be outweighed by using the UAV and TLS data fusion methodology which includes the co-registration of overlapping areas through the Iterative Closest Point (ICP) algorithm (Cucchiaro et al. 2020). Other applications of UAV-SfM and TLS fusion have been made in cultural heritage environments and archaeological research (Balletti et al. 2015; Chatzistamatis et al. 2018; Chiabrando et al. 2016; T. Luhmann et al. 2019) and in environmental monitoring (Šašak et al. 2019; Tilly et al. 2016; Tong et al. 2015; Zieher et al. 2018). In many studies, the ICP algorithm has been applied, and this proved that ICP is a robust method of point cloud fusion. In addition, a variety of objects have been surveyed using the UAV-SfM and TLS fusion approach, from simple to complex surfaces, and natural to man-made features. However, to our best knowledge, there has been no research on building 3D models using the CityGML standard for special mining engineering structures, such as mine shafts based on extracted UAV-SfM and TLS point clouds.

While CityGML supports representing 3D models in five levels of detail, namely LoDs 0, 1, 2, 3, and 4 in which LoDs 3 and 4 are the highest level of detail. From the literature review, many studies focused on building the 3D model of cities in LoD 2 (Agugiaro et al. 2018; Boeters et al. 2015; Jayaraj and Anandakumar 2018; Preka and Doulamis 2016), while several studies were on the 3D model of buildings in LoD 3 (Biljecki et al. 2016; Ying et al. 2017). LoD 2 models are the ones with the highest level of detail that is commonly available in practice, and with a very wide range of applications (Boeters et al. 2015), whereas LoD 3 contains architectural outdoor models and accurately describes complex structures with higher details than LoD 2. Although there are many engineering structures such as office buildings, and factory buildings that can be found in several fields, it can be difficult to find in these fields some structures, such as mine shafts with hoistrooms, headframes, and sheave wheels. However, these structures are common in the mining industry, and they are unique and complex with different sizes and shapes, as well as materials. Therefore, when building their 3D models using the CityGML standard, the question is whether these structures can be 3D modeled in LoD 3 using UAV-SfM and TLS point clouds.

In this study, we propose a workflow for building 3D CityGML models of mine structures in LoD 3, in which both UAV-SfM and TLS technologies are utilized. The proposed procedure offers main steps, including conducting UAV and TLS surveys, processing data to generate point clouds, filtering noise, fusing point clouds, extracting features, and building 3D CityGML models.

2 Method and data acquisition

2.1 Study objects

The selected study objects include 41 buildings and structures located on the 12-hectare ground area of the Nui Beo underground coal mine, in Quang Ninh province, Vietnam. This area is close to one of the most rapidly developing cities in Vietnam, Ha Long city. These objects are two mine shafts surrounded by 5–6 story buildings located on the northern part of the study area, whereas there are several structures with only roofs and pillars in the southern part of the study area (Fig. 1). Both two mine shafts were constructed from an elevation of around—400 m. On the ground, at the elevation of 35 m, each mine shaft has its hoistroom, 50 m—tall winding tower or headframe, sheave wheels, and other supporting parts. From the architectural perspective, this area is quite popular in the Vietnam mining industry, with brick-built office buildings and metal material-built factory houses.

Fig. 1
figure 1

Study area: the green boundary lines for UAV surveys and the red boundary circle for TLS surveys

2.2 Process description

For the purpose of building 3D models of complex engineering structures in mines, the survey normally requires good survey preparation and planning. This is because there are many different survey equipment possibly employed for the data collection, and the survey site is often busy with many different large heavy machines operating. Good survey preparation could ensure no conflicts between activities or failures of important survey activities happen. In addition, the processing stage comprises many steps related to a variety of survey data. Therefore, a precise workflow is necessary (Fig. 2). The workflow considers all aspects and possible issues, starting from careful survey planning and preparation to the production of 3D models. It also provides important steps in a logical and correct sequence, including control network establishment, UAV flight mission planning, TLS location positioning, ground control point (GCPs) and checkpoint (CPs) measurements, UAV flight, and TLS performance, data processing, UAV and TLS integration. Post-processing of UAV and TLS point clouds which includes extracting points of important features is followed by the last step of building 3D models of features.

Fig. 2
figure 2

Workflow of building 3D models for mine structures using UAV-SfM and TLS

2.3 Data collection

2.3.1 UAV survey

In the data collection phase, for the UAV and TLS surveys, a DJI Phantom 4 Advanced drone, and a FARO FOCUS3D X130 laser scanner were employed, while a Leica FlexLine TS09 total station was used for the direct coordinate measurement. These instruments were technically checked and validated to be at the ready-to-use status.

The UAV camera is crucially important as it partly ensures accurate 3D models. In this study, the camera onboard the Phantom 4 Advanced is equipped with a Sony camera (Fig. 3a). The camera is a 1-inch 20 MegaPixel 1″ CMOS sensor with a focal length of 8.8 mm/F2.8-11, and a FOV of 84° (Fig. 3b). In addition, several checkpoints were deployed to evaluate the accuracy of UAV generated products, such as orthophotos and point clouds. The coordinate measurement of these checkpoints was performed using a Leica TS09 Plus total station (Table 1).

Fig. 3
figure 3

Unmanned Aerial Vehicle: a Phantom 4 Advanced drone; b Sony camera

Table 1 Specifications of the total station and laser scanner

The first task of measuring ground control points (GCPs), checked points (CPs), and TLS targets were conducted. These points were well distributed in the study area (Fig. 4). A total number of 31 ground points was measured by a Leica TS09 Plus and GNSS FOIF A90 (Table 1) with 10 of them being CPs for the accuracy assessment of the UAV survey. The positional accuracy of these points was at subcentimeter precision of 0.9 cm (for total station) and 2.5 cm (for GNSS/RTK).

Fig. 4
figure 4

Distribution of GCPs and CPs for a The whole study area b The second mine shaft

Several GCPs and CPs were both cross-shaped and chessboard markers placed on the ground surface, whereas chessboard markers were also placed on feature facades (Figs. 5a–c). Some GCPs and CPs were concrete road markings (Fig. 5d) and their coordinates were measured by GNSS/RTK. The total station in the non-prism and the prism modes were used to measure the coordinates of façade and ground markers, respectively. In addition, GCPs which were captured on both UAV and TLS surveys were used for the later process of alignment. For the layout of GCPs, they should be well distributed on the site. According to Thomas Luhmann et al. (2020), without GCPs in the upper part of the building, and with increasing height and distance, the overall accuracy of the model will get worse. For better layout of GCPs, more GCPs should be placed closer to buildings and on the to upper parts of buildings which will be 3D digitally represented in LoD 3. This could improve the accuracy of UAV point clouds.

Fig. 5
figure 5

Markers placed on a The ground for UAV, b The ground for TLS, and c, d The façade

The planning and operation of UAV flights to collect data were undertaken using Pix4DCapture software installed on a smartphone. Several flight plans were created and performed in both automatic and manual modes, as well as in linear and circular orbits. The camera was oriented at 45° to acquire oblique images, which is to avoid shadow effects as much as possible when supplementing nadir image blocks. The camera angle of 45° was used and recommended in many studies, for example, to study extreme topography (Agüera-Vega et al. 2018), and urban areas (Fritsch and Rothermel 2013; Rupnik et al. 2015). While the linear flight plan was an automatic mode to acquire images over the whole study area (Fig. 6), the circular one was in the manual mode to capture images of the center area with office buildings and the above-ground parts of the second mine shaft (Fig. 7). The former is to collect data for a large area with reasonable accuracy, whereas the latter is to supplement the former’s data and improve spatial accuracy and precision, and reduce data gaps and systematic errors in the final point cloud (Nesbit and Hugenholtz 2019).

Fig. 6
figure 6

Planning and performing the linear flight with oblique cameras for image acquisition

Fig. 7
figure 7

Performing the circular flight capturing images of the winding tower

For the automatic mode, several important parameters, such as flight height, image overlaps, and camera optical axis angles were set up for the flight plan. However, for manually controlled circular flights, the flight radius which was the distance from the center of flight circular orbits to the drone was considered as it defines the ground sample distance (GSD) of images. Each circular flight was performed at a different radius and height. In the automatic linear flight mode, the GSD of images is defined by the flight altitude. In addition to the automatic operation of the drone, a stop-and-go mode in which the drone hovers before shooting each image can help to avoid blur images. This could, therefore, improve the subsequent photogrammetric results. Table 2 compiles these important parameters and summarises the result of UAV data collection.

Table 2 Flight plan parameters and UAV data

2.3.2 TLS survey

There were eight external TLS scan stations set to scan at the ground level surrounding the mine shaft (Fig. 8). Therefore, some parts of the mine shaft towers, such as roofs of hoistrooms and winding towers, could not be captured. The TLS data of eight scan stations were processed using Faro SCENE software. There were a total number of over 32 million points (Fig. 9), and with a resolution of one point per 7.6 mm at 10 m. The point cloud was georeferenced to the local coordinate system (VN-2000) through survey targets (Fig. 5) with root mean square errors (RMSE) = 3.7 cm.

Fig. 8
figure 8

Preparation of the FARO FOCUS3D X130 for scanning the study area

Fig. 9
figure 9

TLS point cloud displayed in SCENE software

2.4 Denoising operations

As the limitation of TLS and UAV sensors, devices, and environment, noise contamination and outliers in their produced point clouds are inevitable. Therefore, filtering operations are necessarily performed on raw point clouds to improve the accuracy of point clouds that can be ready for further processing. In this study, the statistical outlier removal (SOR) filter, one of the most commonly used methods was employed. This method includes the following steps:

  1. (1)

    Users define a number of each point’s neighbors, these points are searched using the k-nearest neighbor algorithm.

  2. (2)

    The average distance between each point and its nearest neighbors is calculated.

  3. (3)

    Computing the average distance of all points in the point cloud, then the standard deviation of the average distance.

  4. (4)

    Outliers are identified when their distance is larger than the average distance in step 3 plus n times the standard deviation. n is set by users.

It can be seen from the above steps, there are two input parameters defined by users. These parameters are important as they influence the accuracy of noise reduction and the effectiveness of the filter operation. As the number of neighboring points (k) increases, the accuracy of noise reduction increases; however this leads to the increase of processing time. In contrast, the accuracy of noise reduction improves when the value of n decreases; nevertheless, normal points can be wrongly classified as outliers (Chen et al. 2018). In this study, complex objects which are mainly steel structures are used for building the mine shaft tower. k = 6 and n = 1 were adopted for denoising UAV point cloud dataset, while k = 8 and n = 4 were set for denoising TLS point cloud dataset.

As both UAV and TLS point clouds were denoised, they were ready for integration. For data fusion, there are three levels of data fusion, including low, intermediate, and high (Bastonero et al. 2014; Cucchiaro et al. 2020). In this study, we proposed to integrate UAV and TLS datasets at the low level as UAV and TLS point clouds are combined to form one UAV-TLS point cloud. The ICP algorithm was used to merge the two point clouds.

2.5 ICP algorithms

The ICP method was proposed by Besl and McKay (1992), in which the least square is iteratively applied many times to reach a minimum error of matching points. Euclidean distance is used in this method. Each point of the secondary point cloud will have its closest point in the primary point cloud (reference point cloud) based on the found smallest Euclidean distance between them (see Eq. (1)). These pairs of points are used to determine the parameters of transformation based on the least square method (see Eq. (2)). This is an iterative process to improve the co-registration accuracy of two point cloud dataset. Figure 10 describes the main steps of ICP method.

$$\hat{i} = \begin{array}{*{20}c} {\arg \;\text{min}} \\ i \\ \end{array} \left\| {\to _{{p_{i} }} - \to _{{q_{j} }} } \right\|^{2}$$
(1)
$$\hat{R},\vec{t} = \begin{array}{*{20}c} {\arg \;\text{min}} \\ {\hat{R},\overrightarrow {{\hat{t}}} } \\ \end{array} \mathop \sum \limits_{i = 1}^{N} \left\| {\left( {R\to _{{p_{i} }} + \vec{t}} \right) - \to _{{q_{i} }} } \right\|^{2}$$
(2)

where \(\hat{i}\) is smallest Euclidean distance. pi and qj are vectors of points in primary and secondary clouds, respectively. \(\hat{R},\vec{t}\) are parameters of transformation.

Fig. 10
figure 10

Workflow of ICP method

2.6 Selection of LoD for 3D models

As the final product of this study is 3D models of structures, it is important to define the level of detail (LoD) of these models, then the point cloud extraction of features is performed to ensure this LoD. In this study, the definition of LoD in the CityGML standard was used. There are five levels of detail, including LoD 0 (Region and Landscape), LoD 1 (city and region), LoD 2 (city, city district, and project), LoD 3 (city district, exterior architectural model, and landmark), and LoD 4 (landmark and interior architectural model) and the higher levels use increasing structural complexity and accuracy (Liu et al. 2017). Based on the quality of acquired data, the 3D model of the mine shaft’s above-ground parts was built with LoD 3, whereas their surrounding objects in the study area were built with LoD 2.

CityGML defines LoD 2 as containing basic roof shape and orientation. In other words, a building in LoD 2 has differentiated roof structures and thematically differentiated boundary surfaces (Gröger et al. 2012). The LoD 2 model derives prototypical roofs by analyzing the planarity and orientation of the point clouds (Park and Guldmann 2019). For LoD 3, it denotes architectural models with detailed wall and roof structures potentially including doors and windows (Gröger et al. 2012).

2.7 Feature extraction and 3D data modeling

The extraction of point clouds was both manually and automatically performed. For automatically extracting point clouds of engineering structures on the ground such as office and factory buildings, and mine shafts, the cloth simulation filter (CSF) was applied. A detailed description of CSF is provided in (Zhang et al. 2016). The expected results of this step were used to extract all objects above the ground. Then, the resulting extracted point clouds based on CSF were directly used to build 3D models at LoD 2. However, for LoD 3 modeling, further steps of manually extracting point clouds need to be conducted. In this study, the steel structured winding tower was modeled at the LoD 3, and the fused TLS and UAV point cloud of the mine shaft were manually separated according to its components. These include winding towers or headframes, the sheave wheel, and hoistrooms.

Based on the extracted point clouds, the following building information can be exploited: roof type, height, and other sizes of buildings. This information is necessary to design the exterior of the buildings in the study area in LoD 2, and the mine shafts’ headframe and sheave wheel in LoD 3. The Trimble Sketchup software was used to draw walls, roofs, and steel frames and bars. In addition, textures acquired from UAV oblique photos were applied to the buildings’ surfaces.

3 Results and discussion

3.1 Accuracy assessment of UAV point clouds

The accuracy of UAV survey was evaluated by the computing RMSE along each axis: RMSEX, RMSEY, and RMSEZ, together with planimetric error RMSEXY or positioning error. Table 3 shows that the accuracy of the UAV survey was at the centimeter level as the RMSE values of horizontal and vertical directions are 2.6 and 1.7 cm, respectively. This level of accuracy proves that the UAV survey meets the accuracy requirement of making CityGML models at all LoD (Gröger et al. 2012).

Table 3 The residuals of 10 CPs with RMSEs along each axis

According to Gröger et al. (2012), the positional and height accuracy of points at LoD 2 and LoD 3 are 2 m and 0.5 m, respectively. However, they also mentioned that these are absolute accuracy, and they are debatable and are to be considered discussion proposals. In order to have a further quantitative assessment of UAV-SfM point clouds, a direct comparison of the point clouds of TLS and UAV-SfM was used. Several buildings that have both TLS and UAV-SfM point clouds were selected for the accuracy assessment. Their TLS and UAV-SfM point clouds were used to extract sub-point clouds of facades and roofs. Figure 11 shows the position of these sub-clouds in the study area with roofs (a, b, c), and facades (d, e, f, g), and Fig. 12 describes their shape and size. These sub-clouds were in the same reference system with the removal of all points which were not in the overlapping areas. Their TLS sub-clouds were used as a reference, while their corresponding UAV-SfM sub-clouds were used as compared ones. These comparisons were performed by CloudCompare software (version 2.9.1). Deviations in different areas of facades/roofs between the TLS and UAV-SfM sub-clouds were calculated in the software. The procedure of the deviation computation using the nearest neighbor algorithm of CloudCompare includes two steps: (1) extraction of UAV-SfM and TLS sub-clouds for roofs and facades; (2) C2C distance computation of UAV-SfM and TLS sub-clouds for each roof or façade. The computed deviations indicate how consistent between the UAV-SfM and TLS point clouds of these objects were. The large deviations are the low accuracy of the points and vice versa.

Fig. 11
figure 11

Position of roof and facade sub-clouds

Fig. 12
figure 12figure 12

C2C distance maps and histograms of roofs (ac) and facades (dg)

Computing results of deviations in the different areas between the UAV-SfM and TLS sub-clouds of roofs and facades were presented using an appropriate colormap and histograms (Fig. 12). In general, roofs have better congruence between the TLS and UAV-SfM sub-clouds than those of façades as the deviation of roof sub-clouds were on average from 5 to 14 mm with a maximum of 14.2 cm compared to an average of 6–46 mm and a maximum of 29 cm of façade sub-clouds. The reason can be a lower image coverage on facades of buildings than on roofs which leads to greater difficulties in image orientation. In addition, Fig. 12c, e show that there were data gaps due to the shadow effect of TLS stations.

CSF method was applied to UAV point cloud for point cloud extraction of buildings, which was later used for generating their LoD 2 models, and to the integrated UAV and TLS point cloud to extract the point cloud of the mine shaft with mainly its above-the-ground parts (e.g. hoistroom and headframe). There are three important parameters to set when operating the CSF, i.e., the type of terrain surface (steep slope, relief, or flat), the cloth resolution (grid size), and the threshold for off-ground points’ classification. In this study, these parameters were set with references to (Oniga et al. 2022; Zhang et al. 2016). Due to the flat area study, the type of terrain surface was set to “flat”, and the cloth resolution and the threshold were set to 2 m and 0.5 m, respectively. The resulted off-ground point cloud contains all objects above ground, including vegetation and buildings. Therefore, one more step of excluding vegetation was performed (Fig. 13).

Fig. 13
figure 13

Extracting UAV point clouds of buildings using the CSF method: a Original point cloud; b CSF-based extracted point cloud; c Finally extracted point cloud of buildings

3.2 3D CityGML modeling

3.2.1 Creation of LoD 2 models of buildings

The main task of this step is the generation of planes for facades and roofs of buildings. There are three types of roofs, flat, cross gable, and arc roofs. In addition, some buildings have a mix of flat and cross gable roofs. A tool of ‘fit plan’ offered by CloudCompare software can be used to approximate planes. Editing the roof and façade planes was performed in Trimble Sketchup software before being converted to CityGML using FME software (https://www.safe.com). Although containing only two basic mapping elements of polyline and polygon which form polygons and substance, respectively, Sketchup provides an ease tool of modeling and editing polygons and volumes, and its models are scalable and flexible. In Sketchup, firstly, geometry modeling was conducted by importing point clouds as reference before creating outlines of objects, such as buildings, steal frames, and then stretching a preliminary framework and modifying detail. Secondly, texture modeling was performed by importing photos captured by UAV or digital cameras. Figure 14 illustrates several 3D models of buildings, whereas Fig. 15 shows the part of the CityGML files of models.

Fig. 14
figure 14

Creating LOD 2 models of office and factory buildings

Fig. 15
figure 15

A part of the converted CityGML files from FME

Figure 15 illustrates a fragment of the CityGML file enriched with information regarding the buildings by the relational database. The GML document is a structure of markup to inform identity number, LoD, and coordinates of a specific building.

3.2.2 Creation of LoD 3 models of the mine shafts’ above-ground parts

Compared to LoD 2, all models of objects built at LoD 3 are at a considerably higher level of detail (Fig. 17). To meet this detail requirement, the quality of the point cloud is an important factor. The fusion of UAV and TLS point clouds was performed to generate a point cloud of the mine shaft with both accurate and highly detailed levels. The 3D model creation of parts of mine shafts, such as the winding tower or headframe, the hoistroom, and the sheave wheel was manually performed as these objects are complex with many steel bars in various shapes and sizes (Figs. 16b–d, 17). Figure 16a is the photo of the mine shaft captured by the UAV camera. The result of converting Sketchup models (see Fig. 18) to CityGML using the SAFE FME software was shown in Fig. 19.

Fig. 16
figure 16

Sketchup 3D model of Mine shaft’s above-ground parts

Fig. 17
figure 17

Establishing LoD 3 models of various parts of the mine shaft

Fig. 18
figure 18

Sketchup 3D models of the surface plant of the underground coal mine

Fig. 19
figure 19

CityGML 3D building models within the study area shown in FZKViewer (www.iai.kit.edu)

4 Conclusions

In this study, we have utilized UAV and TLS technologies to collect data over the surface plant area of an underground coal mine and created the 3D CityGML models of buildings and mining industrial structures in this area. The acquired data was processed to generate point clouds, including UAV-SfM and TLS point clouds. While the UAV-SfM point cloud was used to build 3D models at the LoD 2 for most of the buildings in the study area, the integrated UAV-SfM and TLS point cloud of the mine shaft’s above-ground parts were used to create the 3D models of these structures at the LoD 3.

The accuracy assessment of the UAV-SfM point cloud was conducted using ten checkpoints with coordinates measured by a Leica TS 09 total station and GNSS/RTK. The result shows the centimeter-level of accuracy for the UAV-SfM point cloud in all three directions, and it ensures that this point cloud meets the accuracy requirement of building 3D CityGML models at LoD 3. In addition, comparisons between UAV-SfM and TLS sub-clouds of roofs and facades were made, and they showed that most of the absolute deviations of roof and facade sub-clouds do not exceed 14.2 cm and 29 cm, respectively.

However, the mine shaft is a complex engineering structure with many steel objects above the ground, such as hoist rooms, winding towers, and sheave wheels. Therefore, building LoD 3 models of these objects requires a higher level of detail and accuracy for point clouds. In this study, the TLS point cloud meets this requirement, and point clouds of some top objects such as roofs have an average deviation of 6 – 46 mm in comparison with TLS point clouds. This ensures that missing TLS data of the top parts could be filled with the UAV-SfM point cloud after integrating the two point clouds. The integrated UAV-SfM and TLS point clouds were successfully used to build 3D CityGML models of complex structures such as mine shafts in LoD 3.

Each technology has its advantages and disadvantages; however, both UAV and TLS technologies have proved their potential in data collection for building 3D models in mining industrial areas, especially when they are employed together.