Introduction

In recent years, there have been rapid developments in real scene 3D technology, making it a hot topic in current research. Real scene 3D modeling involves transforming real-world objects or scenes into 3D models using digital technology. It integrates techniques and methods from multiple disciplines, including computer vision, graphics, metrology, and geographic information systems, enabling the simultaneous recording and display of textures and geometric shapes1. Summarizing previous research, the focus has primarily been on two types of targets: modern buildings in urban settings2,3,4, and sculptures and ancient buildings in cultural heritage conservation scenarios5,6. Previous studies have also conducted similar research on 3D modeling of karst cave landscapes7,8. This paper aims to emphasize the integration of different technical methods to achieve better model integrity and the reproduction of color and texture of the karst cave landscapes.

The name 'Karst' comes from the Slovenian Karst Plateau9,10. We can define karst as terrain with distinctive hydrology and landforms that arise from a combination of high rock solubility and well-developed secondary (fracture) porosity11. Karst is found not only on the surface but also below it, including dissolution features such as caves, voids, sinking springs, and underground springs and rivers. Carbonate rocks, such as limestone, are the most typical lithology in which karst landscapes are created, due to their solubility9. In most cases, carbonate rocks are dissolved by slightly acidic water infiltrating the rock. The acidity primarily derives from CO2 present in the air and soil, which slowly dissolves into meteoric water, reducing its pH and increasing its corrosion capability10. According to the analysis, 15.2% of the global ice-free continental surface is characterized by soluble carbonate rock, representing potential karst aquifers with surface or near-surface exposure. China is the country with the largest karst surface area, 2.55 million km2, corresponding to 26.5% of the land surface12,13,14.

Caves contain unique and irreplaceable geological evolutionary landscapes and cultural historical relics, offering significant value for both observation and scientific research15. For instance, the rock structures and mineral deposits in some caves provide geologists with crucial clues about crustal movements and ancient climates. The cultural relics found in caves, such as traces of ancient human habitation and rock paintings, are invaluable for archaeological and historical studies16. With the growth of tourism, more natural caves are being developed as tourist attractions. This not only allows the public to experience nature and the magical beauty of caves but also boosts local economies. However, the development and use of caves also face the challenge of balancing protection and exploitation. Excessive commercialization can cause irreversible damage to the natural environment and cultural relics of caves. The karst disturbance index (KDI) is a method for systematically and holistically measuring human disturbance to karst environments17,18,19.

In this context, the development of high-precision, real scene 3D models become particularly important. By creating an accurate and vivid digital replica, it not only offers visitors a new way to explore the caves, reducing the impact on their natural environment, but also provides scientists with an effective tool for studying and analyzing caves. Scientists can utilize this data to study the formation processes of caves or for various other purposes. Therefore, creating high-precision, real scene 3D models of cave landscapes, based on laser scanning and photogrammetry, holds significant practical and scientific value.

Traditional building modeling is characterized by regular geometric shapes with fixed dimensions, such as length, width, height, and curvature, allowing for the construction of 3D models through the acquisition of simple geometric parameters20,21. The color and texture of these models are often monochromatic or feature regular patterns, making them easy to replicate. In contrast, cave systems, having evolved naturally over extended periods, present irregular spatial extensions and complex surface features. Their interiors, filled with stalagmites and stalactites, have unique structures that traditional measurement methods cannot capture due to their continuous geometric variations. Moreover, the dark lighting conditions, along with the presence of sediments and moisture, complicate the reproduction of their complex textural features22. Compared to traditional measurement techniques, 3D laser scanning offers the advantages of being non-contact, efficient, and highly automated. It can continuously collect feature point cloud from the surface of the target, accurately reflecting the complex geometric information of the measured object and rapidly acquiring the three-dimensional data of the cave. By integrating 3D laser scanning with photogrammetry, it is possible to accurately obtain both the interior spatial and exterior topographic geometric information of the caves, as well as capture the surface texture of the real objects, adding color and realism to the model15. In addition, it can facilitate the semi-automatic extraction of discontinuities for structural and stability analysis23,24,25. Simultaneously, it is necessary to evaluate the reliability of these analysis results.

The aim of this study is to establish a high-precision real scene 3D model of cave landscapes based on point cloud and image data. For the cave interior, a mobile 3D laser scanner was employed to collect point cloud data. To address the lack of Global Navigation Satellite Systems (GNSS) signal inside the cave, we first used RTK to collect known coordinates at the cave entrance. Then, we employed a total station to measure and obtain the coordinates of control points within the cave. Addressing the difficulty in collecting texture information in the cave's dark interior22, as previously mentioned, a new solution was proposed, resulting in the construction of a real scene 3D model of the cave interior with an accuracy exceeding 4 cm. For the aboveground exterior cave landform, UAV oblique photogrammetry was used for aerial surveying, collecting exterior images and establishing a real scene 3D model with an accuracy exceeding 2 cm. Subsequently, the 3D models of both the interior and exterior landscapes of the cave were integrated and expressed, forming a complete set of high-precision techniques for constructing real scene 3D models of cave landscapes.

Study site

The Wulong Karst Cave, located within the Longmenshan National Geopark in Sichuan Province, China, is a significant geological heritage site in the park (Fig. 1).

Fig. 1
figure 1

Location of Wulong Karst Cave26.

It developed within a peak cluster landform formed from carbonate rocks. Peak cluster landforms are a distinctive type of karst topography, primarily characterized by numerous isolated, steep peaks. These peaks are typically separated by flat or depressed valleys. This geomorphological feature is particularly prevalent in southern China. Within the area marked in Fig. 1, there are over ten variously sized karst caves, with the Wulong Karst Cave being the most representative.

The entrance to the Wulong Karst Cave is situated at an altitude of 1164 m, with the entrance measuring 10 m in height and 7 m in width. The cave extends to a depth of 120 m and encompasses an erosional space exceeding 10,000 m3. Inside the cave, there is an underground river, and at the end of the cave, there is a circular karst window formed through dissolution and collapse processes. The Wulong Karst Cave was chosen as the subject of this study due to its moderate size, complete developmental characteristics, and existing tourism development, making it suitable for this research.

Methods

This study primarily investigates the methods of high-precision, real scene 3D modeling of karst cave landscapes based on laser scanning and photogrammetry technology. It integrates the internal spatial structure of the cave with the overlying geomorphological landscape to establish a unified model. The main technical workflow includes data collection, data processing, and model construction (Fig. 2).

Fig. 2
figure 2

Flowchart.

The modeling process is divided into two parts: the internal structure model of the cave and the external geomorphology model. For the internal structure of the cave: mobile 3D laser scanners are used to collect point cloud data, which are particularly suitable in the complex spatial environment of caves. Additionally, RTK and a total station are used to set Ground Control Points (GCPs), ensuring the accuracy of the point cloud data. A Digital Single Lens Reflex (DSLR) camera is used to capture photos of different areas inside the cave for extracting texture information. After data collection, the collected point cloud data are preprocessed through resampling and denoising using a computer, extracting the cave point cloud. A Triangulated Irregular Network (TIN) model is established based on the cave point cloud, followed by model repair and optimization. Texture mapping is then performed to create a real scene 3D model of the cave's interior. For the external geomorphology of the cave: UAV oblique photography technology is adopted, which is widely used for rapid construction of 3D surface models. The data collected by UAV include imagery and Position and Orientation System (POS) information. Using GCPs for correction, the accuracy can reach the centimeter level. In real scene 3D modeling software, a dense point cloud is obtained based on the principles of aerial triangulation, further establishing a 3D surface model. This paper integrates the internal and external models of the cave, constructing a complete, high-precision, real scene 3D model of the cave landscape.

Data collection, processing and results

Internal structure

Light Detection and Ranging (LiDAR) is commonly used for making high-resolution maps and has applications in geodesy, geomatics, archaeology, geography, geology, geomorphology, forestry and airborne laser swath mapping27,28. Unlike traditional point measurement methods, 3D laser scanning technology adopts an advanced approach to acquire a multitude of 3D point cloud from the surface of the object being measured. This technology enables scanning in complex environments and on-site, directly recording the three-dimensional data of various large, intricate, and irregular entities, thereby rapidly reconstructing the 3D model of the scanned object. However, collecting point cloud data inside caves presents several major challenges22:

  • The complex structure and severe occlusions within the cave inevitably lead to numerous scanning blind spots, resulting in missing point cloud data and affecting the quality of the model;

  • The cave's interior is dark with poor lighting conditions, making it impractical in most cases to illuminate the entire cave with artificial lighting. Furthermore, the presence of deposits and humid vapor within the cave obscures its true colors, complicating the collection of texture data;

  • The lack of GPS signals inside the cave prevents the accurate positioning of the scanning instruments, affecting the precision of the point cloud data.

Data collection

The collection of point cloud data inside the cave in this study utilized the GVI LiGrip V100 handheld rotating laser scanner (Fig. 3). The effective laser scanning range of this device is 100 m. It is characterized by its simple design, compact body, and ease of handling. It can quickly capture extensive scene data, combining LiDAR and SLAM algorithms to achieve integrated measurement of multiple indoor spaces. This approach contrasts with previous studies15,16 that primarily used stationary laser scanners, which are limited to scanning from fixed angles and require the establishment of multiple stations to acquire point cloud data from different perspectives. Processing such data involves calibrating information from various stations, which is less efficient. The mobile laser scanner allows for continuous measurement, offering the freedom to choose data collection routes and scanning angles while in motion. This significantly reduces scanning blind spots, shortens data collection time, and avoids the errors associated with calibrating point cloud data collected from different stations by stationary laser scanners15,16,29.

Fig. 3
figure 3

GVI LiGrip V100 handheld rotating laser scanner.

In response to the issue of no GNSS signal inside the cave, rendering the instrument's built-in positioning sensor unable to obtain positioning information22, we employed a total station. By setting up control traverse lines from the entrance of the cave into its interior, we were able to acquire coordinate information within the cave (Fig. 4). We marked the six GCPs using chalk-drawn cross symbols, which we consider a simple yet reliable marking method. This approach is easy to clean and does not damage the cave landscape. Additionally, it avoids the positional errors that can occur with mobile markers. Points P1 and P2, located on the open ground at the cave entrance, were in an open-air environment, where RTK was used to collect absolute coordinate information. Points P3 to P6, positioned inside the cave, had their absolute coordinates calculated by the total station. This allowed for the point cloud data collected by the instruments to be calibrated from relative to absolute coordinates, and for the correction of coordinate errors accumulated during the movement of the laser scanner.

Fig. 4
figure 4

Total station traverse of Wulong Karst Cave.

In the process of data collection, it is crucial to plan the route in advance to ensure comprehensive scanning of every part of the cave, thereby minimizing data blind spots22. While moving forward with handheld instruments, it is important to maintain a uniform and slow pace to ensure optimal data collection quality. Starting from the exterior to the interior of the cave, upon reaching each control point, the position of the instrument should be precisely aligned with the control point. This ensures accurate recording of the instrument's relative coordinate information, preparing for registration to absolute coordinates and error correction during data processing. After completing data collection, the obtained raw point cloud data consists of approximately 160 million points (Fig. 5).

Fig. 5
figure 5

Raw point cloud data and instrument path of Wulong Karst Cave.

During the export of raw data, control points are used to calibrate the point cloud accuracy, aligning the instrument's recorded relative coordinates with the total station's absolute coordinates. Upon calculation, the error at all six control points was found to be less than 4 cm (Table 1).

Table 1 GCPs accuracy correction.

Extract karst cave points

During 3D laser scanning of targets, factors such as the scanning equipment, surrounding environment, and characteristics of the target itself, inevitably introduce noise and redundant points into the point cloud data. Additionally, the presence of tourists and facilities within the cave can lead to the inclusion of non-cave points in the scan. This not only increases the data volume but also reduces the efficiency of data processing. Consequently, data preprocessing is a critical prerequisite for high-precision 3D modeling based on point cloud data29,30,31, as well as for the extraction of karst cave points.

We conducted data preprocessing using two point cloud processing software, LiDAR 360 and Cyclone 3DR, which involved the following four steps (Fig. 6).

  • Resampling: the point cloud is resampled with a minimum point spacing of 1 cm. This approach reduces the redundancy in dense areas of the point cloud without compromising the quality in sparse regions, thereby enhancing data processing efficiency;

  • Noise removal: points that are noise due to sensors or other interference sources are removed. The basic principle of the algorithm is to calculate the average distance D from each point in the point cloud to its 10 nearest neighboring points. If the average distance D of a point exceeds the set threshold Dmax, the point is considered noise and removed. The threshold Dmax is calculated using the formula: Dmax = M(D) + S × M(S), where M(D) represents the median of these average distances, S represents the standard deviation, and M(S) represents the standard deviation multiplier, which is a tool used to measure the variability of statistical data. In this algorithm, the number of neighboring points and the standard deviation multiplier can be manually set. We have set these values to 10 and 5, respectively.

  • Outlier removal: points not belonging to the main structure of the karst cave and typically caused by people moving inside the cave are removed. This can be achieved using a distance-based segmentation algorithm to identify and remove points that are more than 0.1 m away from the main structure of the cave;

  • Manual removal of non-cave points: finally, some non-cave points that cannot be automatically identified and removed by algorithms remain. These points, often connected to the main body of the cave and resulting from stationary objects during data collection, need to be manually eliminated through human–computer interaction methods.

Fig. 6
figure 6

Extract karst cave points: (a) original point cloud, (b) noise removal, (c) outlier removal, (d) manual removal of non-cave points.

After preprocessing, the karst cave point cloud data contains approximately 67 million points, which is 42% of the original data, meeting the precision requirements for 3D modeling.

TIN modeling

The core concept of TIN modeling involves triangulating point cloud data to divide it into numerous small triangles. This process uses these triangles to approximate the real object's surface. The Delaunay triangulation algorithm is commonly employed for this purpose due to its advantages. It minimizes the distortion of triangles and ensures the triangles are as equilateral as possible, enhancing the quality of the TIN model32. Once the triangulation is complete, the TIN model can be constructed based on the three-dimensional data of each triangle's vertices, forming a continuous triangular mesh. Triangular mesh modeling was performed using Geomagic Wrap software. Geomagic Wrap is a professional 3D reverse engineering software that converts point cloud data into precise polygonal models.

Initial TIN models often present issues such as erroneous meshes and holes, coupled with large data volumes that reduce computational efficiency31. Thus, repairing and simplifying the model to achieve a complete and ideal TIN is essential (Fig. 7).

  • Removal of erroneous meshes: incorrect edge connections can lead to displaced triangles or irregular shapes, such as isolated sharp points and overlapping surfaces in the triangular mesh. These anomalies need to be identified and removed. Approaches include re-triangulating the affected areas or adjusting vertex positions to restore mesh integrity.

  • Holes filling: holes refer to areas in the TIN model not covered by triangles. Repairing these gaps involves analyzing the curvature characteristics of the surrounding areas and replicating similar features in the repair region to ensure seamless integration with the existing structure, maintaining overall geometric consistency.

  • Model simplification: the aim of model simplification is to reduce the number of nodes and triangles in the model, lowering data processing and storage requirements while preserving key features and accuracy. Simplification algorithms typically involve assessing the importance of each vertex and its impact on the overall geometric shape of the model. Less significant vertices are then removed or merged, followed by corresponding re-triangulation. The challenge lies in balancing significant complexity reduction with the preservation of critical topographical details.

Fig. 7
figure 7

Karst cave model repair: (a) original TIN model, (b) repaired TIN model, (c) close-up view of the model repair process, sequentially showing from 1 to 4: the original meshes, removal of incorrect meshes, holes filling, and model simplification.

The TIN model of the cave, constructed based on point cloud, comprises 42 million triangular meshes (Fig. 8), effectively preserving the structural features of the cave with high precision33. The point cloud data underwent meticulous denoising and optimization during processing, ensuring the quality of the data. Special attention was paid to the accuracy of the model during the conversion of point cloud data into a 3D model. Each triangular mesh was carefully designed to accurately replicate geological features of the cave, such as rocks, fissures, and other structural details. Furthermore, for ease of use in various software and applications, this 3D model was exported in the commonly used OBJ format for 3D models. The versatility of this format allows the model to be easily utilized and further processed in a range of 3D modeling software.

Fig. 8
figure 8

TIN model of karst cave before texture mapping.

Texture mapping

While the TIN model accurately represents the geometric structure of caves, it lacks the texture information of real scenes, failing to meet the requirements for a visually realistic depiction of caves. To address this, texture mapping is employed to restore the information of the real scene. We implemented this functionality using 3ds MAX software.

Texture mapping, a computer graphics technique, involves adding images to the surfaces of 3D models to yield more realistic visual effects. In this study, the texture data used were directly extracted from on-site photographs taken inside the cave (Fig. 9).

Fig. 9
figure 9

Texture extraction: (1) karst cave walls, (2) karst window, (3) boulders, (4) stone walls, (5) rick walls, (6) stairs, (7) sculptures, (8) flagstones, (9) ceramic tiles, (10) stone bricks, (11) cobblestone floor, (12) river channel.

Due to the dark conditions inside the caves, often compounded by the presence of sediments and moist vapor, the real textures are obscured, hindering the acquisition of complete and ideal cave textures22. Consequently, the cave were segmented based on their textural differences into 12 components, including cave walls, karst window, river channel, and sculptures, encompassing both natural and man-made features (Fig. 10). For each category of feature, typical areas were selected, and local images were captured using DSLR cameras. Small-area scenes were adequately photographed using the built-in flash of the cameras, facilitating the capture of ideal photos for texture extraction.

Fig. 10
figure 10

Structural partitioning and texture mapping: the content indicated by the numbers corresponds to that in Fig. 9.

The creation of ideal textures involved seamless processing and color balance. Seamless processing was employed to avoid noticeable seams in the texture mapping process, while color balance ensured that textures collected under varying lighting conditions maintained consistent brightness and contrast, preventing excessive disparities in light and dark areas. These textures were then mapped onto different structural divisions of the cave. Each part was endowed with clear and realistic textures, significantly enhancing the visual realism of the cave model.

The 3D model of the karst cave interior, completed with texture mapping, not only depicts the detailed spatial structural features of the cave but also showcases the refined textural effects (Fig. 11).

Fig. 11
figure 11

Interior of the Wulong Karst Cave real scene 3D model.

External landform

Oblique photography is a low-altitude remote sensing technology based on UAV34 that has rapidly developed over the past decade. It is a photogrammetric method used for acquiring geographic information and creating real scene 3D models. Unlike traditional vertical photogrammetry, oblique photography employs either a single lens swinging or multiple lenses capturing from different angles to photograph ground targets. This approach provides a more comprehensive perspective, capturing the sides and structural details of terrestrial objects. Oblique photography typically offers high-resolution geographic information, making it a potent tool for cartography, surveying, and Geographic Information System (GIS) applications. In urban planning, oblique photography aids planners in better understanding city layouts and the conditions of existing buildings. In the field of cultural heritage conservation, this technology can be used to create detailed 3D models of ancient sites and historic buildings, thereby aiding in their preservation and restoration. This paper primarily applies oblique photography technology to the real scene 3D modeling of karst landform, exploring its modeling quality in complex terrain areas.

Data collection

For the construction of the external geomorphic model of the cave, the UAV system used was the Feima Robotics D20 UAV, which has a horizontal and vertical positioning accuracy better than 2 cm. It is equipped with a D-OP3000 five-lens camera, each lens having an effective pixel count of 24.3 million (Fig. 12). Moreover, the UAV can perform high-precision terrain-following flights, maintaining a stable flight altitude in complex terrains, thus ensuring the consistency and accuracy of the imagery.

Fig. 12
figure 12

Feima robotics D20 UAV and D-OP3000 five-lens camera system.

The planning of flight paths is one of the critical steps before data collection35. It involves determining the flight or photography path of UAV, requiring the integration of multiple factors such as flight altitude and weather conditions. The employment of professional route planning software or tools is essential to ensure that the collected image data satisfies the requirements for precision and accuracy, and to successfully complete the flight mission.

This flight mission includes 11 flight lines, also referred to as shooting lines (Fig. 13), with the designated flight altitude set at 200 m. At this altitude, the ground resolution of the collected images is approximately 3.1 cm per pixel, which not only meet the requirements for precision but also ensures flight safety.

Fig. 13
figure 13

Flight path planning and GCPs.

Additionally, the overlap rate, which refers to the degree of overlap between photos, was also taken into consideration. Both the forward and side overlap rates were meticulously set at 70%. This high degree of overlap means that 70% of the area captured in one image overlaps with its adjacent images, both along and across the flight trajectory. Such comprehensive coverage guarantees that each terrain section is documented by multiple photographs, thereby significantly reducing the risk of data loss or distortion and enhancing the reliability of the 3D mapping process.

In oblique photography, the placement of ground control points is crucial as they are key to enhancing the accuracy and precision of the resulting oblique photography models. The area covered by this aerial photography mission was 190,000 m2. Based on the on-site conditions, three ground control points were set up at the cave entrance, the road, and the mountaintop (Fig. 13). Coordinates were collected with RTK for subsequent precision correction35.

The data collection mission was carried out in weather with soft lighting to avoid shadows on the ground, ensuring the capture of optimal image data. After completing the preparations, the UAV commenced the flight mission, capturing surface images at 241 different locations along the flight lines. At each location, five images were taken from different angles, resulting in a total of 1205 photos. This process was completed within 1 hour. These photos will be used in subsequent data processing to create the oblique photography model.

Real scene 3D reconstruction

The real scene 3D reconstruction is based on the Structure from Motion (SfM) algorithms36,37. In oblique photography, cameras or sensors capture images from different angles, and identical feature points in these images can be matched to establish the triangulation relationship between the camera positions and the feature points (Fig. 14).

Fig. 14
figure 14

The adjustment calculation of aerial triangulation.

This process involves several key steps and concepts: initially, the algorithm identifies and matches the same feature points appearing in different images by detecting unique textures or shapes in the images, such as the corners of buildings or windows. Then, by calculating the relative positions of the same feature points in different images, it is possible to estimate the position and orientation of the camera at the time each photo was taken. Once the camera's position and attitude are determined, the actual geographical coordinates of the feature points can be calculated using the principles of triangulation. During this calculation process, the internal parameters of the camera, such as focal length and the size of the image sensor, are crucial for accurately computing the coordinates of the feature points. These parameters help to transform the two-dimensional image data into coordinates in three-dimensional space.

The collected imagery and POS information are imported into ContextCapture software for the adjustment calculation of aerial triangulation. To enhance the accuracy of these calculations, control points are used for error correction. Typically, multiple iterations of aerial triangulation adjustments are required to achieve the desired level of precision. In this specific adjustment calculation, the correction errors for three control points were all within 2 cm (Table 2).

Table 2 GCPs accuracy correction.

After the completion of the adjustment calculation for aerial triangulation, a dense point cloud is generated. The next step in the process of real scene 3D reconstruction involves controlling the model boundaries using vector ranges and dividing the model into multiple tiles for processing. This is crucial because real scene 3D reconstruction is a process that demands high computational power. Appropriately sizing these tiles enables the computer to model in segments, reducing the computational load per operation and lowering the overall performance requirements. Upon completion of the real scene 3D reconstruction, the accuracy report indicates an average reprojection error of 0.48 pixels for all feature points, approximately equivalent to 1.5 cm.

Similar to the internal TIN models of caves, the initially constructed external landform real scene 3D models inevitably exhibit issues such as geometric deformation and incorrect texture mapping. These issues manifest as image stretching, holes, floating objects, and incomplete model structures (e.g., roads, walls), along with extraneous objects like vehicles38, detracting from the aesthetic quality of the model. To enhance the model quality and achieve better results, we conducted repairs and beautifications of the model. The repair of the oblique photography model encompasses two main aspects: the triangular network and the texture. Illustrations demonstrate examples such as the removal of vehicles and the individualization of buildings. After repairing the triangular network, texture remapping is performed to achieve the desired model quality (Fig. 15).

Fig. 15
figure 15

Oblique photogrammetric model repair: (a) vehicle removal, (b) building individualization, (c) model data reduction.

Furthermore, the issue of reducing data size in oblique photography models is currently one of the hot topics in research. This is due to the high precision and rich information content of real scene 3D models, which typically come with large original data sizes. These models are heavily dependent on high-performance computers and pose challenges in terms of data transmission39. In our study, we converted the external topography real scene 3D model into the universal OBJ format, aligning it with the format of the internal structure model of the cave. We then performed data size reduction on the model, decreasing its size from the original 7.51 GB to 2.84 GB, which is 38% of the original size (Fig. 15). During the model simplification process, we prioritized maintaining the morphological features and accuracy of the model, ensuring that there was no significant visible degradation in quality. For areas with minimal curvature changes, such as flat grounds and regular buildings, we simplified the triangular mesh to reduce data volume. However, in areas with significant curvature changes, such as complex trees, we endeavored to retain the number of triangular meshes to preserve accuracy. This strategy not only reduced the data volume but also enhanced processing efficiency while maintaining the model’s precision.

Model integration

After completing the construction of both the interior structure model and the exterior geomorphological model of the cave, the next step is to integrate these two models within 3ds MAX software to create a comprehensive and integrated 3D model of the cave landscape (Fig. 16). This process involves several key steps:

  • Coordinate system unification and precise registration: both the interior structure model and the exterior geomorphological model of the cave, established using laser scanning or photogrammetry techniques, utilized precise ground control points to ensure data accuracy. This allowed for the creation of cave models both inside and out with centimeter-level accuracy in real-world coordinates. Unifying both models in the same world coordinate system ensures their perfect alignment;

  • Triangular mesh joining: the main body of the cave is situated below the ground surface, with only the entrance and skylights connecting to the surface model. Therefore, it is only necessary to join the triangular meshes at the cave entrance and skylight positions to ensure a seamless integration of the two models. During the joining process, the interior cave model and the exterior geomorphological model each maintain their respective accuracies. The key task is to naturally handle the triangular meshes at the entrance and skylight seams. Additionally, it is crucial to minimize damage to the original models, ensuring that the triangular meshes in the joining areas blend naturally with the surrounding meshes, consistent with real-world surface characteristics;

  • Seam processing: after the joining process is completed, the triangular meshes at the seam positions require remapping with new textures. For the texture mapping at the seams, we selected color and texture data adjacent to the positions, maintaining the same resolution as the surrounding areas. This approach ensures that the textures at the joining areas are consistent with those of the surrounding textures, thereby enhancing the naturalness and realism of the entire model;

  • Quality check and optimization: after the initial assembly of the model, a detailed quality check and optimization of the entire cave landscape real scene 3D model is conducted. This includes checking the geometric information accuracy of the model, ensuring the richness and realism of texture information, and fine-tuning the overall appearance of the model to closely resemble the real scene.

Fig. 16
figure 16

High-precision real scene 3D model of karst cave landscape: the internal structure model of the cave is located below the external terrain model, displayed with semi-transparently. The inset displays a side view of the middle part of the model.

Through the above steps, we can obtain a complete high-precision real scene 3D model of the karst cave landscape (Fig. 16). This model not only possesses accurate spatial geometric information but also rich texture information from the real world, recreating the real scene. It significantly enhances the method of preserving geological landscape information of caves and offers a more vivid and realistic visual experience for applications in scientific research, education, and tourism.

Discussion

The previous research on 3D modeling of karst cave landscapes primarily focused on the detailed depiction of the internal spaces of caves, serving specific academic studies. For instance, De Waele et al. collected point cloud data from a cave in northern Italy using ground-based laser scanning, constructing a cave model to study the cave's geological evolution based on its structural features15. Similarly, Marsico et al. utilized ground laser scanning to create a 3D model of a cave, aimed at the conservation of geological heritage. This cave, revealing artifacts from both the Paleolithic and Neolithic eras, holds both geological and archaeological significance16. Šupinský et al. explored how to use 3D laser scanning to produce high-scale cave maps, employing two-dimensional images to represent three-dimensional cave structures33. Based on the research presented in this article, we can affirm that the creation of detailed 3D models significantly enhances studies and applications related to caves. We have primarily summarized the following three aspects:

Academic research: 3D models provide detailed structural and morphological features of caves, which are key to understanding their geological evolution and formation. With laser scanning technology, we can capture these features with centimeter-level precision. These morphometric parameters not only describe the underground topography but also reveal the processes of cave formation and their relationship with karst phenomena, greatly enhancing morphological analysis capabilities40;

Tourism development: by assessing the development scale of caves, their internal natural landscapes (such as stalactites), and geographic location, we can better understand their potential value as tourist attractions. This helps us evaluate the economic potential of caves in the tourism industry on a broader scale, providing a scientific basis for tourism planning and development;

Spatial utilization: as naturally formed underground spaces, caves have great potential for use in civil defense. In special circumstances, caves can be directly utilized as shelters or for other emergency purposes, avoiding the time and economic costs of constructing civil defense facilities. This area of research is not only significant for public safety but also offers a new perspective on the multifunctionality of caves. However, it is important to emphasize that this type of utilization can cause irreversible damage to the cave landscapes and heritage. Therefore, it should be considered as an option only in emergency situations.

The high-precision, real scene 3D model of cave landscapes in this study not only meets research needs in academic study, tourism development, and spatial utilization, but also includes texture information, bringing them replicate real-world scenarios more effectively. Combined with the external topography model, this model offers greater completeness and satisfies a higher standard of visual appeal. Integrating with virtual reality technology, it can also enhance the interactivity and immersive experience of the model41, holding significant importance for the promotion of cave tourism and knowledge dissemination.

The reproduction of color and texture information on cave walls under dark environments involved only limited texture data collection for texture mapping, due to the small scale of the caves and the lack of significant variation in color and texture on the cave walls. However, for larger caves, which may involve changes in rock types or different geological processes, different parts of the cave walls exhibit varied colors and textures. For such caves, it is necessary to collect a broader range of texture information to more accurately restore the cave's color and texture.

Conclusions

With the continuous development of real scene 3D technology, its application in various fields is increasing, and the demand for precision and quality of 3D models is also rising. In the fields of physical geography and geology, such as karst cave research, there is an urgent need for the development of high-precision real scene 3D models.

This paper combines ground remote sensing and UAV remote sensing technologies, based on laser scanning and photogrammetry, to construct a complete method for building high-precision real scene 3D models of karst cave landscapes. The accuracy of the internal model of the cave is better than 4 cm, and the external model is better than 2 cm. This demonstrates the effectiveness of the method presented in this paper in real scene 3D modeling of karst cave landscapes. In the construction process of the karst cave model, the main challenges are the complexity of the cave's internal environment, insufficient lighting conditions, and the difficulty in restoring texture information. To address these issues, this paper proposes a new solution, significantly enhancing the realism and detail of the internal landscape of the cave.

In terms of the model's application value, the "Integrated High-Precision Real Scene 3D Modeling of Karst Cave Landscape" developed in this paper not only facilitates subsequent extraction of morphological features of karst caves15,16,33, enhancing morphological analysis capabilities and providing references for the evolutionary history and causation in scientific research, but also allows for further exploration into using this paper’s 3D data to extract discontinuities in karst caves23,24,25, thereby advancing the reliability of structural and stability analyses.

While ensuring the model's accurate 3D information, we focused more on color and texture reproduction and realistic expression of target features, making the model replicate real-world scenarios more effectively. The integrated model of both the interior and exterior of the cave became more complete and information-rich. In public dissemination, this type of real scene 3D model makes it easier for the general public to understand and accept, enhancing the promotion and marketing of karst cave tourism and more effectively attracting tourists' interest. Combined with virtual reality technology, it also enables immersive tours to be conducted remotely, with applications in the field of education.

The successful application of this methodology also provides valuable experience and technical reference for subsequent 3D modeling of other physical geographical landscapes. In the future, with further refinement and optimization of technology, this field holds great potential for development and research space.