1 Introduction

The rapid growth of information technology coupled with the increasing complexity of products across diverse industries has ignited significant interest and research in the field of digital twin technology and its applications. This technology has emerged as a focal point for investigation, showcasing its potential to drive the ongoing technological revolution and presenting promising opportunities across various sectors (Wang, 2022). Notably, the construction industry has paid significant attention in recent years, with countries and regions, such as the UK, the US, and the EU, issuing a range of policies, plans, standards, and other authoritative documents (Table 1). In China, the National Development and Reform Commission, Ministry of Science and Technology, Ministry of Industry and Information Technology, etc., have issued corresponding policy documents.

Table 1 Policies and plans to promote the development of digital twin cities

The origins of this concept can be traced back to the Mirrored Spaces Model (MSM), which comprises three key elements: real space, virtual space, and connected information (Grieves, 2005; Glaessgen & Stargel, 2012). Tuegel (2012) proposed a four-layer model that introduces the data assurance layer, modeling computation layer, digital twin function layer, and immersive experience layer, delineating the process from data collection to application. Tao et al., (2018a, 2018b) proposed a comprehensive five-dimensional model of the digital twin, comprising physical entities, virtual entities, connections, twin data, and services. Following this trend, the guidelines for digital twin implementation can be illustrated in Fig. 1.

Fig. 1
figure 1

Digital twin five-dimensional model and application guidelines

Considering national strategic requirements, and the progression of information technology in recent years, digital twin technology has experienced notable advancement and application in civil engineering. With the drastic reduction of land space, most cities and towns have incorporated the planning and construction of underground space (Shao & Wang, 2022). Therefore, underground space, which serves as a significant resource for urban development, demands the integration of digital technology (Zhang & Pan, 2019). Wang (2022) elaborated the combination process of underground space development and digital twin technology into five distinct stages (Table 2), outlining the pivotal role of information technology in facilitating the evolution of underground space from a “point” to a “network”.

Table 2 Evolutionary history and characteristics of underground space

The diverse real-time sensing and monitoring techniques in underground engineering improve the acquisition of vast amounts of data, characterized by real-time, realistic, and high-fidelity attributes (Alam & Saddik, 2017). This data forms the fundamental basis for the establishment of digital twins. Furthermore, Building Information Modeling (BIM) technology has provided a comprehensive 3D modeling platform that extends the boundaries of information modeling. Consequently, the realization of hyper-realistic, multi-scale, and computable digital representations of underground space can be realized (Costin et al., 2018). The progress in high-performance computing has propelled the dynamic evolution of underground engineering models by enabling continuous feedback between real physical entities and their digital counterparts. Through the combination of reinforcement learning, knowledge mapping, and machine learning techniques, these computational advancements indicate previous indistinct relationships among different features and responses in the field of underground engineering (Schleich et al., 2017). A layered digital twin approach using data collection, digital modeling, and data services has been tried and applied in utility tunnels underground (Lee et al., 2023).

Aiming at the various new models and algorithms, research focusing on the application of technologies in digital twin has been fruitful during the past years, especially for underground engineering infrastructure. Shao and Wang (2022) conducted intelligent holistic planning for underground spaces through specific hierarchical analysis and a complexity evolution model. They verified the feasibility of applying digital twin technology to underground spaces by conducting experiments and simulations in order the mitigate the disaster process of underground infrastructures. Wu et al. (2022) integrated multiple-channel monitoring videos in tunnel scenarios with tunnel geometric models based on topological structures, creating a tunnel panoramic digital twin scene with 3D reality fusion. Ye et al. (2023) developed a safety warning platform for tunnel construction processes based on the concept of Data Technology (DT), ensuring timely interaction of information across the physical layer, digital layer, and service layer. Yu et al. (2021) defined tunnel twin data and developed a rule-based inference engine for rapid decision-making on tunnel electromechanical equipment failures based on the Construction-Operation Building Information Exchange standard.

Nevertheless, the rapid urbanization process witnessed in China over the past decades has given rise to certain challenges, primarily stemming from the constraints imposed by information technology development and the absence of a widespread scientific for early-stage underground space design. Through their research, Li et al. (2017) observed a notable lack of comprehensive and systematic standards about the nomenclature, codes, data types, and precision formats associated with underground space data. This deficiency hampers the efficient utilization of data and imposes limitations on the seamless progression of subsequent data collection and processing, platform establishment, and the analysis and application of urban underground space informatization. To address the issue of independent modeling, and computational analysis within the field of underground engineering, Yao et al. (2018) have made initial strides by developing interfaces for Revit and ANSYS. However, achieving the integration of engineering construction visualization and engineering calculation remains challenging. Consequently, the shortcomings of theoretical and technical means for perceiving, expressing, and analyzing the authentic characteristics of underground engineering still exist (Boje et al., 2020).

The convergence of conventional civil engineering disciplines with emerging intelligent algorithm technologies, facilitated by spatiotemporal digital empowerment, bears significant importance in achieving operational efficiency and control of urban infrastructure systems through digital twin technology. This study focuses on the urban underground infrastructure enabled by digital twin technology through the combination of civil engineering and artificial intelligence. To illustrate this concept, a project in Wenzhou serves hereby as an exemplary case for realizing the digital twin empowerment and intelligent management of its municipal infrastructure system.

2 Methodology of enabling digital twin technology in underground engineering

2.1 General framework

To apply digital twin technology for the construction of underground infrastructure, a comprehensive exploration of the multidimensional data generated by the target object throughout its lifecycle is essential. This exploration requires the integration of mechanical principles and fundamental concepts from the field of civil engineering with artificial intelligence algorithms for analysis and identification, ultimately leading to the formulation of appropriate disposal strategies. This process needs a broad scope, quick prediction, and enhanced support for urban infrastructure operation and management. In the process of attempting to digitize underground infrastructure, it was found that, due to the greater concealment and complexity of underground engineering data, early structured storage of databases and the integration of data from multiple heterogeneous sources are crucial. Furthermore, guiding engineering practices by integrating data analysis results with the digital model of underground facilities is equally important. Based on this, we reviewed the latest research results in relevant fields and, through discussions among research members, put forward a comprehensive framework.

The attainment of these objectives can be accomplished through a systematic approach consisting of five key steps: “data collection”, “mining and analysis”, “predictive computation”, “iterative correction”, and “risk recognition”, as shown in Fig. 2. Finally, the application of this framework was validated through specific engineering cases.

Fig. 2
figure 2

Overall thinking framework for digital empowerment

2.1.1 Data collection: deep digital mining of the target structure to obtain multi-dimensional digital and physical indicators

In the field of digital twin applications, data collection is the first and one of the most important steps, having a profound influence on the quality and dependability of the digital twin model. Nonetheless, traditional monitoring techniques such as stress and strain measurements are inadequate in addressing the escalating complexity of the engineering geological environment encountered in large-scale underground projects. Considering this, the advancements in modern monitoring methodologies and their widespread implementation in underground engineering, the extraction of valuable information from the extensive “big data” obtained during the process of underground engineering monitoring assumes critical significance (Luo et al., 2019). As shown in Fig. 3, real-time monitoring, safety alerts, and reliability forecasting achieved through intelligent data processing and management techniques, have been used in the data collecting process (Chang et al., 2007; Wang, 2013). Consequently, the progression in monitoring technologies has made it feasible to merge vast quantities of data with digital twin models.

Fig. 3
figure 3

Multiple sources of heterogeneous data from different paths

2.1.2 Mining and analysis: analyze the massive data during the operation period, and predict the data values at future time points through artificial intelligence algorithms

Data mining methods for underground engineering can be broadly classified into deterministic and statistical approaches, both aiming to extract valuable information from raw data (Zhou, 2014). Consequently, the selection of appropriate data mining and feature processing methods, based on the acquisition of large datasets, becomes a crucial step in establishing an intelligent management system. Currently, data analysis methods closely integrated with digital twin technology predominantly revolve around the combination of artificial intelligence and big data analysis. Some researchers employed machine learning algorithms to evaluate the safety of underground structures and propose corresponding prevention and control measures based on the identified causes (Panda et al., 2015; Sun, 2009; Zhao, 2019). Furthermore, the technology of deep learning has indicated a significant revolution in machine learning, catalyzing profound changes in intelligent planning, design, construction, and maintenance in civil engineering (Farrar & Worden, 2012). A similar concept can be used in bridges (Li et al., 2018; Wei et al., 2017), structures (Bao et al., 2019), roads and pavements (Li et al., 2022a, 2022b).

2.1.3 Digital modeling: giving physical connotations to key nodes, digitally modeling the actual infrastructure, and constructing a digital twin model.

In geotechnical engineering, numerical simulation software such as Abaqus, Ansys, PLAXIS, and FLAC 3D are commonly employed to execute computational models for mechanical calculations of various projects, including tunnels, slopes, and road foundations (Ren et al., 2017; Xie et al., 2019). However, these processes lead to substantial amounts of data and computational resources for simulation and analysis, along with specialized skills and knowledge to operate the digital twin system. Hence, the integration of digital intelligence technology to empower urban spaces relies on utilizing civil engineering theory to validate machine learning prediction outcomes.

In the domain of digital twin model construction, Jiang et al. (2022) achieved a highway modeling approach that combines structure from BIM and surroundings from a Geographical Information System (GIS) by integrating multiple data sources such as point clouds and satellite imagery. Koch et al. (2017) developed a comprehensive information model for tunnel engineering by integrating four interconnected sub-domain models and correlating engineering performance data, thus applying the concept of BIM. Fabozzi et al. (2021) utilized four Bentley suite software to perform 4D modeling of tunnels (Fig. 4).

Fig. 4
figure 4

Mapping physical nodes to digital twin models (Fabozzi et al., 2021)

2.1.4 Iterative correction: using continuously collected and updated data brought into the algorithm to verify and correct the AI prediction accuracy

The accuracy of prediction results in machine learning algorithms mainly depends on the availability of a substantial volume of data, and the dimension of the database can significantly impact the accuracy of these predictions (Demir, 2008). Wang and Cha (2018) developed a long and short-term memory network using monitoring data to model the health status of bridges. Similarly, Saeed et al. (2022) introduced a self-tuning brain-based emotional learning intelligent controller (ST-BELBIC) algorithm that enables the controller to adaptively adjust its parameters and gains through the integration of a fuzzy monitor and tuner. Consequently, the dynamic updating of the machine learning algorithm based on the continuous acquisition of new data and the utilization of monitoring data (Fig. 5) for automatic optimization of the algorithm play crucial roles in improving the accuracy and performance of the model. These steps are essential for accurately predicting artificial intelligence algorithms and optimizing the health status of structures.

Fig. 5
figure 5

A neural network algorithm

It should be noticed that new datasets continually emerge during the operation of the underground space, and are also critical for the prediction. Firstly, to ensure the flexibility of the database, an independent modular design can be applied to semi-structured and unstructured data, ensuring the relative independence of new data types. Secondly, the adoption of Web Service technology makes it possible to involve corresponding business data interface standards. After executing this, it facilitates other Web Service applications to search for and invoke the services it has executed. Finally, Auto Machine Learning for time-series data analysis can also be used employed to achieve automatic learning and iteration of dynamic data during the data analysis process (Cui, 2022, Tian, 2023, Yang & Shami, 2022).

2.1.5 Risk recognition: theoretical calculation of the critical value, comparison between predicted results and critical values in the future to give judgment indicators

Digital twins find application in real-time monitoring and prediction of various civil engineering structures, such as buildings, bridges, and roads. By combining simplified civil engineering theory with digital twins, more accurate simulations and predictions can be achieved. For example, Qin et al. (2015) derived a simplified expression for crack opening displacement and external load, considering the distribution form of crack cohesion. This approach enables the complete calculation of fracture processes and accurate prediction of the ultimate bearing capacity of members with cracks. Tang (2023) assumed the tunnel as an infinitely long Euler–Bernoulli beam resting on the Vlasov foundation and proposed a simplified calculation method for predicting the effect of pit excavation on the vertical deformation of the lower tunnel. Wei et al. (2022) simplified the pile foundation as an Euler–Bernoulli continuous beam with equivalent flexural stiffness placed on the Winkler foundation. Most of the previous research focuses on presenting simplified predictions, while the ultimate objective of digital twins is to present civil engineering theoretical determination concepts using digital visualization technology, making them accessible to operations and maintenance (O&M) managers who may not have civil engineering backgrounds.

2.2 First step: data structure construction

Among these aspects shown in Fig. 2, the establishment of a well-structured database serves as the fundamental requirement. The data collection process should be based on actual projects and aligned with subsequent monitoring needs, with a focus on establishing a structured data form for essential information. To digitally represent underground urban space, it is imperative to address the functional requirements of different infrastructures. Thus, the initial step involves constructing a database that houses underground space information. The geological information database primarily comprises fundamental geographic, geologic, engineering geologic, hydrogeologic, and environmental geologic data. Topographic and geological information is typically obtained through professional observations and geological explorations and is stored and expressed in textual format. Numerical data related to basic physical and mechanical properties of soil, hydrogeology, and environmental geology are sourced from field or laboratory tests, such as borehole investigations, physical explorations, chemical analyses, in-situ testing, geotechnical testing, and environmental assessments.

Furthermore, leveraging Revit API interface technology, data from the database or solid models can be accessed using programming languages like C# or Python. Data analysis software can then be employed to generate and calculate relevant information, enabling visualization and dynamic warnings based on the fundamental layer data. Table 3 presents an overview of the essential information encompassed in the geological environment survey.

Table 3 Example of structured engineering geology data

After establishing the basic project information data, it is also necessary to establish a dynamic structured database according to the construction and functional requirements of different technical facilities, such as shield tunnels, foundation pits, pipe corridors, rivers, roads, etc. Taking the construction process of the foundation pit project and tunnel project as an example, the data table shown in Table 4 is established to connect the multi-source data such as monitoring items, construction progress, shield data, and advancement speed to the digital control system for management.

Table 4 Structured data for monitoring and warning

By establishing a fundamental geological database, it becomes necessary to configure different structured data sets tailored to the unique functions of various aspects while considering their corresponding monitoring content. Regarding the foundation pit, the initial step involves setting up data features aligned with the project’s construction design process to facilitate progress management. These features encompass construction progress, plans, measurement point layout, construction logs, and other text-based databases. Primarily sourced from construction and design personnel records, this content primarily serves data storage and management functions. Additionally, numerical data features are designed to address monitoring items, such as pile foundation, column, precipitation, and support monitoring data. These data sets cater to subsequent data processing, analysis, and modeling needs, with the primary source being the upward flow from the monitoring system via data interfaces, and the downstream flow into data processing and analysis systems.

In the case of shield tunneling data, similar geological information should be collected, considering both shield machine information and the construction details of the tunnel structure. For the tunnel structure, the key features encompass design parameters, functional design, tunnel formation, and tunnel progress. These aspects are stored, queried, and managed in the form of data, drawings, and text-based formats. Conversely, the shield machine data set primarily includes real-time parameters such as geometric parameters, performance parameters, cutter parameters, and mud and water balance. Stored predominantly as numerical data, these parameters facilitate dynamic warning and management functions through the identification of monitoring systems and data processing systems, respectively.

2.3 Second step: data analysis and processing

From the fundamental database presented in Tables 3 and 4, it becomes apparent that the intelligent operation and maintenance system yields a vast amount of heterogeneous data from diverse monitoring processes. Multi-source heterogeneous data includes subsurface spatial data, data on existing underground structures, and data on planned underground structures. This encompasses a vast array of data, including not only engineering geology, hydrogeology, and environmental geology data but also extensive data related to structure surveys, design, and construction. With the development of monitoring techniques and artificial intelligence, the sources of such data have become more complex and refined. Intelligent algorithms like the BP neural network can be employed to ensure accuracy and reliability (Ju et al., 2006, Liu et al., 2019, Qiu et al., 2018, Xiao et al., 2020, Zhang et al., 2022).

Subsequently, data analysis plays a vital role within the digital twin framework, as it seeks to process the data further to enable assessment and prediction of the existing infrastructure condition. However, previous data analysis approaches have often relied on conventional civil engineering theories, with certain limitations:

  1. 1)

    The mechanical problems in civil engineering lie in exploring the theoretical relationships among complex factors, with each assessment indicator often encompassing the influence of numerous variables. To simplify calculations for theoretical studies, equations were initially simplified based on various assumptions, which deviate from the real-world complexities encountered in practical scenarios.

  2. 2)

    As research and computational capabilities advance, the current trajectory of research involves incorporating additional indicators to enhance the fundamental physical equations. The traditional path of data processing tools based on civil engineering theory encounters challenges in directly interfacing with the O&M system and falls short in attaining real-time data processing and updating capabilities (Ahmed, 2018).

With the substantial advancements in contemporary computing power and data collection capabilities, machine learning algorithms are increasingly being employed with remarkable efficiency. Machine learning, including deep learning, can discern underlying patterns directly from data, enabling the approximation of arbitrary continuous functions that link input and output spaces with exceptional precision. Concurrently, the rapid progress in robust computational support, such as graphics processing units (GPUs) and tensor processors (TPUs), facilitates the deployment of intelligent models with computational efficiency that far surpasses traditional methods (Lv & Song, 2016).

Within the realm of machine learning, two primary types of problems are addressed: classification and regression. Regression methods serve as supervised learning algorithms for predicting and modeling numerical continuous random variables, whereas classification methods are utilized to model or predict discrete random variables in a supervised learning context (Phoon & Zhang, 2023). Presently, the fundamental machine learning algorithms commonly employed encompass linear regression, support vector machine (SVM), k-nearest neighbor (KNN), logistic regression, decision tree, random forest, plain Bayesian, and neural network algorithms, among others. A comparison of the characteristics and conditions of use for these algorithms is provided in Table 5.

Table 5 Basic machine learning algorithms comparison of advantages and disadvantages

The progressive advancement of weak learners has given rise to the emergence of optimization algorithms for integrated learning, such as Bagging and Boosting, as well as deep learning approaches, including convolutional networks, recurrent networks, and generative adversarial networks, built upon simple machine algorithms (Liu et al., 2022). Integrated learning refers to a class of machine learning algorithms that construct and combine multiple learners to achieve a learning task, often yielding superior generalization performance compared to a single learner (Valiant, 1984). Among the prevailing models in contemporary machine learning, Gradient Boosting Decision Tree (GBDT) holds a prominent position and finds wide-ranging applications in various industries. Notably, it boasts the advantages of achieving favorable training effects while mitigating the risk of overfitting (Ke et al., 2017). In this investigation, the focus centers on dynamic data analysis of risks within excavation pits and tunnels, exploring the utilization of GBDT algorithm in classifying and predicting segmental diseases through digital analysis.

For example, to predict tunnel pipe sheet diseases, a dataset comprising 281,658 samples, characterized by 12 columns of features, was selected. Among these features, 11 columns pertain to categorical variables used to predict whether the pipe is diseased or not. Due to the substantial imbalance between positive and negative samples, the negative samples were down-sampled. After conducting experimental comparisons, it was observed that the positive and negative samples achieved better results with a ratio of 1:2. The training and test sets were divided in an 8:2 ratio, and commonly employed classification metrics, namely AUC (Area Under the Curve), Precision, Recall, and F1-score, were adopted to assess the classification effectiveness. The results of the prediction of tunnel pipe sheet diseases are presented in Table 6.

Table 6 Prediction results for tunnel lining disease

To conduct an interpretability analysis of tunnel hazard prediction, the Light GBM algorithm was employed to determine the feature importance that influences tunnel hazards (Fig. 6). By extracting features with higher importance, the entire tree structure was visualized. Additionally, the SHAP (Shapley Additive exPlanations) machine learning interpretation framework was applied in conjunction with visualization methods to illustrate the discriminatory power of specific prediction values and the entire sample set. Furthermore, a rule table for the classification of tunnel tube sheet hazards was generated using a rule representation learning algorithm, enhancing the effectiveness of the underground engineering model and improving the interpretability and applicability of the prediction outcomes. The feature importance analysis of Light GBM is presented in the accompanying figure, wherein the year information and conv-value exhibit the highest levels of importance. The decision tree-splitting process of Light GBM is illustrated in Fig. 7.

Fig. 6
figure 6

Feature importance of Light GBM

Fig. 7
figure 7

Light GBM Decision Tree (For example depth = 3)

2.4 Third step: judgment of prediction results

Furthermore, it is crucial to integrate civil engineering theory effectively with recognition analysis and prediction results, as this forms the cornerstone of empowering underground infrastructure in the digital realm. For instance, the prediction of tunnel deformation patterns and the assessment of tunnel health risk levels can be accomplished by leveraging civil engineering theories to estimate the tunnel forces subjected to stacking loads. Fang (2022) proposed a methodology for predicting the lateral deformation of shield tunnels in the Shanghai subway system under varying stacked load conditions, using a representative subway tunnel in Shanghai as a case study. A selection of tunnels (Fig. 8) with distinct stratigraphic configurations was chosen, and horizontal convergence values were then calculated for different stacking ratios H/C and stacking distances d using a beam-nonlinear spring model. The physical indices of the strata were assigned recommended values from the Shanghai Foundation Design Standards (DGJ08-11-2018), as outlined in Table 7. For a specific type of shield tunnel, while maintaining a constant stacking width of b = 30 m and stacking length of l = 60 m, the stacking ratio and stacking distance d were systematically varied to compute the horizontal convergence values of the tunnel under different combinations of working conditions.

Fig. 8
figure 8

Schematic diagram of the tunnel assignment conditions

Table 7 Stratigraphic parameters

Figure 9 illustrates a rapid prediction of the horizontal convergence for a typical shield tunnel in Shanghai subjected to a heap load (Fang, 2022). The stacking ratio H/C is represented along the horizontal axis, while the stacking distance d is depicted along the vertical axis. By establishing the corresponding horizontal and vertical coordinates, the incremental horizontal convergence Δδ resulting from surface stacking can be approximately predicted through interpolation of the contour cloud values at that particular point. Using the horizontal convergence value as an indicator, the tunnel’s health condition can be classified into distinct categories: “healthy” (Δδ < 5‰D), “diseased” (5‰D < Δδ < 15‰D), “severely diseased” (15‰D < Δδ < 25‰D), and “dangerous”(Δδ > 25‰D). This approach enables a quick assessment of the potential risks associated with stacking activities, providing a foundation for subsequent digital modeling and decision-making pertaining to abnormal disposal measures.

Fig. 9
figure 9

Predicted tunnel deformation under stacked load (Fang, 2022)

As an illustrative instance, Fig. 9(a) showcases the risk prediction for a ④-④ tunnel subjected to stacking. In the presence of a positive overburden (d = − 23.1 m), the tunnel’s structural integrity is deemed “healthy” when the ratio of overburden H/C is below 0.15. When the ratio falls within the range of 0.15 < H/C < 0.35, the tunnel is classified as “diseased”. Furthermore, if the ratio lies between 0.35 < H/C < 0.55, the tunnel is in a “severe lesion” state. Finally, when H/C surpasses 0.55, the tunnel enters a “dangerous” condition. For instance, when H/C equals 0.5, to maintain a “healthy” state for the tunnel, the stacking distance must be a minimum of 0 m, indicating that it should lie outside the tunnel profile.

In this scenario, the concept of rapid evaluation theory is applied to the digital twin system, utilizing a comprehensive database encompassing diverse geological conditions and varying tunnel depths. This enables the quick identification and assessment of lateral tunnel deformation.

2.5 Last step: digital model building and data interfacing

Through the completion of the tasks involving database construction, data analysis, and theoretical discrimination, the outcome is the development of a comprehensive visual digital model. Numerous digital visualization software options are available, each serving distinct functions, summarized as follows:

① iS3 system: A 3D geological model can be generated and seamlessly integrated into the engineering information management system. The system facilitates the extraction of management, data, and analysis models from the information flow level. Additionally, it offers an open interface for integrating other modeling software, extending the capabilities of data collection, processing, unified data modeling, and information sharing from an information flow perspective. As a result, a wide range of analysis and integrated decision-making services can be provided on the iS3 platform (Zhu et al., 2018);

② Cesium: Cesium is a cross-platform and cross-browser Javascript library designed for displaying 3D earth and maps. It supports various forms of map display, including 2D, 2.5D, and 3D representations. The library enables the creation of diverse data visualization displays, such as different geometries, highlighted areas, and even 3D models. Furthermore, it provides underlying code based on Visual Studio Code, allowing for program development, and supporting dynamic data presentation based on a timeline (Stockdonf et al., 2002);

③ Shanhaibi visualization: Shanhaibi serves as a data visualization software specifically designed for large-screen editing. It adopts a CS (Client–Server) mode, effectively reducing the cost of local privatization deployment, while also being compatible with the Web properties of BS (Browser-Server) mode. This software facilitates intelligent identification and processing of imported data, offering visualization capabilities for large-screen editing. It encompasses various features such as data source formats, a wide range of visualization components, flexible project delivery methods, and optimized software operation experience, all contributing to the visualization of digital twins.

Considering the iS3 system as an example, it utilizes the data of a Geographic Information System (GIS) to depict spatial entity objects by integrating graphic (geometric) data and attribute data. The geometric and attribute data are linked through a unique code, enabling the establishment of a corresponding relationship between each graphic element constituting a spatial object and the attribute describing it. This integration facilitates the realization of the digital twin model depicting geological conditions (see Fig. 10).

Fig. 10
figure 10

Digital twin model for geological conditions

The acquisition of essential data about structured information and comprehensive details regarding real engineering projects, encompassing roads, bridges, rivers, pipelines, integrated corridors, and other urban infrastructure systems, is initiated. Based on on-site surveys and design drawings, the city is systematically decomposed, and its components are classified accordingly. The utilization of modeling software such as 3Dmax, Blender, and SketchUp facilitates the digital representation of solid elements through precise mapping techniques (refer to Fig. 11).

Fig. 11
figure 11

Actual model build (3Dmax city model)

At the spatiotemporal correlation function level of data, dynamic data binding can be accomplished by building upon a unified data interface model. Taking into consideration the deformations resulting from foundation pit and tunnel construction, the establishment of dynamic monitoring and analysis of structured data in early warning formats becomes feasible. Eventually, through the amalgamation of the database (at the data storage level), digital twin modeling (at the digital representation level), and dynamic data API interface (at the data collection and analysis levels) within the visualization platform, each pivotal node within the urban infrastructure system attains physical significance. This comprehensive approach ensures that not only simplistic data presentations but also digital models encompass a wider array of physical parameters (Fig. 12).

Fig. 12
figure 12

Dynamic data update settings

3 Practice of using digit twin technology: a case study

3.1 Case background

This study is centered around a project located in Wenzhou, covering an expansive planning area of 130 square kilometers. The project encompasses various municipal facilities, including 60 roads with a total length of 77,460 m and an area of 2.857 million square meters, 60 bridges spanning a total area of 186,192 square meters, 6 municipal integrated pipeline corridors with a combined length of 13.50 km, and 10 rivers extending over a total length of 32.4 km. Therefore, this project focuses on medium-scale urban infrastructure exploration, incorporating roads, bridges, rivers, municipal pipeline networks, and landscapes as its primary elements.

Situated in a coastal region, the geological stratum is characterized by soft marine silt as the underlying base. Based on survey data, the foundation rock is revealed to be divided into four layers with burial depths ranging from 46.20 m to shallow. From top to bottom, the layers consist of silty powder clay (with flow plasticity and no laminae but containing occasional shell debris distributed throughout the area), powder sand silt (with flow plasticity and no laminae but featuring micro-laminae distributed throughout the area), silt (with flow plasticity and no stratification, occasionally containing shell debris distributed throughout the area), and silty clay (with flow plasticity and scaly distribution throughout the area).

3.2 Data collection and storage

The underground space data system corresponds to the underground space structure system, encompassing various database subsystems such as the geological data basic database, foundation pit database subsystem, tunnel database subsystem, pipeline database subsystem, river database subsystem, and bridge database subsystem. The engineering database is established based on a predefined classification table (as depicted in Fig. 13). Exploring the content corresponding to Database Collection in the conceptual framework, deep foundation and tunnel construction processes were taken as examples. Based on the data fields in Table 3 and Table 4, following the theoretical discussion in Sect. 2.2 of the paper, on the foundation of a database that comprehensively covers underground space information, it is essential to preserve the flexibility of the database structure. Therefore, in this case study, a NoSQL database is employed to independently design a modular system to keep the stability of the original database whenever new data is generated. These data are interconnected within the digital control system to facilitate efficient management. Additionally, the data set must possess a certain level of scalability to accommodate future requirements.

Fig. 13
figure 13

Underground spatial data system (with geological data set as an example)

3.3 Digital modeling

Visualization representation plays a pivotal role within information management and control systems, enabling the presentation of data in rich and diverse formats. In this project, software such as 3Dmax, Blender, and SketchUp are employed to create visually engaging models. Specifically, the construction of a model based on the shield’s first single-line tunnel construction conditions and the reinforced concrete support pit construction conditions is showcased in Fig. 14.

Fig. 14
figure 14

Actual model build

Conducting engineering case studies based on the Digital Model Building section. Initially, leveraging the capabilities of the iS3 platform, the geological model incorporates information regarding stratum distribution, which is then integrated into a numerical model. Subsequently, utilizing 3Dmax, detailed models of the tunnel and excavation pit shown in Fig. 14, are constructed, and its physical engine is employed to dynamically showcase the construction process of the underground engineering. Additionally, SketchUp is used to establish a regionalized overall model of the underground space, as depicted in Fig. 11, providing an efficient visualization from a holistic perspective.

3.4 Machine learning algorithm analysis and data iteration interface

In this phase, the integration of the Mining and Analysis components from the framework with the other parts is accomplished through computer programming. The Light GBM algorithm depicted in Fig. 7 relates to a vast array of monitoring data for data training. Simultaneously, an adaptive algorithm is embedded to achieve adaptive learning under continuous updates of monitoring data. The predictive outcomes are then input into the theoretical recognition layer.

To achieve data interaction among the static database, real-time monitoring data, numerical analysis data, machine algorithm data, and the system’s built-in physical model, data integration, storage, and processing are crucial for the digital twin. Various technologies, including API, XML, JSON, and Web services, can be employed to realize data interaction within the digital twin. Shanhaibi supports a wide range of data sources, including files (Excel, CSV, JSON), databases (MySQL, SQL Server, PostgreSQL, MongoDB, Oracle), and Internet of Things (IoT) devices using the Modbus TCP protocol and API interfaces.

In this paper, Modbus protocol-connected IoT devices serve as an example to establish data connections within the Shanhaibi Manager. By specifying the appropriate “host” and “port” parameters, data connections can be established. The created Modbus protocol application can be viewed and managed on the Shanhaibi Data Manager homepage, offering options to open or delete the application. More complex data source processing can be achieved through API and Hub applications.

3.5 Real-time risk detection in operations management

In the basic information interface (Fig. 15), the framework applying simplified theories in civil engineering for Risk Recognition is utilized. Within this interface, the digital twin mapping model of each element in a project in Wenzhou is displayed, and interactive functions are set for each specific node, which can be clicked for the display of basic information and the detailed display of sub-nodes. With the above-mentioned data interface, the tunnel risk identification theories presented in Sect. 2.4 are integrated into the system, enabling the visualization of diverse risk outcomes within the interface.

Fig. 15
figure 15

Basic information interface

Within the digital twin framework, each sub-node is accompanied by a comprehensive sub-interface (Fig. 16). For instance, the real-time interface of the foundation pit encompasses the monitoring of construction progress at each site and provides real-time data from monitoring points. Similarly, the real-time interface of the tunnel offers feedback data and tracks the travel speed of the shield machine. Leveraging the mechanical analysis model proposed in this study, data undergoes processing and analysis. The system’s built-in data update interface facilitates the provision and display of warning results within the visualization interface, enabling effective operational management of smart city concerns.

Fig. 16
figure 16

Sub-node detail interface

4 Conclusion and prospective output

The intelligent infrastructure software system represents an initial exploration of digital twin technology for underground infrastructure. Taking a project in Wenzhou as a case study, the establishment of a digital twin model relies on a visualization platform, enabling the interrelated operation of the entire system. This study draws the following conclusions:

  • A framework for empowering underground space infrastructure with digital twin technology is summarized from five different aspects. Data mining analysis algorithms are utilized to predict data values at future time nodes. Digital modeling assigns physical connotations to key nodes, and the iterative update of data, in conjunction with monitoring technology, is incorporated into the algorithm to refine the accuracy of artificial intelligence predictions. Lastly, judgment indicators such as hazard values and critical values are provided based on the theory of civil engineering.

  • The development of new technologies is indispensable for the implementation of digital twins in underground engineering. Advancements in monitoring technology enable comprehensive sensing and real-time transmission, facilitating the holistic perception of digital twins. The emergence of technologies such as iS3 and BIM provides a broad foundation for three-dimensional modeling in the context of underground engineering.

  • A preliminary practice of the digital twin system for underground infrastructure has been conducted, combining the proposed digital twin framework with information technology in the civil engineering field. Using the “excavation pit” and “tunnel” facilities from a real-life engineering project in Wenzhou as a case study, a comprehensive database covering the entire project lifecycle was established. Subsequently, the physical model was visualized through digital modeling techniques, and physical information of the model was assigned using data analysis software. Finally, integration with the visualization platform enabled dynamic operation and maintenance management.

The integration of digital technology with underground spatial facilities, as a cutting-edge interdisciplinary field, is currently in the preliminary stages of research. This platform, serving as an initial attempt, does have certain limitations. (1) In the current monitoring methods, some data can be directly obtained through certain approaches, but it poses significant challenges to the accuracy of monitoring methods and data. (2) The complexity of combining multi-source heterogeneous data impacts the accuracy of digital model predictions, however, it is hard to find a standard and data exchange methods for driving models with multiple data types at present. (3) Previous research on digital twins for underground spaces mainly focus on algorithm-driven approaches, simulation of digital models, and real-time monitoring data, but the achieve intrinsic collaborative operation among these three aspects need to be investigated in depth.