1 Introduction

Efforts worldwide are underway to transition from run-to-failure to condition-based preventive structural maintenance policies [1, 2]. This paradigm shift is driven by the rising awareness among citizens and policymakers of the significant socio-economic impacts of aging infrastructures. These derive from the high risks associated with deficient maintenance, possibly leading to catastrophic collapses and loss of human lives. Exacerbated by the faster rates of material degradation induced by climate change, experts estimate that damages to critical infrastructures in Europe could increase ten-fold by the end of the century (from the current 3.4 to 34 billion €) [3]. In this context, Structural Health Monitoring (SHM) systems are being increasingly adopted as an effective means to provide continuous health assessment of structures, enabling precise interventions to extend their lifespan, reduce costs, and prevent collapses. In this regard, cultural heritage structures are particularly critical as strategic assets for sustainability in Europe [4]. Vibration-based SHM systems are particularly well-suited for these constructions of invaluable historical and cultural value due to their non-destructive and non-intrusive nature, global damage assessment, and relatively easy automation [5, 6]. However, while these systems excel at detecting damage, their use for achieving higher identification levels such as localization and quantification is still evolving and subject to scrutiny. These identification levels, however, become critical in the aftermath of disruptive events, such as earthquakes, facilitating the organization of emergency services and intervention prioritization.

Damage identification in SHM is typically structured as a hierarchical problem with increasing levels of complexity [7]: I—Detection; II—Localization; III—Classification; IV—Extension; and V—Prognosis. This problem can be addressed through either unsupervised learning (UL) and supervised learning (SL) techniques [8]. Data-driven UL approaches directly analyze monitoring data to detect the appearance of damage in the form of anomalies affecting the structural performance [9,10,11]. In this light, techniques such as statistical pattern recognition and quality control charts have gained popularity, as they are independent of structural models and associated uncertainties. Additionally, these techniques seamlessly integrate into continuous SHM [12,13,14]. However, one of the key limitations of UL is its limitation to damage detection (Level I), as it can only locate and quantify defects in some specific cases [15]. To achieve higher damage identification levels, SL techniques often become imperative. These methods, often referred to as structural identification (St-Id) or model updating, stand as the most efficient means to achieve complete damage identification. The goal of model updating is to adjust the parameters of a certain structural model of the monitoring asset with the purpose of minimizing discrepancies between the model’s theoretical predictions and actual experimental observations [16,17,18]. In this light, if a damage condition emerges, it can be translated into a variation in the mechanical properties of some specific structural members. Nonetheless, the inverse calibration of structural models of real-world civil engineering structures is often ill-conditioned [19], which refers to the lack of convexity in the associated optimization problem. As a solution, several studies in the literature have proposed the use of global optimization algorithms such as genetic [20,21,22] or particle swarm optimization (PSO) algorithms [23, 24], as well as robust probabilistic Bayesian St-Id [25, 26].

The connection between St-Id and the modern concept of Digital Twins (DT) is evident. Broadly, a DT serves as a digital counterpart of a physical asset characterized by virtual-real interactions [27,28,29]. In the context of SHM, a DT involves a physics-based or machine learning model that continuously utilizes monitoring data to deduce and categorize the health status of the physical asset [30]. This, in turn, enables the autonomous trigger of specific operational, inspection, and maintenance actions [31]. The use of St-Id for constructing DTs, however, represents a formidable challenge. This is primarily due to the large computational burden of numerical models of real-world civil engineering structures, as well as the considerable volume of model evaluations required by global optimization algorithms. As a solution to reconcile St-Id with real-time SHM schemes, a large variety of surrogate models (SM) have been reported in the literature to bypass resourceful numerical models in a computationally efficient way [32,33,34]. In this light, Cabboi et al. [35] adopted a second-order response surface model to meta-model the modal predictions of a 3D FEM of a stone-masonry tower, the San Vittore bell-tower in Varese, Italy. Those authors adopted a SM-based deterministic St-Id approach, demonstrating its effectiveness under various synthetic damage scenarios. Similarly, García-Macías et al. [36] proposed the use of a SM combining adaptive polynomial chaos expansion (PCE) and Kriging modeling for continuous Bayesian damage identification of a historic masonry civic tower, the Sciri Tower in Perugia, Italy. These experiences evidence the potential of SMs to conduct quasi real-time damage identification.

While SMs hold significant potential for developing structural DTs, the inherent ill-posed nature of the St-Id problem remains a major limitation. Ill-posedness refers to the lack of uniqueness or stability in the solutions of an inverse problem [37]. To address this issue, established approaches in the literature include regularization and parameterization techniques [19]. Common regularization approaches encompass various adaptations of the classical Tikhonov regularisation [38], as well as the natural regularization capabilities of Bayesian St-Id [39]. An equally crucial aspect is the selection of an appropriate model parameterization. Essentially, this entails the selection of parameters that exert the most significant influence on the model’s output. To this aim, linear sensitivity analysis is the simplest and most intuitive approach [40], although more advanced methods are also available in the literature, such as variance-based global sensitivity analysis [41], sensitivity-based parameter clustering [42], and others [26].

Nevertheless, model parameterization becomes a challenging issue in the context of damage identification, with no general procedure being available in the literature. In these applications, it is evident that regions or members affected by a particular pathology are not strictly related to the sensitivity of the undamaged configuration. Instead, they depend on the specific damage mechanism, such as the loading configuration or the load-bearing capacity of the structural members. This implies that, if a classical sensitivity-based parametrization is adopted, elements with limited (undamaged) sensitivity may be omitted in the St-Id regardless of their susceptibility to a certain damage mechanism. Moreover, in conditions of limited observability, as is common in St-Id of civil engineering structures, ill-conditioning limitations induce considerable uncertainties in the damage localization task, especially when combining parameters with different sensitivities [43]. In this context, it deems reasonable to believe that engineering knowledge can be injected into the St-Id problem by narrowing the parameter search space to regions/members affected by certain structural pathologies frequently observed in similar structures. This idea has prompted some researchers to pursue St-Id approaches involving not just a single model but a family of competing models, representing diverse damage mechanisms that a structure may experience. Building on this approach, when structural damage occurs, the model class that most accurately represents the damage-induced effects can be identified through a model class selection approach, providing intrinsic information on the damage localization. Additionally, once selected, the corresponding fitting parameters offer more detailed information on both the localization/extension and severity of the damage. This problem aligns with the field of model class selection in statistics, referring to the task of choosing the best model from a model class to represent a set of data [44, 45].

To address the model selection problem, the two most commonly adopted approaches include the use of information criteria (IC) and Bayesian evidence [46]. Information criteria are simple metrics representing a trade-off between the uncertainty in the model (prediction error) and its complexity (number of fitting parameters). Among the different IC available in the literature, the most widely recognized ones include the Akaike (AIC) and the Bayesian (BIC) criteria, along with their various extensions [47]. Bayesian evidence techniques represent a more sophisticated, although a more computationally intensive, approach for model selection. These techniques estimate the model evidence as the likelihood of the observed data integrated across the parameter space of the model. Such an integral is as high-dimensional as the number of model parameters; therefore, Markov chain Monte Carlo (MCMC) sampling techniques are usually required, with nested sampling and transitional MCMC (TMCMC) being the most popular approaches [48]. Among the competing models, these techniques allow for the selection of the most plausible one using Bayes factors (evidence rates), while also extracting the probability distributions of the fitting parameters as a by-product. In either case, both IC and Bayesian evidence techniques encapsulate the spirit of the Occam’s razor or the principle of parsimony [49], that is, under the circumstance of explaining the data with comparable accuracy, the simplest model will be the most plausible one. Overall, model selection techniques have found wide applicability across various disciplines such as epidemiology, chemometrics, astrophysics, ecology and evolution [50, 51], although their applicability in civil engineering and SHM has been poorly investigated. This can be primarily attributed to the formidable computational challenges inherent in the numerical models used for St-Id. Among the few experiences in the literature, it is notable to mention the work by Mthembu and co-authors [52] who adopted the nested sampling algorithm proposed by Skilling [53] for the model updating of a theoretical H-beam and a laboratory airplane model, the Garteur SM-AG19 structure. Another significant contribution came from Qian and Zheng [54], who introduced an evolutionary nested sampling algorithm for both model updating and model selection, illustrating the potential of this approach with two numerical examples including a clamped beam and a truss structure. Despite these promising results, to the best of the authors’ knowledge, the use of model selection techniques for continuous supervised damage identification of real-world civil engineering structures remains unexplored in the literature.

To address the previously mentioned gap in the literature, this work presents a novel multi-model or multi-class SL damage identification approach based on surrogate modeling and IC for model selection. Unlike traditional single-model St-Id, this work proposes the use of multiple finite element models (FEMs) with fitting parameters tailored to replicate the different damage mechanisms a structure may experience. On this basis, if an anomaly is detected, the most probable damage mechanism being activated is identified by a model selection approach based on the assessment of the BIC. To make the inverse calibration of all the model classes compatible with continuous SHM, we propose the use of Kriging meta-modeling to produce computationally light SMs of the forward FEMs. In this light, the proposed approach allows constructing multi-class digital models conformed by populations of competing structural models, providing a quasi real-time, interpretable, and comprehensive health assessment of instrumented civil engineering structures. Specifically, we focus on this paper on vibration-based SHM data, exploiting modal data in the inverse calibration and using a deterministic St-Id approach. To illustrate the potential of this approach, two case studies are presented, including a (i) numerical planar truss structure; and (ii) a real case study of an instrumented cultural heritage structure, the Muhammad Tower in the Alhambra fortress located in Granada, Spain. In the latter, a series of simulated damage scenarios is presented to evaluate the damage identification capabilities of the proposed approach.

2 Theoretical fundamentals

The proposed approach for multi-class SL damage identification comprises three key components presented hereafter. Section 2.1 introduces the general framework of the proposed methodology. Section 2.2 explores the fundamental concepts used in FEM updating, and Sect. 2.3 overviews the principles for the construction of Kriging SMs. Finally, Sect. 2.4 presents the adopted BIC formulation for model selection.

2.1 General framework: populations of digital twins

The overarching goal of the proposed approach is to develop multi-class digital models composed of a population of competing structural models reproducing potential damage mechanisms. When implemented into a continuous vibration-based SHM system, the DT is used to conduct supervised damage identification through St-Id. The process iteratively acquires experimental data from the physical asset, performs automated Operational Modal Analysis (OMA), and conducts St-Id by inverse calibration of the different SMs. Once calibrated, a simple model selection approach using the BIC is adopted to select the model that best fits the data, so achieving not only the identification of the activated pathology, but also of its severity. The general work-flow is sketched in Fig. 1 and comprises the following five consecutive steps:

Fig. 1
figure 1

Flowchart of the proposed multi-class digital models for continuous supervised damage identification structures

  1. (a)

    Automated modal identification (online phase) - A SHM system periodically acquires ambient vibration data and stores them in separate computer files of certain time duration. On this basis, using automated OMA, the time series of modal characteristics are estimated, including resonant frequencies \(f_j\), mode shapes \(\varphi _j\), and damping ratios \(\zeta _j\). The impact of benign fluctuations caused by environmental and operational conditions (EOC) on the previously identified modal signatures is mitigated through the application of statistical pattern recognition.

  2. (b)

    Model parameterization (offline phase) - In this step, several FEMs are defined to represent the various failure mechanisms that the structure may encounter. Each of these models is parameterized taking into account the possible affected zones by the specific pathology, considering their local elastic properties as the fitting parameters. These models are defined on the basis of a reference (healthy) FEM, calibrated using the modal properties extracted from initial Ambient Vibration Test (AVT).

  3. (c)

    Surrogate modeling (offline phase) - Using the previously defined FEMs, computationally efficient SMs are created as black-box functions that establish a correspondence between the corresponding model parameters (\({\textbf {x}}\)) and the modal signatures of the structure.

  4. (d)

    Inverse model calibration (online phase) - This step establishes the St-Id of the asset by conducting the inverse calibration of all the created SMs. This process involves solving a specific optimization problem with an objective function denoted as \(J\left( {\textbf {x}}\right)\), which quantifies the disparity between the theoretical predictions and the previously identified experimental modal signatures. Consequently, distinct time series of the estimated fitting parameters are collected for all the considered models in the DT.

  5. (e)

    Model Selection (online phase) - The model selection process is carried out using the BIC, which evaluates the goodness of fit and complexity of the different SMs forming the DT. If an anomaly in the structural performance is detected, the main outcome of this process is the identification of the most plausible damage mechanism that has been activated, along with its severity (location and quantification). This information is derived from the inference results obtained from the parameterized zones in the FEM used to construct the selected SM.

2.2 Finite element model updating using modal data

Model updating, refers to the process of calibrating certain model parameters \(x_i\), \(i=1,...,m\) organized in a vector \({\textbf {x}} = \left[ x_1,x_2, \ldots ,x_m \right] ^\textrm{T}\), with the aim of minimizing the mismatch between experimental data and the theoretical predictions. Each of these variables is constrained to a specific, physically meaningful range \(\left[ a_i,b_i\right]\), in such a way that \({\textbf {x}}\) spans the m-dimensional design space \({\mathbb {D}}=\left\{ {{\textbf {x}} \in {\mathbb {R}}^m:a_i \le x_i \le b_i }\right\}\). Note that the numerical models of large-scale structures generally contain a significant amount of uncertainty stemming from different assumptions, idealizations and spatial discretization, as well as epistemic uncertainties associated with the material and the connectivity between structural elements. To minimize such uncertainties, FEM updating is formulated as an optimization problem seeking the model parameters that minimize the differences between the theoretical and experimental results. Therefore, the definition of the objective function is thus the first step in FEM updating. In this work, we focus on vibration-based SHM, therefore we introduce an objective function J(x) accounting for the relative differences between the l target modes of vibration determined experimentally and their theoretical counterparts as [18, 55]:

$$\begin{aligned} J({\textbf {x}}) = \sum _{i=1}^{l} \left[ \alpha \varepsilon _i ({\textbf {x}})+\beta \delta _i ({\textbf {x}})\right] +\eta {\mathcal {R}}({\textbf {x}}), \end{aligned}$$
(1)

with

$$\begin{aligned} \varepsilon _i ({\textbf {x}})= & {} \frac{\left\| f_{exp}^i-f_{model}^i\right\| }{f_{exp}^i}, \quad \delta _i ({\textbf {x}}) = 1-MAC_i ({\textbf {x}}), \quad \nonumber \\ {\mathcal {R}}({\textbf {x}})= & {} \sum _{i=1}^m\left( x_i^0-x_i\right) ^2, \end{aligned}$$
(2)

where \(\left\| \cdot \right\|\) denotes absolute value, and \(\alpha\), \(\beta\) and \(\eta\) are weighting coefficients. The first error term \(\varepsilon _i ({\textbf {x}})\) indicates the mean absolute relative errors between the i-th experimental \(f_{exp}^i\) and numerical \(f_{model}^i\) resonant frequencies, respectively. Finally, the term \(MAC_i({\textbf {x}})\) stands for the Modal Assurance Criterion (MAC) value between the i-th experimental and numerical mode shapes. The last term in Eq. (2), \({\mathcal {R}}({\textbf {x}})\), signifies a classical Tiknonov (\(L^2\)) regularization term, which serves to mitigate ill-conditioning in the calibration process by penalizing solutions that deviate significantly from the reference (healthy) parameter values \(x_i^0\). On this basis, the procedure can be articulated as the ensuing constrained non-linear minimization problem:

$$\begin{aligned} \overline{{\textbf {x}}}=\text {arg}\,\underset{{\textbf {x}} \in {\mathbb {D}}}{\text {min}} \, J\left( {\textbf {x}}\right) . \end{aligned}$$
(3)

The optimization problem in Eq. (3) is generally non-convex and, consequently, it is recommended to adopt a global optimization algorithm for its resolution. For this purpose, the PSO algorithm has been adopted using the open-source program Python library Pymoo [56].

2.3 Non-intrusive surrogate modeling: Kriging

The FEMs of large-scale civil engineering structures are often computationally intensive, which critically undermines the efficiency of iterative optimization algorithms. To tackle this challenge, computationally efficient SMs using Kriging are adopted to replicate the predictions of the forward FEM with significantly reduced computational overhead. After selecting the design variables, the process of constructing a SM typically involves three consecutive steps, namely: (i) Sampling of the design space, (ii) Generation of the training population, and (iii) Construction of the SM.

Consider the previously introduced set of m design variables \(x_i \in {\mathbb {R}}\) for \(i=1,\ldots ,m\) to be calibrated. As anticipated in Sect. 2.1, it is crucial for the selected model parameters to capture the effects of potential damage on the structure’s investigated response y. In this context, a SM serves as a computationally efficient means of establishing a functional relationship between the selected damage-sensitive parameters \({\textbf {x}}\) and the response \(y\in {\mathbb {R}}\), as predicted by the FEM of the structure. When considering a non-intrusive SM, a training population of \(N_s\) individuals is necessary, mapping the output y and the design space \({\mathbb {D}}\), often referred to as the experimental design (ED). This is achieved by uniformly sampling \({\mathbb {D}}\) and forming a matrix of design sites \({\textbf {X}}=[{\textbf {x}}^1,\ldots ,{\textbf {x}}^{N_s} ] \in {\mathbb {R}}^{m \times N_s}\). Corresponding outputs \(y^i\) are obtained through direct Monte Carlo simulations (MCS) using the forward FEM and compiled in an observation vector \({\textbf {Y}}=\left[ y^1,\ldots ,y^{N_s} \right] ^\text {T}\). In this work, the damage-sensitive design variables pertain to the elastic moduli of particular regions within the FEM, referred hereafter as macro-elements. It is important to emphasize that such a simplified damage model is valid as long as the structure continues to behave as a linear time-invariant system after the appearance of damage. The outputs considered are the modal properties derived from a linear modal analysis of the FEM. Consequently, individual SMs must be constructed for each natural frequency and modal displacement associated with all the vibration modes considered in the model calibration. Specifically, if there are l selected vibration modes and \(n_{DOF}\) degrees of freedom characterizing the mode shapes, a total of \(l\left( 1+n_{DOF}\right)\) SMs (per model parameterization) need to be developed.

From the extensive array of non-intrusive SMs available in the literature, the Kriging model is adopted in this work due to its exceptional adaptability to a diverse range of applications [57]. The Kriging interpolator conceptualizes the function of interest \(y({\textbf {x}})\) as the sum of a linear regression term \(y_r \left( {\textbf {x}}\right)\) and a zero-mean stochastic process \({\mathcal {Z}}\left( {\textbf {x}}\right)\) as [58]:

$$\begin{aligned} y\left( {\textbf {x}}\right) =y_r \left( {\textbf {x}}\right) + {\mathcal {Z}}\left( {\textbf {x}}\right) . \end{aligned}$$
(4)

Essentially, \(y_r \left( {\textbf {x}}\right)\) serves as a global approximation for the entire design space, while \({\mathcal {Z}}\left( {\textbf {x}}\right)\) models localized deviations. The regression function \(y_r ({\textbf {x}})\) relies on a set of p regression parameters, \(\varvec{\kappa }=\left[ \kappa _1,\ldots ,\kappa _p\right] ^\text {T}\), and certain user-defined regression functions, \(f({\textbf {x}})=\left[ f_1 ({\textbf {x}}),\ldots ,f_p ({\textbf {x}})\right] ^\text {T}, \; f_i:{\mathbb {R}}^m \rightarrow {\mathbb {R}}\), as \(y_r ({\textbf {x}})=f({\textbf {x}})^\text {T} \varvec{\kappa }\) [59]. On the other hand, the stochastic process \({\mathcal {Z}}\left( {\textbf {x}}\right)\) is characterized by its covariance function \(\text {Cov}\left[ {\mathcal {Z}}({\textbf {x}}_i){\mathcal {Z}}({\textbf {x}}_j)\right]\), which quantifies the correlation between any two arbitrary data points \({\textbf {x}}_i\) and \({\textbf {x}}_j\) as:

$$\begin{aligned} \text {Cov}\left[ {\mathcal {Z}}({\textbf {x}}_i){\mathcal {Z}}({\textbf {x}}_j)\right] =\sigma ^2 r\left( {\textbf {x}}_i,{\textbf {x}}_j,\varvec{\theta } \right) . \end{aligned}$$
(5)

Here \(\sigma ^2\) represents the variance of \({\mathcal {Z}}\left( {\textbf {x}}\right)\), and \(r\left( {\textbf {x}}_i,{\textbf {x}}_j,\varvec{\theta } \right)\) is a spatial correlation function, which is reliant on a set of hyper-parameters denoted as \(\varvec{\theta }\). On this basis, the Kriging predictions \({\widehat{y}} \left( {\textbf {x}}\right)\) for the response \(y \left( {\textbf {x}}\right)\) at any designated design site \({\textbf {x}}\) are defined as follows:

$$\begin{aligned} {\widehat{y}}({\textbf {x}})=f({\textbf {x}})^\text {T} \varvec{\kappa }+r({\textbf {x}})^\text {T} {\textbf {R}}^{-1} \left[ {\textbf {Y}}-f({\textbf {x}})^\text {T} \varvec{\kappa } \right] , \end{aligned}$$
(6)

where \(r({\textbf {x}})\) is a vector that contains the correlations between the design sites and \({\textbf {x}}\), defined as:

$$\begin{aligned} r({\textbf {x}})^\text {T}=\left[ r\left( \varvec{\theta },{\textbf {x}}_1,{\textbf {x}}\right) ,\ldots ,r\left( \varvec{\theta },{\textbf {x}}_{N_s},{\textbf {x}}\right) \right] ^\text {T}, \end{aligned}$$
(7)

and \({\textbf {R}}\) is a \(N_s \times N_s\) positive definite matrix with components \(R_{ij}= r\left( {\textbf {x}}_i,{\textbf {x}}_j,\varvec{\theta } \right)\).

Note in Eq. (6) that, once the regression model and the correlation function have been chosen, the Kriging interpolator is exclusively governed by the regression parameters \(\varvec{\kappa }\) and the correlation parameters \(\varvec{\theta }\). In the scope of this study, second-order polynomial regression functions are used to delineate the trend term, while Gaussian correlation functions are adopted for the stochastic term as [60]:

$$\begin{aligned} r\left( {\textbf {x}}_i,{\textbf {x}}_j,\varvec{\theta } \right) = \prod _{k=1}^{m} \exp \left[ -\theta _k \left( x_i^{(k)}-x_j^{(k)} \right) ^2\right] . \end{aligned}$$
(8)

The hyper-parameters \(\theta _k\) in Eq. (8) play a pivotal role in shaping the correlation function, potentially introducing anisotropy along the dimensions of \({\textbf {x}}\). Nevertheless, for the sake of simplicity, isotropic correlations are assumed in this study, i.e., \(\theta _k=\theta\) for all dimensions in \(1 \le k \le m\).

With the hyper-parameters \(\varvec{\theta }\) known, it becomes possible to compute the trend parameters \(\varvec{\kappa }\left( \varvec{\theta } \right)\) and the variance \(\sigma ^2 \left( \varvec{\theta } \right)\) as closed-form functions of \(\varvec{\theta }\) using the empirical best linear unbiased estimator (BLUE). Further details on this approach can be found elsewhere (refer e.g. to [58, 59]). On the other hand, estimating the hyper-parameters \(\varvec{\theta }\) often requires solving a non-linear optimization problem, with the maximum-likelihood-estimator being a common approach. In this work, the iterative pattern search optimization algorithm implemented in the open-source Python library pydacefit [61] has been adopted.

2.4 Model selection by Bayesian information criterion

The optimization problem outlined above is critically determined by the fitting parameters in \({\textbf {x}}\). However, as mentioned in the introduction, there is not a general procedure for making such a selection. Ideally, this selection should be conducted taking into account the potential failure mechanisms the structure may experience. Nevertheless, real-world structures may undergo a large variety of failures, and there is no general model that can reproduce all the different pathologies a structure may present. Therefore, specific models should be created for specific failure mechanisms. These models, organized into different model classes \(MC_i\), should be calibrated through the St-Id approach outlined in Eq. (3). Afterward, a proper model selection approach needs to be adopted to identify the most plausible model class. In this work, we propose the use of the BIC metric as a simple yet efficient approach. It facilitates model comparison and selection by simultaneously calibrating multiple structural models, each representing a different damage mechanism, and evaluating the models’ plausibility based on their goodness of fit to the experimental data and their complexity (number of fitting parameters). The BIC is formally defined for the i-th model class as [62]:

$$\begin{aligned} BIC_i = 2\ln \left( {\hat{L}}_i\right) + c\ln (l), \end{aligned}$$
(9)

where \(\hat{L_i}\) is the maximized value of the likelihood function of the model \(MC_i\), i.e. \(\hat{L_i} = \text {p}({\textbf {y}}~\big |~\hat{{\textbf {{x}}}},MC_i)\) with \(\hat{{\textbf {{x}}}}\) the parameter set that maximizes the likelihood function, \({\textbf {y}}\) is the observed data, and c denotes the complexity of the model, that is number of parameters to be estimated \(c=m\). Under the assumption that the model errors in terms of frequencies and mode shapes are independent, and identically distributed according to Gaussian distributions with equal variances across all the modes, the BIC in Eq. (9) can be rewritten as:

$$\begin{aligned} BIC = l \ln \left( \frac{RSS^f}{l}\right) + \left( n_{DOF}\cdot l\right) \ln \left( \frac{\sum _{j=1}^{n_{DOF}}RSS^{\varphi }_j}{n_{DOF}\cdot l}\right) + c\ln (l). \end{aligned}$$
(10)

The terms \(RSS^f\) and \(RSS^{\varphi }_j\) in Eq. (10), corresponding to the residual sum of squares error in terms of frequencies and modal displacements, respectively, can be expressed as:

$$\begin{aligned} RSS^f= & {} \sum _{i=1}^{l} \left( f^{i}_{exp} - f^{i}_{model} \right) ^{2}, \;\nonumber \\ RSS^{\varphi }_j= & {} \sum _{i=1}^{l} \left( \varphi ^{i}_{j,exp} - \psi _i \, \varphi ^{i}_{j,model} \right) ^{2}, \end{aligned}$$
(11)

with \(\varphi ^{i}_{j,exp}\) and \(\varphi ^{i}_{j,model}\) denoting the j-th components of the i-th experimental and numerical mode shapes, respectively. The term \(\psi _i\) is a normalizing constant between the experimental and numerical mode shapes given by:

$$\begin{aligned} \psi _i = \frac{ \left( \varphi ^{i}_{model} \right) ^\text {T} \varphi ^{i}_{exp}}{ \left( \varphi ^{i}_{exp} \right) ^\text {T} \varphi ^{i}_{exp}}. \end{aligned}$$
(12)

Note that the BIC metric in Eq. (10) increases as both the error in the predictions and the number of fitting parameters increase. Therefore, after obtaining the BIC values for all the competing models or classes, the optimal model can be selected as the one with the lowest value. This model selection approach aligns with Occam’s Razor principle [63], which posits that among a population of competing models, the one with the fewest assumptions and the least complexity is the most plausible.

3 Numerical results and discussion

The effectiveness of the proposed methodology is illustrated through two different case studies: a theoretical and a real full-scale structure. The first one is a planar truss structure serving as a control case study used to appraise the effectiveness and limitations of the proposed model selection approach, as well as its robustness to measurement noise. The second case study is the Muhammad Tower in the Alhambra fortress, which was recently instrumented by García-Macías et al. [64] with a long-term dynamic-based SHM system. This case study offers a realistic scenario to assess the proposed approach in terms of damage identification effectiveness and computational efficiency.

3.1 Case study I: 2-D truss structure

The investigated 25-bar planar truss structure (Fig. 2) is a benchmark case study for damage identification investigation utilized by Thanh Cuong-Le et al. [65]. The FEM of the structure is defined using planar 2-nodes truss elements implemented in Python, leading to a total of 24 degrees of freedom. The mass density and the modulus of elasticity of the material are defined as 7500 kg/m\(^3\) and 210 GPa, respectively. All the elements of the structure are defined with a cross-section area of 18 cm\(^2\).

To discretize the mode shapes, five sensors monitoring x and y-directions are defined at nodes n\(_2\), n\(_4\), n\(_6\), n\(_9\) and n\(_{10}\) (\(n_{DOF}=10\)) as indicated with red arrows in Fig. 2. To assess the implications of limited experimental observability, analyses have been conducted considering eight (\(l=8\)) and sixteen (\(l=16\)) modal signatures. In both cases, all mode shapes have been normalized to maximum unit displacement.

Fig. 2
figure 2

Case study I: truss structure. Red arrows indicate the position and direction of the sensors

Fig. 3
figure 3

Ranking of the truss members of case study I according to mean sensitivity in terms of resonant frequency

Fig. 4
figure 4

Model classes MC\(_1\) (a) to MC\(_4\) (d) for case study I: truss structure

To assess the effectiveness of the proposed model selection approach, different classes of competing models with increasing numbers of fitting parameters have been defined. To this aim, the bars of the truss structure have been ranked according to a preliminary sensitivity analysis reported in Fig. 3. In this analysis, the stiffness of the bar elements of the structure were sequentially affected by a perturbation of 5%, calculating the mean sensitivities in terms of frequencies (\(\overline{S^f}\) ) across the eight considered modes. The results in Fig. 3 guided the definition of different model classes MC\(_i\), \(i=1,\ldots ,4\), incorporating an increasing number of parameters of decreasing mean sensitivity. Specifically, four different model classes have been defined as depicted in Fig. 4, with the stiffness multipliers \(k_i\) of the bars highlighted in blue as the fitting parameters (\(x_i\)). Note that these model classes represent nested models, wherein each subsequent class includes the parameters of the previous class. For the FEM calibration in Eq. (3), the range of variation of the stiffness multipliers \(k_i\) is set to [0.7, 1.1], and the weighting factors \(\alpha\), \(\beta\) and \(\eta\) have been set to 1, 1, and 0, respectively, after manual tuning. Note that, although a value of \(k_i\) greater than one (\(x_i^0=1\)) may lack physical meaning in the context of damage identification (damage typically does not increase the stiffness of an element), this upper limit has been extended to address potential ill-conditioning limitations in the solution and prevent the algorithm from becoming stuck at the upper limit. The non-linear minimization problem in Eq. (3) is solved using a PSO algorithm with a population size of 40 particles, and the convergence threshold is set at 1E-5 for error tolerance. It is important to remark that, since the heuristic PSO optimization algorithm is adopted, no explicit use of the sensitivity matrix is required, the previous analysis limiting to ranking the element bars.

Table 1 Definition of the bars affected by the four defined model classes, as well as the synthetic damage scenarios for Case Study I: Truss structure

A dense set of synthetic experimental data has been generated to assess the effectiveness of the proposed approach. Specifically, 150 simulations have been performed for every model class, randomly affecting the corresponding bars with stiffness reductions uniformly distributed in the range of 0 to 20% (\(k_i \sim {\mathcal {U}}\left( 0.8,1.0\right)\), see Table 1). Similarly to reference [54], to evaluate the robustness of the proposed approach to the presence of noise in the measurements, the modal displacements are contaminated with uniformly distributed random noises of increasing magnitude, \({\overline{\phi }}_{ij} = \phi _{ij}\left( 1+\eta \, {\mathcal {U}}\left( -1,1\right) \right)\), where \(\phi _{ij}\) and \({\overline{\phi }}_{ij}\) are the ij-th noise-free and noisy modal response, respectively, and \(\eta\) denotes the noise level. Three noise levels \(\eta = 0\%\), \(10\%\), and \(20\%\) are considered in this study. On this basis, the proposed model selection approach is applied to all the simulated tests, extracting in every analysis the most plausible model class.

Fig. 5
figure 5

Successful classification rates [%] considering increasing noise levels (\(\eta = 0\%\), \(10\%\), and \(20\%\)) and experimental evidence (8 and 16 modes) for case study I: truss structure

Fig. 6
figure 6

Unit-normalized probability distribution functions (\({\overline{PDF}}\)) of the fitting errors in terms of the objective function \(J({\textbf {x}})\) considering increasing noise levels (\(\eta = 0\%\), \(10\%\), and \(20\%\)) and experimental evidence (8 and 16 modes) for case study I: truss structure

The obtained classification results are depicted in Fig. 5 for the three considered noise levels, assuming observability of 8 or 16 vibration modes. Notably, the classification consistently improves as the number of fitting parameters decreases. This trend is attributed to the fact that the BIC penalizes model complexity, which becomes dominant over the prediction error when dealing with similar model parameterizations, as in this case with nested models. In such circumstances of limited experimental observability, adhering to the Occam’s Razor principle, simpler models with fewer parameters that offer comparable prediction accuracy to more complex models are preferably selected. Indeed, note that as the number of experimental observations increases from the top (8 modes) to the bottom (16 modes) rows in Fig. 5, the classification results for the four model classes consistently improve. These results demonstrate that, in the context of limited experimental evidence, achieving accurate classification of complex models becomes increasingly challenging as their complexity increases. Furthermore, the results in this figure highlight the role of noise contamination in the measurements, resulting in minimum successful classification rates dropping from 52% to 34% when eight modes are observable, and from 63% to 49% when sixteen modes are considered in the analysis.

Finally, Fig. 6 presents the unit-normalized probability distribution functions (\({\overline{PDF}}\)) of the fitting errors evaluated in terms of the objective function in Eq. (1). In this figure, the global errors obtained by selecting the most plausible model at every iteration are also depicted with thick gray lines. Note that, in most cases, regardless of the noise level and the number of observable modes, the proposed model selection approach tends to yield low-concentration errors. These results suggest the potential of the proposed multi-class St-Id approach for comprehensive model-driven damage identification, especially in structures prone to different failure mechanisms requiring multiple parameterizations.

3.2 Case study II: Muhammad tower

Fig. 7
figure 7

Views of Muhammad Tower (left: photo taken from Mirador del Rey Chico; right: photo taken from the Patio de la Madraza de los Príncipes; Source: legadonazari.blogspot.com; from December 17th 2022) (a). Plan and elevation views of the tower and sensors layout (b)

Fig. 8
figure 8

Modal tracking the first three global resonant frequencies of the Muhammad Tower from January 10th until March 31st 2022 (a) and the experimental mode shapes (b) from reference [64]

The second case study applies the methodology proposed in Sect. 2 to one of the towers of the Alhambra fortress, the Muhammad tower (Fig. 7a) in Granada, Spain. The tower, originally constructed in the 13th-century under the rule of Muhammad II, served a defensive purpose, controlling access to the palaces. It is seamlessly integrated into the walls of the Alhambra Fortress, positioned between the Tower of the Cube and the Mexuar Palace. The structure has 1.3–1.9 m thick walls and rises to a height of 11.6 m above the floor, featuring an approximately rectangular cross-section (6.6 x 9.0 m). Constructed primarily with rammed earth (RE) and brick masonry, the tower has two vaulted floors and a rooftop terrace enclosed by a 0.80 m tall parapet and 1.2 m tall battlements. The three levels of the tower are connected by masonry staircases at the south–west façade of the tower. The tower’s foundations rest upon a geological formation of conglomerates with intercalated sands and clays from the Pliocene and Lower Pleistocene, known as the Alhambra Formation. Despite there is evidence of multiple alterations over the centuries, official documentation of restoration efforts did not commence until the 1950s, when extensive restoration work was carried out after a prolonged period of neglect. This included the underpinning and stabilization of the tower’s foundations, a project led by the architect Francisco Prieto-Moreno Pardo in 1975.

Within a research project devoted to the risk assessment of the tower after the seismic swarm occurred between February and August 2021, the tower was instrumented by the authors with a continuous vibration-based SHM since January until March 2022. The monitoring system comprised 8 high-sensitivity piezoelectric accelerometers (PCB393B31, \(\upmu\)5% 10.0 V/g, broadband Resolution: 1 \(\upmu\)g rms and ±0.5 g pk) installed on the three main levels of the tower, labeled with A1 to A8 as sketched in Fig. 7b. This configuration was intended to characterize the rigid diaphragm motions of the floors and the global torsional rotations of the tower. The acceleration signals were recorded by a data acquisition system (DAQ) model LMS SCADAS, and stored in separate data files containing 30-min-long records with an acquisition frequency of 200 Hz. Environmental data were also acquired with a sampling frequency of 10 min by a nearby meteorological station, including the air temperature, relative humidity, wind speed, and atmospheric pressure.

Fig. 9
figure 9

Three-dimensional FEM the Muhammad tower (a), and y-z (b) and x-z (c) sections (originally developed in [64])

Fig. 10
figure 10

Crack patterns simulated by non-linear static analyses: Scenario 1 - pushover analysis in the N–S direction (a), and Scenario 2 - differential vertical settlement of the foundation (b)

Through automated OMA, the modal signatures of the tower were continuously extracted, as reported in [64]. Figure 8a displays the frequency tracking of the first three global modes of the tower during the monitoring period from January 10th to March 31st, 2022, comprising a total of 3233 acceleration records. This study focuses solely on the first three global modes of the structure (Fx, Fy and Tz), with local modes induced by the motion of the battlements of the terrace being disregarded (refer to reference [64] for an in-depth discussion). Fy and Fx correspond to first-order bending modes in the N–S and W–E directions of the tower, respectively, while Tz is the first torsional mode of the tower, as shown in Fig. 8b. The detailed views in Fig. 8a also reveal the presence of significant daily oscillations, suggesting the influence of environmental factors in the tower’s global modes, especially in Fx. Furthermore, it is worth noting that there were periods when the monitoring system was disrupted due to electrical power shortages, occurring during mid-January to mid-February and twice in March.

For the generation of the multi-class digital model of the tower, we retrieved the ABAQUS FEM (Fig. 9) developed and calibrated in our previous work [64]. Specifically, the model encompasses the main body of the tower with linear springs representing the constraints exerted by the surrounding walls of the fortress. The elastic modulus and mass density of the material were calibrated through linear sensitivity analysis of a comprehensive FEM that considered the surrounding walls, serving as the base model for developing a more computationally efficient FEM of the main body of the tower, adopted herein. In the latter, the stiffness of the linear springs, replacing the surrounding walls, was tuned manually (for further details, refer to Section 6.2 in [64]). The resulting FEM yielded a maximum relative error in terms of resonant frequencies of 2.11% as reported below in Table 2, which is assumed sufficient for the generation of the SMs in this work without requiring further (computationally intensive) heuristic calibration. This model was meshed using solid C3D8 linear elements with a mean dimension of about 20 cm, resulting in a total of 105,922 nodes and 555,883 elements. Note that this numerical model is computationally intensive, requiring approximately 3 min to complete a linear modal analysis. Therefore, the use of a computationally efficient SM becomes imperative to solve the inverse model calibration problem in Eq. (3).

Table 2 Frequency decays and MAC values with respect to the undamaged condition for the simulated damage scenarios in the Muhammad Tower. The experimental model signatures were by automated OMA of the first 30 min of ambient vibrations recorded on January 10, 2022 10:00 a.m

The initial reference FEM is used to generate different model classes representing distinct failure mechanisms. To this aim, two different synthetic damage scenarios have been defined after non-linear simulation analyses. In both cases, the simulation consists of four sequential steps involving: (i) gravity loading; (ii) incremental imposed displacements; (iii) release of imposed displacements; and (iv) modal analysis using linear perturbation. The first damage scenario represents a pushover analysis along the NS direction. In this case, a parabolic profile of imposed displacements is applied to the tower until a maximum top displacement of 1.7 cm is reached. The second scenario simulates a condition of differential foundational settlement. In this case, a linear profile of displacements covering an area of 6.3 m\(^2\) (12.3%) is imposed at the foundation until achieving a maximum settlement of 0.7 cm. The non-linear behavior of the RE is simulated using the Concrete Damage Plasticity (CDP) model implemented in ABAQUS, considering the constitutive properties reported in [64]. This model allows reproducing the cracking- and crushing-induced losses of stiffness through two scalar fields \(d_t\) and \(d_c\) (\(d_c,d_t=0\) undamaged, and \(d_c,d_t=1\) fully damaged), as well as to reproduce the damage-induced effects upon the modal properties of the tower through linear perturbation analysis. In this work, since tensile cracking dominates over compression crushing, compression-induced stiffness losses are disregarded. The crack patterns obtained for the two simulated scenarios are reported in Fig. 10a and b for the pushover analysis and the differential settlement simulation, respectively. Table 2 shows the frequency decays and MAC values with respect to the initial FEM (undamaged) for each damage scenario.

Fig. 11
figure 11

Model classes defined for the FEM of the Muhammad Tower: a MC\(_1\), b MC\(_2\), and c MC\(_3\)

Three different model classes, each with three fitting parameters, have been defined as reported in Fig. 11. The first model class MC\(_1\) represents a general-purpose parametrization (equivalent to the single parametrization used in [64]), in which the main body of the tower is discretized through three macro-elements M\(_i\) along its height (Fig. 11a). The other two parameterizations are defined based on the two non-linear simulation analyses outlined above. Specifically, the model classes MC\(_{2}\) and MC\(_{3}\) reproduce the earthquake-induced damage in the N–S direction of the tower and the differential settlement of the foundation, as illustrated in Fig. 11b, c, respectively. In these cases, the macro-elements M\(_i\) have been defined by grouping sets of elements affected by the corresponding damage scenario (\(d_t>0.9\)) and forming the main developed cracks. Across these three discretizations into macro-elements M\(_i\), the fitting parameters have been defined as stiffness multipliers \((k_i, i = 1,..,3)\) affecting the elastic moduli of the corresponding elements.

Given the large computational cost of the FEM parameterizations, it becomes indispensable to replace them by Kriging SMs compatible with continuous St-Id. The input and output variables of the Kriging meta-models are the stiffness multipliers \(k_i\) of the corresponding macro-elements and the modal signatures (resonant frequencies and mode shapes) of the tower, respectively. The stiffness multipliers are assumed to be uniformly distributed within domains of variation \({\mathbb {D}} = \left\{ {\textbf {x}} \in {\mathbb {R}}^3: 0.7 \le k_i \le 1.1 \right\}\), \({\mathbb {D}} = \left\{ {\textbf {x}} \in {\mathbb {R}}^3: 0.5 \le k_i \le 1.1 \right\}\), and \({\mathbb {D}} = \left\{ {\textbf {x}} \in {\mathbb {R}}^3: 0.4 \le k_i \le 1.1 \right\}\), for the three model classes MC\(_1\), MC\(_2\), and MC\(_3\), respectively, as defined above in Fig. 11. Note that these variation ranges are considerably large, with 0.7, 0.5, and 0.4 meaning reductions of 30%, 50%, and 60% of the elastic modulus of the affected macro-elements, respectively. The maximum value in the three domains of variation (\({\mathbb {D}}\)) has been defined as 1.1 in the three model classes. Similar to the previous case study, this value is established to address potential issues related to ill-conditioning. Additionally, having an upper limit above 1.0 (nominal stiffness, \(x_i^0=1, i = 1,..,3\)) becomes fundamental to accommodate the EOC-induced oscillations in the resonant frequencies of the tower, as previously reported in Fig. 8 (a).

Fig. 12
figure 12

Comparison between the predictions of the forward FEM and the Kriging SMs for the Muhammad Tower: MC\(_{1}\) (a), MC\(_{2}\) (b), MC\(_{3}\) (c)

Random samples have been uniformly generated across the domains of variation (\({\mathbb {D}}\)) using the quasi-random sequence of Sobol. Specifically, two EDs have been generated: one of 160 samples for training the SMs and another independent validation set of 200 samples. The sizes of these EDs were determined based on the convergence analyses reported in our previous work (refer to [64]). For each individual in the EDs, the modal signatures are obtained by performing a forward evaluation of the 3D FEM. As indicated above, only the first three global modes of the tower are considered in the analysis. Consequently, a total of 27 SMs are constructed for each model (3 resonant frequencies plus \(8 \times 3\) modal displacements). The comparison between the predictions of the SMs and the forward FEMs is shown in Fig. 12. This figure depicts the forward evaluations of the first three resonant frequencies by the 3D FEM versus the predictions of the SMs for the three model classes. In this figure, to appraise the quality of the SMs in estimating the mode shapes of the tower, a metric \(J_{MAC,i}\), which accounts for the median value of the \(1-\text {MAC}\) values between the \(i-\)th experimental mode shape \(\varphi ^i_{exp}\) and the prediction by the SMs \(\varphi ^i_{model}\) in the validation set, is introduced as:

$$\begin{aligned} J_{\text {MAC},i} = \text {med} \left\{ 1 - \text {MAC} \left( \varphi ^i_{exp},\varphi ^i_{model} \right) \right\} . \end{aligned}$$
(13)

The minimal dispersion of data points in Fig. 12 around the diagonal lines provides strong confirmation that the SMs are formed with a high degree of accuracy. This is further supported by the values of the coefficients of determination \(R^2\), which are all close to one, as well as the low values of the root-mean-squared-errors (RMSE) and \(J_{MAC,i}\) metrics on the order of \(10^{-4}\) for all the model classes. It is important to emphasize that the average evaluation time of the resulting multi-class digital model is 1.68 min (the three model classes), representing a 44% reduction compared to that of the forward FEM. Such high computational efficiency is critical for their applicability within a continuous SHM scheme as shown hereafter.

Once the SMs have been constructed with accuracy, the multi-model SL damage identification approach previously introduced in Sect. 2.1 is applied to the experimental time series of modal features extracted from January 10th until March 31\(^{\textrm{st}}\) 2022. Prior to the St-Id, the multiple linear regression (MLR) model previously defined in reference [64] has been retrieved and used to minimize the effects of EOC in the time series of resonant frequencies previously reported in Fig. 8. In that work, correlation analyses with environmental data revealed that the most effective combination of predictors for the MLR model comprised air temperature AT, humidity H, and derived quantities, including \(AT^2\), \(H^2\), and moving averages of AT with time windows of 48 (1 day) and 1344 (1 month) data points. For the training of the MLR model, and given the restricted quantity of monitoring data available, the training period was set from January 10th until February 27th 2022 (2200 data points). After construction, the cleansed time series of resonant frequencies were determined by adding the average resonant frequencies within the training period to the residuals between the experimental data and the MLR model predictions (for further details, readers are referred to Section 6.1 in [64]).

Fig. 13
figure 13

Time series of identified stiffness multipliers \(k_i\) of macro-elements M\(_i\), \(i=1,\ldots ,3\) for the three model classes, MC\(_1\) (a), MC\(_2\) (b), and MC\(_3\) (c), defined in the Muhammad Tower under the synthetic damage Scenario 1. The subfigures on the right-hand side depict the corresponding probability density functions (PDFs) of the data points during the training period (dashed lines) and the damage period (solid lines)

Fig. 14
figure 14

Time series of BIC values for the three model classes MC\(_i\) defined in the Muhammad Tower for two synthetic damage scenarios: a pushover analysis, and b and differential foundational settlement

Since the tower experienced no damage during the monitoring period, a set of synthetic damage scenarios has been generated to test the proposed multi-model SL damage identification approach. These scenarios are defined based on the previously reported frequency decays in Table 2, which were determined from the non-linear simulation of the forward FEMs. Specifically, the frequency decays are incorporated to the time series of resonant frequencies after March 7th 2022 (beyond the training period for the MLR model). Given that the considered damage scenarios have minimal impact on the tower’s mode shapes (as indicated by the MAC values in Table 2), the time series of experimental modal displacements remained unaltered. On this basis, the non-linear minimization problem outlined in Eq. (3) is solved using the PSO algorithm with a swarm of 50 particles and an error tolerance of 1E-5. In the regularization term from Eq. (1), the reference vector of design variables is defined as \({\textbf {x}}^0=[1,1,1]^\text {T}\) for the three model classes. This reference vector, \({\textbf {x}}^0\), characterizes the scenario where the macro-elements M\(_i\) possess their undamaged nominal Young’s moduli (i.e. \(k_i=1\), \(i=1,\ldots ,3\)). In the inverse model calibration, the weighting coefficients \(\alpha\), \(\beta\) and \(\eta\) in Eq. (1) in Sect. 2.1 have been set to 1, 2 and 2, respectively, after manual tuning.

Figure 13 displays the time series of the identified stiffness multipliers \(k_i\) for the three model classes when considering the first synthetic damage scenario. Furthermore, this figure also furnishes the corresponding probability distribution functions (PDFs) of the data-points in both the training and the damaged periods. It is evident in these results that the stiffness multipliers \(k_i\) of the three model classes exhibit acute decays after the introduction of damage. These results highlight the risks associated with single-model St-Id applications, which provide no means for assessing the plausibility of the damage identification results. This can lead to misinterpretations of the affected regions, undermining the subsequent decision-making. In contrast, the proposed model selection approach allows for the comparison of competing models, providing a measure of plausibility of the different considered failure mechanisms. To illustrate this, the time series of BIC values obtained from January 10th to March 31st, 2022, for the three model classes under the synthetic damage scenarios 1 and 2 are shown in Fig. 14a and b, respectively. It is evident in these figures that, prior to introducing the damage-induced frequency decays, the BIC values are nearly identical across the three model classes, indicating that the three models are equally probable. Nevertheless, after the introduction of the damage-induced frequency decays after March 7th 2022, the model classes exhibit noticeable distinct BIC values. Specifically, it is clear that for the first damage scenario (Fig. 14a), MC\(_2\) exhibits the lowest BIC value, which agrees with the forward FEM used for generating the synthetic damage. Similarly, MC\(_3\) is selected as the most plausible model with the lowest BIC value for damage scenario 2 (Fig. 14b). Therefore, these results demonstrate the effectiveness of the proposed procedure for conducting multi-class damage identification.

4 Concluding remarks

This work has presented a novel supervised damage identification approach that exploits the concept of multi-class digital models. In contrast to classical single-model FEM updating, the proposed approach simultaneously calibrates several models, with the basic premise of each one reproducing distinct failure mechanisms. Then, a straightforward model selection approach, which analyzes the BIC values of the competing models, allows for the identification of the most plausible model to explain the damage condition. The efficacy of the proposed approach has been appraised through two different case studies, including a numerical benchmark case study and a real historical tower. In both case studies, the presented numerical results and discussion have evidenced the potential of the proposed methodology for identifying multiple damage pathologies. The first case study has been used to assess the effectiveness of the proposed approach in the presence of noise. Additionally, a discussion on the implications of limited experimental evidence has been included. The second case study has illustrated the feasibility of the field implementation of the proposed methodology through the use of computationally efficient surrogate models (SMs). The presented results and discussion have demonstrated the potential of the proposed approach for long-term full damage identification (detection of the damage pathology, localization, and quantification) through a series of synthetic damage scenarios generated via non-linear simulations. The key findings of this work can be summarized as follows:

  • The investigated benchmark case study has highlighted that the use of the BIC values as the criterion for model selection tends to favor simpler model parameterizations with a reduced set of parameters, as a direct consequence from Occam’s Razor principle. Future developments should address the consideration of more sophisticated model selection approaches, possibly accounting for the probability of occurrence of the considered failure mechanisms.

  • The simultaneous calibration of multi-class digital models is a powerful technique for improving damage identification, providing not only an assessment of the model parameters but also a means to gauge the plausibility of the inference. The compatibility of the St-Id of multiple competing models has been made possible thanks to the consideration of computationally inexpensive SMs.

  • Through synthetic damage scenarios generated via non-linear simulations, the presented results have demonstrated the potential of the proposed methodology for full damage identification. This includes the detection of the damage pathology by measuring its plausibility, precise localization of damage within the structure, and accurate quantification of its extent.

The proposed multi-class supervised damage identification approach has the potential to significantly influence decision-making in structural maintenance, representing an important breakthrough for the extensive technology transfer of St-Id. In future research, efforts should focus on incorporating larger parameter spaces and exploring more numerous populations of competing models. Additionally, an exciting opportunity to extend this concept can be found in the utilization of advanced Bayesian techniques. Incorporating these approaches has the potential to provide more robust frameworks for solving the model selection problem, accounting for the intrinsic uncertainties in the model parameters. This may ultimately bolster the reliability of the damage identification, providing the subsequent decision-making process with a probabilistic assessment of the location and extension of the damage.