Keywords

1 Motivation

Modern societies are highly dependent on a broad variety of complex capital goods including aircraft engines, industrial plants, and infrastructure systems, Salomon et al. (2020). Aircraft engines, for example, are of paramount importance for both private mobility and the industrial transportation sector of these societies. For economic and safety reasons, it is vital that such complex systems are as reliable as possible Salomon et al. (2021b). To ensure this efficiently and sustainably, the Collaborative Research Center “Regeneration of Complex Capital Goods” (CRC 871) investigates scientific fundamentals for the maintenance, repair and overhaul (MRO) of these complex capital goods, especially in the field of civil aviation, as proposed in Denkena et al. (2019). In fact, these systems are exposed to various threats and it is extremely challenging to identify all possible critical impacts and prevent them accordingly. Therefore, recent developments focus not only on enhancing the reliability and robustness of these systems, but on increasing their recoverability as well. This has led to the concept of resilience that comprises all of these aspects, cf. Salomon et al. (2020).

Information required as basis in the design, maintenance, and repair of systems are commonly governed by uncertainties, Salomon et al. (2021b). Thus, it is critical for decision making in such processes to have tools capable of efficiently performing resilience and reliability analyses of complex systems, taking into account precisely these uncertainties comprehensively. An additional major concern in MRO processes is not only the identification of direct influences of individual components, but more importantly of complex and elusive interaction effects among multiple components and their impact on the key performance measures of the capital good under investigation Salomon et al. (2021a). Global sensitivity analyses are a well-established tool in this specific context.

The current work addresses the development of a computationally efficient theoretical and algorithmic framework for evaluating the resilience and reliability of complex capital goods under consideration of uncertainty to support decision making in MRO processes. Correspondingly, the following guiding principles were defined as:

  • guarantee of resilience before, during and after regeneration, and in particular on the functionality of the complex capital good;

  • consideration of monetary and technical constraints;

  • quantification of uncertainties during regeneration;

  • identification of regeneration paths improving resilience such that technical and economic risks are minimized.

Typically, complex physical models are employed to derive conclusions in system engineering. A large number of model evaluations are required to analyize and optimize the performance of a system and its continuous guarantee. However, the utilization of a physical model for repeated evaluations with varying model parameters is often accompanied by an enormous computational burden. Thus, a derivation of a function-based model from the complex physical model, mapping the core properties of interest and thus reducing computational effort, is proposed in the current work, as illustrated in Fig. 1. After the generation of such a functional model, the developed resilience assessment framework is applied to derive additional information for the decision making process.

Fig. 1
A flow diagram starts with a representative system model that maps to resilience and risk analysis for all regeneration paths, identification of resilient and low-risk risk regeneration paths, and additional basis for decision making.

Objectives and corresponding work flow

2 Scope of the Paper

Given the principles above, four key objectives are formulated and addressed in the subproject D5 “Resilience-based Decision Criteria for Optimal Regeneration” of the CRC 871:

  1. 1.

    the establishment of a comprehensive function-based modular system modeling approach of the overall engine for resilience and reliability assessment;

  2. 2.

    efficient dynamic system modeling in dependence of operating states due to the concept of survival signature for enhanced computational efficiency;

  3. 3.

    the development of models for mixed – aleatoric and epistemic – uncertainty and the utilization of simulation methods reducing computational effort for sampling;

  4. 4.

    the identification of resilient regeneration paths.

More precisely, this means that at first a representative system model is extracted from a physical simulation model, e.g., of an aircraft engine, by the utilization of a sensitivity analysis. The corresponding findings are presented in Sect. 3. As illustrated in Fig. 1, the resulting functional model is basis for further in-depth analysis and investigation.

Given the functional model of an arbitrary, complex capital good, the comprehensive resilience analysis includes the parts illustrated in Fig. 2. At the top level a resilience analysis forms the fundamental frame for the resilience assessment of complex capital goods, evaluating all possible regeneration paths. Subsequently, limiting technical and monetary constraints are taken into account and a reduced set of acceptable resilient and low-risk regeneration paths is identified.

Fig. 2
A flow diagram starts with resilience analysis that maps to limiting constraints, reliability analysis, uncertainty analysis, and additional information for decision-making.

Work flow in the analysis framework

The reliability analysis based on the concept survival signature, introduced in Coolen and Coolen-Maturi (2013), for enhanced computational efficiency, especially in case of repeated model evaluations, is integrated into the resilience analysis for each regeneration path considered. This leads to a significantly reduced computational effort during resilience analyses of considered complex capital goods. The additional uncertainty analysis allows the consideration of diverse uncertainties utilizing novel developed, highly efficient algorithms, see Salomon et al. (2021b), reduce the sample size tremendously.

The resilience analysis is introduced in Sect. 4 and the consideration of monetary constraints is demonstrated in Sect. 5. Further, an efficient approach for the integrated reliability analysis is proposed in Sect. 6. The uncertainty analysis considered in Sect. 7 forms the last part and allows for a computationally efficient uncertainty quantification when it comes to mixed uncertainties. As a result, an additional basis for decision making in the virtual level of the regeneration process management is obtained taking into account systemic interactions and uncertain data.

3 Functional Modeling Approach

In the current section, developments concerning various functional models and their generation, as illustrated in Fig. 1, are presented. In the context of the CRC 871, a fundamental procedure was established to generate a functional model based on sensitivity analysis utilizing Sobol indices, see Miro et al. (2019). However, these developments focused on binary-state systems. In the current work, the approach proposed in Miro et al. (2019) is further developed for the consideration of multi-state systems. In addition, an alternative sensitivity measure is considered, allowing for the incorporation of interdependencies between various input parameters, enabling for a more comprehensive and realistic system modeling. Once derived from the physical model, the functional model is investigated in the analysis framework that was outlined in Sect. 2 and is presented in subsequent sections.

3.1 Extraction of Structure Functions Based on Sensitivity Analyses

As fundamental step, the methodology to derive a functional model from a physical simulation is presented within the CRC 871, Miro et al. proposed in Miro et al. (2019) a procedure to extract a functional model from a performance model of an multistage axial compressor. A multistage compressor combines multiple rotor and stator blade rows in an alternating series of connected stages. It was shown that various performance measures are dependent on the blade roughness. According to Miro et al., the blade surface roughness is considered as input variable for further analysis. The four-stage high-speed axial compressor of the Institute of Turbomachinery and Fluid Dynamics at Leibniz University Hannover is the baseline compressor of this study, consisting of four stator rows S1 - S4 and four rotor rows R1 - R4.

Miro et al. established a functional model based on results of a sensitivity analysis considering Sobol indices, see Sobol (2001), of an one-dimensional aerodynamic simulation model of that axial compressor. They chose a variance threshold of 25% based on expert knowledge, see Fig. 3. Correspondingly, the system is considered to fail due to roughness related effects if a 25% total variation of the system performance measure, estimated via Monte Carlo Simulation (MCS), is reached.

Fig. 3
A bar graph plots the index value versus rotor and stator blade roughness. The index value for R 1 is 0.12, R 2 is 0.119, R 3 is 0.2, R 4 is 0.4, S 1 is 0.002, S 2 is 0.002, S 3 is 0.00625, and S 4 is 0.001. All values are approximate.

Component importance measure with threshold of 25%, adopted from Miro et al. (2019)

The functional model developed by Miro et al. describes the dependence of the overall compressor performance, i.e., the total-to-total isentropic efficiency, on the roughness of the rotor and stator blades as binary-state structure function in the form of a Reliability Block Diagram (RBD), see Fig. 4. According to the concept of RBDs, the system functions if there exists a connection between start and end node and fails if this connection is interrupted, corresponding with a performance variation of at least 25%.

Fig. 4
A block diagram. R 1 connects to R 2, which are parallel to R 3, merge to R 4, to S 1, S 2, S 3, and S 4, which are parallel to each other.

Functional model of the multistage high-speed axial compressor

In the approach proposed by Miro et al., a certain row is specified as one of four component types ci for i 1, … , 4 . Components of the same type are prone to identical distributions describing degradation while being independent from each other. The classification of the component type and the arrangement of the components in the RBD is chosen based on the sensitivity of the component blade roughness affecting the total-to-total isentropic efficiency. Figure 3 shows the sensitivity results that are based on Sobol indices. Components with similar sensitivity values are determined to be of one component type. In the current work, the component type allocation suggested by Miro et al. is adopted. Correspondingly, the stator and rotor rows are assigned as (R1, c1), (R2, c1), (R3, c2), (R4, c3), (S1, c4), (S2, c4), (S3, c4), (S4, c4).

The arragement of the components is established as follows: If the sensitivity value of a single component exceeds the threshold, it is set in series with other components going beyond the threshold due to significant importance to the overall system performance; if a sum of sensitivity values exceeds the threshold the corresponding components are set in parallel and then linked in series. For example, component R4 goes beyond the threshold alone and therefore is considered as the most important component. Thus, the system should fail if component R4 fails, i.e., R4 exceeds a critical roughness and due to that the roughness-related performance variation of the system exceeds the threshold. Further, R1 or R2 only go beyond the threshold in sum with R3. Correspondingly, the functioning of this subsystem is described as \((R1 \vee R3)\; \wedge \;(R2\; \vee \;R3)\; = \;(R1\; \wedge \;R2)\; \vee \;R3\), as shown in Fig. 4, where R1, R2, R3 ∈ {0, 1}. Following this idea all stators should be arranged in parallel as each of them has rather a small impact. This parallel block S = S1 \(\vee\) S2 \(\vee\) S3 \(\vee\) S4 is again in series with the R1, R2, R3 block as well as with R4. Miro et al. argued that the parallel stator block should be allocated in series as shown in Fig. 4 based on expert knowledge, even though the sum of their sensitivity values doesn’t reach the threshold. To summarize, the entire system and its functional state is described by F = (R1 \(\wedge\) R2) \(\vee\) R3) \(\wedge\) R4 \(\wedge\) (S1 \(\vee\) S2 \(\vee\) S3 \(\vee\) S4) with F ∈ {0, 1} and the system components R1, R2, R3, R4, S1, S2, S3, S4 ∈ {0, 1}.

In the current work, this approach is adapted for a multi-state system multi-state component consideration to prove the applicability of the functional modeling approach in the context of partial functionality. Thereby, suppose the system is functioning in the state j or above if the j -th rule is satisfied for components in state j or above with j = 1, … J. For illustrative purposes, four rules are defined as structure functions represented by RBDs and thus J = 4. Correspondingly, four thresholds are determined as basis to generate four structure functions corresponding to four levels, see Fig. 5.

Fig. 5
A bar graph plots the index value versus rotor and stator blade roughness plots the value of R 1 as 0.12, R 2 as 0.119, R 3 as 0.2, R 4 as 0.4, S 1 as 0.002, S 2 as 0.002, S 3 as 0.00625, and S 4 as 0.001. All values are approximate.

Component importance measure with thresholds of 2.5, 7.5, 15 and 30%

Figure 6a shows the structure function via an RBD for the system state of perfect functioning j = J = 4. The corresponding threshold is set to 2.5%. The components R1, R2, R3, R4 and S3 exceed this threshold, while components S1, S2 and S4 only go beyond the threshold if summed up. Thus, R1, R2, R3, R4 and S3 are connected in series, while S1, S2 and S4 are connected in parallel and then set in series. This seems reasonable as the components S1, S2 and S4 do not have a critical impact, while the components R1, R2, R3, R4 and S3 definitely harm the perfect state due to their significant influence on the system performance variation.

Fig. 6
Four diagrams present functional models for level j equals 4, j equals 3, j equals 2, and j equals 1.

RBDs for different performance levels

Figure 6b shows the structure function via an RBD for the intermediate system state j = 3. The corresponding threshold is set to 7.5%. The components R1, R2, R3 and R4 exceed this threshold, while components S1, S2, S3 and S4 only go beyond the threshold if summed up. Thus, R1, R2, R3 and R4 are connected in series, while S1, S2, S3 and S4 are connected in parallel and then set in series.

Figure 6c shows the structure function via an RBD for the intermediate system state j = 2. The corresponding threshold is set to 15%. The components R3 and R4 go beyond this threshold. Thus, R3 and R4 are connected in series. In contrast, the components R1, R2 as well as the parallel stator block S only exceed the threshold if at least two of those are summed up. Based on the series connection of R3 and R4, the system functions in state j = 2 if at least R1 R2, R1 S and S R2 take a value of 1, i.e., function in state j = 2 or above. As a consequence, the latter relationship is modeled via an at-least-2-out-of-3-connection.

Figure 6d shows the structure function via an RBD for the last system state j = 1 before complete failure. The corresponding threshold is set to 30%. Note that all thresholds are set arbitrarily and only for illustrative purpose. The component R4 goes beyond this threshold and is connected in series. The components R1, R2 and S only exceed this threshold if summed up with R3. In case of functioning it holds that (R1 \(\vee\) R3) \(\wedge\) (R2 \(\vee\) R3) \(\wedge\) (S \(\vee\) R3) = (R1 \(\wedge\) R2 \(\wedge\) S) \(\vee\) R3.

The obtained binary-state structure functions represented in Figs. 4 and 6 are utilized for multi-state system reliability analysis. The corresponding findings are presented in Sect. 6.3.

3.2 Kucherenko Indices

Typically, Sobol indices are utilized for conducting sensitivity analysis as, e.g., proposed in Miro et al. (2019). These variance-based indices display effects of single input variables on output variables (first-order effect indices), and interaction effects between several input variables and their impact on the output variables (total-effect indices). The Sobol indices, as well as most other sensitivity analysis tools, are based on the assumption that all input variables are independent of each other. However, this assumption rarely applies in reality, and in various engineering fields, input variables are correlated, see e.g., Jacques et al. (2006); Keitel and Dimmig-Osburg (2010). Therefore, in this work, a sensitivity analysis of the above mentioned steady-state performance model for an aircraft engine is conducted by applying a generalized form of the Sobol indices according to Kucherenko et al. (2012), hereinafter referred to as Kucherenko indices. These indices are capable of taking into account dependencies between input variables and are therefore more suitable for addressing real world problems.

In practice, an analytical determination of the Kucherenko indices is often not feasible. Therefore, Kucherenko et al. presented Monte Carlo estimators for their indices in their work, Kucherenko et al. (2012). Both estimators require a conditional sampling. Conditional sampling, however, might be tedious or even impossible for some models due to computational demand, such as for the jet engine iteration matching model, considered in this work. Therefore, in Marelli et al. (2019), Marelli et al. provide sample-based Monte Carlo estimators for both Kucherenko indices.

As an example, consider the V2500-A1 jet engine that is a two-spool turbofan with a fan, low-pressure compressor (LPC), high-pressure compressor (HPC), high-pressure turbine (HPT), low-pressure turbine (LPT) and a common thrust nozzle. This jet engine was considered in Salomon et al. (2021a) to show the applicability of a sophisticated sensitivity measure for an entire jet engine. In cooperation with the subproject D6 of the CRC 871 concerning module interactions and the overall system behavior, the first-order effects for the six considered output quantities with respect to the five varying input efficiencies of the main turbomachines, are determined by utilizing the sample-based Monte Carlo estimators. The corresponding results are shown in Fig. 7. It can be seen that among all of the direct effects on output variances, the variance of the HPT efficiency is the dominant factor for four to five out of all six output quantities. This is the fact as the HPT exhibits the highest index value and therefore constitutes the main influence on the system performance. In this manner the approach is applicable for the assessment of an entire jet engine under consideration of interdependencies between engine components.

Fig. 7
A compound bar graph plots first order effect Kucherenko indices. H P T has the highest exhaust gas temperature, specific fuel consumption, surge margin H P C, and pressure with 0.67, 0.6, 0.48, and 0.45, H P C has the highest temperature of 0.86, and fan has the highest spool speed of 0.65.

First-order effect Kucherenko indices

In Salomon et al. (2021a), Salomon et al. additionally compute the total-effect Kucherenko indices and both results are discussed in detail. To summarize, the study shows that simple correlations are not sufficient to explain the influence of combined module variances and find the causes of deterioration. Therefore, sensitivity analyses under consideration of dependent variables by means of Kucherenko indices and digital performance twins are powerful tools to determine the influence on a scientific basis. For an overall view, however, the change in capacity and work must also be examined at different operating points with an engine pressure ratio regulation. It shall be noted, that these results can be utilized as a basis for a detailed reliability analysis by developing a functional model according to Miro et al. (2019) and Eryilmaz and Tuncel (2016) of the V2500-A1 aircraft engine performance model.

4 Resilience Analysis

The resilience analysis, developed in Salomon et al. (2020), forms the first phase and basis of the analysis framework, illustrated in Fig. 2, and is presented in the current section. Therefore, a fundamental notion of resilience and a corresponding metric is suggested. Subsequently, a resilience decision making framework is developed consisting of two key ingredients, an adapted systemic risk measure and a sophisticated resilience metric, enabling for systematic computation of the resilience for various endowment configurations. These endowment configurations can be intepreted in a variety of ways, e.g., as different regeneration paths of the considered complex capital good. Finally, the grid search algorithm and its advantageous properties in terms of computational efficiency are presented. For illustrative purposes, the developed algorithmic framework is than applied to the functional model established in Sect. 3.1, whereby it is not limited to this particular use case, but can be utilized to a variety of system models. Illustrative results are presented in combination with the second phase of the analysis framework in Sect. 5.

4.1 Resilience Metric

Given a system being exposed to a disruptive event and recovering its functionality afterwards, three essential phases can be defined classifing the system states, as illustrated in Fig. 8: (i) The original stable state, whose duration relates to the reliability of the system, forms the first phase. (ii) The second phase is the loss of performance after the occurrence of a disruptive event. This loss depends on the vulnerarbility or robustness of the system; the robustness of the system is interpreted as the resistance to a loss of performance. (iii) The disrupted state of the system and its recovery to a new stable state is the last phase and governed by the recoverability. In general, the new stable state may differ from the original state and, accordingly, its performance may be higher or lower. The majority of resilience metrics available in the current literature is based on system performance, i.e., on the three states and their transitions shown in Fig. 8. Consequently, a quantitative measure of resilience depends on the specific choice and definition of system performance, see e.g., Ayyub (2015). Performance-based approaches may be ratio-based, integral-based, or both.

Fig. 8
An area graph plots Q t versus time. The first phase is reliability, vulnerability, robustness comes next after the decline starting from a disruptive event, and recoverability when the line rises.

The three resilience phases before and after a distruptive event; adapted from Henry and Ramirez-Marquez (2012)

In the current work, the probabilistic resilience metric by Ouyang et al. Ouyang et al. (2012) is utilized. The metric is denoted by Res and defines the expected ratio of the integral of the system performance (t) over the time interval [0, T] and the integral of the target system performance T Q(t) over the same time interval:

$$Res = E\left[ Y \right], \,\, {\text{ where}} \,\, Y = \frac{{\mathop \smallint \nolimits_{0}^{T} Q\left(t \right)dt}}{{\mathop \smallint \nolimits_{0}^{T} {\mathcal{T}}Q\left(t \right)dt}}.$$
(1)

The system performance Q(t) is described as a stochastic process. In general, TQ(t) might be considered as a stochastic process as well, but for expediency it is assumed to be a non-random constant TQ in this work. The resilience metric takes values between 0 and 1 when limiting the recovered performance at maximum equivalent to the original performance. The value Res = 1 indicates a system performance corresponding to the target performance, while Res = 0 captures that the system is not working during the considered time period at all.

4.2 Adapted Systemic Risk Measure

As proposed in Salomon et al. (2020), the resilience metric presented in Sect. 4.1 is integrated into an adapted systemic risk measure, enabling the systematical assessment of various system configurations, that might be, e.g., regeneration paths. In particular, technical systems for which a meaningful system performance Q(t) can be determined are considered.

Assume that the system encompasses l system components. Each component is characterized by its type and n relevant properties that influence the overall system performance. For convenience, apply matrix notation. A component i ∈ {1, ..., l } can be characterized by a row vector

$$\left({a}_{i};{j}_{i}\right)=\left({\eta }_{i1},{\eta }_{i2}, \dots ,{\eta }_{in};{j}_{i}\right) \,\, \in \,\, {R}^{\left(1\times n\right)}\times {\mathbb{N}}$$
(2)

where ηi1, ηi2, ..., ηin represent the numerical values of the n properties and ji ∈ {1, 2, … , b} \(\subseteq \mathbb{N}\) defines its type. The system is described by a pair including the matrix A \(\mathbb{R}\)(l×n) and the column vector z\(\mathbb{N}\)l that captures the types of the components:

$$(A;z)\; = \;\left({\begin{array}{*{20}c} {\eta_{11} } & {\eta_{12} } & \ldots & {\eta_{n1} ;} & {z_{1} } \\ {\eta_{21} } & {\eta_{22} } & \ldots & {\eta_{2n} ;} & {z_{2} } \\ \vdots & \vdots & \; & \vdots & \vdots \\ {\eta_{l\,1} } & {\eta_{l\,2} } & \ldots & {\eta_{l\,n} ;} & {z_{l} } \\ \end{array} } \right)$$
(3)

The input-output model Y = Y A; is evaluated for these pairs. Below, a corresponding adapted systemic risk measure is constructed as follows. As a specific example, choose the acceptance set

$${\mathbb{A}}\; = \;\{ X\; \in \;{\mathbb{X}}\;{|}\;{\text{E[X]}}\; \ge \;\alpha \} \;\;\text{with}\;\;\alpha \; \in [0,1]$$
(4)

The risk measure is defined as

$$R(Y;\;K)\; = \;R(Y;(K;\;z))\; = \;\{ A\; \in {\mathbb{R}}^{l \times n} \;|\;Y_{(K+A;z)} \; \in \;\mathbb{A}\} ,$$
(5)

that is the set of all allocations of modified system properties A that are added to the base properties K for which the altered system (K+ A; z) exhibits a resilience greater or equal to α. Without loss of generality but to keep the notation simple, set K = 0, and R(Y; 0) is written as R(Y).

Practical applications might require to impose restrictions for the structure of the matrix in Eq. (3). For instance, components of a specific type might require an equivalent configuration, i.e., the corresponding row vectors ai must possess equal values. Following Feinstein et al. (2017), such constraints can be captured by functions : \(\mathbb{R}\)p\(\mathbb{R}\)(l×n) that are monotonously increasing with a’ → (A; z), where z\(\mathbb{R}\)l indicates the types of the components. Such a function maps a lower-dimensional set of parameters a’\(\mathbb{R}\)p to the system description.

4.3 Grid Search Algorithm

In accordance with Feinstein et al. (2017), a set-valued systemic risk measures as presented in Sect. 4.2 can be computed via a combination of the so-called grid search algorithm and stochastic simulation. In two dimensions, a box-shaped subset of endowment properties is subdivided by a grid of equidistant points.

The algorithm proceeds as follows. The search starts at the origin of the search space; assume that the origin is outside of R(Y). In a successive manner, the acceptance criterion is evaluated for each adjacent grid point on the grid diagonal along the direction (1, 1, … , 1)T. Typically, in each evaluation stochastic simulation is performed. The search along the diagonal terminates as soon as a grid point that meets the acceptance criterion is identified. Given the monotonicity of the input-output model and the properties associated with the acceptance criterion (cf. Feinstein et al. (2017)), all grid point configurations in the box-shaped subset with the first accepted one as the bottom left corner are acceptable as well and consequently belong to R(Y). Analogously, all endowments in the box-shaped subset with the first accepted one as the top right corner are rejected. Thus, these points belong to R(Y)C that is the complement of the systemic risk measure. Precisely this monotonicity property makes the algorithm efficient.

Each neighboring pair of diagonally adjacent points with one of these points meeting the requirements and the other not, defines a sub-box. In the next step, the algorithm checks the remaining corners of this sub-box, assigning a status to dominating and dominated endowments, respectively. Subsequently, the next neighboring pairs of points can be determined. The algorithm terminates as soon as all points on the grid have an assigned acceptance status. Finally, risk measure R(Y) is determined as a discrete grid-approximation. This algorithm, combined with the methods proposed in Sects. 4.1 and 4.2, allows decision-making to be made regarding the optimal trade-off between resileince-enhancing endowments for complex capital goods.

5 Constrained Resilience Analysis of an Axial Compressor

In the current section, the methodology presented in Sect. 4 is demonstrated for illustrative scenarios, while the procedure for considering monetary constraints is elaborated, see phase one and two in Fig. 2. The method can be applied to assess a variety of complex capital goods. In Salomon et al. (2019) and Salomon et al. (2020), Salomon et al. proved the applicability of the proposed approach for a wide range of complex systems, e.g., flow networks, an axial compressor and the Berlin metro network.

5.1 Resilience Analysis Setting

In the context of the CRC 871, consider the functional model illustrated in Fig. 4 of the axial compressor developed in Miro et al. (2019) presented in Sect. 3.1. Again, an interruption between start and end represents system failure, i.e., a roughness-realted performance variation of the physical system of at least 25%. The system functionality is utilized as meaningful system performance Q(t) that was claimed in Sect. 4.2 for the subsequent application of the resilience decision making procedure. The system performance is evaluated at each point in time tℎ and equals 1 if there is a connection from start to end and is 0 if the connection is interrupted.

Components i ∈ {1, … , 8} of the functional model represent stator blade rows and rotor blade rows. In this example, each of them is assumed to have the same component type, i.e., it holds that ji = 1 ∀i ∈ {1, … , 8} . For simplicity, denote (ai;ji) = (ai ; 1) = ai ∀ i ∈ {1, … , 8}. Suppose that each row, i.e., each component, is characterized by two endowment properties, namely, a roughness resistance re and a recovery improvement r*. Then, the component is described by ai = (rei , r*i). It holds true that rei = rei* , ri*, if ji = ji′ , consequently, the endowment pair (rei, ri*) has equal numerical values for all components. Each of these configuration pairs might represent a particular regeneration path.

After evaluating the system performance in a previous time step t, each component can fail randomly. A failed component is removed from the model and no longer contributes to the system performance at time tℎ+1. The component remains in the failed state until its full recovery. Assume that the failure probability of the component i is assumed to be constant in the time interval (t, tℎ+1). For illustrative purposes, it is given by

$$P\{ {\text{Component}}\;{\text{i}}\;{\text{fails}}\;{\text{during}} \,\, (t_{\text{h}} ,t_{{\text{h+1}}})\} = t\;\cdot\lambda_{i}$$
(6)

with

$$\lambda _{i} \; = \;0.8\; - \;0.03\; \cdot \;re_{i} ,$$
(7)

where λi is the time-independent failure rate. This single-step failure model corresponds to a simple approach for considering reliability and robustness. A consideration of system reliability in multiple states, where the system passes through several intermediate states before failure, as presented in a subsequent section, is one possibility for a more comprehensive modeling approach.

Suppose that a failed component instantly recovers to the original performance level after a certain number of time steps passed. Then, the component recovery is described by

$$r_{i} \; = \;r_{\max \;} - r_{i}^{*} \;\;\text{with}\;\;r_{i}^{*} \; < \;r_{\max \;} ,$$
(8)

where rmax denotes the maximum number of time steps required for recovery and ri* is a reduction depending on the current endowment of the component i. Since each time step is of the length \(t\; = \;\frac{T}{u}\), with T denoting the investigated duration and u the amount of considered time steps, the duration of the recovery process is \(r_{i} \; \cdot \;\frac{T}{u}\). In accordance with Ayyub (2014) and Ayyub (2015), this simple recovery model corresponds to a one step recovery profile; however, various other characteristic profiles of recovery in time are conceivable.

Note that in this setting increasing the roughness resistance of a blade row, i.e., a component i, mitigates the degradation of the surface, i.e., counteracts the roughning process, and correspondingly reduces the failure rate \(\lambda_{i}\). If the component i fails, its functionality is fully recovered after ri time steps specified via Eq. (8).

5.2 Costs of Endowment Properties

A certain endowment relates to the property quality of one or more components. In general, a higher quality of components results in a more resilient system. However, an increase in quality is typically associated with an increase in costs. Consequently, it is essential to take into account monetary aspects for an expedient decision making procedure. In accordance to Mettas (2000), assume that increasing the reliability of components in complex systems corresponds to an exponential increase in their costs.

Assume that the cost associated with improving the endowment property roughness resistance is given by

$$cost^{re} \; = \;\sum\limits_{i = 1}^{8} {price^{re} \; \cdot \;1.2^{{(re_{i} - 1)}} ,}$$
(9)

where rei is the roughness resistance value of component i. Further, pricere is a common basic price independent of i in the current case study. Analogously, assume an exponential relationship for the costs associated with the recovery improvement ri*:

$$cost^{r*} \; = \;\sum\limits_{i = 1}^{8} {price^{*} \; \cdot \;1.2^{{(r_{i}^{*} - 1)}} .}$$
(10)

The total cost of an endowment results from the sum of these costs:

$$cost\; = \;cost^{re} \;+\;cost^{r*} .$$
(11)

This cost function shown is subsquently utilized to determine the cost of a certain endowment. Consequently, the endowment pair with minimum cost can be identified. The combination of the adapted systemic risk measure developed in Sect. 4.2, including the corresponding acceptance set, with the cost function allows the evaluation of optimal endowment pairs regarding resilience and monetary constraints.

5.3 Scenario and Numerical Results in a Two-Dimensional Setting

Below, the decision making method for identifying resilience-enhancing endowments under consideration of monetary constraints is demonstrated for the multi-stage high-speed axial compressor presented in Fig. 4 in Sect. 3. For illustrative purposes, the model parameters and simulation parameter values, shown in Table 1, are considered.

Table 1 Parameter values for the resilience decision making method for the functional model of the multi-stage high-speed axial compressor

Assume an resilience acceptance threshold of α = 0.8, an arbitrarily selected number of u = 200 time steps, a constant failure rate of λ = 0.8 as well as an arbitrarily selected time step length of t = 0.05. The first step in the analysis is to determine the set of all acceptable endowments that correspond to a resilience value of at least Res = 0.8 over the time period under consideration. In practice, any improvement of the axial compressor blades is associated with costs. Consequently, the second step is to identify the least expensive acceptable endowment, denoted by \(\hat{A}\). The grid search algorithm described in Sect. 4.3 explores the roughness resistance re and the recovery improvement r* over rei ∈ {1, ..., 20}, ri* ∈ {1, ..., 20} ∀I ∈{1, …, l}. Increasing a value of the properties of a component i is interpreted as increasing its the quality level. The roughness resistance values are interpreted as various quality levels of coatings applied to the blades. In terms of recovery, the quality increasing leads to a reduced recovery time for the components taking values from a maximum of 20 time steps (for ri* = 1) to a minimum of one time step (for ri* = 20) given rmax = 21.

Figure 9 shows the results of the grid search algorithm. The acceptable pairs of component properties, i.e., roughness resistance and recovery improvement, are depicted as blue, filled dots. Clearly, the quality of recovery improvement and the quality of the blade coatings can be compared regarding their impact on the system resilience. For instance, given recovery improvement values with ri* ≥ 15, the minimum roughness resistance value of rei = 1 is already sufficient to achieve the desired level of resilience.

Fig. 9
A line graph plots recovery improvement versus roughness resistance. Two lines decrease, following the same pattern. The areas above and below the lines are shaded the same as the top and bottom lines, respectively.

Numerical results of the grid search algorithm for the functional model of the axial compressor with explored roughness resistance/recovery improvement values

For the determination of R(Y) only about 10% of all possible endowment pairs had to be evaluated due to the grid search algorithm presented in Sect. 4.3. More precisely, the number of endowment pairs on the diagonal plus the number of pairs equivalent to the size of the set of pairs with minimum acceptable resilience, i.e., the boundary or pareto front of R(Y), had to be evaluated. Taking into account the base prices in Table 1, the most cost-efficient endowment i among the boundary set is characterized by a roughness resistance of rei = 8 and a recovery improvement of ri* = 13 for each of the eight components. In Fig. 9 the corresponding pair is highlighted by a green circle. According to Eqn. (11) the total cost equals 136 930€. Based on these results, the decision maker is advised to realize rei = 8 and ri* = 13.

Note that in case of analyzing regeneration paths, resilience applies to the regeneration paths in two ways: 1. as a part of the overall performance over the entire life cycle of the complex capital good, and 2. as a resilient regeneration path in itself. Clusters are formed or identified of similar, equally acceptable regeneration paths to which, in the event of a problem, it is possible to switch without great effort.

5.4 Scenario and Numerical Results in Multiple Dimensions

As shown in Salomon et al. (2019), the methodology developed in Salomon et al. (2020) can be utlized in the multidimensional case as well. The current subsection shall prove the applicability in a four dimensional setting given the model of the multistage high-speed axial compressor. The model parameter and simulation parameter values shown in Table 2 are considered. Assume the recovery improvement r to be fixed for all components, regardless of their type, ri* = 11 1, … l , while the roughness resistance re is explored over rei ∈ {1, … , 20} ∀ I ∈ {1, … , l} . Again, the roughness resistance values can be interpreted as increasing quality levels of coatings. In this scenario, the four component types suggested in Sect. 3.1 are adopted. Correspondingly, the first and second rotor blade rows are assigned as c1), the third and fourth as c2 and c3, respectively, while all stator blade rows are assigned as c4. The set of all acceptable endowments leading to a system resilience value of at least Res = 0.85 over the time period under consideration is determined via the grid search algorithm. Then, the most cost-efficient endowment denoted by   is identified (Table 2).

Table 2 Parameter values for the resilience decision making method for the functional model of the multistage high-speed axial compressor

Figure 10 shows the corresponding results. In Fig. 10a, all combinations with a satisfactory system resilience of at least Res = 0.85 are depicted, corresponding to phase one in the analysis framework, see Fig. 2. This is the set of roughness resistance endowment pairs contained in. In fact, the roughness resistance of the fourth rotor blade row (c3) has the highest impact on the system resilience compared to other rows. This can be concluded as only pairs with a high roughness resistance quality for this type are acceptable. Regardless of the endowment property values of all other component types ci ∈ {1, …, 4}, the endowment pairs with coating qualities of rei ≤ 15 for c3 are not sufficient to provide an acceptable level of system resilience. In contrast, the roughness resistance of the four stators (c4) has minor influence on the system resilience compared to all other types. Even endowments with (rei, 4) = 1, i.e., a minimum coating quality level, are sufficient to achieve acceptable resilience values. The same holds true for the rotors of of type c1 and c2. Although, in comparison to the stators, components of types c1 and c2 require significantly higher level of coating quality to compensate small, i.e., values other than maxium values of roughness resistance for c3.

Fig. 10
Two graphs plot roughness resistance versus component types. A, numerical results with explored roughness resistance values. B, numerical results considering a budget threshold.

Numerical results for multi-dimensional setting

For decision making, it is crucial to be able to take into account monetary constraints. Therefore, Fig. 10b shows the endowment pairs contained in R(Y) that lead to a satisfying system resilience of Res = 0.85 considering a budget threshold that is set to costremax = 50 000€ for illustrative purposes, corresponding to phase two in the analysis framework, see Fig. 2.

The results illustrated in Fig. 10b show that only configurations with low coating quality levels for all stators (c4) are below the cost limit. Firstly, this is the case due to their low influence on system resilience, and secondly, to the high costs for increasing the quality levels for the stators caused by their amount and exponential cost-quality behavior. In contrast, only configurations that provide the highest quality levels of (rei, 3) ≥ 17 for the rotor of type 3 are acceptable and below the price limit simultaneously. The roughness resistance of the rotor of type c3 has a critical influence on the system resilience. Consequently, the compensation of lower quality levels for c3 by higher quality levels of the remaining blade rows exceed the given budget treshold. Even though the roughness resistance of the rotor of c2 has a lower influence on the system resilience than that of c3, mininum quality levels for c2 cannot be compensated by high qualities of the other components either. Correspondingly, at least (rei, 2) = 4 is required to meet the acceptance criterion.

Considering the base prices in Table 2, the most cost-efficient endowment is characterized by the pair with roughness resistances of (rei, 1) = 4, (rei, 2) = 14, (rei, 3) = 19 and (rei, 4) = 3. The corresponding configuration is highlighted in blue in Fig. 10b. Via Eqn. 11 the total cost is obtained as \(cost_{{(\hat{A};z)}} \; = \;cost^{{re}} \;+\;cost^{*} \;\) = 42 604€ + 35 664€ = 78 268€.

The numerical effort for the computation of R(Y) was reduced by about 98% due to the grid search algorithm compared to a naive evaluation of the search space. Correspondingly, only 2% of all possible combinations of roughness resistance values had to be evaluated. Note that the application of this methodology to higher-dimensional problems is only limited by constraints of computational memory and time.

6 Reliability Analysis

The reliability analysis follows the resilience analysis and the reduction of all, in terms of system resilience, acceptable system configurations respectively regeneration paths due to technical and monetary restrictions. It thus forms the third phase in the analysis framework, see Fig. 2.

6.1 Repeated Evaluation of the Survival Function

For all remaining endowment pairs of interest for decision makers, a system reliability analysis is conducted to evaluate the system failure probability at given time t. The system reliability is evaluated based on the stochastic properties of the system components represented as probability distribution functions that describe the event probability of failure due to degradation or a disruptive event. Each model evaluation leads to the so-called survival function, i.e., the probability that the system is still functioning at time point t, performing its predefined task. In Fig. 11 three different survival functions are shown as Path A, B and C for illustrative purposes.

Fig. 11
A line graph plots survival function versus time. Decreasing lines are plotted for paths A, B, and C, maximum and minimum requirement, and without regeneration measure.

Reliability analysis based on the concept of survival signature considering multiple regeneration paths

Different endowment pairs evaluated in the grid search algorithm correspond to various component properties due to different regeneration paths and lead to different survival functions. The three survival functions in Fig. 11 correspond to three different regeneration paths, i.e., endowment pairs. In addition, there might exist minimum and maximum requirements of system reliability due to practical experiences, customer requirements, e.g., budget limitations, or other circumstances. An efficient procedure is required to realize the large number of repeated model evaluations, i.e., computations of survival functions, with changing component properties.

6.2 Concept of Binary-State Survival Signature

Figure 12 illustrates the concept of survival signature introduced in Coolen and Coolen-Maturi (2013). The most beneficial attribute of this approach is its separation property. That means that the system structure is separated from the probability structure of the system describing the component failure behavior. This leads to a significant reduction of the computational effort, since once the typical costly to determine system structure has been computed, any possible characterization of the probabilistic part can be tested with no need to recompute the structure. This means that any number of system configurations, i.e. regeneration paths, can be simulated and analyzed, since these only affect the probability structure and usually not the system structure. At the same time, the survival signature radically condenses information on the topological reliability for systems with multiple component types with K being the maximum number of component types. The failure times of components of one type are claimed to be independent and identically distributed (iid) or exchangeable. For more information on claimed exchangebility in practice, see Coolen and Coolen-Maturi (2016) and Salomon et al. (2021b).

Fig. 12
An equation of P of T s is greater than t. The parts on the right of the equation are labeled system structure and probability structure paths A, B, and C.

Illustration of the advantageous properties of the concept of survival signature

For a deeper understanding of this concept, consider a coherent system with a given binary-state structure function defining the system state to be either 0 or 1 for a binary-state vector out of the set of all possible state vector. The binary-state vector specifies the state of n = \(\sum\nolimits_{k = 1}^{K} {n_{k} }\) components in total, there are \(\left({\begin{array}{*{20}c} n \\ l \\ \end{array} } \right)\)state vectors x with exactly l components refer to as Sl. In the case of k ≥ 2, the survival survival signature summarizes the probability that a system is working as a function depending on the number of working components lk for each type k = 1,..., K. Assume the failure times within a component type to be iid or exchangeable. Consequently, all possible state vectors are equally likely to occur. Then, the survival signature is defined as

$$\Phi(l_{1} ,\;l_{2} ,\; \ldots ,l_{K} )\; = \;\left[ {\prod\limits_{k = 1}^{K} {\left({\begin{array}{*{20}c} {n_{k} } \\ {l_{k} } \\ \end{array} } \right)^{ - 1} } } \right]\sum\limits_{{x \in S_{{l_{1} ,l_{2} , \ldots ,l_{K} }} }} {\phi (x),}$$
(12)

where \(\left({\begin{array}{*{20}c} {n_{k} } \\ {l_{k} } \\ \end{array} } \right)\) corresponds to the total number of state vectors xk of type k and \(S_{{l_{1} ,\;l_{2} ,\; \ldots ,l_{k} }}\) denotes the set of all state vector of the entire system for which \(l_{k} \; = \;\sum\nolimits_{i = 1}^{{n_{k} }} {x_{k,i} }\). Then the survival signature (l1, l2, …, lK) characterizes the probability that a system is working given that exactly l of its components working for l = 1,..., n. Note that the survival signature depends only on the topological reliability of the system, independent of the time-dependent failure behavior of its components, namely, the probability structure. Note that it is differentiated between the concept of survival signature with its seperation property shown in Fig. 12 and the mathematical object survival signature itself shown in Eq. 12.

Further, assume the probability distribution for the failure times of type k to be known with Fk(t), denoting the corresponding cumulative distribution function. Then,

$$\begin{aligned} P\left({\bigcap\limits_{k = 1}^{K} {\{ C_{k} (t)\; = \;l_{k} \} } } \right)\; & = \;\prod\limits_{k = 1}^{K} {P(C_{k} (t)\; = \;l_{k} )} \\ & = \;\prod\limits_{k = 1}^{K} {\left({\begin{array}{*{20}c} {n_{k} } \\ {l_{k} } \\ \end{array} } \right)[F_{k} (t)]^{{n_{k} - l_{k} }} [1 - F_{k} (t)]^{{l_{k} }} } \\ \end{aligned}$$
(13)

describes the probability structure of the system, regardless of its topology. Ck(t) ∈ {0, 1, … , nk} represents the number of components of type k in a working state at time t.

Both Eqs. 12 and 13 form together the concept of survival signature illustrated in Fig. 12. The concept is integrated into the proposed framework, see phase three in Fig. 2, to leverage its salient beneficial properties for repeated model evaluations that are required for comprehensive MRO decision making.

6.3 Concept of Multi-state Survival Signature

While a binary-state consideration of systems and their components is state-of-the-art, further research on multistate systems with multi-state components is inevitable for a more realistic and comprehensive assessment of system reliability. In Eryilmaz and Tuncel (2016), Eryilmaz & Tuncel proposed a generalized concept of survival signature in the context of unrepairable homogeneous multi-state systems. In accordance with the approach presented in Coolen and Coolen-Maturi (2013), the survival function for multiple types with type k = 1, … , K can be derived as:

$$\begin{aligned} P\{ T^{ \ge J} \; > \;t\} \; & = \;\mathop {\sum \ldots \sum }\limits_{{i^{1} \ge \cdots \ge i^{J} }} \Phi^{\ge J} \left({i_{1}^{1} ,\; \ldots ,\;i_{k}^{j} ,\; \ldots ,\;i_{K}^{J}} \right) \\ & \times \;P\left\{ {C_{1}^{1} (t) = i_{1}^{1} ,\; \ldots ,\;C_{k}^{j} (t)\; = \;i_{k}^{j} , \ldots ,\;C_{K}^{J} (t) = i_{K}^{J} } \right\}. \\ \end{aligned}$$
(14)

with maximum system and component level J and T≥J that is the system failure time in state J. Thereby, \(\Phi^{ \ge J} (i_{1}^{1} ,\; \ldots ,\;i_{k}^{j} , \ldots ,i_{K}^{J} )\) represents the j-th level survival signature for level J, i.e., the probability that the system is working in state J or above if \(i_{k}^{j}\) components are working for types k = 1, … , K and states j = 1, … , J with \(i_{k}^{j - 1} \; \ge i_{k}^{j}\). The total number of state vectors given ijk components of type k functioning in state j or above is

$$\upsilon_{{n_{1} , \ldots ,n_{K} }} (i_{1}^{1} ,\, \ldots ,i_{k}^{j} ,\, \ldots ,i_{K}^{J} ) = \;\prod\limits_{j = 1}^{J} {\left({\begin{array}{*{20}c} {n_{1} - i_{1}^{j\,\,1} } \\ {i_{1}^{j\,} \; - \;i_{1}^{j\,\,1} } \\ \end{array} } \right)\; \ldots \;\left({\begin{array}{*{20}c} {n_{K} - i_{K}^{j\,\,1} } \\ {i_{K}^{j\,} \; - \;i_{K}^{j\,\,1} } \\ \end{array} } \right),}$$
(15)

where \(i_{1}^{J\,+\,1} \; = \; \ldots \; = \;i_{K}^{J\,+\,1} \; = \;0\) and nk denotes the total number of components of type k. The j-th level survival signature for level J for multiple types is given as

Again the j-th level survival signature and the survival function with multiple components are derived similarly to Eqs. (16) and (14), respectively.

The approach proposed in Eryilmaz and Tuncel (2016) allows to compute the reliability of multi-state systems for J binary-state structure functions with components following a Markov degradation process with minor failures. Given the four binary-state structure functions established in Sect. 3.1 and the probability structure describing component degradation from state to state, the reliability of the multi-state axial compressor to be in one of four states can be evaluated. For illustrative purposes, the component degradation model was established as suggested in Eryilmaz and Tuncel (2016) with arbitrarily selected instantaneous degradation rates. Figure 13 shows the survival functions of the levels j = 1, …, 4 of the multi-state axial compressor previously introduced. The combination of this multi-state system consideration with the developed analysis framework allows the assessment of the resilience of mutli-state systems with multi-state components.

Fig. 13
A line graph of R t versus t. J equals 1, 2, 3, and 4 start at (0, 1) and curve downward to (4, 0), passing through different points where j equals 1 being at the top.

j -th level survival functions for j ∈ {1,..., 4}

7 Uncertainty Analysis

In reality, the information on a complex capital good and its behavior is subject to aleatoric or so-called irreducible uncertainties but typically also epistemic uncertainties or so-called imprecision. For instance, this is the case due to estimates of distribution parameters based on expert knowledge, measurement errors or a simple lack of data. In the context of the CRC 871, the influence of a regeneration measure on the survival behavior of a complex capital good might not be precisely known, i.e., the distribution parameters describing the failure behavior can only be estimated. Thus, the models and corresponding simulations are also governed by these uncertainties. However, for comprehensive decision making existing uncertainties need to be considered in analysis and therefore beneficial approaches to implement these are an important research topic, see Beer and Ferson (2012) and Beer et al. (2013). Consequently, the novel uncertainty analysis developed in this work constitutes the fourth and final phase of the analysis framework, see Fig. 2.

7.1 Imprecision and its Implementation via Fuzzy Probability

Figure 14 shows the concept of survival signature with an adaption of the probability structure via fuzzy probabilities, cf. Salomon et al. (2021b). Thus, the imprecision is propagated through the model. This allows the advantageous properties of this concept to be exploited while accounting for imprecision.

Fig. 14
An equation of P at T s greater than t. A part on the right hand side of the equation is labeled probability structure, fuzzy probabilities.

Fuzzy probabilities included in the concept of survival signature

The result is not a sharp survival function, as seen in Fig. 15a, but an imprecise survival function with regions, as seen in Fig. 15b. The imprecise survival functions are computed on basis of stochastic input variables described via fuzzy probabilities. The reliability analysis under consideration of imprecision can be simplified by considering various discrete alpha-cuts with α ∈ [0, 1] as illustrated in Fig. 15b for α ∈ {0, 0.2, 0.4, 0.6, 0.8}.

Fig. 15
Two line graphs of survival function versus time. a, Alpha 1 starts at (0, 1) and curves downward to (6, 0). b, Alpha 0, 0.2, 0.4, 0.6, 0.8, and 1 start at (0, 1) and curve downward to (6, 0), passing through different points. All values are approximate.

Precise and imprecise survival functions

The survival function generated with α = 0 represents the maximum level of uncertainty. For example, an expert specifies the parameter interval corresponding to an alpha level α = 0 of the fuzzy probability as the maximum degree of uncertainty, i.e. the parameters will certainly not violate the interval limits. This might be the case in design and maintenance, if, e.g., only insufficient information on the installed components has been collected so far and only an educated guess of an expert is available. In contrast, an alpha-level of α = 1 corresponds to a precise survival function that is typically not known. Note that this is only the case for triangular fuzzy probabilities as presented in Salomon et al. (2021b). However, there exist fuzzy probabilities that describe an interval of parameters for α = 1 as well. Depending on the budget, gathering precise information for each component type, e.g., via experimental campaigns, might not be feasible, impeding proper reliability analyses. In fact, a complete elimination of imprecision is in most cases neither necessary nor cost-efficient. Therefore, a method for determining a critical level of imprecision is crucial for cost-effective decision making that balances imprecision against the cost associated with reducing it. In Salomon et al. (2021b), Salomon et al. developed a comprehensive decision making procedure for uncertainty reduction. Integral parts are target values for the system reliability at certain points in time. If the current setting of parameters modeled via fuzzy probabilities fail to ensure these target reliabilites the imprecision in the fundamental data should be reduced. The procedure is cost-efficient, since it proceeds successively from α = 0 up to α = 1 until the reliability requirements are met.

7.2 Efficient Simulation Algorihtm Under Consideration of Imprecision

The consideration of both irreducible uncertainty and imprecision requires adequate treatment in systems analysis. A frequently implemented approach is a two-stage simulation, commonly known as a “double-loop” approach. Correspondingly, variables describing imprecision on parameters are sampled in an “outer loop” and variables representing irreducible uncertainty and depending on the imprecise parameters, as, e.g., failure time of components, are sampled in an “inner loop,” cf. Hofer et al. (2002), or vice versa, in an “outer loop” aleatory variables are sampled and epistemic uncertainty is treated in the “inner loop”, cf. Alvarez (2006). Clearly, for complex systems, this naive sampling approach leads to an extremely large sample size and consequently a high computational cost, see, e.g. Sarkar et al. (2015). Consequently, simulation approaches that increase computational efficiency and yield high accuracy at minimal sample size are desirable. Recently, the Non-Intrusive Stochastic Simulation (NISS), a promising approach to efficiently compute imprecise structural models with significantly reduced sample size was introduced in Wei et al. (2019). The method is divided into two basic approaches, Local Extended Monte Carlo Simulation (LEMCS) and Global Extended Monte Carlo Simulation (GEMCS), which provide different advantages in terms of accuracy and variation.

In Salomon et al. (2021b), a novel methodology was developed in the context of imprecise system reliability analysis by adapting the LEMCS and the GEMCS and combining them with the concept of survival signature. The imprecision of parameters is modeled via fuzzy probabilities. This imprecision is then propagated efficiently through the analysis framework by means of the new method as illustrated in Fig. 16. The complex amalgamation brings together the advantages of both the concept of survival signature and NISS concepts: a significant memory reduction of topological information and large efficiency benefits in repeated model evaluations, combined with a comprehensive consideration of uncertainties with only one required stochastic simulation, thus drastically reducing the sample size. This combination leads to beneficial synergy effects, increasing the efficiency even more. The savings due to the new methodology regarding sampling effort compared with the naive double loop approach are illustrated in Fig. 17. Figure 17a and b The most attractive aspect of both the LEMCS and GEMCS algorithms is the fact that only a single stochastic simulation is necessary to account for the imprecisions. Therefore, the traditionally employed “double loop” simulation can be circumvented. In both LEMCS and GEMCS, the interval analysis and the stochastic analysis have been decoupled successfully, and the computational expense is mainly driven by the single stochastic simulation performed. Moreover, the stochastic analysis has been separated from the system topology by merging it with the survival signature, so that only one reliability analysis in terms of topology is required to generate the survival signature. In addition to these beneficial features of the survival signature, it is exactly the single stochastic simulation required that gives the proposed methodology its efficiency and differentiates it clearly from traditional approaches. Thanks to this approach the imprecise stochastic analysis used to estimate the bounds on the system survival function has been greatly simplified.

Fig. 16
An equation. Survival signature plus N I S S method plus fuzzy probabilities yields to probability structure in the right hand side of the equation of P at T s greater than t divides theta.

NISS method and the LEMCS estimator

Fig. 17
Two flow diagrams. A, sampling procedure via the double loop approach starts at sampling distribution. B, sampling procedure via novel developed uncertainty analysis.

Sampling via the “double loop” approach and the novel uncertainty analysis developed in this work

In Salomon et al. (2021b), the new methodology was demonstrated, among others, on the functional model of the axial compressir presented in Sect. 3 and shown in Fig. 4. The “double loop” approach is conducted with 5 000 samples (failure times) on the inner loop and 1 000 samples (model parameters) on the outer loop, i.e., a total of 5 000 000 samples. While even improving the quality in results, for both LEMCS and GEMCS, only one simulation was required with 100 000 generated samples (failure times for 100 000 different model parameters). Correspondingly, only 1/50th of the sample size compared to the “double loop” approach was required.

8 Conclusions

A decision making process has been developed that allows the identification of optimal trade-offs among numerous resilience-enhancing features/measures for complex capital goods of various types. During the period of the CRC 871 the approach as been applied to models of an axial compressor, of a flow network, of an arbitrary complex system, and of the Berlin metro network. The consideration of monetary and technical constraints into the decision making process is realized. The broad applicability of all developed methods is ensured, i.e., there are no limitations to a specific system type. A reduction in computational effort has been achieved, mainly due to the separation property of the survival signature, i.e., once the system structure has been computed, any possible characterization of the probabilistic part can be evaluated without the need to recompute the structure. The integration of uncertainties in the reliability analysis is enabled and the sample size is drastically reduced due to the adapted NISS methods requiring only a single stochastic simulation, avoiding the tedious “double loop” simulation traditionally applied.

It could be shown that functional models are a good and effective approach to represent physically complex systems. This approach was further developed to not only consider dependencies in the input parameters but also to include a time dependency of the sensitivities by means of importance indices. Still challenging is the merging of the several developed functional models of subsystems within the overall jet engine into an encompassing representative overall model. It is very computationally expensive due to the costly sensitivity analysis for complex systems with numerous input and output parameters. Further in-depth research on this topic is required to allow the generation of such an extensive model.

The comprehensive and encompassing analysis framework developed in this work, consisting of resilience analysis, consideration of technical and monetary constraints, reliability analysis, and uncertainty analysis, provides decision makers with an additional basis for decision making in MRO processes, enabling sophisticated decisions on an efficient background. Thereby, resilience applies to regeneration paths not exclusively as a part of the overall life cycle performance of the complex capital good, but as an important property for the regeneration path in itself. Clusters are identified of similar, equally acceptable regeneration paths to which, in the event of a problem, effortlessly can be switched, leading to a variety of resilient regeneration paths. Furthermore, strict attention was given that all developed phases within the analysis framework are applicable to complex capital goods of any kind and in every field of application, e.g., design process, optimization process etc.

The outlook that goes beyond the scope of this project is the combination of all presented approaches into a single encompassing methodology in order to be able to take even greater advantage of the excellent synergy effects between them.