1 Introduction

The structural integrity and functionality of critical infrastructure, such as oil, gas, water, and sewage pipelines, as well as roads, tunnels, and bridges in the aftermath of an earthquake is decisive for the management of response actions by the civil protection authorities and heavily influences the seismic resilience of communities (Casari and Wilkie 2005; Fragiadakis et al. 2015; Kilanitis and Sextos 2019; Mazumder et al. 2020). A potential failure may result in injuries and human fatalities, environmental pollution, as well as significant direct and indirect economic losses (Somerville 1995; Basöz et al. 1999; Bird and Bommer 2004; Steinberg and Cruz 2004; Nair et al. 2018). Even though strict standards are applied during the design, construction, operation, and maintenance of these critical infrastructures, failures are still occurring. Among the most catastrophic earthquake-induced actions is the fault offset in case of large-magnitude earthquakes affecting the overlying structures. Fault offset is the differential displacement along a fault plane in the earth’s crust, appearing wherever the causative fault rupture propagates up to the ground surface. Any structure subjected to fault offset has to follow the ground displacement by developing excessive deformations. This has been studied for buried pipelines (Girgin and Krausmann 2016), above-ground pipelines (Honegger et al. 2004), pipeline networks (O’Rourke 2010; O’Rourke et al. 2014), tunnels (Roy and Sarkar 2017), and bridges (Anastasopoulos and Gazetas 2007; Yang and Mavroeidis 2018).

The seismic resilience of critical lifelines and infrastructure against tectonic faulting can be secured within the framework of Performance-Based Earthquake Engineering (Cornell and Krawinkler 2000), which requires at first the quantification of the fault displacement hazard at the crossing site. The most appropriate methodology to do so is the Probabilistic Fault Displacement Hazard Analysis [PFDHA (Youngs et al. 2003; Moss and Ross 2011; Petersen et al. 2011; Valentini et al. 2021)]. A comprehensive framework for the performance assessment of buried pipelines at fault crossings has been presented by Melissianos et al. (2017), which was further refined and focused on the fault displacement hazard at lifeline–fault crossings on an engineering basis by Melissianos et al. (2023). PFDHA aims at quantifying the mean annual frequency of exceeding arbitrary fault displacement levels at the lifeline crossing site, considering the dimensions and the seismological properties of the fault along with the location of the crossing lifeline on the fault trace (i.e., the crossing site). However, this is an advanced analysis with complicated probabilistic calculations based on a set of specialized seismological data [see for example the site-specific analysis for the Milun Fault in Taiwan recently published by Gao et al. (2022)] and thus unsuitable for being incorporated “as is” in code provisions.

In response, a code-compatible statistical approximation is developed for estimating the design fault displacement for application across Europe. A large number of PFDHAs was executed considering the pertinent uncertainties within a logic tree framework (Bommer and Scherbaum 2008) to handle the seismological and geometrical properties of faults obtained from the 2020 European Fault-Source Model [EFSM20 (Basili et al. 2022)] that was used for the development of the 2020 European Seismic Hazard Model [ESHM20 (Danciu et al. 2021)]. Further, the PFDHA results were statistically analyzed and a procedure for estimating the fault displacement was developed.

The main outcome of this study is a set of empirically-derived equations that establish a link between the fault displacement and key variables, such as the fault seismic productivity, the fault mechanism, the fault length, and the crossing point on the fault trace. These expressions constitute an engineering orientated and code-compatible methodology that allows the estimation of the design fault displacement for lifelines crossing active tectonic faults. This methodology is a structure-independent and hazard-consistent approach that is applicable by engineers, who are typically not familiar with detailed hazard calculations, as well as specialized seismological and geophysical data. The proposed methodology has been adopted in prEN 1998-4:2022 (European Committee for Standardisation 2022) as an informative Annex. It is important to emphasize that the methodology described should not be considered a substitute for a comprehensive assessment of fault displacement hazard specific to the site in question, particularly in cases involving high-risk infrastructure or when such an assessment is explicitly mandated by the owner of the infrastructure or regulatory authorities.

2 Proposed methodology

The proposed methodology for estimating the design fault displacement, referred as the EN1998-4 approach hereinafter, is implemented as follows:

1st step: The fault mechanism, the fault length, and the crossing point are determined for the lifeline-fault crossing at hand. In more detail:

  • The fault mechanism determines the mechanical response of the fault-crossing structure. For example, the deformation of a buried pipeline subjected to normal, reverse, and strike-slip faulting is depicted in Fig. 1. Normal faulting causes pipe bending and elongation, reverse faulting causes pipe bending and shortening, while strike-slip faulting causes pipe bending (O’Rourke and Liu 2012). At this point, it shall be noted that the effect of fault trace uncertainty related to the propagation of rupture through soil until reaching the ground surface (e.g., Anastasopoulos et al. 2007, 2008; Loukidis et al. 2009) is not considered.

  • The fault (subsurface) length (\({L}_{F}\)) can be obtained from a geological map or defined by an appropriate survey.

  • The crossing point (\({X}_{L}\)) stands for the ratio of the distance along the fault trace of the lifeline–fault crossing point to the closest fault-end over the fault trace length as per Fig. 2; naturally \(0<{X}_{L}\le 0.50\). The crossing point itself results from the lifeline route selection procedure.

Fig. 1
figure 1

Fault mechanisms and corresponding pipe deformation (the right block is the stationary, while the left is the moving one)

Fig. 2
figure 2

Lifeline–fault crossing plan view

2nd step: The recurrence rate (\({v}_{F}\)) of the fault, representing the earthquake occurrence, is the average annual rate of all earthquakes above a minimum magnitude and is derived either from an available source model or defined by a specialized seismological study. The minimum earthquake magnitude considered is 5.5, assuming that lower magnitudes do not cause enough fault displacement to endanger the lifeline integrity. Alternatively, the recurrence rate can be approximated (\({v}_{F,approx}\)) via the proposed methodology presented in Sect. 5 using the fault length (\({L}_{F}\)) and the the reference spectral acceleration at period \({T}_{\beta }=1\, {\text{s}}\) corresponding to a return period of \({T}_{R}=475\) years (as estimated via ESHM20) at the crossing site. The recurrence rate is classified into two categories as per Table 1.

Table 1 Recurrence rate (\({v}_{F}\)) classification

3rd step: The return period (\({T}_{R}\)) of exceeding a selected fault displacement level at the lifeline–fault crossing is estimated as:

$${T}_{R}\left({\Delta }_{F}\right)=\frac{1}{{C}_{F}{v}_{F}{f}_{L}\left({\Delta }_{F},{L}_{F},{X}_{L}\right)}$$
(1)

with \(0.25\, {\text{m}}\le {\Delta }_{F}\le 4.00\, {\text{m}}\) where: \({C}_{F}\) is the confidence factor estimated after Eq. (3), \({v}_{F}\) is the recurrence rate obtained from the 2nd step, \({f}_{L}\left({\Delta }_{F},{L}_{F},{X}_{L}\right)\) that depends on the fault mechanism, fault length, and crossing point; estimated for the selected fault displacement after Eq. (2) based on the recurrence rate classification of Table 1.

$${f}_{L}\left({\Delta }_{F},{L}_{F},{X}_{L}\right)=\mathrm{exp}\left[\begin{array}{c}{a}_{1}+{a}_{2}\mathrm{ln}{L}_{F}+{a}_{3}{X}_{L}+{a}_{4}{\left(\mathrm{ln}{L}_{F}\right)}^{2}+{a}_{5}{X}_{L}\mathrm{ln}{L}_{F}+\\ {a}_{6}{{X}_{L}}^{2}+{a}_{7}{\left(\mathrm{ln}{L}_{F}\right)}^{3}+{a}_{8}{{X}_{L}\left(\mathrm{ln}{L}_{F}\right)}^{2}+{a}_{9}{{X}_{L}}^{2}\mathrm{ln}{L}_{F}\end{array}\right]$$
(2)

where \(\mathrm{ln}\left(\bullet \right)\) is the natural logarithm of its argument and the coefficients \({a}_{1}\), \({a}_{2}\), …, \({a}_{9}\) differ per recurrence rate class and \({\Delta }_{F}\) value, as listed in A1 for normal fault, in Table A2 for reverse fault, and in Table A3 for strike-slip fault mechanism

Table 2 Fault displacement deterministic cap (\({\Delta }_{F,det.cap}\))
Table 3 Logic tree formulation for the b-value based on \({v}_{F}\left(M>5.5\right)\) classification

The confidence factor \({C}_{F}\) is estimated as:

$${C}_{F}=\left\{\begin{array}{ll}1.00& \quad \text{for }{v}_{F}\\ \text{Eq. }\text{ (12) }&\quad\text{for }{v}_{F,approx}\end{array}\right.$$
(3)

The following remarks should be additionally considered:

  1. 1.

    If there is uncertainty about the crossing point, then \({X}_{L}=0.50\) should be considered as the worst-case scenario because it yields a higher MAF of exceeding any fault displacement (Melissianos et al. 2023).

  2. 2.

    In case the fault displacement corresponding to a design return period is requested, as it is typically the case in design applications and it is the inverse of what is obtained from Eq. (1), linear interpolation in \(\left[{\Delta }_{F},\mathrm{ln}{T}_{R}\left({\Delta }_{F}\right)\right]\) space may be employed among the values estimated after Eq. (1) as per Fig. 3.

  3. 3.

    If the obtained fault displacement value is outside the range of Eq. (1), namely \({\Delta }_{F}<0.25\, {\text{m}}\ {\text{or}}\ {\Delta }_{F}>4.00\ {\text{m}}\), then linear extrapolation in the \(\left[{\Delta }_{F},1/\mathrm{ln}{T}_{R}\left({\Delta }_{F}\right)\right]\) space may be employed as a conservative option (Fig. 4). If values lower than 0.10 m are obtained, then it is suggested to consider \({\Delta }_{F}=0.10\ {\text{m}}\) as a minimum design value for safety reasons. If values significantly higher than 4.00 m are estimated, then a more detailed site-specific seismological study should be performed.

  4. 4.

    If the approximated recurrence rate has been employed (see Sect. 5), then the design fault displacement value should be the minimum between the one obtained via the interpolation of Fig. 3 and the deterministic cap of Table 2. The deterministic cap is roughly the 90% percentile of the empirical fault scaling relations of Leonard (2014) that relate the average fault displacement with the fault length.

  5. 5.

    Only principal faulting is considered, while any additional differential displacement appearing at sites away from the fault trace due to distributed/secondary faulting (i.e., displacements, shear, fractures, etc. located up to a few kilometers from the principal fault) is neglected.

Fig. 3
figure 3

Determination of the \(\mathrm{ln}{T}_{R}\left({\Delta }_{F}\right)\) versus \({\Delta }_{F}\) relationship via linear interpolation (\({T}_{R}\) in years)

Fig. 4
figure 4

Linear extrapolation of the \(1/\mathrm{ln}{T}_{R}\left({\Delta }_{F}\right)\) versus \({\Delta }_{F}\) relationship for fault displacement \({\Delta }_{F}<0.25\ {\text{m}} \, {\text{and}} \, {\Delta }_{F}>4.00\ {\text{m}}\) (\({T}_{R}\) in years)

3 Methodology background

3.1 Fault displacement hazard calculation

The (mostly) rate-independent function \({f}_{L}\left({\Delta }_{F},{L}_{F},{X}_{L}\right)\) of Eq. (2) has been derived from the statistical processing of a large number of PFDHA results that were carried out using the baseline engineering approach developed by Melissianos et al. (2017, 2023). In particular, PFDHA yields the mean annual frequency (MAF) of exceeding a predefined fault displacement (\({\delta }_{F}\)) on the lifeline crossing site, calculated as:

$${\lambda }_{{\Delta }_{F}}\left({\delta }_{F}\right)={\nu }_{F}PoE$$
(4)

where the probability of exceeding (\(PoE\)) a given fault displacement value is:

$$PoE={\sum }_{i}P\left({\Delta }_{F}>{\delta }_{F}|{m}_{i}\right){P}_{M}\left({m}_{i}\right)$$
(5)

\(P\left({\Delta }_{F}>{\delta }_{F}|{m}_{i}\right)\) the conditional probability that fault displacement \({\Delta }_{F}\) will exceed value \({\delta }_{F}\) given an earthquake of magnitude \({m}_{i}\) has occurred and \({P}_{M}\left({m}_{i}\right)\) is the probability of the earthquake magnitude \(M\) being in a bin of \({m}_{i}\pm\Delta m\), provided that \(M\) ranges between a minimum (\({M}_{min}\)) and a maximum (\({M}_{max}\)) value. \({P}_{M}\left({m}_{i}\right)\) is estimated after the Gutenberg–Richter (G-R) bounded recurrence law (Gutenberg and Richter 1944):

$${P}_{M}\left({m}_{i}\right)=P\left(M<{m}_{i}+\Delta m|{M}_{min}\le {m}_{i}\le {M}_{max}\right)-P\left(M>{m}_{i}-\Delta m|{M}_{min}\le {m}_{i}\le {M}_{max}\right)$$

with:

$$P\left(M<{m}_{i}|{M}_{min}\le {m}_{i}\le {M}_{max}\right)=\frac{1-\mathrm{exp}\left[-b\left({m}_{i}-{M}_{min}\right)\right]}{1-\mathrm{exp}\left[-b\left({M}_{max}-{M}_{min}\right)\right]}$$
(6)

where the b-value is the slope of the curve that provides the “expected” future earthquake magnitudes and is a seismological property of the fault.

The main input for performing the fault displacement hazard calculations is:

  • fault mechanism and length, which are available to the engineer from a geological map or a seismic source model,

  • the b-value of the G-R law and the maximum earthquake magnitude (\({M}_{max}\)), which are specialized seismological information, being estimated by other specialists and not engineers,

  • and the crossing point, which is defined from the lifeline route selection procedure (e.g., Seel et al. 2014; Hamid-Mosaku et al. 2020).

The fault mechanism, fault length, and the crossing point are direct input parameters for the proposed methodology (Sect. 2), while the b-value of the G-R law and the maximum earthquake magnitude were considered as epistemic uncertainties related to the model parameters and were handled through logic trees (Bommer and Scherbaum 2008).

3.2 Database of active faults

A selection from the ESFM20 database of seismically active faults (Basili et al. 2022) that was created for the development of ESHM20 (Danciu et al. 2021) was exploited through a data mining process. The aim was to identify the main properties of the faults in order to (1) define the range of parameters (fault mechanism, tectonic environment, fault length) for examination and (2) develop the appropriate logic trees for handling the uncertainties on variables. It is noted that for engineering purposes, we excluded from the analysis faults with length \({L}_{F}<10\ {\text{km}}\) and \({L}_{F}>300\ {\text{km}}\), as well as (blind) faults whose uppermost boundary is located deeper than 3 km from the surface.

3.2.1 Tectonic environment and fault mechanism

The faults examined are mapped in Fig. 5, distinguished by tectonic environment between Interplate (INT) and Stable Continental Region (SCR). A statistical analysis of the number of faults per tectonic environment and mechanism, separately, is shown in Fig. 6. The vast majority of faults are INT (94%) compared to SCR ones (6%). Regarding the fault mechanism, nearly half of the faults are normal, while reverse faults are the fewest.

Fig. 5
figure 5

Map of faults classified per tectonic environment (INT: red, SCR: blue), a selection from the EFSM20 database (Danciu et al. 2021; Basili et al. 2022)

Fig. 6
figure 6

Tectonic environment and mechanism of examined faults [a selection from the EFSM20 database (Danciu et al. 2021; Basili et al. 2022)]

3.2.2 Gutenberg–Richter law b-value

The G-R law is an integral part of the fault displacement hazard calculation (Kramer 1996) and even though it was developed in the 1940s, it remains a standard tool for estimating the magnitude of future earthquakes (Bommer 2002). The b-value is a seismogenic parameter and is the negative slope of the recurrence curve expressing the average ratio of exponentially distributed small and large magnitude earthquakes (Danciu et al. 2021) and affects the shape of the fault displacement hazard curve at the crossing site (Melissianos et al. 2023). In ESHM20, a single b-value is used for each active tectonic fault, since the sensitivity analysis revealed that the uncertainty of b-value has lower impact on the magnitude-frequency-distributions compared to the fault slip-rates and \({M}_{max}\) uncertainties. The b-values were calculated via a set of complex procedures presented in the documentation of ESHM20, being related to the declustering of catalogues of recorded earthquakes.

The distribution of b-values per tectonic environment is presented in Fig. 7, where it is revealed that the predominant value is \(b=1.00\), while the number of faults with \(b>1.00\) is much lower than those with \(b<1.00\). The same conclusions are drawn regarding the distribution of b-values per fault mechanism (Fig. 8).

Fig. 7
figure 7

Histograms of G-R b-value per tectonic environment

Fig. 8
figure 8

Histograms of G-R b-value per fault mechanism

3.2.3 Earthquake recurrence rate

The earthquake recurrence rate of a fault provides the average annual number of events above the minimum earthquake magnitude of engineering significance, \({M}_{min}\). The value \({M}_{min}=5.5\) is adopted on the basis of engineering and scientific judgement and since the resulting fault displacement for lower magnitude values is insignificant. The latter is based on the effect of the conditional probability of slip that is an integral part of PFDHA and represents the probability of the rupture reaching the surface, conditioned only on earthquake magnitude. For all three fault mechanisms, a magnitude \(M>5.50\) event is required to have at least 20% probability for the rupture to even reach the surface (Melissianos et al. 2023).

The rate of events with magnitude higher than \({M}_{min}\) is estimated via the formula:

$${v}_{F}\left(M>{M}_{min}\right)={10}^{a-b{M}_{min}}$$
(7)

where the a-value (representing the total seismic productivity of a given fault) and the b-value (Sect. 3.2.2) can be obtained from a site-specific geological study or (in our case) the EFSM20. The resulting rate for each fault of the database has been estimated and the results are presented in Fig. 9 per tectonic environment. As expected, the rate of SCR faults is much lower compared to the more active INT faults. The distribution of rates per fault mechanism is depicted in Fig. 10, indicating a significant scattering of values.

Fig. 9
figure 9

Histograms of earthquake recurrence rate values, \({v}_{F}\left(M>5.5\right)\), of faults per tectonic environment (top row: full range of rate values, bottom row: rates lower than \(0.05\, {\text{year}}^{-1}\))

Fig. 10
figure 10

Histograms of earthquake recurrence rate values, \({v}_{F}\left(M>5.5\right),\) of faults per mechanism (top row: full range of rate values, bottom row: rates below \(0.05\, {\text{year}}^{-1}\))

3.3 Handling of uncertainties

The epistemic uncertainties related to the model parameters, namely the b-value of the G-R law and the maximum earthquake magnitude, are handled through logic trees. Additionally, provided that the SCR faults are few compared to the INT ones, the tectonic environment is also treated as an epistemic uncertainty due to the involvement of expert judgment and sometimes subjective definition. Every logic tree branch leads to an alternative scenario or, in other words, to a different mean annual frequency (MAF) of exceeding a predefined fault displacement.

3.3.1 Logic tree of Gutenberg–Richter b-value

The b-value is related to the earthquake recurrence rate \({v}_{F}\), as depicted in Fig. 11 for the active faults of the database. Therein, the earthquake recurrence rate is plotted on the horizontal axis and the b-value on the vertical one. The two recurrence rate classes (Table 1) are also presented, with their boundary defined empirically based on a direct search for optimal classification.

Fig. 11
figure 11

b-value versus \({v}_{F}\left(M>5.5\right)\) and definition of recurrence rate classes

A further breakdown of Fig. 11 per tectonic environment and fault mechanism is provided in Fig. 12. Regarding the tectonic environment, SCR faults could be found only in the low recurrence rate class, as should be expected. Also, significant scattering of the values with respect to the fault mechanism was detected and consequently, it was decided to handle the b-values in each recurrence rate class independently per fault mechanism (Table 3). In both classes, distinct patterns of the relationship \({v}_{F}\left(M>5.5\right)\sim f\left(b\right)\) can be observed given the fault mechanism (Fig. 13).

Fig. 12
figure 12

b-value versus \({v}_{F}\left(M>5.5\right)\) per tectonic environment and fault mechanism [the dashed vertical line separates the low from high recurrence rate class]

Fig. 13
figure 13

Gutenberg-Richer b-value from ESHM20 versus \({v}_{F}\left(M>5.5\right)\) estimated after Eq. (7) using ESHM20 values; results are shown here per fault mechanism in both recurrence rate classes

The logic tree branches for the b-value were distinguished using the k-means clustering method (Mackay 2005) to near-optimally partition the b-values into discrete sets that minimize the within-cluster sum of squares. The clustering was applied separately for each recurrence rate class as per Table 4. Clustering for INT and SCR faults per fault mechanism for the low recurrence rate class is illustrated in Figs. 14 and 15, respectively. As expected, a lot of clusters were required for INT faults, contrary to SCR faults. The appropriate clustering of b-values per fault mechanism for the high recurrence rate class is illustrated in Fig. 16. The limited number of faults with a high recurrence rate in the cases of normal and reverse faults drove the k-means algorithm to create a single cluster. Finally, all clusters of b-values (centroid and weight factor) are summarized in Table 4.

Table 4 Clusters of b-value per recurrence rate [\({v}_{F}\left(M>5.5\right)\)] class, tectonic environment, and fault mechanism
Fig. 14
figure 14

Clustering of b-values with respect to \({v}_{F}\left(M>5.5\right)\): INT faults at low recurrence rate class

Fig. 15
figure 15

Clustering of b-values with respect to \({v}_{F}\left(M>5.5\right)\): SCR faults at low recurrence rate class

Fig. 16
figure 16

Clustering of b-values with respect to \({v}_{F}\left(M>5.5\right)\): INT faults at high recurrence rate class

3.3.2 Logic tree of maximum earthquake magnitude

The G-R law for estimating the probability of future earthquake magnitudes is bounded between a minimum value of engineering significance and a “physics-based” upper bound value (\({M}_{max}\)). The latter is a seismogenic parameter that is usually unknown to the engineer/designer, being highly uncertain and typically related to the dimensions of the fault, namely length, width, area (Wells and Coppersmith 1994; Wang 2018). One can explore the \({M}_{max}\) values of the fault database, similarly to what was done for the b-values. Still, this could lead to multiple clusters of \({M}_{max}\) given fault length, resulting to a substantial increase in the size of the logic tree and the needed calculations. Instead, it was decided to take advantage of already available models, i.e., obtain the \({M}_{max}\) values from the fault scaling relations of Leonard (2014) based on the fault length. The empirical expressions of Leonard (2014) relating earthquake magnitude and fault length (mean value and standard deviation) are listed in Table 5. It is noted that the scaling relations of Leonard (2014) were utilized within EFSM20 and ESHM20.

Table 5 Empirical fault scaling relation \({M}_{max}=a+\beta {\mathrm{log}}_{10}{L}_{F}\) after Leonard (2014)

The uncertainty on the estimation of \({M}_{max}\) was handled through a logic tree formulation per tectonic environment and style-of-faulting. In each case, three branches were considered as per Table 6, where \({M}_{max,A}\) is the average value, while \({M}_{max,L}\), \({M}_{max,U}\) are the average minus/plus one standard deviation, respectively. The standard deviation is equal to \(\left({a}_{max}-{a}_{min}\right)/2\), where \({a}_{max}\) and \({a}_{min}\) are given in Table 5. The weight factors for each branch are also provided in Table 6, where \({w}_{m}\) was computed for each case so as to obtain the same standard deviation provided by Leonard (2014).

Table 6 Logic tree formulation for \({M}_{max}\)

At this point it should be noted that the fault slip rate, which can be instrumentally monitored, is considered constant for a particular fault. Therefore, when the maximum earthquake magnitude is modified within a logic tree formulation as per Table 6, the number of events, as expressed via the recurrence rate \({v}_{F}\), has to be adjusted in order to be consistent with the constant slip rate. This is because, for example, an event of high magnitude leads to the release of more energy than a low magnitude one. Thus, a high \({M}_{max}\) that would allow such high-magnitude events to occur, should be combined with an overall reduced rate events in order for the energy balance to be stable. In other words, the energy released due to earthquakes should balance the energy introduced due to slip. This may be achieved by pairing a high estimate for \({M}_{max}\) with a low a-value for the G-R law and vice versa. Theoretically, an appropriate formula should be used, as the one proposed by Youngs and Coppersmith (1986), to relate the earthquake magnitude, the slip rate, and the recurrence rate among other parameters. However, the proposed methodology is generic and such a formula cannot be practically implemented because the developed logic tree for \({M}_{max}\) (Table 6) has to deal with numerous different faults, rather than be optimized for a single one.

To overcome this hurdle and be consistent with the constant slip rate, a generic logic tree is adopted, following the actual trends but correcting via the mean value of many faults. Specifically, the hazard curves calculated with \({M}_{max,L}\) (low maximum magnitude value) were weighted by the \(\mathrm{mean}\left({a}_{GR,ML} \right)/\mathrm{mean}\left({a}_{GR,MA}\right)\) ratio, while the hazard curves calculated with \({M}_{max,U}\) (high maximum magnitude value) were multiplied with \(\mathrm{mean}\left({a}_{GR,MU} \right)/\mathrm{mean}\left({a}_{GR,MA}\right)\), where the corresponding means were taken over all faults within a “bin” of given tectonic environment, fault mechanism, and recurrence rate class. For the faults falling within each such bin, \(\mathrm{mean}\left({a}_{GR,ML}\right)\) is the mean of pertinent a-values of the G-R law from the fault database that correspond to the low magnitude value, \(\mathrm{mean}\left({a}_{GR,MA}\right)\) is the mean of a-values that correspond to the average magnitude value, and \(\mathrm{mean}\left({a}_{GR,MU}\right)\) is the mean of a-values that correspond to the high magnitude value. The final ratios are tabulated in Table 7.

Table 7 Ratios of mean a-values of the G-R law to account for the constant slip rate, as calculated per given recurrence rate class, tectonic environment, and fault mechanism

3.3.3 Logic tree of tectonic environment

The tectonic environment of the fault should normally be a known property. Yet, it will most probably be an unknown to a practitioner and code user. Thus, it becomes a source of epistemic uncertainty to be handled within a logic tree formulation for the low recurrence rate class, which is the only environment where both INT and SCR faults can coexist. In the developed single-level logic tree per fault mechanism, the weight factor of each one of the two branches, namely INT and SCR faults, equals the percentage of faults in each tectonic environment, as illustrated in Fig. 17.

Fig. 17
figure 17

Percentage of faults with respect to the tectonic environment (INT versus SCR) per mechanism for the low recurrence rate class

3.3.4 Combined 2/3-level logic tree

In total, six logic trees are developed per recurrence rate class and fault mechanism, namely three logic trees (normal, reverse, and strike-slip fault mechanism) in low recurrence rate class and three logic trees (normal, reverse, and strike-slip fault mechanism) in high recurrence rate class. The combined 2/3-level logic tress are presented indicatively for normal fault mechanism at low recurrence rate class in Fig. 18 and at high recurrence rate in Fig. 19. It is recalled that only INT faults can be found in high recurrence rate class (see Sect. 3.3.1, Tables 3, and 4).

Fig. 18
figure 18

Logic tree for normal fault at low recurrence rate class

Fig. 19
figure 19

Logic tree for normal fault at high recurrence rate class

3.4 PFDHA results

PFDHAs were carried out following the analysis scheme presented in Table 8 for a range of fault lengths \(10\,{\text{km}}\le {L}_{F}\le 300\,{\text{km}}\) and for crossing points \(0.10\le {X}_{L}\le 0.50\), the latter with a step of 0.05. The probability of exceedance (\(PoE\)) after Eq. (5) was obtained from each analysis.

Table 8 Analysis scheme for each PFDHA

Polynomial surface fitting was carried out on the results for predefined fault displacement values of Table A1 through Table A3 with respect to fault length and crossing point. The statistical model of Eq. (2) was developed separately for each recurrence rate class and fault mechanism. Indicative results of the fitted surfaces for normal faults at the low and high recurrence rate classes are depicted in Figs. 20 and 21, respectively. The natural logarithm of the fault length (\(\mathrm{ln}{L}_{F}\)) with \(10\,\mathrm{km}\le {L}_{F}\le 300\,\mathrm{km}\) (see Sect. 3.2) and the crossing point (\({X}_{L}\)) with \(0\le {X}_{L}\le 0.50\) are shown in the two horizontal axes, while the natural logarithm of the probability of exceedance (\(\mathrm{ln}PoE\)) obtained after Eq. (5) is shown in the vertical axis. Essentially, the function \({f}_{L}\left({\Delta }_{F},{L}_{F},{X}_{L}\right)\) of Eq. (2) is \({f}_{L}\left({\Delta }_{F},{L}_{F},{X}_{L}\right)=PoE\). Note that although a sum-of-square-errors criterion was minimized for optimal fitting, this is not a regression, where overfitting and bias-variance considerations are important; instead, it is a curve-fitting operation, where there is only one valid outcome for every combination of inputs and ease of use is our only limitation in terms of the fitted form.

Fig. 20
figure 20

Surface fitting for normal fault, low recurrence rate class, and various fault displacements (fault length \({L}_{F}\) in km)

Fig. 21
figure 21

Surface fitting for normal fault, high recurrence rate class, and various fault displacements (fault length \({L}_{F}\) in km)

4 Methodology evaluation and application

4.1 Evaluation of results

The proposed methodology (abbreviated as the EN1998-4 approach) is applied to an indicative set of faults from the fault database (see Sect. 3.2), featuring different dimensions and seismological properties. The INT faults under examination are listed in Table 9 and the SCR faults in Table 10. The EN1998-4 approach is evaluated by comparing the obtained return period after Eq. (1) for the predefined fault displacement values of Table A1 through Table A3 to the one obtained from a full PFDHA after Eq. (4). It is noted that the maximum earthquake magnitude considered in a fault displacement hazard analysis affects significantly the resulting MAFs (Melissianos et al. 2023) or equivalently the return period. Thus, it is important to examine this effect when comparing the EN1998-4 approach with a full PFDHA. For the latter case, five values of \({M}_{max}\), obtained from the fault database, were considered for each fault: \({M}_{0.02}\) is the 2% value, \({M}_{0.05}\) is the 5% value, \({M}_{avg}\) is the average value, \({M}_{0.95}\) is the 95% value, and \({M}_{0.98}\) is the 98% value.

Table 9 Interplate (INT) faults under examination
Table 10 Stable continental region (SCR) faults under examination

The results for the INT faults are presented in Fig. 22 and for SCR in Fig. 23. It is observed that the EN1998-4 approach is conservative in general, leading to lower return periods than a full PFDHA. It is recalled that the maximum earthquake magnitude considered in the development of the EN1998-4 approach was calculated using the empirical fault scaling relations of Leonard (2014); the mean estimated magnitude is shown in Table 9 and Table 10 as \({M}_{L2014}\) for comparison reasons. In the case where the \({M}_{L2014}\) value is close to the lower percentiles of the magnitude values from the database, then the return periods obtained from the EN1998-4 approach are quite higher than the ones obtained from PFHDA, e.g., see the ITCF03K, GECF00F, PTCF00Y, and ESCF02C faults. These form the bulk of the rare cases where the EN1998-4 approach would end up being unconservative.

Fig. 22
figure 22

INT faults: comparison of return periods for predefined fault displacement values between the EN1998-4 approach and PFDHA (5 different maximum earthquake magnitude values considered)

Fig. 23
figure 23

SCR faults: comparison of return periods for predefined fault displacement values between the EN1998-4 approach and PFDHA (5 different maximum earthquake magnitude values considered)

4.2 Case studies

The fault displacement using the EN1998-4 approach was calculated for design return periods of 2500 years (\({\Delta }_{F,2500}\)) and 5000 years (\({\Delta }_{F,5000}\)) for a set of indicative faults in Europe (Table 11) that are located close to industrial areas, large cities, and important infrastructure. It is noted that the 2500 years and 5000 years return periods correspond to the Near Collapse limit state for consequences classes CC3-a and CC3-b, respectively, as dictated by prEN 1998-4:2022 (European Committee for Standardisation 2022). The crossing was assumed to be located at the middle of the fault (\({X}_{L}=0.50\)). Additionally, the median estimate of fault displacement by Leonard (2014) after Table 12 are presented for comparison reasons. The aim is to showcase the difference on the obtained fault displacement values between the hazard-consistent EN1998-4 approach and a “seismicity-agnostic” deterministic one, where only the fault dimensions are taken into account. Note that the latter approach disregards the fault seismicity and the magnitude that would correspond to a given return period for each fault (Davis 2008), representing an engineering-level approximation of low fidelity.

Table 11 Case study faults
Table 12 Empirical estimates of \({\Delta }_{L2014}\sim f\left({L}_{F}\right)\) after Leonard (2014)

The case study faults are examined by country in  Fig. 24 through Fig. 26.

Pyrenees: Three indicative faults in the Pyrenees at the France–Spain border were examined. It is observed that due to the significantly low recurrence rate (lower than 0.0005 events on average per year with magnitude \(M\ge 5.5\)), the resulting displacement values for both return periods are set equal to the minimum, namely \({\Delta }_{F}=0.10\,{\text{m}}\) (Fig. 24).

Fig. 24
figure 24

Fault displacements obtained from the EN1998-4 approach for return periods of 2500 years (\({\Delta }_{F,2500}\)) and 5000 years (\({\Delta }_{F,5000}\)), compared against the “seismicity-agnostic” estimate (\({\Delta }_{L2014}\)): Pyrenees, France, and Germany

France: The tectonic environment in the northwest part of France is SCR. The recurrence rate is quite low and, consequently, the resulting fault displacements are set equal to the minimum value (\({\Delta }_{F}=0.10\,{\text{m}}\)) [Fig. 24].

Germany: Three normal faults were selected as a case study in Germany, one in the greater area of Aachen and the other around Frankfurt. In all cases the resulting fault displacements from the EN1998-4 approach are equal to the minimum value (\({\Delta }_{F}=0.10\, {\text{m}}\)), while the obtained values from the empirical fault scaling relations are high, especially for the very long DRCF000 fault with \({L}_{F}=165.70\,{\mathrm{km}}\)) [Fig. 24].

Austria: INT strike-slip and normal faults are located around Wien. The evaluation of analysis results indicates that the recurrence rate is the critical parameter driving the resulting fault displacements (Fig. 25).

Fig. 25
figure 25

Fault displacements from the EN1998-4 approach for return periods of 2500 years (\({\Delta }_{F,2500}\)) and 5000 years (\({\Delta }_{F,5000}\)), compared against the “seismicity-agnostic” estimate (\({\Delta }_{L2014}\)): Austria, Portugal, and Slovenia

Portugal: In the central and north parts of Portugal there are SCR faults with all three mechanisms. They have in general low recurrence rates, leading to low fault displacement values, while the \({\Delta }_{L2014}\) values are considerably high due to the large length (Fig. 25).

Slovenia: Numerous faults are located in the northwest part of the Balkan Peninsula in Slovenia. Four indicative INT faults were selected and examined. One should notice the SICF004 and SICF00J strike-slip faults with a higher recurrence rate, compared to the others. The resulting fault displacement values for these faults are roughly equal to \({\Delta }_{F}=0.50\, {\text{m}}\) (Fig. 25).

Bulgaria: INT normal faults with various lengths and recurrence rates have been selected in the northeast part of Bulgaria and around the capital Sofia (Fig. 26).

Fig. 26
figure 26

Fault displacements obtained from the EN1998-4 approach for return periods of 2500 years (\({\Delta }_{F,2500}\)) and 5000 years (\({\Delta }_{F,5000}\)), compared against the “seismicity-agnostic” estimate (\({\Delta }_{L2014}\)): Bulgaria, Italy, Greece, and Turkey

Italy: In the industrial area of Calabria in Italy (south part) there are INT faults with considerable recurrence rates and consequently very high fault displacements, as obtained from the EN1998-4 approach (Fig. 26).

Greece: In the Attica region and in particular north, west, northwest of Athens, there are INT normal faults with relatively short length but high recurrence rates. The obtained fault displacement values are very high, indicating a non-negligible threat for potential crossing lifelines (Fig. 26).

Turkey: The North Anatolian fault system in Turkey is a well-known and studied active seismic area. Three indicative normal and strike-slip faults located at the south of Marmara Sea were selected and studied. Regardless of the length, the considerably high recurrence rates lead to very high design values for the fault displacement for both return periods. Contrarily, the displacement values derived from the empirical fault scaling relations are low (Fig. 26).

In all cases, the main observation is that the recurrence rate is a dominant parameter that clearly differentiates the resulting displacements for faults of similar length. On the other hand, the “seismicity-agnostic” deterministic approach that neglects the rate and only focuses on the dimensions and mechanism of the fault is bound to be overly conservative in low-recurrence-rate cases (e.g., cases examined in France and Portugal), while unconservative in cases of high recurrence rates (e.g., cases examined in Greece and Turkey).

5 Recurrence rates approximation

The earthquake recurrence rate of an active fault is a critical aspect of the hazard calculations and, in fact, is an external multiplier in Eq. (4), which is typically estimated by earth science experts, i.e., geologists, geophysicists, seismologists, etc. In the absence of such information, the correlation between the recurrence rate and the resulting ground shaking level in the vicinity of the fault can be used. It is preferable to employ engineering-level quantities that are already available in the relevant design code to accomplish this. Herein, we employ the fault length and a seismic design parameter, i.e., \({S}_{\beta ,475}\) estimated at the location of the fault, which is the reference spectral acceleration at period \({T}_{\beta }=1\,{\text{s}}\) corresponding to a return period of \({T}_{R}=475\) years. This is considered superior to the short period reference acceleration, \({S}_{\alpha ,475}\), as longer periods are better indicators of large magnitude seismicity that is of interest for fault displacements. Seismic hazard maps of the mean and median \({S}_{\beta ,475}\) values are provided in EN1998-1-1:2021 (European Committee for Standardisation 2021) based on the ESHM20 (Danciu et al. 2021). It is acknowledged that even on top of a fault, \({S}_{\beta ,475}\) values may incorporate contributions from all nearby faults and will be sensitive to ground motion models. Still, it can be conservatively assumed that the underlying fault under consideration contributes the most, indicating a strong positive correlation of \({S}_{\beta ,475}\) with \({v}_{F}\). \({S}_{\beta ,475}\) values on each fault trace of the database were obtained through appropriate interpolation on the \({S}_{\beta ,475}\) seismic hazard maps. To avoid any over-representation of locations, multiple \({S}_{\beta ,475}\) values were computed along each fault trace, and their average was taken as representative of the entire fault. Such averaged values are derived for both the mean and the median \({S}_{\beta ,475}\), and plotted versus \({v}_{F}\) in Figs. 27 and 28, respectively. No clear pattern emerges per tectonic environment or fault mechanism in either case. It was thus decided to build a regression model by considering all faults together.

Fig. 27
figure 27

Earthquake recurrence rate, \({v}_{F}(M>5.5)\) in year−1, database versus mean \({S}_{\beta ,475}\) per tectonic environment and fault mechanism

Fig. 28
figure 28

Earthquake recurrence rate, \({v}_{F}(M>5.5)\) in year−1, versus median \({S}_{\beta ,475}\) per tectonic environment and fault mechanism

Equation (8) is the fitted polynomial expression of the model (Fig. 29) for the approximated recurrence rate (\({v}_{F,approx}\)). The model metrics are \({R}^{2}=0.7333\) and standard error \({\sigma }_{\varepsilon ,mean}=0.7539\) when considering the mean acceleration \({S}_{\beta ,475}\), versus \({R}^{2}=0.7097\) and \({\sigma }_{\varepsilon ,50}=0.7867\) when considering the median acceleration \({S}_{\beta ,475}\), indicating a fair but imperfect fit.

Fig. 29
figure 29

Surface fitting for \({v}_{F}\) versus \({S}_{\beta ,475}\) and fault length (\({v}_{F}\) in \({\text{year}}^{-1}\), \({S}_{\beta ,475}\) in \({\text{g}}\), \({L}_{F}\) in km)

$$\mathrm{ln}{v}_{F,approx}={p}_{1}+{p}_{2}{S}_{\beta ,475}+{p}_{3}{{S}_{\beta ,475}}^{2}+{p}_{4}{S}_{\beta ,475}\mathrm{ln}{L}_{F}+{p}_{5}{\left(\mathrm{ln}{L}_{F}\right)}^{2}+{p}_{6}{{S}_{\beta ,475}}^{3}+{p}_{7}{S}_{\beta ,475}{\left(\mathrm{ln}{L}_{F}\right)}^{2}$$
(8)

The model of Eq. (8) can be rewritten in a more straightforward form:

$${v}_{F,approx}=\mathrm{exp}\left[\begin{array}{c}{p}_{1}+{p}_{2}{S}_{\beta ,475}+{p}_{3}{{S}_{\beta ,475}}^{2}+{p}_{4}{S}_{\beta ,475}\mathrm{ln}{L}_{F}+\\ {p}_{5}{\left(\mathrm{ln}{L}_{F}\right)}^{2}+{p}_{6}{{S}_{\beta ,475}}^{3}+{p}_{7}{{S}_{\beta ,475}\left(\mathrm{ln}{L}_{F}\right)}^{2}\end{array}\right]$$
(9)

\(\mathrm{ln}\left(\bullet \right)\) is the natural logarithm of its argument and the coefficients \({p}_{1}\), \({p}_{2}\),…, \({p}_{7}\) are listed in Table 13 for \({L}_{F}\) in km and \({S}_{\beta ,475}\) in units of g.

Table 13 Coefficients for estimating the approximated recurrence rate (\({v}_{F,approx}\)) after Eq. (9), derived for \({L}_{F}\) in km and \({S}_{\beta ,475}\) in g

The regression residuals are estimated as

$$\varepsilon =\frac{\mathrm{ln}{v}_{F}-\mathrm{ln}{v}_{F,approx}}{{\sigma }_{\varepsilon }}$$
(10)

Negative residuals (\(\varepsilon <0)\) reflect an overestimation of the recurrence rate, which is considered to be acceptable conservatism within the framework of a design code. Contrarily, the underestimation (\(\varepsilon >0\)) requires some treatment. It is noted here that the aim is not to build an unbiased probabilistic model, but rather a conservative one, suitable for preliminary design.

The regression residuals (\(\varepsilon\)) versus the actual recurrence rate (\({v}_{F}\)) are illustrated in Fig. 30 along with the 5%, 16%, 50%, 84%, and 95% running quantiles (Haver and Winterstein 2009), employing a symmetric-neighbor window length of 20% of the sample. This is the traditional approach of inspecting a regression model, and it points to a less-than-optimal fit without homoscedastic residuals, indicating unconservative estimates at high actual rates. However, when looking to determine an appropriate confidence factor for correcting (conservatively biasing) the regression model output, the actual rates are not useful; they are not available to the user. Only the approximated rate, \({v}_{F,approx}\), can be employed.

Fig. 30
figure 30

Regression residuals (\(\varepsilon\)) versus the actual recurrence rate, \({v}_{F}\) (in \({\text{year}}^{-1}\))

Figure 31 offers this view, showing the residuals against \({v}_{F,approx}\). Attempting to envelop the underestimation area, a multiplicative confidence factor \({C}_{F}\) is employed to update/increase \({v}_{F,approx}\), introducing the needed conservative bias:

Fig. 31
figure 31

Fitting residuals (\(\varepsilon\)) of the statistical model versus the approximated recurrence rate, \({v}_{F,approx}\) (in \({\text{year}}^{-1}\))

$${v}_{F,approx,u}={{C}_{F}v}_{F,approx}$$
(11)

Based on Fig. 31, a constant confidence factor \({C}_{F}>1.00\) is considered for \(\mathrm{ln}{v}_{F,approx}\le -\) 3 with a linear (in log–log space) ramp-down to 1.00 within \(-3\le \mathrm{ln}{v}_{F,approx}\le -1\) (see Fig. 32):

$${C}_{F}=\left\{\begin{array}{ll}\mathrm{exp}\left({a}_{CF}\right)& \quad\mathrm{ln}{v}_{F,approx}<-3\\ \mathrm{exp}\left[{a}_{CF}-{a}_{CF}(\mathrm{ln}{v}_{F,approx}+3)/2\right]&\quad -3\le \mathrm{ln}{v}_{F,approx}\le -1\\ 1.00&\quad -1<\mathrm{ln}{v}_{F,approx}\end{array}\right.$$
(12)

where \({v}_{F,approx}\) is in units of \({\text{year}}^{-1}\), \({a}_{CF}=1.2975\times {\sigma }_{\varepsilon ,mean}\approx 0.98\) for the mean \({S}_{\beta ,475}\), and \({a}_{CF}=1.3323\times {\sigma }_{\varepsilon ,50}\approx 1.05\) for the median \({S}_{\beta ,475}\). Note that the multiplier of the standard error is chosen as the residual value roughly conforming to the near-constant value of the 95% quantile for low recurrence rates in Fig. 31. It is noted that the quantiles should normally come closer to zero for high rates, yet the large window length of 20% employed in the running quantile scheme is influenced by nearby values and maintains higher absolute values even in the high-rate region.

Fig. 32
figure 32

Confidence factor (\({C}_{F}\)) for the approximated recurrence rate \({v}_{F,approx}\) (in \({\text{year}}^{-1}\))

The recurrence rate values from the database and the (uncorrected/non-updated) approximated ones via Eq. (9) are compared in Fig. 33 for both estimates using the mean and the median acceleration \({S}_{\beta ,475}\). A roughly 50% of underestimation is detected for both cases, namely 50% of values are below the dashed line, indicating that the actual recurrence rate of the faults is higher than the approximated one. Then, the effect of the confidence factor on the approximated recurrence rate is illustrated in Fig. 34, where the recurrence rate values from the fault database are plotted versus the updated approximated values after Eq. (11). The introduction of the confidence factor drives about 40% more values above the dashed line, yielding a coverage (\({v}_{F,approx,u}>{v}_{F}\)) above 90%.

Fig. 33
figure 33

Comparison of actual recurrence rates (\({v}_{F}\)) versus the approximated ones (\({v}_{F,approx}\))

Fig. 34
figure 34

Comparison of actual recurrence rates (\({v}_{F}\)) versus the updated approximated ones (\({v}_{F,approx,u}\))

A set of indicative faults with essentially different length and seismological properties is selected from Table 9 and Table 10 to evaluate the results of the recurrence rate approximation. The mean acceleration \({S}_{\beta ,475}\) is computed via interpolation from the seismic hazard map for several locations on the mapped fault trace. At each location, the updated approximated rate (\({v}_{F,approx,u}\)) is calculated via Eqs. (9), (11), and (12) and is plotted in Fig. 35 with dots. As expected, the rate is changing along the fault trace due to the variation of the acceleration. The actual recurrence rate \({v}_{F}\) obtained from the fault database is plotted for comparison with a straight horizontal line. It is observed that the recurrence rate approximation is a conservative approach, as designed so, leading to higher rates and consequently higher fault displacement values for a given return period. This is the reason for introducing the deterministic cap [see note (4) in Sect. 2] in order to limit the displacement estimates with respect to the fault length.

Fig. 35
figure 35

Comparison of the actual recurrence rate (\({v}_{F}\)) versus the updated approximated ones (\({v}_{F,approx,u}\)) calculated at different points along the fault using the mean \({S}_{\beta ,475}\); the distance is measured according to the fault trace node sequence, which is ordered following the right-hand rule (Aki and Richards 1980)

6 Summary and conclusions

Lifelines, tunnels, and bridges are vulnerable to permanent ground displacements, such as those resulting from tectonic fault movement. The estimation of the design fault displacement via empirical fault scaling relations may lead to a result of unknown safety and conservatism. On the other hand, a full Probabilistic Fault Displacement Hazard Analysis (PFDHA) is the most appropriate tool to incorporate fault productivity in terms of events per year in the calculations. PFDHA yields the mean annual frequency of exceeding predefined fault displacement values and is essentially the first step for the performance-based design of critical infrastructures. However, this approach is not compatible with the code due to the required specialized seismological data and the sophisticated calculations. To overcome these drawbacks and at the same time offer a code-compatible hazard-consistent approach, a simplified methodology for estimating the design fault displacement at lifeline–fault crossings was developed. A set of empirically-derived equations for calculating the displacement is offered given the fault productivity (as represented by the recurrence rate), the fault mechanism, the fault length, and the crossing location of the lifeline on the fault (crossing site). These equations were developed from the statistical processing of results from PFDHA by considering the pertinent epistemic uncertainties and a range of input variables, both derived and obtained, respectively, from a selection of faults from the 2020 European Fault-Source Model database (Basili et al. 2022).

The fault displacement obtained from the proposed methodology was compared to the one from a full PFDHA to a set of relevant active faults in Europe, revealing a fair match, erring towards the conservative side. The proposed methodology has been adopted as an informative Annex in prEN 1998-4:2022 (European Committee for Standardisation 2022) and it may serve as a screening tool within the lifeline route selection procedure, or as a preliminary design tool to indicate whether a specialized seismological study is needed. Finally, the methodology may be easily implemented via the spreadsheet that accompanies the paper as an electronic supplement.

7 Supplementary material

The online version contains the supplementary material.