1 Introduction

According to the National Risk Assessment released by the Italian Department of Civil Protection in 2018 (DPC, 2018), in the last 50 years the economic and social losses caused by earthquakes in Italy have been estimated at around 180 billion Euros and 5000 victims. As revealed by the recent Italian earthquakes (Lagomarsino, 2012; Cattari et al., 2014; Dolce and Goretti, 2015; Masi et al., 2019a), these results mainly depend on the high vulnerability of the existing building stock, which was mostly designed considering only gravity loads. Therefore, an accurate evaluation of seismic vulnerability has a crucial role in the prioritization of strengthening and retrofit interventions on existing buildings within disaster risk reduction policies.

In the context of performance-based earthquake engineering (Porter et al., 2007), seismic vulnerability is assessed by means of fragility functions which provide the probability of exceeding certain limit states of a structure as a function of selected ground motion intensity measures (IM). Among the different approaches available to derive fragility curves (e.g. Calvi et al., 2006), analytical methods make use of computational analyses applied to a mechanical model of structure and aim at defining a mathematical relationship between earthquake ground motion intensity, structural response (quantified through engineering demand parameters) and expected damage level. A crucial aspect in the estimation of fragility curves using non-linear dynamic analyses (NLDA), besides the calibration of the non-linear structural model, is the selection of an appropriate suite of accelerograms to represent the seismic excitation. This involves the careful consideration of a number of issues ranging from the selection and scaling criteria and the required number of signals to the definition of the intensity measure or measures (see e.g. Bommer and Acevedo, 2004; Masi et al., 2011; NIST, 2011).

Taking advantage of the growing availability of strong motion databases, the use of records from real earthquakes, as opposed to artificial accelerograms, has become more attractive. Record selection is performed in such a way as to ensure compatibility of the selected set of ground motions with a target response spectrum. For this purpose, either a Uniform Hazard Spectrum (UHS), a Conditional Mean Spectrum (CMS, Baker, 2011) or a Conditional Spectrum (CS, Abrahamson and Al Atik, 2010) can be used as target. In some cases, the scarcity of ground motion recordings for specific scenarios, i.e. large magnitude events at short source-to-site distances, requires either the application of a suitable scaling procedure to the selected ground motions or the use of alternative approaches (e.g. simulated ground motions, see Paolucci et al. 2021; Petrone et al. 2021) to ensure compatibility with the target spectrum. Conventional approaches for the manipulation of recorded ground motions typically use constant factor scaling in the time domain (linear scaling), such as in REXEL (Iervolino et al., 2010) or in SigmaSpectra (Kottke and Rathje, 2010) or, less frequently, spectral matching techniques, where the records are adjusted to approach the target response spectrum either by adding wavelet components to the original time series, such as in the RSPMatch software (Abrahamson, 1998) and SeismoMatch (Seismosoft, 2019), or by iteratively scaling the Fourier spectrum amplitude of the original record. A comprehensive discussion of the advantages and drawbacks of both approaches can be found in the NIST report (2011).

The suitable number of signals to be used in NLDA is still an open issue (Haselton et al., 2012). Generally, the optimal number derives from a compromise between computational time and required accuracy in defining each considered damage state (NIST, 2011). As for design criteria, the minimum number of ground motions recommended by codes ranges from three signals according to ASCE/SEI 7–10 (ASCE, 2010) to five for ASCE/SEI 7–16 (ASCE, 2016), and to seven for Eurocode 8 (EC8; CEN, 2004). On the other hand, a much greater number of ground motions (e.g. at least 7 signals per eight intensity levels according to ATC-58–1 (FEMA, 2018)) are required for risk analyses, obviously to span a wide range of intensity values representative of both low and high seismic hazard levels.

The number of signals also depends on the efficiency of the considered intensity measure (IM), defined as its capability to reduce the variability of the adopted earthquake demand parameter (Luco and Cornell, 2007). In this context, although Peak Ground Acceleration (PGA) and Spectral acceleration at the fundamental vibration period (Sa(T1)) have a limited correlation with non-linear structural response (e.g. Haselton et al., 2012; Masi and Vona 2012; Masi et al., 2020; O’Reilly 2021), they are the most used IMs in fragility analyses. Other IMs, such as the geometric mean of Sa across an appropriate period range (Kazantzi and Vamvatisikos, 2015) or integral measures, such as the Housner Intensity (Masi et al., 2011; 2015), appear to be more efficient but critical for practical applications due to unavailability of properly defined hazard maps in terms of these parameters.

Within the 2019–2021 research agreement between the Italian Civil Protection Department (DPC) and the Network of University Laboratories for Earthquake Engineering (ReLUIS), the WP4 “Seismic Risk Maps—MARS” work package is devoted to updating the 2018 release of the Italian National Seismic Risk Assessment (Masi et al., 2021). Specifically, Task 4.2—Seismic input—is focused on the definition and implementation of a suitable approach to selecting arbitrarily large suites of real accelerograms, with no or limited scaling factors, to be used in NLDAs to derive fragility curves. In this context, easy-to-use software, namely S&M—Select & Match, has been developed for the selection and spectral matching of recorded ground motions approaching a target response spectrum in a broad range of vibration periods. To achieve broadband spectral compatibility, a dataset of high-quality digital accelerograms, reliable up to periods of about 10 s and covering the magnitude and distance range of interest for Italian sites as homogeneously as possible (SIMBAD-V06, following the first version introduced in Smerzini et al., 2014), has been considered. For the practical applications addressed in this work, it is worth highlighting that a narrower period range (i.e. up to 2.5 s) can be considered in structural analysis, because the most common building types in Italy are characterized by natural vibration periods below 2.5 s.

In this paper, the S&M code is firstly introduced by illustrating the main original features of the tool with respect to both the selection and spectral matching phase (Sect. 2). After that (Sect. 3), the feasibility of the proposed approach has been tested within the MARS Project by selecting a large dataset of real accelerograms approaching the code-based Italian spectra defined for different return periods, in order to cover a wide range of ground motion intensity levels. In the context of the MARS Project (Masi et al., 2021), the selected accelerograms will be used for deriving site-independent fragility curves by using a Cloud-like approach (e.g. Jalayer et al., 2017). The dataset, consisting of unscaled and frequency-scaled accelerograms representative of both stiff and soft soil conditions, is described and analysed. Finally, in Sect. 4, some NLDAs have been performed on an existing typical Italian RC building in order to check the capability of the dataset to reach the damage states generally adopted in fragility analyses. Results from the two datasets have been compared by considering some of the most commonly adopted IMs. Finally, the effects on seismic response due to some selection criteria permitted by the proposed tool have been analysed.

2 Tool for ground motion selection: Select & Match

A new tool, S&M—Select & Match, has been used for selecting suites of recorded earthquake ground motions (EGMs) approaching a user-defined target spectrum. The tool allows for the search of both unscaled and spectrally-matched real accelerograms ensuring the compatibility of the resulting suite of signals in prescribed ranges of vibration periods. S&M is undergoing a significant development, especially in terms of the available accelerogram datasets that will further include NGA West2 (Ancheta et al. 2013) and the simulated BB-SPEEDset (Paolucci et al., 2021), allowing for a wider flexibility in the selection procedure. Furthermore, additional features for selection will be included, such as search for pulse-like ground motions as well as multi-component spectral compatibility. These new features will be presented in a future publication.

In the present version, S&M makes use of a dataset of worldwide earthquake recordings with response spectral ordinates that are reliable in a broad range of periods, namely up to about 10 s, which is usually regarded as the upper bound for vibration periods of engineering interest (Smerzini et al. 2014). This dataset, named SIMBAD (Selected Input Motions for displacement-Based Assessment and Design)-V06, consists of nearly 600 three-component accelerograms from moderate to large crustal earthquakes, with a moment magnitude MW of mostly between 5 and 7.3,Footnote 1 recorded at epicentral distances Repi less than about one fault length, mostly below 35 km. These characteristics are consistent with the maximum magnitude around MW 7 of historical earthquakes in Italy (Dolce and Di Bucci 2017).

Furthermore, most records are from digital instruments, making them reliable up to long periods, and are characterized by known site proxies (either code-based site class or average shear wave velocity in the top 30 m, VS30). The availability of records of interest for engineering applications in Italy minimizes the need for applying large scaling factors, which may significantly alter the amplitude of the original accelerogram, while keeping frequency-related, duration and other integral ground motion parameters unchanged. This may introduce some distortions in the natural correlation between peak and integral ground motion parameters, with potential bias in structural response modelling. In this way, all records are expected to be directly usable for structural time history analyses, without scaling or with limited scaling factors. For this purpose, Bommer and Acevedo (2004) suggest that limits on scaling typically range from factors of 2 to 4, while Du et al. (2019) recommend a scaling limit of 3 to 5 for general use when selecting ground motions from the NGA‐West2 database (Ancheta et al. 2013).

On the whole, the key feature of S&M is modularity and flexibility in handling the EGM selection in the two phases of analysis, as sketched in the flowchart of Fig. 1: (1) Selection (S) and (2) spectral matching (M).

Fig. 1
figure 1

Flowchart of S&M code

Details on the algorithms implemented in the Selection and Matching phases are provided in the following.

2.1 S&M: Selection phase

The Selection (S) phase provides as output a set of N unscaled EGMs according to a ranking procedure consisting of the following steps:

  • (S.1) EGMs are filtered according to the minimum (Mmin) and maximum (Mmax) magnitude as defined by the user;

  • (S.2) For the ith EGM, extracted from the previous step, the mean (εm,i) and maximum (εmax,i) normalized errors (or mismatches) are computed and verified against user-defined threshold values (εm,t and εmax,t, respectively), as follows:

$$\varepsilon_{m,i} = \frac{1}{{N_{T} }}\mathop \sum \limits_{j = 1}^{{N_{T} }} \left| {\frac{{S_{e,i} \left( {T_{j} } \right) - S_{e,target} \left( {T_{j} } \right)}}{{S_{e,target} \left( {T_{j} } \right)}}} \right| \le \varepsilon_{{{\text{m}},{\text{t}}}}$$
(1)
$$\varepsilon_{\max ,i} = \max \left| {\frac{{S_{e,i} \left( {T_{j} } \right) - S_{e,t\arg et} \left( {T_{j} } \right)}}{{S_{e,t\arg et} \left( {T_{j} } \right)}}} \right| \le \varepsilon_{\max ,t}$$
(2)

where NT is number of vibration periods Tj, \({S}_{e,target}\) is the target 5%-damped (hereafter the 5% damping value is understood) elastic acceleration spectrum, and \({S}_{e,i}\) is the elastic acceleration response spectrum of the ith EGM.

Equations (1) and (2) ensure that the mean and maximum mismatch of the selected recordings are limited to selected thresholds and, thus, avoid the selection of candidate motions which are “spectrally far” from the target spectrum.

S.3) The EGMs from step S.2) are scored from the best (score = 1) to the worst (score = 100) for each of six different criteria (Ck), defined as follows:

C1: average spectral mismatch (according to Eq. (2)) in the whole period range, typically T ≤ 8–10 s;

C2: maximum spectral mismatch in the whole period range;

C3: average spectral mismatch in the primary period range [\({T}_{1,min}\), \({T}_{1,max}\)] (note that for specific applications, \({T}_{1,min}\) and \({T}_{1,max}\) may be defined based on the fundamental vibration period of the structure under consideration);

C4: average spectral mismatch in the secondary period range [\({T}_{2,min}\), \({T}_{2,max}\)], i.e., the period range complementary to [\({T}_{1,min}\), \({T}_{1,max}\)];

C5: Site class dependency. This includes three logical values (“strict”, “close” or “any site class”) to define the correspondence between the site class of the selected EGM and that of the target spectrum. The selection may be:

  • strictly site-class specific: i.e. the records are selected to be within the same site class as that of the target spectrum,

  • closely site-class specific: when records in a site class that is stiffer and softer than the target one are also allowed, or

  • site-class independent: if any record can be selected regardless of the corresponding site class;

C6: Closeness to target PGA. This criterion allows the user to enforce a closer match to the target PGA.

Finally, a global score (si) is computed for the ith EGM as the weighted sum of the individual scores associated with the afore-mentioned criteria as follows:

$${s}_{i}={\sum }_{k=1}^{6}{w}_{k}{s}_{i,k}$$
(3)

where sik is the score for the ith EGM and the kth criterion and wk is the weight (0 ≤ wk ≤ 1) associated with the kth criterion.

2.2 Spectral matching

In the Matching phase (M), the accelerogram undergoes an iterative scaling procedure in the frequency domain, until its response spectrum approaches the target one within a prescribed tolerance. The spectral matching is achieved within a selected period range, namely between \({T}_{3,min}\) (typically 0.01 s) to \({T}_{3,max}\) (typically 2–2.5 s).

For each iteration n ≤ niter, the following steps are carried out:

M.1) The response spectral ratio (\({R}_{i,j}^{n}\)) between the target spectrum and that of the ith EGM is computed for each period Tj, through the following equation:

$${R}_{i,j}^{n}={S}_{e,target}\left({T}_{j}\right)/{S}_{e,i}^{n}({T}_{j})$$
(4)

Tjs are selected at sufficiently closely spaced intervals, i.e., around 0.005 s up to 0.05 s and 0.05 s afterwards. Note that at the first iteration (n = 1), the signal is the original one, hence \({S}_{e,i}^{n=1}={S}_{e,i}\). Scaling factor is 1 outside the period range for the spectral matching [\({T}_{3,min}\), \({T}_{3,max}\)].

M.2) The Fourier amplitude spectrum of the accelerogram (\({FAS}_{i,jf}^{n}\)) is computed and multiplied in the frequency domain, at discrete frequencies fjf = 1/Tj, by a scaling factor equal to \({R}_{i,j}^{n}\):

$${FAS}_{i,jf}^{n+1}={R}_{i,j}^{n} \bullet {FAS}_{i,jf}^{n}$$
(5)

Scaling factors for the remaining discrete frequencies are computed by linear interpolation. Note that Eq. (5) modifies the modulus of the Fourier spectrum, while keeping the phases of the signal unchanged.

M.3) By applying the inverse Fourier transform to \({FAS}_{i,jf}^{n+1}\), the acceleration time history (\({acc}_{i}^{n+1}\)) is computed and provided as input for step M.1) in the next iteration.

Note that, generally, a total number of iterations niter below 5 is sufficient to get convergence of results.

As an explanatory example, Fig. 2 presents the effect of spectral matching on the ground motion recorded at NRC station (EW component) during the MW 6.5 30 October 2016 Norcia earthquake both in the time domain (left) and in terms of acceleration response spectra (right). Note that, according to the iterative procedure illustrated previously (steps M.1 to M.3), each Fourier spectral ordinate is subjected to a different scaling factor, the amplitude of which depends on the distance between the original response spectrum and the target one. In this application, the spectra of the original records have been taken close to the target ones especially in the long period range, that is mostly governed by the physical parameters of the earthquake (i.e., magnitude, duration). In this way, the resulting scaling preserves, with only minor modifications, the portion of ground motion constrained by such physical parameters, while larger scaling factors are applied to the high-frequency portion, mostly governed by small-scale effects poorly correlated with the earthquake source. Furthermore, the iterative scaling to approach the target spectrum automatically ensures a smooth transition between factors applied to different frequency intervals.

Fig. 2
figure 2

Explanatory example of the spectral matching procedure implemented in S&M. Original (red) and matched (blue) ground motions (EW component) recorded at NRC station during the M6.5 30 October 2016 Norcia earthquake both in terms of acceleration, velocity and displacement time histories (left) and in terms of acceleration response spectra (right)

Because of the previous criteria for selection, as shown in Fig. 2, the resulting spectrally-matched ground motions provide displacement and velocity time histories almost undistinguishable from the original ones, while only the acceleration time history undergoes significant modifications.

The selection and scaling of EGMs through the S&M tool has enabled the fulfilment of some requirements made necessary by the application to numerical fragility analysis, such as:

  • Broadband spectral compatibility. S&M is based on the fundamental assumption that the set of selected EGMs approaches the target spectrum in a broad period range, up to at least 5 s. For this reason, as explained previously, the spectral matching is expected to alter only slightly the low frequency range of the ground motions, in order to preserve the physical information about earthquake magnitude and duration.

  • Simultaneous selection of two horizontal components. The code allows the selection of a set of N two horizontal components EGMs. For this purpose, a simplified procedure is adopted, where, first, the selection is done on the primary horizontal component (i.e. the component with the best scores named H1) while the corresponding perpendicular components (named H2) are taken accordingly. Note that the spectral-matching procedures act independently on the two horizontal components. In future developments of S&M, selection will optionally be performed also based on the geometric mean of horizontal components, according to the findings of recent studies (Baraschino et al. 2021).

  • Record-to-record variability. In its standard formulation, spectral matching may provide signals whose response spectra coincide almost exactly with the target spectrum, thus ending up with a suite of accelerograms characterized by a nearly vanishing dispersion of response spectral ordinates across the selected period range. To avoid this issue and keep some spectrum-to-spectrum variability to ensure that the fragility curves are robustly defined (referred to as peak-to-through variability in Stafford and Bommer 2010), a lower and upper tolerance (tolM,up and tolM,low) may be assigned in the spectral-matching phase (e.g. if 10%-30% tolerances are given, the mean spectrum of the record set is allowed to vary between 90 and 130% of the target spectrum). To better highlight this point, Fig. 3 shows the outputs of S&M for a target code-based spectrum with return period TR = 1000 years for different matching criteria: left, no matching, i.e. selection of unscaled accelerograms (indeed, the average spectrum tends to fall below the lower bound of 90% of the target); center, matching tolerances, tolM,up and tolM,low, are set both to 30% (“loose” matching); right: matching tolerances are set to very low values, i.e., tolM,up = tolM,low = 1% (“strict” matching). Note that in Fig. 3, regardless of the tolerance assumed in the matching scheme, the target spectrum is shown together with its lower (90%) and upper (130%) bound levels, according to selection acceptability criteria recommended in the Italian building code (NTC18, 2018).

Fig. 3
figure 3

Outputs of S&M tool for different matching criteria: left, no matching, i.e. unscaled recordings; center: matching tolerances, tolM,up = tolM,low = 30% (“loose” matching); right: tolM,up = tolM,low = 1% (“strict” matching)

3 Definition of UHS-constrained input motions

The S&M tool has been applied to select a large set of recorded EGMs for deriving analytical fragility curves to be used to update the Italian seismic risk assessment (Masi et al. 2021). In this context, EGM selection needs to satisfy four criteria, namely: (i) to provide a sufficiently large set of accelerograms covering a wide range of ground motion intensity levels consistent with the national seismic hazard assessment for Italy (MPS04, Stucchi et al. 2011); (ii) to allow the derivation of fragility curves for different structural types (i.e. seismically and not-seismically designed) considering all the damage states according to the European Macroseismic Scale EMS-98, Grunthal, 1998); (iii) to allow for site-independent analyses, such that fragility curves are not affected by the specific seismic hazard at the site; (iv) to take into account the possible influence of soil amplification effects in the derivation of fragility curves.

In the following, the different steps of the selection phases using the S&M tool have been described and the selected EGMs have been analysed. The dataset of signals is freely available at www.reluis.it.

3.1 Target spectra

In order to sample ground motion with intensity levels in a broad range without referring to a specific site, the selection of recordings has been based on the “proximity” to a set of code-conforming UHS spanning from very low to very large return periods. This ensures that the suite of ground motions is representative of a broad range of ground motion intensities gradually increasing from very low to very high levels.

For application to Italy, the target spectra are the 5%-damped elastic design spectra defined in the Italian code (NTC18, 2018) for eight return periods, namely, TR = 50, 100, 200, 475, 975, 2475, 5000, and 10,000 years, each one defined for one of the most hazardous Italian cities, L’Aquila (42.35oN, 13.40oE). This site has been chosen not to make a site-specific application but to ensure that the largest ground motion intensity levels expected in Italy for very long return periods could be reasonably covered by the selection. Furthermore, to check the influence of site conditions on the fragility curves, two different sets of target spectra and corresponding candidate ground motions are defined for two “macro” site categories, namely, A/B (for no- or low soil amplification conditions) and C/D (for mid- or high soil amplification conditions).

When UHS are not defined in NTC18, specifically for 5000 and 10,000 years return period, the target spectra are approximated by using the reliability differentiation relationship recommended by EC8 (CEN, 2004) through the scaling factor of the design spectrum \({\gamma }_{I}\), defined as follows:

$${\gamma }_{I}={\left(\frac{{T}_{LR}}{{T}_{L}}\right)}^{-1/k}$$
(6)

where \({T}_{LR}\)=2475 years (upper bound return period for which target spectral ordinates are known, as provided by NTC18), \({T}_{L}\) = 5000 or 1000 years (return period whose target spectral ordinates are unknown), and k (exponent depending on seismicity) = 3. Hence, \({\gamma }_{I}\) =1.26 and \({\gamma }_{I}\) =1.60 are obtained for TR = 5000 and 10,000 years, respectively. Furthermore, site amplification at long return periods is neglected because of non-linear site effects, i.e., Ss = 1, while the topographic amplification factor (ST) is set to 1. As regards the spectral shape for soft soils, the same factor Cc (which modifies the constant-velocity corner periods for softer site conditions) as obtained for TR = 2500 years is also adopted for the longer return periods (i.e. TR = 5000 and 10,000 years).

In Fig. 4, target elastic 5%-damped acceleration spectra are shown for all return periods, separately for class A/B (left) and C/D (right).

Fig. 4
figure 4

Target spectra, based on NTC18, for both stiff (left) and soft (right) soils

3.2 Selection and spectral matching parameters

For each site category (i.e. A/B and C/D), a total number of 125 EGMs (about 15 EGMs for each considered target spectrum) have been selected according to the procedure described in Sects. 2.1 and 2.2. While referring to Table 1 for the complete list of input parameters used for S&M, some remarks, specific for the application under study, are provided below.

Table 1 Parameters of S&M with their explanations, allowable ranges, and values set for the current study

In step S.1), SIMBAD records with MW ≤ MWmax = 7.25 are selected. The value of MWmax has been selected to be consistent with the maximum magnitude assumed for deriving the currently available probabilistic seismic hazard map of Italy (MPS04, Stucchi et al. 2011). As SIMBAD includes records within a limited range of distances, no preliminary selection based on distance has been made.

Firstly, for each return period, the software performed the selection of N = 20 unscaled recorded ground motions approaching the target spectrum in a total period range ranging between 0.1 and 8 s, according to the scoring approach of steps S.2 and S.3. The thresholds set for the mean and maximum spectral errors and the corner periods are indicated in Table 1. More specifically, [\({T}_{1,min}\), \({T}_{1,max}\)] = [0.1, 2.0] s, [\({T}_{2,min}\), \({T}_{2,max}\)] = [2.0, 8.0] s are considered and a larger weight is given to the primary period range because the fundamental vibration period of the prototype building considered for the fragility analysis (equal to around 1 s, see Sect. 4) falls within this interval.

To control the maximum mismatch only, beyond the target period range (e.g. until T = 2.5 s) only two additional period points are added, one at an intermediate (T = 4.5 s) and another at a long period range (T = 8.0 s). It is noted that S&M allows a flexible definition of input periods in which the selection will be carried out and, based on the requirements of the problem under consideration, different definitions of input period range could also be defined.

Finally, due to the lower number of records approaching the spectra target, especially for the higher return periods, and in order to ensure a similar number of signals for each return period, about 5 records (for each intensity level) with the lower score have been removed thus considering only the first 15 signals of the selection.

In Fig. 5, an illustrative example related to the top-scored 20 EGMs obtained from the selection phase for stiff sites considering TR = 5000 years is presented. Details of the selected accelerograms is given in Table 2.

Fig. 5
figure 5

Response spectra of the suite of 20 unscaled accelerograms approaching the TR = 5000 years target spectrum (in red) for stiff sites, selected according to S&M procedure (left: H1 components, right: H2 components). Flatfile information of H1 components are given in Table 2

Table 2 Summary information for TR = 5000 years at stiff sites for the best performing 20 signals

Finally, when unscaled accelerograms (always preferable option) were not found, typically because of the extremely high intensity levels that are not sufficiently covered by the SIMBAD dataset, spectral matching of recorded motions has been considered. In this way, sets corresponding to TR = 5000, 10,000 years for stiff sites and TR = 2475, 5000, 10,000 years for soft sites have been obtained through the spectral matching procedure.

For the same case study related to the unscaled signals (i.e., TR = 5000 years and stiff sites), an illustrative example of the output of the spectral matching step is shown in Fig. 6.

Fig. 6
figure 6

Target and frequency-scaled elastic 5% damped acceleration spectra at stiff sites for hazard level of TR = 5000 years. Left: for H1, right: for H2

In the spectral matching, a prescribed tolerance, i.e. tolM,up = tolM,low = 30%, is set to achieve some dispersion similar to that obtained for the selection of unscaled records for smaller return periods. In general, the spectral matching procedure can induce more iterations, in particular due to the high frequency region (roughly until T = 0.1 s). The number of iterations can be decided by the user, although it should not exceed 5 in order to limit the modifications to the original signal. In this application, 2 iterations have been found sufficient.

3.3 Overview of the selected input motions

In this section, an overview of the final set of EGMs is presented, focusing in particular on the distribution of selected IMs, relevant for seismic fragility analyses, namely, PGA (Peak Ground acceleration), PGV (Peak Ground Velocity), HI (Housner Intensity) and Sa (Spectral acceleration) at around 1 s. The latter corresponds to the fundamental vibration period of the structure considered in the example described in Sect. 4.

Figure 7 shows the MW-Repi distribution of the sets of 125 EGMs corresponding to stiff and soft soil conditions. Note that, for both sets, all EGMs are within MW = [5.3 7.1] and with Repi mainly less than about 35 km.

Fig. 7
figure 7

Distributions of magnitude-epicentral distance for the selected sets of EGMs

With few exceptions, the lower is TR the lower is the earthquake magnitude of the record. Although this might be considered as an obvious remark, it should be noted that no specific instructions have been given to S&M in terms of a target magnitude range. As discussed in Smerzini et al. (2014), when the spectral compatibility is enforced on a broad period range, including periods longer than about 2 s, an increasing trend of magnitude is expected as long period spectral ordinates increase.

Figures 8 and 9 present the empirical cumulative frequency distribution (ECFD) of the aforementioned IMs for both horizontal components (referred to as H1 and H2, respectively) and soil conditions. Note that, for each IM level x, ECFD(x) is the proportion of the values in the data sample less than or equal to x. To complement these graphs, Table 3 lists the median and logarithmic standard deviations (β = σln) of the lognormal distribution best fitting the IM datasets. For each component, the unscaled and matched signals are denoted by different colours (orange and blue, respectively). The following remarks can be made:

  • The S&M approach, based on a broadband spectral proximity to a range of target spectra from low to very long return periods, naturally yields the selection of input motions characterized by different IMs, including peak parameters (PGA, PGV), Spectral acceleration (Sa(T)) and integral-type measure (HI), covering, at the same time, the range of intensities of interest for structural analyses with a sound statistical distribution;

  • the addition of spectrally matched EGMs for high intensity IMs (corresponding to long return periods) to the unscaled ones does not significantly alter the statistical distribution of the different IMs, either in terms of the overall average trend or of its variability;

  • when considering the perpendicular components (H2), a larger dispersion is typically found (see Table 3), especially for the subset of matched signals, because records do not comply with any selection criterion on this specific component;

  • the cumulative distribution of PGV, Sa and HI is strongly dependent on site conditions, the median values for stiff sites being lower than those for soft sites, as expected because of site amplification effects. On the other hand, for PGA, the statistical distributions are not well separated, especially for higher intensity values, because this IM, being mainly associated with high-frequency components, is less correlated with low-frequency soft-soil amplification. Furthermore, it is also possibly affected by non-linear site effects causing a de-amplification of motion, especially at high frequencies.

Fig. 8
figure 8

Empirical Cumulative Frequency Distribution (ECFD) of PGA, PGV, Sa(T = 1 s = computed on the selected sets of ground motions at both stiff (triangles) and soft (dots) sites. H1 are the horizontal components obtained by the S&M code according to selection criteria, while the H2 components are the corresponding perpendicular components

Fig. 9
figure 9

As in Fig. 8 but for components H2

Table 3 Median (θIM) and logarithmic standard deviation (βIM) of the lognormal distributions best-fitting the distributions of IMs of Figs. 8 and 9

4 Application to non-linear dynamic analyses

In order to verify the capability of the selected suite of recordings to reach the different damage states generally accounted for in seismic fragility studies and the effect on seismic response due to some selection criteria, NLDAs have been carried out considering a reinforced concrete (RC) prototype building. For EGMs related to stiff and soft soil types, relationships between a commonly used earthquake demand parameter (EDP), i.e. the inter storey drift ratio, and different IM., i.e. PGA, PGV, HI and Sa(T), have been explored and the relevant results have been compared.

4.1 Description and modelling of the prototype structure

The building under study is a four-storey RC framed structure representative of the pre-code Italian building stock designed in the ‘70 s only for vertical loads. It has a rectangular shape in plan (Fig. 10) with total dimensions 20.95 × 11.75 m2 (X and Y direction, respectively) and constant inter-storey height equal to 3.05 m. In accordance with the design practice commonly used for gravity load structures, the considered prototype has lateral load resisting frames only along one direction (Y direction, orthogonal to the slab orientation) with flexible beams, while rigid beams are present along the perimeter frames. The staircase sub-structure is placed in a symmetric position in relation to the Y direction with knee-type beams.

Fig. 10
figure 10

In plan layout of the building type under study a and 3D view of the model b

Cross-section dimensions and reinforced details have been determined by simulated design (Masi, 2003) considering the code in force at the period (Ministerial Decree 30 May 1974), usual design practice and typical properties of materials such as medium quality of concrete (i.e. Rbk250, admissible compressive strength equal to 59.5 MPa) and deformed steel (i.e. FeB38k, characteristic yielding strength greater than or equal to 380 MPa).

Safety verifications for resisting members have been performed according to the allowable stress method. For the columns, only axial load and the minimum requirements for reinforcement have been considered in the design. The beams have been designed by a simplified model of a continuous beam resting on simple supports. As a result, columns have generally cross-section dimensions equal to 0.30 × 0.30 m2, except for some elements at the lower storeys whose dimension is 0.30 × 0.40 m2. Flexible beam dimensions are 0.7 × 0.25 and 1.0 × 0.25 m2 while, rigid ones, are 0.3 × 0.5 m2.

As for infills, double-layer masonry infills with 8 cm (for internal layer) and 12 cm (for external layer) thick panels of hollow brick masonry and empty cavity (10 cm thick) have been considered as the type commonly used in the ‘70 s (Manfredi and Masi, 2018).

Structural analyses have been performed by using the finite element code OpenSees (McKenna, 2011). A macro-modelling based on lumped plasticity has been adopted according to Ricci et al. (2019). Specifically, at both ends of each structural member a bending moment–rotation (M-θ) relationship has been defined by adopting the Ibarra, Medina and Krawinkler model (2005). When a brittle failure is predicted (e.g. this typically occurs in the short columns of the staircase structure), the above-mentioned M-θ relation is appropriately modified considering a bending moment value calculated as a function of the ultimate shear capacity evaluated according to the Sezen and Moehle model (2004). On the basis of the mechanical properties of the constituent materials typically found in real buildings of the period under consideration (Ricci et al., 2011; Masi et al., 2013; Masi et al., 2019b), mean concrete strength value (fcm) equal to 20 MPa and mean steel strength value (fym) equal to 400 MPa were assumed in evaluating the structural capacity.

4.2 Results of NLDAs

The above-described prototype has been subjected to both sets of accelerograms, i.e. related to stiff and soft soils. NLDAs have been performed by simultaneously applying both in-plane components of signals, specifically H1 along the X direction of the structural model and H2 along the Y one (see Fig. 10). For each analysis, the maximum inter-storey drift ratio (IDR, assumed as EDP) and the corresponding IM value have been processed considering the two orthogonal in-plane directions. It is worth noting that, as a consequence of low differences in terms of IM values between the two orthogonal components (see Table 3) along with the substantial in-plane symmetry of the considered structure, slight differences in terms of global structural response have been found from analyses performed by rotating the two components (i.e. H1 along the Y direction and H2 along the X one).

Figures 11 and 12 show the relationship IDR vs IM defined in the logarithmic space for all the considered IMs (i.e. PGA, PGV, HI and Sa(T1), with T1 being equal to about 1 s for the considered building) by using the cloud approach (Jalayer et al., 2017) for stiff (Fig. 11) and soil (Fig. 12) accelerograms. The parameters of the best-fitting linear regression (i.e. a, b, and the logarithmic standard deviation β) are also reported. In order to distinguish the results relevant to the unscaled and the spectrally matched signals, different markers are used. Furthermore, the points related to dynamic instability cases, i.e. when deformation increases in an unlimited manner for small increments of ground motion intensity so that structural collapse can be considered (e.g. Vamvatsikos and Cornell 2002; Villaverde 2007), are displayed with a black contour. Note that the dynamic instability cases are not included in the linear regression.

Fig. 11
figure 11

Relationships between IDR and IM values (in terms of PGA a, PGV b, HI c Sa(T = 1 s) d) for signals on stiff soil (black marker edge refers to dynamic instability cases). Dashed lines refer to ± one β-value

Fig. 12
figure 12

Relationships between IDR and IM values (in terms of PGA a, PGV b, HI c, Sa(T = 1 s) d) for soft soil signals (black marker edge refers to dynamic instability cases). Dashed lines refer to ± one β-value

In the same figure, the IDR threshold values assigned to the five damage states DS1-DS5 considered in the EMS98 (Grünthal, 1998) are also displayed. The IDR threshold values (see Fig. 13) were defined in Masi et al. (2015), (where more details can be found), for seismic fragility analyses of typical RC Italian buildings designed only for gravity loads, by considering both structural and non-structural components, on the basis of relevant literature results (e.g. FEMA, 2012).

Fig. 13
figure 13

Relationship between damage levels and interstorey drift values adopted for the building case study (from Masi et al., 2015)

Analysing the results in Fig. 11, the following remarks can be made:

  • IDR values clearly increase with increasing intensity values, whatever IM is considered;

  • IDR-IM points are quite evenly distributed across all damage states;

  • unscaled (real) signals essentially involve damage states up to DS4;

  • spectrally matched signals, which are associated to the longer return periods (i.e. 5000 and 10,000 years), involve the heaviest damage states, i.e. DS4 and DS5;

  • about 18% of the analyses reach the dynamic instability;

  • as above mentioned, results obtained in case of dynamic instability have not been considered in the regression. Therefore, due to lack of available data, the linear regression is poorly constrained at the higher IM intensities;

  • in terms of variability of results, the lowest dispersion is found when HI is selected as IM (β = 0.43), while the highest one is found for PGA (β = 0.58). PGV (β = 0.47) has also a good performance similar to HI. This points out the higher efficiency, to represent the damage potential of ground motions, of the IMs representing broader period ranges, such as either HI, in terms of an integral measure, or PGV, in terms of a peak measure mostly affected by intermediate to short periods (see Paolucci and Smerzini, 2018);

  • additional analyses performed by considering further IM (i.e. Arias intensity and Cumulative Absolute Velocity) provide similar or higher dispersion values.

Similar trends can be also found from the analyses where soft soil accelerograms are used (see Fig. 12), although some differences with respect to the stiff soil analyses need to be highlighted. Specifically, due to the higher seismic intensity of the soft soil signals, the number of dynamic instability cases increases up to 38% (with respect to 18% for stiff soil analyses). Further, while the β values for soft soil are very close to those obtained for stiff soil, β differs in case of PGA (0.51 for soft soil and 0.58, for rock one). This latter aspect further emphasizes the poor efficiency of PGA as damage predictor.

To better understand the possible effects of soil conditions on seismic response, the NLDA results from stiff and soft soil accelerograms have been compared. Specifically, in Fig. 14, the IDR-IM linear-regressions obtained from the two sets of accelerograms are plotted in order to highlight the IDR differences with the same intensity values, i.e. attributable to the inability of the specific IM to take into account possible spectral/energy differences of the signals. Results show practically coincident trends when HI is considered, with negligible differences in case of PGV and Sa(T1). On the contrary, a different trend can be observed for PGA, with higher values in case of soft soil, especially at lower intensities. As an example, for PGA = 0.1 g, IDR in case of soft soil is on average 0.22% while it is 0.15% for stiff soil. These results suggest that, while PGA alone is not sufficient to properly define the fragility curve of a structure, and the additional information is needed on the site conditions, this limitation does not apply when efficient IMs are adopted.

Fig. 14
figure 14

Comparisons of the IDR values between stiff (in red) and soil (in blue) results for the considered IM parameters (PGA a, PGV b, HI c, Sa(T = 1 s) d). Dashed lines refer to ± one standard deviation value

As reported above, the S&M tool makes it possible to select spectrally matched real accelerograms approaching a given target spectrum by varying different selection criteria. In order to test the influence on the seismic response due to the tolerance in the spectral matching phase (see Sect. 2), NLDAs have been carried out for two sets of accelerograms, the response spectra of which are displayed in Fig. 3. Each set is selected by using different tolerance values compared to the TR = 1000 years design spectrum, namely: “loose” (tolerance 30%) and “strict” (tolerance 1%) matching. Results have been also compared with the ones obtained from original (unscaled) signals. The IM vs IDR relationship obtained from the NLDAs is plotted in Fig. 15 along with the lognormal probability density function (PDF) for both IM and IDR data. For all the considered IMs, results show that the IDR median value (θIDR) is 1.07% for the unscaled signals while a similar value (about 0.8%) has been found for the two sets of spectrally matched ones. As expected, the logarithmic standard deviation (βIDR) value decreases from unscaled (0.96) to spectrally-matched accelerograms, and it also decreases with increasing values of matching tolerance (i.e. 0.31 for “loose” and 0.21 for “strict” sub-sets).

Fig. 15
figure 15

IDR vs IM (in terms of PGA a, PGV b, HI c, Sa(T = 1 s) d) results obtained for unscaled TR = 1000 y accelerograms and the corresponding spectrally matched ones for two different values of matching tolerance (“loose” and “strict”). Lognormal PDF values for both IDR and IM data are also plotted. Note that θIDR and βIDR are identical for all IM considered and, consequently, they are reported only in Fig. 13a

For a given IM, almost coincident θIM values have been obtained from the three sets of signals. On the contrary, significant differences have been found in terms of β, for which the highest values refer to the unscaled signal analyses while lower values have been calculated for the spectrally matched ones. For example, in case of Sa(1 s), βIM is 0.54 for the unscaled set, while it decreases to 0.23 and 0.17 for “loose” and “strict” ones, respectively. It is worth noting that, using HI as intensity measure, very high differences have been found between the two sets of spectrally matched data. Specifically, while βIM is 0.11 for “loose” matching criteria, βIM is 0.01 is for “strict” criteria, because a strict spectral matching implies an almost invariant area underlain by the spectrum and the same HI value.

5 Conclusive remarks

Within the 2019–2021 research agreement between the Civil Protection Department (DPC) and the Network of University Laboratories for Earthquake Engineering (ReLUIS), the WP4 “Seismic Risk Maps—MARS” work package is specifically devoted to updating the 2018 release of the Italian National Seismic Risk Assessment. Among others, different approaches in deriving the fragility curves of the building types mostly present in the Italian building stock are being adopted. In case of analytical approaches through non-linear dynamic analyses, accelerograms representative of the Italian seismic hazard have been selected using a tool named Select & Match—S&M. S&M is a user-friendly tool able to select both unscaled and spectrally matched accelerograms approaching a user-defined target spectrum in a broad period range by weighting different selection parameters in order to better comply with the user’s goals.

By taking advantage of the SIMBAD database covering the magnitude and distance range of interest for the seismic hazard at Italian sites, S&M has been used to select two sets of 125 accelerograms comprising the two horizontal components for both stiff and soft soil types, making it possible to derive site-independent fragility curves by using a cloud-like approach. The selected datasets mainly consist of unscaled (real) signals. However, in order to reach the higher damage levels for the building types at hand, it has proved necessary to also include a sub-set of frequency-scaled accelerograms.

After analyzing the signals in terms of four intensity measures (IM), such as PGA, PGV, HI and Sa(1 s), the seismic response of a typical Italian existing RC frame building with four storeys and designed only for vertical loads has been evaluated through non-linear dynamic analyses. Results show that the selected signals were able to evenly populate all the EMS-98 damage levels, that is a key requirement to obtain reliable estimates of the fragility curve parameters. Furthermore, with the only exception of PGA, the other considered IMs, i.e. PGV, HI and Sa(T1), showed only minor differences when comparing results from the two soil types, supporting their efficiency in the derivation of reliable fragility curves.

Finally, in order to check the influence on seismic response due to the selection and matching criteria permitted by S&M, some statistical analyses have been carried out considering a sub-set of unscaled signals and the corresponding ones obtained by spectral matching using different tolerance levels (“loose” and “strict”, tolerance 30% and 1%, respectively). Results show that the median value of the considered engineering demand parameter (i.e. inter-storey drift ratio, IDR) is 1.07% for the unscaled signals while a slightly lower value (about 0.8%) has been found for the two spectrally matched sets. Conversely, large differences have been found in terms of logarithmic standard deviation, whose values significantly drop from unscaled (0.96) to spectrally-matched input motions (0.21 for “strict” selection of signals). It can therefore be concluded that, moving from an unscaled selection of records to a strictly spectrally-matched selection, the probability distribution of IDR has only relatively minor changes in terms of median values, but it can change sharply in terms of its standard deviation. This appears the price to be paid for a site-independent definition of fragility curves, where the input ground motion is not scaled to fit either a specific hazard-based target spectrum or a specific spectral ordinate of a target spectrum. Instead, if the selection of input motions is constrained by the site-specific hazard, the resulting variability can decrease remarkably, but it would not be possible to consider the corresponding fragility curves as site-independent.