1 Introduction

Nowadays, non-linear dynamic analysis has been positioned as the major tool for the seismic assessment of engineering structures. Differently from response spectrum analysis, non-linear dynamic analysis takes into account energy absorption and force redistribution due to sequentially deterioration in the inelastic range, and the contribution of higher modes. It is generally known that non-linear dynamic analysis presents two main components. First, a robust numerical model able to capture all possible sources of non-linearity (i.e., concentrated/distributed plasticity, second-order effects, etc.); second, a proper definition of the input excitation consistent with the seismic hazard level of the site of interest. The latter has been recognised in the literature as a major source of bias and uncertainty in the computation and interpretation of results [1, 2]. In this sense, relevant seismic codes, such as Eurocode 8 (EC8) [3], American Standards ASCE/SEI 41-13 [4] and ASCE/SEI 7-10 [5], or the New Zealand Standard NZS 1170.5:2004 [6] provide simplified guidelines for selection and scaling. Furthermore, extensive research has been conducted for earthquake ground motion selection and spectral matching in the time and the frequency domain; amplitude scaling based on Intensity Measures (IMs); and even the use of ground motion simulations (e.g., source-based models including deterministic, stochastic, and hybrid approaches as well as site-based models, etc.) as input to non-linear dynamic analysis (see Rezaeian and Sun [7]).

Regarding the technical guidance in the selection and scaling, most seismic codes present similarities related to the minimum/maximum number of accelerograms to be considered in the process (the predicted structural response can be estimated as the mean response when at least seven records are considered or the maximum response for a minimum of three records) and the period matching range, established at 0.2T1–2.0T1 within the EC8 [3] and 0.2T1–1.5T1 for both ASCE/SEI 41-13 [4] and ASCE/SEI 7-10 [5], with T1 as the fundamental period of vibration of the structure under analysis. Yet, the NZS 1170.5:2004 [6] proposes a different matching range 0.4T1–1.3T1, and more importantly, scaling factors 0.33 < k1 < 3.0 and 1.0 < k2 < 1.3, for individual records and the whole set respectively. This was demonstrated to be more effective in reducing the record-to-record variability and individual mismatch, compared to uniform scaling [8]. Besides, not only the target spectrum can be defined from code-based spectra at different return periods but also in terms of Uniform Hazard Spectrum (UHS), which is one of the principal outputs derived from conventional Probabilistic Seismic Hazard Analysis (PSHA) through the evaluation of hazard curves for various return periods [9, 10]. Disaggregation [11] can also be applied to the results of PSHA leading to other targets, such as the Conditional Mean Spectrum (CMS), which is closely related to IMs selection and scaling [12, 13].

IMs can be interpreted as a link variable between seismology and earthquake engineering. In this regard, the statistical properties of such variables are first determined through PSHA, and then, the structural response, conditioned to a specific level of IM, is estimated. The lognormal distribution for IMs has been found to be adequate in the past [14]. Therefore IM-based selection and scaling approaches usually seek to cover this theoretical distribution under the notions of efficiency and sufficiency [15]. Here, efficiency is associated with the level of variability or dispersion in the estimation of the structural response around the regression model for a given IM. On the other hand, sufficiency refers to the independence of a particular IM from other variables, such as Magnitude and Distance (Mw, R pairs), in estimating the mean prediction of an Engineering Demand Parameter (EDP). In other words, a selected IM should be efficient, leading to relatively small dispersion on the results of structural analysis, and sufficient, in such a way that the estimation of the structural response does not depend on variables that are commonly ignored (i.e., seismological characteristics). Furthermore, the use of simulated signals as an alternative to ground motion selection and scaling has been encouraged in the last years mainly because of the large improvements in computational techniques for the simulation of site-specific earthquake scenarios [16, 17].

Therefore, this literature review focuses on providing a comprehensive description and discussion of the scaling and selection approaches of earthquakes motions for engineering applications. The focus is on structural engineering applications and works conducted in the last decade, since other review papers have covered research in the field since early 1980s until 2010 [18, 19]. The first part of the review presents a brief discussion of the most relevant works before 2010. Subsequently, the two main topics of discussion are introduced: (1) Code-based selection and spectral matching; and (2) Probabilistic assessment based on IMs. The last section of the paper is devoted to briefly mentioning some investigation on the usage of simulated signals as an alternative to ground motion selection and scaling. The aim of this review is to provide a wider understanding of current research regarding scaling and selection of earthquake motions for structural engineering applications, contributing to further investigations. The outline of this review is organised as follows: (1) Summary of preliminary works (research done before 2010); (2) Code-based selection and spectral matching; (3) Probabilistic assessment based on IMs; and (4) Use of simulated signals as an alternative to ground motion selection and scaling.

2 Summary of Preliminary Works

This section aims to briefly describe the works that have made major contributions regarding selection and scaling methodologies before 2010. For better clarity, the preliminary works are categorised into two sub-sections focusing on selection and scaling on actual recorded motions and generation of artificial ground motions, respectively.

2.1 Selection and Scaling Based on Actual Recorded Motions

In the early 2000s, Kappos and Kyriakakis [20] studied the effects of scaling on elastic and inelastic spectra for strength and displacement. Various scaling methodologies were considered: Peak Ground Acceleration (PGA); Peak Ground Velocity (PGV); Arias Intensity (Ia); Root-Mean-Square (RMS) acceleration; Spectrum Intensity, i.e., area under the pseudovelocity spectrum (SI); and Housner Intensity (HI). It was determined that SI scaling led to reasonably low scatter in the entire period range. The authors also analysed a 10-storey 3-bay Reinforced Concrete (RC) building using a scaling technique based on HI, while finding covariation values between 10 and 40% in drift and member ductility, as well as reasonable scatter along the building height. Lee et al. [21] suggested a methodology to fit the response spectra of earthquake ground motions with a linear elastic design response spectrum. First, a pre-selection was performed according to the similarity of the recorded response and the target design spectra. For this purpose, velocity over acceleration values (v/a) of 61.0, 91.5, and 112.0 cm/s/g were used for rock, stiff, and soft soil sites, according to Newmark and Hall parameters [22]. The fitting was carried out with respect to ground shaking intensities (PGA and PGV) and response characteristics, namely effective peak acceleration (EPA) and effective peak velocity (EPV). PGA calibration and spectral shape resulted in a better fit to target the design spectra in the full period range.

Malhotra [23] presented a method for selection and scaling of records for site-specific analysis in which smooth response spectra were matched with site response spectrum by scaling acceleration time series. The record was modified by an α factor (defined as the amplitude scaling factor) coherent with closer or distant events. The procedure was also claimed to work for obtaining ground motion pairs in an orthogonal direction. The work of Kurama and Farrow [24] tested the effectiveness of seven scaling methods in reducing scatter in peak lateral displacement demands considering non-linear single-degree-of-freedom (SDOF) and multi-degree-of-freedom (MDOF) models. It was demonstrated that proper scaling depends on multiple factors since methods which performed well for ground motions representative of stiff soil and far-field conditions lose their effectiveness for soft soil and near-field conditions for a wide range of structural characteristics. Then, Giovenale et al. [25] took advantage of the available results of non-linear analyses performed to study the spectral acceleration, Sa, at the fundamental period of the structure, T1, to prove the adequacy of alternative IMs. For that purpose, a regression of the results obtained from the original analyses was proposed, taking the alternative IM as an independent variable (direct method) and also rebuilding the probability density function of the Demand Measure (DM) given a defined value of the candidate IM by means of the total probability theorem.

Subsequently, Naeim et al. [26] employed genetic algorithms to preselect, among thousands of earthquake records, a sub-set of records that better match a design spectrum and, after that, to determine a scalar factor for fitting the response spectra against a site-specific design spectrum. Two practical examples were adopted, selecting seven scaled records to match a target spectrum from a dataset with 1496 records and selecting the appropriate scaling factors for a preselected set of seven (bi-directional) earthquake records. Bommer and Acevedo [27] studied the influence of geophysical and response spectral search criteria to provide guidelines for selecting real accelerograms in dynamic analysis. They highlighted the importance of achieving a good match with the design earthquake magnitude and the site classification of the project. In addition, it was recommended to perform a search within the results of the first sweep in terms of matching spectral shapes and spectral amplitudes. Later, Baker and Cornell [28] introduced a vector-valued IM based on spectral acceleration and epsilon ε (the difference between the spectral acceleration of a record and the mean of a ground motion model at the given period). The authors showed that ε is an indicator of spectral shape and is related to structural response. Generic structures and an RC moment frame building were analysed to demonstrate that the omission of ε value led to conservative estimates when computing the drift hazard. This research was subsequently expanded by adopting a conditional response spectrum with a level of Sa(T1) and its associated mean (disaggregation-based) causal magnitude, distance and ε value. It was argued that this spectrum is a more appropriate target for record selection, and the reductions in bias and variance of resulting structural response are comparable to the reductions achieved by using a vector-valued measure of earthquake intensity [12].

Shome et al. [29] examined the influence of a set of bin records of magnitude Mw at distance R on the non-linear response of a 5-storey steel structure. Among multiple alternatives, it was determined that scaling records to the 5% damped spectral acceleration at the fundamental frequency of the structure leads to better results (reduction in variance). The work of Iervolino and Cornell [30] compared the non-linear response of three different SDOF systems and two moment-resisting frames, one of reinforced concrete and the other of steel, under the influence of two sets of records. The first set was selected to represent a specific magnitude and distance scenario, and the other was chosen randomly from a large catalogue. The study concluded that the selection of records with respect to Mw and R values provides small difference in results. Iervolino et al. [31] analysed the sensitivity of non-linear demand measures (ranging from displacement ductility ratio to equivalent number of cycles) to ground motion duration (small, moderate, and large duration), considering several case studies. The impact of duration on structural failure probability was evaluated by fragility curves, and the authors argued that the duration content of ground motion is statistically insignificant to displacement ductility and cyclic ductility demand. Dhakal et al. [32] proposed a method to identify critical earthquake ground motions for the assessment of seismic performance of a RC highway bridge. The steps are as follows: (1) Setting a suitable suite of ground motions and an appropriate IM; (2) Developing the non-linear model of the structure; performing Incremental Dynamic Analysis (IDA); and (3) Analysing these results in terms of 50th and 90th percentile performance bounds to finally identify the critical ground motions that are close to these defining probabilistic curves at ground motion intensities. The proposed methodology was applied to the assessment of an RC highway bridge.

Next, Beyer and Boomer [33] revised code provisions for the selection and scaling of ground motions in bi-directional analysis. They employed the geometric mean of the spectral ordinates of the two horizontal components to define the target spectrum and to select and scale ground motion records. It was shown that the structural response varies depending on the angle of incidence of the ground motion with respect to the structural axes. In order to examine the influence of strong-motion duration on inelastic structural response, Hancock and Boomer [34] analysed an 8-storey RC wall-frame building under the action of 30 accelerograms with different durations. The authors established that duration has no influence on damage measures using the peak response, such as inter-storey drift, but is correlated to cumulative damage measures, such as absorbed hysteretic energy and fatigue damage. Kottke and Rathje [35] introduced a method for the selection and scaling of ground motions to fit a target acceleration response spectrum and controlling the variability within the ground motion suite. In this method, motions are selected to match the target spectral shape, and then fitting the amplitude and standard deviation of the target by adjusting the individual scaling factors for the motions. In addition, the method tries each possible suite of motions to find the suite that provides an acceptable fit to the target spectrum. Iervolino et al. [36] investigated the possibility of finding unscaled records from the European Strong-Motion database (ESM) fulfilling, as much as possible, EC8 requirements. Records for A, B, and C site classes were found (Vs30 > 800 m/s; 800 m/s ≥ Vs30 > 360 m/s; and 360 m/s ≥ Vs30 > 180 m/s, respectively), while for very soft soil sites (Vs30 ≤ 180 m/s) it was not possible to find acceptable solutions. Moreover, it was found that unscaled record sets strictly matching EC8 spectra resulted in a large record-to-record variability in the spectral ordinates within the same set.

In order to estimate the number of scaled/matched accelerograms required for inelastic dynamic analysis, Hancock et al. [37] considered the response of an 8-storey RC structure to accelerograms linearly scaled or spectrally matched using five different techniques. The techniques include selecting real records based on seismological characteristics; scaling to the target spectral acceleration at the initial period; scaling to the target spectrum over a range of periods; using wavelet adjustments to match the target spectrum; and using wavelet adjustments to match multiple target spectra for multiple damping ratios. The authors determined that the number of required records and the degree of bias systematically decrease as one applies more constraints on the scaling and matching of accelerograms, especially when using wavelet adjustments to match multiple target spectra at multiple damping ratios. Finally, Iervolino et al. [38] presented a study on the selection of real accelerograms for seismic analysis of bridges according to EC8 as an extension of their previous work in [36]. Multiple sets were downloaded from the ESM dataset [39], including the vertical component of the ground motion, respecting the spectral matching requirements in broad period ranges as much as possible. Also, sets of normalised code-compatible horizontal accelerograms are considered to reduce the record-to-record variability of the spectra, and to obtain sets which are independent of the anchoring value of the code spectrum.

2.2 Generation of Artificial Ground Motions

Naeim and Lew [40] first pointed out significant problems with the usage of frequency-domain scaled design-spectrum-compatible ground motion time series. The authors computed a set of records for six pairs of horizontal ground motion time series using a self-developed computer tool and the WES-RASCAL code to reproduce synthetic motions [41], assuming in both cases a ratio of spectral acceleration areas (signal/target) less than 2% and average error in all frequencies lower than 5% to achieve convergence. The records were then applied for the analysis of a regular six-storey seismic-isolated hospital building. It was found that results from frequency-domain scaling were about 300% larger than time-domain scaling, and more than 10 times larger than real records. The findings were due to the energy content in the design spectrum, which is not seen in recorded time series.

In 2006, Hancock et al. [42] proposed an improvement to the wavelet-based program RspMatch, first developed by Abrahamson [43], through matching accelerograms to the pseudo-acceleration, displacement spectral ordinates, and absolute acceleration spectrum at several damping ratios independently. Atkinson [44] made use of the stochastic finite-fault method to generate earthquake accelerograms matching the National Building Code of Canada (2005 NBCC) UHS with a 2% chance of being exceeded in 50 years for site Classes A, C, D and E [45]. The definition of scaling factors was recommended to improve the match of the selected record to the UHS over the specified period range. Finally, the work of Giaralis and Spanos [46] proposed a wavelet-based technique to match artificial seismic accelerograms with a given displacement design/target spectrum from EC8. First, stochastic modelling is used to obtain a family of simulated non-stationary earthquake records whose response spectrum is, on average, in good agreement with the target spectrum, and then harmonic wavelets are employed for modifying the simulated records iteratively to satisfy the compatibility criteria for artificial accelerograms prescribed by EC8.

3 Code-Based Selection and Spectrum Matching

The current section of the manuscript is aimed at presenting a comprehensive review of investigations during the last years that have approached the selection and scaling of real records to match a target spectrum which can be defined by code provisions and other different targets such as UHS and CMS. It should be noted that these techniques are, in general, well-known by practitioners in the fields of structural design and earthquake engineering. Throughout the literature, and within this review, the terminology of spectral marching can also refer to some special methodologies implemented to achieve convergence with given target spectra [42, 46].

A comparative analysis of current code requirements and common practices in several standards (United States, China, European Union, New Zealand, and Taiwan) for ground motion selection was conducted by Hachem et al. [47]. Although those codes exhibit marked differences in the definition of response spectra and seismic demands, it was concluded that the selection criteria of ground motion records present lots of similarities. Only the NZS presents differences in the minimum number of motion records to be used in the analysis and the period range for matching, being less restrictive than EC8. On the other hand, Iervolino et al. [48] developed REXEL, a computational tool for code-based record selection. REXEL allows searching for suites of waveforms from the ESM compatible with reference spectra (user-defined or code-based). To optimise the selection process, the algorithm defined an expression based on the Sum of Squared Errors (SSE) to rank how much the spectrum of an individual record deviates from the reference spectrum in broad period ranges. REXEL efficiency was demonstrated through several examples considering the selection of multi-component GM suites according to the New Italian Building Code (NIBC) [49] and EC8 criteria for different site classes and limit states or return periods. An internet version of REXEL was presented afterwards, allowing the search of records in the ITalian ACcelerometric Archive (ITACA) repository [50]. Jayaram et al. [51] proposed an efficient algorithm to match target the response spectrum and variance. The algorithm generates multiple response spectra from a target distribution, for then selecting recorded motions whose response spectra individually match the simulated response spectra. A greedy optimisation [52] was implemented to improve the match between the target and the sample means and variances. The selection processes are performed by computing the SSE for each ground motion in the database and then choosing the ground motion with the smallest SSE. Considering a scenario with characteristics of strike-slip mechanism, Mw 7.0, distance to rupture of 10 km, and Vs30 of 400 m/s, 40 records were selected. For the case of SDOF and MDOF systems, it was observed that considering the variance of the response spectrum as an additional constraint had no significant effect on the median response but considerably increased its dispersion (logarithmic standard deviation).

Sextos et al. [53] investigated the limitations of the EC8 earthquake ground motion selection framework in the assessment of an existing irregular RC building subjected to the 2003 Lefkada earthquake. Parametric analyses were conducted for different EC8-compliant sets of records to quantify the discrepancy in the structural response due to record-to-record and set-to-set variability. Significant intra-set scatter in the inelastic response of the case-study building (mostly in structural damage within the same set of seven pairs of records, for most ground-floor columns) led researchers to conclude that the implementation of the EC8 provisions for the selection of earthquake records may be misleading. The Harmony Search (HS) metaheuristic algorithm [54] was adapted to obtain input datasets compatible with the EC8 design spectra for different soil types [55]. The pool of motions was defined from 352 records selected from the Pacific Earthquake Engineering Centre (PEER) strong-motion database [56]. The EC8-based selection was formulated as a constrained optimisation problem with multiple penalty functions. Datasets for soil classes A, B, C, D, and E, were obtained without matching a specific soil class (first approach for selection). The second approach considered instead 186 and 166 ground motions for soil classes C and D, exclusively. It was observed that all the obtained ground motion datasets could satisfy all set given constraints. Similarly, Shahrouzi and Sazjini [57] implemented HS for optimal scaling and selection of accelerograms with the main difference that a search refinement was also implemented. In this case, the selected records are fixed, while their scaling factors are further optimised resulting in a minimal error of 5% in the spectral matching. The flowchart in Fig. 1 describes the proposed HS algorithm.

Fig. 1
figure 1

(Source Shahrouzi and Sazjini [57])

Flowchart of the proposed HS algorithm

In 2013, Katsanos et al. [58] introduced ISSARS as a MATLAB-based environment for ground motion selection coupled with structural analysis in SAP2000 to consider the variability of critical structural response parameters (bending moments and shear forces of structural elements and lateral storey displacements). A structural model of an RC building in the city of Thessaloniki, Greece, was used to demonstrate the applicability of ISSARS. A 15% tolerance bandwidth was established for the estimated median response, for which a δ parameter was set as a spectral matching indicator, and 280 bidirectional time history analyses were performed using the selected records afterwards. These authors discussed the importance of combining earthquake ground motion selection and structural analysis to meet code requirements while increasing reliability. Ergun and Ates [59] summarised basic methodologies and criteria for selecting and scaling ground motion time histories according to ASCE 7-05 and EC8. The usage of time-domain scaling was highlighted by arguing that this method only scales the amplitude of the seed motions and does not change its frequency content. The authors observed considerable variability with respect to the target spectrum for both averages of scaled far-field and near-field ground motions. Smerzini et al. [60] extended the previous research [48, 50] by selecting displacement-spectrum compatible earthquake ground motions from real accelerograms, REXEL-DISP. In this way, a strong-motion database was gathered, with recordings from shallow crustal earthquakes with epicentral distances of less than about 30 km, covering as homogeneously as possible the magnitude and distance ranges of interest for seismic hazard at Italian sites. REXEL-DISP software demonstrated, in most cases, to be able to obtain unscaled or lightly scaled record sets closely approaching the target spectrum in a broad period range.

Subsequently, Ye et al. [61] modified the HS metaheuristic by incorporating a fitness priority-based ranking method to handle additional constraints when minimising the difference between the design response spectrum from the Chinese GB50011-2010 [62] and the mean response spectrum of selected and scaled ground motions within a period range of interest. The search space consisted of a total of 704 records selected from the PEER database [56] according to magnitude, distance, and site conditions. Using the proposed technique, sets of seven ground motions were selected for structures with periods ranging from 0.45, 0.90, and 1.80 s (typical short-period, medium-period, and long-period structures). The technique exhibited acceptable performance, fitting the design spectrum with RMS error values of 3.4% and 7.5% in the worst-case scenario. Later, Kaveh and Mahdavi [63] proposed a simple approach for spectral matching of ground motions utilising wavelet transform and colliding bodies metaheuristic optimisation. They proposed to decompose the original ground motion using the wavelet transform, covering a special frequency range at each level, and then each level is multiplied by a variable. These variables are estimated with metaheuristic optimisation to minimise errors between the response and target spectra. The methodology was tested in a sample of 12 recorded accelerograms to be matched with the EC8 design spectrum at soil classes A and B. The same range of period matching was defined as in [61], and the maximum number of iterations was limited to 300, for which the algorithm exhibited good performance. Ha and Han [64] proposed a simplified methodology for matching target response spectrum mean and variance. First, a unique motion is selected from a library only considering the target response spectrum mean without variance, and then motions and their scaling factors are selected according to the smallest SSE for mean and variance. The methodology was implemented to derive 40 motions from 8 libraries that were constructed with records from the Next Generation Attenuation relationships (NGA) database [65]. The numerical results revealed that the accuracy of the selection process increases with the number of records.

Araújo et al. [8] presented a broad study in which EC8, ASCE41-13 and NZS1170.5:2004 criteria for record selection were compared with emphasis on the influence of the selected number of real records on the estimation of the mean seismic response. The main contribution of this investigation was to propose an additional spectral mismatch limit (± 50%) for each individual record to reduce the record-to-record variability within the period range of matching. Considering the additional mismatch constraint, the records derived from each code-based methodology were implemented as input for the seismic assessment of a steel building composed of moment-resisting frames. Four configurations of the same building were considered to comply with different limits for the inter-storey drift according to EC8 criteria. Additionally, the records were linearly scaled to match spectra in Damage Limitation (DL) and Near Collapse (NC) limit states. Pant and Maharjan [66] analysed the effects of selection and scaling of ground motions on the response of seismically isolated structures subjected to bidirectional excitation. The selection process was either random or based on matching the shape of the response spectrum to the target spectrum. According to ASCE 7, for the design of isolated structures, both, the design earthquake and Maximum Considered Earthquake (MCE) motions are needed; thus, the MCE spectra were used to scale earthquakes. The authors determined that linear scaling performed better when pulse-like motions are considered, and either linear scaling or spectral matching can be used for non-pulse-like motions.

Afterwards, Han and Ha [67] proposed a four-stage algorithm for the selection and scaling of motions for two-dimensional analysis following the ASCE 7-10 criteria. First, an individual scaling is determined to match a target response spectrum. Then, guidelines are provided to select the required number of motions. Finally, a re-scaling factor is determined according to the distance between the target and mean response spectra at the fundamental period T1, to make sure that the mean response spectrum of selected motions is larger than the target response spectrum. The selected motions were employed to conduct non-linear analysis in 12 SDOF systems and one MDOF taken from [68]. In order to limit the error to a maximum of 20% in the median response, it was found that the number of motions in the library and the motions selected should be at least 130 and 10, respectively. Likewise, 74 and 7 motions (library and selected, respectively) should be considered to limit the error by 30%. Two kinds of Genetic Algorithms (GAs), real-permutation and binary-permutation, were adapted by Yaghmaei-Sabegh et al. [69] for scaling earthquake records. The dataset consisted of 374 records recorded from shallow crustal events, and the target design spectrum was determined from the Iranian seismic design code. Real-permutation and binary-permutation coded exhibited mean square errors in the matching process with respect to the target design spectrum below 1%.

A full version and detailed description of the program SelEQ were presented by Macedo and Castro [70]. The program presents a graphical user interface subdivided into three modules, seismological module; preliminary selection (NGA-WEST2 database); and record selection. Preliminary versions of the program allowed only code-based selection with an additional constraint of ± 50% individual spectral mismatch to reduce the record-to-record variability [71]. However, this new version allowed to obtain the CMS (which will be discussed in detail in the next section of this paper) for the European territory by making use OpenQuake (OQ) [72] and the seismic hazard model by the SHARE [73] project to conduct PSHA. The constrained objective function was established for both code-based and CMS-based selection cases, to be solved using the adaptive HS metaheuristic. For code-based selection, the applicability of the tool was illustrated with the results of the study previously conducted in [8]. On the other hand, two examples were developed for CMS-based selection. First, for two moment-frame-resistant structures with vibration periods equal to 1.1 s and 1.63 s in Istanbul, considering several occurrence probabilities and suites of 40 scaled records; and then for six structures with vibration periods equal to 0.25 s and 0.35 s (2-storey structures), 0.37 s and 0.93 s (4-storey structures), and 0.42 s and 1.14 s (5-storey structures), located in Porto and Lagos (Portugal) for a probability of occurrence of 5% in 50 years. Figure 2 depicts the results for both CMS-based selection cases listed before. It is worth noting that even when all the conditions could not be met, SelEQ returned the set that best fitted the objective function.

Fig. 2
figure 2

(Source Macedo and Castro [70])

CMS-based SelEQ application example

Shakeri et al. [74] pointed out the limitations of the ASCE 7-10 selection and scaling procedure of earthquake motions for the analysis of tall buildings (e.g., contribution of the higher modes and inelastic response). To this end, the authors proposed a pushover-based ground motion scaling procedure to consider the inelastic response and the contribution of higher modes. In this method, the peak displacement of the equivalent inelastic SDOF system under the action of the scaled record should be consistent with the inelastic target displacement. A simplified version of the method was also proposed by analysing 3 regular (8-storey, 14-storey, and 20-storey) buildings and 1 irregular 20-storey building. The maximum error for the mean storey drift ratio profiles were 9.9%, 16.4%, 18.9% and 15.5% for the structures under evaluation against a maximum 54.6% for the ASCE 7-10 procedure. Thus, both methodologies, original and simplified, demonstrated their accuracy and higher performance when compared to the code methodology. Reyes et al. [75] proposed an adjustment to the ASCE/SEI 7-10 ground motion scaling procedure, considering the spectral shape and an additional scaling factor for the full record set for 3D response history analysis of multi-storey plan-asymmetric buildings. The numerical results demonstrated that the improved procedure led to conservative results of around 15%, while the conventional ASCE/SEI 7-10 procedure provided underestimated results of about 29%. It was also observed that the results computed with the improved methodology overcome the conventional code procedure as torsional irregularity increases. The work of Tian et al. [76] investigated the influence of the selection of motions on the seismic performance of electricity transmission towers. The records were selected to match the Institute of Electrical and Electronics Engineers (IEEE 693) spectra, which considers a 2% damping ratio. The multiple records selected by matching the same target spectrum led to different inter-segment displacement ratios. The authors concluded that the geomean criterion may generate too conservative results in the seismic demand assessment of the prototype structure and that random selected records provided results comparable to the ones computed when using historical records. Finally, the authors argued that artificial records may not always be appropriate for seismic analysis of transmission towers.

Subsequently, Moschen et al. [77] proposed the usage of multi-objective optimisation through genetic algorithms to select ground motions for seismic assessment of structures. The authors pointed out that, different from single objective optimisation, the proposed approach did not require weighting functions for different fitness/cost functions. These authors presented a strong mathematical basis to incorporate n number of fitness functions in the optimisation process as with the case for bidirectional analysis. Yet, it is worth to note that the presented multi-objective optimisation applied the same scale factor to all records within the set. Lombardi et al. [78] favoured the usage of Linear Time-History Analysis (LTHA) as a feasible seismic design method based on EC8. The motion sets were selected following EC8 and FEMA P-1050 recommendations. The authors emphasised that the current version of EC8 does not distinguish ground motion selection for linear or non-linear analysis. In that sense, a direct comparison between linear and non-linear analysis was presented regarding Demand/Capacity (D/C) ratios for various structural members at each level. More conservative results were computed using LTHA, and it was determined that average values of the D/C ratios are extremely dependent on the strongest earthquakes within the suite. Ucar and Merter [79] examined the inelastic response of 6-storey, 8-storey, and 10-storey RC frames under the effect of recorded records matching the design spectra of the Turkish Seismic Design Code 2007 [80], Uniform Building Code 1997 [81], and EC8 [3]. Minimum differences in base shear forces and maximum roof displacements were found for any of the three code-based matching procedures, noting that EC8 matching led to slightly higher results consistent with also higher spectra ordinates. Kaveh et al. [82] conducted a comparative study of the performance of Multi-Objective Particle Swarm Optimisation (MOPSO), Non-dominated Sorting Genetic Algorithm II (NSGA-II), and Pareto Envelope-based Selection Algorithm II (PESA-II) for ground motion record selection based on ASCE 7-16 standard. The results showed a better performance of the NSGA II technique with a better match to the target spectral acceleration and less dispersion in the period range of scaling.

Similarly, Mergos and Sextos [83] adopted GAs to select recorded earthquake motions. The selection was based on spectral matching and parameters coherent with the regional seismology, soil conditions, motion intensity, and duration, among others. After the algorithm provides sets with the best fit-design values, the user can select the most suitable solution based on inspection and/or engineering criteria. The methodology was applied for the EC8-based selection of records for non-linear analysis of a structure with a fundamental period of 0.75 s, for which a good level of agreement with the objective values was observed. Cavdar et al. [84] addressed the influence of the number of motions, period range, number of periods within the matching range, and distribution of weight factors at selected periods in code-compliant records for non-linear bidirectional analyses of short-period and long-period structures. The target spectrum was defined from the Turkish Seismic Design Code 2007 [80], with a probability of exceedance is 2% in 50 years. Because of the changes in the response spectrum of records in the short-period range, the authors observed that the amplitude of scale factors, and, therefore, seismic response of short-period structures, are more likely to change compared to long-period structures. Next, the authors examined differences when using arithmetic and logarithmic values in the least square method to match a target spectrum defined, in this case, from the NEHRP 1994 design response spectra [85]. 3-Storey, 9-storey, and 20-storey steel moment frames were adopted as case-study buildings. Although both, arithmetic and logarithmic values can be used to predict the mean or median of structural response, the relative errors calculated through logarithmic values were larger than for arithmetic values. It was also observed that logarithmic values reduced variability in the structural responses for the case of long-period structures with a considerable non-linear response (i.e., 20-storey structure).

Battaglia et al. [86] briefly addressed the selection process of 50 recorded accelerograms taken from the PEER Ground Motion Database [56] and scaled to match the seismic hazard of Seixal city in Portugal according to the Portuguese National Annex of EC8. The scaling process was limited in the period range to 0.1 and 1.0 s and magnitude between 6.5 and 7.0 to be compatible with the Life Safety (LS) limit state spectrum. The records were then linearly scaled at different levels of PGA for the definition of fragility curves. Georgioudakis and Fragiadakis [87] developed a technique for selecting ground motions from large databases. They proposed using an evolutionary optimisation algorithm to solve the multi-objective problem of selection and scaling to fit a target spectrum from UHS or CMS. The problem was defined as a single objective to match the mean spectrum in the period range of searching and a multi-objective to match mean and variance. More recently, Karimzadeh et al. [88] studied the effects of ASCE/SEI 7-16 [89] and EC8 selection criteria on the seismic assessment of an unreinforced masonry shear wall. The seismological parameters for selection were restricted to: Mw values in the range 5.5 to 9.0; strike-slip mechanism; RJB distances lower than 25 km; and soil types C (Vs30: 360–760) and D (Vs30: 180–360). In general, higher demands were computed in terms of displacements using EC8 criteria for selection. Rui et al. [90] proposed the usage of normalised modal-mass participation factors as weight variables when matching target response spectra to consider the influence of higher modes. For the study, the Newmark spectrum was employed as a matching target. The accuracy of the method was tested by estimating the mean structural response of long-period structures. Moreover, the authors suggested the suitability of the proposed technique for base-isolated buildings. Ertürk et al. [91] discussed the effects of frequency and time domain scaling methods based on the 2018 Turkey Building Earthquake Code [92], in the seismic response assessment of a finite element model of the historical masonry clock tower built in 1894 in Çorum, Turkey. The study involves the scaling of horizontal as well as vertical components of records to the target design spectrum. A better matching to the target spectrum was observed by means of the frequency domain method. Furthermore, the maximum response metrics were computed using the frequency domain method.

Kayhan et al. [93] proposed a strong-mathematically-supported solution to obtain recorded motions for one-directional analysis of 2D structural models and motion pair sets for the bi-directional analysis of 3D models. It is worth noting that this method allows for the selection based on scenario-based spectrum, UHS, and CMS. As in previous research [70], the optimisation problem was proposed to be solved using HS metaheuristic optimisation, and the additional constraints of the problem were formulated as penalty functions. Examples regarding TBEC [92] and ASCE 7-16, scenario-based, and CMS for a vibration period of 1.00 s were developed to illustrate the effectiveness of the algorithm in meeting the predefined conditions. Zhang et al. [94] proposed a new weighted scaling method to match the Newmark–Hall spectrum at different seismic hazard levels. The error function was established as the SSEs in acceleration, velocity, and displacement-sensitive regions. It was determined that 7 to 10 records are required to compute results with prediction errors lower than 20%, regardless of the weighting scaling in the Newmark–Hall spectrum. Manfredi et al. [95] introduced a friendly software, Select & Match (S&M), for the selection and matching of actual records with a target spectrum within a period range. The algorithm implemented in the software can be summarised in three steps. First, motions are filtered by their magnitude; second, mean and maximum normalised errors are estimated and compared against a fixed value; and third, earthquake motions are listed from best to worst according to the defined selection criteria. Two sets of 125 records, one for each site category (i.e., stiff sites A, B and soft sites C, D) were selected to match the elastic design spectra of the Italian seismic code (NTC18, 2018) for different return periods. An RC 4-storey building, very representative of the Italian building typologies, was selected as the case study. Moreover, the selected records were implemented to derive site-independent fragility curves for various IMs (PGA, PGV, HI, and Sa at around 1 s) and the inter-storey drift ratios as EDP (see Fig. 3).

Fig. 3
figure 3

(Source Manfredi et al. [95])

Empirical Cumulative Frequency Distribution using inter-storey drift as EDP

In order to approach the selection of spectrum-matched records, Zhao et al. [96] implemented high-performance Siamese Convolutional Neural Networks (SCNNs). Design spectra for soil types A to D from EC8 were selected as target spectra. For each target design, only 40 training samples were enough to achieve results with minor standard deviation. These authors emphasised that the method considers only similarities of the response spectra with the target spectrum. Therefore, other signal features can be ignored. Finally, Zhang et al. [97] introduced a weighted scaling method for the selection and scaling of earthquake motions, in which the weight factors are estimated using the modal-mass participation factors. The method made use of the least-squares errors and the modal-mass participation factors (λi) as weight variables. Hence, larger weight factors are assigned to the spectral values around the fundamental period. The target was set to match the 1994 NEHRP design spectra at different hazard levels (2%, 10%, and 50% probabilities of exceedance in 50 years), and three steel moment frames (taken from the American SAC Phase II Steel Project) were adopted as case studies. The same level of accuracy in estimating mean structural response was observed for the three case studies and weighted and unweighted methodologies. Nevertheless, the weighted procedure could reduce better the scatter in the structural response. Table 1 summarises the major novelty contributions of most investigations in this subchapter with a brief algorithm description as well as the type of matching. Furthermore, it is also reported whether the research provided validation on structures and software tools or not.

Table 1 Novelty contributions in code-based selection and spectral matching

4 Probabilistic Assessment Based on IMs

Research regarding the probabilistic assessment of civil structures using the selection of motions based on scalar or vector-valued IMs has grown massively in past years. In that sense, performance-based seismic design is aimed at estimating the mean probability of exceedance of a certain value considering a specific DM or limit state, as follows:

$${\lambda}_{\text{DM}}\left({\text{x}}\right)\,{=}\,\sum_{{{\text i}=1}}^{\text{n}}{{\text{G}}}_{{\text{DM}}\left|\text{M},\text{R}\right.}{(}{\text{x}}\left|{\text{M}}_{\text{i}}{,}{\text{R}}_{\text{i}}{)}\right.{\uplambda}{(}{\text{M}}_{\text{i}}{,}{\text{R}}_{\text{i}}{)}{,}$$
(1)

where GDM|M,R(x|Mi, Ri) represents the likelihood of exceeding the DM for a specific value of x, conditional to the n number of (M, R) pairs; and λ(Mi, Ri) denotes the mean annual frequency of occurrence of seismic events characterised by specific magnitude and distance values.

Selecting suitable IMs can be efficient and sufficient (i.e., independent of terms such as earthquake magnitude and source-to-site distance). Thus, it is possible to re-write Eq. (1) in terms of IMs as:

$${\lambda}_{\text{DM}}\left({\text{x}}\right)\,{=}\,\sum_{{\text{x}}_{\text{i}}}{\text{P}}{(}\text{DM}>\text{x}\left|{\text{IM}}{=}{\text{y}}_{\text{i}}\right.{)}{\Delta}{\lambda}_{\text{IM}}{(}{\text{y}}_{\text{i}}{)}$$
(2)

with P(DM > x|IM = yi) as the probability of exceeding a x level of DM for a specific IM = yi, and ΔλIM(yi) representing the annual frequency of IM = yi.

This reduces drastically the number of motions and analyses to be performed for proper probabilistic assessment. Hence, a vast number of IMs have been implemented as criteria for record scaling and selection, and different structures have been studied on the basis of performance-based seismic assessment. Table 2 summarises the most relevant IMs whose notation will be used next.

Table 2 Intensity measure, proposed equation, and original reference

In Table 2, ug is the ground displacement, and the dots on u represent the derivatives of the function in time domain, denoting velocity and acceleration consecutively; Sa, Sv, and Sd, represent spectral acceleration, velocity, and displacement, respectively; similarly, PSa and PSv are the pseudo-spectral acceleration and velocity; ζ and T denote the damping ratio and period values within the spectra or pseudo-spectra; td is the total duration of a particular record; t5, t75, and t95 are the time values where the 5%, 75%, and 95% of Ia0 are achieved, respectively; and lastly, Ci and fi make reference to the Fourier amplitudes of the entire accelerogram and the discrete Fourier transform frequencies in the range 0.25–20 Hz needed for the computation of Tm.

For a better understanding of the works summarised in this review, the same will be sub-classified into three sub-sections regarding: (1) Recent theoretical contributions; (2) Studies with validation on conventional SDOF and MDOF systems; and (3) Studies with validation on other special types of structures.

4.1 Recent Theoretical Contributions

Based on the assumption that IMs have a multivariate lognormal distribution, Bradley [111] proposed to consider an arbitrary number of ground motion IMs as a Generalized Conditional IM (GCIM) for ground motion selection. Six conditional (and unconditional) distributions of IMs were illustrated for a rock site (Vs30 = 760 m/s) in Christchurch, New Zealand including Sa(0.05 s), Sa(0.5 s), HI, ASI, Ia, and significant duration to be used thereafter as GCIM for the selected two sets of motions. All potential ground motions are scaled to the desired level of IMs within the selection process, and motions are selected to match all the univariate distributions but not the complete multivariate distribution. This research was extended in [112] by presenting an GCIM-based algorithm for selecting recorded and scaled motions and simulated or synthetic records. To clarify the effects of considered IMs and the properties of selection at multiple exceedance probabilities, some examples of the selection were presented for a hypothetical site in Los Angeles.

The research presented by Baker [13] exhibited practical guidelines for using the Conditional Mean Spectrum (CMS) in ground motion selection. A site in Riverside, California, was implemented to illustrate the incompatibility of UHS and median spectrum from disaggregation. In this example Baker exposed that a predicted value of Sa = 0.89 g at a fundamental period T1 = 1 s in the UHS is caused by ground motions that are, on average, two standard deviations larger than the median predicted ground motions from the causal earthquake event. Furthermore, knowing the ε at a particular period of interest, the expected mean values of ε at other periods can be estimated to derive the CMS consistent with the mean values of the spectrum at all periods. For ground motion selection, it was recommended to scale records either to match the target Sa at a conditioning period T* or over the entire period range and then select the best match based on SSE between the logarithms of the motion spectrum and the target spectrum (see Fig. 4). A probability-based approach for determining seismic design demands was proposed by Bradley [113], considering an amplification factor function of the number of ground motion records considered and the uncertainty in the seismic responses. The process was summarised in three simple steps: (1) Perform seismic response analysis for a predetermined number of ground motions; (2) Calculate the arithmetic mean, and lognormal standard deviation of the response parameters of interest; and (3) Compute the scale factor to obtain the final seismic demand. The efficiency of the procedure was demonstrated in the work of Araújo et al. [8] for local and global deformation demands.

Fig. 4
figure 4

(Source Baker [13])

CMS for the Riverside site (T* = 1 s), and response spectra of selected ground motions

In 2013, Lin et al. [114] expanded the concept of CMS introduced by Baker and Cornell [12, 28] by considering multiple causal earthquakes and Ground Motion Models (GMMs) to calculate the exact Conditional Spectrum (CS). Consequently, this new process requires the extension of conventional PSHA disaggregation, which considers only magnitude, distance, and ε to de-aggregate GMMs. This new process can also incorporate different causal earthquakes, M, R, θ combinations, multiple seismic source models, and logic-tree branches. It was noticed that the exact conditional standard deviation is higher than the approximate, which is consistent with the contribution from the variance in mean logarithmic spectral accelerations due to variations in causal earthquakes and GMMs. Afterwards, Tarbali and Bradley [115] implemented the GCIM concept to select records considering scenario ruptures. Through an amplitude scale factor, potential motions are selected based on matching random realisations of IMs distributions for a particular rupture scenario, with special attention to the weighting of considered IMs. A total of six selection vectors, combining different IMs such as n Sa ordinates, significant duration, CAV, Ia, ASI, HI, and Displacement Spectrum Intensity (DSI), were utilised for the selection of recorded acceleration records. The results demonstrated that considering Sa only, other parameters, such as duration or cumulative effects, may result in a biased representation. However, some IMs deliver redundant information, for which their joint consideration is inadequate for ground motion selection (i.e., Sa together with ASI, HI, DSI, or even PGV).

Baker and Lee [116] presented an improvement for the algorithm of Jayaram et al. [51] proposed for selecting ground motions matching CS. The main updates of the algorithm focused on making incremental changes to the initially selected set and optimising the fit to the statistically simulated target spectrum. Examples for the selection of recorded and simulated ground motions were provided and the advantages of the algorithm in terms of utility and speed were highlighted. The MATLAB code was also publicly supplied. Shi and Stafford [117] described two algorithms for the selection of records based on a conditional IM target. The first algorithm takes inspiration from the conventional GCIM selection; the procedure was summarised as a random walk algorithm in which records are added sequentially to approximate the joint probability density function through the distribution of the selected records. On the other hand, the second method is deterministic and computationally demanding as it aims to identify an optimal set which matches the target distribution defined from the sample. The IMs were chosen to address amplitude, frequency content, cumulative effects, and duration (Sa at 30 different periods, PGA, CAV, and significant duration D5–95 and D5–75). The second algorithm outperformed the first one as well as the conventional GCIM selection approach. However, when the record sets are small (approximately 20 records), GCIM showed better performance than the first algorithm.

Du et al. [118] introduced a selection algorithm based on conditional or unconditional ground motion scenarios where the theoretical background is based on the multivariate normal distribution of Sa ordinates at multiple periods. The authors referred to the unconditional case as the statistical distribution of response spectra from GMM and the conditional case as the distribution of spectral values conditioned on a target period. An example considering Sa values at 22 periods was illustrated for the unconditional and conditional selection. The empirical cumulative distribution of other IMs, including PGA, PGV, CAV, and D5-75 considering three site scenarios, was consistent with the distribution of the target median spectra. The research of Kozak et al. [119] made available an online repository of ground motions modified to match the seismological characteristics of southern Illinois for dynamic structural analysis. The CMS for the city of Cairo (Illinois), was derived to be used as target-matching spectrum for various periods of interest. Twenty suites of CMS-matched motions for five different CMS and conditional periods in the range 0.2–2.0 s were derived subsequently. The authors emphasise that only records with conditional periods similar to the fundamental period of the structure of interest should be selected from the repository to perform dynamic analyses.

Ji et al. [120] approached the matching of GCIM multivariate distribution through Genetic Algorithms as a tool for earthquake selection. Instead of random realisations, the optimisation algorithm searches a suite of motions that better fit the target distribution. Thus, the cost function was defined as the sum of the squared differences between the means and the variances of the target spectra for all IMs. The GCIM was supported by Sa(T), intensity-based, cumulative, and duration-based IMs. All these IMs matched well with the target distribution in all cases. Nevertheless, the authors pointed out that some parameters within the algorithm could be adjusted in practice according to the performance of the fitness function. As an alternative to using various CMS for the selection of 3-component motion for the assessment of three-dimensional structures, Kwong and Chopra [121] proposed and assembled CMS-UHS. This composite spectrum is defined in a range of period Tmin–Tmax defined as the shortest and longest period that contribute the most to the structural response, see Eq. (3). To ensure hazard consistency, the authors explicitly highlighted the importance of selection using the vertical target spectrum, a wide period range (0.01–10 s), and constrained scale factors. Thus, given ε from Tmin to Tmax, the Composite Spectrum for the k component (k = H or V, for horizontal and vertical components, respectively) of GM is defined as:

$${\text{A}}_{\text{k}}{,}{\text{Composite}}\left(\text{T,}\,\varepsilon\right){=}\left\{\begin{array}{c}{\text{A}}_{\text{k}}\text{,CMS}\left(\text{T,}\varepsilon{(}{\text{T}}_{\text{min}}{)}\right)\\ {\text{A}}_{\text{k}}\text{,UHS}\left(\text{T,}\,\varepsilon\right)\\ {\text{A}}_{\text{k}}\text{,CMS}\left(\text{T,}\,\varepsilon{\text{T}}_{\text{max}}\right)\end{array}\right.\begin{array}{c}\text{T}\le{\text{T}}_{\text{min}},\\ {\text{T}}_{\text{min}}\text{<T<}{\text{T}}_{\text{max}},\\ \text{T}\ge{\text{T}}_{\text{max}}.\end{array}$$
(3)

4.2 Studies with Validation on Conventional SDOF and MDOF Systems

Limitations regarding the feasibility of selecting ground motions considering a target ε value at the fundamental period of the structure were pointed out by Haselton et al. [122]. An alternative methodology in which a general motion set can be selected regardless of the ε value was proposed. The structural response, in terms of collapse capacity distribution, is corrected to consider the effects of spectral shape for a specific site and hazard level. For this purpose, a linear regression between the collapse capacity of each record and the ε(T1) of the record was proposed to adjust the collapse capacity distribution consistent with the target ε(T1) for a site and hazard level of interest. The method was applied to assess the collapse capacity of an RC 8-Storey building with a fundamental period T = 1.71 s. Figure 5 compares the results of predicted collapse capacity distributions when records are selected rigorously for ε(T1) and the alternative methodology, which led to very similar results (4% difference in the prediction of the mean collapse capacity).

Fig. 5
figure 5

(Source Haselton et al. [122])

Comparison of collapse capacity distributions

Buratti et al. [123] studied the distribution of drift response in a 6-storey RC frame building with vibration periods of 0.93, 0.32, 0.17, and 0.12 s (1st to 4th) considering the correlation of this parameter with the spectral acceleration at the initial fundamental period. The reference distribution was defined through a regression analysis considering 1666 records without scaling. The trend of the results revealed a strong correlation between spectral acceleration with the 4th-storey drift, and the roof drift. Cimellaro et al. [124] studied the effect of amplitude scaling and spectral matching in the derivation of fragility curves for acceleration response and inter-storey drift ratio using Sa(T1) and PGA as IMs. The case study was defined as a 5-storey moment-resisting frame taken from the work of Shome et al. [125]. For amplitude scaling, the scale factor was obtained by minimising the SSE of the target spectral values and the median of the spectral ordinates. At the same time, for spectral matching with UHS and CMS, four techniques were implemented (SIMQKE [126], RASCAL [127], RSPMATCH [128], and SMSIM [129]). The authors led to the conclusion that fragility functions with Sa(T1) as IM are affected by the type of spectrum-matching tool, for instance, RSPMATCH led to lower estimation in median response against SIMQKE and amplitude-scaled motions of 50% and 80%, respectively, when compared with RASCAL. Further, it was determined that to reduce the error below 10% in estimating fragility functions, the number of records is at least 23 and 20 for PGA and Sa(T1), respectively. Similarly, they concluded that the type of spectrum-matching tool affects fragility functions derived based on Sa(T1) as IM. The study by Huang et al. [130] analysed the influence of four scaling techniques (geometric-mean scaling; spectral matching; Sa(T1) and scaling of motions per distribution of spectral demands) on the distribution of displacement response considering non-linear SDOF models. For the case of Sa(T1), unbiased estimation of median responses but with dispersion was observed, in the same range or even larger than those with geometric mean scaling for ductility values larger than 3, since the 1st mode, in this case, is no longer representative of the structural response. The authors also discussed the importance of ε within the selection process by pointing out the exclusion of the potential bias in non-linear responses from amplitude scaling.

Lin et al. [131] presented the first part of a research on risk-based assessment focusing on the estimation of the annual rate of exceeding peak-storey drift ratios of conventional RC frame structures. The structural model of a 20-storey RC frame building located in Palo Alto, California, with a fundamental period T1 = 2.6 s, was selected as a case study for this research. Seismic hazard and disaggregation analyses were conducted for the site of interest to subsequently derive the target CS, for which a total of 40 records were scaled and selected to match the CS for Sa(T1) with 2% exceedance rate in 50 years. This part of the study paid special attention to hazard consistency, implying that the distribution of response spectra is consistent with the hazard curves of the site of interest at all relevant periods. Besides, structural analyses were conducted using a set of records matched from different conditional periods to measure the effect of this conditioning period. Strong similarities in the results were found, indicating that when performing risk-based assessment, it is possible to obtain accurate results by implementing any conditioning period in the exact CS-based selection. In contrast, the second part of the study was devoted to intensity-based assessment and evaluating an alternative target spectrum [132]. It was determined that using CMS instead of CS had no relevant impact on the estimation of the median response. Finally, the authors argued against the UHS matching selection, pointing out that selected ground motions are not consistent with the ground motion hazard for which they were selected. Ay and Akkar [133] validated a methodology for scaling and selection of motions developed in [134] for the probabilistic risk assessment of 3-story, 4-story, and 8-storey RC plane frame models. The methodology aims to scale and select n records from k candidate accelerograms, leading to the minimum dispersion with respect to a target spectral displacement value derived from the disaggregation of a site-specific PSHA. The methodology of Baker regarding the CMS [13] was used to compare the results computed through the proposed methodology. An acceptable level of agreement was observed within the probability distributions of scaled ground motion spectral acceleration ordinates and global structural demand parameters.

On the other hand, Kazantzi and Vamvatsikos [135] investigated the selection of proper scalar IMs for vulnerability assessment in buildings. The characteristics of the structural models for research were representative of (1) high-rise RC frame structures and (2) low-rise steel moment frames. The IDA curves were defined for peak inter-storey drift ratio and floor acceleration at each storey floor as EDPs, to be consistent with the assessment of structural and non-structural losses. For the case of selection based on conventional Sa(T1), an undesirable bias was found at high levels of scaling. At the same time, an IM combining five periods ranging from the mean second-mode period to twice the mean first-mode period exhibited great performance in terms of efficiency and sufficiency for both building classes. Dehghani and Tremblay [136] proposed an approach for ground motion selection in which, regardless of the dynamic characteristic of the structure to be analysed, the same set of scaled records (in a site of interest) can be used for seismic design assessment. The selection process is related to the results of PSHA to determine the dominant events from disaggregation analyses. The final selection is based on a ranking process of the inter-correlation of various amplitude, frequency, and duration IMs. The effectiveness of the process was tested using four SDOF systems (with vibration periods of 0.5 s, 1.0 s, 1.5 s, and 2.0 s) and six buckling-restrained braced frames steel buildings, for which it was observed that average inelastic demands were not sensitive to the number of records, with 16% difference between the lowest and highest average demand computed in the worst case.

An experimental investigation was led by O’Donnell et al. [137] to study the statistical dispersion of peak inter-storey and roof drift demand for specimens (denoted as NL2R2, NL2R4, NL4R2, NL4R4) with variations in the fundamental period (0.22 s and 0.27 s for NL2 and NL4, respectively) and lateral strength, 1/2 (R2) and 1/4 (R4), under the action of non-pulse and impulsive records covering a wide range of near-fault effects (distances of 0.2 to19.9 km). In that sense, 39 records were downloaded from the PEER database [56] and scaled subsequently for: (1) ASCE 7-10 methodology; (2) Sa(T1); (3) Median Maximum Incremental Velocity (MIV); and (4) Modal Pushover-based Scaling (MPS). The dispersion in the demand-based parameters was observed to remain constant among the four structure types for the MIV scaling approach, while it was greater using the remained scaling methods. The MIV method also exhibited better performance for the specimens with higher levels of non-linear behaviour. Bayati and Soltani [138] described a methodology for estimating seismic fragility curves in the NC limit stage, using only a limited number of motion records, especially for the case of very regular structures. A 2D frame from a 6-storey RC structure with T1 = 1.2 s was selected to assess the methodology in which the maximum inter-storey drift and Sa(T1) were taken as EDP and IM, respectively. Instead of performing IDA for a large set of ground motion records, the authors proposed selecting a smaller number of records (seven for this particular case) based on estimated fragility curves derived from the results of non-linear static analysis (pushover capacity curve to IDA curves). The authors observed good agreement when comparing fragility curves from the mean of the seven selected records and the mean of IDA curves from all records in the set for the 2D frame model. Chandramohan et al. [139] addressed the effect of duration in the collapse risk assessment of an RC frame building. Source-specific conditional distributions of duration and conditional spectra were implemented as GCIM targets for selecting hazard-consistent records. The analyses determined that considering duration in the selection is more important for sites with large contributions from large-magnitude interface earthquakes.

Chandramohan et al. [140] presented a work on the influence of motion duration on the collapse capacity of two different structures (5-storey steel moment frame and an RC bridge support). Different from the work presented in [139], the authors consider this time long and short duration records (quantified by means of D5–95, D5–75, Ia, Iz, CAV, and 0.05 g bracket duration, Db0.05) with similar spectral shape and intensity defined by the 5% damped Sa(T1). In the building case, median collapse capacity decreased 29% using long duration motions set. In contrast, a 17% reduction was observed in the median collapse capacity estimated by the long duration set compared to short duration records for the bridge pier. The authors highlighted that their results were drastically different from other works in the past that pointed out the negligible influence of duration on peak deformations. Furthermore, it was concluded that structures with high ductility and rapid rates of cyclic deterioration are most vulnerable to the duration of ground motions. The Conditional Spectral Dispersion (CSD) was introduced in [141] as a measure of spectral variability, see Eq. (4), and suites of motions were selected, each with the same median response spectrum but different levels of CSD. The records were implemented as input to analyse both SDOF systems with varying ductility levels and a 12-storey RC frame with T1 = 2.01 s, T2 = 0.68 s, and T3 = 0.39 s. The results exposed a consistent trend between bias and CSD, leading researchers to assume that spectral matching did not induce bias between EDPs resulting from scaled and spectrum-matched motions.

$${\text{CSD}} = {\sigma _{{\text{ln}}{{\text{S}}_{\text{a}}}({{\text{T}}_{{\text{eff}}}})}} = \sqrt {\frac{1}{{{\text{n}} - 1}}\sum _{{\text{i}} = 1}^{\text{n}}\left[ {{\text{ln}}{{\text{S}}_{\text{a}}}{{({{\text{T}}_{{\text{eff}}}})}_{\text{i}}} - {\text{ln}}{{\text{S}}_{\text{a}}}{{({{\text{T}}_{{\text{eff}}}})}_{{\text{geo}}}}} \right]},$$
(4)

where Sa(Teff)geo is the geometric mean of spectral ordinates at effective period Teff, and i and n are the index and number of ground motions, respectively.

Later on, Kohrangi et al. [142] briefly introduced a methodology for record selection using the CS approach with IM as the Sa value averaged geometrically over a period range to be aligned with the response of building classes. To incorporate hazard disaggregation at different sites, the approach developed by Lin et al. [114, 131, 132] was adopted by that study. Next, a 4-storey steel frame and 3 RC frames with 7, 12 and 20 stories and fundamental periods of e 1.82 s, 1.60 s, 2.10 s, and 2.85 s were taken as case studies. For analyses, four alternative ground motion sets consisting of 44 accelerograms were chosen at each IM level for the CS(SaT1) and the CS(Saavg) as the spectral ordinates averaged at periods T = 0.4:0.2:4.0 s. The authors observed lower dispersion in IDA using Saavg as conditioning IM, which remained uniform at different IM levels, different than SaT1, where dispersion increased as with IM level. A More detailed discussion on the advantages of Saavg as an advanced IM was presented afterwards in [143]. The effect of GM selection at various hazard levels, considering UHS, CMS, and GCIM, on the seismic collapse fragility of a non-linear 10-storey RC frame building model was addressed in the research conducted by Koopaee et al. [144]. Collapse analysis was first started using the uncracked period of the structure. Fragility curves were derived for the structure considering the fundamental period of the structure T1 = 1.0 s and subsequently for the cracked period. It was observed that the collapse fragility function varies significantly depending on the selection approach. The estimation of the median collapse capacity was up to 40% higher for selection based on CMS and GCIM from PSHA against prediction with UHS from NZS1170.

Samanta and Huang [145] studied the influence of 4 GM scaling methodologies, including geometric mean, spectral matching, Sa(T1), and maximum versus minimum orientations in a 34-storey building model referring to EDP such as floor acceleration, storey drift and average floor spectral acceleration. Comparable performance was observed for the geometric and Sa(T1) scaling in predicting mean acceleration and drift response values. Besides, the largest dispersion was observed for Sa(T1) regarding acceleration responses, indicating, in consequence, that this scalar IM is not suitable for the analysis of structures sensitive to excitations from higher modes. Similarly, Samanta and Pandey [146] analysed the effects of GM scaling in a 15-storey building considering Geometric mean, Sa(T1) scaling, RSPMATCH, and Seismosoft matching for short and long duration motions. 30 Pairs of ground motion from PEER NGA Database [56] were obtained by matching the conditions established for the Lucknow region (capital of Uttar Pradesh, India) with Mw = 6.5 to 8.7, epicentre distance of 300 km, and shear wave velocity in the range 73–385 m/s. The records were also classified according to the significant duration D5–75. The researchers determined that Sa(T1) scaling method led to lower estimations in the peak floor acceleration parameters and more dispersion in peak floor and average floor acceleration against the geometric mean scaling. Regarding duration, it could either decrease the mean peak and average floor acceleration at higher hazard levels or increase the median peak storey drift at lower hazard levels.

In order to test the independency of IMs and scale factors, Wen et al. [147] studied their effects on the maximum displacements of inelastic SDOF systems. A large variety of IMs were analysed in terms of acceleration (PGA, Ia, and Ic), velocity (PGV), displacement (RMSd and PGD) and spectra-related (Sa(T) and HI). According to their results, the best IMs (with an approximated bias of 20%) include HI, PGV, and Sa(T) for structural systems with short, medium, and long periods, respectively. In consequence, the researchers highlighted the performance of HI in the whole period range, even for scale factors around 10. Sa(T) and PGV were considered acceptable alternatives with bias in the 40% range for a scale factor between 0.2 and 5.0. Dávalos and Miranda [148] suggested a new IM named Filtered Incremental Velocity (FIV) to evaluate structural collapse. The proposed period-independent IM was defined as the sum of the n largest incremental velocities obtained from a low‐pass filtered ground acceleration, as represented in Eq. (5). To study the reliability of this new IM, a set of 269 records selected from 11 shallow crustal earthquakes with Mw between 6.9 and 7.6, Vs30 between 180 and 760 m/s (NEHRP site classes C and D) and RJB between 0 and 27 km, was collected. The results were compared against other widely known IMs considering a 4-storey moment-resisting frame building analysed previously in [149]. The results showed that by considering the first three largest incremental velocities (FIV3) the dispersion of collapse intensities is reduced by 65.2% compared with traditional Sa(T) selection.

$$\text{FIV} = \underset{\nabla{\text{t}}\text{ < }{\text{t}}_{\text{end }}\text{- }\alpha{\text{T}}_{\text{n}}}{\text{max}}\left|{\int }_{\text{t}}^{\text{t} + \alpha{\text{T}}_{\text{n}}}{\ddot{\text{u}}}_{\text{gf}}\left(\tau\right){\text{d}}\tau\right|$$
(5)

with ügf as the low pass filtered acceleration, tend as the last instant of time of the acceleration time series, and α as a scalar value responsible for accumulating time directly proportional to Tn.

Du et al. [150] studied the influence of different scaling limits (2, 5, 10, and 15) on ground motion selection, focusing on the CS. The authors employed a reduced dataset from the NGA‐West2 [151] with 10,679 recorded accelerograms from 310 events having Mw in the range of 3.1 to 7.9 and Rrup within 0.1–499.54 km, and two non-linear 2D steel frames with fundamental periods of 0.20 s and 1.01 s, respectively. The CS was then computed based on four hazard scenarios in the western US. It was determined that a lower scale factor of less than 2 can barely approximate the distribution of the target spectra. Higher scaling factors, on the other hand, were observed to directly affect the cumulative and duration characteristics of the records (Ia and D5–75) and directly increase the mean value of predicted EDPs (about 10% in the maximum inter-storey drift ratio and four times larger than when using lower scaling factors). Hence, a limit in the scaling ranging from 3 to 5 was finally recommended.

Subsequently, Ghotbi and Taciroglu [152] analysed the effects of conditioning criteria for single, 2, and averaged IMs when collecting ground motions to analyse RC structures. For the study, 4-storey, 8-storey, and 12-storey buildings were modelled as 2D moment frames and selected as case studies. GM suites with 30 bi-directional records each are selected from a scenario defined for Los Angeles, California, by matching the multivariate conditional distribution of multiple IMs [Sa(T), Ia, CAV, D5–75, D5–95]. The study focused on assigning different conditioning criteria and importance weight factors to each IM in the selected scenario. Results emphasised the performance of the averaged-IM-based selection in predicting median responses of demand parameters for all case studies.

Du and Padgett [153] introduced the concept of a Multivariate Return Period (MRP) to consider the joint rate of exceedance of vector IMs. The selection of motions based on MRP was generalised into a set of steps to select a user-specified number of records based on the SSE between the target spectrum and the response spectrum of the scaled record from the database. Three case-study structures were defined, first, a perfectly elastic SDOF with fundamental period of vibration of 0.5 s and damping coefficient ζ = 5%; second, an inelastic SDOF with similar characteristics and nondeteriorating bilinear hysteresis behaviour; and third, a 5-storey shear frame defined earlier in [154]. In addition, the seismic hazard was defined for a site in Memphis, Tennessee, US. The authors compared the results on the median prediction of EDPs and dispersion by considering CS-Sa, and CS-Saavg selection, for which significantly different estimations were observed in line with the variation in the target spectra distribution. The work of Ebrahimian and Jalayer [155] addressed the computation of efficiency and the relative sufficiency of IMs through modified cloud analyses and IDA, not in terms of EDPs but damage measures directly. A 2D frame non-linear numerical model of a school building in Avellino (Southern Italy, in Campania region) was selected for analysis. A list of candidate scalar IMs was set to investigate: (1) PGA; (2) Sa(T1); (3) Saavg(0.2T1–2T1); (4) SN1 = Sa(T1)0.5 × Sa(1.5T1)0.5; (5) SN2 = Sa(T1)0.75 × Sa(T2)0.25; (6) CAV; and (7) Ia. The analysis of the results for the case study structure showed that Saavg(T) and SN1 measures were significantly more sufficient relative to Sa(T1).

Thereafter, Vargas-Alzate and Hurtado [156] investigated the efficiency in predicting the mean response of buildings under the action of near- and far-fault records. By taking into account the probabilistic characterisation of building structures, 444 and 492 models were derived to study the effects of near-fault records and far-fault records, respectively. Accordingly, two ground motion subsets are collected: a subset of 444 records with an epicentral distance of less than 10 km; a subset of 492 records with epicentral distance ranging within 10 to 30 km. Spectral-based, energy-based, and IMs computed directly from the record were considered alongside EDPs, including maximum roof displacement, maximum global drift ratio, maximum inter-storey drift ratio, and shear base coefficient. It was claimed that energy/velocity-based IMs exhibited higher efficiency than acceleration-based IMs, showing correlation index values up to 0.97 associated to each IM-EDP set of points. Since the examined IMs are not structure-dependent, structures with different properties might be analysed using the same fragility function. It was also verified that near-fault motions induce larger demands in terms of EDPs than far-fault recorded accelerograms, even at the same intensity.

To determine the potential source of bias induced by amplitude scaling, the study by Tsalouchidis and Adam [157] examined the maximum inter-storey drift ratio of four groups of structures, including 4-storey, 8-storey, 12-storey, and 20-storey non-linear planar moment frames, under the effects scaled GMs with an equivalent level of intensity. For this purpose, 17.150 records were gathered, covering the distribution of GM intensities with a focus on Sa(T1), Saavg, Ia, CAV, and significant duration as well as seismological characteristics in terms of Mw, RJB, and Vs30. The results of cloud analysis from unscaled and scaled accelerograms led researchers to conclude that regardless of considering spectral shape compatibility, amplitude scaling of IMs did not induce bias on the median prediction of EDPs. Nonetheless, the authors suggested a particular caution on using low amplitude records with large amplitude scaling to cover the lack of records at higher amplitudes. Sucuoğlu et al. [158] extended the concept of inter-storey drift-based scaling (IDS) introduced previously in [159] to select bi-directional motions for the analysis of 3D structures. This scale factor is the ratio of average maximum inter-storey drift distributions for a specific target and the bi-directional motion to be scaled. The authors considered a 20-storey RC regular building and a 3-storey RC dual (frames and walls) irregular system to illustrate the applicability of the proposed scaling technique. ASCE/SEI 7-16 and Sa(T1) scaling were also employed for comparison. IDS-IM proved to be more efficient than the other two scaling procedures. The potential of this new IM was also pointed out in the mean prediction of other demand measures, such as maximum plastic rotations. Finally, Xu et al. [160] developed a machine-learning approach for damage prediction considering a large set consisting of 48 IMs to represent ground motion characteristics. The methodology was tested in 12 building models with different structural and dynamic characteristics. Findings revealed that frequency-based IMs significantly influence the mean prediction over time-domain IMs. Furthermore, it was demonstrated that to achieve good predictions in the mean response of EDPs (accuracy larger than 90%), the number of IMs as input increases with the complexity of structures. This means frame structures may require a single IM, but complex systems may require up to 13 IMs in the worst-case scenario).

4.3 Studies with Validation on Other Special Types of Structures

The work of Hariri-Ardebili and Saouma [161] was aimed at expanding the definition and application of IMs (used mainly in the context of frame-building structures) for probabilistic seismic demand assessment of concrete dams. Approximately 70 scalar IMs were examined carefully by the authors, and they determined that the most suitable IMs for the analysis of concrete dams include: PGA, PGV, PGD, IF, ASI, Sa(Tn), Sv(Tn), and Sd(Tn). For the derivation of fragility curves, the crest displacement was selected as EDP, for which PGA and Sa(T2) generate wider curves than Sa(T1) considering a limit value arbitrarily set in 20 mm, as depicted in Fig. 6.

Fig. 6
figure 6

(Source Hariri-Ardebili and Saouma [161])

Comparison of fragility curves on exceeding a crest displacement of 20 mm

The concept of GIMD was adopted again in the record selection for the seismic slope displacement analysis [162]. Multiple suites of motions were selected considering different target vector-IMs (Sa ordinates at 22 periods, Ia, and D5–75) for an earthquake scenario with Mw = 7, strike‐slip fault mechanism, Rrup = 20 km, and Vs30 = 400 m/s. It was verified that the collected records matched well to the target distribution of Sa over a wide period range. Biased distributions in the mean displacement values of each slope were derived using motion suites with biased Ia distributions. It was also numerically demonstrated that the mean displacements are proportional to the increase of Ia amplitudes. In contrast, the biased distribution of D5–75 did not produce visible differences in the mean slope displacements. Morelli et al. [163] took from [164] the numerical model of an industrial structure to study the suitability of selection and scaling techniques for the performance-based assessment of irregular structural systems. Two sets of motion were selected for the analysis, matching the UHS and the CMS at an Italian high seismicity zone. When Sa(T1) in the x direction was adopted as IM, the trend in the results of EDPs exhibited less scattered for probabilities of exceedance of 10% and 50%, while sets scaled to Sa(T1) in the y direction did not show any advantage in terms of dispersion within the EDP results but higher mean values.

Li et al. [165] focused their research on selecting the most unfavourable earthquakes for the dynamic analysis of nuclear power plant models. To this end, various IMs to characterise the EDPs and damage were examined by utilising correlation analyses. The authors proposed a total of 32 IMs, subclassified in groups derived directly from the record itself, the elastic spectrum, or the inelastic response spectrum. These IMs were defined to represent the ground motion damage potential quantified by the maximum top displacement of the containment structure and the average floor spectral acceleration of the secondary system. Sa(T1) was selected as the most suitable IM for the containment structure as it exhibited mean values of correlation with the top displacement around 0.998 with a standard deviation of 0.002. As for the secondary system, Sa, Sv(T1), RMSa and Ia were determined as acceptable IMs with good levels of correlation with the EDP. The effects of motion selection and scaling of earthquake records were examined using non-linear models developed for the bridges: (1) Jack Tone Road Overcrossing; (2) La Veta Avenue Overcrossing; and (3) Jack Tone Road Overhead, all located in California [166]. The earthquake scenario was selected to be consistent with the California area by considering an Mw = 7.0 earthquake on a strike-slip fault, a source-to-site distance of 10 km, and Vs30 based on the bridge soil profile. PGV was selected as the most suitable IM, and 99 records (taken from the PEER database [56], according to the previously described scenario) were scaled based on its distribution. The effects of conventional Sa(T1) selection and scaling versus the CS method were investigated. Results reveal that both approaches led to a significant underestimation of the probability of collapse compared with the unconditional selection approach in which the median + 1.5σ spectrum scenario from the attenuation model defines the target spectrum.

On the other hand, the research of Zuccolo et al. [167] introduced haselREC (HAzard-based SELection of RECords) as an open-source tool for GM selection and scaling. One important input prior to the selection is the PSHA and disaggregation results from OpenQuake [72]. Hence, the selection module provides the ID of selected records to be subsequently scaled in the next module. In addition to conventional PGA, Sa(T) selection, average Sa (Saavg) was also included as an alternative IM within the selection process to overcome the limitations of single-period selection and scaling. The applicability of the tool was illustrated by selecting two component accelerograms considering six different site conditions and multiple return periods for the seismic analysis of existing bridge models located in Italy, North Macedonia, and Israel. For selection based on Saavg, a good level of fit was observed in the period range by the selected records. Regarding the structural analysis, it was briefly mentioned that the application of haselREC led to less dispersion in the behaviour of the bridge inventory. It should be noted that haselREC allows for selection in the framework of CMS, but the authors expressed their interest in including the exact CS selection methodology for upcoming works. Table 3 summarises the works referenced in Sects. 2.3.2 and 2.3.3 regarding case-study structure, IMs and EDPs analysed.

Table 3 Summary of investigations on probabilistic assessment based on IMs with validation on SDOF and MDOF systems, and other special types of structures

5 Use of Simulated Signals as an Alternative to Ground Motion Selection and Scaling

This last section provides a discussion of a few investigations in which the selection of earthquake input to dynamic analyses has been tackled by adopting simulated artificial or synthetic signals. According to Rezaeian and Sun [7] exist two main categories of models for generating synthetic ground motions; those are source-based and site-based models. In source-based approaches, the fault rupture, wave propagation, and site conditions (i.e., physics of the earthquake) are explicitly considered to generate a time series at a specific site. In contrast, site-based methodologies account implicitly for these characteristics by fitting a stochastic process to previously recorded and well-characterised earthquake motions. Accordingly, the list of works discussed in this last section is sub-categorised into works implementing source-based and site-based approaches.

5.1 Works Implementing Source-Based Approaches

Koboevic et al. [168] used a large dataset of source-based simulated records [44] compatible with the 2005 National Building Code of Canada (NBCC 2005) [45] uniform hazard spectrum to study the median inelastic brace deformations of a four-story concentrically braced steel frame, compared to the effect of amplitude scaled historical records. Three sub-sets of records were derived, first, by considering the unscaled matching with the target spectra (showing better correspondence with the results of historical records); second, by taking two accelerograms for each M–R scenario; and third, by considering the Atkinson approach described in [44]. Consecutively, two methodologies were utilised to achieve compatibility with the NBCC 2005 design spectra for each sub-set and for ensembles of 120 simulated records. The former, by matching spectral values in the period range 0.2–2.0 s, and the latter, based on the PGV ratio plus an additional scale factor affecting the whole set. The historical records, on the other hand, were selected according to their seismological characteristics. In spite of different motion characteristics, the cumulative probability of normalised brace axial deformation (i.e., fragility analysis) was in good agreement with the results of historical and simulated signals (record-to-record variability generally higher than 0.4) supporting the adequacy of the sets of simulated records.

Michaud and Léger [169] analysed the effectiveness of seven scaling methods and two spectral matching approaches with the NBCC 2005 [45] UHS for Montreal to perform nonlinear seismic analysis in multiple SDOF structures, and a 4-storey steel frame considering both historical and simulated records. The selection of motion was carried out considering the disaggregation of the UHS of Montreal for soil type class C. The simulated records were taken from the dataset by Atkinson [44]. Regarding the scaling methodologies, these included: PGA scaling; Sa(T1) scaling; acceleration spectrum intensity-based scaling such that the area under the spectral acceleration is equal to the area under the target spectrum for a given range; ASCE/SEI 7-10 [5] methodology; ATC scaling [170]; Atkinson methodology [44]; and mean square error-based scaling. As for the spectral matching methods, these included: frequency domain spectral matching by modifying Fourier coefficients for each frequency of interest, and time-domain spectral matching using RspMatch [128]. For SDOF structures, it was observed the ASCE and ATC scaling methods as well as frequency domain spectral matching generate higher than expected displacement ductility demand, while the other methodologies presented a lower standard deviation. On the other hand, SA(T1), acceleration spectrum intensity, and mean square error-based scaling led to the smallest standard deviation in the seismic response of the 4-storey steel frame under study.

The study by Karimzadeh et al. [171] showed the feasibility of implementing sets of simulated ground motions in the generation of fragility curves. The simulated catalogue was developed considering the local seismicity parameters of Erzincan (Turkey) as input to the stochastic finite fault technique. The study adopted structural models of representative masonry archetypes in the region. The simulated records were selected and linearly scaled to reach sequential damage stages in the fragility analysis that were validated with the observed damage levels after the 1992 Erzincan earthquake of Mw = 6.6. More recently, Karimzadeh et al. [172] examined the differences between real and simulated records reproduced through the stochastic finite-fault method. The simulated records were reproduced for Erzincan and Duzce city centres in Turkey, considering scenario earthquakes for different Mw values as well as simulations reproduced after the 1992 Erzincan earthquake (Mw = 6.6) and the 1999 Duzce earthquake (Mw = 7.1). Independent sets of real and simulated records were selected according to the distribution of IMs (i.e., PGA; PGV; and ASI). In the first stage, the datasets were compared in terms of structure-independent IMs using Kolmogorov–Smirnov statistical tests to quantify the difference between empirical distribution functions of real and simulated sets, demonstrating that real ground motions are generally in close agreement with the simulated data. Furthermore, the correlation between IMs and maximum displacement demand was studied considering multiple non-linear SDOF systems subjected to the selected records of each set. A Higher correlation for the simulated records was found, mostly attributed to the less scatter in region-specific simulations against real motions.

5.2 Works Implementing Site-Based Approaches

The work of Lin et al. [173] examined diverse methodologies for obtaining spectrum-compatible input records for the analysis of three building structures ranging from low-medium to high-rise buildings (4-storey,10-storey, and 16-storey, respectively). The first technique consisted of scaling each record such that the area of the acceleration spectrum is equal to the area under the design spectrum within the range 0.2–4.0 s. Modification of accelerograms in the frequency domain was also conducted by considering the ratios between spectral values from the design spectrum and the unmodified accelerogram in an iterative process. Artificial accelerograms were developed as a sum of sinusoids through SIMQKE [174]. The largest dispersion around the mean spectra for all four sets was observed in the real-scaled and simulated records. Further, the authors encouraged the usage of scaled real accelerograms for the analysis of RC buildings since it led to mean maximum values of inter-story drifts and curvature ductilities.

Causse et al. [175] generate a set of synthetic records compatible with EC8 design spectra. A reference set consisting of 10 accelerograms was downloaded from the PEER Ground Motion Database [56] according to the conditions 5.8 ≤ Mw ≤ 6.2, 0 ≤ Rrup ≤ 20 km, and 400 ≤ Vs30 ≤ 600 m/s (EC8, site B category) to be subsequently modified by adding wavelets to match the response spectrum. A total of 5000 signals were simulated using the non-stationary stochastic method [176] from which a subset of 10 accelerograms compatible with the distribution of Ia and significant duration of real records at a specific level of PGA is selected subsequently. Such accelerograms were implemented in comparing drift response of non-linear SDOF revealing that simulated records led to the lowest variability in the mean response.

Zhong et al. [177] implemented site-specific simulated accelerograms to conduct code-based design checks (ASCE 7-16) and performance-based assessment of two tall buildings, a 20-storey RC moment frame and a 42-storey shear wall building, considering three sites (San Francisco, Los Angeles, and San Bernardino) and 5 seismic hazard scenarios to conduct the Broadband Platform (BBP) simulations [178]. Regarding code-based design, the average response of both buildings in terms of maximum story drift ratio, peak floor acceleration, and story shear profiles showed good correspondence using either simulated or real records fitted to the maximum credible earthquake spectrum. However, when using the CMS as target, the authors recommended the usage of BBP simulated records since they represent more realistic spectral amplitudes and shapes.

Finally, site-based simulation was also implemented by Fayaz et al. [179] to assess hazard-targeted seismic demand of bridges. The study considered non-linear models of four RC bridges with different structural and dynamic characteristics. A site-specific catalogue of synthetic GMs was derived covering a time span of 100,000 years for seven sites in southern California using the seismic model proposed by Field et al. [180]. In addition, the group of records for the analysis were selected in order to represent accurately the distribution of Sa(T1) but also the correlation with other IMs as duration and frequency content. From an engineering point of view, simplified tables were provided by which practitioners can determine the number of analyses to conduct according to the bridge structure topology.

6 Conclusions

This paper presented a detailed literature review on the scaling and selection approaches of earthquake motions for structural engineering applications. Reviews have been presented in the past, as in Iervolino and Manfred [18] and Katsanos et al. [19], covering most of the research conducted before 2010. Thus, the present review devotes special attention to works performed after 2010. The discussion within this article was chronologically organised into four principal categories: (1) Preliminary works (research done before 2010); (2) Code-based selection and spectral matching; (3) Probabilistic assessment based on Intensity Measures (IMs); and (4) Use of simulated signals as an alternative to ground motion selection and scaling.

Regarding code-based selection and spectral matching, most research approached the selection and scaling of motions as a constrained optimisation problem by minimising the difference in the sum of square errors between the spectral acceleration of motions and the target spectrum. Harmony search [55, 61, 70, 93], colliding bodies [63], genetic algorithms, or even neural networks [96] were proposed to minimise the Sum of Squared Errors difference. Under this criterion, it should be noted that some authors made available user-friendly software for matching recorded accelerograms with code-defined target spectra or UHS derived directly from PSHA, such as REXEL [48], ISSARS [58], and SelEQ [70] mainly. Nevertheless, early investigations argued that these spectra do not represent individual earthquakes [13]. Therefore, alternatives were proposed to overcome such limitations as the Conditional Mean Spectrum or the exact CS [114, 131, 132].

In the last years, several studies have worked on selecting motions based on IMs and subsequent probabilistic seismic assessment. Such research demonstrated that the effectiveness of IM-based selection is dependent not only on the ground motion characteristics but also on the structural and dynamic properties of the models under analysis. A large number of IMs have been examined in this context, considering different types of structures from low, medium, and high-rise frame building structures [130, 133, 135, 138, 144], concrete dams [161], highway bridges [166], even to simplified lumped-mass models of nuclear power plants [165]. Evidently, most of these works focused on the evaluation of frame structures, and less attention has been paid to other types of structures, such as masonry buildings or base-isolated structures. In addition, it has been shown that the number of IMs to take into account for proper probabilistic assessment increases with the complexity of structures [160]. Although few, investigations on the implementation of simulated signals as an alternative to ground motion selection and scaling, point to synthetic records as a promising alternative for code-based and performance-based seismic assessment. Simulated records represent realistic spectral amplitudes and shapes and reduce the need for large scaling to represent earthquake features if they are validated both from seismological and structural points of view.

Finally, it is important to stress that this paper's goal is to provide an overview and discussion of the most advanced and recently proposed methods/approaches regarding scaling and selection of earthquake motions for structural engineering applications. The discussion focuses on two principal topics: (a) code-based selection and spectral matching and (b) probabilistic assessment based on IMs. Therefore, this review provides a suitable reference for future investigations.