1 Introduction

Mechanical properties and morphological setting of the ground are a key element in modifying locally the seismic ground motion in terms of amplitude, duration and frequency content, commonly known as site effects. As a consequence, recordings from seismic stations can be significantly affected by the variation of site conditions close and below the station, influencing the studies based on earthquake ground motions collected by regional or local seismological networks.

To facilitate practical engineering design, site conditions are often characterized by a small number of site attributes (or proxies) or their combinations, aimed at describing their effects on seismic ground motion (Trifunac 2016; Bergamo et al. 2021). This simplified approach is adopted in many research fields: evaluation of local amplification and ground response analysis (Derras et al. 2017; Priolo et al. 2019), calibration of strong-motion records for realistic ground shaking estimates (Cauzzi et al. 2014; Michelini et al. 2020 and references therein), assessment of site-specific hazard for critical infrastructures (Bazzurro and Cornell 2004a, b; Rathje et al. 2015; Pecker et al. 2017; Aristizabal et al. 2018), estimation of ground motion models (Bozorgnia et al. 2014; Douglas 2016; Bindi et al. 2019; Kotha et al. 2020), soil classification following the building code prescriptions (NEHRP, BSSC 2015; Eurocode 8, EC82004; NTC18 Italian code, NTC 2018). Most existing strong motion databases actually include some information on the VS30 proxy, which was the first one to be proposed in the nineties as a continuous, quantitative alternative to the binary (or ternary) classification soil/rock or soft soil/stiff soil/rock (Borcherdt 1992, 1994; Boore et al. 1994).

In the last decade, the number of stations of permanent and temporary seismic networks worldwide has largely increased (Margheriti et al. 2011; Mazza et al. 2012; Michelini et al. 2016; Moretti et al. 2016; Hetényi et al. 2018; Cara et al. 2019; Chen et al. 2019), rising the amount of recorded data (McNamara and Buland 2004; Pintore et al. 2012; Lanzano et al. 2019), without paying much attention on site characterization information.

Recently some efforts have been carried out at national level to perform extensive site characterization at seismic stations (e.g. Sandikkaya et al. 2010; Michel et al. 2014; Stewart et al. 2014; Albarello et al. 2017; Felicetta et al. 2017; Hollender et al. 2018). Currently only a few seismological national networks expose site condition characteristics with very detailed information and reports on site topography, morphology, geology and on seismic surveys used to derive VS profiles (Cornou and Bard 2019): examples of national database are provided by Switzerland through Site Characterization Database for Seismic Stations (Swiss Seismological Service (SED) at ETH Zürich 2015, http://stations.seismo.ethz.ch; Michel et al. 2014; Poggi et al. 2017), Italy with the Italian Accelerometric Archive (ITACA, http://itaca.mi.ingv.it; D'amico et al. 2020), Turkey with the Turkish Accelerometric Database (http://kyhdata.deprem.gov.tr).

At European level, some national seismological networks make use of European web portals to disseminate data and information on seismic stations, namely the Earthquake Strong Motion database (ESM, https://esm-db.eu/; Luzi et al. 2020; Lanzano et al. 2019), the ORFEUS station book (http://orfeus-eu.org/stationbook/), the European Geotechnical Database (EGD, http://egd-epos.civil.auth.gr/). Nowadays, the most complete European database is ESM which exposes site condition information at 2071 strong motion sites, among which EC8 ground type is available at 1455 stations (about 70% of the number of stations). These web portals can display one to several indicators for site characterization and, in some few cases, it is possible to download reports of specific geophysical and geological surveys carried out at the strong motion site. However, most often direct measurements of the site attributes are either unavailable or have not yet been performed. In this case the level of information provided by such report is poor, and proxies based on geology and/or topography are generally used to define the site classification of a strong-motion station. In ESM, only 469 stations (22% of the total number of stations) have EC8 soil class derived from measured VS profiles, for the remaining stations being inferred from geology or terrain slope (Lanzano et al. 2019).

The lack of complete site condition information at European strong motion sites prevents the full use of seismic records for site amplification at local or regional level (Cornou and Bard 2019; Kotha et al. 2020; Cauzzi et al. 2021 and references therein). More generally, setting-up standard practices for a comprehensive seismic characterization of a station site, together with a clear evaluation of their reliability, is becoming a growing concern to reach high-level site condition metadata, and to offer unique opportunities of studies based on the availability of the large amount of high-quality data.

To fill the gap between data providers and researcher users, the networking activity of the SERA EU project (“Seismology and Earthquake Engineering Research Infrastructure Alliance for Europe”, project no. 730900, Horizon2020 INFRAIA-01–2016-2017 Program) led to the definition of a European strategy for site characterization of seismic stations in Europe, and to the proposition of standards for the best practice and site characterization quality assessment (Task 7.2 of WP7-NA5 “Networking databases of site and station characterization” http://www.sera-eu.org/en/activities/networking/; Di Giulio et al. 2019). At international level, the USA based COSMOS Consortium (https://strongmotion.org/Projects/CharacterizationGuidelines/) shares similar goals on the strong-motion data dissemination and on the definition of standard procedure for site characterization and reporting.

Because of the combination of this concern and of various recent studies and papers, discussing the limitations of VS30 taken as a single indicator and proposing other proxies (Trifunac 2016; Boudghene-Stambouli et al. 2017; Derras et al. 2017; Zhu et al. 2020; Bergamo et al. 2021; Felicetta et al. 2021), we thought it timely to question the scientific and engineering community about the optimal site proxies to be used in the future for improved ground motion predictions. The results of this inquiry are discussed in two companion papers. The present one details the rationale behind a list of seven indicators considered as the most relevant indicators for the site characterization of seismic stations and, for each of them, the template for a summary report aimed at the quantitative assessment of the quality of site metadata. The second one (see the companion paper, Di Giulio et al. 2021) proposes a quality metrics to evaluate the site characterization reliability to be included in station metadata).

In the current study we first describe the outcomes of an international online survey to identify the indicators with the largest consensus, and thus to be considered as necessary for a reliable site characterization. The same survey is then used to build two additional indices that characterize, respectively, the cost and the difficulty to obtain a reliable value for each of the considered site indicators. Next, we provide a scheme of summary reports containing in a compact format the information related to each of them, with the background information helpful to assess its reliability. The selected indicators and summary reports have been presented to a representative panel of European and worldwide experts in a dedicated workshop (Cultrera et al. 2019), during which they were discussed and validated through focus groups. The seven indicators and the associated summary reports represent a first attempt to reach high-level metadata for site characterization, being aware that they can be improved after a few years of experience, based on the feedback from seismic-network data users.

2 The international questionnaire

Despite some sparse efforts, there is not a customized site characterization procedure among the networks operators, and the site metadata of the permanent networks appear highly heterogeneous, if not completely absent. The identification of a set of indicators, able to catch the main site effects at a seismic station, represents the first step towards standardized site information to be included in seismic databases.

This goal has been pursued by involving a wide scientific community dealing with site effects, from both seismological and earthquake engineering viewpoints. We first collected the existing bibliography on site effect estimation and related methodologies of data analysis, through a preliminary survey among research experts in this field. We involved both partners of the SERA project (ISTERRE-CNRS, France; ETH, Switzerland; INGV, Italy; AUTH, Greece) and a few other expert groups dealing with site characterization (Caltech-USGS, USA; AFAD, Turkey; Virginia Tech, USA; GFZ, Germany; ITSAK, Greece; University of Potsdam, Germany; UoT-University of Texas, USA). Each of those experts was asked to produce his/her own list of most relevant indicators for site effects assessment, together with an appreciation on their importance, their feasibility and the preferred methods of analysis for retrieving them, following the scheme of Table 1. Details and collected bibliography can be found in the deliverable D7.2 of the SERA project (Di Giulio et al. 2019).

Table 1 Scheme of the preliminary survey to collect the indicators useful for site effects description, together with the required methods and data to retrieve them

The preliminary survey allowed us to define a comprehensive set of 24 candidate site indicators (Table 2) to be considered in a subsequent online questionnaire addressed to the broad scientific community working on site characterization. It also pointed out a number of remaining open issues that were useful to shape the next steps of the project: the missing standards on acquisition and analysis of data; the unclear definition of some indicators (e.g. non-unique interpretation among the experts in the definition of “seismic bedrock”); the lack of consensus on the quantitative evaluation of uncertainties or confidence, most often overlooked by end-users of waveform data, accompanying each indicator.

Table 2 Description of the site condition indicators of the online Questionnaire (from Di Giulio et al. 2019); in the first column, the names in parenthesis refer to the short-name of Fig. 3, if different

The questionnaire, defined after the preliminary survey, was online from August to November 2018 and allowed us to gather the feedback about the best-practice procedures for the computation of the site indicators. We collected answers from a large number of experts in different fields (from geotechnical engineering to seismic risk) and from many countries within Europe and worldwide. Their analysis led to rank the site indicators according to various criteria, and to propose a limited set of recommended ones for site characterization at seismic stations. Finally, the proposed indicators have been adjusted following the feedback from an international workshop where we shared the project’s results (Cultrera et al. 2019).

In summary, for each indicator of Table 2 we asked for: (1) the preferred method of estimation; (2) the difficulty level for obtaining it, considering both data acquisition and analysis (so called “Feasibility index”, that can be “easy”, “intermediate” or “difficult”); (3) the approximate cost range for deriving it, including again both data acquisition and processing; (4) free comments. Finally, we asked each participant to rank the indicators according to a 3-degrees priority scale, i.e. whether she/he thinks it is a “mandatory”, “recommended” or “optional” indicator to be included in site characterization databases.

Figure 1 shows the screenshots of the online questionnaire for two indicators: the fundamental resonance frequency (f0) and the shear-wave velocity of the seismic bedrock (Vs_seis_bed). In the f0 case, the proposed data acquisition and processing options are noise (i.e. ambient vibrations), earthquake, modelling or unknown procedure, whereas for the Vs_seis_bed the choice was limited to non-invasive (e.g. surface, passive or active, seismic methods) or invasive (e.g. seismic down-hole) methods. In addition, we asked for a preferred definition of the seismic bedrock, because some comments received from the preliminary survey pointed out that the definition of this indicator is not unique (see definition in Table 2). The questionnaire pages for the remaining indicators of Table 2 follow mostly one of the two schemes of Fig. 1 and are displayed in Online resource 1.

Fig. 1
figure 1

Examples of the online questionnaire for fundamental resonance frequency (f0) and shear-wave velocity (VS) of the seismic bedrock (on the left and right panel, respectively)

An invitation to compile the online questionnaire was sent to more than 280 scientists worldwide, preliminary chosen to keep a balanced distribution of skills in Geophysics (14%), Seismology (12%), Engineering seismology (21%), Geotechnical Engineering and Geology (12%), Seismic hazard and risk (21%), mix of previous fields (20%). However, only a fourth of them contributed to the survey (N = 71), mainly scientists with primary expertise in seismology, geophysics, geotechnical engineering and engineering seismology (Fig. 2a). If we consider also the secondary research field mentioned by scientists, the experts of microzonation studies, of Ground Motion Prediction Equations (GMPE) and of Probabilistic Seismic Hazard Assessment (PSHA) represent more than 35% of the total answers. This imbalance in the scientific fields may introduce a bias in the results that mostly represent the seismological and geophysical community viewpoint.

Fig. 2
figure 2

Histograms of the answers (in %) of the questionnaire: a scientific field of interest, including multiple choices (each researcher could indicate more than one field, in gray) or only the main field he/she feels to belong to (black); b country of the membership Institution

The geographical distribution of the 71 participants is shown in Fig. 2b: 69% are from Europe and 31% from other countries; the most represented countries are Italy, France, Switzerland, USA and Greece. The first three are the leading countries of the WP7-NA5 SERA Project, whereas USA and Greece have teams very interested in the topic addressed by the questionnaire.

3 Analysis of the questionnaire results

According to the online questionnaire, 87% of the respondents agreed on the completeness of the set of indicators listed in Table 2, and did not suggest to add any other one. Amongst the remaining 13%, the respondents suggested additional and more advanced indicators, such as the dependence of the site response to the earthquake location, the lateral variability of geological formations (2D-3D behavior), the soil-structure interaction (in case of a strong motion station installed in or near a building), the duration lengthening (frequency-dependent lengthening of seismic ground-motion duration) and the geometrical parameter (any parameter related with 2D or 3D structure, i.e. surface topography or underground lithological heterogeneity). These last two indicators were initially included in the Questionnaire but were not accounted for in the analysis described in this paper, because there were few answers available.

We thus consider that the analysis of the answers can be performed with a good level of confidence on the results.

3.1 Most recommended indicators

First of all, we ranked the indicators according to the degree of importance for site characterization at seismic stations, as assigned by each respondent (Fig. 3). Almost all of them are considered useful (i.e., at least "optional") for a reliable site characterization, whereas only few are given the highest priority ("mandatory") to be reported in site characterization databases by more than 50% of all respondents: f0 (89%), VS (72%), VS30 (63%), Surface geology (61%), Depth of seismic bedrock Hseis_bed (58%), Soil class (56%), Depth of engineering bedrock Heng_bed (55%). We thus decided to focus on these 7 consensual indicators–that we refer as to the “most recommended indicators” in the following-, to be used for the metadata of seismic stations. Nevertheless, in order to better understand what are the key aspects that drive, or not, the survey results (for instance, physical relevance, practical availability, measurement reliability, etc.), we analysed the indications provided by respondents on the preferred methods to obtain them, their difficulty and their cost.

Fig. 3
figure 3

Answers of participants to the questionnaire assessing the different level of importance of the indicators listed in Table 2

The main outcomes of the Questionnaire are summarized in Fig. 4 for the 7 most recommended indicators. The results for the resonance frequency f0 (Fig. 4a) show that the ambient noise measurements and earthquake recordings are the two main preferred experimental methods to obtain f0, with the largest consensus for the former. Numerical modeling was also proposed by some teams (less than 30%), although modeling assumes that site properties (e.g. velocity profile) are already known from literature or from specific experiments. The feasibility plot in Fig. 4a indicates that the data acquisition and processing are considered “easy” for noise data (70% of answers for the corresponding Feasibility index) and “Intermediate” for earthquakes (about 40% of answers). The cost to obtain the indicator value at a target site was estimated to be less than 1000 euros for noise and up to 20,000 euros in case of earthquake data (Cost plot in Fig. 4a). However, the cost evaluation has some uncertainty (note the number of “I don’t know” answers), and one must keep in mind that it corresponds to a "marginal cost" only, i.e. the amount required to perform and to interpret the measurements without including the equipment value.

Fig. 4
figure 4

Percentage of answers to the Questionnaire for the 7 most recommended indicators: methods for obtaining them (left), together with the corresponding feasibility (center) and cost (right). Panels a to f refer to f0, VS, VS30, surface geology, depth of bedrock (both Hseis_bed and Heng_bed) and soil class, respectively

We should mention that, as f0 is in close connection to the site transfer function under ground motion shaking, its reliability increases when earthquake data are used (e.g. Cultrera et al. 2014; Régnier et al. 2018). In case of areas of low-seismicity, however, the ground motion acquisition can be expensive and time-consuming, that is why it is often replaced by noise measurements. Then, to overcome the limits of the noise interpretation, which is not always straightforward (e.g. Mucciarelli et al. 2005; Bonnefoy-Claudet et al. 2009; Molnar et al. 2018; Kawase et al. 2015), several noise measurements in a relatively wide area around the site of interest are recommended to increase the robustness of f0 estimation.

The histograms on the methods of analysis for the remaining most recommended indicators are shown in the other panels (b–f) in Fig. 4, together with their cost and feasibility.

Concerning the VS velocity profile with depth (VS, Fig. 4b), the non-invasive methods (i.e. active or passive seismic methods) are preferred to the invasive ones (i.e. measurements in borehole, such as cross-hole or down-hole). This is most probably because they are less expensive (Cost panel) and more feasible (although of intermediate difficulty, see Feasibility panel) than the invasive methods, especially in urban environments and for large depth investigation. However, capabilities of non-invasive methods are limited by the measurable wavelength range, which is strongly linked to the array layout of receivers and the ground structure properties (e.g. Wathelet et al. 2008; Foti et al. 2018 for surface-wave passive methods). Another issue concerns the determination of VS profile at stiff and rock sites, being always considered challenging due to the requirement of large wavelength measurements in, most often, mountainous regions (Poggi et al. 2017).

The results for VS30 are consistent with the soil class indicator (panels c and f in Fig. 4), because in current practice the latter is computed usually from the VS30 values. Amongst the methods of data analysis for both of them, the direct measurements (geophysical and geotechnical methods) are more widely recommended against the other methods (e.g. based on Digital Elevation Model—DEM -, geology and model) that are geomorphic terrain‐based proxy (e.g. Allen and Wald 2009; Stewart et al. 2014; Pilz et al. 2010; Yong 2016; Bergamo et al. 2019). Note that “model” stands for the geological or velocity model extrapolated from other areas with similar geological characteristics, used for computing VS30 and soil class at a specific site. Alternative velocity and soil class definitions can be provided in terms of correlations based on parameters derived from “geotechnical methods”, such as SPT or CPT penetration tests and undrained shear strength (Wair et al. 2012, and references therein). The geophysical and geotechnical methods are thus more widely recommended than the ones from proxies, though they are more expensive.

Both the use of available cartography (geological, lithological, etc.) and specific geological field surveys have a large consensus for evaluating the surface geology (Fig. 4d), providing a preliminary model representative of the area. Field survey is considered as more accurate because it has higher resolution and accounts for other available information (i.e. boreholes, stratigraphy, geological sections), leading however to an increased cost in site characterization of a target station.

Finally, the results for the depth of seismic and engineering bedrock (Hseis_bed and Heng_bed) are similar to each other and they are presented in Fig. 4e. In this case, the non-invasive measurements are preferred even though complementary geophysical and geological studies could be required to constrain them: despite the higher accuracy of the invasive measurements, their method’s cost increases dramatically with the bedrock’s depth. The bedrock depth and the VS profile are considered as the most difficult to get amongst all site indicators (see the percentage of Difficult feasibility in Fig. 4b and e).

3.2 Feasibility and cost indices

Two additional indices are proposed for comparing the overall cost and feasibility to obtain a reliable value of the 24 considered site indicators. For each i-th indicator, the respondents had to select 1-to-m different methods to compute it: we indicate with pij the number of people recommending the method j and pitot the total number of recommendations for the indicator i \(\left( {pi_{tot} = \mathop \sum \nolimits_{j = 1}^{m} pi_{j} } \right)\). As an example, for the indicator f0 we considered m = 3 methods: Noise (j = 1), Earthquake (j = 2) and Modeling (j = 3). For each of them we got different recommendations (Table 3 and 4): pi1 = 67, pi2 = 46 and pi3 = 22, respectively, for a total of pitot = 135 answers (the “Don’t know” answers were 4).

Table 3 Example of the overall difficulty index (DItot) for f0, considering that it can be computed with 3 different methods
Table 4 Example of the overall cost index (CItot) for f0, considering that it can be computed with 3 different methods

The overall feasibility and cost to evaluate a given indicator is then computed as a weighted average of the feasibility and cost for each of the considered methods as follows:

  1. (a)

    The Difficulty index (DIj) for a specific method j-th refers to the Feasibility and it is estimated as a weighted average:

    $$DI_{j} = 5 \times \left\{ {\left[ {\left( {1 \times ns_{DI} + 2 \times ni_{DI} + 3 \times nh_{DI} } \right)/N_{DI} } \right] - 1} \right\}$$
    (1)

    where nsDI, niDI and nhDI are the number of questionnaire answers indicating “small” (weight of 1), “intermediate” (weight of 2) and “high” (weight of 3) difficulty, respectively, for the j-th method (in the questionnaire they are indicated as “easy”, “intermediate” and “difficult” feasibility); NDI= nsDI + niDI + nhDI is the total number of informative answers about the difficulty for the j-th method (NDIN = 71, N being the total number of people who answered the questionnaire).

    DIj ranges on a 0–10 scale, being DIj = 0 when method and/or processing is easy to apply (nsDI non-zero, niDI = nhDI = 0), 5 when an intermediate difficulty is suggested (niDI non-zero, nsDI = nhDI = 0), 10 for the most difficult (nhDI non zero, nsDI = niDI = 0). In the case of the f0 results, DI varies from around 5 for modelling, being of intermediate feasibility, to 1.4 for noise, which is very easy to achieve (Table 3).

  2. (b)

    Similarly to DIj, the Cost index (CIj) for a specific method j is defined by:

    $$CI_{j} = \left( {0.5 \times ns_{CI} + 3 \times ni_{CI} + 12.5 \times nh_{CI} } \right)/N_{CI}$$
    (2)

    where nsCI, niCI and nhCI are the number of questionnaire answers indicating “small” (less than 1 keuro), “intermediate” (from 5 to 20 keuros) and “high” (from 5 to 20 keuros) cost, respectively, for the j-th method; NCI = nsCI + niCI + nhCI is the total number of informative answers about the method cost for the j-th method (again, as not all answers inform about cost, NCIN = 71 the total number of questionnaire answers).

    CIj is indeed the average estimated cost in k€, as the various weighting coefficients are simply the median costs for each cost interval (0.5 is the median of the low cost interval [0–1 k€], 3 the median of [1–5 k€], and 12.5 the median of [5–20 k€]). It thus ranges from 0.5 to 12.5, but one may note that it only very rarely exceeds 10 k€. For the f0 example, CI varies from a median cost of 2 k€ for the noise method to about 7 k€ for modelling, considering that it is then necessary to know the geophysical and morphological underground properties to estimate f0 (Table 4).

  3. (c)

    Finally, for a given indicator, the overall DItot and CItot are computed as a weighted average of the DIj and CIj obtained for each j-th method, with normalized weights wj proportional to the number of people pij recommending that method (column pi in Table 3 and 4):

    $$DI_{tot} = \mathop \sum \limits_{j = 1}^{m} \left( {DI_{j} *w_{j} } \right)$$
    (3)
    $$CI_{tot} = \mathop \sum \limits_{j = 1}^{m} \left( {CI_{j} *w_{j} } \right)$$
    (4)

    where \(w_{j} = pi_{j} / pi_{tot}\), pij being the number of people recommending the method j and pitot the total number of recommendations for the i-th indicator \(\left( {pi_{tot} = \mathop \sum \nolimits_{j = 1}^{m} pi_{j} } \right)\).

For the f0 example in Tables 3 and 4, the overall values considering the 3 different methods to compute f0 are summarized into the total difficulty index DItot = 3.00 (low value in the 0–10 scale), and the total cost index CItot = 4.22 (median cost of about 4 k€), for a total number pitot of answers = 135. One may note however that if the "modeling" approach is discarded (it was recommended by only a small proportion–much below 50%—of respondents), the average cost decreases to 3.6 k€.

Figure 5 shows the resulting overall Difficulty and Cost indices (DItot and CItot, respectively) for all the indicators of Table 2. In general, the higher is the difficulty to infer the indicator, the larger is the cost for deriving it, considering the expenses for data acquisition and processing: that is, the difficulty can be overcome by a larger amount of funding. More interesting, the indicators are graphically clustered in 3 groups: (1) the lowest DItot and CItot values (median cost less than 3 keuros) refer to the topography class and the surface geology; (2) the intermediate values (median cost between 4 and 6 keuros) refer to indicators related to the site transfer function and the seismological parameters in general (including f0, VS30 and soil class); (3) the highest values (median cost between 6.5 and 8.5 keuros) include parameters at depth (i.e. velocity profile above 30 m, depth of seismological or engineering bedrock) and advanced geotechnical properties.

Fig. 5
figure 5

Overall Difficulty index (DItot) versus Cost index (CItot) for all parameters of Table 2. The colors distinguish the most recommended indicators (in orange) from the others of Table 2 (in blue). The size is proportional to the percentage of consensus on the “mandatory” class of Fig. 3. Vsz_less_30 and Vsz_above_30 indicate VSZ of Table 2 at depth less or greater than 30 m, respectively

The most recommended indicators (orange symbols in Fig. 5) turn out not to correspond to low cost and low difficulty only, which strongly indicate that the choice of the scientific community is also related to the confidence in their physical relevance for site amplification issues, and the reliability of their measurements. Within the seven indicators indeed, the depth of seismological and engineering bedrocks and the VS profile have high cost and great difficulty, whereas the geology and the f0 are considered to be of low cost and low difficulty.

4 Summary report

In the previous section we defined the most recommended indicators of the site behavior at the seismic station. However, their values alone are not enough to assess the reliability and the associated uncertainty of the single indicator and, more in general, of the site characterization for the station site as a whole. The background information about data acquisition and processing used to infer the site indicators are then necessary to evaluate an overall quality index for the station site, increasing the quality of seismological data (see the companion paper, Di Giulio et al. 2021).

Some seismic networks databases expose site condition characteristics with very detailed information and reports. In general, the reports can be very detailed when dedicated measurements at the site are carried out. For instance, the site characterization metadata of the accelerometric stations in the Italian Accelerometric Archive (ITACA; http://itaca.mi.ingv.it) are stored in three main thematic levels (topographic features, geological features and geophysical measurements), including seismic signal analysis and seismic classification according to Italian and European seismic codes (Felicetta et al. 2017): out of 743 stations in ITACA database, less than 30% of them have also an exhaustive report on VS profile and 20% have a detailed geological survey. Another example is the Site Characterization Database for seismic stations of the Swiss Seismological Service (Swiss Seismological Service (SED) at ETH Zurich, 2015), that collects very detailed information and reports on seismic survey and VS analysis, together with topographical map, geological map, housing and current instrumentation (Michel et al. 2014; Poggi et al. 2017). However, most of the time such detailed information is not public and, for some of the indicators involved in seismic site characterization, it is not possible to verify the adherence to prescriptions of standards or guidelines associated with the most popular techniques of measurement and analysis (e.g., Foti et al. 2018; Hunter and Crow 2015; SESAME 2004; Thompson et al. 2012).

In the following we propose, for each indicator, the scheme of a summary report containing in a compact format the details we think are necessary to proceed with a quantitative evaluation of the quality associated to the single-indicator value and to the overall site characterization. The summary report is not intended to replace a complete report, which may follow specific guidelines or standards; on the contrary, it could be used as a compact checklist for the basic requirements and a handy framework to homogenize information especially for seismic network metadata.

Each report contains a general description on site location and references of specific studies at the site, the contact information of the compiler, together with the final value for the indicators and the associated Quality index according to the procedure explained in the companion paper (Di Giulio et al. 2021) (Fig. 6).

Fig. 6
figure 6

General information of the summary reports

Then, for each indicator, it follows the core of the summary report scheme, that is intended to provide the information required for assessing the quality and the reliability of measurement and computation. For each of the principal methods that can be used to compute the target indicator, the summary report provides fields for the description of the data acquisition (date of experiment, location, equipment, instrumental setting) and analysis (methodology and general processing parameters, uncertainties and limits of resolution).

Figure 7 depicts the summary sheet for the resonance frequency f0. In this case, the two options provided as data source are ambient noise or earthquake recordings. For the first one, it is possible to choose between 3 techniques (Horizontal-to-Vertical spectral ratio “H/V”, “Ellipticity” of the Rayleigh waves or a different method to be specified in “Other”) and to specify the experimental conditions, including environment (weather, soil-sensor coupling, urbanization) and equipment description. Some details on the analysis (software, smoothing and windows length) is also required to infer the resolution limits of the computation for the uncertainty estimation. The methods based on earthquake data are limited to Horizontal-to-Vertical spectral ratio (HVSR), standard spectral ratio to a reference station (SSR), Generalized Inversion (GIT) or any different method (Other). The experimental conditions concern the earthquake parameters, recording period, number of events, minimum and maximum epicentral distance and magnitude range. Finally, the report provides details on the analysis, such as the considered seismic phase and seismic duration, in case of HVSR and SSR, and the list of main parameters, in case of GIT inversion.

Fig. 7
figure 7

Summary report scheme for the indicator f0

The template of the summary report for each recommended indicator has been formatted as editable pdf, that are easy to compile and allow to get the metadata entered in that form by means of appropriate computer programs. They are available in the SERA web page (http://www.sera-eu.org/en/Dissemination/scientific-publications-00001/) and they are also given in the supplementary electronic material of the present paper (Online Resource 2), which includes page for general information and detailed pages for f0, VS profile, VS30, Hseis_bed, Heng_bed, surface geology, and soil class.

5 Discussion and conclusion

This paper illustrates part of the activities carried out within the SERA EU project (WP 7-NA5 “Networking databases of site and station characterization”; http://www.sera-eu.org/en/activities/networking), with the aim of proposing a common, reliable and efficient framework for site characterization at European seismic stations. To achieve this goal, we identified the indicators that are used in site characterization analysis (Table 2) and organized an online survey to get an informed feedback about their usefulness, including cost and difficulty of their measurement (Online Resource 1). We collected the respondent answers to propose a list of the most recommended ones: seven indicators (f0, VS, VS30, Hseis_bed, Heng_bed, surface geology and soil class) are considered as "mandatory" by a majority (> 50%) of respondents, and are found to represent a good compromise between physical relevance, practical usefulness and availability for the metadata of seismic stations. For each indicator, we proposed to gather the background information for assessing the reliability of the obtained value through a summary report, which contains in a compact format the basic details we think necessary to accompany the computed values (Online Resource 2). The summary report allows to homogenize the information of seismic network metadata. Further, it can be used to evaluate in a quantitative way the reliability of the indicator’s value by computing an overall quality index for the station site (see the companion paper Di Giulio et al. 2021), or as a sort of checklist for new efforts in improving not-characterized networks. The summary reports provided as editable pdf, although they can certainly be improved, constitute a practical tool for network operators and data users to homogenize the content and presentation of site metadata, which is of increasing concern for most of them (e.g., Cauzzi et al. 2021; Lanzano et al. 2021; Strollo et al. 2021).

The entire study has been driven by the results of an online questionnaire that was sent to a large number of experts within the scientific community involved in site effect issues, either as data providers or as data users. Although the project was at European level, other countries were contributing to the survey. We received a large feedback mostly from the seismological community (seismology, geophysics, geotechnical engineering and engineering seismology), with a comparatively poorer contribution from civil engineers. As a further step towards a larger consensus, both the proposed list of main indicators and their summary reports have been presented and discussed during an international workshop organized in L’Aquila, Italy (https://sites.google.com/view/site-characterization-workshop/home; Cultrera et al. 2019) within the SERA Project activities. The participants were divided into focus groups that analysed either the indicator’s choice and its summary report, or the proposed quality metrics (see companion paper Di Giulio et al. 2021); their suggestions were summarized in a plenary discussion and taken into account for the final proposition described in this paper.

One issue emerging from the discussion was about the choice of the recommended indicators: are they the most important ones or simply the ones we are more familiar with?

First of all, the recommendation rate is not directly linked to the level of cost and/or difficulty as shown in Fig. 5, which indicates that the choice of the scientific community takes also into account the confidence in their physical relevancy for site amplification issues, and the reliability of their measurements. This result may be indicative of the robustness of the questionnaire responses.

Secondly, the chosen proxies aim at representing in a concise way the local seismic response. According to Bergamo et al. (2019,2021), most of the indicators, with the exception of surface geology and soil class, are related to stratigraphic amplification in 1D environment through VS profile or Horizontal-to-Vertical spectral ratio on noise (HVN). These parameters (f0, VS30, VS, Heng_bed and Hseis_bed) are usually obtained from in-situ geophysical measurements and directly refer to a geo-mechanical soil behavior strictly related to soil amplification (Cadet et al. 2010; Derras et al. 2017). Surface geology and soil class, instead, can be considered as indirect proxies, i.e. parameters of “cheap” availability allowing to extrapolate local information to areal surfaces by using the geological and/or topographical map (Bergamo et al. 2019). They are loosely related to geo-mechanical properties, and can be used to estimate other proxies more closely related to the local site response, such as VS30 (see for example the papers of Wills et al. 2015, Yong et al. 2012; Yong 2016, Forte et al. 2019). Bergamo et al. (2019,2021) evaluated systematically the sensitivity of local amplification towards a collected set of indicators, finding that proxies derived from in-situ geophysical measurements (e.g. f0, VS30 and Heng_bed) perform in general better than parameters derived from local topography or geology, meaning they are more “strongly” correlated with amplification. As also highlighted by Zhu et al. (2020), the period-dependency of site amplification implies that there is no such single proxy that performs the best over the whole frequency band. To improve the site amplification estimation, it is then more viable to use a combination of site proxies than to use a single, predictor variable (Trifunac 2016; Boudghene-Stambouli et al. 2017; Derras et al. 2017; Zhu et al. 2020 and references therein; Felicetta et al. 2021).

We are aware that the choice we have made of the most appropriate indicators was driven by the present knowledge and that some important site effects were missing or poorly represented. As an example, we can mention the site response dependence on the earthquake location (back-azimuth, distance and depth), the soil-to-structure interaction and the lateral variability of geological formations that can strongly affect the seismic behavior. It is also likely that the results of the survey can be somewhat biased by the limited number of respondents, and the uneven distribution of their individual expertise. Nevertheless, the present survey is the first of this kind, and presents two advantages: on one hand, the small number of "I don't know" answers indicates that precise answers on cost and/or difficulty of each method are likely to be informed answers, while on the other hand, the diversity of respondents is likely to provide a much wider viewpoint corresponding to the community as a whole. The selected list of indicators, together with the background information to be associated with the value of each indicator, represent a first proposition to classify the role of indicators employed in a site characterization analysis for European seismic networks.

Indeed, this proposal meets the scientific community needs for the evaluation of site characterization of strong-motion station locations throughout a clear definition of the site indicators, and for recommendation on how to obtain them including uncertainties. It finally allows to homogenize information for high-quality metadata of seismic station recordings. It is a first attempt that can help not only to increase the awareness about the need for higher quality site characterization, but also to shape the evolution of the structure of site metadata in strong and weak ground motion databases. The list of basic indicators can be modified according to the outcomes of new studies for evaluating other proxies for site effect evaluation, and/or their mutual correlation and their correlation with site amplification, together with a broader discussion into an enlarged scientific community involved in seismic networks and site characterization. We hope however that the outcomes of the present survey already constitute a significant step forward in merging both the "state-of-knowledge" (what is available) and the desirable evolution of "state-of-practice" (what is needed).