Indicators for site characterization at seismic station: recommendation from a dedicated survey

In recent years, the permanent seismic networks worldwide have largely increased, raising the amount of earthquake signals and the applications using seismic records. Although characterization of the soil properties at recording stations has a large impact on hazard estimates, it has not been implemented so far in a standardized way for reaching high-level metadata. To address this issue, we built an online questionnaire for the identification of the indicators useful for a reliable site characterization at a seismic station. We analysed the answers of a large number of experts in different fields, which allowed us to rank 24 different indicators and to identify the most relevant ones: fundamental frequency (f0), shear-wave velocity profile (VS), time-averaged Vs over 30 m (VS30), depth of seismological and engineering bedrock (Hseis_bed and Heng_bed), surface geology and soil class. Moreover, the questionnaire proposed two additional indices in terms of cost and difficulty to obtain a reliable value of each indicator, showing that the selection of the most relevant indicators results from a complex balance between physical relevancy, average cost and reliability. For each indicator we propose a summary report, provided as editable pdf, containing the background information of data acquisition and processing details, with the aim to homogenize site metadata information at European level and to define the quality of the site characterization (see companion paper Di Giulio et al. 2021). The selected indicators and the summary reports have been shared within European and worldwide scientific community and discussed in a dedicated international workshop. They represent a first attempt to reach a homogeneous set of high-level metadata for site characterization.


Introduction
Mechanical properties and morphological setting of the ground are a key element in modifying locally the seismic ground motion in terms of amplitude, duration and frequency content, commonly known as site effects. As a consequence, recordings from seismic stations can be significantly affected by the variation of site conditions close and below the station, influencing the studies based on earthquake ground motions collected by regional or local seismological networks.
To facilitate practical engineering design, site conditions are often characterized by a small number of site attributes (or proxies) or their combinations, aimed at describing their effects on seismic ground motion (Trifunac 2016;Bergamo et al. 2021). This simplified approach is adopted in many research fields: evaluation of local amplification and ground response analysis Priolo et al. 2019), calibration of strong-motion records for realistic ground shaking estimates Michelini et al. 2020 and references therein), assessment of site-specific hazard for critical infrastructures (Bazzurro and Cornell 2004a, b;Rathje et al. 2015;Pecker et al. 2017;Aristizabal et al. 2018), estimation of ground motion models (Bozorgnia et al. 2014;Douglas 2016;Bindi et al. 2019;Kotha et al. 2020), soil classification following the building code prescriptions (NEHRP, BSSC 2015;Eurocode 8, EC8 2004;NTC18 Italian code, NTC 2018). Most existing strong motion databases actually include some information on the V S30 proxy, which was the first one to be proposed in the nineties as a continuous, quantitative alternative to the binary (or ternary) classification soil/rock or soft soil/stiff soil/rock (Borcherdt 1992(Borcherdt , 1994Boore et al. 1994).
In the last decade, the number of stations of permanent and temporary seismic networks worldwide has largely increased (Margheriti et al. 2011;Mazza et al. 2012;Michelini et al. 2016;Moretti et al. 2016;Hetényi et al. 2018;Cara et al. 2019;Chen et al. 2019), rising the amount of recorded data (McNamara and Buland 2004;Pintore et al. 2012;Lanzano et al. 2019), without paying much attention on site characterization information.
Recently some efforts have been carried out at national level to perform extensive site characterization at seismic stations (e.g. Sandikkaya et al. 2010;Michel et al. 2014;Stewart et al. 2014;Albarello et al. 2017;Felicetta et al. 2017;Hollender et al. 2018). Currently only a few seismological national networks expose site condition characteristics with very detailed information and reports on site topography, morphology, geology and on seismic surveys used to derive V S profiles : examples of national database are provided by Switzerland through Site Characterization Database for Seismic Stations (Swiss Seismological Service (SED) at ETH Zürich 2015, http:// stati ons. seismo. ethz. ch; Michel et al. 2014;Poggi et al. 2017), Italy with the Italian Accelerometric Archive (ITACA, http:// itaca. mi. ingv. it;D'amico et al. 2020), Turkey with the Turkish Accelerometric Database (http:// kyhda ta. deprem. gov. tr).
At European level, some national seismological networks make use of European web portals to disseminate data and information on seismic stations, namely the Earthquake Strong Motion database (ESM, https:// esm-db. eu/; Luzi et al. 2020;Lanzano et al. 2019), the ORFEUS station book (http:// orfeus-eu. org/ stati onbook/), the European Geotechnical Database (EGD, http:// egd-epos. civil. auth. gr/). Nowadays, the most complete European database is ESM which exposes site condition information at 2071 strong motion sites, among which EC8 ground type is available at 1455 stations (about 70% of the number of stations). These web portals can display one to several indicators for site characterization and, in some few cases, it is possible to download reports of specific geophysical and geological surveys carried out at the strong motion site. However, most often direct measurements of the site attributes are either unavailable or have not yet been performed. In this case the level of information provided by such report is poor, and proxies based on geology and/or topography are generally used to define the site classification of a strong-motion station. In ESM, only 469 stations (22% of the total number of stations) have EC8 soil class derived from measured V S profiles, for the remaining stations being inferred from geology or terrain slope (Lanzano et al. 2019).
The lack of complete site condition information at European strong motion sites prevents the full use of seismic records for site amplification at local or regional level (Cornou and Bard 2019; Kotha et al. 2020;Cauzzi et al. 2021 and references therein). More generally, setting-up standard practices for a comprehensive seismic characterization of a station site, together with a clear evaluation of their reliability, is becoming a growing concern to reach high-level site condition metadata, and to offer unique opportunities of studies based on the availability of the large amount of high-quality data.
To fill the gap between data providers and researcher users, the networking activity of the SERA EU project ("Seismology and Earthquake Engineering Research Infrastructure Alliance for Europe", project no. 730900, Horizon2020 INFRAIA-01-2016 Program) led to the definition of a European strategy for site characterization of seismic stations in Europe, and to the proposition of standards for the best practice and site characterization quality assessment (Task 7.2 of WP7-NA5 "Networking databases of site and station characterization" http:// www. sera-eu. org/ en/ activ ities/ netwo rking/; Di . At international level, the USA based COSMOS Consortium (https:// stron gmoti on. org/ Proje cts/ Chara cteri zatio nGuid elines/) shares similar goals on the strong-motion data dissemination and on the definition of standard procedure for site characterization and reporting.
Because of the combination of this concern and of various recent studies and papers, discussing the limitations of V S30 taken as a single indicator and proposing other proxies (Trifunac 2016;Boudghene-Stambouli et al. 2017;Derras et al. 2017;Zhu et al. 2020;Bergamo et al. 2021;Felicetta et al. 2021), we thought it timely to question the scientific and engineering community about the optimal site proxies to be used in the future for improved ground motion predictions. The results of this inquiry are discussed in two companion papers. The present one details the rationale behind a list of seven indicators considered as the most relevant indicators for the site characterization of seismic stations and, for each of them, the template for a summary report aimed at the quantitative assessment of the quality of site metadata. The second one (see the companion paper, Di ) proposes a quality metrics to evaluate the site characterization reliability to be included in station metadata).
In the current study we first describe the outcomes of an international online survey to identify the indicators with the largest consensus, and thus to be considered as necessary for a reliable site characterization. The same survey is then used to build two additional indices that characterize, respectively, the cost and the difficulty to obtain a reliable value for each of the considered site indicators. Next, we provide a scheme of summary reports containing in a compact format the information related to each of them, with the background information helpful to assess its reliability. The selected indicators and summary reports have been presented to a representative panel of European and worldwide experts in a dedicated workshop , during which they were discussed and validated through focus groups. The seven indicators and the associated summary reports represent a first attempt to reach high-level metadata for 1 3 site characterization, being aware that they can be improved after a few years of experience, based on the feedback from seismic-network data users.

The international questionnaire
Despite some sparse efforts, there is not a customized site characterization procedure among the networks operators, and the site metadata of the permanent networks appear highly heterogeneous, if not completely absent. The identification of a set of indicators, able to catch the main site effects at a seismic station, represents the first step towards standardized site information to be included in seismic databases.
This goal has been pursued by involving a wide scientific community dealing with site effects, from both seismological and earthquake engineering viewpoints. We first collected the existing bibliography on site effect estimation and related methodologies of data analysis, through a preliminary survey among research experts in this field. We involved both partners of the SERA project (ISTERRE-CNRS, France; ETH, Switzerland; INGV, Italy; AUTH, Greece) and a few other expert groups dealing with site characterization (Caltech-USGS, USA; AFAD, Turkey; Virginia Tech, USA; GFZ, Germany; ITSAK, Greece; University of Potsdam, Germany; UoT-University of Texas, USA). Each of those experts was asked to produce his/her own list of most relevant indicators for site effects assessment, together with an appreciation on their importance, their feasibility and the preferred methods of analysis for retrieving them, following the scheme of Table 1. Details and collected bibliography can be found in the deliverable D7.2 of the SERA project (Di . The preliminary survey allowed us to define a comprehensive set of 24 candidate site indicators (Table 2) to be considered in a subsequent online questionnaire addressed to the broad scientific community working on site characterization. It also pointed out a number of remaining open issues that were useful to shape the next steps of the project: the missing Methodology to infer the indicator's value and to evaluate the uncertainty of the estimation Suggested code of analysis Preferred code usually adopted to analyze the data and to infer the indicator's value References & guidelines References (papers, reports, presentations) and guidelines for the indicators and best-practice of measurement and analysis Amplitude of the spectral peaks (i.e. amplitude from HVSR and HVN or amplification from SSR) Site Transfer Function [STF] Curve in the frequency domain describing the amplification function at a site Preferential direction of ground motion [Direction] Predominant direction of ground motion; it could be computed by particle motions, rotated spectra, ellipticity vector (covariance matrix method), and/or time-frequency polarization analysis kappa0 [k 0 ] High-frequency/near-surface attenuation factor Frequency-dependent attenuation [FDA] Model for near-surface attenuation (k), Quality factor (Q) or damping as a function of frequency Subsoil velocity profile of compression wave (V P ) as a function of the depth (z) V S30 Travel-time average of shear-wave velocity V S over the first 30 m depth average V S (z) below or above 30 m [V SZ ] Travel-time averaged V S at a given z depth below or above 30 m (e.g, z = 5 m, 10 m, 20 m, etc.) Vs seismic bedrock [Vs_seis_bed] V S of the seismological bedrock, shear-wave velocity of the geological unit that controls the lowest ( Soil class according to a specific Seismic Building Code; it is also called "Ground Type" in EC8 (2004) or "Site Class" in some national building codes (BSSC 2015; NTC 2018) Aggravation factor for basin and topography [AF] Ratio between 2D (or 3D or recorded motion) and 1D estimates for a given intensity measure of ground motion (IM): scalar, if it applies to a scalar IM (e.g. for PGA or Arias intensity) or frequency dependent (e.g. for STF, or amplification factor on response spectra) standards on acquisition and analysis of data; the unclear definition of some indicators (e.g. non-unique interpretation among the experts in the definition of "seismic bedrock"); the lack of consensus on the quantitative evaluation of uncertainties or confidence, most often overlooked by end-users of waveform data, accompanying each indicator. The questionnaire, defined after the preliminary survey, was online from August to November 2018 and allowed us to gather the feedback about the best-practice procedures for the computation of the site indicators. We collected answers from a large number of experts in different fields (from geotechnical engineering to seismic risk) and from many countries within Europe and worldwide. Their analysis led to rank the site indicators according to various criteria, and to propose a limited set of recommended ones for site characterization at seismic stations. Finally, the proposed indicators have been adjusted following the feedback from an international workshop where we shared the project's results .
In summary, for each indicator of Table 2 we asked for: (1) the preferred method of estimation; (2) the difficulty level for obtaining it, considering both data acquisition and analysis (so called "Feasibility index", that can be "easy", "intermediate" or "difficult"); (3) the approximate cost range for deriving it, including again both data acquisition and processing; (4) free comments. Finally, we asked each participant to rank the indicators according to a 3-degrees priority scale, i.e. whether she/he thinks it is a "mandatory", "recommended" or "optional" indicator to be included in site characterization databases. Figure 1 shows the screenshots of the online questionnaire for two indicators: the fundamental resonance frequency (f 0 ) and the shear-wave velocity of the seismic bedrock (V s_seis_bed ). In the f 0 case, the proposed data acquisition and processing options are noise (i.e. ambient vibrations), earthquake, modelling or unknown procedure, whereas for the V s_seis_bed the choice was limited to non-invasive (e.g. surface, passive or active, seismic methods) or invasive (e.g. seismic down-hole) methods. In addition, we asked for a 1 3 preferred definition of the seismic bedrock, because some comments received from the preliminary survey pointed out that the definition of this indicator is not unique (see definition in Table 2). The questionnaire pages for the remaining indicators of Table 2 follow mostly one of the two schemes of Fig. 1 and are displayed in Online resource 1. An invitation to compile the online questionnaire was sent to more than 280 scientists worldwide, preliminary chosen to keep a balanced distribution of skills in Geophysics (14%), Seismology (12%), Engineering seismology (21%), Geotechnical Engineering and Geology (12%), Seismic hazard and risk (21%), mix of previous fields (20%). However, only a fourth of them contributed to the survey (N = 71), mainly scientists with primary expertise in seismology, geophysics, geotechnical engineering and engineering seismology (Fig. 2a). If we consider also the secondary research field mentioned by scientists, the experts of microzonation studies, of Ground Motion Prediction Equations (GMPE) and of Probabilistic Seismic Hazard Assessment (PSHA) represent more than 35% of the Fig. 2 Histograms of the answers (in %) of the questionnaire: a scientific field of interest, including multiple choices (each researcher could indicate more than one field, in gray) or only the main field he/she feels to belong to (black); b country of the membership Institution total answers. This imbalance in the scientific fields may introduce a bias in the results that mostly represent the seismological and geophysical community viewpoint.
The geographical distribution of the 71 participants is shown in Fig. 2b: 69% are from Europe and 31% from other countries; the most represented countries are Italy, France, Switzerland, USA and Greece. The first three are the leading countries of the WP7-NA5 SERA Project, whereas USA and Greece have teams very interested in the topic addressed by the questionnaire.

Analysis of the questionnaire results
According to the online questionnaire, 87% of the respondents agreed on the completeness of the set of indicators listed in Table 2, and did not suggest to add any other one. Amongst the remaining 13%, the respondents suggested additional and more advanced indicators, such as the dependence of the site response to the earthquake location, the lateral variability of geological formations (2D-3D behavior), the soil-structure interaction (in case of a strong motion station installed in or near a building), the duration lengthening (frequencydependent lengthening of seismic ground-motion duration) and the geometrical parameter (any parameter related with 2D or 3D structure, i.e. surface topography or underground lithological heterogeneity). These last two indicators were initially included in the Questionnaire but were not accounted for in the analysis described in this paper, because there were few answers available.  Table 2 We thus consider that the analysis of the answers can be performed with a good level of confidence on the results.

Most recommended indicators
First of all, we ranked the indicators according to the degree of importance for site characterization at seismic stations, as assigned by each respondent (Fig. 3). Almost all of them are considered useful (i.e., at least "optional") for a reliable site characterization, whereas only few are given the highest priority ("mandatory") to be reported in site characterization databases by more than 50% of all respondents: f 0 (89%), V S (72%), V S30 (63%), Surface geology (61%), Depth of seismic bedrock H seis_bed (58%), Soil class (56%), Depth of engineering bedrock H eng_bed (55%). We thus decided to focus on these 7 consensual indicators-that we refer as to the "most recommended indicators" in the following-, to be used for the metadata of seismic stations. Nevertheless, in order to better understand what are the key aspects that drive, or not, the survey results (for instance, physical relevance, practical availability, measurement reliability, etc.), we analysed the indications provided by respondents on the preferred methods to obtain them, their difficulty and their cost.
The main outcomes of the Questionnaire are summarized in Fig. 4 for the 7 most recommended indicators. The results for the resonance frequency f 0 (Fig. 4a) show that the ambient noise measurements and earthquake recordings are the two main preferred experimental methods to obtain f 0 , with the largest consensus for the former. Numerical modeling was also proposed by some teams (less than 30%), although modeling assumes that site properties (e.g. velocity profile) are already known from literature or from specific experiments. The feasibility plot in Fig. 4a indicates that the data acquisition and processing are considered "easy" for noise data (70% of answers for the corresponding Feasibility index) and "Intermediate" for earthquakes (about 40% of answers). The cost to obtain the indicator value at a target site was estimated to be less than 1000 euros for noise and up to 20,000 euros in case of earthquake data (Cost plot in Fig. 4a). However, the cost evaluation has some uncertainty (note the number of "I don't know" answers), and one must keep in mind that it corresponds to a "marginal cost" only, i.e. the amount required to perform and to interpret the measurements without including the equipment value.
We should mention that, as f 0 is in close connection to the site transfer function under ground motion shaking, its reliability increases when earthquake data are used (e.g. Cultrera et al. 2014;Régnier et al. 2018). In case of areas of low-seismicity, however, the ground motion acquisition can be expensive and time-consuming, that is why it is often replaced by noise measurements. Then, to overcome the limits of the noise interpretation, which is not always straightforward (e.g. Mucciarelli et al. 2005;Bonnefoy-Claudet et al. 2009;Molnar et al. 2018;Kawase et al. 2015), several noise measurements in a relatively wide area around the site of interest are recommended to increase the robustness of f 0 estimation.
The histograms on the methods of analysis for the remaining most recommended indicators are shown in the other panels (b-f) in Fig. 4, together with their cost and feasibility.
Concerning the V S velocity profile with depth (V S , Fig. 4b), the non-invasive methods (i.e. active or passive seismic methods) are preferred to the invasive ones (i.e. measurements in borehole, such as cross-hole or down-hole). This is most probably because they are less expensive (Cost panel) and more feasible (although of intermediate difficulty, see Feasibility panel) than the invasive methods, especially in urban environments and for large depth investigation. However, capabilities of non-invasive methods are limited by the measurable wavelength range, which is strongly linked to the array layout of receivers and the ground structure properties (e.g. Wathelet et al. 2008;Foti et al. 2018 for surfacewave passive methods). Another issue concerns the determination of V S profile at stiff and rock sites, being always considered challenging due to the requirement of large wavelength measurements in, most often, mountainous regions (Poggi et al. 2017).
The results for V S30 are consistent with the soil class indicator (panels c and f in Fig. 4), because in current practice the latter is computed usually from the V S30 values. Amongst the methods of data analysis for both of them, the direct measurements (geophysical and geotechnical methods) are more widely recommended against the other methods (e.g. based on Digital Elevation Model-DEM -, geology and model) that are geomorphic terrain-based proxy (e.g. Allen and Wald 2009;Stewart et al. 2014;Pilz et al. 2010;Yong 2016;Bergamo et al. 2019). Note that "model" stands for the geological or velocity model extrapolated from other areas with similar geological characteristics, used for computing V S30 and soil class at a specific site. Alternative velocity and soil class definitions can be provided in terms of correlations based on parameters derived from "geotechnical methods", such as SPT or CPT penetration tests and undrained shear strength (Wair et al. 2012, and references therein). The geophysical and geotechnical methods are thus more widely recommended than the ones from proxies, though they are more expensive.
Both the use of available cartography (geological, lithological, etc.) and specific geological field surveys have a large consensus for evaluating the surface geology (Fig. 4d), providing a preliminary model representative of the area. Field survey is considered as more Table 3 Example of the overall difficulty index (DI tot ) for f 0 , considering that it can be computed with 3 different methods For each method j, ns DI , ni DI and nh DI are the number of questionnaire answers indicating "small", "intermediate" and "high" difficulty (in the questionnaire they are indicated as "easy", "intermediate" and "difficult" feasibility); pi j is number of people recommending that method; DI j is the difficulty index for each method and w j is the corresponding weight to compute DI tot For each method j, ns CI , ni CI and nh CI are the number of questionnaire answers indicating "small" (less than 1 keuro), "intermediate" (from 5 to 20 keuros) and "high" (from 5 to 20 keuros) cost; pi j is number of people recommending that method; CI j is the cost index for each method and w j is the corresponding weight to compute CI tot accurate because it has higher resolution and accounts for other available information (i.e. boreholes, stratigraphy, geological sections), leading however to an increased cost in site characterization of a target station. Finally, the results for the depth of seismic and engineering bedrock (H seis_bed and H eng_bed ) are similar to each other and they are presented in Fig. 4e. In this case, the noninvasive measurements are preferred even though complementary geophysical and geological studies could be required to constrain them: despite the higher accuracy of the invasive measurements, their method's cost increases dramatically with the bedrock's depth. The bedrock depth and the V S profile are considered as the most difficult to get amongst all site indicators (see the percentage of Difficult feasibility in Fig. 4b and e).

Feasibility and cost indices
Two additional indices are proposed for comparing the overall cost and feasibility to obtain a reliable value of the 24 considered site indicators. For each i-th indicator, the respondents had to select 1-to-m different methods to compute it: we indicate with pi j the number of people recommending the method j and pi tot the total number of recommendations for the indicator i � pi tot = ∑ m j=1 pi j � . As an example, for the indicator f 0 we considered m = 3 methods: Noise (j = 1), Earthquake (j = 2) and Modeling (j = 3). For each of them we got different recommendations (Table 3 and 4): pi 1 = 67, pi 2 = 46 and pi 3 = 22, respectively, for a total of pi tot = 135 answers (the "Don't know" answers were 4). The overall feasibility and cost to evaluate a given indicator is then computed as a weighted average of the feasibility and cost for each of the considered methods as follows: (a) The Difficulty index (DI j ) for a specific method j-th refers to the Feasibility and it is estimated as a weighted average: where ns DI , ni DI and nh DI are the number of questionnaire answers indicating "small" (weight of 1), "intermediate" (weight of 2) and "high" (weight of 3) difficulty, respectively, for the j-th method (in the questionnaire they are indicated as "easy", "intermediate" and "difficult" feasibility); N DI = ns DI + ni DI + nh DI is the total number of informative answers about the difficulty for the j-th method (N DI ≤ N = 71, N being the total number of people who answered the questionnaire). DI j ranges on a 0-10 scale, being DI j = 0 when method and/or processing is easy to apply (ns DI non-zero, ni DI = nh DI = 0), 5 when an intermediate difficulty is suggested (ni DI non-zero, ns DI = nh DI = 0), 10 for the most difficult (nh DI non zero, ns DI = ni DI = 0). In the case of the f 0 results, DI varies from around 5 for modelling, being of intermediate feasibility, to 1.4 for noise, which is very easy to achieve (Table 3). (b) Similarly to DI j , the Cost index (CI j ) for a specific method j is defined by: where ns CI , ni CI and nh CI are the number of questionnaire answers indicating "small" (less than 1 keuro), "intermediate" (from 5 to 20 keuros) and "high" (from 5 to 20 keuros) cost, respectively, for the j-th method; N CI = ns CI + ni CI + nh CI is the total number of informative answers about the method cost for the j-th method (again, as (1) DI j = 5 × 1 × ns DI + 2 × ni DI + 3 × nh DI ∕N DI − 1 (2) CI j = 0.5 × ns CI + 3 × ni CI + 12.5 × nh CI ∕N CI not all answers inform about cost, N CI ≤ N = 71 the total number of questionnaire answers).
CI j is indeed the average estimated cost in k€, as the various weighting coefficients are simply the median costs for each cost interval (0.5 is the median of the low cost interval [0-1 k€], 3 the median of [1-5 k€], and 12.5 the median of [5-20 k€]). It thus ranges from 0.5 to 12.5, but one may note that it only very rarely exceeds 10 k€. For the f 0 example, CI varies from a median cost of 2 k€ for the noise method to about 7 k€ for modelling, considering that it is then necessary to know the geophysical and morphological underground properties to estimate f 0 (Table 4). (c) Finally, for a given indicator, the overall DI tot and CI tot are computed as a weighted average of the DI j and CI j obtained for each j-th method, with normalized weights w j proportional to the number of people pi j recommending that method (column pi in Table 3 and 4):  Table 2. The colors distinguish the most recommended indicators (in orange) from the others of Table 2 (in blue). The size is proportional to the percentage of consensus on the "mandatory" class of Fig. 3. Vsz_less_30 and Vsz_above_30 indicate V SZ of Table 2 at depth less or greater than 30 m, respectively where w j = pi j ∕pi tot , pi j being the number of people recommending the method j and pi tot the total number of recommendations for the i-th indicator � pi tot = ∑ m j=1 pi j � .
For the f 0 example in Tables 3 and 4, the overall values considering the 3 different methods to compute f 0 are summarized into the total difficulty index DI tot = 3.00 (low value in the 0-10 scale), and the total cost index CI tot = 4.22 (median cost of about 4 k€), for a total number pi tot of answers = 135. One may note however that if the "modeling" approach is discarded (it was recommended by only a small proportion-much below 50%-of respondents), the average cost decreases to 3.6 k€. Figure 5 shows the resulting overall Difficulty and Cost indices (DI tot and CI tot , respectively) for all the indicators of Table 2. In general, the higher is the difficulty to infer the indicator, the larger is the cost for deriving it, considering the expenses for data acquisition and processing: that is, the difficulty can be overcome by a larger amount of funding. More interesting, the indicators are graphically clustered in 3 groups: (1) the lowest DI tot and CI tot values (median cost less than 3 keuros) refer to the topography class and the surface geology; (2) the intermediate values (median cost between 4 and 6 keuros) refer to indicators related to the site transfer function and the seismological parameters in general (including f 0 , V S30 and soil class); (3) the highest values (median cost between 6.5 and 8.5 keuros) include parameters at depth (i.e. velocity profile above 30 m, depth of seismological or engineering bedrock) and advanced geotechnical properties.
The most recommended indicators (orange symbols in Fig. 5) turn out not to correspond to low cost and low difficulty only, which strongly indicate that the choice of the scientific community is also related to the confidence in their physical relevance for site amplification issues, and the reliability of their measurements. Within the seven indicators indeed, the depth of seismological and engineering bedrocks and the V S profile have high cost and great difficulty, whereas the geology and the f 0 are considered to be of low cost and low difficulty.

Summary report
In the previous section we defined the most recommended indicators of the site behavior at the seismic station. However, their values alone are not enough to assess the reliability and the associated uncertainty of the single indicator and, more in general, of the site characterization for the station site as a whole. The background information about data acquisition and processing used to infer the site indicators are then necessary to evaluate an overall quality index for the station site, increasing the quality of seismological data (see the companion paper, Di . Some seismic networks databases expose site condition characteristics with very detailed information and reports. In general, the reports can be very detailed when dedicated measurements at the site are carried out. For instance, the site characterization metadata of the accelerometric stations in the Italian Accelerometric Archive (ITACA; http:// itaca. mi. ingv. it) are stored in three main thematic levels (topographic features, geological features and geophysical measurements), including seismic signal analysis and seismic classification according to Italian and European seismic codes (Felicetta et al. 2017): out of 743 stations in ITACA database, less than 30% of them have also an exhaustive report on V S profile and 20% have a detailed geological survey. Another example is the Site Characterization Database for seismic stations of the Swiss Seismological Service (Swiss Seismological Service (SED) at ETH Zurich, 2015), that collects very detailed information and reports on seismic survey and V S analysis, together with topographical map, geological map, housing and current instrumentation (Michel et al. 2014;Poggi et al. 2017). However, most of the time such detailed information is not public and, for some of the indicators involved in seismic site characterization, it is not possible to verify the adherence to prescriptions of standards or guidelines associated with the most popular techniques of  Foti et al. 2018;Hunter and Crow 2015;SESAME 2004;Thompson et al. 2012).
In the following we propose, for each indicator, the scheme of a summary report containing in a compact format the details we think are necessary to proceed with a quantitative evaluation of the quality associated to the single-indicator value and to the overall site characterization. The summary report is not intended to replace a complete report, which may follow specific guidelines or standards; on the contrary, it could be used as a compact checklist for the basic requirements and a handy framework to homogenize information especially for seismic network metadata.
Each report contains a general description on site location and references of specific studies at the site, the contact information of the compiler, together with the final value for the indicators and the associated Quality index according to the procedure explained in the companion paper (Di   (Fig. 6).
Then, for each indicator, it follows the core of the summary report scheme, that is intended to provide the information required for assessing the quality and the reliability of measurement and computation. For each of the principal methods that can be used to compute the target indicator, the summary report provides fields for the description of the data acquisition (date of experiment, location, equipment, instrumental setting) and analysis (methodology and general processing parameters, uncertainties and limits of resolution). Figure 7 depicts the summary sheet for the resonance frequency f 0 . In this case, the two options provided as data source are ambient noise or earthquake recordings. For the first one, it is possible to choose between 3 techniques (Horizontal-to-Vertical spectral ratio "H/V", "Ellipticity" of the Rayleigh waves or a different method to be specified in "Other") and to specify the experimental conditions, including environment (weather, soil-sensor coupling, urbanization) and equipment description. Some details on the analysis (software, smoothing and windows length) is also required to infer the resolution limits of the computation for the uncertainty estimation. The methods based on earthquake data are limited to Horizontal-to-Vertical spectral ratio (HVSR), standard spectral ratio to a reference station (SSR), Generalized Inversion (GIT) or any different method (Other). The experimental conditions concern the earthquake parameters, recording period, number of events, minimum and maximum epicentral distance and magnitude range. Finally, the report provides details on the analysis, such as the considered seismic phase and seismic duration, in case of HVSR and SSR, and the list of main parameters, in case of GIT inversion.
The template of the summary report for each recommended indicator has been formatted as editable pdf, that are easy to compile and allow to get the metadata entered in that form by means of appropriate computer programs. They are available in the SERA web page (http:// www. sera-eu. org/ en/ Disse minat ion/ scien tific-publi catio ns-00001/) and they are also given in the supplementary electronic material of the present paper (Online Resource 2), which includes page for general information and detailed pages for f 0 , V S profile, V S30 , H seis_bed , H eng_bed , surface geology, and soil class.

Discussion and conclusion
This paper illustrates part of the activities carried out within the SERA EU project (WP 7-NA5 "Networking databases of site and station characterization"; http:// www. sera-eu. org/ en/ activ ities/ netwo rking), with the aim of proposing a common, reliable and efficient framework for site characterization at European seismic stations. To achieve this goal, we identified the indicators that are used in site characterization analysis (Table 2) and organized an online survey to get an informed feedback about their usefulness, including cost and difficulty of their measurement (Online Resource 1). We collected the respondent answers to propose a list of the most recommended ones: seven indicators (f 0 , V S , V S30 , H seis_bed , H eng_bed , surface geology and soil class) are considered as "mandatory" by a majority (> 50%) of respondents, and are found to represent a good compromise between physical relevance, practical usefulness and availability for the metadata of seismic stations. For each indicator, we proposed to gather the background information for assessing the reliability of the obtained value through a summary report, which contains in a compact format the basic details we think necessary to accompany the computed values (Online Resource 2). The summary report allows to homogenize the information of seismic network metadata. Further, it can be used to evaluate in a quantitative way the reliability of the indicator's value by computing an overall quality index for the station site (see the companion paper Di , or as a sort of checklist for new efforts in improving notcharacterized networks. The summary reports provided as editable pdf, although they can certainly be improved, constitute a practical tool for network operators and data users to homogenize the content and presentation of site metadata, which is of increasing concern for most of them (e.g., Cauzzi et al. 2021;Lanzano et al. 2021;Strollo et al. 2021).
The entire study has been driven by the results of an online questionnaire that was sent to a large number of experts within the scientific community involved in site effect issues, either as data providers or as data users. Although the project was at European level, other countries were contributing to the survey. We received a large feedback mostly from the seismological community (seismology, geophysics, geotechnical engineering and engineering seismology), with a comparatively poorer contribution from civil engineers. As a further step towards a larger consensus, both the proposed list of main indicators and their summary reports have been presented and discussed during an international workshop organized in L'Aquila, Italy (https:// sites. google. com/ view/ site-chara cteri zationworks hop/ home; Cultrera et al. 2019) within the SERA Project activities. The participants were divided into focus groups that analysed either the indicator's choice and its summary report, or the proposed quality metrics (see companion paper Di ; their suggestions were summarized in a plenary discussion and taken into account for the final proposition described in this paper.
One issue emerging from the discussion was about the choice of the recommended indicators: are they the most important ones or simply the ones we are more familiar with?
First of all, the recommendation rate is not directly linked to the level of cost and/or difficulty as shown in Fig. 5, which indicates that the choice of the scientific community takes also into account the confidence in their physical relevancy for site amplification issues, and the reliability of their measurements. This result may be indicative of the robustness of the questionnaire responses.
Secondly, the chosen proxies aim at representing in a concise way the local seismic response. According to Bergamo et al. (2019Bergamo et al. ( ,2021, most of the indicators, with the exception of surface geology and soil class, are related to stratigraphic amplification in 1D environment through V S profile or Horizontal-to-Vertical spectral ratio on noise (HVN). These parameters (f 0 , V S30 , V S , H eng_bed and H seis_bed ) are usually obtained from in-situ geophysical measurements and directly refer to a geo-mechanical soil behavior strictly related to soil amplification (Cadet et al. 2010;Derras et al. 2017). Surface geology and soil class, instead, can be considered as indirect proxies, i.e. parameters of "cheap" availability allowing to extrapolate local information to areal surfaces by using the geological and/ or topographical map (Bergamo et al. 2019). They are loosely related to geo-mechanical properties, and can be used to estimate other proxies more closely related to the local site response, such as V S30 (see for example the papers of Wills et al. 2015, Yong et al. 2012Yong 2016, Forte et al. 2019. Bergamo et al. (2019Bergamo et al. ( ,2021 evaluated systematically the sensitivity of local amplification towards a collected set of indicators, finding that proxies derived from in-situ geophysical measurements (e.g. f 0 , V S30 and H eng_bed ) perform in general better than parameters derived from local topography or geology, meaning they are more "strongly" correlated with amplification. As also highlighted by Zhu et al. (2020), the period-dependency of site amplification implies that there is no such single proxy that performs the best over the whole frequency band. To improve the site amplification estimation, it is then more viable to use a combination of site proxies than to use a single, predictor variable (Trifunac 2016;Boudghene-Stambouli et al. 2017;Derras et al. 2017;Zhu et al. 2020 and references therein;Felicetta et al. 2021).
We are aware that the choice we have made of the most appropriate indicators was driven by the present knowledge and that some important site effects were missing or poorly represented. As an example, we can mention the site response dependence on the earthquake location (back-azimuth, distance and depth), the soil-to-structure interaction and the lateral variability of geological formations that can strongly affect the seismic behavior. It is also likely that the results of the survey can be somewhat biased by the limited number of respondents, and the uneven distribution of their individual expertise. Nevertheless, the present survey is the first of this kind, and presents two advantages: on one hand, the small number of "I don't know" answers indicates that precise answers on cost and/or difficulty of each method are likely to be informed answers, while on the other hand, the diversity of respondents is likely to provide a much wider viewpoint corresponding to the community as a whole. The selected list of indicators, together with the background information to be associated with the value of each indicator, represent a first proposition to classify the role of indicators employed in a site characterization analysis for European seismic networks.
Indeed, this proposal meets the scientific community needs for the evaluation of site characterization of strong-motion station locations throughout a clear definition of the site indicators, and for recommendation on how to obtain them including uncertainties. It finally allows to homogenize information for high-quality metadata of seismic station recordings. It is a first attempt that can help not only to increase the awareness about the need for higher quality site characterization, but also to shape the evolution of the structure of site metadata in strong and weak ground motion databases. The list of basic indicators can be modified according to the outcomes of new studies for evaluating other proxies for site effect evaluation, and/or their mutual correlation and their correlation with site amplification, together with a broader discussion into an enlarged scientific community involved in seismic networks and site characterization. We hope however that the outcomes of the present survey already constitute a significant step forward in merging both the "state-ofknowledge" (what is available) and the desirable evolution of "state-of-practice" (what is needed).