1 Introduction

Despite the progress achieved in the last century by instrumental seismology, macroseismic intensity data continue to be fundamental for the assessment of seismic hazard and risk. The macroseismic intensity classifies, through the use of a given macroseismic scale (e.g., MCS – Sieberg 1930; MMI – Wood and Neumann 1931; MSK – Medvedev et al. 1964; EMS-98 – Grünthal 1998), the severity of the effects of the earthquake shaking produced in a limited area, usually called “locality”, on humans, and on the built and natural environment. For example, the European Macroseismic Scale EMS-98 (Grünthal 1998) is compiled with theoretical descriptive frameworks, arranged in a hierarchical order. This scale assigns macroseismic intensity by considering the typology and vulnerability of damaged buildings, the relative level of damage, and comparing that data to a predefined table. As a consequence, the macroseismic intensity is a tool in the seismological and engineering practice for classifying the degree of damage that an earthquake may cause and is a parameter that could be used to estimate expected ground shaking. Furthermore, the spatial distribution of Intensity Data Points (IDPs) is used for the characterization of the seismic source (i.e., estimates of epicentral location and magnitude) of pre-instrumental earthquakes (e.g., Bakun and Wentworth 1997; Gasperini et al. 1999, 2010; Provost and Scotti 2020), that constitute the bulk of seismic catalogues in countries where the historical record is much longer than that of the instrumental record. In Italy, much of the significant information on earthquakes has been compiled from historical-macroseismic studies. Recent years have seen an increase in historical research, which has been incorporated into macroseismic databases (Monachesi and Stucchi 1997; Boschi et al. 2000; Guidoboni et al. 2007; Stucchi et al. 2007; Locati et al. 2022), improving the understanding of seismic activity in the country.

Macroseismic intensity attenuation is the rate at which shaking amplitude decreases with distance from the epicentre (Musson and Cecic 2002). Kövesligethy (1906) proposed the first intensity attenuation model, which describes the decrease in seismic energy due to absorption of the geophysical media and geometrical spreading. The model is represented by the difference between epicentral intensity (Io) and site intensity (I) as a function of pseudo hypocentral distance (R), pseudo focal depth (h), and a free parameter. Blake (1937) simplified the model by eliminating the linear term while keeping the coefficient of the logarithm as a free parameter. However, empirical observations and macroseismic data have revealed issues with this model, such as the use of Io as a measure of earthquake size. The accuracy of earthquake intensity predictions has been improved over the years by refining Kövesligethy's model. Refinements have included additional simplifications and considering magnitude as an earthquake size term. Table 1 shows a summary of such models grouped into families of functional types.

Table 1 Examples of functional types of macroseismic intensity attenuation models

The models proposed in literature, mainly after 1990, are either logarithmic or non-logarithmic (linear, non-linear, polynomial, etc.). These models are used to estimate the correlation between macroseismic data in isoseismal maps and seismic energy density (Howell and Schultz 1975). Several linear, log and log-linear models have been used in different places, such as California, Northern Rhine area, Iberian Peninsula, France, Portugal, Caribbean, Andean region, Brazil and Japan among others (Table 2) with different spatial and source size ranges and resulting intensities.

Table 2 Examples of macroseismic intensity attenuation models calibrated in magnitude

In this study new macroseismic intensity attenuation models have been derived for Italy by exploiting the most up-to-date and publicly available datasets, including the Italian Macroseismic Database (DBMI15, Locati et al. 2022) and the Parametric Catalogue of Italian Earthquakes (CPTI15, Rovida et al. 2022a). The models were developed in the framework of a research project, leaded by the Seismic Hazard Center (Centro di Pericolosità Sismica, CPS) of the Istituto Nazionale di Geofisica e Vulcanologia (INGV) and supported by the Italian Civil Protection Department (Dipartimento della Protezione Civile, DPC), aimed to produce the MPS19 seismic hazard model for Italy (Meletti et al. 2021); one task of the project was focused on the prediction models for strong motion parameters and macroseismic intensity to be used in the hazard calculations.

The observed ground motion resulting from an earthquake is the convolution of mainly three contributing factors: source, path and site effects. Unlike other intensity measures that describe the ground shaking (such as PGA), which have a spatial representativeness restricted to the immediate vicinity of the recording station, the macroseismic intensity has a spatial representativeness that can be extended to the entire area of inhabited centres (even several square kilometres). For the PGA (and similarly PGV and SAs) we have a precise estimate of the shaking that may be strongly influenced by the soil response during a seismic event. For that reason, empirical models calibrated for instrumental parameters need the definition of reference sites unaffected by local conditions (e.g. variability of near-surface geology, topographic effects, seismic waves polarisation). For this types of ground motion models, the identification of reference sites is a critical issue that involves a variety of proxies related to the site response which are a mixture of qualitative classification of the subsoil and quantitative measures (e.g. outcropping geology, site topography, shear wave velocity Vs30, horizontal-to-vertical spectral ratios obtained from noise measurements or recordings). As pointed out before, each macroseismic intensity point represents the earthquake average effects observed on a large inhabited area, often located on a substratum that is spatially variable, and for this reason it is extremely difficult to evaluate how local site effects may have influenced the macroseismic intensity assignment. Infact, the evaluation of the earthquake effects in anthropic environments, by convention, mostly depends on the damaging level of buildings, whereas soil condition or geomorphological effects are not properly considered.

Although the calibration of attenuation models in terms of macroseismic effects accounting for local site conditions is a greater challenge, most of the observation points used in this work to calibrate the attenuation model fall in EC8-B soil class (84.2%), while just a small percentage can be related to EC8-C class (15.5%) (Mori et al. 2020). The intensity correction accounting for site conditions requires a right evaluation of macroseismic effects in absence of seismic amplification related to EC8-A class, which is not represented in our calibration dataset. For such a reason in this work we have performed just a preliminary analysis on the possible correlation between model residuals and broad-scale seismic site conditions, postponing the correction of the macroseismic intensity for site amplification effects to future evolution of this work.

Leveraging the most recent research and a robust macroseismic dataset, we present a classical Log-linear attenuation model calibrated in Mw, which is the most widely used in the literature, as well as an alternative model with higher sensitivity to physical parameters such as focal depth and magnitude following Howell and Schultz (1975). To check the validity of our models, we also apply a validation process using an independent dataset following Mak et al. (2015) and Bakun and Wentworth (1997).

2 State-of-the-art in Italy

Several intensity attenuation models have been developed in Italy to compile seismic hazard assessment in terms of macroseismic intensity or to derive earthquake parameters for pre-instrumental events from IDPs.

The published models (Table 3) have extensively focused on macroseismic intensity attenuation over whole Italy (Gasperini 2001; Albarello and D'Amico 2004; Gomez-Capera 2006; Pasolini et al. 2008b) or more localised areas (e.g., Azzaro et al. 2006 for the Mt. Etna volcanic region). This multiplicity of studies is largely due to the continual improvement of Italian macroseismic databases and earthquake catalogues, as well as to a greater understanding of methodological factors that were previously overlooked. As an example, previous attenuation models proposed for Italy (Peruzza 1996, 2000), and used for seismic hazard analyses in terms of macroseismic intensity (Slejko et al. 1998; Albarello et al. 2000), did not consider aleatory uncertainty in the predicted intensity values.

Table 3 The most recent macroseismic intensity attenuation models referenced in literature for Italy

The more recent models available in literature for the whole Italian territory have been proposed by Gomez-Capera (2006) and Pasolini et al. (2008b). Both models were developed using the same datasets (DBMI04, Stucchi et al 2007 and CPTI04, CPTI Working Group 2004) and were used to compute seismic hazard in Italy in terms of macroseismic intensity through different approaches (Gomez-Capera et al. 2010). The two models differ in methodological and computational choices, mainly concerning the functional form of intensity-distance decay and the source-size parameter (Table 3).

In particular, Gomez-Capera (2006) adopted epicentral intensity as the measure of earthquake size and assumed a model in which the intensity decay (i.e., difference between epicentral and site intensity) is proportional to the cubic root of the epicentral distance (Berardi et al. 1993), independent of the earthquake's focal depth. Historical and post-1900 earthquakes were used and criteria for excluding distant IDPs were applied. Calibration was conducted using 20,873 IDPs related to 212 earthquakes that occurred from 1279 to 2002; these data were grouped according to the predominant faulting mechanism in the relevant seismic zone (Meletti et al. 2008). This model was developed for the whole Italian territory, as well as for each of the predominant style-of faulting (normal, reverse and strike-slip) and the Etna volcanic zone (Table 3). A residual analysis was reported, along with its accompanying standard deviation, as an indication of the quality of the regressions. For the Whole Italy and normal-faulting models, the fit was good up to about 150 km; for the Strike-Slip and the Reverse model, the error increased for distances greater than 110 km and, for the Etna volcanic area, beyond 10 km. Validation process on an independent dataset was not performed and there was no calibration process with a physical parameter such as moment magnitude.

The model by Pasolini et al. (2008b) considered earthquakes after 1200 with a minimum of 10 IDPs and excluded offshore events and those that occurred in the volcanic areas of Mt. Etna and Ischia Island. To avoid any bias due to incompleteness of low macroseismic intensity values, they also disregarded all IDPs at a distance where an intensity of less than 4 MCS was expected, according to Gasperini (2001). Similarly, Gomez-Capera (2006) applied the same selection criteria, but further omitted deep events (h > 35 km), minor earthquakes (Io < 7 MCS), and those outside the seismogenic zones of the ZS9 source zone model (Meletti et al. 2008). The dependence of macroseismic intensity on source distance is modelled through a log-linear function. This choice, shared by many studies on intensity attenuation, is based on physical evidence that supports the proportionality of intensity with the logarithm of ground motion amplitude (Pasolini et al. 2008a), thereby justifying the adoption of a relationship similar to common Ground Motion Prediction Equations (GMPEs). Although the two studies adopted different functional forms to model intensity-distance decay and differed in their analysis strategies, their average attenuation curves are very similar, with the exception of the epicentre, where Gomez-Capera (2006) predicts higher values than the relevant epicentral intensity. This difference is attributed to the different source terms used by the two studies rather than to different attenuation patterns: unlike the IE parameter of Pasolini et al. (2008b), which is defined as the intensity expected at the epicentre, the "conventional" epicentral intensity Io is often not consistent with the intensity predicted by the attenuation relationship at R = 0 (Table 3, Pasolini et al. 2008a). However, when examining the effect of the two predictive equations on seismic hazard, some discrepancies were observed in certain parts of Italy. D'Amico et al. (2009) suggest that these differences should be attributed more to the different parameterization of the source term (Io for Gomez-Capera 2006; IE for Pasolini et al. 2008b, Table 3) and its consequent effect on seismicity rate calculations, rather than to the different values of standard deviation.

The models developed by Gomez-Capera et al. (2008a, b; 2009) were applied to evaluate earthquake parameters of historical earthquakes from IDPs. The first model (Gomez-Capera et al. 2008a) was proposed for offshore events of the CPTI11 catalogue (Rovida et al. 2011). This model simplified the functional form by eliminating the absorption coefficient while allowing the geometric coefficient to remain a free parameter (Gomez-Capera et al. 2008b, Table 3). Gomez-Capera et al. (2009) calibrated a more recent intensity attenuation model within the context of the EC NERIES Project and the European Earthquake Catalog SHEEC (Stucchi et al. 2013). To derive epicentral parameters from IDPs, this earthquake catalogue used the Boxer method (Gasperini et al. 1999) as well as the Bakun and Wentworth method (1997), both of which had been properly calibrated. While this Italian relationship was validated on an independent dataset, it was disregarded as its accuracy could be further increased with a larger dataset. The European EPICA catalogue (Rovida and Antonucci 2021; Rovida et al. 2022b) retains exactly the same principles and procedures adopted for the compilation of SHEEC, of which it is an update.

3 Data

The input dataset considered in this work was extracted from DBMI, the Italian Macroseismic Database (Locati et al. 2022 and references therein). The latest version of DBMI is DBMI15 v4.0 and it provides access to 123,981 IDPs, related to 3229 earthquakes that occurred between the year 1000 and 2020. When data of instrumental origin is missing, the intensity data of DBMI is used to compile the earthquake parameters of CPTI, the Parametric Catalogue of Italian Earthquakes (Rovida et al. 2016, 2019, 2020, 2021, 2022a). CPTI contains homogeneous macroseismic and instrumental data for earthquakes with a maximum macroseismic intensity Imax ≥ 5 MCS or a magnitude Mw ≥ 4.0 that occurred on Italian territory. The latest version of CPTI is CPTI15 v4.0 and it provides access to parameters related to 4894 earthquakes between the year 1000 and 2020. In order to ensure the accuracy of the new macroseismic intensity attenuation model, a carefully selected input calibration dataset has been compiled using data from DBMI and considering CPTI parameters. This dataset satisfies the following general criteria:

  • Earthquakes should cover a wide range of magnitudes.

  • Earthquakes and macroseismic intensity data should have a wide spatial distribution, intended both as national coverage and as a wide distance and azimuthal distribution for a single event.

  • Magnitude and epicentre should preferably be of instrumental origin, with a preference for Mw.

  • The set of IDPs associated with each considered earthquake should be as large as possible.

The compilation of the calibration dataset from CPTI15 and DBMI15 with well-defined criteria ensures the reliability of the new macroseismic intensity attenuation model in the seismic hazard framework.

To accurately calibrate a macroseismic intensity attenuation model, a careful selection of shallow tectonic earthquakes and IDPs was carried out according to various criteria (Gomez-Capera 2006; Gomez-Capera et al. 2010):

  • Earthquakes in the volcanic area of Mount Etna were excluded since the attenuation pattern in this zone is distinct from that of active crustal regions (Ciccotti et al. 2000).

  • IDPs with intensity I < 3 were discarded to avoid data incompleteness;

  • Events with accumulated effects due to damaging aftershocks were also removed (Locati et al. 2011); as well as earthquakes with focal depth > 35 km were discarded.

  • Earthquakes with an azimuthal gap in the macroseismic intensity distribution due to a lack of information from sparsely populated areas and off-shore epicentres were excluded, as were cross-national earthquakes due to the incomplete distribution of IDPs and ill-constrained epicentre location (Fig. 1).

  • Earthquakes characterised by low Io and/or small Mw were removed since the study will focus on strong earthquakes;

  • Earthquakes with less than 12 macroseismic observations were removed because events with a low number of IDPs could bias the regression analysis.

  • Macroseismic data associated with special cases identified by DBMI15 (e.g. TE [large territory], IB [isolated building], SS [small settlement], MS [multiple settlement], DL [deserted locality], AL [absorbed locality], CQ [city quarter]) should be evaluated carefully, since the statistical nature of macroseismic intensity may not be fulfilled. For this reason, such macroseismic data were excluded.

  • Likewise, special care was taken when dealing with non-conventional macroseismic intensities (e.g. Felt, Damage, etc.).

Fig. 1
figure 1

Examples of earthquakes excluded in the calibration process with an azimuthal gap in the IDPs distribution: off-shore and cross-national events: a 1988.04.13 Golfo di Taranto; b 1991.11.20 Grigioni, Vaz

The criterion of using I ≥ 3 is supported by the analysis of intensity distribution, revealing the "incompleteness" of data for intensity class I = 2 compared to I = 3 (Locati et al. 2022). The calibration dataset used in this study confirms a significant number of records with I = 3 (1,793 records or 11%), while 516 records with I = 2 were not utilised. High magnitude earthquakes, such as the 1976.05.06 Friuli event (Mw = 6.4) and the 1962.08.21 Irpinia event (Mw = 6.2), lack IDPs for I = 2 and I = 2–3 (Archivio Macrosismico GNDT 1995; Gizzi 2012). These findings align with previous studies (Gomez Capera 2006; Gomez Capera et al. 2010; Table 3) and are consistent with the existing literature (Bakun and Wenworth 1997; Mezcua 2004; Bakun 2006; Bakun and Scotti 2006; Table 2). These criteria are applied to capture the main physical characteristic of the macroseismic data distribution for each earthquake: the magnitude of the earthquake and its corresponding macroseismic intensity attenuation trend with distance.

Following the application of the selection criteria, 16,260 IDPs related to 119 earthquakes (Fig. 2a, b) that occurred between 1908 and 2013 with reliable instrumentally recorded or scaled magnitude (Mw) are used to calibrate the coefficients of the macroseismic intensity attenuation model. The instrumental parameters of the earthquakes (CPTI15) used for the calibration process are provided in xlsx format in the electronic supplement (Table A1). Figure 3 provides descriptive statistics and graphical representations of the input dataset containing 119 earthquakes and their 16,260 IDPs selected from CPTI15 and DBMI15 respectively. As shown in Fig. 3a, the moment magnitude (Mw) of the 119 earthquakes ranges from 3.8 to 7.1. Seven events have Mw greater than 6.0. These were the 1962.08.21 Irpinia (Mw 6.2), 1976.05.06 Friuli (Mw 6.4), 1920.09.07 Garfagnana (Mw 6.5), 1930.07.23 Irpinia (Mw 6.6), 1980.11.23 Irpinia-Basilicata (Mw 6.8), 1915.01.13 Marsica (Mw 7.0), and 1908.12.28 Stretto di Messina (Mw 7.1) earthquakes; their distribution of IDPs and macroseismic intensity versus epicentral distance, obtained from instrumental epicentre, are shown in Fig. A1 of the electronic supplement. The maximum Mw of 7.1 was recorded in the 1908.12.28 Stretto di Messina earthquake, while the minimum (Mw 3.8) was recorded in the 1998.02.08 Aspromonte event, located in the Peninsula of Calabria in Southern Italy.

Fig. 2
figure 2

a Location of the 119 post 1900 (1908–2013) earthquakes (3.8 ≤ Mw ≤ 7.1) used in the calibration process (events from CPTI15, Rovida et al. 2022a). b Macroseismic Data Points distribution (16,260 IDPs) used for calibration of the macroseismic intensity attenuation model (macroseismic data from DBMI15, Locati et al. 2022)

Fig. 3
figure 3figure 3

Histograms and distribution of the input data in the calibration dataset. For the 119 earthquakes from CPTI15 catalogue: the histogram of a magnitude Mw and b focal depth. For the 16,260 IDPs from DBMI15: the histogram of c macroseismic intensity and d epicentral distance (in km); the plot of intensity I(MCS) versus epicentral distance e in log scale. Finally, the plot of f intensity I(MCS) versus magnitude Mw and g epicentral distance (km) versus magnitude Mw

Out of the 119 input earthquakes, 83 of them have focal depth data in CPTI15. The frequency distribution of these focal depths is shown in Fig. 3b, the median value is h = 10 km. As for the 16,260 IDPs, the frequency distribution of their macroseismic intensities and epicentral distances are given in Fig. 3c and d respectively. The intensity ranges between 3 and 11 MCS and shows a median value of 5 MCS. The distribution of the epicentral distances is asymmetric and the majority (98%) of IDPs are located within 252 km from the epicentre, reflecting the decay of the number of intensity observations at long distances. Figure 3c may indicate incompleteness for intensities I = 3, 3–4. Specifically, the proportion of IDPs with intensity I = 3, 3–4, 4, and 4–5 is respectively 11%, 8%, 18.4%, and 11.2%, corresponding to 1793, 1307, 2993, 1821 IDPs. To investigate the extent of this deficit, the site intensities are shown versus epicentral distances (Fig. 3e in log scale) and versus magnitudes (Fig. 3f); the epicentral distances are also shown versus magnitudes (Fig. 3g). It appears that the intensity I = 3, 3–4 covers a broad range of magnitudes and distances, which suggests that earthquakes with varying magnitudes and epicentral distances have been perceived with similar intensity levels.

The following table summarises the range of the calibrated model:

Time range

No events

No IDPs

Mw range

Repi (km) range

h (km) focal depth

I range

Imax /Io range

[1908–2013]

119

16,260

[3.8–7.1]

[0.1–634]

h < 35

[3–11]

[5–11]

4 Method

This study analyses two attenuation functionals proposed by Howell and Schultz (1975). Such study assesses the pattern of attenuation in macroseismic intensity based on an empirical relationship between macroseismic intensity and energy, and the classical equation for energy decay:

$$E = \left( {\frac{{E_{0} }}{4\pi }} \right)R_{hyp}^{ - b} 10^{{ - cR_{hyp} }}$$
(1)

where E0 is the total energy, Rhyp is the hypocentral distance, b is a constant for geometric spreading, and c is a constant for the rate of absorption. Howell and Schultz (1975) proposed two empirical equations that relate seismic intensity, expressed as macroseismic intensity (I), to seismic energy density (E):

  1. i.

    The first equation is a logarithmic relationship, in which I is proportional to the logarithm of E:

    $$I = k_{1} + k_{2} LogE$$
    (2)
  2. ii.

    The second equation is an alternative power relationship, in which I is proportional to a power of E:

    $$I = k_{3} + k_{4} E^{{k_{4} }}$$
    (3)

These equations were developed under the assumption that seismic energy generated during an earthquake is radiated from a point source into a space of simple geometry, such as a uniform hemisphere or layering only as a function of focal depth. Although the macroseismic intensity is conceived in terms of observable qualitative effects, these equations suggest that the steps of the macroseismic intensity scale approximate an even progression of any kind based on energy.

The state of the art of macroseismic intensity modelling shows an empirical relationship between the intensity (I) and pseudo hypocentral distance (R), as an approximation of hypocentral distance. Most models are similar to the Logarithmic-Linear model (Log-Lin) and use the intensity at the epicentre (Io) as seen in early macroseismic models (Table 1). The following two empirical macroseismic intensity attenuation models (4) and (5) were derived from (2) and (3), respectively, using the earthquake size parameter Mw (Howell and Schultz 1975; Gomez-Capera and Salcedo Hurtado 2002):

  1. i.

    Macroseismic intensity attenuation model type 1

    $$I = a_{1} - b_{1} LogR - c_{1} R + d_{1} Mw$$
    (4)

    where an additive Normal error is implicit (“Appendix A1”).

  2. ii.

    Macroseismic intensity attenuation model type 2

    $$LogI = a_{2} - b_{2} LogR - c_{2} R + d_{2} LogMw$$
    (5)

    where an additive Normal error is implicit. Model (5) is equivalent to the multiplicative model

    $$I = a_{2}^{\prime} R^{{ - b_{2} }} 10^{{ - c_{2} R}} Mw^{{d_{2} }}$$
    (6)

    with \(a_{2}^{\prime} = 10^{{a_{2} }}\) and multiplicative LogNormal error (“Appendix A1”).

The classical functional form of macroseismic intensity attenuation, described by Eq. (4), is widely observed in literature from Kovesligethy (1906) to the most recent models calibrated in Mw (Atkinson et al. 2014; Baumont et al. 2018; Gomez-Capera et al. 2020; Mezcua et al. 2020; Table 2). This form is similar to the one used by GMPEs, taking into account geometric spreading and anelastic attenuation.

Equation (5) is an alternative linear functional form for the attenuation of the logarithm of macroseismic intensity, which increases with magnitude Mw in logarithmic form. Taking the logarithm of both sides of Eq. (6) is equivalent to Eq. (5). This functional form is less common in the literature, but it has been applied with successful regional results by Howell and Schultz (1975) for San Andreas, Cordillera, and eastern provinces in the United States and Canada; Kaila and Sarkar (1982) for the United States; Chavez and Castro (1988) for subduction zones in the south-central region of Mexico and the Trans-Mexican Volcanic Belt; Greenhalgh et al. (1989) for Australian states; and Gomez-Capera and Salcedo Hurtado (2002) for events with h < 60 km in the Colombian Andes. There are a wide range of alternative equations from (4) and (6) as discussed in Howell and Schultz (1975), Cua et al. (2010), among others.

In the Eqs. (4) and (6) for modelling macroseismic intensity, the coefficient a is calibrated to reflect boundary conditions of the source; b defines the rate of energy geometric spreading; c is associated with anelastic attenuation; and d represents magnitude dependence. The pseudo hypocentral distance R is calculated as the square root of the sum of the squares of the epicentral distance Repi and the pseudo focal depth h (in km): \(R = \sqrt {R_{epi}^{2} + h^{2} }\).

The resulting macroseismic attenuation models (4) and (6) are functions of both epicentral distance Repi and moment magnitude Mw, namely \(I = f\left( {R_{epi} , Mw} \right)\), where (\(a_{1} ,b_{1} , c_{1} , d_{1} , h_{1}\)) and (\(a_{2}^{\prime} ,b_{2} , c_{2} , d_{2} , h_{2}\)) are coefficients to be estimated. By assuming that all coefficients are positive except \(a_{1}\) which can be any number, the difference between Eqs. (4) and (6) lies in the curvature of the variation of macroseismic intensity with epicentral distance (Repi) and moment magnitude Mw. The slope of the macroseismic intensity I with respect to Repi is expressed by the first order derivative, given by Eq. (7) for model (4) and (8) for model (6):

$$dI/dR_{epi} = - \left[ {b_{1} /\left( {R ln10} \right) + c_{1} } \right] \cdot \left( {R_{epi} /R} \right)$$
(7)
$$dI/dR_{epi} = - I \cdot \left[ {b_{2} /R + c_{1} ln10} \right] \cdot \left( {R_{epi} /R} \right)$$
(8)

Slope (7) depends solely on the epicentral distance Repi, whereas slope (8) depends on moment magnitude Mw as well. This implies that, unlike model (4), the curvature of the intensity attenuation model (6) changes with Mw. Nevertheless, (7) and (8) share some properties. Both slopes are approximately zero very close to the epicentre, where the intensity \(I = f\left( {R_{epi} , Mw} \right)\) is almost constant. As analytical functions, they asymptotically approach a constant value, precisely \(- c_{1}\) for model (4) and zero for model (6). This indicates that, for very large values of Repi, the analytical function (4) has a linear decreasing trend and assumes negative values; consequently, in our application to macroseismic data, function (4) must be truncated at the distance where the intensity reaches its minimum value (I = 3 in this study). Also the model (6) must be analogously truncated because its analytical function tends asymptotically to zero.

As for the derivatives of models (4) and (6) with respect to Mw, they are respectively given by

$$dI/dMw = d_{1}$$
(9)
$$dI/dMw = I \cdot d_{2} /Mw$$
(10)

These slopes show that the intensity of model (4) grows linearly at constant rate \(d_{1}\) as Mw increases, whereas the intensity of model (6) grows at a nonlinear rate if \(d_{2} > 1\) and decreases otherwise.

5 Results

This section is devoted to determine which of the two models chosen in this study is the best predictor of macroseismic intensity attenuation through moment magnitude and hypocentral distance (roughly defined by the epicentral distance Repi and the pseudo focal depth h). To this end, the model coefficients (a, b, c, d, h) of models (4) and (5), along with their standard errors, are estimated from the input dataset by applying a linear regression procedure based on the Levenberg–Marquardt algorithm (Synergy Software 2021; Wolfram 2022). The optimization procedure detects the best-fitting values of the coefficients among all real numbers. The obtained values of the coefficients, as well as their standard error, are summarised in the electronic supplement (Table A2) and the corresponding calibrated relationships in Table 4. The estimated standard deviation (σ) of the residuals provides a measure of precision of the model, with lower standard deviation indicating a more precise model.

Table 4 Two types of macroseismic intensity attenuation models proposed in the present study (the two best models are in bold): \(I = f\left( {R,Mw} \right)\) and \(LogI = f\left( {R,Mw} \right)\) where \(R = \sqrt {R_{epi}^{2} + h^{2} }\)

5.1 The best regressions assuming h as free parameter

The following results have been obtained for the macroseismic intensity attenuation models (4) and (5) considering the pseudo focal depth h as a free parameter:

  • Log-Lin_10 model type 1.

    The regression analysis of model type 1 in Eq. (4) will be hereinafter denoted by Log-Lin_10, because the pseudo focal depth h is estimated to be approximately equal to 10 km (h = 9.87 km ± 5.7%). This is shown in row N = 2 of Tables 4 and A2, where the estimated coefficients of the model and their standard error are also reported. The input data and the model are visualised in Fig. 4a, where the predicted macroseismic intensity (\(I_{comp}\)) and the associated uncertainty (\(I_{comp} \pm \sigma\)) are represented against epicentral distance (Repi) for Mw classes, in logarithmic scale. Let \(I_{obs}\) and \(I_{comp}\) denote the observed and the predicted macroseismic intensity at sites, respectively; the histogram of the residuals (\(I_{obs} - I_{comp}\)) is illustrated in Fig. 4b, with estimated standard deviation \(\sigma =\) 0.748 derived from the root mean square of the residuals:

    $$RMS = \sqrt {\left[ {\mathop \sum \limits_{i = 1}^{P} \left( {I_{obs,i} - I_{comp,i} } \right)^{2} } \right]/P}$$
    (11)

    where P is the total number of IDPs. Some summary statistics of residuals are given in Fig. 4c.

    Fig. 4
    figure 4

    Type 1 model Log-Lin_10: a 2D cross section plots of \(I=f({R}_{epi},Mw)\) for Mw classes, at logarithmic distance scale; with estimated standard deviation σ = 0.748; grey dots are the 16,260 IDPs in the input dataset; b histogram of the residuals between observed and computed intensity (\({I}_{obs}-{I}_{comp}\)) and c related summary statistics

  • CRV9 model type 2.

    The regression analysis of model type 2 in Eq. (5) will be hereinafter denoted by CRV9, being the estimated pseudo focal depth approximately equal to 9 km (h = 8.72 km ± 7.9%). This is shown in row N = 5 of Tables 4 and A2, where the estimated coefficients of the model and their standard error are also reported. These estimates also apply to the equivalent model (6), remembering however that \(a_{2}^{\prime} = 10^{{a_{2} }}\). Figure 5a illustrates both the input data and the estimated model (6), where the predicted macroseismic intensity and the associated uncertainty (\(I_{comp} \pm \sigma\)) are represented against epicentral distance (Repi) for Mw classes, in logarithmic scale. The histogram of the residuals (\(I_{obs} /I_{comp}\)) for model (6) are shown in Fig. 5d, with a standard deviation of 0.149 computed from the entire dataset and 0.15 obtained from the theoretical formula (A1.4) in “Appendix 1”. Summary statistics of residuals are given in Fig. 5e. For easy comparison with Fig. 4b, the histogram of the residuals (\(I_{obs} - I_{comp}\)) is also shown in Fig. 5b, with a standard deviation of 0.731, jointly with the corresponding summary statistics in Fig. 5c.

    Fig. 5
    figure 5

    Type 2 model CRV9: a 2D cross section plots of \(I=f\left({R}_{epi} ,Mw\right)\) for Mw classes, at logarithmic distance scale; with estimated standard deviation σ = 0.731; grey dots are the 16,260 IDPs in the input dataset; b Histogram of the residuals (\({I}_{obs}-{I}_{comp}\)) and c related summary statistics; d Histogram of the residuals between observed and computed intensity (\({I}_{obs}/{I}_{comp}\)) and e related summary statistics

The values of a, b, c, d, and pseudo focal depth h are different in both calibrated models (Log-Lin_10 and CRV9), which indicates that each equation has a different way of relating the variables R and Mw to the macroseismic intensity (I). Figure 4a illustrates the model (Log-Lin_10) for different Mw classes; according to this model, which is given by Eq. (4), the macroseismic intensity I is directly proportional to Mw and decreases with the increase of R. Similarly, Fig. 5a shows the model CRV9, given by Eq. (6), for different Mw classes; in this case, the macroseismic intensity I is proportional to a power of Mw and decreases as R increases.

Since the estimated coefficients are all positive (Table A2), the remarks on the curvatures given in previous Sect. 4 apply to both estimated models. Figure 6 compares the slopes (\(dI/dR_{epi}\)) versus the epicentral distance (Repi) of the models. As already noted, the model CRV9, unlike model Log-Lin_10, has a different slope for each Mw class. As might be expected, the only slope of Log-Lin_10 exhibits an average behaviour with respect to the slopes of CRV9 for different Mw classes.

Fig. 6
figure 6

Comparison of the slopes for the macroseismic intensity attenuation models Log-Lin_10 with h = 9.87 km (green solid line) and CRV9 with h = 8.72 km (dashed lines with different colours for Mw classes), according to Eqs. (7) and (8): Logarithmic scale of the epicentral distance

In both cases, near the epicentral area, the slopes rapidly and slightly decrease to a minimum (at about 10.1 km for Log-Lin_10 and 8.1 km for CRV9) and then just as quickly grow towards a constant value. A change in concavity of both models is also observed (at about 17.4 km for Log-Lin_10; difficult to calculate exactly for CRV9, but likely at much less than 20 km), from downward concavity near the epicentral area to upward concavity further away.

5.2 Sensitivity analysis of the best regressions

5.2.1 Simplification of the functional and the Log model

Previous studies (Tables 1, 2, 3) have demonstrated that further simplifications can be made to the Log-Linear functional to give more weight to the geometric attenuation. We conducted a one-component logarithmic regression of the hypocentral distance for both type 1 and 2 models. In Fig. 7, the type 1 Log model with only geometric attenuation (N = 8 in Table 4) was seen to be comparable to the Log-Lin_10 model (N = 2 in Table 4) up to a distance of 300 km, after which the model diverged and reached an epicentral distance of 700 km for macroseismic intensity I = 3. In this type 1 Log model (N = 8), the pseudo focal depth was modelled as h = 16.6 km, with a higher standard deviation (0.751) when compared to the Log-Lin_10 model (0.748). The type 2 model with geometric coefficient only (N = 9 in Table 4) is similar to the CRV9 model with both geometric and anelastic attenuation (N = 5) up to an epicentral distance of 200 km (Fig. 7). Beyond this, the model diverges up to 1000 km for macroseismic intensity I = 3. For this type 2 model with geometric coefficient only, the pseudo focal depth is computed as h = 16.2 km and its standard deviation (0.735) is slightly higher than that of the CRV9 model (0.731). It is clear from Fig. 7 that there is an attenuation deficit in the macroseismic far field due to the lack of anelastic coefficient compared to the two best regressions, particularly for strong earthquakes. The importance of the linear attenuation component with distance (anelastic attenuation) is highlighted in the macroseismic far field (Repi \(\ge 200\) km), complementing the relevance of the geometric attenuation in the macroseismic near field. This suggests that the regressions with only a geometric attenuation are not viable in the present study. However, we take h = 16 km, an approximation of the estimated pseudo focal depth, as the reference threshold for the following sensitivity tests.

Fig. 7
figure 7

Comparison between type 1 and type 2 Log models for a magnitude Mw 7.0 event, with just geometric attenuation (N = 8 in yellow dashed line, and N = 9 in blue dashed line; Table 4), and the geometric-anelastic attenuation models Log-Lin_10 for type 1 (N = 2 in green solid line; Table 4) and CRV9 for type 2 (N = 5 in red solid line; Table 4): Logarithmic scale of the epicentral distance

5.2.2 Macroseismic data completeness in the far field and the Log-Lincut dist I = 3 model

A trial regression was conducted to reduce incomplete data following Gasperini (2001). In that paper, macroseismic data from earthquakes of the historical period and post-900 were used as input, and models that do not depend on the moment magnitude Mw were analysed (Table 3). Pasolini et al. (2008b) also applied criteria based on the distance to avoid incomplete macroseismic data while using historical period events and Mw in the regression. In this study, these criteria were also used to get a dataset as complete as possible (Gomez-Capera 2006): for each macroseismic intensity at the epicentre (Io), the local distance for I = 3, called Dist_I3, was determined using the relationship by Pasolini et al. (2008b), a function of IE which in turn is a function of Mw or Io (Table 3); IDPs with D ≥ Dist_I3 were disregarded. This restriction significantly reduced the input data to 12,587 IDPs. The resulting model, named Log-Lincut dist I = 3 (N = 7 in Table 4), was Log-Linear type 1 with a slightly higher standard deviation (0.771) than Log-Lin_10 (0.748), despite the reduced data set. The pseudo focal depth was modelled as h = 11.3 km. The calibrated coefficients had similar values, but with greater uncertainties in comparison with the coefficients of the best Log-Linear model Log-Lin_10 (Table A2).

We recall that in this study post-900 earthquakes, macroseismic data and Mw calibrated regressions were analysed. At great epicentral distances, the macroseismic data are essentially due to large magnitude events, as demonstrated by the IDPs between 400 and 634 km epicentral distance (Figs. 3d, A1) that are from 1915.01.13 Marsica (Mw = 7.0), 1930.07.23 Irpinia (Mw = 6.6), 1976.05.06 Friuli (Mw = 6.4) and 1980.11.23 Irpinia (Mw = 6.8) earthquakes (Table A1; Rovida et al. 2022a; Locati et al. 2022). The macroseismic intensity attenuation model incorporating both geometric and anelastic coefficients offers the most accurate representation of macroseismic intensity attenuation for a wide range of macroseismic intensity distribution and Mw values, encompassing near and far macroseismic field, as well as moderate and strong earthquakes.

5.2.3 Varying the pseudo focal depth (h = 5 km and 16 km)

In this section we investigate the influence of hypocentral distance on the best regressions to ensure an acceptable level of uncertainty in the estimates of the intensity attenuation model. Additional regressions were performed to model intensity with pseudo focal depths of 5 km (Log-Lin_5) and 16 km (Log-Lin_16) as fixed input parameters and compared with the best regressions Log-Lin_10 and CRV9.

5.2.3.1 Log-Lin_5 and Log-Lin_16 models type 1

For Mw range 4.0–7.0, Fig. 8 compares the macroseismic attenuation trends of the best-fitting model Log-Lin_10 with two type 1 Log-Lin models (Eq. 4), named Log-Lin_5 and Log-Lin_16, in which h is set equal to 5 km and 16 km respectively (N = 1 and N = 3 in Table 4). The macroseismic intensity attenuation trend is not greatly impacted by different pseudo focal depths: in all cases, maximum intensities differ by less than a half-degree of macroseismic intensity unit for h = 5 km compared to h = 9.87 km and h = 16 km. For epicentral distances Repi > 10 km, the three attenuation models are similar. We can conclude that, in this study, model type 1 is not significantly affected by pseudo focal depth to \(h \le\) 16 km.

Fig. 8
figure 8

Sensitivity analysis of the Log-Linear Model (Type 1) to pseudo focal depth h = 5, 9.87, and 16 km, for different values of Mw: a Mw = 7.0; b Mw = 6.0; c Mw = 5.0; d) Mw = 4.0

5.2.3.2 CRV5 and CRV16 models type 2

For type 2 model (Eq. 6), the near field macroseismic intensity attenuation trend is affected by both pseudo focal depth (h) and magnitude (Mw). Table 4 shows the results of CRV5 and CRV16 models (N = 4, 6; fixed h = 5 and 16 km, respectively) and compares them with the best type 2 CRV9 model (N = 5; estimated h = 8.72 km). According to Appendix 1 in the electronic supplement, the standard deviation of the residuals for the additive Normal model (5) of log intensity (Log I) and that for the multiplicative LogNormal model (6) of intensity (I) are calculated respectively from RMS given in Eq. (11) and from the theoretical formula (A1.4). The standard deviation of the residuals of model (5) is 0.0657, 0.00655, and 0.0657 respectively for CRV5, CRV9, and CRV16 (last column of Table A2, denoted by σ_LogI), whereas the standard deviation of the residuals of model (6) is 0.15 for each of CRV5, CRV9, and CRV16 (last column of Table A2, denoted by σ_Icomp/Iobs). By comparing these quantities between the models CRV5, CRV9, and CRV16, the differences are negligible. However, the calibrated coefficients of CRV5 and CRV16 demonstrate considerable variability compared to those of CRV9 model. The largest uncertainties are observed when the pseudo-focal depth is fixed to h = 5 km (CRV5). Figure 9 displays the Eqs. (4) and (6) calibrated for the three scenarios characterised by different pseudo focal depths h and Mw classes: Log-Lin_5 and CRV5; Log-Lin_10 and CRV9; Log-Lin_16 and CRV16. For large events with Mw > 5.5 and near the epicentral area at distance Repi < 10 km, the maximum intensity values for the CRV models are higher than those of the Log-Lin models, decreasing with depth and increasing with magnitude. For large epicentral distances Repi > 10 km, the CRV models perform similarly. As well as to magnitude Mw, model type 2 exhibits a significant sensitivity to pseudo focal depth (h ≤ 16 km) near the epicentral area (Repi < 10 km).

Fig. 9
figure 9

Sensitivity analysis of Log-Linear_10 (h = 9.87 km; Type 1) and CRV models (Type 2) to pseudo focal depth h = 5, 8.7, and 16 km, and Pasolini et al. (2008b) for different values of Mw: a Mw = 7.0; b Mw = 6.0; c Mw = 5.0; d Mw = 4.0

5.3 Discussion of the trend of coefficients and residuals

The varying of the pseudo focal depth in Eqs. (4) and (5) leads to an increase in the absolute values of the geometric attenuation coefficients (b1, b2) and of the independent terms (a1, a2), and a decrease of the absolute value of the anelastic attenuation (c1, c2) (Tables 4, A2). A similar trend was observed by Howell and Schultz (1975). The magnitude coefficient (Mw) was found to remain constant in the first approximation for both models (d1, d2 about equal to 1.4) and had values close to those obtained for the calibrated models of post-900 events by Mezcua et al. (2020), Gomez-Capera et al. (2008a), Bakun (2005) and Mezcua et al. (2004) (Tables 2, 3). The uncertainty of the Mw coefficient was lower than those estimated for the geometric and anelastic attenuation coefficients, as well as for the independent term of all the calibrated models. The calibration coefficient of the magnitude Mw was also stable in the sensitivity analysis. Simplifications of the Log-Lin model, such as modelling only the geometric component or applying a distance cut of data in the far macroseismic field, did not provide any physical or statistical advantage for modelling attenuation (Table A2).

The results show that the Log-Lin_10 model type 1, with h as a free parameter, has the lowest standard errors of the coefficients and reasonably low standard deviation of residuals (σ) compared to the other models. Model type 1 also offers a more precise pseudo focal depth of h = 9.87 km ± 5.7%, compared to model type 2 with h = 8.72 km ± 7.9%. This suggests that model type 1 provides the most accurate representation of macroseismic intensity attenuation with epicentral distance and moment magnitude. Although the attenuation trend performance of the type 2 model (CRV9) is not significantly different from that of type 1 (Log-Lin_10) at distances greater than 10 km, CRV9 may still be better at predicting the maximum intensity of stronger earthquakes.

For a further comparison, the residuals and the means of residuals for all models are summarised in bins of predicted intensities (Fig. 10), moment magnitudes (Fig. 11), and epicentral distances (Fig. 12). A comparison between models, based only on mean residual and standard deviation, is summarised in the electronic supplement (Figs. A2, A3, A4). In Fig. 10 a similar trend is observed for all models: the mean of residuals fluctuates around zero with overall tendency to underestimate by less than 0.5 degree of intensity and, for some models, even up to 1 degree for I ≥ 9. We verified that large residuals are especially obtained for large site intensities, which refer to small subsets of earthquakes and IDPs (Fig. 3c–g, 4a); for example, only 15 IDPs have intensity 11, of which 11 MPDs from the 1908.12.28 Messina event, 7.1Mw (Pino et al. 2000), and 4 IDPs from the 1915.01.13 Marsica event, 7.0Mw (Margottini et al. 1993). The two seismic events have been well studied (Guidoboni et al. 2007; Molin et al. 1999) and the respective IDPs are spatially well distributed and sample both the highest degrees and those of least macroseismic impact (Fig. A1). The set of IDPs of intensity classes \(I \ge 9\) MCS correspond to 2.05% of the calibration dataset, contributing to the bias in this interval.

Fig. 10
figure 10figure 10

Distribution of residuals (coloured dots) versus predicted macroseismic intensities, its mean (square dots) and standard deviation (error bars), for different models

Fig. 11
figure 11figure 11

Distribution of residuals (coloured dots) versus epicentral distance in log-spaced distance bins, its mean (square dots) and standard deviation (error bars), for different models

Fig. 12
figure 12figure 12

Distribution of residuals (coloured dots) versus Mw considering 0.25 Mw bins, its mean (square dots) and standard deviation (error bars), for different models

According to Figs. 11 and 12, there is no evidence of any dependence of the residuals either on epicentral distance or magnitude. For the latter case, CRV9 and Log-Lin_10 models have the best performance in terms of absolute mean of residuals, which is 0.1 in both cases. More in details, we note that 89% of IDPs (14,458 IDPs) are located at an epicentral distance between 10 and 193 km; limited to these IDPs, model Log-Lin_10 provides the best value of the absolute mean of residuals (0.03), followed by CRV9 (0.08). On the contrary, model CRV9 has a lower absolute mean of residuals than Log-Lin_10 both in near field IDPs (epicentral distance less than 10 km) and in far field IDPs (epicentral distance between 193 and 634 km), with values respectively of 0.10 and 0.15 for CRV9 versus 0.13 and 0.30 for Log-Lin_10.

As shown in Fig. 13a, b for the preferred model Log-Lin_10, the between event residuals of calibration earthquakes mostly vary in [− 0.5,0.6], have about zero mean and no trend with respect to magnitude and spatial distribution of earthquakes. In CPTI15, several methods have been applied to determine the magnitudes (Mw) of smaller earthquakes; such methods are summarised in the electronic supplement (Fig. A5). The use of diverse methods for calculating Mw may contribute to the observed bias in Fig. 13a.

Fig. 13
figure 13figure 13

Type 1 model Log-Lin_10: a Between-event residuals versus magnitude Mw of the calibration earthquakes; b Spatial distribution of the calibration earthquakes; c Within-event residuals versus epicentral distance

Similar results are obtained for the within event residuals (Fig. 13c), where the mean of residuals is around zero for 10 km < epicentral distance (Repi) < 400 km, and it is slightly underestimated for distances outside of this range. In Figs. 11c and 13c, it is evident that the divergence in residuals becomes prominent for distances exceeding 400 km. As a result, it is advisable to restrict the applicability of the models (Log-Lin_10 and CRV9) to a maximum range of epicentral distance 400 km.

Preliminary analysis of the possible correlation between model residuals and broad-scale seismic site conditions is performed and summarised in the electronic supplement (Fig. A6). Such analysis uses as input data the attenuation model based on the 16,260 IDPs and the raster Vs30 map with a spatial resolution of 50 × 50 m defined in Mori et al. (2020). It associates each IDP with the corresponding Vs30 value and assigns the related model residuals (Fig. A6a). A significant portion, approximately 84.2% (Alps and Apennines) of the input IDPs are associated with soil type B, while 15.5% (Po Basin and Tuscany region) is linked to soil type C and 0.3% is associated with soil type D. The residuals have about zero mean and there are no trends between the preferred model Log-Lin_10 residuals and Vs30 classes of the EC8 classification (CEN 2004, Fig. A6b). As a result, the preferred model assigns the prevailing mean condition to type B soil sites.

6 Validation process

The validation process was conducted using independent observations, that is earthquakes and IDPs not considered for models’ calibration. To further assess the reliability of the calibrated intensity attenuation models, we analysed the IDPs (DBMI15; Locati et al. 2022) of 15 Italian earthquakes with known instrumentally-derived earthquake parameters (location and magnitude, CPTI15, Rovida et al. 2022a, Table 5, Fig. 14). We selected the events following the criteria outlined in Sect. 4. Among the 15 earthquakes, four events had partial IDP distributions, which we used to evaluate the performance of the calibrated attenuation models: two offshore (1978.04.15 Golfo di Patti, Mw = 6.0; 2002.09.06 Tirrreno Meridionale, Mw = 5.9) and two onshore earthquakes (2009.04.06 Aquilano, Mw = 6.3; 2016.08.24 Monti della Laga, Mw = 6.2).

Table 5 Instrumental parameters (CPTI15) of the earthquakes used for validation process and mean absolute error (MAE) for each event and macroseismic intensity attenuation model
Fig. 14
figure 14

a Location of the 15 post 1900 (1908–2013) earthquakes (4.2 ≤ Mw ≤ 6.3), used in this study for the validation process (data from CPTI15, Rovida et al. 2022a) b Macroseismic Data Points distribution (4,533 IDPs), used for the validation process (data from DBMI15, Locati et al. 2022)

The performance of the calibrated intensity attenuation models is validated via two prediction strategies: i) mean absolute error (MAEi) as the measure of the goodness of fit between model predictions and observed macroseismic intensities of earthquake i (a standard metric for model regression analysis; i.e., refer to Willmott et al. (1985) and Mak et al. (2015) for some applications in geophysics); and ii) moment magnitude Mw from IDPs, using the macroseismic intensity attenuation model calibrated in Mw and the Bakun and Wentworth (1997) method. The results demonstrate the reliability of the models.

6.1 Mean absolute error (MAE)

Mean absolute error (MAE), as its name implies, is a measure of the difference between the forecasted and observed values. MAE was used to assess the accuracy of the macroseismic intensity attenuation model predictions and the observed macroseismic intensities for each validation earthquake. For the macroseismic field of an earthquake with a number P of macroseismic data points, MAE is defined as

$$wMAE = {{\left( {\mathop \sum \limits_{j = 1}^{P} w_{{j,\left| {res_{j} } \right|}} } \right)} \mathord{\left/ {\vphantom {{\left( {\mathop \sum \limits_{j = 1}^{P} w_{{j,\left| {res_{j} } \right|}} } \right)} {\left( {\mathop \sum \limits_{j = 1}^{P} w_{j} } \right)}}} \right. \kern-0pt} {\left( {\mathop \sum \limits_{j = 1}^{P} w_{j} } \right)}}$$
(12)

and

$$res_{j} = I_{obs,j} - I_{pre,j}$$
(13)

where \(I_{obs,j}\) are the observed intensity values, \(I_{pre,j}\) are the predicted intensity values, and \(w_{j}\) the weights of \(I_{obs,j}\) for each site record j of the considered earthquake. Mak et al. (2015) did not weight the dataset (\(w_{j} = 1\)) because in the traditional macroseismic intensity database, for example DBMI15, there is no indicator for the precision of each macroseismic intensity assignment. We calculated the average of all event-based Mean Absolute Error (MAE) scores as the performance indicator for each macroseismic intensity attenuation model calibrated in Moment Magnitude (Mw) in this study. This indicator was then used to compare with models published in the literature, such as Pasolini et al. (2008b).

Table 5 presents the 15 earthquakes used for the validation process, along with the obtained mean event-based MAE scores for Log-Lin_10, CRV5, CRV9 and CRV16, as well as Pasolini et al. (2008b). Figures 15 and 16 display the results from Table 5, arranged chronologically and by Mw respectively, for each macroseismic intensity attenuation model.

Fig. 15
figure 15

Average value of event-based mean absolute error (MAEi). The events are ordered chronologically (Table 5); PAS: Pasolini et al. (2008b)

Fig. 16
figure 16

Average value of event-based mean absolute error (MAEi). The events are ordered by Mw

For the 1909.04.06 Aquilano and 2016.08.24 Monti di Laga earthquakes, the worst predictions were given for all attenuation models due to an incomplete distribution of far-field macroseismic data. Conversely, the best prediction for all attenuation models was given for the offshore 2002.09.06 Tirreno meridionale event. The 1919.06.20 Mugello earthquake, which had the highest Mw, maximum intensity (6.3Mw ± 0.2 and 10 MCS) and 565 IDPs, presented a stable performance for all calibrated models.

The four models calibrated in this study outperformed Pasolini et al. (2008b), which had a median event-based MAE score of 0.9. For comparison, the present study found a median value of all event-based MAE scores of 0.6 intensity units for the models Log-Lin_10, CRV5, CRV9, and CRV16. This result indicates that these models are better at capturing macroseismic intensity for moderate and large earthquakes than Pasolini et al. (2008b).

6.2 Moment magnitude from IDPs

6.2.1 Bakun and Wentworth (1997) method

The Bakun and Wentworth (1997) method (BW97) requires a macroseismic intensity attenuation model as a function of both earthquake magnitude and distance. This is shown through Eqs. 4 and 6, which assume a constant pseudo focal depth h = h0 km. These equations are inverted to estimate the single-site magnitude \(MI_{i}\) (where i is an integer between 1 and P, the total number of IDPs available for the earthquake under examination) from the individual intensity values \(I_{i}\) observed at distances \(R_{i}\):

$$MI_{i} = f\left( {I_{i} , R_{i} } \right)$$
(14)

The location and magnitude of a given earthquake are estimated by computing the magnitude \(MI_{i}^{k}\) over a grid of trial source locations \(x_{k}\) (i.e., grid nodes). For the earthquake studied, the magnitude \(MI^{k}\) is defined as the average of the magnitudes \(MI_{i}^{k}\) estimated from individual macroseismic observations (IDPs) and assuming the epicentre located in the grid node \(x_{k}\), that is

$$MI^{k} = \left( {\mathop \sum \limits_{i = 1}^{P} MI_{i}^{k} } \right)/P$$
(15)

Then, considering a grid of trial source locations \(x_{k}\), the root mean square \(rms\left( {MI^{k} } \right)\) is computed as

$$rms\left[ {MI^{k} } \right] = rms\left[ {MI^{k} - M_{i} } \right] - rms_{0} \left[ {MI^{k} - M_{i} } \right]$$
(16)

where

$$rms\left[ {MI^{k} - M_{i} } \right] = \sqrt {{{\mathop \sum \limits_{i = 1}^{P} \left[ {w_{i} \left( {MI^{k} - MI_{i}^{k} } \right)} \right]^{2} } \mathord{\left/ {\vphantom {{\mathop \sum \limits_{i = 1}^{P} \left[ {w_{i} \left( {MI^{k} - MI_{i}^{k} } \right)} \right]^{2} } {\mathop \sum \limits_{i}^{P} w_{i}^{2} }}} \right. \kern-0pt} {\mathop \sum \limits_{i}^{P} w_{i}^{2} }}}$$
(17)

\(rms_{0} \left[ {MI^{k} - M_{i} } \right]\) is the minimum of all \(rms\left[ {MI^{k} - M_{i} } \right]\) over the grid of assumed trial source locations, and \(w_{i}\) the distance weighting function (Bakun and Wentworth 1997)

$$w_{i} = \left\{ {\begin{array}{*{20}l} {0.1 + cos\left( {\frac{{R_{epi} }}{150} \cdot \frac{\pi }{2}} \right)} \hfill & {{\text{for}} \;R_{epi} < 150\,{\text{km}}} \hfill \\ {0.1 } \hfill & {{\text{for}} \;R_{epi} \ge 150\,{\text{km}}} \hfill \\ \end{array} } \right.$$
(18)

The minimum of all the \(rms\) over the grid of assumed trial source locations is subtracted from each of the \(rms\) of the grid node \(x_{k}\). The grid node \(x_{k}\) corresponding to the minimum \(rms\left[ {MI^{k} } \right]\) is the intensity centre (IC):

$$rms\left[ {MI^{k} - M_{i} } \right] - rms_{0} \left[ {MI^{k} - M_{i} } \right] = 0$$
(19)

where the intensity magnitude \(MI\) is given by \(MI^{k}\) evaluated at point IC. The IC corresponds to the location on the fault plane with the highest energy release (i.e., the location of the maximum fault displacement, or centroid moment; Beauval et al. 2010). Hence, this does not always match the epicentre (Bakun 2006). In the present work, the IC is used as the macroseismic epicentre, as opposed to the classic definition of an epicentre as the point on the surface that is the vertical projection of the seismic focus where the rupture begins. RMS levels provide confidence intervals, indicating that the IC is within the area delimited by them. Typically, the 95%, 90%, 80%, 67%, and 50% are represented, and their shape is based on the number of IDPs (Bakun and Wentworth 1999). The magnitude MI of the IC is equivalent to the Mw of the earthquake, and the Eqs. (4) and (6) are calibrated in Mw. However, according to Bakun and Wentworth (1997), the accuracy of the calculated Mw is dependent on the number of IDPs used. Further details are provided in Bakun et al. (2011) and Gomez-Capera et al. (2022).

6.2.2 Mw validation results

In order to further validate the reliability of the calibration models obtained using the BW97 method, we processed 15 Italian earthquakes with known location and moment magnitude (Table 6, Fig. 14a) and a macroseismic dataset of 4,533 IDPs (Fig. 14b). The macroseismic locations were in good agreement with the instrumental locations. Macroseismic intensity attenuation was not a critical factor for location. Reliable source location estimates depend on the quantity and quality of the IDPs, as well as the geometry of the data field relative to the earthquake source; no technique can provide accurate source locations for events outside the network of IDPs. In the calibration–validation process, we focused on the prediction of moment magnitude (Mw).

Table 6 Instrumental parameters (CPTI15) of the earthquakes used for validation process and Mw estimations among the macroseismic intensity attenuation models and BW97 method for each event

Moment magnitude estimates are heavily dependent upon the macroseismic intensity attenuation model, which is calibrated using events for which instrumentation is available to accurately determine earthquake parameters and IDPs. The results of this calibration are displayed in Fig. 17. In particular, Fig. 17a displays the instrumental Mw and the calculated Mw from macroseismic data for each event (Log-Lin_10km and CRV5,9,16km models). Figure 17b displays the instrumental Mw and calculated macroseismic Mw with Log-Linear model in comparison with Mw calculated with Pasolini et al. (2008b) model and macroseismic magnitude (bxn; boxer method; Gasperini et al. 1999, 2013) by CPT15 (Rovida et al. 2022a).

Fig. 17
figure 17

Earthquakes from CPTI15 used in validation process: a Comparison of instrumental Mw with calculated Mw from models Log-Lin_10, CRV5, CRV9, and CRV16; b Comparison of instrumental Mw with calculated Mw from the best model Log-Lin_10 model, macroseismic Mw based on Pasolini et al. (PAS; 2008b), and macroseismic Mw based on Boxer by Gasperini et al. (bxn; 1999)

The differences between the instrumental Mw and calculated Mw values in this study, as seen in Table 6, are all within 0.0–0.3 magnitude units for the Log-Lin_10 model and 0.0–0.4 for the CRV5,9,16 models, and 0.0–0.8 for Pasolini et al. (2008b). All four calibrated models exhibit similar performance, with a median absolute residual of 0.2 magnitude units (Table 6).

The Log-Linear_10km and CRV5,9,16km models computed macroseismic Mw values that align well with the instrumental ones, but less so with the magnitudes calculated with the Pasolini et al. (2008b) model and Boxer's proposal at CPTI15. This outcome was anticipated (Table 3), as the Pasolini et al. (2008b) and Boxer method have not been calibrated on this dataset. The Log-Lin_10km and CRV5,9,16 calibration models were thus validated, confirming that the BW97 method is an effective tool for evaluating the performance of macroseismic intensity models and determining earthquake location and magnitude from macroseismic data in the present study.

7 Discussion and conclusions

In this study we propose new macroseismic intensity attenuation models calibrated in Mw for the Italian territory, using the most recent releases of the CPTI15 earthquake catalogue (Rovida et al. 2022a) and the associated DBMI15 macroseismic database (Locati et al. 2022). Two attenuation functional forms, originally proposed by Howell and Schultz (1975), are analysed. These models are given by Eqs. (2) and (3) which are combined with the classical equation for energy decay (1) and the earthquake size parameter (Mw) to propose two empirical macroseismic intensity attenuation models (4) and (5), respectively named Log-Lin type 1 and CRV type 2. These two model types have different ways of relating the variables R, Mw to the macroseismic intensity I; e.g., a direct proportionality is assumed between I and Mw in the Log-Lin model and between their logarithms in the CRV model. This also implies different expressions of their curvatures, which are related to the pseudo focal depth in the Log-Lin model and to both pseudo focal depth and magnitude of the earthquake in the CRV model.

A carefully selected set of Italian shallow tectonic earthquakes and IDPs were used (119 events from CPTI15 and 16,260 IDPs from DBMI15) to calibrate the macroseismic intensity attenuation models. This data set was selected based on criteria such as magnitude range, spatial distribution, quality of instrumental moment magnitude, epicentral locations, and intensity range. The two new attenuation models calibrated in Mw are analysed, along with a set of alternative models. We found that the best models are Log-Lin_10 and CRV9, the former having a slightly higher standard deviation, but lower error in the calibrated coefficients than the latter. The estimated pseudo-focal depths are (h = 9.87 km ± 5.7%) for Log-Lin_10 and and (h = 8.7 km ± 7.9%) for CRV9, showing that both results of the calibration of coefficient h are in good agreement with the mean observed in the input dataset, which is h = 10 km.

These findings lead us to conclude that the two best models, Log-Lin_10 and CRV9, are both essentially valid to describe the macroseismic intensity attenuation in Italy with pseudo hypocentral distance and moment magnitude. Following a principle of parsimony, a preference can be expressed in favour of the mathematically and computationally simplest model, which in our case is the regression Log-Lin_10 model for macroseismic intensity I. The CRV9 model type 2 still remains an alternative attenuation model calibrated in Mw: it may offer an alternative solution to the more frequently used type 1 model, particularly for strong (Mw > 6) and shallow (h < 10 km) events, as it can more accurately predict the maximum intensity in these cases.

As a result of the sensitivity analysis, the CRV model is more affected than the Log-Lin model to changes in h and Mw, especially in terms of maximum intensity for events of Mw > 6.0. It is however noteworthy that the standard deviation of both type 1 and type 2 calibrated models (Tables 4, A2) remains relatively unchanged under variations in pseudo focal depth h.

Further analysis has been performed to test the models on a separate set of data, called the validation dataset, including 15 earthquakes from CPTI15 and 4,533 IDPs from DBMI15. The results of the validation process show that the four calibrated intensity attenuation models Log-Lin_10, CRV5, CRV9, and CRV16 performed better than the Pasolini et al. (2008b) model, with a median value of all event-based MAE scores equal to 0.6 (intensity unit). The calculated Mw values from IDPs obtained with the Log-Lin_10 and CRV9 models are in good agreement with the instrumental magnitudes, with differences of \(\left| {ResMw} \right|\)≤ 0.3 and of \(\left| {ResMw} \right|\)≤ 0.4 respectively. These results indicate that the intensity attenuation models obtained in this study are reliable and can be used to accurately estimate macroseismic intensity at a site from Mw and epicentral distance, or to assess Mw from IDPs for a given earthquake.

One of the main aspects concerning the developments and the usefulness of the results of the present work is the possibility to define earthquake scenarios (e.g. probabilistic seismic hazard maps) in terms of Macroseismic Intensity and related standard deviation, thus avoiding the uncertainties due to the conversion of various ground shaking parameters into intensity values in the development of seismic hazard/risk maps. The macroseismic intensity relates specifically to damage in a way that parameters like PGA do not, and it can be useful to better define the priorities of Civil Protection interventions immediately after an event. Our results should be useful for comparison between seismic maps in terms of intensity, such as the National Seismic Hazard Map (MPS04) (Gruppo di Lavoro MPS 2004) and future ones.

Aware that it would be of great interest to provide intensity attenuation model accounting for site amplification, we have extended the residual analysis also to the possible influence of different EC8 soil classes on macroseismic intensity and we verified that the model proposed in this study can be considered as representative of the EC8-B class. This topic warrants additional investigation in future studies.

In conclusion, this study provides updated macroseismic intensity attenuation models that can accurately predict the macroseismic intensity and/or Mw, and can be applied to evaluate seismic hazard in terms of macroseismic intensity in Italy.