1 Introduction

The direct macroseismic observations also have value in the era of the specialized, highly technical science. There are abundant macroseismic observations for important earthquakes not recorded by modern instruments. Seismic-hazard analyses calculated in terms of (macro)seismic intensity (from here on intensity) have been widespread (e.g. Mayer-Rosa and Schenk 1989; McGuire 1993), and compatible updates may be necessary. Site effects can be investigated using macroseismic data (Bossu et al. 2000; Mucciarelli et al. 2000). Empirical information on the impact of earthquakes on society provides a basis for the communication of seismic hazards, for example in ShakeMaps as part of early warning implementation (Worden et al. 2010).

Attenuation parameters can be used in the general regression scheme to obtain the location, magnitude and depth of historical earthquakes (e.g., Levret et al. 1994; Bakun and Wentworth 1997, 1999). Intensity models for this purpose have been developed, for example, by Gasperi and Ferrari (1997) for Italy, Bakun and Wentworth (1997) for coastal California, Hinzen and Oemisch (2001) for the northern and Middle Rhine Area in central Europe, Bakun et al. (2003) for eastern North America, Fäh et al. (2003) for Switzerland, Bakun (2006) for southern California and Bakun and Scotti (2006) for regions of France.

The attenuation structure of the crust and upper mantle can be determined by constructing an intensity prediction equation (IPE). IPEs based on isoseismals have been presented, for example, by Brazee (1972) for the United States west of longitude 106°W, Gupta and Nuttli (1976) for the central U.S., Ambraseys (1985) for Northwest Europe, Lapajne (1987) for Slovenia, Levret et al. (1994) for France, Pantea (1994) for the Romanian territory with adjacent areas and Musson (2005) for the United Kingdom. These studies have made use of isoseismal maps for well-studied earthquakes.

Contouring of individual intensities is avoided when using intensity data points (IDPs). IPEs based on IDP data have been presented, among others, by Stromeyer and Grünthal (2009) for parts of Germany, France, the Netherlands, and the Czech Republic in central Europe, Sørensen et al. (2009) for the Sea of Marmara region in Turkey, Bindi et al. (2011) for central Asia and Allen et al. (2012) for shallow active tectonic crust worldwide. Many IPEs have been presented for Italy, or parts of it, including those by Peruzza (1996), Gasperini (2001), Albarello and D’Amico (2004), Gómez (2006), Faccioli and Cauzzi (2006), Pasolini et al. (2008a, b) and Sørensen et al. (2010).

The present paper examines the effect of synthetic and real data properties on the IPE. It is explored how successfully its coefficients can be resolved using input data that contain errors. The motivation arises out from areas of low seismicity and/or poorly documented earthquake effects, where it is doubtful whether the available intensity data are sufficient for constructing a local IPE. The feasibility of obtaining the average attenuation trend using empirical intensities can be examined with the help of synthetic data. They have previously been used to study macroseismic location based on sparse intensity datasets by Mäntyniemi et al. (2017). This paper begins with a formulation of the inverse problem to be solved (Sect. 2) and a scheme for creating synthetic intensity data (Sect. 3). IDP samples of different sizes and effects of magnitude and depth errors on a synthetic database are investigated (Sect. 4). An IPE is constructed on the basis of intensity data from the database of the United Kingdom provided by the British Geological Survey (Sect. 5). Finally, the results are discussed (Sect. 6).

2 Regression Technique for Solving the Equation Coefficients

Much of the previous literature listed above has been concerned with estimating the coefficients of the Kövesligethy–Sponheuer equation using regional intensity data. The content of the equation and the coupling of its coefficients have been perused in detail, for example by Ambraseys (1985), Levret et al. (1994) and Stromeyer and Grünthal (2009). Different regression techniques are typically used to resolve the coefficients. The equation includes the epicentral intensity, which may be a source of bias (e.g., Ambraseys 1985; Musson 2005; Pasolini et al. 2008a).

In the present analysis, the issue of epicentral intensity is avoided by using the equation by Kondorskaya and Shebalin (1982), which is as follows:

$$I_{i} = b \cdot M - \nu \cdot lg\sqrt {R_{i}^{2} + H^{2} } + c.$$
(1)

Above, Ii, i = 1,…,n, are the intensities at n localities (xi,yi) and Ri the corresponding epicentral distances (in km), M is the earthquake magnitude and H the focal depth (km). Equation (1) has been calibrated against the ML scale. The ML and MS magnitudes were equal in the magnitude interval used. Intensities were given on the Medvedev–Sponheuer–Kárník (MSK-64) scale. The coefficients are the attenuation coefficient, ν, as well as b and c. The logarithm is to the base 10. The square root term is the hypocentral distance Rhyp. Rupture dimensions are not considered in this study, because a point source is assumed. Equation (1) has been used to study attenuation by e.g., Shebalin et al. (1998). They proposed attenuation ν = 4.0 for central and south-eastern Europe (φ ≤ 47°N) and ν = 3.5 for its northern part (φ > 47°N).

Depth estimates are not always given in the existing IDP databases. A single value may be taken to represent a regional average depth (e.g., Bindi et al. 2011), or it may be determined as an additional regression parameter (e.g., Sørensen et al. 2010). Depth errors are investigated in the later sections.

Assuming Eq. (1) and two different magnitudes M1 and M2 with the corresponding hypocentral distances Rhyp1 and Rhyp2, respectively, we obtain

$$I_{1} = b \cdot M_{1} - \upsilon \cdot \lg (R_{hyp1} ) + c,$$
(2)

and

$$I_{2} = b \cdot M_{2} - \upsilon \cdot \lg (R_{hyp2} ) + c.$$
(3)

Extracting Eq. (3) from Eq. (2) gives

$$I_{1} - I_{2} = b \cdot (M_{1} - M_{2} ) - \upsilon \cdot \lg \left( {\frac{{R_{hyp1} }}{{R_{hyp2} }}} \right).$$
(4)

Dividing both sides of Eq. (4) by b gives

$$M_{2} - M_{1} = \left( {\frac{\upsilon }{b}} \right) \cdot \lg \left( {\frac{{R_{hyp2} }}{{R_{hyp1} }}} \right) + \frac{{I_{2} - I_{1} }}{b}.$$
(5)

Substituting \(y = M{}_{1} - M_{2}\) and \(x = \lg \left( {\frac{{R_{hyp2} }}{{R_{hyp1} }}} \right)\) leads to a linear equation

$$y = \left( {\frac{\upsilon }{b}} \right) \cdot x + \frac{{(I_{2} - I_{1} )}}{b}.$$
(6)

The inverse problem is solved step by step as follows: First, magnitude differences and the ratio of hypocentral distances can be calculated according to Eq. (4) for the corresponding intensities. Each distance–intensity pair is assumed to carry information on the attenuation properties. Only positive intensity differences are used to avoid redundancy. Secondly, the procedure is repeated for all ΔI. A set of linear regressions (6) is obtained and can be used to solve ν and b. Non-trivial solutions of (6) are found only for non-zero y. Coefficient c can then be determined using Eq. (1).

3 Generation of Synthetic Data for Stability Tests

The synthetic intensity, IS, at locality i was computed using Eq. (1) with the coefficients ν = 3.5, b = 1.5 and c = 3. Synthetic data were created for both single earthquakes and a database. Since intensity was taken to be independent of azimuth, their corresponding distances from the epicentre could be considered in one dimension.

An initial set of synthetic intensities was generated starting from I = 1.51 with an increment of 0.01 to a maximum intensity defined with Eq. (1) with an epicentral distance equal to zero. A value was selected randomly from this set, and the corresponding distance was calculated. Then it was rounded to the nearest integer. For example, intensity I = 4 results from one hundred possible initial values between 3.50 and 4.49. A uniform probability over distance means equality between different intensities. However, the maximum possible intensity is defined in such a way that the number of possible intensities is the same. For example, a magnitude of 4.7 and depth of 10 km yield an initial maximum intensity of 6.55, but there are only six possible initial values, from 6.50 to 6.55, which would be rounded to 7, so in this case the final maximum intensity is 6.

Samples with 5–90 IDPs, with increments of 5, were created. Eleven magnitudes from 4.5 to 6.5 with increments of 0.2 units were used. One million samples were computed for each magnitude and number of IDPs, and 11 times 18 million samples were thus available.

As the next step, a database of synthetic intensities was created. It was taken to be composed of 18 earthquakes and 1110 IDPs. It included four different magnitude values, 4.5, 4.7, 5.1 and 5.7, with 10, 5, 2 and 1 earthquake(s), respectively (Table 1). The number of IDPs was assumed to increase with magnitude: there were 15 IDPs for each M4.5 earthquake, 40 for each M4.7 event, 200 for both the M5.1 events and 360 IDPs related to the M5.7 earthquake. Since at this point the magnitudes were assumed to be error-free, there was no difference between ten M4.5 earthquakes with 15 IDPs and one M4.5 event with 150 IDPs. Two sets of focal depths were tested (columns depth 1 and depth 2 in Table 1). The minimum of three different integer intensities was needed for each sample to be able to solve the coefficients. They were solved for each pair I2 − I1. The mean values of 10,000 inversions were taken to be the final coefficient values.

Table 1 The synthetic database

4 Results

The synthetic IDPs represent highly idealized intensity data, but nevertheless provide insight into the effect of the input data properties on the estimated IPE coefficients.

4.1 Single Earthquakes

Firstly, the initial set of synthetic intensities were used to test the selected inversion procedure. Coefficients ν (selected to be 3.5) and M + c (= 1.5·M + 3) were computed, because it is not possible to resolve the coupling together of b and c using a single magnitude.

Coefficient values of ν and M + c within a standard deviation of ± 0.2 of the correct value were regarded as good results. The full range of values was wide in the case of small samples. There could be unusual distributions of small numbers of IDPs, leading to anomalous coefficients. Increasing the sample size resulted in a narrower range of estimated values. This was observed for all magnitudes, of which M4.9 and M6.1 are illustrated in Fig. 1. Simultaneously with increasing sample size, the distance ranges of the samples varied less. This follows from the random generation of the IDPs: as the sample size increases, it becomes increasingly likely that the IDPs are spread over a wider range of distances. The distributions of the estimated coefficients were typically skewed: the coefficients were overestimated rather than underestimated (Fig. 1).

Fig. 1
figure 1

Estimation of coefficients of the intensity prediction equation on the basis of one million randomly generated samples comprising 5, 45 and 90 intensity data points: The obtained values of a the attenuation coefficient,ν, for magnitude M = 4.9 and b (M + c), where b and c are equation coefficients and M = 6.1. The thick horizontal lines indicate the correct coefficient values ν = 3.5 and M + c = 1.5·M + 3 = 12.15. It is not possible to resolve the coupling of b and c using a single magnitude. The numbers of solutions inside the contour lines are given as exponents of the base of the natural logarithm e ≈ 2.71828

The proportions of solutions within ± 0.2 of the correct coefficient value were lower for M + c than for ν. This is expected, because, as explained in Sect. 2, the inaccuracy of ν is carried into the solutions of the other two coefficients. For example, in the case of 60 IDPs and coefficient ν, the proportions were 35%, 42% and 63% for magnitudes 4.5, 5.5 and 6.5, respectively. The corresponding proportions for M + c were 19.5%, 17.2% and 28%. In the case of the attenuation coefficient, ν, there was an overall trend of a larger proportion of good results for larger magnitudes and sample sizes, but this was not strictly linear with magnitude. When solving for M + c, the pattern was more complex, although the largest proportions of acceptable coefficient values were obtained for the larger magnitudes.

Three sets of data were compared for each sample size using magnitudes 4.5, 5.5 and 6.5: all one million samples, a subset of samples that gave an attenuation coefficient in the range of 3.3 ≤ ν ≤ 3.7 and a second subset of samples leading to ν = 3.5 exactly. The size of the second subset was typically of the order of fractions of a per cent. The longest distance from the epicentre was shorter in the samples of the first subset than in all one million samples (Fig. 2a). The benefit of increasing the sample size is largest for magnitude 6.5: because the maximum distance range for magnitude 4.5 is not long, increasing the sample size brings no particular benefit, as the added IDPs are found to be close together. The samples leading to ν = 3.5 exactly stand out in that the minimum distance range is much longer than in the other two cases (Fig. 2b). In all three cases, the minimum range of all samples approaches the maximum value as the sample size increases.

Fig. 2
figure 2

a The maximum shortest distance from the epicentre (Rmin) and b the minimum range of distances (Rmax − Rmin) of randomly generated samples as a function of sample size (number of intensity data points) for magnitudes of 4.5, 5.5 and 6.5. The solid lines correspond to one million samples and the dashed lines to a subset of samples that yielded an attenuation coefficient value in the range of 3.3 ≤ ν ≤ 3.7. In b, the dotted lines correspond to the second subset with the samples that yielded the correct value ν = 3.5 exactly

Briefly, the exercise suggests that the success of estimating the coefficient depends on the spread of intensities and the corresponding distances from the epicentre along the radius of perceptibility. A short radius of perceptibility is more rapidly saturated with new data points than a long one. These features also indicate that the selected inversion procedure performs correctly and can be used to resolve the IPE coefficients.

4.2 The Database

The estimation of coefficients was also investigated using the synthetic database (Table 1). As expected, the presence of a magnitude range helps to resolve all the unknown coefficients, although here it extends slightly over one magnitude unit and is composed of only four different magnitudes. A minimum of three different integer intensities was needed for each sample to be able to solve the three coefficients.

A test of 10,000 random selections of 5 IDPs per earthquake, totalling 90 IDPs, revealed that the coefficients could be successfully resolved. (Assuming error-free magnitudes, this is identical to one M4.5, M4.7, M5.1 and M5.7 earthquake with 50, 25, 10 and 5 IDPs, respectively). The mean of the attenuation coefficient ν obtained was 3.50 ± 0.04, and νmin = 3.32 and νmax = 3.66. The corresponding values of coefficient b were 1.50 ± 0.068, bmin = 1.28 and bmax = 1.78, and those of coefficient c were 3.00 ± 0.30, cmin = 1.68 and cmax = 3.99. Removal of the M5.7 earthquake from the synthetic database narrowed the magnitude range to only 0.6 units. The mean and standard deviation of coefficient ν remained as above, but the standard deviation of b grew to 0.11 and that of coefficient c to 0.48. In all these calculations, there was a strong negative correlation, approximately − 0.95, between b and c, which is understandable from Eq. (1).

Then the coefficients were estimated using all possible combinations of magnitude errors ± 0.1 and ± 0.2 up to ± 0.5 units (Fig. 3). The obtained coefficients b were in the interval 1–2.4 (Fig. 3a). The biggest effect was observed on the c coefficient: It became negative, when the largest magnitude of 5.7 was replaced by 5.6 or 5.5 and simultaneously the smallest magnitude of 4.5 became either 4.6 or 4.7 (Fig. 3b). To compensate for this, the coefficient b increased above 2. Small errors up to ± 0.2 units had a minimal effect on the attenuation coefficient, which was almost resolved exclusively within the interval 3.4–3.6 (Fig. 3c). It can also be seen that the obtained coefficient values are not symmetrical around the correct values. When the magnitude errors increased, the range of attenuation coefficient values widened and the values were typically overestimated (Fig. 3d).

Fig. 3
figure 3

Effect of magnitude errors on the intensity prediction equation coefficients ab, bc and cν in the case of synthetic data and magnitudes of 4.5, 5.1 and 5.7. All combinations of magnitudes and their errors ± 0.1 and ± 0.2 magnitude units were tested. The thick horizontal lines indicate the correct coefficient values. Part d shows ν in the case of magnitudes 4.5 and 5.7 and their errors up to ± 0.5 units. The dashed line is the arithmetic mean of 10,000 inversions

As the next step, the effect of the focal depth was examined using the concept of average regional depth. A total of 10,000 trials were carried out for each case. When all initial depths are equal to 20 km (column depth 1 of Table 1), the IPEs with the correct coefficients give higher intensities than those based on the assumed regional depths of 5, 10 and 15 km, whereas the IPE based on the depth of 25 km catches up with the correct ones as the hypocentral distance approaches the epicentral one (Fig. 4a). The curve corresponding to an assumed regional depth of 5 km deviates the most from the correct ones, which is reasonable. The differences due to the depth are not large except for epicentral distances below 30 km, where they can be 1 intensity degree, and below 10 km even 2 degrees. If the assumed regional depth is shallower than the correct one, the mean attenuation coefficient is underestimated and if it is deeper, ν is overestimated.

Fig. 4
figure 4

Effect of the assumed regional depths of 5, 10, 15 and 25 km on the intensity prediction equation (IPE) in the case of the synthetic database. The solid lines indicate the IPEs with the correct coefficients b, c and ν. Depth 5 km is blue, 10 km red, 15 km black and 25 km green. The other IPEs were plotted using the mean coefficients of 10,000 inversions and the corresponding depths as given in Table 1 in the columns a depth 1 and b depth 2. All IPEs were plotted for a magnitude of 5

It is often observed that earthquake depth tends to increase with magnitude, but any possible effect of this is taken to be insignificant in the small magnitude span of the database. However, earthquake occurrences at different depths were also modelled (column depth 2 of Table 1). The obtained coefficients ν and c increased with depth (Table 2). The mean of coefficient ν was 3.41 and 3.71 at depths 5 and 10 km, respectively, and above 4 at depths 15 and 25 km. All coefficients c were too high, including the minimum values. Coefficient b was quite stable from one depth to another, but the estimated values were close to 1, which is incorrect. Figure 4b shows that the depth of 5 km gave the equation closest to the correct one. All M4.5 earthquakes, corresponding to more than half of the events, but only to 13.5% of all IDPs, had correct depths. Typically the highest intensities follow from the depths of 15 and 25 km over the displayed distance range, and they are one intensity higher than the other curves. This runs counter to what is expected.

Table 2 Effect of depth errors on the estimation of coefficients b (= 1.5), ν (3.5) and c (3.0) using the synthetic database (Table 1)

In conclusion, a rather small number of low-magnitude earthquakes appears sufficient for the successful solution of the IPE coefficients: the synthetic database only has four different magnitude in a narrow range. However, even small magnitude errors affect the values of b and c coefficients, and the attenuation coefficient is also affected by large magnitude errors. Assuming a regional depth that deviates from the correct one can be misleading, and the mean coefficient values may deviate from the correct ones.

5 Coefficient Estimation Using UK Data

The coefficients of the IPE Eq. (1) were also estimated using real data. Since this analysis focuses on small-to-moderate magnitude earthquakes, the data were retrieved from the intensity database of historical earthquakes in the United Kingdom provided by the British Geological Survey (BGS) (http://www.quakes.bgs.ac.uk/historical/, last accessed November 2017). The initial selection criteria were post-1965 earthquakes with M ≥ 4. At the time of downloading, the database ended in 2001, and twelve earthquakes fulfilled the criteria. Five of them were omitted. Two earthquakes occurred in the same region within an hour and 57 min on 25 February 1974. The first earthquake had a magnitude ML 3.9, the second 4.1. For the second earthquake, 90 intensities out of 97 are less than 4, so there is a possible mix-up between the two events. The earthquake of 19 July 1984 was located close to the shore, and the spatial distribution of intensities appears contaminated. The distance ranges of the IDPs associated with the earthquakes of 9 August 1970 and 4 March 1999 largely overlap, making the assessment of attenuation complicated.

A total of seven earthquakes were left in the dataset (Table 3, Fig. 5). The quality of their locations and magnitudes was examined. For example, the epicentre given for the earthquake of 7 March 1972 (mb 4.0, at 06:52 UTC) is at least 15 km from the cluster of localities with the maximum intensity of 6. The intensity decreases as a function of distance when using the epicentre coordinates determined by Le Bureau Central Sismologique Français given in the Bulletin of the International Seismological Centre (ISC). The magnitude 4.7 given for the earthquake of 26 December 1979 leads to an anomalously large area of perceptibility. The anomaly disappeared when using the magnitude ML 5.2 determined by the Institute of Geological Sciences in the UK.

Table 3 The earthquakes selected from the intensity database of historical earthquakes in the United Kingdom provided by the British Geological Survey (http://www.quakes.bgs.ac.uk/historical/, last accessed November 2017)
Fig. 5
figure 5

The set of intensity data points retrieved from the UK Historical Earthquake Database of the British Geological Survey (http://www.quakes.bgs.ac.uk/historical/, last accessed November 2017). Seven post-1965 earthquakes were used: a five with the given magnitude below 4.5 and b two with magnitude above 5. Magnitudes are on the ML scale (mb scale for 1966 and 1972). The uncertain intensities (17% of all) have been plotted between the integer intensity degrees

The BGS database does not include focal depths, so they were taken from the ISC Bulletin. Priority was given to the depths estimated by agencies in the UK. Magnitude is given on the mb instead of the ML scale for the two oldest earthquakes. We assume that the difference in the magnitude scales is negligible in comparison with the accuracy of the magnitude evaluation. For example, magnitude estimates given by different agencies for the earthquake of 26 December 1979 vary by up to 0.5 units. Musson (1996) reported that recent magnitudes were of ± 0.2 ML, which is taken to refer to the early 1990s. The accuracy is poorer for earlier events recorded at few stations.

The number of IDPs is 896, ranging between 5 and 426 IDPs per earthquake. Intensities assigned to monumental buildings or large territories, descriptions of “felt” and values of 1 (not felt) were excluded from the data. The reported maximum intensities are 5, 5–6 or 6 on the European Macroseismic Scale (EMS-98). The intensity degrees between EMS-98 and MSK-64 assumed by Eq. (1) are similar, except for the two highest intensities (Musson et al. 2010). An intensity I = 2 is infrequently reported (Fig. 5), so we have taken the minimum intensity to be 3. Many authors have pointed out the incompleteness of the IDP data in the far field, which may cause bias in statistical analysis (Albarello and D’Amico 2004; Allen et al. 2012, among others). A prevalent practice is to define 3 or 4 as the lower intensity limit in the final input data (e.g., Pasolini et al. 2008a; Stromeyer and Grünthal 2009; Sørensen et al. 2010).

Intensities may be uncertain, which is formally defined in the EMS-98 guidelines (Grünthal 1998). For example, an intensity given as 7–8 means that the intensity is either 7 or 8 at the locality in question. This implies epistemic uncertainty originating from inadequate or skewed documentation on earthquake effects, or poor survival of documents to the present time. The notation addresses a relevant issue, but does not provide practical instructions. It may easily be associated with “7 to 8”, and an intermediate degree appears appropriate, but it rather reads “7 or 8” (7/8).

So-called half-intensity values (7–8 is replaced by 7.5) are sometimes used when deriving IPEs (e.g., Stromeyer and Grünthal 2009; Sørensen et al. 2010; Bindi et al. 2011), although it is understood that this practice compromises the integer character of intensity. The new classes between integer values may be less dispersed but are less reliable (Peruzza 1996), and the practice suggests that the intensity scale has 23 degrees instead of twelve (Musson 1998). Statistical approaches have been used to account for the uncertainty (Magri et al. 1994; Peruzza 1996; Pasolini et al. 2008b), or the input data have been visually checked (Stromeyer and Grünthal 2009; Sørensen et al. 2010).

We propose to investigate the effect of uncertain values on the IPE. The present dataset includes 152 uncertain intensities, or about 17% of the IDPs. They were handled in three ways: they were omitted, rounded down (for example, 4–5 was replaced by 4) and rounded up (4–5 was replaced by 5). Rounding down and up to the closest integer give the limits of the effect of uncertain intensities. The obtained coefficients were b = 1.8, c = 1.9 and ν = 3.5 when the uncertain intensities were removed. When they were kept in the data and rounded down, the values were b = 1.4, c = 4.6 and ν = 3.9, whereas rounding up gave b = 1.4, c = 4.5 and ν = 3.9. Rounding up the uncertain values gives slightly lower values than rounding down, but the differences are not significant. Omitting the uncertain intensities has the largest impact on the coefficients in the present case.

A stability test was performed on the coefficient values (Table 4). The events of the database were removed one at a time, and the coefficients were estimated on the basis of the remaining six earthquakes. It can be seen, for example, that removing the five IDPs related to the Kintail earthquake of 10 August 1974 had a larger effect on the ν coefficient than removing the Warwick earthquake of 23 September 2000 with its 157 IDPs. According to this test, the IPE coefficients can be resolved with an accuracy of 0.23 for b, 0.38 for ν and 1.0 for c. However, no earthquake parameter errors were considered.

Table 4 Coefficients b, ν and c when the earthquakes of the UK dataset were removed one at a time

Accounting for magnitude errors demonstrated that the outcome resembles that of the synthetic case (Fig. 3), but the ranges of the obtained coefficients were larger and the range of coefficient ν was quite fragmentary (Fig. 6). This is attributed to inconsistencies in the real data. The ML 4.0 earthquake displayed in Fig. 6 is that of 15 February 1994. If the mb 4.0 earthquake of 7 March 1972 is used instead, the range of obtained coefficient values increases. The arithmetic means and the corresponding standard deviations of 10,000 inversions are 2.1 ± 0.4 for b, 1.6 ± 1.8 for c and 4.1 ± 0.2 for ν. Here, there was also a strong negative correlation, about − 0.98, between b and c, which is understandable from Eq. (1).

Fig. 6
figure 6

Effect of magnitude errors on the UK dataset. The coefficients of the intensity prediction equation ab, bc and cν obtained for magnitudes ML 4.0, 4.4 and 5.2 ± 0.2 units are shown. The thick horizontal lines are the arithmetic means of 10,000 inversions, and the thin lines are the corresponding standard deviations. The x-axis is discontinuous to avoid overlap

The BGS intensities were compared to calculated ones in order to obtain an understanding of their variance. The coefficients of Eq. (1) were resolved for the three regional depths and the given depths (Table 3) in order to calculate the intensity at each available site. The uncertain intensities were downgraded, upgraded and omitted (Table 5). The variance was of the order of 0.5–0.6 in most cases, implying an uncertainty of 1 intensity degree.

Table 5 The resolved coefficients of the intensity prediction equation using the depths given in the bulletin of the International Seismological Centre (Table 3) and three assumed regional depths

Assuming regional depths of 5, 10 and 25 km for the UK dataset (Fig. 7) gave a different pattern from the synthetics (Fig. 4). The IPEs corresponding to the different depths were not parallel over the entire distance range, but intersected at intensity 4 at the distance of approximately 100 km from the epicentre. At the longer distances, the larger regional depths give smaller intensities.

Fig. 7
figure 7

Effect of assumed regional depths of 5, 10 and 25 km on the intensity prediction equation in the case of the UK earthquakes of a 26 Dec 1979 (ML 5.2, H = 11 km) and b 2 Apr 1990 (ML 5.1, H = 14 km). The corresponding coefficients are given in Table 5 (column ‘down’). The open diamonds are the intensity data points available for the two earthquakes

In summary, the results obtained using real data resemble the pattern of magnitude errors of the synthetic results but imply larger existing inconsistencies. Assuming different regional depths affects the shape and level of the IPE. In all testing, coefficient c was the most and attenuation coefficient ν the least sensitive to errors. In the case of the synthetic and real databases, a strong negative correlation between coefficients b and c was observed. This is attributed to the form of Eq. (1), in which these coefficients try to accommodate to the data properties. In the inversion procedure (Sect. 2), coefficient c also accumulates some of the error of other two coefficients. It can be inferred that a value of c close to the value of 3 indicates less inconsistencies in the input data than the higher absolute values.

6 Discussion and Conclusions

In the available literature, variation in intensity close to the epicentre is regarded as a potential source of bias in the analysis (e.g., Bakun and Scotti 2006). The shortest distances from the epicentre, some 100 km, are of principal interest in applications such as ShakeMaps. Davis et al. (2000) demonstrated that an anomalous intensity was caused by geological focusing of seismic waves at the distance of 21 km from the epicentre. Hinzen and Oemisch (2001) attributed the observed irregularities in near-focal areas to source radiation and varying ground amplification conditions. Using the Kövesligethy–Sponheuer equation, Levret et al. (1994) concluded that, particularly in the near field, intensity attenuation is much more dependent on the depth of focus than on absorption by the soil.

Given such challenges, alternative approaches to the modelling of intensity attenuation have been proposed. For example, Magri et al. (1994) used a logistic model to estimate the probability that the attenuation exceeds a threshold value at a given distance from the epicentre. When intensity attenuation, ΔI, is regarded as a random variable, the probability of the site intensity becomes the convolution of the probability distribution of epicentral intensity, I0, and that of the intensity attenuation (e.g., Tsapanos et al. 2002). Rotondi and Zonno (2004) took intensity attenuation to be a random variable that follows the binomial distribution with parameters (I0, p), where p depends on the distance from the epicentre and is a Beta random variable according to the Bayesian paradigm. The probabilistic approaches avoid the construction of a local IPE and are often targeted at seismic-hazard analyses.

In the large majority of available literature, a deterministic function and local intensity data are employed to investigate attenuation. One reason behind the IPE is the need for the regional correlation between magnitude and area of perceptibility to estimate the size of historical earthquakes. A prevalent approach is to take attenuation to be a function of I0 and the distance of the site from the epicentre and use an exemplary earthquake of the target region. Alternatively, only the distance from the epicentre and all the sites in the database are considered.

This investigation belongs to the latter type, which may be attractive in regions with small intensity databases. The basic elements of input are intensities that are equal to, or larger or smaller than, the other available values, the configuration of localities and the distances between them (e.g., Mäntyniemi et al. 2014). In the present paper, independence of azimuth was assumed. In real target regions, population centres remain fixed for many years, and randomness of the distance to them is created by different earthquake locations. The synthetic samples suggest that widespread intensities along the radius of perceptibility are a property of good data. In real cases, for example an offshore earthquake, part of the distance and intensity range is unavailable to the analysis. The synthetic single earthquakes clearly showed that an IDP not far from the epicentre implies good chances of solution (Fig. 2). The distribution of settlements can influence the data (e.g., Musson 2005). For example, the IDPs of the Carlisle earthquake of 26 December 1979 have a gap in the distance range because of mountainous territory (Figs. 5b, 7a). In extreme cases, such features can affect the median distance. The synthetic and real database of the present study revealed a clear effect of errors of magnitudes and focal depths on the IPE coefficients (Figs. 3, 4, 6, 7). They indicate that bias does not necessarily only follow from intensity assessments. Instrumental parameter determination and the collection of macroseismic data are parallel activities, and their combined results can provide outliers. Magnitude and location errors can affect the results and a wrong choice for the regional depth may affect the IPE coefficients. This is different from the findings of Sørensen et al. (2010), who concluded that the uncertainties in earthquake source parameters are negligible in comparison to the spread in the intensity data. Their investigation included also historical earthquakes in the magnitude range from Mw 6.3 to 7.0.

Figure 8 compares the present analysis with the IPE provided by Musson (2005). It was based on 727 isoseismals from 326 British earthquakes. These data also included historical earthquakes and covered the magnitude range of ML 2.0–6.1. The present evaluation is based on a much narrower magnitude range 4.0–5.2 with a gap range of 4.5–5.0. The values used are from rounding up of the uncertain intensities of the UK data as such (b = 1.4, c = 4.5, ν = 3.9; Sect. 5), the data shaking experiment (Table 4) with the uncertain values upgraded (corresponding means b = 1.47, c = 4.03 and ν = 3.81) and the means from the inclusion of magnitude errors (Fig. 6; 2.1 for b, 1.6 for c and 4.1 for ν). In the case of a magnitude of 4, all curves give a similar intensity until 35 km, after which the curve related to the magnitude errors attenuates the fastest. In the case of a magnitude of 5, the differences between the present tests are small in comparison with the Musson (2005) equation, which gives systematically larger values beyond 50 km. According to the Musson (2005) equation, the attenuation of small intensities is very slow. There are no data beyond 300 km in the current dataset (Fig. 5), which advocates faster attenuation.

Fig. 8
figure 8

Comparison of the Musson (2005) intensity prediction equation with the present tests for magnitudes ML 4 and 5 and a depth of 10 km. The uncertain intensities of the UK data were rounded up, the UK earthquakes were removed from the data one at a time (Table 4) and magnitude errors of ± 0.2 were included

Synthetic data help to analyse IPEs, although they are idealized. The synthetic tests suggest that even modest numbers of IDPs give correct coefficients: the synthetic database included only four different magnitudes in a narrow range. There is no principal reason for not using small-to-moderate magnitude earthquakes to construct IPEs in the absence of large earthquakes, but the errors in real data complicate the sound evaluation of coefficients.

To conclude, this investigation calls attention to basic data properties. Sophistication is no attribute of intensity, so it is proposed to investigate the effect of uncertain intensities in each dataset instead of using decimals. The synthetic data suggest that small-to-moderate earthquakes can be used in constructing IPEs. The performance of synthetic data gives a model with which the real data can be compared. The attenuation coefficient is insensitive to small magnitude errors, so its large variation may tell of the presence of large magnitude errors. An erroneously assumed regional depth may lead to unusual patterns of intensities as a function of depth. Intensity data should not be downloaded from the available databases without critical revision.