1 Introduction

Earthquake engineering has traditionally focused on protecting the built environment and its inhabitants from the severe dynamic loading imposed by moderate and large earthquakes, in response to the destruction caused by such events. While smaller earthquakes have always been of interest to seismology since they provide insight into seismicity patterns and geophysical characteristics, the focus of earthquake engineering was generally limited only to defining the lower limit of magnitude, Mmin, to be considered in seismic hazard assessments undertaken to define earthquake design loads. Considerable effort was invested in responding to this question, based on both structural analyses and extensive review of empirical field data (EPRI 1989), leading to the definition of Mmin values close to 5 for critical infrastructure. Values commonly used in seismic hazard mapping and site-specific studies are generally in the range 4.5–5, the implicit assumption being that smaller earthquakes do not generate motions that could threaten structures designed for seismic resistance (Bommer and Crowley 2017). This in turn influenced the derivation of ground-motion prediction equations (GMPEs), which were usually calibrated for application to the higher range of magnitudes.

In recent years, three factors have generated greater interest in smaller magnitude earthquakes. One factor is the occurrence of destructive events that are below the threshold of what would generally be considered a damaging earthquake, a striking example of which was the Ischia earthquake in Italy in August 2017, which led to loss of life despite having a moment magnitude M of only 3.9. Another factor has been the development of the practice of seismic risk assessment of existing building stock, often with limited consideration for seismic resistance in the original design and construction, for insurance, risk mitigation, and emergency planning purposes. While it is reasonable to discount the possibility an earthquake of, say, M4.5 damaging a newly built structure that has been designed against lateral loads, clearly such an assumption may not hold for existing buildings as highlighted by such cases as the Ischia earthquake. The third factor influencing the heightened interest in small-to-moderate magnitude earthquakes is the potential risk posed by induced seismicity (e.g., Taylor et al. 2018), particularly since this is viewed as an imposed rather than natural hazard. Moreover, since induced seismicity can occur in regions that were previously of very low—or even null—seismic hazard, it can affect buildings designed and constructed without consideration of earthquake loads, which may moreover be deteriorated through age and lack of maintenance.

Addressing the risk posed by smaller earthquakes, and in particular induced seismicity, requires many aspects of conventional earthquake engineering to be adapted. For example, extrapolation of conventional GMPEs to smaller magnitudes has been shown to over-estimate the ground-motion amplitudes (e.g., Bommer et al. 2007), prompting development of GMPEs specifically calibrated to the target magnitude range (e.g., Atkinson, 2015). The derivation of fragility functions also needs to account for the shorter durations and lower energy content of the motions expected from smaller earthquakes (Bommer et al. 2015). In view of the potential differences with respect to more conventional seismic risk studies, it is useful in such cases to be able to assess risk estimates against an objective frame of reference. Indeed, the work presented in this paper was originally prompted by efforts related to risk assessment due to induced seismicity in the Groningen gas field in the Netherlands (van Elk et al. 2019). In the period immediately following the ML 3.6 Huizinge earthquake in 2012—the largest induced event in Groningen to date—some scenario-based risk assessments were issued with alarming estimates of casualties due to moderate earthquakes. To avoid the distress and consternation that can obviously result from such prognostics, it was viewed as being important to provide an independent framework against which such estimates could be evaluated.

The work undertaken and summarised in this paper is the compilation of a global database of earthquakes, both natural and induced, with magnitudes in the range from 4.0 to 5.5 for which damage and/or casualties have been reported. The upper magnitude limit is selected since there is no real doubt that earthquakes of greater size can be destructive, as testified, for example, by the M5.7 San Salvador earthquake of 1986 (e.g., Bommer et al. 2001). The lower limit—fixed at the outset of the project and prior to the Ischia earthquake—was selected on the basis of it being highly unlikely that smaller events could cause appreciable damage. Although these limits have been maintained, we also discuss some earthquakes that fall below the lower limit of the range.

The primary purpose of the database is to facilitate studies that can provide insights into the potential for earthquakes in the defined magnitude range to cause material damage to the built environment. For any individual earthquake of this size that caused damage, the only reliable way to determine the factors that contributed to the impact would be to fully characterise both the earthquake and the exposed environment, including local ground-motion recordings, surface geotechnical conditions and the fragility of the damaged structures (Nievas et al. 2019b). An alternative approach, which can provide complementary insights, is to investigate the relative frequency with which such small-to-moderate earthquakes cause damage, and to explore general patterns in these cases. The database summarised in this paper is focused on this second class of study and has indeed been used in statistical analyses of the numbers of earthquakes in the defined magnitude range that have coincided with human settlements (Nievas et al. 2019a). From the outset, however, it is important to have realistic expectations of such a database, which in aiming for breadth of coverage—by including as many reportedly damaging events as possible—inevitably has to compromise in terms of depth for the simple reason that the available information regarding many of the events is rather limited. Equally, it is very unlikely that such a database can be complete, especially towards the lower limit of the defined magnitude range, and the level of completeness is very difficult to determine. Notwithstanding these challenges and limitations, the database is a potentially valuable resource for obtaining greater understanding of the risk contribution of moderate magnitude earthquakes.

Following this introduction, Section 2 describes the compilation of the database in terms of the primary sources of information and the overall numbers of cases, including their geographical, temporal, and magnitude distribution. Section 3 discusses the main database fields and the decisions regarding the selection of a single parameter value or code for factors in the face of uncertainty. Factors scrutinised include depth, magnitude, whether the earthquake formed part of a sequence, the exposed population, and measures of the impact, with an appendix dedicated to examining at some length the issue of fatal heart attacks attributed to earthquakes. Section 4 of the paper presents some general statistics of the database, highlighting factors that contribute to the smallest earthquakes for which severe consequences have been reported. The paper ends with a brief discussion of the database and its potential applications as well as acknowledging its limitations, but also looking to how this resource may be extended and improved in the future.

2 Overview and composition of the database

2.1 Contents

At the time of writing, the Database of Damaging Small-to-Medium Magnitude Earthquakes consists of 1958 earthquakes with magnitudes in the range M4.0–5.5 that occurred from the year 1900 through 2017 for which reports of damage and/or casualties have been found. Their distribution in time is not uniform, with around 49.3% of the events having occurred in the 5 years between 2013 and 2017. At least one of the following criteria had to be met in order to include an event in the database:

  • Reports of at least one death or injury of any kind (slight or serious).

  • Reports of at least one building with some kind of damage to its structural or non-structural (but architectural) components. Cases in which mentions were oriented only to damage to falling china, bottles, or other contents did not qualify.

  • Reports of damaged infrastructure.

  • Reports exist of damage insurance claims.

  • Reports exist of losses expressed in financial terms (measured or estimated).

However, the event could be later excluded if:

  • It was part of an earthquake series with any events above M5.5 and it was not unambiguously clear which shocks caused the reported damage.

  • The damage and/or casualties were not a direct or indirect result of the earthquake. For example, explosions and mine collapses are often reported as earthquakes, and the casualties and losses related to them are usually a consequence of the explosion or the collapse itself and not of the generated ground shaking; such cases were excluded. However, if the earthquake was the cause of the damage, even if one of the consequences was the collapse of a mine, then it was included. It is noted that in many cases, there is not enough information to understand whether the earthquake originated in a source external to the mine, yet with slip potentially induced by mining-related stress changes, or whether it was originated by a collapse or explosion in the mine itself and only caused damage in the mine. The former would be included in the database, while the latter would not. Cases in which the damage or casualties were due to phenomena triggered by the earthquakes (such as landslides, for example) were included.

The kind of information that was sought for the characterisation of each earthquake in the database can be summarised as follows:

  • Earthquake source parameters: hypocentral coordinates, magnitude, date and time of occurrence (UTC and local).

  • Maximum intensity, when available.

  • Population exposed to the ground shaking.

  • Number of people affected, evacuated, left homeless, injured, and killed, with causes when known.

  • Number of damaged and destroyed buildings, as well as monetary losses.

  • Flags to indicate the nature of the event (induced or tectonic), whether infrastructure was affected or not, the occurrence or not of landslides, and the occurrence or not of liquefaction.

  • Whether the earthquake belongs to a cluster, is a main shock, foreshock, or aftershock, and whether the consequences are expected to correspond only to the listed earthquake or may include those of other earthquakes in the series.

2.2 Data sources

While information regarding the consequences of destructive large magnitude earthquakes is relatively abundant, this is generally not true for their small-to-medium magnitude counterparts. The reasons for this are many, starting with a natural tendency both within the scientific environment and the media to invest resources in the assessment of events that have a more extreme impact on society. Only in the more recent years has the earthquake engineering community started to recognise that expected annual losses due to seismic events may be influenced by frequent shaking that causes small damage (e.g., Bazzurro and Luco 2007). Of no less importance is the fact that when smaller earthquakes occur in areas of high seismicity, their impact tends to be perceived as minimal by the population and little or no effort is invested in documenting their consequences. All this leads to the sources consulted for the present work varying in terms of both quality and completeness, as described below.

Already existing databases of damaging earthquakes have naturally been a very relevant resource for the present work. Created by different organisations and with different purposes, the following have been of particular interest:

  • The International Events Database (referred to as well as the Emergency Events Database) of the Université Catholique de Louvain, Belgium (EM-DAT hereafter).

  • The Significant Earthquake Database of the National Centers for Environmental Information of the National Oceanic and Atmospheric Administration (NOAA) of the United States (NOAA hereafter) (NGDC).

  • The EXPO-CAT catalogue of human population exposure (Allen et al. 2009b) and the PAGER-CAT losses database (Allen et al. 2009a).

  • The Earthquake Impact Database (EID), a relatively recent initiative that describes itself as a community who “collects information and provides statistics about damaging earthquakes in the whole world”, in direct link with the work of the Earthquake-Report website (https://earthquake-report.com/).

A further source which may potentially contain valuable information in this context is the CATDAT damaging earthquakes database (Daniell et al. 2011). However, it has not been possible to integrate it to the present work as it is inaccessible for public use at the present time.

The EID is available online from the year 2013 onward as spreadsheet files. It differs from the sources listed above in the fact that information regarding damage and casualties due to earthquakes occurring worldwide is gathered and published online in near-real time via a network of collaborators, their sources being the media and the reports that people who have felt an earthquake leave using their online questionnaire, as well as relevant seismological institutions. The compilation of information in real time allows for even the smallest of events to be taken into consideration as the data remains fresh and accessible. As time passes after each earthquake, not only are media reports of small events buried under the immense flow of information of the Internet, but it also gets more difficult to distinguish the effects from different earthquakes in a sequence. When earthquakes happen within some hours of each other, the latter is impossible, but if they occur some days or weeks apart, immediate reports might be able to make this distinction to some degree, while final reports usually make reference to the sequence as a whole. As a consequence, many databases do not have an entry per event but rather one report summarising all the observations collectively, while the EID appears to generate separate entries in each case, when possible. A comparison of the number of earthquakes contained in the EID, EM-DAT and NOAA databases for the years 2013–2017 reveals that EID has, in average, 13 and 7 times the number of earthquakes contained in EM-DAT and NOAA, respectively, adding up to a total of 1721 entries for the 5 years (Nievas et al. 2019b). Given that the number of earthquakes in the EID is so large, a computer code was written in Python to make the process as automated as possible. The challenges of this automation were, mainly (i) identifying the earthquake that each entry of the EID is making reference to, knowing only the country, sometimes a region within it, the date, but not the time, and a magnitude in an unspecified scale and (ii) retrieving as much information as possible from a database format not conceived for automatic processing. The consequences of each earthquake from the EID, namely, the number of fatalities, injuries, homeless, damaged buildings, and destroyed buildings, were taken at face-value. Details on the processing of data from the EID are given in Appendix 1.

Though not organised as databases of damage and casualties, the earthquake catalogue of the United States Geological Survey (USGS), the Bulletin of the International Seismological Centre (ISC), and an extract of the Preliminary Determination of Epicenters (PDE) Bulletin of the USGS for the period 1968-2011 (K. Jaiswal, pers. comm.) served as well as important sources of information, usually consisting of one or two sentences describing the consequences, providing numbers when available. Phrases such as “minor damage in [place]”, “some buildings damaged at [place]”, or “buildings damaged or destroyed in [place]” are common. Consequences reported in the ISC Bulletin usually correspond to contributions from the USGS itself, but many cases have been found in which a comment regarding damage occurrence is attributed to the USGS and available in the ISC Bulletin but not on the website of the USGS. Local agencies relevant to specific earthquakes were consulted as well, as many publish on their websites brief descriptions or reports on damaging events that occurred within their area of influence.

Scientific journal papers and reports are, undoubtedly, another source of relevant information, despite their natural focus on larger events or exceptionally damaging small ones. Apart from those focusing on a particular earthquake or earthquake sequence, extensive compilations of damage descriptions for complete earthquake catalogues and periodical summaries of observed seismicity were extremely useful as well (e.g., Part C of the EKDAG earthquake catalogue for Germany and adjacent areas, Schwarz et al. 2010; list of Peruvian earthquakes by the National Civil Defence Institute of Peru, INDECI; the newsletters of the Society for Earthquake and Civil Engineering Dynamics, SECED).

While not always rigorous from a scientific perspective, newspaper articles were fundamental for the compilation of this database. They are a good source of detail for the cases of earthquakes for which more succinct data can be found in larger impact databases, and may be the only source of information regarding damage and casualties for many smaller earthquakes. As the latter are not usually reported by the international media, local newspapers are of paramount relevance as well, language then becoming the main issue to be addressed. The amount and quality of information in the present work may, in this sense, be biased towards earthquakes that occurred in areas of the world where English, Spanish, Italian, French, Portuguese, or German are spoken. We were able to gain access to texts in Serbian, Russian, Greek, Hindi, Nepalese, and Chinese in a more sporadic fashion, thanks to collaborators to whom we are grateful, which extended the geographical coverage. It is for this reason as well that the EID and other online services were such relevant sources of information for this work, precisely for their capacity to collect information on the consequences of so many more earthquakes than any other earthquake consequence database, thanks to their network of international collaborators.

Within the broader digital domain, summaries of catastrophe relief actions from ReliefWeb, the digital service of the United Nations Office for the Coordination of Humanitarian Affairs (UN OCHA), were very valuable, together with the more recent work of Earthquake-Report, which aims at reporting in near-real time on the consequences of all earthquakes that occur worldwide, irrespective of their magnitude or the extent of the resulting damage. These websites are not organised as databases with fields of data to be searched and retrieved, but as a collection of reports on events. Belonging to the UN, ReliefWeb usually provides good descriptions of the extent of the consequences observed, since this provides guidance for requesting and coordinating international assistance to earthquake-hit areas.

Last but not least, personal blogs and social media (e.g., Twitter) often provided information on damage and casualties when better data was not available. The authors of these sources can sometimes be (governmental or not) emergency response and/or seismological agencies, as well as scientists.

As can be observed, the range of sources used for the compilation of this database is rather diverse. These sources differ in their levels of reliability and, in most cases, the latter cannot be determined (see, for example, discussion in Appendix 1 on taking consequences reported by the EID at face value). Limiting a database such as this one to only include data that can be fully verified would undoubtedly render a much smaller and incomplete list of events than currently included within it and defeat the purpose of its compilation. It is for this reason that we strongly support initiatives like the EID and Earthquake-Report for gathering these data while still available and preserving it for the future, and encourage the earthquake engineering community to increase our efforts in this regard.

2.3 Geographic distribution

Figure 1 shows the epicentral locations of the 1958 earthquakes currently present in the database and reveals that the geographical distribution of events that make up the Database of Damaging Small-to-Medium Magnitude Earthquakes follows, in general, the patterns of global seismicity. However, two observations can be made. Firstly, that areas that feature high seismicity rates but very low population density, such as, for example, Alaska, Tierra del Fuego, and the Kurile Islands, are largely absent in the database, clearly due to their very low exposure. Secondly, that lower-seismicity areas acquire greater prominence in the database (see, for example, north-eastern USA and Brazil), most likely due to the higher social impact of smaller events in areas for which seismic shaking is perceived as unusual, as well as a likely higher vulnerability of the building stock (broadly speaking, notwithstanding socio-economic factors).

Fig. 1
figure 1

The Database of Damaging Small-to-Medium Magnitude Earthquakes. Red rhombuses and blue circles indicate years 1900–2012 and 2013–2017, respectively

2.4 Distribution in time

The number of earthquakes in the database is not distributed uniformly across its whole duration, as illustrated by Fig. 2 (left). With the number of reported earthquakes per decade visibly increasing in time, it is clear that the database is not complete and is subject to the inherent limitations of data accessibility that are common both to earthquake catalogues in general and damage databases in particular. A first notable jump occurs around 1960 and is likely due to the establishment of the World-Wide Standardized Seismograph Network (WWSSN) around that time, as well as the creation of the International Seismological Centre (ISC) in 1964, both of which facilitated the systematic processing of large volumes of earthquake data (Adams 2010). Another distinct jump can be observed at the beginning of the 2000s, when the ability to communicate and access news and reports was significantly enhanced by the global embrace of online technologies. It is possible that this incompleteness of the database may be reflected as well in the geographical coverage discussed in the previous section and the magnitude distribution presented in the following section, as well as, naturally, in all analyses derived from this data.

Fig. 2
figure 2

Distribution of dates of the earthquakes that make up the whole of the Database of Damaging Small-to-Medium Magnitude Earthquakes. The plot on the right shows the period 2010–2017 in detail, with numbers stemming from the EID and all other sources indicated over the bars

The plot on the left of Fig. 2 reveals as well that events incorporated from the Earthquake Impact Database (EID) by means of the procedure described in Appendix 1 (shortened as earthquakes “from the EID” hereafterFootnote 1) represent around half of all 1958 events: 868 earthquakes were incorporated in such a way, while the remaining 1090 events come from all the other sources considered. The plot on the right of Fig. 2 further illustrates the noticeable impact of the EID as a dataset and raises the question of whether the rate of around 190 damaging earthquakes per year observed for 2013–2017 could be assumed to be the same as for the period before 2013, simply masked by the difficulties in retrieving the corresponding information. If this were to be true, it would imply, for example, that less than 20% of the damaging earthquakes of the 2000s may have been captured in the database, with this percentage reducing further back in time. While the fact that many of the events incorporated from the EID are reported to have caused only non-structural or very limited structural damage give the impression of high levels of completeness, it is not possible to guarantee that all damaging earthquakes have indeed been captured by the EID. Moreover, around 6% of the entries of the EID were not incorporated to the present database due to the impossibility of matching them with earthquakes reported in the ISC Bulletin, as explained in Appendix 1. As a consequence, it is not possible to quantify how close or not this average rate of 190 damaging earthquakes per year is to reality.

2.5 Magnitude distribution

Since magnitudes are often reported to just one decimal place (e.g., in the ISC Bulletin), the range M4.0–5.5 was effectively translated into 3.95 ≤ M < 5.55, as 3.95 would be rounded up to 4.0 and values approaching but still below 5.55 would be rounded down to 5.5. This small rounding for the ranges 3.95 ≤ M < 4.00 and 5.5 < M < 5.55 was done in Fig. 3, which illustrates the distribution of moment magnitudes of earthquakes that make up the whole database. The rest of the bins in the plot include their lower boundary but not their upper one. The plot was generated considering the primary value of M used to define the inclusion or not of the earthquakes in the database, as will be explained in Section 3.1.

Fig. 3
figure 3

Distribution of moment magnitudes of the whole of the Database of Damaging Small-to-Medium Magnitude Earthquakes. Direct = M directly calculated from inversion (retrieved from seismological agencies, scientific literature, etc.). Retrieved proxy = proxy M retrieved from existing catalogues. Calculated proxy = proxy M calculated from other magnitude scales within the context of this work. See explanation in Section 3.2

The number of damaging earthquakes in each magnitude bin increases steadily and reaches its maximum within the 5.25–5.50 bin of Fig. 3. This is not surprising, considering the larger damage potential of larger magnitude earthquakes (notwithstanding the influence of distance, site effects, vulnerability, and other factors). However, it is interesting to note that the relative increase from the 5.00–5.25 to the 5.25–5.50 bin is very small (around 5%) with respect to the relative increase from all other bins up to that point. Moreover, when excluding the earthquakes in the range 5.5 < M < 5.55 that were included in the 5.25-5.50 bin (making this bin wider, as described above), the number of earthquakes actually reduces (slightly) from the 5.00–5.25 to the 5.25–5.50 bin. This suggests that the Gutenberg-Richter relationship (Gutenberg and Richter 1944) appears to take over for magnitudes above around 5.25, as there occur in the world, theoretically, 1.8 times more earthquakes in the range 5.00–5.25 than in the range 5.25–5.50 in a given period of time, assuming a b value of 1.0. In other words, Fig. 3 shows that the higher likelihood of larger magnitude earthquakes to cause damage or casualties is counterbalanced by the decreasing frequency with which these events occur.

The two plots in Fig. 4 are analogous to Fig. 3 but focus each on the events obtained from sources other than the EID (left) and those processed automatically from the EID (right). It is interesting to note that, while the plot on the left of Fig. 4 follows a similar trend to that of Fig. 3, the plot on the right of Fig. 4 reaches a peak within the 4.75–5.00 bin instead and features a much more uniform distribution. Moreover, the number of earthquakes per bin decreases above magnitude 5.00 for this plot. This is most likely due not only to the capacity of the EID to pick up on smaller events, but also on the fact that larger events are more frequently reported elsewhere (i.e., they were already present in the plot on the left before the systematic incorporation of events from the EID according to the process described in Appendix 1).

Fig. 4
figure 4

Distribution of magnitudes of the earthquakes that make up the Database of Damaging Small-to-Medium Magnitude Earthquakes and were not automatically processed from the EID (left), and those from the EID (right). Direct, Retrieved proxy, Calculated proxy as per description of Fig. 3

3 Details on particularly relevant database fields

The Database of Damaging Small-to-Medium Magnitude Earthquakes comprises around 50 fields that describe each earthquake with regard to mainly its source parameters and its consequences. While a brief summary of the kind of contents of these fields has been provided at the end of Section 2.1, specific parameters of interest are discussed in depth herein. For a complete enumeration and description of the database fields the reader is referred to the READ ME tab of the database itself and Appendix 1 of the report by Nievas et al. (2019b).

3.1 Earthquake location

Hypocentral coordinates (latitude, longitude, depth) as well as date and time of occurrence (UTC) were mostly retrieved from the USGS, the ISC Bulletin or local agencies of relevance. However, there are cases, especially for relatively old events, for which the coordinates reported have their origin in journal papers or any of the other sources considered herein, when the earthquakes could not be found listed by seismological agencies. While epicentral distance and hypocentral depth play a major role in the ground motion levels observed and the distances up to which the earthquake may generate consequences (both affecting the length of the travel path, and depth having an influence on the size of the stress drop), homogenisation of earthquake locations was not an objective of the present database, which focuses on consequences. With depth being the most difficult parameter to constrain within this context—a fact that is particularly true in the case of small-to-medium magnitude earthquakes (e.g., Letort et al. 2014)—the database contains several cases of supposedly deep yet reportedly damaging earthquakes. A small discussion on the potential causes of this and an analysis on a subset of 213 of these events can be found in Nievas et al. (2019a).

The fields “Est. Local Date” and “Est. Local Time” contain estimates of the local date and time of occurrence of the earthquakes obtained from the UTC dates and times by means of Google’s Time Zone API. As per the corresponding documentation, these are expected to be accurate and account for daylight saving time from 1st January 1970, and more limited before this date. Coordinates corresponding to the closest onshore location were used to retrieve the local time of epicentres located within seas and oceans, as the Time Zone API cannot identify the time zone in those cases. These cases are indicated as “TZ API near” in the “Est. Local Case” field. A few cases in which the Time Zone API caused a small mismatch in the minutes (e.g., a 2-min difference) were adjusted manually and flagged as “TZ API adj” in the “Est. Local Case” field. It is noted that potential differences with real local times are not expected to exceed two hours.

3.2 Earthquake magnitude

The magnitude scales compiled in the database were moment magnitude (M), surface-wave magnitude (Ms), body-wave magnitude (mb), and local magnitude (ML). The value of moment magnitude reported in the first column of the database was used to decide whether or not to include the earthquake in the database, while values reported in the field “Mw (alt)” are alternative values found. As there are cases in which the scale of the magnitude reported is not clear (e.g., events from the EID), or in which the sources themselves specify that the scale is unknown (e.g., some events in the NOAA database), an additional field was created for unknown magnitude scales, Munk. The magnitude values reported by the EID were always included in this Munk field. Estimates of moment magnitude (as well as Ms, mb, and ML) were retrieved mostly from the ISC Bulletin, the USGS catalogue, the ISC-GEM catalogue (v4.0) (Storchak et al. 2013), and the world catalogue of Weatherill et al. (2016) (v3.0c), referred to as WPG16* hereafter. The Parametric Catalogue of Italian Earthquakes (CPTI15 v1.5; Rovida et al. 2016), the Italian web portal of macroseismic intensities of the INGV (Tosi et al. 2015), the catalogues of the Spanish National Geographic Institute (Instituto Geográfico Nacional, IGN), the Mexican National Seismological Service (Servicio Sismológico Nacional, SSN), the French seismic catalogue (FCAT-17; Manchuel et al. 2018), the Colombian Geological Service (Servicio Geológico Colombiano, SGC), Geoscience Australia (GA), the China Earthquake Networks Center (CENC; Mignan et al. 2013), the earthquake catalogue for Germany and adjacent areas (EKDAG; Schwarz et al. 2010), and the earthquake database of the British Geological Survey (BGS) were also consulted when necessary.

The field “Mw Case” of the database indicates the source of the reported moment magnitudes. Whenever acronyms of seismological agencies are indicated (a relatively complete list can be viewed on the website of the ISC), it is implied that the values contained in the database are those reported in the ISC Bulletin and attributed to those agencies, with the exception of the catalogues and agencies listed above. In the case of values of moment magnitude stemming from the ISC-GEM, the WPG16*, the CPTI15, or the FCAT-17 catalogues, the words “direct” and “proxy” indicate whether they are directly calculated values or estimated from other magnitude scales or observations of macroseismic intensities, respectively. Whenever it was possible to find neither a direct nor a proxy value of M, empirical conversion models were used to make an estimate of M by means of one of the values of magnitude available in other scales, with order of preference of Ms over mb over ML over Md as source value. For Ms and mb, linear piece-wise models that average those of Scordilis (2006), the Generalised Orthogonal Regression models of Di Giacomo et al. (2015), and those of Weatherill et al. (2016) were used, while a one-to-one equivalence was assumed for ML = Md = M. Details on these average models and assumptions are explained in Nievas et al. (2019a). These cases are indicated in the field “Mw Case” of the database by means of the text “Converted from [original scale]=[value in original scale]”.

Though seemingly simple, determining whether an earthquake lies in the range M4.0–5.5 is not always straightforward, due to three main reasons. Firstly, seismic moments are not routinely calculated for smaller earthquakes. Secondly, different estimates of moment magnitude might be available for the same earthquake, and some of them might lie in the range while some others might lie outside. Thirdly, proxy moment magnitude values retrieved from the ISC-GEM, the WPG16*, the CPTI15, or the FCAT-17 catalogues or calculated specifically for this work have uncertainties associated with the variability of the models used to determine them and the quite frequent lack of data used to derive them in the lower magnitude range.

A series of criteria were defined in order to be able to deal with these issues. Firstly, values of M directly calculated from inversion were always preferred over proxy values of M retrieved from the literature or existing catalogues, which were, in turn, preferred over the use of empirical conversion models. Secondly, hierarchies were established in order to deal with lack of agreement in the values of M. Due to its global coverage and its longevity as a dataset, the M estimates from the Global Centroid Moment Tensor catalogue (GCMT; Dziewonski et al. 1981; Ekström et al. 2012) were preferred, when available, over all other estimates, in line with the ISC-GEM catalogue (Storchak et al. 2013) and the work of Weatherill et al. (2016), who took it as the reference scale for harmonisation (which means as well that the empirical conversion models used by the two aimed at converting Ms and mb into a GCMT-equivalent M, retrieved herein as proxy M when needed and available). Estimates of M by the USGS were taken in second order of preference, as they scale equivalently to those of the GCMT, albeit with some variance, as shown by Weatherill et al. (2016). Rigorously compiled local catalogues such as the Parametric Catalogue of Italian Earthquakes (CPTI15 v1.5; Rovida et al. 2016) took precedence over the USGS when available, though they were still considered after GCMT. In more general terms, large international organisations were prioritised over smaller local ones. In the case of proxy M values, the order of preference was ISC-GEM over WPG16* over local catalogues (CPTI15 for Italy and FCAT-17 for France). Similar criteria were applied to magnitude estimates in other scales used to estimate M by means of conversion models whenever direct or proxy M values were not yet readily available in existing catalogues/sources, with Ms and mb calculated by the ISC or, if not available, by the USGS, were preferred over those calculated by local agencies. Local agencies relevant to the area in which the earthquakes occurred were always prioritised over local agencies from other areas. Detailed studies published in the scientific literature were preferred over magnitude estimates reported by agencies, as the latter often result from automatic calculations while a study focused on the determination of magnitude is expected to be able to take into consideration information unavailable to automatic processing algorithms.

It has been observed that there are cases for which the value of moment magnitude obtained from the seismic moment (Mo) reported by the GCMT might be, for example, M5.57 but is reported by the GCMT itself as M5.5, that is, truncated to the first decimal place instead of rounded. For the sake of consistency with the lower- and upper-bound limits defined for this work, the values reported in the Database of Damaging Small-to-Medium Magnitude Earthquakes are those obtained from the seismic moment (\( \mathbf{M}=\raisebox{1ex}{$2$}\!\left/ \!\raisebox{-1ex}{$3$}\right.\left[{\log}_{10}{\mathrm{M}}_{\mathrm{o}}\hbox{--} 9.05\right] \), with Mo in N·m; Hanks and Kanamori 1979).

3.3 Maximum macroseismic intensity

Ideally, one would want to know the peak ground and spectral acceleration levels at the affected sites, so as to paint a relatively complete picture of the destructive potential of the earthquake. Additional parameters such as significant duration would be of great interest as well. However, such information is rarely available for the kind of earthquakes contained in the present database. Moreover, even if it was, it would be relatively complex to convert into a series of simple database fields. Ground motions are a spatial property and, consequently, it would be fundamental to be able to convey not just acceleration values but also the positions at which the reported acceleration was measured with respect to the source and the affected area. It would, of course, not be impossible, though the lines between a database and a complete report would then become more blurred. A detailed assessment of a series of case-study earthquakes—many of which have detailed information regarding the ground-motion field—has been carried out in parallel to this work and can be found in a report by Nievas et al. (2019b).

The database itself contains a field in which the maximum macroseismic intensity is reported, when available. While this is clearly not a direct measure of ground motion itself and is rather inferred from observations that are influenced by exposure and vulnerability, it is often readily available and simple to report. The reported intensity can be either as reported by the sources from observations, or from USGS ShakeMaps (Worden and Wald 2016).

3.4 Clustering of events and interdependence of consequences

One of the largest difficulties associated with studying the consequences of earthquakes is that of discriminating the amount of damage or number of casualties caused by individual events in the presence of other shocks that occur closely spaced in time. It is for this reason that determining whether an earthquake is part of a cluster or not helps to understand the situation and infer if pre-weakening or progressive damage phenomena may have made the consequences more severe than they otherwise would have been due to the single event alone. The database has two fields dedicated to this aim: “Clustering”, in which the earthquake is classified as being a main shock (MS), a foreshock (FS), an aftershock (AS), or part of a swarm (SWARM), and “Consequences”, in which direct comments are made regarding the extent to which the consequences listed may have been influenced by other events.

Whenever possible, the status of each earthquake regarding its potential participation in a cluster was determined manually, by observing a sufficiently long time window of seismicity in the area of interest, or from the existing literature. This information was complemented with the results obtained from declustering the WPG16 catalogue using the declustering algorithm of Gardner and Knopoff (1974) as implemented in the OpenQuake Hazard Modeller’s Toolkit (Pagani et al. 2014; Weatherill 2014). Square brackets in the fields “Clustering” (e.g., [AS]) indicate that the classification is that resulting from the declustering algorithm and no manual verification has been carried out.

The “Consequences” field may contain phrases such as “additional damage”, if the damage reported has occurred in an area previously damaged by a preceding event, “possibly of many”, when several earthquakes have occurred within a short period of time but the sources do not report consequences individually for each of them, “seem separate”, if the consequences seem to be due only to the earthquake being listed, etc. Descriptions of the impact of earthquakes within the USGS catalogue and/or the ISC Bulletin sometimes include expressions such as “additional damage” or specify that the consequences of a certain event are included in those of another one. This information was retrieved whenever possible.

The limitations of declustering algorithms (e.g., being agnostic to structural geology and assuming spatial and temporal windows) and inherent difficulties associated with determining the extent of the interdependence between the consequences of different earthquakes that have occurred closely spaced in time prevent both the compilation and interpretation of these two database fields from being a straightforward task. The user is, thus, advised to handle the contents of these fields with caution.

3.5 Nature of the event

The anthropogenic origin of earthquakes classified as induced or triggered is often very clear and without controversy, but in other cases, the distinction between natural and induced seismicity can be much more ambiguous and an issue of contention. Approaches for determining whether seismic activity was related to anthropogenic activities range from mainly qualitative assessments based on series of simple questions (e.g., Davis and Frohlich 1993; Verdon et al. 2019) to detailed physics-based calculations of probabilities (e.g., Dahm et al. 2015). Whenever possible, suspected anthropogenic origins have been indicated in the database by means of the “Induced Flag” field, relying mainly on existing evaluations rather than attempting to make such assessments as part of this work. Sources have been mostly scientific publications, together with comments included within the ISC Bulletin. The Human-induced Earthquakes Database (HiQuake, Foulger et al. 2018) has been of significant help in this regard as well. Comments regarding the potential anthropogenic origin of the earthquakes listed in the EID were used too. Cases for which the earthquakes were not flagged in the EID as having an anthropogenic origin but whose location was the same as other earthquakes in the EID that were flagged, as well as earthquakes that correspond to areas of known induced seismicity (e.g., Oklahoma, USA), were considered to be anthropogenic too.

3.6 Population exposure

The field “Exposed Population” contains, in most cases, the population estimated to have been exposed to Modified Mercalli Intensities equal to or larger than IV, according to EXPO-CAT (Allen et al. 2009b) or PAGER (Wald et al. 2008). These numbers are not observations but estimations based on intensity prediction equations and models for population distribution. In some cases, particularly for older events, the values are those reported by the sources as being the population of the most affected localities. This field gives an idea of the number of people that could have been potentially affected, the severity of the consequences of the earthquake being directly linked to the proportion of people injured/killed/affected to the total number of people expected to have been exposed.

3.7 Damage and monetary losses

Consequences to the physical environment are reported in terms of damaged buildings, destroyed buildings, and economic losses, as well as a series of flags to indicate the occurrence of landslides and liquefaction, and whether infrastructure was affected. In the context of the database, landslides can include landslides, rockslides, mudslides, and snow slides. Affected infrastructure can include damaged roads, bridges, and/or dams, damaged lifelines, as well as simple reports of interruption of services, even if the causes are unknown, as sources are often not explicit regarding the kind of the interruption observed.

Damage to buildings is reported in two different fields, labelled “Buildings Damaged” and “Buildings Destroyed”. These two categories might seem too few in contrast with detailed damage scales, such as the European Macroseismic Scale EMS-98 (Grünthal 1998), which allow to differentiate between some hairline cracks in a few walls and large extensive cracks in many walls, for example, two cases that would fall under the “damaged” category of the present database. The reason for using the simple damaged vs. destroyed distinction herein is the lack of more detailed information for most of the earthquakes that make up the database. Sources like the USGS and NOAA use “damaged” and “destroyed” as their two categories, though the former may sometimes include descriptors such as “minor”, “extensive”, or similar. Moreover, sources sometimes state “X buildings damaged or destroyed”; in these cases, the number X was written under “Buildings Damaged” while “Some of” was written under “Buildings Destroyed” to indicate that some of the buildings listed as damaged were actually destroyed. The proportion of earthquakes for which data is available in terms of standard damage scales or more detailed verbal descriptions is very small and renders the consistent use of standard damage scales impossible for the compilation of this database. As specific numbers of damaged or destroyed buildings were not always available, these fields may contain verbal descriptions such as “several”, “some, minor”, “limited”, and “extensive”. These descriptions were adopted as found in the sources. Whenever a simple phrase such as “Damage in [Place]” was found, the word “Some” was written under the damaged buildings field; this should not be interpreted as implying any particular number of affected buildings but only as the existence of damage being reported. Whenever different estimates of damaged or destroyed buildings were found in different sources, these were reported as ranges. Due to potential inconsistencies in what could be considered damaged and what could be considered destroyed in different sources, the ranges of damaged and destroyed buildings may overlap. It should thus not be assumed that the upper bounds of both damaged and destroyed buildings can be taken together to represent the impact of the earthquake, as this may result in the overall consequences being unrealistically inflated.

When contrasting the numbers of damaged and destroyed buildings reported in the NOAA database with those contained in the USGS catalogue or in the ISC Bulletin during the compilation process, several types of inconsistencies were noticed: (i) cases in which the USGS/ISC report X damaged or destroyed, and NOAA reports X damaged and X destroyed; (ii) cases in which the USGS/ISC report X damaged and Y destroyed, and NOAA reports Y damaged and X destroyed; and (iii) cases in which the USGS/ISC report damage with descriptions such as “minor damage to some buildings” or “X buildings slightly damaged” and NOAA reports 0 buildings damaged and 1 to 50 destroyed. As it was not possible to determine the source of these inconsistencies, it was decided that written descriptions from the USGS or the ISC Bulletin (as well as other potential sources) would take precedence over the numbers reported in the NOAA database. For the first case, repetition of a certain number X under both categories was interpreted as X being the total number of buildings that were either damaged or destroyed, and that the exact split is not known. In these cases, the number X was assigned to “Buildings Damaged”, while “Some of” was written under “Buildings Destroyed”.

It is noted that numbers of damaged and destroyed buildings may sometimes refer to dwellings or homes, in which case this was noted in the Comments field of the database, when known. Moreover, it was often not clear if the word “homes” refers to “houses” as buildings inhabited by one family, or to dwellings, that is, housing units inhabited by one family many of which may exist within the same building. These difficulties were particularly prominent when dealing with machine-assisted translations into English of sources in other languages. It was noticed as well that damage associated with Chinese earthquakes may often be reported in terms of rooms (i.e., subdivisions of a dwelling).

Interpretation of the reported damage to buildings requires consideration of the socio-political and economic context within which each earthquake took place, not only in terms of the potential systematic differences that might exist in the fragility of the structures across different countries but also with respect to the perception of the severity of damage. It is not unreasonable to expect, for example, that societies with a strong culture of safety and/or maintenance will tend to report much minor damage than others. A clear example of this is the M4.0 2007 Folkestone (UK) earthquake, for which most of the damage involved chimneys, roof tiles and plaster, with infrequent minor structural damage being only observed within the most affected area, but whose aftermath was handled with extreme caution by the British authorities (Sargeant et al. 2008).

Losses in monetary terms are reported (mostly) in US dollars at the time of the earthquake. The NOAA database specifies that its economic losses correspond to US dollars at the time of the earthquake when they are specific values, but 1990 US dollars when they are reported as ranges, which were assigned by NOAA from verbal descriptors when a specific amount could not be found in the literature, as per the conversion scheme presented in Table 1. Whenever values were found reported in local currencies, historic exchange rates for the particular currency were sought online (FXTOP 2018) and used to transform the values to US dollars at the time of the earthquake. Losses resulting from said conversions are expressed as round numbers, taking an approximate average rate within the year in which the earthquake occurred.

Table 1 Ranges of monetary losses of the NOAA database

It is noted that it is not always possible to be certain of the currency in which the sources themselves are reporting, especially when new papers and studies are published regarding older earthquakes. Moreover, there are cases in which previous conversions are likely already implicit within the sources. This is the case, for example, of the ML 5.4 Adelaide (Australia) earthquake of 28th February 1954, for which Denham (1992) reports an economic loss of 8.8 million Australian dollars at the time of the earthquake. However, Australian dollars were first introduced in 1966, so it is likely that this value be already a conversion from British pounds. In this case, the economic loss reported in the present database was obtained using the 1966 conversion rate between Australian and US dollars.

The variation of the value of money in time is not the only issue for the interpretation of economic losses. Ideally, the latter should be subdivided into material losses (i.e., repairing, rebuilding, etc.), downtime losses, and losses due to the “value of life” (i.e., the economic value assigned to fatalities). Most of the times, what exactly has been included in the value reported by the sources is not clear, as it is not clear whether the value was a first estimate in the immediate aftermath of the earthquake or a balance made after full recovery was achieved. In online news reports, for example, it is common to find citations of government officials stating what the losses are expected to be. In many cases, only references to insured losses can be found. In cases related to induced seismicity in the USA, it is common to find values associated with monetary claims made in court. For these reasons, no attempt was made to further classify the kind of economic loss being reported.

Due to the large uncertainties associated with the origin of these figures, reported monetary losses are not to be taken as definite values but only as an indicative guide of the extent of the damage caused by the earthquake.

3.8 Casualties

Casualties reported in the database were classified into total deaths, shaking deaths, injured, homeless, evacuated, and total affected. Whenever different estimates were found in different sources, these were reported as ranges. By definition, the number of shaking deaths needs to be equal to or smaller than the number of total deaths. The term “shaking deaths” is used herein to refer to deaths caused by the response of buildings or their contents to the ground shaking, as opposed to other earthquake-related deaths such as those that could be caused by stampedes, accidents occurred while trying to flee from a building, earthquake-triggered landslides, etc. Whenever information regarding the causes of death (and/or injury) were found, they were added to the database in the “Causes of death/injury” field. The reader might find that, in some of these cases, this field contains expressions such as “deaths possibly due to structural failures”. These statements were included whenever the extent of the damage reports suggested that the most likely cause of death was damage to buildings and/or its contents. Moreover, the keyword “PAGERshaking” was used to indicate those cases for which the causes of death were not found but the deaths were listed as being shaking deaths in PAGER-CAT (Allen et al. 2009a). It is noted, however, that PAGER-CAT treats as shaking deaths not only those for which the causes of death have been explicitly identified but also those for which information to support their link to secondary causes (such as landslides and tsunamis) have not be found (Allen et al. 2009a). Both this assumption and that made herein regarding extensive damage being associated with shaking deaths clearly leave room for uncertainty.

Information on injuries is, in general, scarcer and more imprecise than that on deaths, due mostly to the difficulties of defining the degree or type of injury to be counted, the occurrence of minor injuries that are not reported due to the primary focus on people who need urgent medical treatment right after an earthquake (Alexander, 1985), and the much more systematic recording of deaths. This is influenced as well by the overall aftermath of the event, with a proportionately larger number of light injuries likely to be counted when consequences are limited in extent than when more serious casualties have occurred. Moreover, it is noted that information on both deaths and injuries (as well as damage to the built environment) is not always free from censorship associated with different socio-political regimes that have existed in different areas of the world in the course of the last century.

The fields reporting on the number of homeless and evacuated people make reference to those whose residence became permanently affected by the earthquake and could not return, and those who had to leave their residences for a limited period of time but could finally return, respectively. The definition of total number of people affected can be slightly more undefined, as many sources just report, for example, “1,000 people were affected by this earthquake” without specifying if this includes only injured and dead, or any person for whom the earthquake and/or its consequences interfered somehow in their everyday life, even if to a much lesser extent. The values reported herein correspond to those found in the sources under the same wording.

3.9 Other fields

The “ID” field contains an identifier used for compilation. Those IDs starting with the letter E and followed by a number and a hyphen (e.g., E15-xxx) correspond to earthquakes that were automatically processed and retrieved from the EID, as described in Section 2.2 and Appendix 1. These earthquakes are also flagged as “Y” (yes) in the “From EID” field. This flag does not indicate whether the earthquake can be found in the EID or not (this is indicated with the “EID” flag), but rather whether it was included in the present database by means of the algorithm described in Appendix 1.

As explained above, whenever different estimates of damaged buildings, destroyed buildings, casualties, etc., were found in different sources, these were reported as ranges (unless it could be established that the discrepancy had its origin in one source giving only a provisional number and another source reporting a final count). However, there exist also many cases in which the only available values are estimations made by NOAA on the basis of verbal expressions. In these cases, the ranges reported are the standard ones defined by NOAA and reported herein in Table 2, and it is indicated in the fields “NOAA deaths”, “NOAA injuries”, “NOAA damaged”, “NOAA destroyed”, and/or “NOAA economic” that the reported values are NOAA’s estimates and not observations. Reported numbers were always preferred if both an estimate and a specific number were found.

Table 2 Ranges of deaths, injuries, damaged buildings and destroyed buildings of the NOAA database

The fields “EM-DAT”, “NOAA”, “USGS”, “PDE”, “ISC”, and “EID” are flags that indicate whether information from each of these sources has been taken for any particular earthquake (Y = yes, N = no). This is limited to information on damage/casualties and does not make reference to earthquake source parameters.

Any relevant additional information is contained in the “Comments” field. This can be related to a clarification regarding the damage observed, additional data regarding the main shock of the sequence if the reported earthquake is an aftershock, the uncertainty associated with the reported proxy moment magnitude, the occurrence of other shocks closely spaced in time, the currency conversion rate used for the economic losses, etc.

4 Summary of earthquake consequences of the whole database

4.1 Damage to buildings and the environment

4.1.1 Building damage

As specific numbers of damaged and/or destroyed buildings were not always available, the corresponding fields of the database contain different kinds of data, including numbers, number ranges, and verbal descriptions. Figure 5 shows the proportion that each kind of data represents. It should be noted that lack of availability of data in any of these fields can be due either to no buildings having been damaged/destroyed as well as simple lack of information (for example, when a value of economic loss is given but no details can be found regarding the kind of damage observed). As shown in Fig. 5, 13.2% and 39.9% of earthquakes (259 and 781 events) lack details on damaged and destroyed buildings, respectively. Based on the experience obtained from the compilation of this database, the authors believe that whenever a number of damaged buildings or a damage description in verbal terms is available, lack of data regarding destroyed buildings is likely an indication of no destroyed buildings observed, though it may not always be the case and this may be subject to the interpretation of “damaged” and “destroyed”. Moreover, many cases of no data for either damaged or destroyed buildings do have a reported economic loss, which implies that damage did occur but was not accurately recorded.

Fig. 5
figure 5

Kind of information available regarding damaged (left) and destroyed (right) buildings for the whole of the Database of Damaging Small-to-Medium Magnitude Earthquakes

In view of the different kinds of data described above, results regarding damaged and destroyed buildings are presented subdivided into two groups. The plots in Fig. 6 present those corresponding to data labelled as “Number”, “EID estimation”, “NOAA range”, “Other range”, and “Some of” in Fig. 5. Lower and upper bounds are the same for those cases in which a specific number is available and differ for the case of ranges being reported. In the case of “Some of”, the lower bound was taken as 1 while the upper bound was taken as the specific number of damaged buildings, when available, or as the upper bound of damaged buildings when these were given as a range. The increasing destructive power of increasingly larger magnitude events is visible in both plots. It is noted that of the 696 cases of zero destroyed buildings explicitly reported, 684 correspond to earthquakes retrieved directly from the EID, around two thirds of which result from the conversion of the EID damage scale into damage descriptions (see Appendix 1). The accuracy of such a conversion is expected to be limited.

Fig. 6
figure 6

Number of damaged (left) and destroyed (right) buildings as a function of magnitude for the whole of the Database of Damaging Small-to-Medium Magnitude Earthquakes for which data was available as “Number”, “EID estimation”, “NOAA range”, “Other range”, and “Some of”, as per Fig. 5 (738 and 1062 earthquakes in left and right plots, respectively). Zero values not shown (26 and 696 cases in left and right plots, respectively)

The threshold of 10,000 damaged buildings appears to be crossed only by earthquakes with magnitudes above around 4.7, while 4.8 marks the start of numbers of destroyed buildings above 2000. However, not all figures can be considered equally reliable. Of the 46 earthquakes for which the lower- and/or upper-bound number of destroyed buildings is larger than 1000, all with M larger than 4.5 and almost half of which were located in China, 12 are cases of “Some of”. In one of these 12 cases, the number of damaged buildings is a NOAA range estimate of 100–1000. Seven of the remaining 34 cases correspond to NOAA estimates of the same range as well, which, apart from being estimates and not observations, are quite broad, with a ratio of 10 between the upper and lower bounds. It is clear that the (unknown) real number of destroyed buildings is likely to be smaller than depicted in Fig. 6 in all eight cases.

A lack of homogeneity in the reliability of the data is revealed when taking a closer look at, for example, the cases of reported destroyed buildings of 10,000 and above, or earthquakes with M smaller than 4.5 and more than 100 reported damaged buildings. While some cases appear to be related to relatively solid reporting of damage, others illustrate the uncertainties that come inevitably associated with these figures, and which apply not only to buildings reported as destroyed but also to those reported simply as damaged. The interested reader can find relevant examples in the report by Nievas et al. (2019b).

The database contains 26 cases for which zero damaged buildings are specifically reported, none of which contain details on economic losses, and 24 of which were directly retrieved from the EID. Of these 26 four have numbers of destroyed buildings reported. Of the remaining 22, three are reported to have affected infrastructure (one of which caused one injury as well), and the remaining 19 are only associated with injuries (17) and/or deaths (three). The deaths were reportedly caused by a landslide (1) and two heart attacks allegedly attributed to the earthquake, though the difficulties of unequivocally associating heart attacks with earthquakes renders such attributions dubious (see Appendix 2).

Figures 7 and 8 illustrate the verbal descriptions found for 956 and 114 earthquakes regarding damaged and destroyed buildings, respectively. These represent almost all of the earthquakes for which generic text descriptions were available, including those 2013–2014 earthquakes retrieved from the EID for which damage was reported in terms of a scale, as explained in Appendix 1. There were five and one earthquakes, respectively, for which their descriptions regarding damaged and destroyed buildings were far more specific and were thus not included in the plots of Figs. 7 and 8. The former consists of four cases in which a percentage of damaged buildings is given and one case in which the name of a village considered to have suffered damage to 100% of its buildings is provided. For all these five cases a deeper investigation regarding the total number of buildings to which these percentages refer would be needed. Similarly, the text related to destroyed buildings provides the name of a village considered to have been completely destroyed.

Fig. 7
figure 7

Descriptions of damage extent in the “Buildings Damaged” field, as a function of magnitude for the whole of the Database of Damaging Small-to-Medium Magnitude Earthquakes for which only verbal descriptions were found (956 earthquakes)

Fig. 8
figure 8

Descriptions of damage extent in the “Buildings Destroyed” field, as a function of magnitude for the whole of the Database of Damaging Small-to-Medium Magnitude Earthquakes for which only verbal descriptions were found (114 earthquakes)

In Fig. 7, the topmost nine descriptions (from “Widespread” until “Some”, inclusive) refer only to quantities of buildings, while the rest also make reference to the kind of damage. While it is clearly difficult to draw sound conclusions from verbal descriptions, it is noted that the most extreme wording (e.g., severe, serious, significant, heavy, extensive, considerable) broadly appears to be more commonly associated with larger magnitudes than lower magnitudes, as would be expected. Of the 956 cases depicted in Figs. 7, 558 (58.4%) can be classified under a broader “minor/slight” category, of which 398 (71.3%) were directly retrieved from the EID (“non-structural”, “limited”, and “some, minor” correspond to the conversion of the EID damage scale into damage descriptions). Around 58% of the 115 earthquakes for which verbal descriptions of destroyed buildings were found correspond to cases retrieved directly from the EID as well.

4.1.2 Infrastructure affected

The database contains 184 earthquakes reported to have affected infrastructure. Around 75% of the cases correspond to events with magnitudes equal to or larger than 5.0, and a further 13% corresponds to magnitudes in the range 4.75–5.00. It is noted, however, that sources such as existing databases of damaging earthquakes do not often report damage to infrastructure explicitly and it is therefore possible that the number of earthquakes that have affected infrastructure may be larger.

4.1.3 Landslides and liquefaction

Similar to the case of affected infrastructure, over 75% of the 159 earthquakes in the database reported to have caused landslides, rockslides, mudslides, and/or snow avalanches had magnitudes equal to or larger than 5.0. This is consistent with the global database of earthquake-induced landslides compiled by Rodriguez et al. (1999). These 159 earthquakes are associated with 339–393 deaths due to the landslides themselves, and 570–864 deaths overall (ranges correspond to taking lower or upper bounds). This implies that 45–59% of the deaths associated with these earthquakes are reported to have been caused by the landslides. However, it is interesting to note that only around 21% of the 159 earthquakes are associated with landslide-caused deaths.

Only 11 earthquakes in the database are reported to have caused liquefaction, which is consistent with the findings of Green and Bommer (2019) regarding the smallest earthquakes that give rise to such phenomena. All but one of these had magnitudes equal to or larger than 5.0. The 11 earthquakes are 1903 Warrnambool (Australia, ML 5.30), 1988 Lingwu (China, M5.25), 1989 Newcastle (Australia, M5.40), 1992 Milos (Greece, M5.19), 1992 Roermond (the Netherlands, M5.38), 1993 Pyrgos (Greece, M5.44), 1996 Épagny-Annecy (France, M4.80), 2002 Au Sable Forks (USA, M5.16), 2004 Garda Lake (Italy, M5.07), 2009 Olancha (USA, M5.26), and 2010 Vitanovac/Kraljevo (Serbia, M5.52). The one case with magnitude smaller than 5.0, the 1996 Épagny-Annecy earthquake, appears to have caused only small-scale liquefaction at one of the ends of the airport’s runway, in spite of the whole affected area being susceptible to liquefaction effects (Dominique et al. 2008). Moreover, there is lack of agreement regarding the moment magnitude of this earthquake, with Dufumier (2002) estimating M4.80 and stating that the M5.30 value reported by the European-Mediterranean Seismological Centre (EMSC) and Bock (1997) results from fixing the hypocentral depth to be unrealistically shallow, which causes the focal mechanism to be incompatible with the local tectonics.

4.2 Casualties

Numbers of injuries and deaths are presented herein in terms of lower and upper bounds, just as for the case of damaged and destroyed buildings. For the case of the injuries, descriptive words such as “a few” and “some” (which were found for 25 earthquakes) were assigned (herein, not in the database itself) ranges according to the criteria used in the NOAA database (Table 2). “Slight” and “minor” were considered synonymous with “few”, and “several” was considered synonymous with “some”. This processing of verbal descriptions was not needed for the case of deaths, as there were no such cases.

Having found 56 earthquakes (around 16% of those associated with deaths) with reported fatal victims associated with deaths due to heart attacks (2.8–3.7% of number of deaths, i.e., 88–89 deaths), a special effort was put into better understanding the relationship between earthquakes and fatal cardiac events. In approximately 78% of these 56 earthquakes, the only reported deaths are those attributed to heart attacks, while for six of these earthquakes, the heart attacks are the only reported consequence found (i.e., no reports on damage, injuries, or the occurrence of liquefaction or landslides). After a brief literature review on the matter, available to the reader as Appendix 2, it was decided to withdraw deaths attributed to heart attacks from the results presented hereafter.

As summarised in Table 3, the whole database comprises a total of 2306–3126 deaths and 33,761–47,205 injuries. However, zero deaths and/or zero injuries (per earthquake) are by far the most frequent outcome. As shown in Table 4, around 59% of the total number of earthquakes that make up the whole of the database are not reported to have caused human casualties at all, the proportion being larger for earthquakes taken directly from the EID (78%) than for those taken from elsewhere (43%). This may be expected, as earthquakes that have caused no casualties and only minor damage are less likely to appear in sources other than the EID. Along similar lines, it is noticeable that the proportion of earthquakes from the EID that have caused deaths (with or without injuries, last two rows of Table 4) is very small, 2.6%, against 14% of the total, or 23% of the non-EID events. Table 4 shows as well that the proportion of earthquakes that have caused no deaths but some injuries (27% of the total, 34% of the non-EID events) is larger than the proportion of earthquakes causing deaths (14% of the total, 23% of the non-EID events).

Table 3 Summary of casualties observed for the whole database, excluding heart attacks attributed to earthquakes
Table 4 Classification of earthquakes according to the existence or not of associated casualties

Analogous to Fig. 6, the plots in Fig. 9 depict the number of injured and dead against magnitude for the whole of the database. Very few earthquakes with magnitudes smaller than 4.5 are reported to have caused over 50 injuries, with most cases lying significantly below this value. The same magnitude marks a jump between earthquakes causing less or more than ten total deaths. The increase in number of casualties with magnitude is clear in both plots. The number of earthquakes reportedly causing non-lethal injuries increases progressively for each 0.25-wide magnitude bin as 24, 46, 73, 120, 231, and 244, while the number of earthquakes reportedly associated with deaths proceeds as 9, 8, 22, 28–29, 87–89, and 115–121.

Fig. 9
figure 9

Number of injuries (left) and total deaths (right, excluding heart attacks) as a function of magnitude for the whole of the Database of Damaging Small-to-Medium Magnitude Earthquakes. Zero values not shown (1224 earthquakes with zero reported injuries, 1683–1690 earthquakes with zero reported deaths)

The earthquakes reported to have caused relatively large numbers of injuries despite their small magnitude in the plot on the left of Fig. 9 are (i) a M3.98 (proxy, ML = mb = 4.0) earthquake in Ecuador, reported to have caused an unspecified number of minor injuries due to broken glass (El Telégrafo 2011); (ii) a M4.08 (proxy, mb = 4.0) in Iran, associated with 56 to 75 injuries due to escaping in panic (22 reported hospitalised) (Iran Front Page 2017; News.am 2017); (iii) a M4.09 (proxy, mb = 3.8–4.1) earthquake in Turkey, reported to have resulted in “many” (translated into 100–1000) injuries associated with people fleeing their homes or jumping off balconies or out of windows, probably influenced by the fact that it occurred on the 14th anniversary of the Izmit earthquake (Earthquake-Report 2013); (iv) a M4.29 (± 0.30, Wilks et al. 2017) that occurred in Ethiopia, for which the EID reports 150 injuries but cited sources point only at 100, reportedly due to a stampede at a university campus (Earthquake-Report 2016); and (v) a M4.56 (proxy, mb = 4.5) in Guatemala, for which NOAA estimates injuries in the range 100–1000. It is, thus, noted, that three out of these five cases correspond to the assignation of a range in absence of a real observation of number of injuries and, from the descriptions of (i) and (iii), could easily be smaller numbers. It is noted as well that only in the last case of the earthquake in Guatemala are the injuries likely to be related to damage to structures.

Something similar occurs with the total number of deaths depicted in the plot on the right of Fig. 9, in which outlying cases of higher fatalities due to causes other than structural failures are indicated. As can be observed, they correspond to either landslides or the consequences of damage to mines. Other than the three mine cases depicted in Fig. 9, there are other further ten cases of earthquakes with magnitudes equal to or smaller than 4.4 with reported deaths (each with just one), of which five are attributed (or likely attributed) to structural failures, one is a death due to falling from a bridge, another one is due to running out of the house, and two others are due to snow- or landslides, while the cause of the tenth event is unknown.

This same data on injuries and deaths is plotted in Fig. 10 against date of occurrence of the earthquake. It is notable that the largest numbers of reported deaths are associated with older events. There are three cases of upper-bound deaths equal to or larger than 200 in the database: (i) a M5.5 (converted from Munk = 5.50, retrieved from the NOAA database) that occurred in Iran in 1925, with 500 reported deaths; (ii) a M5.44 (proxy, Ms = 5.00) from 1943 in Peru, with 75–200 reported deaths; and (iii) a M5.44 (proxy, Ms = 5.00) that occurred in China in 1933, with 200 reported deaths. As shown in Fig. 9, most of the deaths associated with the latter are possibly due to landslides. The 500 deaths of the 1925 Iran earthquake are reported as shaking deaths in PAGER-CAT (Allen et al. 2009a), while the causes of death were not found for the 1943 Peru case. It is noted that, given the unknown scale and date of the Iran case, it is possible that this earthquake may have had a moment magnitude larger than 5.5.

Fig. 10
figure 10

Number of injuries (left) and total deaths (right) as a function of date for the whole of the Database of Damaging Small-to-Medium Magnitude Earthquakes. Zero values not shown

As explained in Section 3.2, cases of moment magnitudes equal to or larger than 5.5 but reported as a truncated 5.5 by the GCMT were observed during the compilation of the database. A set of 30 earthquakes identified to fall into this category, with values of M derived from the GCMT seismic moment in the range 5.55–5.59, would add 41–47 deaths and 837–1763 injuries to the database. A great majority of these deaths are attributed (or likely attributed) to structural failures. This example illustrates the potential impact of the uncertainty in magnitude discussed in Section 3.2, particularly around the upper bound of the magnitude range.

Though more unlikely, it is still possible to identify earthquakes with magnitudes smaller than 4.0 associated with deaths.Footnote 2 The most prominent in recent years is, undoubtedly, the 2017 M3.9 Ischia (Italy) earthquake, which caused extensive damage, 39–42 injured, and the death of two people due to structural failures. These disproportionately large consequences were mostly attributed not only to unusually large ground motions and site effects but also to the extreme vulnerability of the building stock. Other earthquakes include the 2001 M3.2 Lorraine (France), the 2005 ML = 3.7 and 2014 ML = 2.3 South Africa, 2014 ML = 3.1 Czech Republic and 2016 ML = 2.6 Silesia (Poland) ones, reportedly associated with 1, 5, 9, 3, and 1 deaths, respectively. All of these occurred in mines rather than buildings.

Having attempted to identify the causes of death whenever possible, these were collected under ten categories, as depicted in Fig. 11. While some are self-explanatory, others warrant some comment. As in Section 4.1.3, “landslides” include movements of masses of soil, rocks and/or snow. “Structural failures” comprise the behaviour under ground shaking of both structural and non-structural components (e.g., ceilings, but not building contents or cases identified only as “falling debris”). While deaths within mines are often related to the failure of the mine structure, they are treated separately from “Structural failures” in their own “Mines” category due to involving a very specific kind of structure. “Falling objects/debris” includes cases of people being hit by falling plaster or bricks as well as other elements falling from shelves. While falling bricks could also be included under “structural failures”, as the bricks may come either from a structural or non-structural component, they are treated separately for being often found described in this way, referring more to the case of a small fragment falling rather than the full collapse of a structure. “Escaping” includes jumping off balconies and windows, running out of buildings, and cases described in the sources as “panic reactions”. The category “Probably structural failures” refers to cases in which the extent of the damage reports suggested that the most likely cause of death was damage to buildings, as explained in Section 3.8. In many cases, it includes earthquakes for which both this inference could be made and PAGER-CAT (Allen et al. 2009a) reported the deaths as shaking deaths. The category “PAGER shaking” was thus left to include only cases for which no further details have been found and inferences from the extent of damage would be harder to make.

Fig. 11
figure 11

Causes of death for all total deaths (2306, lower bounds, left; 3126, upper bounds, right) reported in the Database of Damaging Small-to-Medium Magnitude Earthquakes, excluding deaths attributed to heart attacks, in terms of proportions of the number of deaths

Figure 11 shows proportions of deaths attributed to each cause. As can be observed, structural failures account for 14.4 to 16.5% of the total number of deaths, with an additional 14.0–15.2% probably due to the same reason, and 29.1–37.8% identified as shaking deaths by PAGER-CAT (Allen et al. 2009a). If all of the latter were, in fact, deaths due to structural failures, these would thus account for 59.6–67.4% of the total number of deaths. Proportions change slightly when considering number of instances of causes of death instead of number of deaths, considering an “instance” a naming of a cause (i.e., one earthquake for which two causes of death are named constitutes two instances). The participation of causes such as falling objects/debris, escaping and others acquire somewhat greater prominence. This suggests that these causes tend to be associated with fewer deaths per instance than landslides or damage to mines, which is expected.

5 Discussion and conclusions

For the purpose of enabling further in-depth studies regarding damage and casualties caused by small-to-moderate earthquakes, a global database has been compiled comprising 1958 events of M4.0–5.5 that occurred from the year 1900 through 2017 for which reports of damage and/or human injury or loss of life have been identified. While a large effort has been invested in identifying the earthquakes to be included, the general scarcity of data on damage caused by earthquakes in this magnitude range is largely reflected in the composition of the database at the time of writing. It has been shown that the number of earthquakes per decade increases steadily and shows important jumps in correspondence with the introduction of significant technological changes such as the establishment of the WWSSN and the creation of the ISC in the 1960s, the global embrace of online technologies at the beginning of the 2000s, and the compilation in near-real time of the Earthquake Impact Database (EID) starting in 2013. Incorporation of earthquakes from the latter for the five years that make up the 2013–2017 period yields a total of 1029 earthquakes in the database for the eight years of the decade of the 2010s contained in the database; these represent over half of the total 1958 earthquakes. Reports on additional events that may have been missed are very much welcomed by the authors.

The database gathers information on the seismological characteristics of the source (though it is not intended to be an earthquake catalogue from the seismological point of view), macroseismic intensities, human casualties, damage to buildings and infrastructure, economic losses, and the occurrence of landslides and/or liquefaction. The fields to be considered were selected aiming at a balance between breadth of coverage and depth of detail, always in light of the scarcity of recorded information on so many of these earthquakes as well as on the number of events involved. The pursuit of deeper detail has been addressed in a parallel study concerning a subset of case-history earthquakes (Nievas et al. 2019b). As in any earthquake catalogue or database, choices about parameter values were made seeking transparency and rationality, though alternatives may exist.

The nature of the work required the consultation of a diverse range of sources, from existing databases of damaging earthquakes all the way to online newspapers and social media, passing through seismological agencies, scientific publications, and reports as well. As a consequence, the database contains, inevitably, information of undetermined reliability and the statistics that stem from it and have been presented herein cannot be deemed to represent verified scientific measurements or observations. Moreover, as several of the earthquakes were assigned ranges of values of the reported quantities, lower- and upper-bound analyses in which all the lower- and all the upper-bound values for all earthquakes were considered together (which is naturally quite extreme since it is very unlikely that these upper estimates would apply in all cases) were required.

The number of damaging earthquakes currently in the database increases steadily with magnitude and reaches its maximum within the M5.25–5.50 bin when including earthquakes in the range 5.50 < M < 5.55 within it (rounding to closest 1-digit decimal) and the M5.00–5.25 bin otherwise. This illustrates simultaneously the larger damage potential of larger magnitude events (notwithstanding the influence of distance, site effects, vulnerability, and other factors), and the counterbalancing effect of the Gutenberg-Richter relationship on the magnitude distribution of the database. It is noted that the database focusing on earthquakes with moment magnitude equal to or larger than 4.0 does not imply that earthquakes with smaller magnitudes are incapable of causing damage and/or casualties.

Earthquakes with quantitative data on damaged and/or destroyed buildings amount to 38% and 54% of the total number of events, respectively, while 49% and 6% have verbal descriptions and 13% and 40% lack explicit data. The latter can mean either that no damaged/destroyed buildings were observed or that data is simply lacking (for example, when an economic loss value is reported but not numbers of affected buildings). For those earthquakes for which numerical data is available, the threshold of 10,000 damaged buildings appears to be crossed only by earthquakes with magnitudes above around M4.7, though very few values above 1000 are observed below this magnitude as well. More than 2000 destroyed buildings are observed only for magnitudes of almost M4.8 and above. The level of reliability of the reported numbers is variable. For example, of the 46 earthquakes for which the reported lower- and/or upper-bound number of destroyed buildings is larger than 1000, all of them with M larger than 4.5, 12 are cases in which no concrete number is specified in the source and it is only reported that “some of” the buildings that were classified as damaged should, in fact, be considered to have been destroyed. For a further seven cases, only an estimate by NOAA in the range of 100 to 1000, which is very broad, is available.

The database contains 184 earthquakes reported to have affected infrastructure, of which around 75% have magnitudes equal to or larger than M5.0. Around the same proportion of earthquakes above, this magnitude can be observed within the 159 earthquakes reported to have caused landslides, rockslides, mudslides, and/or snow avalanches. These 159 earthquakes are associated with 339–393 deaths due to the landslides themselves, and 570–864 deaths overall. Only 11 of the 1958 earthquakes of the database are reported to have caused liquefaction, all of which had magnitudes equal to or larger than M5.0 except for one.

Excluding 88–89 heart attacks (which were removed from the statistics due to the uncertainty and ambiguities associated with the inference of a causal relationship of a cardiac-related death in close proximity in time with an earthquake), the entire database comprises a total of 2306–3126 deaths and 33,761–47,205 injuries, but around 59% of the earthquakes caused neither deaths nor injuries, while 27% caused no deaths but some injuries, 4% caused some deaths but are not reported to have caused further injuries, and around 10% reportedly caused both deaths and non-lethal injuries. These values vary between the subset of earthquakes retrieved from the EID and those incorporated from elsewhere, with the proportion of earthquakes causing no casualties at all rising to 78% for the former and dropping to 43% for the latter, for example. The number of earthquakes reportedly causing non-lethal injuries increases progressively for each 0.25-wide magnitude bin as 24 (3.95 ≤ M < 4.25), 46, 73, 120, 231, and 244 (5.25 ≤ M < 5.55), while the number of earthquakes reportedly associated with deaths proceeds as 9 (3.95 ≤ M < 4.25), 8, 22, 28–29, 87–89, and 115–121 (5.25 ≤ M < 5.55).

Very few earthquakes with magnitudes smaller than M4.5 are associated with more than 50 injuries, with most cases lying well below this number. The same magnitude marks a jump between earthquakes causing either fewer or more than ten total deaths, with many of the deaths and injuries associated with earthquakes with M < 4.5 being due to causes other than structural failures. Structural failures explicitly account for 14.4 (lower bound) to 16.5% (upper bound) of the total number of deaths, although these values could be as large as 59.6–67.4%, if suspicions regarding the remaining cases were true. Other causes of death include landslides, damage to mines, falling objects/debris, accidents occurred while escaping, and a tsunami associated with a M5.49 earthquake in Indonesia. Causes of death were not found for 13.9–24.7% of the deaths. Proportions change slightly when considering number of instances instead of number of deaths, with causes such as falling objects/debris, escaping, and others acquiring somewhat greater prominence.

Given its nature, the database is organic and can grow both as time goes on and as new information regarding past events emerges. Nonetheless, and despite its inherent limitations, we believe it is an important contribution to the understanding of the extent of the consequences that may arise from earthquakes in the magnitude range of study as well as of the relevance of increasing efforts set on registering damage data from smaller events. The latter would facilitate future improvements and extensions of the database. Moreover, it has been a key component of a statistical study on the frequency with which upper-crustal earthquakes with magnitudes in the range of interest cause damage and/or casualties, which is explored in a separate paper (Nievas et al. 2019a).

The database is publicly available and can be downloaded from https://nam-onderzoeksrapporten.data-app.nl/reports/download/groningen/en/e4fd80e4-2e86-495c-97a4-d00954abcdff.