Background

The LaMEVE (Large Magnitude Explosive Volcanic Eruptions) database is a product of VOGRIPA (Volcano Global Risk Identification and Analysis Project), a component of the Global Volcano Model (GVM). LaMEVE is an open access database available online at the British Geological Surveya. The objectives behind the database are to facilitate understanding of how explosive volcanism is distributed in space and time, to identify locations at high risk and gaps in knowledge, and to enable assessments of societal and environmental impacts of volcanism. Global datasets enable scientists and disaster managers to analyse hazard and risk within a global context of systematic information.

The Smithsonian Institution’s Global Volcanism Program (GVP) provides a global dataset documenting Holocene eruptions of all magnitudes (Volcanoes of the World 4.0 (VOTW 4.0)b). However, Deligne et al. (2010) showed that the Holocene is too short a time period to be statistically representative for M > 6.5 eruptions. A database of M ≥ 8 eruptions by Mason et al. (2004) permitted the study of the very largest eruptions recorded through the Ordovician to the present, with no eruptions of this size recorded by Mason et al. (2004) in the Holocene. The LaMEVE database incorporates and builds upon these and other data sources and analyses of global volcanic information in particular to fill a knowledge gap for eruptions of M > 6.5. Relevant and complementary large datasets include the 2012 version of the Collapse Caldera Database (Geyer and Martí 2008), which has been in part incorporated within LaMEVE. The Large Quaternary Caldera list of Decker (1990), a Preliminary List of Large-Volume Pleistocene Eruptions (Siebert et al. 2010) and two mature databases focussing on volcanism in Japan – the One-Million Year Tephra Database of Hayakawa (2010) and the Active Volcanoes of Japan Database (2008)) – are also important data sources for LaMEVE. In short, LaMEVE is a collaborative effort to integrate these datasets and move towards a comprehensive and systematic, open-access source of volcanic data on large magnitude explosive eruptions.

LaMEVE contains data on Quaternary (the last 2.58 Myr as defined in Gibbard et al. 2010) explosive eruptions of magnitude (M) 4 and above, where magnitude is a function of erupted mass (Pyle 2000). The database contains information on 3,130 volcanoes, of which 471 have had at least one M ≥ 4 eruption. We designate these 471 volcanoes as Quaternary Explosive Activity Recorded (QEAR) volcanoes. All volcano types are eligible for inclusion, as the sole physical criterion is the magnitude. There are 1,883 eruptions of M ≥ 4 in LaMEVE. M ≥ 4 eruptions are less frequent than smaller (M < 4) events (Siebert et al. 2010), but present the greatest risk at local to global scales with the potential for high death tolls, economic losses and societal disruption over large areas (Auker et al. 2013). The largest magnitude eruptions can cause global disturbances through the injection of ash and aerosols into the atmosphere, causing climate perturbations over several years (Robock 2000). However, even relatively small eruptions can have global economic ramifications, as illustrated by the M4 2010 Eyjafjallajökull eruption which created mass disruption across Europe by grounding air travel. In 1990, 455 million people were estimated to live within 100 km of a volcano known to have been active in the Holocene (Small and Naumann 2001). Auker et al. (2013) suggested that, with population growth, this has increased to at least 600 million people.

The largest known explosive eruption in the geological record is the M9.3 Guarapuava Tamarana-Sarusas rheoignimbrite eruption from the Paraná-Etendeka igneous province (Bryan et al. 2010), which occurred around 132 Ma, thus falling outside the temporal scope of the LaMEVE database. Volumetrically larger lava eruptions have been recorded in Large Igneous Provinces, but dominantly effusive products are not included in LaMEVE. The largest magnitude eruption currently in LaMEVE is the ~74 thousand years ago (ka) M8.8 eruption of the Younger Toba Tuff at Toba, Indonesia (Ninkovich et al. 1978; Chesner et al. 1991; Oppenheimer 2002; Petraglia et al. 2012; Storey et al. 2012). The Baegdusan-Tomakomai eruption at Changbaishan, China at 950 BP (Volcanoes of the World 4.0, 2013), and the Dakataua caldera-forming eruption in Papua New Guinea at 998 BP (Machida, 2002) are the two largest magnitude Holocene eruptions in LaMEVE (both M7.4). The Arequipa Ignimbrite of Nevado Chachani, Peru is the oldest event currently recorded in LaMEVE at 2.42 Ma.

Crosweller et al. (2012) described the LaMEVE database structure and content in detail, so only a summary is provided here. The name, synonyms, coordinates, region and volcano type are provided for each volcano in LaMEVE, derived from the Smithsonian Institution’s VOTW 4.0 to ensure compatibility between the two databases. Although 85% of the volcanoes in LaMEVE do not currently have associated eruptions listed, they are included to facilitate updates as new information becomes available. LaMEVE has been created to provide a sustainable, open-access catalogue of eruption data. As such, there is no single specified degree of uncertainty for dates, volumes or magnitudes throughout LaMEVE, as the data are derived dominantly from peer-reviewed studies published over the last 90 years. Where studies provided estimates of uncertainty, this has been included in LaMEVE and this information has informed evaluation of overall uncertainties in the dataset. Quality indices using simple criteria are given for ages and magnitudes as a first pass assessment of data reliability (Crosweller et al. 2012). References are given for all data derived from the literature; other entries are calculated from other known properties or ‘assumed’ on the basis of qualitative descriptions. The following key data are provided for each eruption, with additional data, such as uncertainty estimates, when available:

Key data:

  •  Unit name

  •  Age

  •  Dating method

  •  Bulk volume

  •  Bulk DRE (Dense Rock Equivalent)

  •  VEI (Volcanic Explosivity Index)

  •  Magnitude

Additional data:

  •  Tephra fall volume (bulk, DRE)

  •  Ignimbrite volume (bulk, DRE)

  •  Intra-caldera deposit volume (bulk, DRE)

  •  Column height

  •  Intensity

  •  Magma type

This paper provides an introduction to, and synoptic analysis of, the datasets compiled in the LaMEVE database to describe spatial and temporal trends in recorded volcanic activity, identify gaps in the existing knowledge, and discuss issues of under-recording. In this paper, the term ‘under-recording’ refers to the level of record completeness relative to the “true” but unknown record. Under-recording results from a variety of factors, including the failure of people to record events in the historical period (see Siebert et al. 2010), variable preservation of deposits and extent of scholarly study. The assessment of under-recording in this paper is based on the assumption that the global rate of explosive volcanic activity has been stationary throughout the Quaternary. Mason et al. (2004) identified clustering on time scales of the order of tens of millions of years, using this to support the hypothesis that rates of large eruptions are non-uniform and possibly controlled by regional or global tectonics. Although this paper attributes variations in the record of explosive volcanism largely to under-recording, we recognise at the outset that interpretations are confounded by the possibility of non-stationarity in rates of volcanism. We evaluate the assumption of stationarity a posteriori, having considered the evidence.

LaMEVE is a dynamic database and will be updated periodically to account for and incorporate new data. Our analyses use Version 2, released in July 2013. Some data contained within LaMEVE will become obsolete as further research is undertaken and better information emerges. This analysis uses the most up-to-date, published, peer-reviewed data available and therefore presents a review of the status quo. One purpose of the study is to understand the quality of existing data to identify knowledge gaps, issues of data quality and deficiencies in methodologies so that there can be improvement.

Global and regional variability in volcano and eruption distribution

Simkin and Siebert (2000) described how over 94% of historical activity occurred at volcanoes along major plate boundaries. The remaining volcanoes are principally located at hot spots and intra-continental rift zones. The strong link between volcanism and plate boundaries is illustrated by the distribution of Quaternary M ≥ 4 eruptions and M ≥ 7 eruptions over the last 100 kyrc (Figure 1).

Figure 1
figure 1

Location map of Quaternary eruptions in LaMEVE. The yellow stars and black circles indicate the location of M7-7.9 and M ≥ 8 eruptions respectively, and the black line marks the equator.

About two thirds (66%) of all the volcanoes in VOTW 4.0, and therefore LaMEVE, are in the northern hemisphere, yet 74% of QEAR volcanoes and 81% of the LaMEVE eruptions (i.e. those with a documented eruption of M ≥ 4) are located in the northern hemisphere (Figure 1; Table 1). This marked spatial difference in recording is attributed to the longer historical records and the greater availability of resources for geological investigations in the high income countries of the northern hemisphere. The northern hemisphere also contains 90% of the world’s population and most of the global landmass, improving the likelihood of sub-aerial deposits, which are more easily accessible and therefore more likely to be documented than off-shore tephras. LaMEVE is not restricted to sub-aerial volcanism, with large magnitude explosive submarine eruptions included where recorded in the literature. We note there is a substantial record of explosive volcanism in marine sediment cores and ice cores. The former is largely untapped, although LaMEVE does contain data from eruptions investigated in part with sediment cores. Ice cores provide a record of volcanic sulphate pollution events, which is challenging to decipher into a magnitude record of explosive volcanism due to the complexity of sulphate aerosol transport and deposition, the difficulty of correlation with volcanic sources and specific eruptive events, and the observation that SO2 emissions do not correlate in any simple way with magnitude (Pyle et al. 1996).

Table 1 Proportion of volcanoes, eruptions and Quaternary Explosive Activity Recorded (QEAR) volcanoes in LaMEVE by hemisphere

Comparing the number of QEAR volcanoes with the number of eruptions in each region reveals considerable variability between regions (Table 2). The Japan, Taiwan and Marianas (JTM) region hosts 23% of all QEAR volcanoes, the most of any region. South America, Mexico and Central America, and North America including Alaska (Alaska is listed as a separate region by the GVP) collectively account for 36% of QEAR volcanoes. The remaining regions each host fewer than 10% of QEAR volcanoes. It is unsurprising that the JTM region also has the highest percentage of eruptions recorded in LaMEVE given that is has the greatest percentage of QEAR volcanoes; indeed, the dominance of this region is striking, containing about 40% of all LaMEVE eruptions. Despite its geographic scope, almost all eruptions within the JTM region are located within Japan, with the exception of three eruptions at Alamagan, Agrigan and Pagan in the Marianas.

Table 2 Regional breakdown of QEAR volcanoes and eruptions in LaMEVE

It is clear from both the number of volcanoes and eruptions that the majority are part of the circum-Pacific ‘Ring of Fire’. Table 2 also lists the number of eruptions per QEAR volcano in each region. The New Zealand to Fiji region has the highest number of eruptions per QEAR volcano, although this region only has 16 QEAR volcanoes out of 74 recognised volcanoes. This may be a reflection that few of the region’s volcanoes have experienced explosive activity in the Quaternary, or that a small number have been studied in detail. JTM and the Atlantic Ocean are both second with a mean of 6.8 eruptions per QEAR volcano. To test whether these differences are meaningful we have applied the Pearson’s chi-squared test to the data in Table 2.

Pearson’s chi-squared is used to measure whether the difference between variables (in this case, regions or eruptions) is one that could have occurred on the basis of chance alone. Firstly, we tested the number of QEAR volcanoes per region compared to the total number of identified volcanoes. The test assumes that if a particular region has a certain share of the total number of volcanoes (e.g. 10% in the case of JTM) then, if there was no statistical difference between the numbers of QEAR volcanoes in each of the regions, it would be expected to have a roughly equal share of the QEAR volcanoes. However, the test shows that there is a significant (p < <0.01) difference between these values. In particular, JTM, Mexico and Central America, Kuril Islands and Europe all have significantly greater proportions of QEAR volcanoes than expected given the overall distribution of all identified volcanoes; Kamchatka and Mainland Asia, Indonesia, and the Middle East and Indian Ocean all have much lower proportions in comparison. This does not change when the influence of the JTM region is removed from the analysis. Although some of the difference here could be due to some regions having more volcanoes prone to explosive eruptions than others, the result remains statistically significant.

When considering the expected number of eruptions given the proportions of QEAR volcanoes, chi-squared results also show eruptions in some regions are significantly better recorded than others. Excluding the JTM data from the analysis, those regions with the most complete record are New Zealand to Fiji, Kamchatka and Mainland Asia, and Iceland and the Arctic Ocean; regions with the most under-recording are Africa and the Red Sea, Melanesia and Australia, and Indonesia. Unlike the results discussed in the previous paragraph, the tendency for some regions to have more explosive volcanoes would not influence these results as this has already been accounted for by using QEAR values (each QEAR volcano has had at least one M ≥ 4 eruption). Thus these results indicate that recording levels differ significantly between various regions of the world.

With much of the LaMEVE data being from the JTM region, characteristics particular to that region are likely to skew analysis of global data. As such, some of the following analyses investigate the whole global dataset, just the JTM dataset, and the rest of world (ROW) dataset (global excluding JTM).

Dating of eruptions

A mandatory criterion for the inclusion of an eruption in the LaMEVE database is that it must be dated, either directly or by proxy (Crosweller et al. 2012). Approximately one-third of eruptions in LaMEVE currently have no dating technique listed and are entered with dating technique ‘Unknown’. This is mostly due to the absence of this information in the sources from which the dates were derived. For eruptions with a known dating method, 97% are dated by historical, radiometric or proxy techniques (Table 3). The prevalence of each method changes markedly as a function of time.

Table 3 Percentage of eruptions in LaMEVE dated using each technique group

The availability of historical records (i.e. written records of observed eruptions) is variable globally and is strongly influenced by factors such as population distribution, colonisation, literacy levels and emergence of modern science. Since 1450 AD, over 50% of eruptions are dated from written records (Figure 2a). There is a clear decline in historical dating with increasing time, with only 23 historically dated M ≥ 4 eruptions prior to 1 ka (950 AD) and only three prior to 1 AD. These three eruptions occurred at Etna (n = 2) and Vulcano (n = 1), Italy, which lie in the ‘Cradle of Volcanology’, aptly named due to the availability of written records dating back to 1500 BC (Siebert et al. 2010). In addition to a temporal decline in historical dating, there are eruption size trends: 22% of M4-4.9 eruptions are dated by historical means, whilst only 8% of M5-5.9, 3% of M6-6.9 and 0.8% of M7-7.9 eruptions are dated in this way. There are no historical dates for M ≥ 8.

Figure 2
figure 2

The use of historical and radiocarbon dating of eruptions in LaMEVE over time. (a) Percentage of eruptions dated using historical methods between 0–2 ka, calculated in 100 year bins. (b) Percentage of eruptions dated using 14C from 2.5-70 ka, calculated in 2,500 year bins. The peak at 56 ka is the Congo (CGT) eruption from Coatepeque Caldera (El Salvador), constrained using high sensitivity 14C analysis; see Rose et al. (1999) for details. For (a) and (b) bin sizes were selected to be larger than typical uncertainties in the age determinations.

Prior to 2 ka radiocarbon dating dominates with between 20% and 60% of eruptions dated by 14C analysis from 2 to 50 ka (Figure 2b). Radiocarbon dating is only viable to about 50 ka (Fairbanks et al. 2005; red line, Figure 2b) and is only possible if organic material has been preserved. Uncorrected 14C dates from the literature were calibrated and entered in LaMEVE using IntCal 09 (northern hemisphere) and ShCal04 (southern hemisphere) in CALIB Radiocarbon Calibration Program 6.1.0 (Stuiver and Reimer 1993) for dates up to 26 ka. Fairbanks et al. (2005) was used for calibration of dates of 26 to 50 ka. It is not always clear in the literature whether reported dates are calibrated or not; these dates are entered without calibration as ‘Unknown’ dating technique. Calib 7.0, released in 2013, now calibrates radiocarbon ages to 50 ka and will be used for calibration of dates in Version 3 of the LaMEVE database.

Eruptions older than 50 ka are predominantly dated using 40Ar-39Ar and K-Ar radiometric techniques. Over 50% of eruptions between 50 and 500 ka are dated using radiometric techniques; this increases to >75% for eruptions older than 1.5 Ma.

Eruption size and volume

The size of all eruptions in LaMEVE is provided using both VEI (Volcanic Explosivity Index) and magnitude (M) scale, together with bulk and DRE volume data, which are used in the calculation of magnitude. VEI is derived through quantitative and qualitative descriptions of eruptions (Newhall and Self 1982). The VEI is the dominant eruption size parameter utilised in the Smithsonian Institution’s VOTW 4.0. Magnitude, whilst normally comparable to VEI (Crosweller et al. 2012), provides a more precisely defined quantitative measure of eruption size as a function of erupted mass, using Equation 1 (Pyle 2000):

M = lo g 10 erupted mass kg - 7
(1)

or the Dense Rock Equivalent (DRE) volume:

M = lo g 10 DRE m 3 magma density kg / m 3 - 7
(2)

DRE is calculated from the tephra volume:

DRE k m 3 = tephra vol k m 3 * tephra density kg / m 3 magma density kg / m 3
(3)

Magnitude is the eruption size measure used in all quantitative analysis in this paper. It is frequently sourced from published works where it is based on volume estimates for tephra fall deposits, pyroclastic flow deposits (ignimbrites) and intra-caldera deposits. Where unavailable in the literature magnitudes were calculated specifically for LaMEVE from the published volume data or VEI values. There are numerous volume estimation methods, but unfortunately this is commonly not provided or clearly described in the source literature. The most widely used method for calculation of volume of tephra fall deposits is based on the methodologies of Pyle (1989;2000). LaMEVE contains magnitudes derived in a variety of ways. Papers may report anything from one to up to six pertinent parameters, namely VEI, bulk tephra volume, DRE, magnitude, tephra density and magma composition (from which magma density can be calculated). 46% of the LaMEVE magnitude entries are derived from literature sources, which provide either one or more of VEI, bulk tephra volume and DRE. Another 46% are derived from directly reported magnitude values. Most literature does not directly report magnitudes; however the two Japanese databases include this metric. 8% of magnitudes in LaMEVE are derived from sources describing the occurrence of large magnitude explosive eruptions without further information: these are judged to have been M ≥ 4. For calculated magnitudes the tephra density is taken as 1000 kg/m3 if there is no information on deposit density in the original source and no DRE cited. Magma density is assumed to be 2500 kg/m3 (andesite) if the magma type is not known. The bulk tephra and DRE volumes are closely related. There is some overlap in the DRE volume range for each magnitude band due to variations in magma density values (from 2300 to 2700 kg/m3).

Recent studies to assess uncertainties in volume estimates from tephra fall deposits (Connor and Connor 2006; Burden et al. 2013; Engwell et al. 2013) indicate that even the best-documented tephra fall deposits have volume uncertainties greater than 10%. Uncertainties in volumes are not commonly reported, but where they are estimates are typically below half an order of magnitude. Problems also arise when attempting to distinguish between bulk and DRE volumes in literature sources, a matter also highlighted by Mason et al. (2004). Furthermore, substantially diverse volumes can be reported for a single eruption in different literature sources, with occasionally incompatible bulk and DRE volumes between sources and even within the same source. In 64 cases, mostly from Japan, reported magnitudes are either greater or less than the magnitude implied by the reported DRE volume. All data have been included in LaMEVE but labelled as either ‘preferred’ or ‘alternate’ values. Our analysis in this paper is based on the preferred magnitude values only. The identification of ‘preferred’ is a subjective judgement based on the assessed quality of the various sources of information. Given these quite large uncertainties analysis of LaMEVE at higher resolution than order of magnitude bins is not justified. Thus here we analyse LaMEVE data collectively in bins of M4-4.9, M5-5.9, M6-6.9, M7-7.9 and M8-8.9.

The number of recorded events decreases with increasing magnitude (Table 4). Although these data can be fit to a power law this result is meaningless because of the effects of under-recording explained in the next section.

Table 4 The number of Quaternary eruptions in LaMEVE in each magnitude band

Temporal variations in recorded magnitude

The recorded number of eruptions in LaMEVE greatly increases towards the present day (Figure 3). Indeed, 41% of all eruptions in LaMEVE occurred in the Holocene. Here we assume that global volcanism is stationary and that the increase with time is caused overwhelmingly by under-recording. We provide strong justification for this assumption a posteriori after first looking at the data. Factors affecting the quality of the eruption record include volcano location, the size and distribution of deposits, the eruption impact on the population, available records and completed geological studies of tephra deposits. Within the magnitude-time plot (Figure 3) every eruption record is represented by one data point, colour-coded to show the derivation method of the magnitude (literature, calculated or assumed). Clusters of eruptions at M4.0, M5.0, M6.0 and M7.0 indicate that some values in the published literature have been rounded.

Figure 3
figure 3

A scatter plot illustrating each recorded eruption by magnitude and time. Eruptions are colour-coded according to how magnitude is derived (see text). The pink line is the temporal half-way point of the database (1.25 Ma). The blue line at about 20 ka indicates the time when 50% of all eruptions (M ≥ 4) have been recorded in the database, and the grey line at about 108 ka indicates the time when 50% of M ≥ 6 have been recorded.

If global volcanic activity were in steady state (a stationary system), eruptions should be evenly distributed with time and frequency should be invariant. As such, for a database of 1,883 events covering 2.6 Myr, approximately half of these events (940) should be found in the first 1.3 Myr and the other half in the period from 1.3 to 2.6 Ma. However, 940 eruptions are recorded in just the most recent 20,000 years (<1% of the time span of the database). A visual inspection of Figure 3 indicates that the number of recorded events decreases both as a function of time and as a function of magnitude, a result found by Deligne et al. (2010) for the Holocene. To first order larger eruptions are both better recorded historically and better preserved in the geological record. For example, 50% of M ≥ 6 eruptions occur within the last 108 kyr (Table 5). Whilst this is still a long way from the halfway point in time of the database (1.3 Ma), it is a huge improvement over the completeness of the record when smaller magnitude (M < 6) eruptions are included.

Table 5 The median age, T50, of eruptions in each magnitude band

Identification of temporal trends is facilitated by cumulative graphs of recorded eruptions (Figure 4). In Figure 4a the data are normalised to the total number of events in bins of M4-4.9, M5-5.9, M6-6.9, M7-7.9 and M8-8.9, and clearly shows that larger magnitude events have more complete records. Figure 4b depicts the data from the last 50 kyr excluding 1 ka to the present day, which is dominated by historical data (see Figure 2a). This screening permits us to examine the geological record, noting that extensive analysis of the historical under-recording has already been published (Simkin 1993; Coles and Sparks 2006; Deligne et al. 2010; Furlan 2010). Here we focus specifically on under-recording in the geological record. Figure 4b shows an increased level of recording in the Holocene from about 10 ka and a much reduced recording of events before about 35 ka.

Figure 4
figure 4

Cumulative numbers of eruptions over time by magnitude. (a) Cumulative number of recorded eruptions from 2 Ma to 2013 AD normalised from 0–1; (b) Cumulative global eruptions over the time span 50–1 ka. The data suggest an overall exponential growth rate in eruption recording.

The improvement in the completeness of the geological record of events from about 35 ka to the present day is evident in eruptions of all magnitudes except M ≥ 7, which is more consistent through time. Although M4-4.9 events are most common within the overall dataset (Table 4) these are poorly represented in the geological record (>1 ka) and only exceed the recording of M5-5.9 eruptions after approximately 10.5 ka (Figure 4b). From 1 ka to the present eruptions follow the expected order of decreasing eruption frequency with increasing magnitude. The data indicate that pre-Holocene geological preservation is very poor for M < 5 eruptions.

When separated from the rest of world (ROW) data the Japan, Taiwan and Marianas (JTM) dataset shows a more complete record of small magnitude events back into the Pleistocene (see Additional file 1: Figure S1). Here, eruptions of M4-4.9 become the most frequent eruption type at about 18 ka in contrast to around 6 ka in the ROW dataset. The JTM data also show a more steady level of recording of events of M6 and above, whilst only M7 and above eruptions show this feature in the ROW data. This difference demonstrates a more comprehensive eruption record in the JTM region, including the previously published and independent volume calculations of Hayakawa (2010), and that records of small magnitude events are more incomplete elsewhere. The difference may reflect a contrasting geological history as well as admirably thorough studies of Quaternary Japanese tephras.

If volcanic activity was uniform over time and all events had been recorded then cumulative plots of the number of events with time would result in a straight line. We have normalised the data for each magnitude interval by the total number of events in the interval so that the normalised parameter has a value of 1.0 at 1 ka (Figure 5). There is still substantial under-recording at 1 ka for smaller magnitude eruptions (M4 to M6), evident from comparisons of frequencies at 1 ka with historical data (see Coles and Sparks 2006; Deligne et al. 2010; Furlan 2010). Plots in Figure 5 only allow the relative decrease in event recording in the geological record to be evaluated compared to the recording of events at 1 ka. Figure 5 confirms the increasing degree of geological recording with increasing magnitude by the narrowing gap between the expected line and the cumulative data curve as magnitude increases (e.g. there is a much greater departure from the expected line for M4-4.9 eruptions than for M7-7.9 eruptions). The difference between the expected line and observed curve likely reflects the preservation potential of eruptions: M4 events produce smaller volume, laterally constrained deposits and thus have a smaller footprint than larger eruptions, and are more easily eroded or removed by subsequent activity. When analysed separately, the ROW data show greater departure from the uniform growth rate for each magnitude class compared with the JTM data (see Additional file 1: Figure S2).

Figure 5
figure 5

A comparison between the number of expected and recorded eruptions. The expected (red line) cumulative number of eruption records if volcanic activity was both uniform and fully recorded (assuming a complete record at 1 ka), and the actual (blue line) record of eruptions of magnitude intervals 4 to 7. Results are shown up to 1 ka to focus solely on the geologic record.

The increasing under-recording of eruptions with decreasing magnitude is also shown through the calculation of the b-value from the data in Table 4. The Gutenberg-Richter law in seismology states that the frequency of earthquakes of different magnitudes normally generates a b-value of 1.0 with the logarithmic earthquake magnitude scale (Gutenberg and Richter 1956). The b-value generated from the LaMEVE dataset, 0.46 (Additional file 1: Figure S3), indicates a much higher frequency of large eruptions in comparison with smaller events. The JTM and ROW datasets analysed separately produce a similar b-value.

We dismiss the possibility that global trends in the event rate are caused by a dramatic global increase in volcanic activity. Rates of volcanism are principally controlled by rates of plate motion and mantle convection related to hot spots, which change slowly over many millions of years. Thus, global changes related to plate and hot spot processes over the time scales being considered are not credible. However, these arguments are less compelling on a regional scale where, for example, local changes in plate motions and changes from upper plate extension to compression in arcs or vice versa might lead to changes in rates of explosive volcanism. There is evidence for an increase in global volcanism related to deglaciation (e.g. Huybers and Langmuir 2009), however this would not cause a decrease back in time but rather fluctuations related to glacial cycles. In a later section we look at the evidence for event rate variations related to glaciations. Any real fluctuations in rates of volcanism with time (non-stationarity) should affect all magnitudes approximately equally, were it a significant factor. However, the decrease back in time is a strong function of magnitude (Figure 5), a relationship easily explained by under-recording, but essentially inexplicable by a physical process. We do not exclude the possibility of non-stationarity in Quaternary global explosive volcanism but claim that the evidence shows that it is masked by under-recording.

Completeness of the eruption record as a function of time

The ‘expected’ (red) lines in Figure 5 are derived from the total cumulative number of eruptions from 50 to 1 ka. From 1 ka to the present day there is an increasing number of recorded M < 7 eruptions (not shown in Figure 5). The expected frequency of eruptions can be calculated from the historical period (1 ka to 2013) and from this a comparison can be made with the recorded number to examine recording levels as a function of time. Here we adopt a simplified analysis of the available data to provide a first order estimation of the percentage of eruptions recorded at particular times.

The eruption record dramatically improves from 1 ka toward the present and statistical studies conducted by Furlan (2010) suggest two recent points at which record completeness undergoes marked improvements: 1500 AD and 1900 AD. We therefore calculate the mean number of eruptions per century in the periods 450 to 1500 (centred around 1 ka), 1500 to 1900, and 1900 to 2013 to demonstrate the improvement in eruption records (Table 6). It is expected that the most recent record of 1900 to 2013 has the most complete dataset, though this and the period 1500 to 1900 are of insufficient bin size for capturing M ≥ 6.5 eruptions, which have longer recurrence intervals. During the period 450 to 1500 the recording level of M4-4.9 eruptions falls at just 14% of the most recent period. Assuming recent volcanism rates are uniform, this suggests that prior to 1 ka fewer than two out of ten M4-4.9 eruptions are recorded. Under-recording is even more prominent when smaller (M < 4) eruptions are also considered. Based on extrapolation of recent historical eruption rates, Siebert et al. (2010) inferred that <2% of all Holocene eruptions have been recorded.

Table 6 The mean number of eruptions per century during different historical time periods

We compare geological and historical levels of eruption recording using these time intervals to better quantify under-recording at particular times (Figure 6). The geological record of M4-4.9 eruptions falls far below the mean event rate of 56 eruptions per century observed from 1900 to 2013 with between 5 and 21% of the expected eruptions between 2 and 1 ka, decreasing to <10% before this (Figure 6a). Likewise, the eruption record from 1900 onwards suggests an average of 9 M5-5.9 eruptions per century, whereas the actual eruption record for 50 ka to 1 ka only contains between 10 and 50% of the expected eruptions and is frequently <10% prior to 50 ka (Figure 6b). However, records improve with increasing magnitude: 25% of the expected M6-6.9 eruptions, based on the 1900 to 2013 record, are recorded throughout much of the Holocene and earlier (Figure 6c).

Figure 6
figure 6

A time series of eruptions per century from 50 ka to 1 ka, shown in comparison to number of expected events. The recorded number of (a) M4-4.9, (b) M5-5.9, (c) M6-6.9 eruptions (blue line) in 100 year bins, and the (*)expected eruptions if volcanic activity is stationary and fully recorded (black line), based on activity from 1900 to 2013. Levels indicative of 50% and 10% of expected eruptions are shown.

Eruption preservation potential

Following the magnitude-time plot of Figure 3, we quantify the relationship between magnitude and recording level by determining the median time, T50, for each magnitude bin (Table 7). At T50, 50% of recorded eruptions are older. We calculate T50 both for the geological dataset (1 ka to 2.5 Ma) and for the whole dataset (inclusive of historical data from 1 ka to 2013). The median is a superior measure to alternatives, such as exponential time constants, because no assumptions are made concerning the distribution of events of a given magnitude with time.

Table 7 The median age, T 50 , for data in each magnitude bin for time periods ending at 1 ka and 2013

T50 increases with increasing magnitude, however even for M8-8.9 eruptions at 935 kyr, T50 is 359 kyr less than the central point, suggesting under-recording for even the largest events. All T50 values for lower magnitudes are considerably smaller than the 1.3 Ma central point of time for the database, which is simply explained by under-recording. The relationship between magnitude and T50 reflects the probability of preservation of the eruption deposits, which is primarily a function of volume and time. Large volume eruptions (with consequently larger magnitudes) have thicker deposits and cover larger areas than smaller volume eruptions, resulting in surface area to volume ratios commensurate with an increased length of time required for erosion to remove in situ physical evidence of the eruption. Factors such as the depositional environment (e.g. marine deposition) and climate therefore strongly influence preservation. Anthropogenic activities are less likely to remove the largest and thickest deposits, whilst smaller deposits in populated zones may well be quarried or otherwise removed without detailed study.

The log of T50 plotted against magnitude shows a linear increase in T50 with every increase in magnitude bin (Figure 7) with a slope of approximately 0.5. As magnitude is a function of the log of erupted material, this suggests a power law relationship between the under-reporting and volume erupted, both for the geological data and whole dataset. We propose that T50 serves as a proxy for geological preservation. As Figure 7 shows, the relationship between T50 and magnitude (M) follows the form:

lo g 10 T 50 M , in yrs 0.5 M + a
(4)

where a is a constant. Using the definition for magnitude (Equation 1), this suggests that the relationship between T50 and erupted mass (m) follows the form:

T 50 m b m 0.5
(5)

where b is a constant.

Figure 7
figure 7

The log of the median time for eruptions in each magnitude bin. The full dataset inclusive of historical data is shown in red, and the geological data beyond 1 ka only is shown in blue. The overlap in the data at M ≥ 7 is due to the long return periods for such eruptions, hence the dominance of the geological data. The purple dashed line indicates the mid-point of the Quaternary. T50 values lying at or close to this line would indicate perfect recording levels.

The same relationship is found for the data when divided into the JTM and ROW datasets, with increased divergence at lower magnitudes due to the superior record in the JTM region (Additional file 1: Figure S4).

Regional T50

Figure 8 shows the median ages for eruptions in each magnitude interval for each of the 19 regions in the LaMEVE database, demonstrating considerable recording variability around the globe. In general, lower magnitudes have the smallest T50 values, consistent with the interpretation that recording quality of smaller events is highly skewed towards the present.

Figure 8
figure 8

Regional analysis of T 50 for each magnitude interval. A lower value of T50 suggests more under-recording. Within a region, the tighter the grouping of the points, the more consistent the level of recording is between magnitudes.

There is regional variability (Figure 8; Additional file 1: Table S1), with the Kuril Islands, Indonesia, and the Philippines and South East Asia all indicating the greatest levels of under-recording with T50 ages of <250 years BP for M4-4.9 eruptions. The Middle East and Indian Ocean and Antarctica have the highest T50 ages (around 150 ka) for M4-4.9 eruptions, although we note only six eruptions of this size are recorded between them. The Mediterranean and West Asia (40 M4-4.9 eruptions), Africa and the Red Sea (17 M4-4.9 eruptions) and Japan, Taiwan and the Marianas (312 M4-4.9 eruptions) could therefore be interpreted to have the least under-recording in M4-4.9 eruptions, with T50 ages of ~15 ka to 38 ka. A similar degree of recording is suggested in Canada and the Western USA, Hawaii and Pacific Ocean, Mexico and Central America, South America and the West Indies by their comparable T50 ages of around 2 ka. These regions lie within a narrow longitudinal range and this, along with those regions showing greatest under-recording, indicates an anthropogenic recording bias based on longitudinal location.

Indonesia and the Philippines and SE Asia show relative under-recording of M5-6.9 events, and Iceland and Arctic Ocean display consistent under-recording of M4-6.9 eruptions. JTM shows thorough recording with the most populous dataset of eruptions and consistently high T50 ages. The variability between recorded events from present to 20 ka in a selection of regions can also be seen in Figure S5 (Additional file 1), which highlights the reduction in recording of eruptions in Indonesia, the southwest Pacific and Iceland before 1 ka.

Cumulative erupted volume

The cumulative dense rock equivalent (DRE) of all M ≥ 4 recorded Quaternary eruptions is about 26,000 km3, corresponding to a total bulk volume of about 55,000 km3. Almost half of this was erupted by M ≥ 8 events (Figure 9), which account for just 1% of eruptions (Table 4), whilst recorded M4-4.9 eruptions (44% of eruptions) contribute <1% of the total erupted volume.

Figure 9
figure 9

The percentage of the DRE volume by magnitude interval produced globally throughout the Quaternary.

Despite increasing numbers of recorded eruptions towards the present day, the volume erupted over time follows an approximately steady production rate in the last 50 kyr (Figure 10a). This is likely an artefact of the far superior recording of very large eruptions over the Quaternary. Early contribution to cumulative erupted volume is dominated by large eruptions with relatively few low magnitude eruptions contributing to the total; over 50% of the volume erupted over the last 50 kyr was from M ≥ 7 eruptions. Closer to the present a large proportion of eruptions have low magnitudes, generating the observed relatively constant reported volume production. Of course, there are discrete large increases in cumulative volume on short time scales, related to individual large magnitude eruptions (Figure 10b). The largest discrete jump in cumulative volume in the last 50 kyr is caused by the Oruanui eruption at Taupo at 27.1 ka, which erupted about 530 km3 DRE.

Figure 10
figure 10

The DRE volume over time as a cumulative sum and by magnitude band. Cumulative DRE volume for the periods (a) 50 ka to 1 ka and (b) 1 ka to 2013. Eruptions responsible for large volume jumps are labelled and adjacent pie charts illustrate the percentage of volumes attributable to eruptions of different magnitudes. Note that the Kuwae eruption is controversial, with the magnitude and volume described in the literature as shown in the diagram from Hayakawa (1996) and Geyer and Martí (2008) respectively, as studies by Németh et al. (2007) found no evidence of caldera-scale deposits.

The largest eruptions (M ≥ 8) recorded within the last 100 kyr occurred at Kikai, Sakura-jima (Aira caldera) and Aso in Japan, Taupo in New Zealand and Toba in Indonesia. A further 21 eruptions of this size are recorded earlier in the Quaternary. Eruptions of magnitude M ≥ 7 are more common, with 34 in the last 100 kyr (Figure 11). However, Figure 11 illustrates the strong effect of the poor record for the smaller magnitude eruptions (M < 6), which are notably absent from most latitudes north and south of 60º latitude. We recognise that Ellis et al. (2012) determined that some ‘super-eruptions’ may in fact be multiple discrete events, each of a smaller magnitude (though many still M ≥ 8), which may be separated using improved comprehensive dating techniques. Thus, future geological investigations could affect the number of eruptions in all magnitude categories in the LaMEVE database, with the greatest statistical impact on the M ≥ 8 category.

Figure 11
figure 11

Global eruptions from 2013 AD to 100 ka between 70˚N and 70˚S. Circle area represents the bulk volume. Note that the volume recorded for the Kikai eruption suggests a M7.2 eruption (as shown), whilst it is published in the literature as M8.1.

Glacial control on eruption incidence

Although LaMEVE is dominated by improved recording of eruptions towards the present day (Figure 3), there is evidence for super-imposed variability on this overall trend. Here we investigate whether the LaMEVE dataset supports suggestions of an increase in volcanic activity at the end of the last glacial period of the Quaternary (e.g. Huybers and Langmuir 2009). This increase has previously been attributed to enhanced mantle decompression and melt generation due to ice unloading in glaciated regions. Figure 12a provides a time series of recorded events over the last 50 kyr.

Figure 12
figure 12

The number of eruptions over time for glaciated and non-glaciated regions. Number of eruptions over time for (a) the global dataset from 1 to 50 ka, showing a peak in activity at ~10 ka; (b) a detailed view of the period 5 to 15 ka encompassing this peak, showing eruptions in regions which were glaciated during the Last Glacial Maximum (LGM); and (c) the period 5 to 15 ka in non-glaciated regions. A moving average trendline is applied to all graphs to reduce noise (black line) and the number of eruptions is calculated in 100 year bins.

During the Last Glacial Maximum (LGM) ice sheets and high altitude glaciers reached their maximum extent in the northern hemisphere. LGM glaciation peaked between 33 ka and 26.5 ka and onset of ice retreat initiated at about 20–19 ka (Clark et al. 2009). Major ice retreat occurred at the beginning of the Holocene and retreat was largely complete by 7 ka (Peltier 1994).

Huybers and Langmuir (2009) investigated a link between volcanic activity and deglaciation through analysis of a selection of regions from the GVP and Bryson et al. (2006) databases, both of which are included and expanded upon in LaMEVE. Kutterolf et al. (2013) identified cyclicity in activity in marine tephras across 1.2 Ma, with increased volcanism following periods of deglaciation. They also detected an increase in eruptions in non-glaciated regions and suggested that stress field changes affected regions beyond the limits of glaciations, perhaps related to sea level rise. Kutterolf et al. (2013) did not investigate a global dataset but instead the activity in the Central American Volcanic Arc and locations around the Pacific rim, resulting in a relatively small dataset for each glacial cycle. Watt et al. (2013) proposed that the volcanic record for regions identifies localised cycles and that correlations with deglaciation may be coincidental rather than demonstrating causation. Watt et al. (2013) examined the published eruption record on a regional scale and found varying responses to deglaciation, with an estimated increase in eruption rate in post-glacial activity by a factor of about 2, much lower than the factor of 4 to 5 estimated by Huybers and Langmuir (2009).

Here we compare LaMEVE data in glaciated and non-glaciated regions around the world, with the ice extent derived from Huybers and Langmuir (2009) and Watt et al. (2013) (Additional file 1: Table S2). Areas which underwent significant deglaciation were the Southern Andes, Alaska, the Cascades, Iceland, Western Europe and Eastern Russia; the tropical Americas, Africa and Southeast Asia were not glaciated (Huybers and Langmuir 2009 and references therein). Figure 12b and c highlight the 15 to 5 ka period, distinguishing between high latitude (glaciated) and low latitude (non-glaciated) regions. The deglaciated regions show a peak in volcanic activity between about 11 and 9 ka (Figure 12b) with a much lower number of eruptions prior to this, reflecting temporal improvements in eruption records. The peak in glaciated regions falls within the 12 to 7 ka peak identified by Huybers and Langmuir (2009). Watt et al. (2013) also found a peak in volume output between about 11 and 8 ka. The increased peak in glaciated regions in the LaMEVE data (Figure 12b) is thus consistent with the hypothesis of increased volcanism through ice-unloading. Although no peak is evident in non-glaciated regions there is a noticeable increase at the beginning of the Holocene (Figure 12c). A difference between our study and others arises due to the inclusion of M < 4 events in the analysis of Huybers and Langmuir (2009), Kutterolf et al. (2013) and Watt et al. (2013). However, large explosive events dominate the record further back in time and so this difference may have a limited effect. A peak occurs earlier in New Zealand data (12 to 11 ka) and may reflect the earlier onset of deglaciation in the southern hemisphere (Sikes et al. 2013).

Huybers and Langmuir (2009) developed an eruption factor, Ef, comparing eruption frequency in the glaciated and non-glaciated regions. Applied to a particular selection of glaciated and non-glaciated regions they found the peak in volcanism in glaciated regions. However Watt et al. (2013) demonstrate that a small change to assumptions on glacial extent and the chosen regions significantly changes the size and occurrence of this peak and suggested that analysis of the Ef is only valid if the eruption rates in glaciated and non-glaciated regions are equal, which, due to considerable spatial variation in the eruption rates, we know is not the case. Analysis of LaMEVE supports this, illustrating the regional variability in the quality of the eruption record, with considerably different numbers of events between time periods, regions and between glaciated and non-glaciated areas in the same time period (Additional file 1: Figure S5). Comparison of a selection of regions indicates a higher proportion of eruptions were recorded in glacial regions. Correction of the data as suggested by Watt et al. (2013) and calculation of the Ef (not shown) nonetheless still indicates a post-glacial peak in the LaMEVE data. Thus the finding of a peak during deglaciation seems robust.

Analysis of the LaMEVE data indicates increased numbers of eruptions in regions undergoing deglaciation at the end of the LGM. A difficulty arises from the analysis of dating techniques (Table 3, Figure 2), which indicates the predominance of radiocarbon dating from 50 to 2 ka, with approximately 50% of eruptions dated by radiocarbon analysis during the Holocene. Extensive ice sheets reduce both the likelihood of tephra preservation and the material required for carbon dating through the absence of soils, lakes and vegetation. This decreases the probability of the recording of eruptions in glacial stages. Thus, the apparent decline in eruption numbers prior to about 11 ka may be caused by a sampling bias. However, such a bias would not explain the large peak in Figure 12b with a decline in recording after about 9 ka. It is the comparison of this early post-glacial activity with that later in the Holocene that provides evidence for real variation in eruption frequency over time, with deglacial pulses of volcanism observed lasting a few thousand years (Watt et al. 2013). Our peak is dominantly controlled by a few regions: Kamchatka and Mainland Asia, South America, Alaska and the Aleutian Islands and New Zealand. A pulse of increased effusive volcanism in Iceland is discussed by Watt et al. (2013) and references therein, which is much less well constrained in the explosive record, with data from LaMEVE indicating 5 eruptions between about 12 and 9 ka with increased frequency later in the Holocene.

Column height and intensity

Intensity is a measure of the rate of magma discharge (Pyle 2000) and can be estimated from a theoretically expected correlation with column height that has been empirically confirmed (Sparks et al. 1997: Mastin et al. 2009). Following Pyle (2000) intensity, I, is calculated as follows:

I = lo g 10 mass eruption rate kg / s + 3
(6)

Or

I = lo g 10 height km / 167 3.86 * magmadensity kg / m 3 + 3
(7)

Maximum column height is provided for 11% of eruptions in the LaMEVE database, of which 65% are Holocene in age. The column height is derived from literature sources, where the column height is either determined from direct observations of historical eruptions, or estimated through application of the Carey and Sparks (1986) maximum clast size dispersal model. Uncertainties in plume heights from the maximum clast method have been quantified by Burden et al. (2011) and the uncertainty in inferred intensities can also be assessed empirically from the correlation of column height data and independently determined intensity estimates from historic eruptions (Sparks et al. 1997). Uncertainties in column heights derived by the maximum clast method are typically less than 5 km (equivalent approximately to a factor of 2.5 in intensity). Given a known column height, uncertainties in inferred intensities are typically less than a factor of 3. Thus the proxy intensity estimates have uncertainties of about half an order of magnitude or less.

Baines and Sparks (2005) demonstrated that M ≥ 6.5 eruptions are capable of producing horizontal spreads which greatly exceed the vertical height. New calculations of column height may therefore be required for M ≥ 6.5 eruptions, which account for about 10% of the eruptions with associated column heights in the LaMEVE database. The LaMEVE data show column height increases with magnitude, but there is considerable scatter, with, for example, column heights of 30 km recorded for eruptions ranging from M4-8 (Additional file 1: Figure S6). There are clusters at a column height of 30 km and M4.0 and 5.0, indicating that there is some estimation of size values with rounding of results.

11% of eruptions in the database have a maximum intensity level recorded. Most of these are calculated from either literature-sourced maximum column heights or eruption rates for specific eruptions. After Pyle (2000), LaMEVE data shows a positive correlation between magnitude and eruption intensity (Figure 13). For intensity data calculated with equation (7), this is in effect auto-correlation, as magma density features in both equation (7) and equations (2) and (3) used to calculate magnitude. These data indicate a wide range of magnitudes at any one intensity level (Figure 13; see also Carey and Sigurdsson 1989). The data show a wide range of intensities possible for eruptions of given magnitude.

Figure 13
figure 13

Magnitude versus maximum intensity for all eruptions with a recorded intensity in the database. Contour lines show the duration required to erupt a given mass at a given intensity. The Younger Toba Tuff eruption is excluded because of the disputed column height used to calculate the intensity. Uncertainties on magnitude and intensity are typically of order half an order of magnitude or less as discussed in the text.

Discussion

Here we have analysed the LaMEVE data to produce a synopsis of the record of global explosive volcanism (M ≥ 4). We have assessed understanding of the global eruption record, identified knowledge gaps and areas for improvement.

Our analysis of the LaMEVE database has highlighted that the record of global volcanism has both spatial and temporal biases and is incomplete. Under-recording is a well-established attribute of global volcanic datasets (Simkin 2003) and also strongly affects the historical records, which dominate the last few centuries (Simkin and Siebert 2000). Recent analyses of global Holocene datasets (Coles and Sparks 2006; Deligne et al. 2010; Furlan 2010) demonstrate a rapid decrease in historical recording back to 1500 AD, with possible change points at 1500 and 1900 (Furlan 2010). The analysis of Holocene data (M ≥ 4) using an under-recording model by Deligne et al. (2010) suggests that recording was steady from 10 to 2 ka. For example, Deligne et al. (2010) estimated that recording of M ≥ 6 eruptions reflects about 15-20% of the true record in this period. Deligne et al. (2010) concluded that the Holocene was too short a period to sample M ≥ 7 eruptions. The analysis of Mason et al. (2004) considers a longer time-scale, but this study is limited to M ≥ 8 eruptions. LaMEVE contains 26 Quaternary events with M ≥ 8; in contrast, Mason et al. (2004) considered only eight M ≥ 8 Quaternary events out of 36 spanning a 38 Myr period. LaMEVE considerably improves the compiled record of M ≥ 7 eruptions, enabling an assessment of geological recording over a much longer period of time than the Holocene. Although M ≥ 7 events are also affected by under-recording, this is much less severe than for M4-M6 events as indicated by the T50 being much closer to the median time of the Quaternary (Figure 7). The T50 value for M8-8.9 can be used to make a conservative estimate of under-recording: 9 missing events prior to 1.255 Ma would move T50 to the median time for the Quaternary; this would suggest 25% under-recording of M ≥ 8 eruptions in the first half of the Quaternary. However, given that the most recent half of the Quaternary is likely also to be missing some events, assuming uniformity, there is likely >25% under-reporting of M ≥ 8 eruptions.

In this paper we focus principally on the assessment of geological recording and have chosen to analyse the data for ages greater than 1 ka. The analysis shows that recording of events decreases largely monotonically back in time. The record is more complete further back in time as magnitude increases. We quantify this using T50, which divides the data into equal numbers of events before and after this time. A simple power law relationship between T50 and magnitude is found, which we speculate is controlled by the preservation potential of tephra. The power law exponent for the relationship between T50 and erupted mass is approximately 0.5 (Equation 5). Dimensional analysis provides a simple explanation of this relationship. Erupted mass is proportional to volume and thus preservation potential, for which T50 could be considered a proxy, is proportional to the length dimension (L) raised to the power of 3/2. Volume is related to two length scales, namely thickness and square root of area (L2). For tephra fall deposits, volume can be linked through the empirically well-established function of Pyle (1989):

V = 13.08 T o b t 2
(8)

where To is the maximum deposit thickness and bt is effectively a measure of deposit area or hazard footprint (see Pyle (1989) for further details). Pyroclastic flow volumes are typically calculated by determining deposit area and average thickness. Thus, a simple explanation for the relationship found in Figure 7 is that preservation potential is linearly proportional to deposit thickness and to the square root of the deposit area, resulting in the observed 3/2 exponent for volume and 1/2 for mass.

Cumulative curves of the number of eruptions normalised to the number of recorded events prior to 1 ka (Figure 5) enables quantification of the decline in under-recording by comparison with recording rate. These results show the strong dependence of geological under-recording on magnitude. This supports the underlying approximation of stationarity since all magnitudes would be affected to the same extent by any real fluctuations in rates of volcanism. However, there is a marked increase in events in glaciated regions in late glacial times (notably 9–11 ka). While this peak may be affected by sampling biases, it supports previous suggestions (Huybers and Langmuir 2009; Watt et al. 2013) that there was a global increase of volcanism related to deglaciation.

There are clear regional biases in LaMEVE, which highlights the very uneven distribution of knowledge about volcanism around the world. Japanese data are strongly represented in LaMEVE and so could bias global analysis of the data. Japan’s record for M4-M6 eruptions extends back much further in time than for other regions, reflecting commendable scope of geological investigations, and a favourable environment for preserving tephra deposits.

We can use LaMEVE to identify major knowledge gaps and therefore areas which could benefit from focussed research efforts. LaMEVE also provides a resource for producing regional magnitude-frequency relationships and for developing these relationships for individual morphologic types of volcano. However, breaking up the LaMEVE data into subsets based on region, country or volcano type will exacerbate the problem of uneven quality and coverage. This would therefore require development of statistical models that account for under-recording and develop the concept of exchangeability. Exchangeability is based on the premise that if one takes data from a population of objects that are thought to be similar (in this case volcanic sub-regions or individual volcanoes), one can characterise the statistical properties of a sub-set of objects that are relatively well characterised and assume that they can represent those that are poorly characterised. We suggest that such an approach could be applied using Japan as the well-characterised sub-set to develop a statistical model of stratovolcanoes, island arcs or other types exemplified in the Japanese region.

A final issue highlighted by the population and analysis of the LaMEVE database is the uneven quality of the data sources. Many studies do not state the method used to estimate volumes of tephra deposits in sufficient detail and there is no standardisation of methodologies. Uncertainties are commonly not assessed or stated. Looking towards future research, there needs to be significant improvements in literature descriptions of age and volume data to ensure that unambiguous data are available for consistently named eruptions with appropriate derivation methods cited. The introduction of guidelines and adoption of standardised internationally agreed methods is needed. For example, when providing unit dates through radiocarbon analysis it should be made clear whether the ages are calibrated and which algorithm was used in the calibration. The methods and assumptions behind volume estimates and conversions to DRE values should also be reported, and estimates of uncertainties in ages and volumes should be given. Standardisation of data reporting and methodologies would improve clarity and transferability throughout volcanology research.

Conclusions

Our analysis of the LaMEVE database, the most comprehensive inventory of known large (M ≥ 4) explosive eruptions for the Quaternary, identifies major biases in time and space in the scientific community’s record of large explosive eruptions. In particular, under-recording is a dominant feature of the data which worsens with increasing time from present. However, recording improves markedly with magnitude and the database likely includes 70% or more of M ≥ 7 eruptions. Regional biases include better recording in the northern hemisphere and, notably, eruptions from the Japan, Taiwan and Marianas region constitute over 40% of all LaMEVE data.

Since historical influences on eruption recording have been widely studied our analysis of LaMEVE has focussed on geological recording, defined as the time prior to 1 ka. Consideration of volume data through time indicates that there are very strong temporal magnitude-related biases in the data. The record is shown to be incomplete and we suggest that this is partially controlled by and quantifiable through the eruption age and size, with the probability of recording an eruption in the geological record, as defined by a median preservation time scale, being proportional to the square root of the magnitude. This suggests a simple explanation of preservation being proportional to deposit thickness and square root of the deposit area.

The major influence of under-recording prevents much interpretation of temporal variations in volcanism. However, we find that there is an increase in explosive volcanism at the end of the last ice age in glaciated areas; this trend is not observed in areas unaffected by glaciation. These observations support the hypothesis of glacial unloading triggering enhanced volcanism (Huybers and Langmuir, 2009; Watt et al. 2013). The strength of this signal may be exaggerated by biases related to the (lack of) preservation of syn-glacial pyroclastic deposits and carbon within them. The eruptions of the largest magnitudes, M8-8.9, show only modest under-recording, with approximately 25% missing events. The M ≥ 8 data do not indicate any major temporal variability during the Quaternary.

LaMEVE can identify knowledge gaps and constrain return periods of explosive eruptions of different magnitudes on global, regional and local (individual volcano) scales. However, the fact that major spatial and temporal biases together with under-recording are strongly dependent on magnitude means that the database should not be used in its raw form for such estimates. Statistical analyses to correct for biases and under-recording are essential. We recommend that the principle of exchangeability or equivalent correction techniques be applied to enable proper usage of LaMEVE data for assessment of volume production rates, magnitude-frequency estimates and hazard applications.

Endnotes

aPublically accessible at http://www.bgs.ac.uk/vogripa;

bPublically accessible at http://www.volcano.si.edu;

cThroughout this article, a distinction in the notation between geohistorical dates in years before present (e.g. ka, Ma) and geohistorical durations in years (e.g. kyr, Myr) has been made, as per the article by Aubry et al. (2009).