Bulletin of Volcanology

, 76:874 | Cite as

Determining change points in data completeness for the Holocene eruption record

Research Article

Abstract

Changes in data completeness for the Smithsonian Institution’s “Volcanoes of the World” (VOTW) eruption catalogue, by region and for selected countries, are determined and utilised to estimate average eruption recurrence intervals. In the VOTW database, the number of documented volcanic eruptions has increased markedly since the middle of the last millennium. This is largely attributed to population expansion, geological investigation and improvements in detection and recording technologies, rather than an increase in volcanic activity. Simple methods, such as break-in-slope or stationarity tests, can be used to determine changes in data completeness, but often require subjective choices, introducing additional uncertainty. A Markov chain Monte Carlo simulation method for assessing and determining changes in the completeness of natural hazard event catalogues is adapted to determine the completeness of the database. Data completeness is assumed to follow a step-change model, where the probability of documenting an eruption is Volcanic Explosivity Index-dependent before the change point date and 100 % after. A distribution of candidate change point dates is obtained for each region and country subset which allows uncertainty in the data completeness date to be quantified, and for uncertainty in eruption frequencies to be expressed and propagated through statistical models.

Keywords

Data completeness Eruption frequency Recurrence interval Markov chain Monte Carlo Change point model Volcanic hazard 

Introduction

Hazard frequency analysis, determination of recurrence intervals and loss estimation for natural hazard events often rely on catalogues or databases describing past occurrences (Kyselý et al. 2011; Wirtz et al. 2014). For these purposes, the number of recorded events needs to be as high as possible to reduce uncertainty in the derived frequencies. Under-reporting of past hazardous events may result from a variety of factors including a lack of documented observations, incomplete or insufficient geological evidence and investigation, or minimal detection technology. This leads to datasets that can be considered, at least partly, ‘incomplete’ as they do not contain all events that occurred over the time period in question. The effect of incomplete datasets can be large, especially when data are utilised to estimate the frequency or recurrence interval of events. Issues in data completeness are evident in several catalogues of natural hazard events, including hailstorms (Schuster et al. 2005), cyclones in the Atlantic (Landsea 2007), earthquakes (Woessner and Wiemer 2005), avalanches (Dussauge-Peisser et al. 1999), landslides (Kirschbaum et al. 2010) and volcanic eruptions (Hayakawa 1997; Siebert et al. 2010). In order to use these datasets for hazard frequency calculation, a cut-off date is typically chosen, after which point the dataset is assumed to be adequately complete. This date (referred to here as the change point) is critical, as it needs to be recent enough to eliminate errors due to missing data, while maximising the number of records in the dataset in order to adequately sample both large, lower frequency events and smaller, harder to detect events.

A common approach to deal with under-recording in hazard datasets is to assume events occur at a constant rate with respect to time, with any nonstationarity in the rate attributed to under-recording and an incomplete dataset. Tests for stationarity are usually based on classical statistics approaches, such as measuring recurrence interval convergence (e.g. Dussauge-Peisser et al. 1999; Schuster et al. 2005; Landsea 2007), divergence (e.g. Klein 1982; Mulargia et al. 1987) or regression analysis (e.g. Marzocchi and Zaccarelli 2006). However, in most instances, the choice of date or limits of stationarity are based on subjective inference. These classical methods also rely on the asymptotic properties of large datasets to provide confidence in the estimates (Rotondi and Garavaglia 2002), requiring monitoring of scale dependence and restricting the applicability of these methods to smaller datasets.

Increasing rates of volcanic activity, commonly attributed to under-recording of volcanic eruptions, is evident in catalogues of global volcanism such as the large magnitude explosive eruptions (LaMEVE) database (Crosweller et al. 2012), the Smithsonian Institution’s Volcanoes of the World (VOTW) database (Siebert and Simkin 2002) and in the database of Hayakawa (1997) (Coles and Sparks 2006; Furlan 2010). Of these eruption catalogues, the VOTW database is the most comprehensive listing of volcanic eruptions of all sizes (Crosweller et al. 2012) and is frequently used in global (e.g. Chester et al. 2000; Small and Naumann 2001), regional (e.g. Jenkins et al. 2012; Auker et al. 2013) and volcano-specific assessments (e.g. Connor et al. 2001; Mendoza-Rosas and De la Cruz-Reyna 2008). Under-recording is particularly evident in the VOTW database (Siebert et al. 2010), which aims to document all volcanic activity during approximately the previous 10,000 years. Evidence for eruptions comes from historical accounts (directly observed and recorded) and deposits dated using techniques such as tephrochronology and radiocarbon dating (Siebert et al. 2010). Figure 1 displays the notable increase in the number of eruptions documented in the catalogue within approximately the last 500 years. This increase has been attributed to population spread, colonisation and better recording technologies rather than to an increase in the rate of volcanic eruptions (Newhall and Self 1982; Siebert et al. 2010). This artificial change in the recorded rate of volcanic eruptions greatly affects estimates of eruption frequency. For example, 291 eruptions are documented in the database from 1000 ad to 1500 ad; however, 7,417 eruptions are recorded in the period from 1500 ad to 2000 ad. Completeness of the VOTW database varies by region (Lamb 1970; Newhall and Self 1982; Siebert et al. 2010) and therefore necessitates change-point selection on a regional or country scale. However, for some regions, there is not enough data to undertake stationarity tests with high certainty. This is particularly true when considering larger-size eruptions where, for some regions (e.g. Atlantic Ocean, and Philippines and South East Asia), fewer than 20 eruptions have been documented that have a Volcanic Explosivity Index (VEI, a semi-quantitative scale for eruption size used in the VOTW catalogue) greater than or equal to 4. As a leading source of information on volcanic eruptions, the VOTW catalogue is frequently relied on to describe eruptive histories and characteristics on global, regional and volcano-specific scales. Most analyses of this type require a choice of the point in time after which eruptions were consistently reported. To support similar studies in the future, we calculate change points in data completeness for the VOTW eruption catalogue to determine data completeness on both regional and country scales, where enough data are available.
Fig. 1

Cumulative number of eruptions recorded in the Volcanoes of the World 4.0 database (Siebert et al. 2010)

Most approaches to deal with under-recording of volcanic eruptions also rely on the concepts of stationarity and independence, generally assuming eruption events (e.g. Newhall and Self 1982; Guttorp and Thompson 1991; Coles and Sparks 2006; Deligne et al. 2010; Furlan 2010; Jenkins et al. 2012) or periods of repose (e.g. Klein 1982; Mulargia et al. 1987; Marzocchi and Zaccarelli 2006; Wang and Bebbington 2012) fit a Poisson distribution (i.e. the intervals between eruptions are independent and identically distributed exponential random variables). While this assumption is considered valid for the global eruption record, the stationarity assumption may break down when considering eruptions on regional and country scales, as attempted here. For example, Watt et al. (2013) suggest that regional volcanism can increase following the end of glacial cycles, which implies that eruptions on a regional scale are not independent. In a data completeness study, Guttorp and Thompson (1991) identified periodicity in Icelandic eruption catalogue (containing 22 known volcanoes), indicating a lack of independence. However, these same authors found little deviation from the Poisson process in the Japanese eruption record (which contained 77 known volcanoes). Furthermore, De la Cruz-Reyna (1991) explains that aggregation on a global scale is not the sole reason for the Poissonian nature of global volcanism and that even an individual volcano may be represented by a Poisson process. Several studies and hazard assessments (e.g. Klein 1982; Mulargia et al. 1987; Connor et al. 2001; Wang and Bebbington 2012) have assumed independent and identically distributed eruptions from individual volcanoes. A visual examination of regional VOTW eruption records confirms the claim of Newhall and Self (1982) that under-recording has a much greater effect on the eruption catalogue than any non-stationarity in volcanism. While we make the assumption here that the distribution of eruptions within a region or country is Poisson, changes in the frequency of volcanic activity can occur and should ideally be considered in regional-scale analysis.

A simple approach to determine completeness of an eruption catalogue, by region and eruption size, was demonstrated by Jenkins et al. (2012), using a ‘break-in-slope’ method (as described by Hakimhashemi and Grünthal (2012)). This method is well suited for determining change points in regions such as South-East Asia where easily identifiable changes in recording are evident and coincident with increased colonisation beginning in the sixteenth century. However, this method is more problematic in regions such as Japan, where data gradually becomes more complete over time, resulting in a more complicated relationship between documented events and time. Regions such as this require more subjective choices for curve-fitting, calculation of slope and determination of breakpoints. This need for subjective decisions introduces problems in reliably determining reproducible change points.

Simple approaches to determination of data completeness, as in the example described above or by inferring changes in completeness from recurrence-interval divergence (e.g. Klein 1982; Mulargia et al. 1987), usually come down to a subjective choice (Mulargia et al. 1987). In contrast, statistical models of recording bias involve developing a formula for the probability of an eruption being documented and are useful for reducing the need for time-consuming analysis as well as reliance on unreproducible subjective decisions (Coles and Sparks 2006; Furlan 2010). These statistical models usually follow a general form of
$$ {\lambda}_t\left(t,x\right)=\lambda \left(t,x\right)p\left(t,x\right) $$
(1)
In simple terms, the likelihood of an eruption of size x at time t is equal to the underlying eruption rate λ(t,x) multiplied by the probability p(t,x) of recording an eruption of that size at that time (e.g. Guttorp and Thompson 1991; Coles and Sparks 2006; Deligne et al. 2010; Furlan 2010). Guttorp and Thompson (1991) developed a recording probability function based on the number of records in each year of the VOTW catalogue. The function was smoothed using a locally linear fit to reduce noise and scaled to determine an observance probability by assuming the probability of recording an eruption to be 100 % in 1980. Guttorp and Thompson (1991) acknowledged that the constraints on probability function shape and assumption of maximum in 1980 were unreasonable, but necessary for analysis. Coles and Sparks (2006) presented a statistical model of the same general form based on a Poisson point process for eruptions of magnitude (M) ≥ 4, defined as
$$ M= \log {}_{10}(m)-7 $$
(2)

where m is the estimated mass generated by the eruption, and using the 2,000-year dataset of Hayakawa (1997). Deligne et al. (2010) used the same approach on a much larger dataset of large magnitude eruptions. This model assumes the probability of an eruption being documented increases with magnitude and time closer to present, with parameters determined using a classical maximum likelihood approach. The fitted models generally suggested a rapid rise in recording during the most recent 300 years and eruption magnitude dependence on under-recording. However, as with the prescribed model of Guttorp and Thompson (1991), the applicability of this model is limited by the prescription of shape in the recording probability function and the assumption of an increasing recording probability with time, meaning only present day data can be considered 100 % complete.

Furlan (2010) utilised a similar model to Coles and Sparks (2006) but developed an alternative formulation to express recording probability through the use of a step function, bypassing the limitation of having to prescribe shape and recording probability in the Coles and Sparks (2006) approach. The statistical model parameters were fitted using Metropolis-Hastings Markov chain Monte Carlo (MCMC) simulation (see Gilks et al. (1998)). The outcome of MCMC simulation is a distribution of parameter values that accounts for the data and prior knowledge, termed the posterior distribution. This posterior distribution can be used to provide more information than parameter estimates obtained from maximum likelihood estimates (Coles 2001). In Furlan (2010), the step function was generated by averaging the chain of posterior values into a single function, with the approach able to estimate data completeness change points in an automated fashion without subjectivity. Change points were determined at two different threshold magnitudes in the eruption dataset of Hayakawa (1997). However, the size of the dataset was relatively small, containing only 221 and 67 events globally at magnitude thresholds of 4.0 and 5.1, respectively. Due to the small number of events, only a global scale change point was obtained whereas, in reality, different countries or regions will have differing completeness change points (e.g. Lamb 1970; Newhall and Self 1982; Jenkins et al. 2012).

Here, a similar MCMC method is utilised but applied to determine the change points in the Smithsonian VOTW 4.0 catalogue for all defined regions and for specific countries. This work builds on the method demonstrated by Furlan (2010) through using a more comprehensive database of volcanic eruptions, which allows for the application of more informative priors based on analysis of the dataset and estimations of data completeness change points on smaller scales. Estimation of the breakpoints at this finer scale provides more value for regional hazard assessments, where under-recording is most likely different to the global average. In addition, we use the posterior distribution of the change point to quantify uncertainty in data completeness through examining the modes and broadness of the posterior. This is an improvement on more simple models and allows for uncertainty to be propagated through statistical models when calculating eruption frequency.

Volcanoes of the World catalogue

Eruption data were downloaded from the online VOTW Catalogue 4.0 in May 2013 (Siebert and Simkin 2002). This database was chosen as it contains the most extensive list of volcanic eruptions extending over approximately the past 10,000 years. The major difference between this catalogue and that of Hayakawa (1997) (used in Furlan (2010)) is the quantification of eruption size: Hayakawa (1997) used the magnitude scale (Eq. 2), while the VOTW catalogue uses VEI assigned according to the method of Newhall and Self (1982). Eruption magnitude has previously been used to estimate data completeness (e.g. Coles and Sparks 2006; Deligne et al. 2010; Furlan 2010) and had the benefit in these applications of being a purely quantitative and continuous measure of eruption size (Crosweller et al. 2012). In contrast, the VEI scale is semi-quantitative, as it is sometimes assigned based on qualitative measures (Newhall and Self 1982); however, in the majority of cases, VEI is determined based on the volume of erupted material and has a value consistent with magnitude classification (Crosweller et al. 2012). As the VEI scale and VOTW catalogue are frequently used to describe eruptive histories, we chose to use this scale and catalogue to frame the data completeness issues in terms of the commonly used descriptions and database. The VOTW database also contains information on smaller (VEI or M ≤ 4) eruptions included in this study, which are generally not the focus of other available datasets (e.g. LaMEVE, Hayakawa (1997)). Using a VEI rather than magnitude scale meant this study required different and more informative priors than those used by Furlan (2010); these will be discussed in the following section.

For VEIs of 2 and higher, the number of eruptions in the VOTW catalogue follows a decreasing trend, shown in Fig. 2 (note, Plinian/Caldera forming eruptions are classified in the VOTW catalogue with a VEI of 4). There are fewer eruptions of VEI < 2 documented in the database; this is generally attributed to the default VEI for explosive eruptions being assigned as 2. While the number of VEI 2 eruptions is possibly over-estimated due to its use as a default assignment (Newhall and Self 1982), plume measurement data from historical eruptions support the relative abundance of VEI 2 eruptions in the entire catalogue (Newhall and Self 1982; Siebert et al. 2010). Newhall and Self (1982) and De la Cruz-Reyna (1991) also suggest that smaller eruptions (i.e. < VEI 2) are under-reported, bringing into question the completeness of VEI 1 records within the database. To alleviate these issues, we accumulate all eruptions with an assigned VEI of 0, 1 and 2 into a single category, designated as VEI ≤ 2. Note also that some early VEI 1 to 4 data, originating from Newhall and Self (1982), have had VEI values incremented by 1 VEI unit. With some exceptions, the date of 1700 ad was used as the cutoff for this increase. This increase does not have a visible effect on recording in the database, and the impact to the calculated change point date is unknown. To resolve ambiguity, we assumed that all recorded VEI entries in the database are correct and made no further corrections ourselves.
Fig. 2

Histogram of recorded Volcanic Explosivity Index (VEI) in the Volcanoes of the World database (Siebert et al. 2010)

Analyses of the VOTW catalogue were undertaken for every geographical region defined in (Siebert and Simkin 2002) as well as for countries where more than 100 eruptions (of any size) have been recorded. The difference in recorded eruptions between regions (defined geographically) and countries (defined by political boundaries) can be small when the record is dominated by one country. For example, the Indonesian region (includes Indonesia, Andaman Islands and the entire island of Borneo) has only 12 more eruptions than the country of Indonesia. Similarly, the Japan, Taiwan and Marianas region is dominated by the record of Japan. In other cases, countries such as USA are an aggregation of a number of regions, including Alaska, Western USA and Hawaii, with the record dominated by Alaskan and Hawaiian volcanoes. The selection of regional and country boundaries can affect the model, and the effect of this is addressed in the discussion.

The change point dates were calculated considering the entire eruption catalogue (VEIs 0–2 accumulated) and for a threshold VEI of 4. These two different analyses were conducted for a number of reasons: Firstly, large VEI (≥4 in this context) eruptions are better represented in the geological record and are more likely to have been observed and documented in written histories, although occurring less frequently than smaller eruptions (Siebert et al. 2010). By analysing separately the larger eruptions, more large VEI eruptions may be included in the portion of the catalogue considered complete, improving further hazard analysis by sampling more of the lower-frequency, large-size eruptions. Secondly, the default assignment of VEI 2 (Newhall and Self 1982) and possible under-recording of VEI 0, 1 and 3 eruptions (Newhall and Self 1982; De la Cruz-Reyna 1991) make the lower-VEI record more questionable, and Poissonian behaviour is not assured (although suspected by De la Cruz-Reyna (1991)). By doing two analyses per region or country, we obtain completeness dates for all eruptions, which could be less robust due to continuing recording uncertainties (albeit counterbalanced by a larger amount of data), as well as more reliable VEI ≥ 4 dates.

Statistical model

In order to assess the presence of under-recording, we use the modified Poisson point process model of Coles and Sparks (2006), given as:
$$ {\lambda}_M\left(t,x\right)=\lambda \left(t,x\right)p\left(t,x\right) $$
(3)

where λM(t,x) is the modified intensity function which accounts for under-recording, t is the eruption year, x the eruption magnitude, λ(t,x) the intensity function denoting eruption rate and p(t,x) a function describing the probability of an eruption being documented.

The intensity function is assumed to be a two-dimensional Poisson point process (see Pickands III 1971; Coles 2001; Coles and Sparks 2006) above a threshold magnitude u given by the generalised Pareto distribution (GPD):
$$ \lambda \left(t,x\right)=\frac{1}{\sigma }{\left[1+\xi \frac{x-\mu }{\sigma}\right]}^{-\frac{1}{\xi }-1} $$
(4)
where
$$ \sigma > 0\ \mathrm{and}\left[1+\xi \frac{x-\mu }{\sigma}\right] > 0. $$
(5)

In Eq. 4, the parameters μ and σ control the location and scale of the distribution, while ξ controls the shape and endpoint of the tail. In all cases presented here, ξ < 0, meaning the distribution is bounded by a maximum at μ-σ/ξ. The GPD is a power law type relationship commonly used in peaks over threshold (POT) methods (Deligne et al. 2010). In POT methods, the choice of threshold (u) is vital to ensure the validity of the intensity function (Davison and Smith 1990; Coles and Sparks 2006). As the GPD is modelling the tail of a processes distribution, threshold selection is a balance between minimising errors through inclusion of as much data as possible, while still sampling the correct region (tail) of the data. This typically results in choosing the lowest threshold value where the data are still adequately represented by the model (Lang et al. 1999; Coles 2001; Coles and Sparks 2006). Referring to Fig. 2, thresholds of VEI 1 or higher would result in a distribution that could adequately be represented by the GPD model, when VEI 0–2 events are accumulated into a single category. The GPD approximation was also tested using the mean residual life method (see Coles (2001)) on a subset of the data starting from 1800 ad to reduce the effect of under-recording. The mean residual life plot for the data is approximately linear after VEI 1; therefore, threshold VEIs (u) of 1 or higher are likely to be adequately approximated by a GPD, subject to the number of entries being large enough to provide a reasonable approximation. Note that the GPD implicitly assumes the size scale is unbounded (i.e. no maximum); however, the VEI scale used in this study is not open, with a maximum of 8 (Newhall and Self 1982). While this maximum does not appear to affect the change point date, it does limit the use of estimates possible through the GPD, such as the upper end point of the extremes (as shown in Furlan (2010)).

The presence function expresses the probability of recording an eruption of VEI x at time t (Fig. 3a). In this implementation, we use the single change point step function proposed by Furlan (2010) as:
Fig. 3

a Example of the presence function at VEIs of 2, 4 and 6. Before year k = 1,600, eruptions are assumed to be recorded with a probability equal to \( \frac{1}{1+{e}^{-\alpha -\beta x}} \), where x is the VEI, α = 3.5 and β = 1; b shape of the presence function at different α and β values, and c histogram of a theoretical dataset generated using the standard intensity function λ(t,x) (in grey) and the modified intensity function λ(t,x) p(t,x) (in red). The effect of the presence function is demonstrated by the larger amount of under-recording at lower VEIs. Parameters used in this demonstration were μ = 1.5, σ = 1.5, ξ = −0.25, α = 3.5, β = 1 and k = 1567

$$ p\left(t,x\right)=\left\{\begin{array}{cc}\hfill \frac{1}{1+{e}^{-\alpha -\beta x}},\hfill & \hfill t\le k\hfill \\ {}\hfill 1,\hfill & \hfill t>k\hfill \end{array}\right. $$
(6)

where k is the year of the change in recording, and α and β are parameters controlling the scale and shape of the recording probability function. This observation probability function has two states: In the first state, before year k, eruptions are incompletely observed, with the recording probability assumed to follow the upper function with a shape similar to a logistic function (Fig. 3b). The shape of this function represents the recording bias for large size eruptions, where probability increases with x, the VEI of the eruption. In the second state, after year k, eruptions of any VEI greater than the threshold are assumed to have a recording probability of 100 % (completely observed).

The recording probability function (Eq. 6) shows sensitivity to the choice of α and β values (Fig. 3b). Increases to α (black lines) shift the presence function curve towards higher VEI values, in effect reducing the recording probability of eruptions, particularly for VEI ≤ 4. The function displays a larger sensitivity to β values, returning recording probabilities that differ by up to 70 % for perturbations of 1.5. The optimal choice of these values is therefore vital to the applicability and versatility of the statistical model. The effect of the presence function on an artificially generated dataset is demonstrated in Fig. 3c. At parameter values of α = 3.5 and β = 1, VEI 2 eruptions are predicted to be under-recorded by more than 50 %; however, there is almost no impact to the probability of eruptions being recorded with assigned VEIs of between 5 and 7.

Metropolis-Hastings approach

The objective of the MCMC simulation is to find the year k after which eruption data are consistent with the unmodified presence function in Eq. 4. Using the approach of Furlan (2010), the likelihood function for eruptions larger than VEI u can be given as:
$$ \begin{array}{l}L\left(\mu, \sigma, \alpha, \beta, k;\left(t,x\right),\dots, \left({t}_i,{x}_i\right)\right)=\hfill \\ {} \exp \left\{{\displaystyle {\int}_{x=u}^{+\infty }{\displaystyle {\int}_{t=0}^k\lambda \left(t,x\right)\frac{1}{1+{e}^{-\alpha -\beta x}} dtdx}}\right\}\times \exp \left\{{\displaystyle {\int}_{x=u}^{+\infty }{\displaystyle {\int}_{t=k+1}^{2013}\lambda \left(t,x\right) dtdx}}\right\}\hfill \\ {}\times {\displaystyle {\prod}_{i:0<{t}_i\le k}\lambda \left({t}_i,{x}_i\right)}\frac{1}{1+{e}^{-\alpha -\beta x}}\times {\displaystyle {\prod}_{i:k<{t}_i\le 2013}\lambda \left({t}_i,{x}_i\right)}\hfill \end{array} $$
(7)

The parameters μ, σ, ξ, α, β and k were estimated using a Metropolis-Hastings MCMC method, which samples proposed parameter values from a broad initial distribution (prior) and retains the values that give a high likelihood in a chain of likely parameter values (posterior distribution); for more details on MCMC methods, see Coles (2001) and Gilks et al. (1998).

The Metropolis-Hastings algorithm used here is functionally similar to the algorithm used by Furlan (2010), with parameters σ and α transformed to avoid poor mixing in the posterior. A new parameter, ζ, is added as a replacement for σ with the relationship:
$$ \sigma =\frac{\xi \left(u-\mu \right)}{n_e \log {\left(1+{e}^{\zeta}\right)}^{-\xi -1}} $$
(8)
and α is transformed to α* through the relationship:
$$ {\alpha}^{\ast }=\alpha +\beta \overline{x} $$
(9)

where \( \overline{x} \) is the mean VEI of the dataset.

The proposal values for μ, ζ, ξ, α* and β were chosen using a random walk with the formula vp = vc + N(0, ωv), where v denotes the properties of μ, ζ, etc. and ωv is a tuning parameter that controls the efficiency of the algorithm but has no effect on the model (Coles 2001). In our model, the tuning parameters were adjusted to ωμ,ζ,ξ = 0.1 and ωα,β = 0.25 through trial and error to ensure proposal values were well mixed. The formula for the proposed change point value, kp, is shown in Eq. 10. The next candidate for the change point is drawn from a uniform distribution centred on the current candidate, and with half-width (kw) equal to the average return period with limits at the temporal boundaries:
$$ {k}_p\sim U\left[ \max \left({k}_c-{k}_w,{Y}_0\right): \min \left({k}_c+{k}_w,{Y}_{\max}\right)\right] $$
(10)

Where Y0 and Ymax are the earliest and latest year with documented eruptions, respectively, and kp and kc are proposed and current values of k (change point year), respectively. This random walk is an efficient means of exploring the support of k between Y0 and Ymax.

The sensitivity of the GPD and presence function to the parameter values suggests that the prior distributions may not need to be as broad as those specified by Furlan (2010). In preliminary simulations, the use of the VEI scale appeared to require more restrictive priors. The GPD parameters of μ, ζ and ξ were given normal prior distributions with a mean of 0 and variances of 10. This gave a sufficiently broad prior that did not constrain the simulation, while still generally restricting the GPD to expected values. Parameters α* and β control the recording probability function (Fig. 3) with the transition from low to high probabilities expected to occur between VEI 2 and 5. This hypothesis appears to be supported by preliminary examination of the eruption database. As the data completeness change point is expected to change between countries and regions and, given the lack of information on location specific recording, the prior distribution of k was chosen to be uniform between Y0 and Ymax. This broad prior ensured that no preference was given to particular values of the change point year, making the posterior primarily determined by the data.

The starting values for the MCMC algorithm were chosen to be μ = 1.5, σ = 1.5, ξ = −0.25, α = -3.5 and β = 1.0 with k chosen as the first quartile (Q1) of the documented eruption data. Two separate analyses for each region and country were run, one with a threshold value (u) of 1 and the other with the threshold set to 3 (i.e. all VEI’s and VEI ≥ 4, respectively). Geweke’s diagnostics (Geweke 1992) and the autocorrelation function were used to determine the properties of the MCMC chain; the algorithm was run for 200,000 iterations with the first 5,000 discarded (burn in) and one in ten observations retained (thinning) in order to improve convergence, reduce correlation with initial values and reduce the amount of autocorrelation within the MCMC chain.

Comparisons of data completeness

As explained earlier, analyses of the VOTW catalogue were undertaken for every geographical region defined in Siebert and Simkin (2002) as well as for countries with more than 100 eruptions on record. Simulations for two threshold levels provide two different change point dates: one considering the entire eruption catalogue (VEIs 0–2 accumulated) and one considering VEI ≥ 4 eruptions only. As examples, Fig. 4 displays the frequency of the change point year k in the posterior distribution and the cumulative number of eruptions for VEI ≥ 4 eruptions in the United States of America and Indonesia (country). The width of the posterior distribution (grey bars in Fig. 4) can be used to infer certainty in the change point, with broader distributions suggesting greater uncertainty. The four peaks in the distribution between 1000 bc and 1800 ad for the United States of America (Fig. 4a) can indicate other likely change point dates within the distribution. The posterior distribution of k for the USA is much broader than for Indonesia; this is because a large proportion (77 %) of the large VEI eruptions in the USA have been documented through geological studies and dating rather than through historical accounts. This results in an eruption catalogue for the USA characterised by a relatively steady increase in the number of eruptions documented throughout time and demonstrates a greater level of uncertainty in the change point date. In contrast, only 13 % of Indonesian large VEI eruptions were obtained from non-historical records at 17 volcanoes (Siebert and Simkin 2002). The dominance of historical accounts post-colonisation results in a sudden increase in the rate of documented eruptions in the late 1500s, causing the posterior distribution of k to be relatively narrow. The variability in derived posterior distributions for different regions and countries suggests that the distributions could be used to estimate uncertainty in the change point date and consequently in the calculation of average eruption recurrence intervals. While Furlan (2010) estimated the step function change point as the average of the posterior, in our approach, we quantify the uncertainty by using the fifth, 50th and 95th percentiles of the change point posterior. This allows for estimates of the upper, median and lower bounds of derived average recurrence intervals.
Fig. 4

Posterior distributions of the change point year k from the MCMC simulation, and cumulative number of VEI ≥ 4 eruptions for a the United States of America and b Indonesia

The percentiles of the change point posterior, corresponding number of eruptions and average recurrence intervals were calculated for each catalogue region and for selected countries, and are shown in Tables 1 and 2 for all eruptions and VEI ≥ 4 eruptions, respectively. The corresponding recurrence intervals were calculated by dividing the number of years since the change point date by the number of eruptions since the date. For VEI ≥ 0, the difference between the 5th and 95th percentile recurrence interval for each region and country is negligible, with only small differences in the number of eruptions occurring since the calculated change point dates. The change point date occurs in the eighteenth or nineteenth century for most countries and regions, with some exceptions: For the countries of (and regions that contain) Japan and Guatemala, dates occur in the mid-sixteenth century and, for Italy, in the late seventeenth century to early eighteenth century. In general, these countries have longer written records and documented historical observation; however, the change point may also be affected by the proportion of populations living near volcanoes and social or political factors.
Table 1

Data completeness change point dates, number of eruptions after the change point and average recurrence interval for all eruptions using the Markov chain Monte Carlo method

 

Percentile

Change point yeara (ad)

Number of eruptions (all) after change point year

Average recurrence interval (years)

5 %

50 %

95 %

5 %

50 %

95 %

Total

5 %

50 %

95 %

Region

Africa and Red Sea

1819

1820

1820

136

134

134

163

1.4

1.4

1.4

Alaska

1782

1784

1784

307

306

306

340

0.7

0.7

0.7

Antarctica

1891

1893

1899

41

40

39

60

2.9

2.9

2.8

Atlantic Ocean

1986

1988

1990

5

5

5

82

4.9

4.3

4.1

Canada and Western USA

1831

1841

1842

33

31

31

134

5.4

5.4

5.4

Hawaii and Pacific Ocean

1817

1820

1820

156

154

154

306

1.2

1.2

1.2

Iceland and Arctic Ocean

1693

1702

1706

126

124

122

299

2.5

2.5

2.5

Indonesia

1768

1770

1770

1142

1140

1140

1260

0.2

0.2

0.2

Japan, Taiwan, Marianas

1541

1542

1542

1023

1022

1022

1366

0.5

0.5

0.5

Kamchatka and Mainland Asia

1734

1737

1737

315

313

313

566

0.9

0.9

0.9

Kuril Islands

1757

1760

1765

139

138

135

148

1.8

1.8

1.8

Mediterranean and W Asia

1631

1682

1682

205

193

193

326

1.8

1.7

1.7

Melanesia and Australia

1850

1855

1856

378

377

376

419

0.4

0.4

0.4

México and Central America

1503

1517

1518

616

613

612

712

0.8

0.8

0.8

Middle East and Indian Ocean

1731

1750

1751

213

210

209

240

1.3

1.2

1.2

New Zealand to Fiji

1835

1836

1851

290

289

284

388

0.6

0.6

0.6

Philippines and SE Asia

1799

1808

1825

156

154

151

186

1.4

1.3

1.2

South America

1728

1737

1738

631

629

628

805

0.4

0.4

0.4

West Indies

1952

1965

1965

18

15

15

67

3.2

3.0

3.0

Country

Chile

1730

1737

1742

300

299

297

338

0.9

0.9

0.9

Colombia

1820

1822

1822

67

66

66

103

2.8

2.8

2.8

Costa Rica

1818

1821

1821

98

97

97

134

2.0

1.9

1.9

Ecuador

1723

1725

1738

181

180

177

266

1.6

1.6

1.5

El Salvador

1760

1766

1769

96

95

94

108

2.6

2.6

2.6

France

1731

1748

1756

186

183

181

231

1.5

1.4

1.4

Guatemala

1562

1565

1565

109

107

107

120

4.1

4.2

4.2

Iceland

1692

1702

1706

119

116

114

290

2.7

2.7

2.7

Indonesia

1770

1770

1770

1128

1128

1128

1248

0.2

0.2

0.2

Italy

1678

1682

1682

184

182

182

304

1.8

1.8

1.8

Japan

1541

1542

1543

959

958

957

1300

0.5

0.5

0.5

Mexico

1864

1869

1869

52

51

51

159

2.8

2.8

2.8

New Zealand

1834

1836

1855

221

220

216

277

0.8

0.8

0.7

Nicaragua

1847

1849

1849

151

150

150

186

1.1

1.1

1.1

Papua New Guinea

1871

1872

1872

187

186

186

218

0.7

0.7

0.7

Philippines

1800

1822

1825

152

149

148

183

1.4

1.3

1.3

Russia

1735

1737

1759

451

449

442

706

0.6

0.6

0.6

United States

1783

1784

1784

517

515

515

769

0.4

0.4

0.4

Vanuatu

1854

1856

1861

132

131

130

137

1.2

1.2

1.1

aThe change point dates are calculated from the 5th, 50th and 95th percentiles of the change point posterior distribution

Table 2

Data completeness change point dates, number of eruptions after the change point and average recurrence interval for large magnitude eruptions (Volcanic Explosivity Index ≥ 4) using the Markov chain Monte Carlo method

 

Percentile

Change point yeara (ad, negative values are bc)

Number of eruptions (VEI ≥ 4) after change point year

Average recurrence interval (years)

5 %

50 %

95 %

5 %

50 %

95 %

Total

5 %

50 %

95 %

Region

Africa and Red Sea

−8167

−7655

−6075

10

10

9

11

1017.7

966.5

898.3

Alaska

1461

1724

1762

14

12

11

36

39.2

23.8

22.5

Atlantic Ocean

−1914

−329

−85

11

10

10

13

356.7

233.9

209.5

Canada and Western USA

−2403

−653

−446

19

16

15

22

232.3

166.4

163.7

Iceland and Arctic Ocean

779

855

1073

38

38

35

51

32.4

30.4

26.8

Indonesia

1535

1575

1586

29

29

28

34

16.4

15.0

15.1

Japan, Taiwan, Marianas

−2148

−382

113

84

66

59

117

49.5

36.2

32.2

Kamchatka and Mainland Asia

1613

1815

1924

14

12

8

107

28.4

16.3

10.8

Kuril Islands

1561

1662

1688

12

12

12

15

37.4

29.0

26.8

Mediterranean and W Asia

−3128

−2638

−2500

24

24

23

27

214.1

193.7

196.1

Melanesia and Australia

171

1866

1903

24

11

11

30

76.6

13.1

9.7

México and Central America

−868

−383

391

47

45

36

70

61.2

53.2

45.0

New Zealand to Fiji

692

1589

1708

8

6

4

41

164.7

70.2

75.5

Philippines and SE Asia

1159

1335

1380

11

11

10

16

77.4

61.4

63.0

South America

−1488

−1339

−1064

64

64

60

89

54.7

52.3

51.2

West Indies

1900

1901

1902

2

2

2

28

54.9

54.3

54.0

Country

Chile

1490

1633

1888

8

8

6

24

65.0

47.1

20.3

Colombia

−4319

−2014

−1053

16

14

12

18

395.6

287.4

255.3

Costa Rica

1398

1400

1400

1

1

1

22

612.2

610.4

610.0

Ecuador

−1355

−739

−444

32

29

27

43

105.2

94.8

90.9

France

−5025

−694

−103

20

14

11

24

351.8

193.1

192.1

Guatemala

1581

1659

1714

7

7

7

8

61.3

50.1

42.3

Iceland

773

856

1073

38

38

35

51

32.6

30.4

26.8

Indonesia

1533

1575

1586

29

29

28

34

16.4

15.0

15.1

Italy

−3185

−2640

−2464

21

21

20

24

247.4

221.4

223.7

Japan

−3064

−997

128

90

70

56

114

56.4

43.0

33.6

Mexico

541

772

1575

15

15

10

25

97.9

82.5

43.5

New Zealand

−4336

1453

1644

26

4

3

39

244.1

139.3

122.0

Nicaragua

−3459

−1471

−1061

10

9

9

12

546.9

386.8

341.2

Papua New Guinea

113

308

1875

23

23

10

28

82.5

74.0

13.5

Philippines

1162

1340

1400

11

11

10

16

77.1

60.9

61.0

Russia

1620

1689

1810

26

25

20

118

15.0

12.8

10.0

United States

−380

1350

1765

37

22

16

61

64.6

30.0

15.3

aThe change point dates are calculated from the 5th, 50th and 95th percentiles of the change point posterior distribution

The regional and country boundaries used in the VOTW catalogue can have an impact on the change point. For example, Table 1 shows the dominance of the Alaskan regional record on the change point for the United States (as a country). The change point date for all eruptions in the Canada and Western USA region occurs 60 years after the Alaskan record is considered complete. However, when the regions of Alaska, Western USA (minus Canada) and Hawaii are grouped, the change point date is dominated by the Alaskan record. Additionally, changes in political boundaries through a country’s history (e.g. new countries forming post-colonisation) may affect and cause changes in recording. This demonstrates that groupings (regional or otherwise) can impact calculated change point dates and suggests that a possible improvement to this approach could be attained by grouping volcanoes into better-defined regions based on activity or recording history, keeping in mind the requirements of independence and stationarity.

Large eruptions (defined here as VEI ≥ 4) are more likely to be observed and documented, both in historical records and through dating techniques such as tephrochronology. As a result the catalogue will, in general, be considered complete earlier if analysing larger eruptions in isolation. This date is important to determine as the lower frequency of large eruptions usually means there is a lack of data if a shorter time period is considered. The increased probability of eruptions being documented results in broader posterior distributions and larger differences in recurrence intervals (Table 2) when compared with VEI ≥ 0. The change point dates for VEI ≥ 4 eruptions all occur earlier than the change points for the complete catalogue, but the actual dates vary greatly between regions and countries, and with large differences between the 5th and 95th percentile recurrence intervals. The country scale change point dates proposed by Jenkins et al. (2012) (who used a change-in-slope method) appear to agree with the results of the MCMC simulations in most cases, although the MCMC analysis tends to suggests earlier dates that reduce the estimated recurrence intervals. Two notable differences between the calculated change points of Jenkins et al. (2012) and this analysis is for the countries of Mexico and Nicaragua. In both instances, the countries’ entire catalogues were assumed to be complete by Jenkins et al. (2012); however, the MCMC simulations suggests that only a subset of the catalogue is complete, resulting in a difference of more than 250 years in the calculated recurrence intervals.

Completeness for all eruptions

As discussed previously, the posterior distributions for VEI ≥ 0 were relatively tight and resulted in little difference between the calculated 5th and 95th percentile recurrence intervals. In all instances, the change point dates appeared to be strongly correlated with an increase in the recording of eruptions with VEI less than or equal to 2. Examples of this effect can be seen for the regions of Canada and Western USA, Iceland and Arctic Ocean, and South America (Fig. 5). In the regions of Canada and Western USA (Fig. 5a) and Iceland and Arctic Ocean (Fig. 5b), there is a sudden increase in the number of recorded VEI ≤ 2 eruptions that appears to constrain the location of the change point to dates just before this time. In South America (Fig. 5c), the increase in documented VEI ≤ 2 eruptions is not as sudden but still results in a very narrow posterior distribution, giving a high level of certainty in the recurrence interval estimate for the region.
Fig. 5

Cumulative number of eruptions (filled points) and eruption magnitudes (hollow points) recorded for the regions of a Canada and Western USA, b Iceland and Arctic Ocean, and c South America, displaying the 50th percentile (solid line), and the 5th and 95th percentiles (dotted lines) of the change point posterior. The change point appears to be primarily controlled by the amount of VEI < 4 eruptions recorded

While, in most cases, countries will have fewer eruptions on record than will their respective regions, the change point is generally easier to distinguish because recording of small eruptions is likely to be correlated with social, population and political changes in the country (Siebert et al. 2010). Sudden increases for countries such as USA (Fig. 6c) and Indonesia (Fig. 6a) also make it easy to determine the change in data completeness by alternate methods (e.g. Jenkins et al. 2012). However, countries with longer historical records and more extensive geological studies, such as Japan (Fig. 6b), have a more consistent increase in the number of documented eruptions, and several viable change point dates are visible. Our single change point model indicates that the most likely change point date for Japan is between 1541 and 1543, with the dominance and consistent recording of VEI 2 eruptions appearing to be the primary driver of the date. This is also a possible weakness in the single change point model, and a multiple change point model proposed by Furlan (2010) might better identify the multiple break points present in these cases.
Fig. 6

Cumulative number of eruptions (filled points) and eruption magnitudes (hollow points) recorded for a Indonesia, b Japan and c the United States of America displaying the 50th percentile (solid line), and the 5th and 95th percentiles (dotted lines) of the change point posterior. The change point is primarily controlled by the amount of VEI < 4 eruptions recorded

Completeness for large eruptions

The posterior distributions for large eruptions (VEI ≥ 4) were generally much broader and had more complicated shapes when compared with VEI ≥ 0. As an example, the percentiles of the change point for the Alaskan region are shown in Fig. 7a; however, the shape of the change point posterior is bimodal, with peaks in the sixteenth and eighteenth centuries. The first peak corresponds to two VEI 4 eruptions being documented through radiocarbon dating, and the second peak corresponds to the beginning of written historical records. Other regions, New Zealand to Fiji for example, and countries such as Mexico, also exhibit a strong correlation between written records and data completeness, indicating that geological studies included in the VOTW catalogue are incomplete in these areas even for large events.
Fig. 7

Cumulative number of VEI ≥ 4 eruptions (filled points) and eruption magnitudes (hollow points) recorded for the regions of a Alaska, b Japan, Taiwan, Marianas and c New Zealand to Fiji displaying the 50th percentile (solid line), and the 5th and 95th percentiles (dotted lines) of the change point posterior. Note, x axis values vary for each case

Historical records for the Japan, Taiwan and Marianas region (Fig. 7b) start from 680 ad and are supplemented by a large number of eruptions documented through other dating techniques. This results in a gradually increasing eruption record, with steps that could be related to either increases in activity or increases in recording and monitoring. As a result, the calculated posterior distribution spans more than 2,000 years, expressing the large uncertainty in assigning a single date for regions such as this, and also suggesting a multiple change point model could be more appropriate in these circumstances. Countries with eruption histories dominated by geological records, such as Ecuador and Nicaragua (figures not shown), also have very broad posteriors causing larger uncertainty in estimates of recurrence interval.

Discussion

Change point posteriors are strongly affected by the recording frequency of smaller eruptions. This was particularly noticeable when considering the record of all VEI ≥ 0 eruptions, for which the change point dates in all instances correlated with an increase in the recording of eruptions with VEI less than or equal to 2. The statistical model itself suggests this would be the case, with smaller eruptions having a very low probability of documentation, meaning that most VEI ≤ 2 eruptions in the catalogue should be listed after the change point (Fig. 4). Inclusion in the catalogue of these smaller eruptions is typically heavily related to the start of written historical records, with 78 % of small eruptions being classified by Siebert and Simkin (2002) as historical. These results suggest that the start of the historical eruption record appears to be a good indicator of data completeness, particularly when considering smaller eruptions in isolation. As Table 1 shows, the accumulation of VEI 0, 1 and 2 eruptions into a single composite category does not appear to have had an impact on the change point date. This is probably due to the dominance of VEI 2 records in the current catalogue, the large number of eruptions considered and the selection of a single change point model. In areas where VEI 0 or 1 eruptions are more frequent or if future VEI 0 and 1 eruptions are recorded more accurately, this accumulation may be less valid. A more complex presence function that better describes the decreased recording probability of VEI 0 and 1 eruptions or a multiple change point model may be more suitable in these areas, where both the increase in recording of VEI 2 eruptions (i.e. what is generally detected here) and the actual increase in VEI 0 and 1 eruptions need to be considered.

Due to their higher frequency, smaller eruptions affect the change point more profoundly than large eruptions. This is because the lower frequency of large eruptions and more gradual improvement in the large VEI eruption record with time makes it more difficult to detect a systematic change in reporting. The gradual improvement in recording is likely to be due to larger eruptions being easier to distinguish in the geological record. Larger VEI eruptions have been primarily dated through geological techniques, with only 32 % present in historical records, of which most have been documented in the past 300 years. The difficulty in detecting change points in the large-VEI eruption record generally results in a very broad posterior distribution, highlighting the uncertainty in estimating return intervals for large eruptions.

A key benefit of the method presented here is the ability to express uncertainty in the date of completeness, particularly for regions and countries with long geological records such as Japan, Ecuador, Nicaragua and Mexico. In these instances, the start of the historical record is not a reliable measure for determining completeness because the data are heavily supplemented by geological studies. This results in posterior distribution being considerably broader and sometimes multimodal, which increases the uncertainty of recurrence interval estimates. While percentiles of the posterior distribution are used here to efficiently display the median and uncertainty for many regions with simple posterior distributions, this approach may not be suitable when dealing with multimodal distributions displaying varying peaks (Fig. 4a, for example). It may be better in such cases to investigate or sample from the posterior directly, applying additional data analysis and expert inference to determine the ideal date of data completeness within the broad range identified. Multiple change point models such as the one suggested by Furlan (2010) could possibly decrease the uncertainty in these situations, albeit with increased complexity and subjectivity in determining which change point to use.

The presence function utilised here is limited by the assumption that there is a single, dramatic shift in the recording of volcanic eruptions for the subset of data being analysed. While this assumption appears valid for regions such as Southeast Asia and South America (and countries within), where the increase in recording is strongly correlated with exploration and colonisation, it may be less valid for countries which have extensive historical records augmented by detailed geological studies and dating. For these areas, the eruption record is generally characterised by a steadier improvement in recording throughout time. For countries and regions which exhibit these characteristics, presence functions that account for an increase in recording probability with time and eruption size, such as the six-parameter model of Coles and Sparks (2006) or the multiple change point models demonstrated by Furlan (2010), may be more suitable. However, the usefulness of these approaches is limited by the choice of a single threshold date that is generally required to separate ‘well-recorded’ from under-recorded data. It is for this reason that a single change point model was chosen here; however, it could with additional analysis be possible to implement functions that take into account the steady increase in recording with time.

Variations in data completeness and, consequently, estimated recurrence intervals between regions, countries and different eruption sizes indicate that assessments of data completeness need to be undertaken at an appropriate scale, rather than globally. However, the accuracy and level of uncertainty in most statistical analyses (including Bayesian inference) are heavily dependent on the size and accuracy of the dataset. As a result, the change point dates may move as improvements are made to various catalogues, especially in respect to improved geological records for large eruptions (for example, Table 2 suggests the large VEI eruption record for Africa and Red Sea is complete since the early Holocene, which may be proved incorrect with further geologic studies). The nature of the MCMC algorithm means that these change points can be recalculated easily and without subjectivity for updated or new datasets or regions.

Conclusion

Markov chain Monte Carlo simulation has been used to assess the completeness of the Smithsonian Institution’s VOTW catalogue for all eruptions within approximately the past 10,000 years and independently for large eruptions (≥VEI 4). These analyses were conducted on regional and, where enough data were available, country scales. Complete records of VEI ≥ 0 eruptions generally begin from the middle of the last millennium but vary considerably among regions and countries due to social, population and political changes. As suggested by Siebert et al. (2010), the presence of volcanic eruptions in media and social discourse could also increase sensitivity to volcanic eruptions, leading to better recording of these events (e.g. the eruption of Krakatau in 1883). Results from the MCMC simulations appear to support this hypothesis, with countries such as Japan and Italy, which are commonly associated with volcanism and with long written histories, having longer complete records. When considering all eruption sizes, the change point is strongly correlated with an increase in recording of VEI eruptions less than or equal to 2. In most cases, a sudden increase in documentation results in relatively high certainty in the change point date, causing more reliable estimates of the average recurrence interval.

The completeness of documentation for large eruptions (VEI ≥ 4) occurs earlier than for eruptions of all sizes, which is to be expected, given that large eruptions are easier to observe in geological records and more likely to be documented in historical accounts. The relatively low number of documented large events, due largely to long recurrence intervals, results in higher uncertainty in the calculated change point dates, which propagates into large differences in recurrence intervals between 5th and 95th percentiles. In instances such as these, the nature and shape of the posterior distribution can be used to increase the certainty in eruption recurrence intervals. The multimodal nature of some distributions suggests there may be a selection of candidate dates from which the data can be assumed to be complete, as opposed to a continuous distribution. The ability to explore the posterior distribution in order to increase understanding and uncertainty of the eruption record highlights the value of future approaches utilising Bayesian inference. Through using this MCMC approach, consistent, non-subjective change point dates and average recurrence intervals are identified, both with estimates of uncertainty.

Notes

Acknowledgments

The authors would like to thank Ed Venzke and the Smithsonian Institution Global Volcanism Program for early access to the updated online Volcanoes of the World catalogue. We thank Mark Bebbington and an anonymous reviewer for providing detailed suggestions that improved the manuscript. Stuart Mead is jointly supported by an Australian Postgraduate Award (APA) and scholarship from the Commonwealth Scientific and Industrial Research (CSIRO) Digital Productivity and Services (DPAS) flagship.

References

  1. Auker M, Sparks R, Siebert L, Crosweller H, Ewert J (2013) A statistical analysis of the global historical volcanic fatalities record. J Appl Volcanol 2:1–24. doi:10.1186/2191-5040-2-2 CrossRefGoogle Scholar
  2. Chester DK, Degg M, Duncan AM, Guest J (2000) The increasing exposure of cities to the effects of volcanic eruptions: a global survey. Glob Environ Chang B: Environ Hazards 2:89–103CrossRefGoogle Scholar
  3. Coles S (2001) An introduction to statistical modeling of extreme values. Springer-Verlag, London, UKGoogle Scholar
  4. Coles S, Sparks R (2006) Extreme value methods for modelling historical series of large volcanic magnitudes. In: Mader H, Coles S, Connor C, Connor L (eds) Statistics in volcanology, vol 1. Geological Society of London, London, pp 47–56Google Scholar
  5. Connor C, Hill B, Winfrey B, Franklin N, Femina P (2001) Estimation of volcanic hazards from Tephra fallout. Nat Hazards Rev 2:33–42. doi:10.1061/(ASCE)1527-6988(2001)2:1(33) CrossRefGoogle Scholar
  6. Crosweller HS et al (2012) Global database on large magnitude explosive volcanic eruptions (LaMEVE). J Appl Volcanol 1:4CrossRefGoogle Scholar
  7. Davison AC, Smith RL (1990) Models for exceedances over high thresholds. J R Stat Soc Ser B Methodol 52:393–442. doi:10.2307/2345667 Google Scholar
  8. De la Cruz-Reyna S (1991) Poisson-distributed patterns of explosive eruptive activity. Bull Volcanol 54:57–67. doi:10.1007/BF00278206 CrossRefGoogle Scholar
  9. Deligne NI, Coles SG, Sparks RSJ (2010) Recurrence rates of large explosive volcanic eruptions. J Geophys Res: Solid Earth 115, B06203. doi:10.1029/2009JB006554 Google Scholar
  10. Dussauge-Peisser C, Helmstetter A, Grasso JR, Hantz D, Desvarreux P, Jeannin M, Giraud A (1999) Probabilistic approach to rock fall hazard assessment: potential of historical data analysis. Nat Hazards Earth Syst Sci 2:15–26. doi:10.5194/nhess-2-15-2002 CrossRefGoogle Scholar
  11. Furlan C (2010) Extreme value methods for modelling historical series of large volcanic magnitudes. Stat Model 10:113–132. doi:10.1177/1471082x0801000201 CrossRefGoogle Scholar
  12. Geweke J (1992) Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM (eds) Bayesian Statistics 4. Oxford University Press, Oxford, pp 169-193Google Scholar
  13. Gilks WR, Richardson S, Spiegelhalter DJ (1998) Markov chain Monte Carlo in practice. Chapman & Hall, Boca RatonGoogle Scholar
  14. Guttorp P, Thompson ML (1991) Estimating second-order parameters of volcanicity from historical data. J Am Stat Assoc 86:578–583. doi:10.1080/01621459.1991.10475082 CrossRefGoogle Scholar
  15. Hakimhashemi AH, Grünthal G (2012) A statistical method for estimating catalog completeness applicable to long‐term nonstationary seismicity data. Bull Seismol Soc Am 102:2530–2546CrossRefGoogle Scholar
  16. Hayakawa Y (1997) Hayakawa's 2000-year eruption catalog. http://gunma.zamurai.jp/database/
  17. Jenkins S, Magill C, McAneney J, Blong R (2012) Regional ash fall hazard I: a probabilistic assessment methodology. Bull Volcanol 74:1699–1712. doi:10.1007/s00445-012-0627-8 CrossRefGoogle Scholar
  18. Kirschbaum D, Adler R, Hong Y, Hill S, Lerner-Lam A (2010) A global landslide catalog for hazard applications: method, results, and limitations. Nat Hazards 52:561–575CrossRefGoogle Scholar
  19. Klein FW (1982) Patterns of historical eruptions at Hawaiian volcanoes. J Volcanol Geotherm Res 12:1–35. doi:10.1016/0377-0273(82)90002-6 CrossRefGoogle Scholar
  20. Kyselý J, Gaál L, Picek J (2011) Comparison of regional and at-site approaches to modelling probabilities of heavy precipitation. Int J Climatol 31:1457–1472. doi:10.1002/joc.2182 CrossRefGoogle Scholar
  21. Lamb HH (1970) Volcanic dust in the atmosphere; with a chronology and assessment of its meteorological significance. Phil Trans R Soc London A Math Phys Sci 266:425–533. doi:10.2307/73764 CrossRefGoogle Scholar
  22. Landsea C (2007) Counting Atlantic tropical cyclones back to 1900 Eos. Trans Am Geophys Union 88:197–202. doi:10.1029/2007EO180001 CrossRefGoogle Scholar
  23. Lang M, Ouarda TBMJ, Bobée B (1999) Towards operational guidelines for over-threshold modeling. J Hydrol 225:103–117. doi:10.1016/S0022-1694(99)00167-5 CrossRefGoogle Scholar
  24. Marzocchi W, Zaccarelli L (2006) A quantitative model for the time-size distribution of eruptions. J Geophys Res: Solid Earth 111, B04204. doi:10.1029/2005JB003709 Google Scholar
  25. Mendoza-Rosas AT, De la Cruz-Reyna S (2008) A statistical method linking geological and historical eruption time series for volcanic hazard estimations: applications to active polygenetic volcanoes. J Volcanol Geotherm Res 176:277–290CrossRefGoogle Scholar
  26. Mulargia F, Gasperini P, Tinti S (1987) Identifying different regimes in eruptive activity: an application to Etna volcano. J Volcanol Geotherm Res 34:89–106. doi:10.1016/0377-0273(87)90095-3 CrossRefGoogle Scholar
  27. Newhall CG, Self S (1982) The volcanic explosivity index (VEI) an estimate of explosive magnitude for historical volcanism. J Geophys Res: Oceans 87:1231–1238. doi:10.1029/JC087iC02p01231 CrossRefGoogle Scholar
  28. Pickands J III (1971) The two-dimensional Poisson process and extremal processes. J Appl Probab 8:745–756CrossRefGoogle Scholar
  29. Rotondi R, Garavaglia E (2002) Statistical analysis of the completeness of a seismic catalogue. Nat Hazards 25:245–258. doi:10.1023/A:1014855822358 CrossRefGoogle Scholar
  30. Schuster SS, Blong RJ, Speer MS (2005) A hail climatology of the greater Sydney area and New South Wales. Aust Int J Climatol 25:1633–1650. doi:10.1002/joc.1199 CrossRefGoogle Scholar
  31. Siebert L, Simkin T (2002) Volcanoes of the World: an illustrated catalog of Holocene volcanoes and their eruptions. Smithsonian Institution, Global Volcanism Program digital information series, GVP-4. (http://www.volcano.si.edu)
  32. Siebert L, Simkin T, Kimberly P (2010) Volcanoes of the World, 3rd edn. University of California Press, BerkeleyGoogle Scholar
  33. Small C, Naumann T (2001) The global distribution of human population and recent volcanism. Glob Environ Chang B: Environ Hazards 3:93–109CrossRefGoogle Scholar
  34. Wang T, Bebbington M (2012) Estimating the likelihood of an eruption from a volcano with missing onsets in its record. J Volcanol Geotherm Res 243–244:14–23. doi:10.1016/j.jvolgeores.2012.06.032 CrossRefGoogle Scholar
  35. Watt SFL, Pyle DM, Mather TA (2013) The volcanic response to deglaciation: evidence from glaciated arcs and a reassessment of global eruption records. Earth Sci Rev 122:77–102. doi:10.1016/j.earscirev.2013.03.007 CrossRefGoogle Scholar
  36. Wirtz A, Kron W, Löw P, Steuer M (2014) The need for data: natural disasters and the challenges of database management. Nat Hazards 70:135–157. doi:10.1007/s11069-012-0312-4 CrossRefGoogle Scholar
  37. Woessner J, Wiemer S (2005) Assessing the quality of earthquake catalogues: estimating the magnitude of completeness and its uncertainty. Bull Seismol Soc Am 95:684–698. doi:10.1785/0120040007 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Risk Frontiers, Faculty of ScienceMacquarie UniversitySydneyAustralia
  2. 2.Commonwealth Scientific and Industrial Research Organisation Digital Productivity and Services Flagship (CSIRO, DP&S)ClaytonAustralia

Personalised recommendations