Introduction and background

Not all explosive activity at volcanoes or hydrothermal areas is directly linked to the ascent and eruption of fresh magma. There are also events caused by the explosive expansion of water, steam and/or other gases; these are described by many terms, including phreatic eruption, hydrothermal explosion, steam-blast or non-juvenile eruption, and the definitions are not always consistent. In Barberi et al. (1992), phreatic explosions are defined as explosions of a confined pocket of steam and gas with no direct involvement of magma. Other authors include eruptions where minor amounts of juvenile material are ejected, as long as gas is the main driver of the eruption (e.g. Kilgour et al. 2010). Browne and Lawless (2001) distinguish between hydrothermal, phreatic and magmatic-hydrothermal eruptions depending on the main energy source, while Mastin (1995) uses the underlying process to define three types of non-juvenile eruptions: gas eruptions, mixing eruptions and boiling-point eruptions. The latter are the most common and are caused by the depressurisation of water near its boiling point. In this study, we use the term "phreatic eruption" for all explosive activity caused by the explosive expansion of water, steam and/or gas - independent of the fluid source.

These explosions occur at many different volcanoes; Soufrière de Guadeloupe (Le Guern et al. 1980), Meakandake (Ogiso and Yomogida 2012), Usu (Miura et al. 2012), Bandai (Glicken and Nakamura 1988), Karkar (Barberi et al. 1992), Nyos (Neri et al. 1999), Kilauea (Dvorak 1992), Mayon, Pinatubo, Mount St Helens (Newhall et al. 2001), Ruapehu (Kilgour et al. 2010) and Poás (Fischer et al. 2015) are just a few examples. However, as their magnitude is generally smaller than that of magmatic events, phreatic eruptions are comparatively understudied. Due to minor volumes of erupted material, the deposits are typically not well preserved in the geological record and phreatic eruptions are also likely to be overlooked in observational reports, especially when they are precursors to a magmatic eruption. But even on their own, phreatic eruptions pose some significant hazards: ballistic blocks, toxic gases, base surges, lahars, directional blasts and debris avalanches have been reported (Barberi et al. 1992). The particular danger lies in the difficulty to recognise the commonly very short-lived and highly localised precursors; in some cases precursors are not identifiable (e.g. Barberi et al. 1992; Kato et al. 2015; Maeda et al. 2015). As a consequence, the forecasting of phreatic eruptions is extremely challenging.

Due to their comparatively small impact radius, direct hazards from phreatic explosions are concentrated close to the crater. However, recent events at Mayon in 2013 with 5 casualties (Venzke 2013) and Ontake in 2014 with at least 57 casualties (Kato et al. 2015) demonstrate that the impact of localised phreatic eruptions is not to be underestimated by the authorities and that the risk at frequently visited, apparently quiescent, volcanoes might be higher than anticipated.

There are very few detailed, quantitative studies of hazards related to phreatic activity. An exception is the generic investigation by Neri et al. (1999) that adresses the hazards posed by phreatic-related surges, blasts and toxic gases using simple numerical simulations. A second example is the case study presented by Fitzgerald et al. (2014), who combined detailed field data analysis from a phreatic eruption at the Upper Te Maari Crater in 2012 with the simulation of 3D ballistic trajectories to assess ballistic hazards at this volcano.

This study focuses on phreatic eruptions at Ruapehu, New Zealand (Fig. 1), an active andesitic stratovolcano with frequent phreatic and phreatomagmatic eruptions, as well as two larger historic magmatic episodes (Kilgour et al. 2014). The crater hosts a hot acidic crater lake covering an active hydrothermal system. Phreatic explosions are caused by the volatile and heat release from shallow-seated magma into this system, and/or a pressure build up in the hydrothermal reservoir by sealing mineralisation and the rupture of the resulting cap (e.g. Christenson and Wood 1993; Christenson et al. 2010). The latter occurred for example in September 2007, producing a surge, lahars and ballistics and seriously injured a hiker that was present on the summit at the time (Kilgour et al. 2010). Seismic anomalies have been reported prior to previous phreatic eruptions at Ruapehu (e.g. Dibble 1969; Nairn et al. 1979; Jolly et al. 2010; Mordret et al. 2010; Carniel et al. 2013), but have only retrospectively been identified as precursors and therefore could not be utilised as timely warning systems. For the 2007 eruption, using hindsight analysis Jolly et al. (2010) found seismic precursors starting 10 min before the eruption - therefore, even if a monitoring system had identified these signals, little time remained to evacuate potential visitors from the summit.

Fig. 1
figure 1

Overview map of Ruapehu and an illustration of its vent system (modified from Kilgour et al. (2010)). Topographic map from topomap.co.nz

The very detailed catalogue of Ruapehu’s activity (Scott 2013) offers an exceptional opportunity to study the statistics of recurring phreatic events and thereby support the assessment of hazards that might affect people in the summit area, which is frequently visited by hikers. We present an analysis of the recurrence statistics of past events at Ruapehu, followed by a probabilistic hazard assessment. The methodology and results are presented individually, and then discussed together.

Recurrence statistics

Methods

Scott (2013) compiled the most detailed record of the activity at Ruapehu since 1830 based on scientific and public sources. The activity is classified into 6 eruption scales from 0 to 5 (Table 1) based on their impact range. Scott (2013) identified several periods of higher than normal activity. Two magmatic episodes, September 1944 to July 1946 and June 1995 to November 1997, make up about 65% of the observations and Scott (2013) suggests removing these for the assessment of the volcano’s background behaviour. The completeness of the data set was assessed by Scott (2013) through comparing the cumulative frequencies of eruptive scales, which suggests that eruptions of Scale 3 and smaller are underreported. Due to improved monitoring, the data set is considered more complete from 1940 onwards (Scott 2013), although events confined to the lake (i.e. Scale 0-1) can still be missed due to the lack of continuous visual observation.

Table 1 Scale of Ruapehu eruptions after Scott (2013)

For our purposes we extracted all reports of phreatic eruptions from the dataset from 1940 onwards, hence neglecting all magmatic events. However, due to the varying definitions and terms used in literature and in the dataset, it is often unclear whether an event was purely phreatic or phreatomagmatic. Table 2 lists all terms that we interpreted as phreatic eruptions, as well as some ambiguous terms that could have been phreatic but have been interpreted as magmatic and are therefore not included in our analysis.

Table 2 Our interpretation of terms used in the eruption catalogue. Unambiguous terms for magmatic events (such as “magmatic eruption”) are not included in this list

After identifying all phreatic events from the original catalogue, only Scale 1 to 4 remain in our database. The original database has a daily resolution; for our purposes we use a monthly resolution, i.e. a month is counted as a success, if there is at least one day with a reported phreatic explosion. We use month-based bins because there is often a large uncertainty in the database concerning how many events occurred in several-days-long periods. Hence, all subsequently calculated probabilities should be understood within this timeframe, i.e. they represent the probability for at least one event occurring within the next month. A basic statistical analysis of our database, like tests for a Poisson-distributed behaviour or a scale-time relationship, was realised with simple MATLAB®; routines.

Results part one: recurrence statistics for phreatic eruptions at Ruapehu

The cumulative number of phreatic eruptions with time (Fig. 2) shows periods of increased activity and periods of relative quiescence. Some periods of phreatic quiescence coincide with magmatic eruptions, but there are also real periods of quiescence with neither phreatic nor magmatic activity (e.g., August 1959 to May 1964). The known magmatic phases are preceded by an increase in phreatic activity. However, the opposite does not hold true, i.e. not all increases in phreatic activity are followed by a magmatic phase.

Fig. 2
figure 2

Cumulative number of phreatic eruptions with time. Since we only considered the phreatic sub-dataset and magmatic eruptions could have occurred in gaps of phreatic activity, we cross-checked for magmatic eruptions in the original catalogue during longer periods of phreatic quiescence. Red vertical lines indicate periods of magmatic episodes during phreatic quiescence; red boxes indicate major magmatic episodes as identified by Scott (2013). Labelled grey boxes at the top indicate periods 1–4 with different explosion frequencies as discussed in the text

In total, 90 months with eruptions were counted. If we average this over the whole time span from 1940 until January 2016, the overall relative occurrence of months with eruptions is 10%. However, there are periods with a higher eruption frequency than others. Periods 2, 3 and 4 indicated in Fig. 2 precede magmatic phases and show a significantly higher relative occurrence of months with eruptions of 31, 32 and 17%, respectively. For comparison, Period 1 - between July 1946 and March 1966 - has an eruption frequency of only 6%. In the time since the last magmatic phase in 1997, the average eruption occurrence has been even lower: 3%.

The average of the intervals between two phreatic eruptions, i.e. the mean of intereruptive time (“IET”) μ IET, is 9 months, with a standard deviation σ of 14 months. Therefore, the coefficient of variation (Cox and Lewis 1966) is \(\text {CV}=\frac {\sigma }{\mu _{\textrm {IET}}}=1.6\). This value is an indicator of the statistical nature of processes: A Poisson process must have a CV value close to one, while for regularly occurring events CV is <<1 and clustering processes typically show a value >1. Therefore, it seems that phreatic eruptions at Ruapehu cluster. A χ 2-test confirms that the phreatic events do not follow a Poisson distribution: the Poisson-hypothesis is rejected at the 0.05 significance level (p-value of 0.0135).

Figure 3 a shows the cumulative number of phreatic eruptions individually for the different eruption scales and illustrates the expected much higher frequency of smaller-scaled events. The intervals between two eruptions of Scale 3 or 4, respectively, can be much longer than those between smaller scaled eruptions (Fig 3 b). However, no statistical significant relationship between IET and eruption size can be proven (see Appendix).

Fig. 3
figure 3

a) Cumulative phreatic eruption number with time for the individual eruption scales b) Dotty plot of phreatic eruptions over time, red lines indicate magmatic episodes, red boxes indicate major magmatic episodes as identified by Scott (2013)

Probabilistic hazard assessment using PyBetVH

Methods

Following the statistical analysis, we performed a probabilistic assessment of the hazard due to ballistic ejecta in the summit area of Ruapehu using PyBetVH (Tonini et al. 2015). Probabilistic volcanic hazard assessment (PVHA) aims to include the intrinsic uncertainty in forecasting volcanic activity. PyBetVH is the software implementation, coded in Python programming language, for the BET_VH model (Marzocchi et al. 2010; Tonini et al. 2016), a Bayesian event tree model for long-term PVHA. Prior knowledge, e.g. from model results or a priori beliefs, are combined with frequencies from the catalogue and the procedure accounts for both the aleatoric and epistemic uncertainty. The final products are conditional and absolute probability maps for volcanic hazardous phenomena. In the following, we describe the creation of the input files for PyBetVH, following Tonini et al. (2015).

Nodes 1-3

Usually, these nodes represent the probability that a volcano enters unrest (node 1), that the unrest is of magmatic origin (node 2) and that the unrest leads to an eruption (node 3), respectively. Since phreatic eruptions often occur independently of any magmatic unrest (Rouwet et al. 2014), we combine these nodes such that node 1 simply represents the probability of a phreatic eruption and ignore nodes 2 and 3 by assigning them a probability of 1. The assigned prior probability θ for an eruption is 0.5, with λ=1. λ is the “equivalent number of data”, which expresses the confidence in the prior probability value: a low value indicates that only very few data would suffice to change our guess of the prior probability. Our chosen combination of θ and λ for nodes 1 - 3 expresses the maximum ignorance probability distribution, i.e. we do not know anything about the real probability distribution. This prior probability is combined with past data: our phreatic eruption catalogue spans a time period of 789 months (counting only those that did not begin already in a state of phreatic eruption), and we count a total of 90 months with phreatic explosions.

Node 4

This node represents the probability for an eruption originating from a certain vent position. PyBetVH has two categories for the geometry of possible vent locations: a “volcanic field” corresponds to a rectangular grid of possible vents, while a “volcanic cone” is characterised by a central vent position over the summit crater area and 4 lateral sectors. Ruapehu falls in the latter category, with the crater lake as the central vent. Since it seems highly unlikely that the eruption does not originate from the crater lake, we assign a prior probability of 0.99 to the central and 0.0025 to each of the four radial sector positions, respectively. These prior beliefs are combined with past data: all of the recorded eruptions originated from the crater lake.

Node 5

Node 5 represents the probability for an eruption to be of a certain scale. To obtain prior probabilities, we make use of the general assumption that the scale-frequency relation of eruptions follows a power law. From the global magmatic eruption catalogue, Marzocchi et al. (2004) determined the relation log(N)=5.811−0.7794∗VEI, with N being the number of eruptions of the respective VEI in 1000 years. As a best guess, we assume that the scale-frequency relation of phreatic eruptions at Ruapehu also follows this law, leading to a probability of 0.834 for Scale 1, 0.139 for Scale 2, 0.023 for Scale 3 and 0.004 for Scale 4. It should be noted that the relation from Marzocchi et al. (2004) is based on a database of eruptions with VEI ≥2 and therefore might not be valid for phreatic eruptions at Ruapehu. Consequently, we assigned this prior probability a very low confidence value, i.e. λ=1. In combination with the large number of recorded eruptions, this means that the posterior probability is mainly determined by the data from the catalogue. As input for past data, we extracted 42 Scale 1, 28 Scale 2, 13 Scale 3 and 7 Scale 4 eruptions from the catalogue - corresponding to relative frequencies of 0.467, 0.311, 0.144 and 0.078, respectively.

Node 6

This node reflects the probability for the occurrence of specific hazards. Here, we consider the hazard of ballistic ejecta, as these pose a serious threat to people on the summit. By definition, all eruptions of Scale 2 and larger deposit material outside of the lake, while Scale 1 eruption products are confined to the lake. Hence we assign a probability of 1 as the prior probability for Scales 2 to 4, and include all recorded eruptions of these scales as past data for the occurrence of ballistics. The prior probability is set to 0 accordingly for Scale 1, and all past eruptions of Scale 1 are included as a recorded lack of ballistics. λ is set to 1.

PyBetVH calculates the probability of a hazard to overcome intensity thresholds, which are defined in this node. As intensity measure we choose the density of ballistic impacts F (following Fitzgerald et al. (2014)), and set the thresholds 0, 0.001, 0.005 and 0.01 clasts/m2. We assume an average hazardous area of 10 m2 per clast (following estimations in Fitzgerald et al. (2014)) that is multiplied by the impact density to calculate the average number of clasts that affect a person on the summit.

The definition of areas on the map, where the hazard is to be assessed, is also included in this node by the definition of grid points; we use a radial grid (see Fig. 4). First, we define radial areas (rings) A n surrounding the crater lake by their minimum radial distance r n from the lake centre (250, 400, 550, 700, 850, 1000, 1150, 1300, 1450, 1600, 1750, 1900 and 2050m distance, the upper bound is 3950 m). For each of these rings, 24 equally spaced points are then defined on its central radius as the grid points for BET (see Fig. 4).

Fig. 4
figure 4

Illustration of radial areas around the crater lake as defined for the hazard assessment. Note that for illustrative purposes, only 4 grid points are shown here as opposed to the 24 used in the study

Nodes 7&8

In this combined node, the prior probability for a hazard to overcome an intensity threshold in a certain area of interest is defined. To obtain the probabilities for ballistic impacts on Ruapehu’s summit to overcome impact density thresholds, we ran 60,000 simulations of ballistic trajectories (see section “Assessment of probabilities of ballistic impacts on Ruapehu’s summit”) with random input parameters, providing a statistical distribution for distances covered by ejected ballistic clasts. In combination with the number of ejected clasts that make it out of the crater (m), we can calculate the expected impact density F in different areas on the summit (see section “Assessment of probabilities of ballistic impacts on Ruapehu’s summit”) for a given eruption scale. Since m is poorly constrained, we use 10,000 randomly sampled m to obtain an empirical probability density function for F and use this to define the probability that the impact density in an area of interest overcomes a certain threshold (see section “Assessment of probabilities of ballistic impacts on Ruapehu’s summit”). Again, λ is 1. This prior knowledge is combined with data from the 2007 phreatic eruption (Kilgour et al. 2010), where ballistics were mapped: they occurred up to 2 km away from the crater, with an impact density of F=0.014 clasts/m2 and are constrained to the northern side of the lake within a pitch circle with an opening angle of about 40°.

Assessment of probabilities of ballistic impacts on Ruapehu’s summit

To assess the hazard due to ballistic ejecta in the summit area that can be produced by phreatic eruptions of Scale 2 to 4, we developed a MATLAB®; procedure, which performs the following steps:

  1. 1.

    Generate random input parameters for ballistic flightpaths (based on eruption scale)

  2. 2.

    Calculate ballistic trajectories and infer the impact distances by comparing the trajectory with Ruapehu’s topography

  3. 3.

    Derive the probability for one ejected clast to reach a certain area on the summit

  4. 4.

    Derive a probability density function for expected impact densities, given an eruption of a certain scale, for the different areas.

The steps are explained in more detail below.

Step 1: Random input parameters

Ballistic trajectories are mainly determined by eruption parameters, i.e. the ejection velocity and angle, and physical properties of the projectiles such as their size, shape and density. Ballistics from a phreatic eruption at Ruapehu are described for the 2007 eruption: they were predominantly angular to subangular fragments of hyalocrystalline to hypocrystalline pyroxene andesite, with a lesser amount of andesite breccia cemented by a medium grey mud (Christenson et al. 2010; Kilgour et al. 2010). Projectiles of a variety of volcano-lacustrine lithologies made up a subordinate group. The clasts showed varying degrees of hydrothermal alteration and vesicularity; their densities ranged from 1700 kg/m3 to 2700 kg/m3 (Kilgour et al. 2010). In our simulations, we assume cube-shaped projectiles and fix clast density at the mean value determined for ballistics during the 2007 eruption (Kilgour et al. 2010). The relative uncertainty of clast density is small in comparison to that regarding other input parameters, and has a comparatively small influence on resulting ballistic trajectories.

We randomly varied clast diameter, initial velocity and ejection angle, depending on the eruption scale. We only consider ballistic projectiles larger than 20 cm in diameter, because smaller clasts are likely to be affected by the eruptive column and their dispersal can therefore not be accurately simulated by our ballistic ejection model (Alatorre-Ibargüengoitia et al. 2016). The largest ballistic clasts of the 2007 eruption were 2 m in diameter, and eruption fragmentation is assumed to follow a power law (Girault et al. 2014). Hence we invoke three clast diameters with relative proportions based on a power law distribution: 60% of all simulated clasts are 20 cm, 30% are 1 m and 10% of clasts are 2 m in diameter.

Eruptions at Ruapehu are scaled according to the distance covered by their deposits. For each scale of eruption (Table 1) we used this maximum distance reached by volcanic ejecta to infer a maximum eruption velocity of the ballistic clasts. The maximum velocity for ballistics produced by eruptions of Scale 4 is also constrained by Kilgour et al. (2010), who inferred a maximum velocity of 135 m/s for the 2007 phreatic eruption. The minimum velocity for Scale 2 eruptions is based on the minimum distance necessary for ejecta to make it out of the lake. The minimum ejection velocities for Scale 3 and 4 are set to the maximum velocity of Scale 2 and 3, respectively. Within these intervals (see Table 3), we sample a uniform distribution of velocities. In doing so, we assume that exit velocity and grain size are independent parameters, which of course is a simplification. More sophisticated fluid dynamic models of the conduit acceleration phase take into account the effect of the carrier flow field on the eruptive mixture more accurately (e.g. de’Michieli Vitturi et al. 2010), but this is beyond the scope of this study.

Table 3 Input parameters for Eject!

The ejection angle depends on the erupting vent: Ruapehu Crater Lake hosts two eruptive vents, a northern and a central one (Christenson 2000). While the central vent is upright, the northern vent is inclined to the North at 45–60° from the vertical (Fig. 1, Kilgour et al. (2010)). To obtain both an ejection and a direction angle, we sample the von Mises-Fisher distribution (e.g. Fisher et al. 1993), a probability distribution on a sphere, using a MATLAB®; procedure following instructions in Wenzel (2015). The distribution has two parameters, the mean direction and the concentration parameter κ. For the central vent, the mean direction is upright with κ=5; for the northern vent we use a vector inclined 52.5° (from the vertical) to the North as the mean direction and κ=10, since explosions from this vent tend to be more directed. Figure 5 illustrates the distributions for the two vents. Sampling the von Mises-Fisher distribution delivers unit vectors on the whole sphere, from which we infer both the azimuth and ejection angle. This gives ejection angles between -90 and 90° from the horizontal. Only clasts that make it out of the crater lake are of interest to the hazard assessment for the summit. Furthermore, the total number of ejected clasts that make it onto the summit is needed to calculate the impact density and data for this number are available, while no data are available for clasts that remain in the lake. We therefore only simulate trajectories of clasts that make it out of the crater lake. This requires the ejection angle to be larger than a minimum value, which depends on the vent location (i.e. North or central) and ejection direction, and lies between 17 and 36°. To truncate the distribution, we therefore test whether the obtained ejection angles are within the correct interval, and if not, then the angle is discarded and we resample the distribution for a new angle until it is in the desired interval.

Fig. 5
figure 5

Illustration of the von Mises-Fisher distribution with a) an upright mean direction (plotted in red) and κ=5, corresponding to a central explosion and b) a northward inclined mean direction (plotted in red) and κ=10, corresponding to an explosion from the northern vent

Step 2: Calculate ballistic trajectories and infer impact distances

Ballistic trajectories accounting for drag effects were calculated by a Runge-Kutta procedure with the program “Eject!” (Mastin 2001; 2011), based on the clast’s size, mass, ejection angle, initial velocity and other input parameters (Table 3). We translated the Visual Basic code to MATLAB®; in order to call it automatically and easily combine it with our evaluation scripts. The calculated trajectory is then compared with an interpolated topography function, depending on the random ejection direction of the clast (North, East, South or West), to determine its landing point. Clast-clast interactions as presented in Tsunematsu et al. (2014) and the effects of tailwind are ignored in these simulations.

Step 3: Derive the probability for one ejected clast to reach a certain area

The topography of the summit and the location and eruption behaviour of the northern vent have a strong direction-dependent influence on impact distances and therefore impact probabilities. To take this into account in the evaluation of results, we divide the areas A n into 4 pitch rings A ndir , with dir being the azimuth (North, East, South or West) (see Fig. 4). For each eruption scale and both vents, we ran 10,000 trajectory simulations that were used to infer the probability P ndir (c) of one ejected clast to reach the area A ndir . We count a success for A ndir if a clast’s ejection direction is dir and its impact distance is between radius r n and r n+1 (see Fig. 4). The number of successes is then divided by 10,000 to obtain probabilities (conditional to an eruption scale and vent location). Subsequently, we average the probabilities obtained for the northern and central vent, resulting in probabilities depending only on eruption scale and respective area. This implies that eruptions occur from the two vents with an equal probability, reflecting our complete ignorance on whether one of them is more active. Unfortunately, there are no informative data available.

Step 4: Derive a probability density function for impact densities

The above described procedure provides the probability for an ejected clast to reach a certain area on the summit. We are, however, interested in the total number of ballistic impacts per area and this depends on the number of clasts that get ejected onto the summit. The more clasts are ejected, the higher are expected impact densities. From the derived probabilities, one can calculate the ballistic impact density F ndir for area A ndir with

$$ F_{n-dir}=\frac{P_{n-dir}(c)*m}{Area(A_{n-dir})} = \frac{P_{n-dir}(c)*m}{0.25 * \pi *(r_{n+1}^{2}-{r_{n}^{2}})}, $$
(1)

with m being the number of erupted clasts that make it out of the crater. There are few data available for m in phreatic eruptions. The only comprehensive study, presented by Fitzgerald et al. (2014), estimates 15,000 ejected clasts (outside of the crater) during the Upper Te Maari eruptions, Tongariro, in 2012. We sample a uniform distribution between 10,000 and 20,000, to obtain 10,000 random values for m and derive a mean value and standard deviation for impact densities in the different areas. Counting how often the derived impact density overcomes a certain threshold in the simulations and dividing by 10,000 gives the prior probabilities used as input in nodes 7&8 in PyBetVH.

Results part two: probabilistic hazard assessment for Ruapehu

Results from Eject! simulations

From the Eject! simulations, we obtained the probabilities P ndir (c) for a clast reaching an area A ndir , given the eruption of a certain scale. These probabilities vary with distance from the crater as well as the cardinal direction, which is shown for distances up to 1 km away from the lake centre in Figs. 6, 7 and 8. The variation with ejection direction is due to the varying topography of the summit and the influence of the northern vent that produces directed blasts. From these probabilities, we calculated the impact density in the respective areas (Figs. 6, 7 and 8), which can be multiplied by 10 m2 to obtain the average number of clasts affecting a person. The hazard is substantially greater on the northern side of the lake. For example, the expected impact density on the northern shore during an ongoing eruption of Scale 3 is about 0.03 clasts/m2 when one is located less than 400 m from the lake centre, and 0.025 clasts/m2 when one is located less than 550 m from the lake centre - compared to just about 0.01 and 0.001 clasts/m2, respectively, on the shores in other directions. Comprehensive results from the simulations, including all calculated mean impact densities, their standard deviation and the probabilities for the ballistic impacts in areas of interest to overcome intensity thresholds, can be found in the Additional file 1 to this paper.

Fig. 6
figure 6

Results of ballistic simulations, shown on the radial grid they were evaluated on (A ndir ): Probabilities conditional to an eruption of Scale 2 of a clast to reach a certain area and derived impact density (mean value of calculations based on random numbers of ejected clasts between 10,000 and 20,000). The areas are defined by their minimum radial distance r n from the vent

Fig. 7
figure 7

Results of ballistic simulations, shown on the radial grid they were evaluated on (A ndir ): Probabilities conditional to an eruption of Scale 3 of a clast to reach a certain area and derived impact density (mean value of calculations based on random numbers of ejected clasts between 10,000 and 20,000). The areas are defined by their minimum radial distance r n from the vent

Fig. 8
figure 8

Results of ballistic simulations, shown on the radial grid they were evaluated on (A ndir ): Probabilities conditional to an eruption of Scale 4 of a clast to reach a certain area and derived impact density (mean value of calculations based on random numbers of ejected clasts between 10,000 and 20,000). The areas are defined by their minimum radial distance r n from the vent

PyBetVH results

Using PyBetVH, we combined the information from the phreatic eruption dataset with prior knowledge and the Eject! simulation results. Since we have a high number of recorded eruptions, the probability calculated by PyBetVH for an eruption of a certain scale occurring within the next month is very similar to the frequency directly calculated from the database: The probability for an eruption occurring within the next month is 10.8%, the conditional probabilities for the eruption being of Scale 1,2, 3 or 4 are 48.0, 30.4, 14.0 and 7.6%, respectively.

The final products from PyBetVH are the monthly probability and hazard maps for ballistic impact densities. Figure 9 shows hazard maps for the ballistic impact density at a probability of 1% conditional to an eruption of Scale 2, 3 and 4, respectively. Most significant is the extended hazardous area on the northern shore for eruptions of Scale 4, where the impact density is 0.01 clasts/m2 up to 2 km away from the vent. The maps also indicate infrastructure on the summit, which are likely whereabouts of people and can be damaged by ballistics. The location of the dome shelter is within the most hazardous area for Scale 3 and 4 eruptions; ballistics of a Scale 4 eruption can even reach ski lifts on Ruapehu’s northern flank. Figure 10 shows the absolute hazard map for ballistic impact density at a probability of 1%, which is up to 0.01 clasts/m2 everywhere up to 400 m away from the lake centre, i.e. about 150 m from the lake shore.

Fig. 9
figure 9

Hazard map with indicated infrastructure for the ballistic impact density at a probability of 1%, conditional to an eruption

Fig. 10
figure 10

Absolute hazard map with indicated infrastructure for the ballistic impact density at a probability of 1%, a: average, b, c: 10th and 90th percentile, respectively

Figure 11 shows the probabilities for areas to experience a ballistic impact density F>0.01 clasts/m2 conditional to an eruption of Scale 2, 3 and 4, respectively. This impact density would correspond to one clast per 100 m2, or a person being hit by 0.1 clasts on average, so the chance to be hit is 10%. For Scale 2, this probability is 1 on the northern and southern lake shore up to 400 m from the lake centre, and between 0.1 and 0.6 on the eastern and western shore, decreasing with distance. For Scale 3 eruptions, the probability is 1 up to 600 m away from the centre on the northern shore, while the other shores show probabilities between 0.125 and 0.75, again decreasing with distance from the vent. Probabilities directly at the lake shore for Scale 4 eruptions are generally slightly lower, however, the area with probabilities larger than 0.1 extends significantly farther to the north, up to 2 km away from the vent. Figure 12 shows the absolute probability for areas to be affected by a ballistic impact density F>0.001 clasts/m2 within the next month, which is larger than 1% up to 900 m away from the lake centre on the northern shore, and up to 500 m away on the other shores. It increases to about 6% closer to the lake.

Fig. 11
figure 11

Probability for areas to experience a ballistic impact density of 0.01 clasts/m2, conditional to an eruption

Fig. 12
figure 12

Absolute probability for areas to be affected by a ballistic impact density F>0.001 clasts/m2 within the next month, a: average, b, c: 10th and 90th percentile, respectively

Discussion

Limitations of the presented methodology

The presented results for Ruapehu are subject to several uncertainties. First, the dataset is very likely not complete, especially for eruptions of Scale 1 or 2. Additional errors originate from the lack of robust examination and determination of magmatic vs phreatic explosions - not all have been studied in sufficient detail with regards to amounts and origin of juvenile material. The possible misinterpretation of terms used in the catalogue can lead to a false inclusion or exclusion of an event in the phreatic database.

The most substantial errors in the statistical assessment of ballistic hazards stem from our ignorance of any wind effects when calculating the ballistic trajectories. Strong winds can occur on Ruapehu and significantly influence ballistic flight paths, especially of smaller clasts. We focused here on clasts larger than 20 cm to ensure that the effects of wind and the eruption column are of second order. Smaller clasts are very likely affected even by weak winds and the eruptive column, but they can still cause severe damage when colliding with a person and should be included in future studies.

We furthermore invoke several simplifications, such as an arbitrary radius of reduced drag and an ejection velocity independent of the grain size. More sophisticated numerical models, e.g. as presented in de’Michieli Vitturi et al. (2010), take processes during the acceleration phase in the conduit into account and more accurately simulate the effect of the carrier flow field on ballistic trajectories. Future studies should make use of these more advanced models.

We also assume that the grain size distribution of ballistic projectiles can be represented by 3 distinct clast diameters weighted according to a power law, which is likely an oversimplification. However, to date, no comprehensive total grain size distribution of ballistic deposits could be determined. An important error source in our calculations of expected impact densities is the uncertainty related to the number of ejected clasts m. Only one comprehensive study on a hydrothermal eruption provides an estimate on this number (Fitzgerald et al. 2014). Additionally, since m inevitably depends on eruption dynamics and crater-topography, using a different volcano as an analogue is problematic. However, no better data are available at the moment and we therefore have to view our results under this caveat.

The created hazard maps are of course somewhat dependent on the chosen areal grid resolution, since the evaluation of simulation results necessarily involved averaging the discontinuous ballistic impacts over areas of interest on the summit. The overall spatial resolution of the expected impact density on the summit is more accurate the finer the grid. To test this influence, we decreased the areas A ndir by adding another radius halfway between all of the previously defined r n and recalculated impact densities from the ballistic simulations. This affected results in the most proximal ring (from the lake shore up to 150 m radial distance), where the mean impact densities in the halved ring deviated significantly from those calculated in the original, larger ring. However, the resulting impact density still correctly mirrors the averaging of ballistic impacts and adequately gives the number of expected clasts in the defined areas. For hazard assessment purposes, it is sufficient to know the average number of expected ballistics in the defined areas.

Comparison of simulation results with the 2007 phreatic eruption

To assess validity of our ballistic simulations, we can compare simulated impact densities for Scale 4 eruptions with the observed values of the 2007 phreatic eruption, reported in Kilgour et al. (2010). This northward directed blast produced a ballistic apron on the northern lake shore up to a distance of 2 km, with an almost homogeneous impact density of 0.014 clasts/m2. The calculated impact density from simulations using random ejection angles corresponding to a northern vent eruption is negligible on the other shores, and lies between 0.009 and 0.02 clasts/m2 on the northern shore up to a distance of 850 m from the lake, which is a reasonably good agreement with the observations. However, at larger distances, the simulations predict significantly lower impact densities (<0.005 clasts/m2) than observed. According to the isopleth maps in Kilgour et al. (2010), the distal ballistic field is almost entirely small clasts. While we have not modelled the trajectories of clasts <20 cm, the reported impact density takes the smaller clasts into account. Southerly winds were strong during the 2007 eruption and could therefore have transported the smaller clasts, which are most affected by wind, to the distal field. Hence, the lack of wind as well as omitting clasts smaller than 20 cm in our simulations are probably the main causes of the discrepancy in the distal impact densities. Different ejection velocities could be another contributing factor. If smaller ballistic clasts are ejected at higher velocities, they are transported to larger distances, which is not accounted for in our simulations. However, while they cannot reproduce the finer-grained, distal ballistic field of an eruption in windy conditions, our simulations likely produce valid results for the coarser-grained part of the ballistic field.

Hazard implications

Previous studies on ballistic hazards have deterministically calculated the maximum distance reached by ballistics in different eruption scenarios, in order to define safety zones around a volcano (e.g. Alatorre-Ibargüengoitia and Delgado-Granados 2006; Alatorre-Ibargüengoitia et al. 2016; Konstantinou 2015). Probabilistic approaches for ballistic hazard assessment have been proposed only recently (Fitzgerald et al. 2014; Biass et al. 2016). They provide a more accurate and detailed assessment of the hazard due to ballistic projectiles and are therefore of increased use for longterm risk-reduction strategies on volcanoes. When producing hazard maps, the chosen measure of hazard intensity is an important factor. While we followed Fitzgerald et al. (2014) in using impact density as a proxy for hazard intensity, Biass et al. (2016) focused on the kinetic energy on impact, which is a commonly used intensity measure for ballistics (e.g. Tsunematsu et al. 2016). Hail storms are a natural hazard that is somewhat comparable to volcanic ballistics, in the sense that the threat consists of discontinuous impacts over an area. While hail storms are classified based mostly on grain size and impact energy, it has long been acknowledged that other factors such as the number of hailstones, which translates into an impact density, also affect the severity of hail hazards (Webb et al. 2009). Impact energy is the more relevant hazard proxy for studies regarding the vulnerability of built infrastructure, because only projectiles of a sufficient energy are capable of damaging buildings and penetrating roofs. However, the only building in the proximity of Ruapehu’s crater lake is dome shelter, a mountain refuge, and only the most distal ballistics pose a threat to some ski lifts. Similar to the study of Fitzgerald et al. (2014), our main interest is the threat to unprotected people on the summit, which is frequented by hikers and skiers. Any encounter with a ballistic projectile is a health hazard and potentially lethal for a person. Therefore, impact density is the more relevant hazard proxy in this case as it directly translates into the probability to be hit and injured by a projectile via multiplication with the hazardous area per clast, as outlined in Fitzgerald et al. (2014).

We can derive a few important implications for eruption forecasting and hazard assessment at Ruapehu volcano in particular, and possibly similar systems prone to phreatic eruptions. Based on our catalog, magmatic phases at Ruapehu are preceded by an increase in phreatic activity, with up to a 10-fold increase in explosion frequency. However, the duration of phreatic activity phases and the rate of phreatic events are not clearly correlated with the timing of magmatic events. No statistically significant relationship between phreatic eruption scale and duration of relative quiescence between two phreatic phases can be inferred from our data. Interestingly, Phillipson et al. (2013) also found no statistically significant relationship between the length of quiet periods and the duration of following unrest periods at volcanoes. However, the phreatic events tend to cluster, so one can expect more phreatic explosions to follow once the system is reactivated. Since shorter IET are generally more likely than longer periods of quiescence, this also suggests that a bigger phreatic event is more likely within a few years after a phreatic explosion than during a prolonged quiet phase of the volcano. It remains for future research to determine whether phreatic eruption catalogues at other systems - with and without crater lakes - show similar statistics and to determine the physical reason behind the similarities.

The overall probability for a phreatic eruption to occur at Ruapehu within the next month is about 10% - although in the current state of the volcano, the probability seems to be lower (about 3%). The absolute probability for areas within 1 km from the lake centre to be affected by a ballistic impact density larger than 0.001 clasts/m2 within the next month is up to 6%. Conditional to a phreatic eruption, some areas on the summit have a probability of 100% to experience an impact density larger than 0.01 clasts/m2 - these areas are larger, the stronger the eruption. Following estimations in Fitzgerald et al. (2014), this impact density implies that a person standing in these summit areas roughly has a 10% chance of being hit by a ballistic large enough to cause injury. To put these values into perspective, impact densities during the lethal 2014 Mount Ontake eruption exceeded 0.4 clasts/m2 up to 750 m distance from the erupting vent, and were between 0.1 and 0.4 clasts/m2 in the zone up to 850 m distance (Kaneko et al. 2016). These values translate to a 100% chance to be hit by a ballistic, explaining the large number of casualties during this event. Based on the results of our simulations and data from the 2007 eruption, the threat for injury by ballistic projectiles during a phreatic eruption on Ruapehu seems to be an order of magnitude less than on Ontake. However, it is still a non-negligible danger for people on the summit.

Generally, the hazard is higher on the northern shore of the lake - which is important, because the dome shelter on the summit is built on that side of the lake and the presence of people is therefore more likely in that area. While ski lifts are only threatened by ballistics from a Scale 4 eruption, with an expected impact density of less than 0.006 clasts/m2 during an eruption, the dome shelter lies within the most hazardous zone for Scale 3 and 4 eruptions, where the expected impact density is larger than 0.01 clasts/m2 (Fig. 9). There is an absolute probability of more than 2% that the shelter is affected by ballistics within the next month (Fig. 12). However, our hazard and probability maps show that the hazard due to ballistic ejecta on the eastern and western lake shore is mostly constrained to a ring of a few hundred metres around the crater lake. If a precursory signal for a phreatic explosion is identified, even a short warning time could therefore be sufficient to relocate to safer areas.

In this study, we provide a probabilistic hazard assessment for ballistic impacts. Expanding our work for a risk assessment for Ruapehu’s summit would require an additional vulnerability and exposure analysis that needs to be combined with the hazard assessment. For example, the average number of people on the summit per month is an important parameter in determining the probability of casualties on Ruapehu. It is likely that there is a seasonality in this risk, because more hikers visit the summit during the summer months. The time spent near the lake on the other hand is dominant in determining the risk for an individual to be affected by a ballistic clast. Our study focused on a hazard analysis with proxies relevant for human injuries. In order to study the risk for built infrastructure, such as the mountain refuge or the ski lifts, it would be necessary to (i) use a different hazard proxy, i.e. impact energy and (ii) determine the energy thresholds necessary to damage this infrastructure, as has been done for example in Jenkins et al. (2014) and Pomonis et al. (1999). Our hazard and probability maps provide the first and important step towards comprehensive and quantitative risk analyses.

Possible reasons for clustering of phreatic explosions and potential use for eruption forecasting

Ruapehu Crater Lake shows thermal cycling, with temperatures varying irregularly between 10 and 60 °C, and increased phreatic as well as phreatomagmatic activity typically coincides with periods of higher lake temperatures (e.g. Christenson and Wood 1993). The elevated heat flow through the hydrothermal system indicates a more shallow-seated magma source that could enhance eruptive activity. Since 1960, when regular monitoring began, only 3 eruptions have been observed during periods of decreased heat flow. These have been interpreted to result from pressure build up in a sealed hydrothermal system (e.g. Christenson et al. 2010).

There is no obvious correlation between the cumulative eruption number and the observed lake temperatures (Fig. 13). To further test whether there is a link between lake temperature and phreatic eruptions, we define temperature bins in 5 °C intervals and compare the relative frequency of pre-phreatic eruption temperatures to a baseline dataset (Fig. 14). There is a wide range of pre-eruptive temperatures, but the mean pre-eruption temperature is significantly higher than the mean recorded lake temperature, and more than half of eruptions follow lake temperatures higher than 40 °C. This is in line with findings of previous studies, e.g. Hurst (1981). While there might be some sampling bias, this supports the theory that phreatic explosions are more frequent during periods of a more shallow-seated magma, which is consistent with an increased frequency prior to (or indeed within) magmatic phases. The differing heat input from the magma column at depth is a possible reason for the clustering of phreatic explosions.

Fig. 13
figure 13

Cumulative eruption number in comparison with recorded lake temperatures

Fig. 14
figure 14

Relative frequencies of observed lake temperature in comparison to pre-eruptive lake temperatures. Temperatures were measured irregularly. The baseline dataset includes a temperature for each day where data are available. The pre-eruption dataset includes the final temperature measured before each phreatic eruption as long as the measurement was less than 2 weeks before the eruption

The possible correlation between lake temperature and eruption frequency is important information for volcano monitoring and eruption forecasting purposes. An increase in lake temperatures, which are now continuously monitored at Ruapehu, can be a warning sign of an impending eruption. To provide a more quantitative estimate, Fig. 15 a shows the frequency of eruption occurrence within 2 months after a temperature measurement was performed (this time window was chosen because temperature measurements were only made on a roughly monthly basis in the past). Eruption frequencies (ef) of 0.3 for lake water temperatures below 15 °C are significantly higher than those for the temperature range between 15 to 30 °C (ef<0.2). However, increased frequencies of between 0.4 and 0.8 are also evident for lake temperatures higher than 40 °C. This indicates that the eruption frequencies follow a bi-modal distribution that most likely stems from the two mechanisms leading to phreatic eruptions: (i) increased heat flow due to more shallow-seated magma, and (ii) pressure build-up in a sealed hydrothermal system. The distributions for different eruption scales are shown in Fig. 15 b. Distributions for eruptions of Scales 1 and 3 only show the increased eruption frequency for hotter lake temperatures, while the distribution for Scale 2 eruptions indicates bimodality. Although its significance is questionable due to few data points, the distribution for Scale 4 eruptions suggests a higher eruption frequency for colder lakes. In summary, these graphs show that lake temperatures both significantly above and below the long-term background level can indicate an increased eruption probability.

Fig. 15
figure 15

Frequency of eruption occurrence within 2 months after a temperature measurement, versus the observed lake temperature, based on data from 1940 to 2012. (a) For all eruption sizes and (b) depending on eruption size. Note that one data point has been excluded from the analysis: only one temperature measurement exists that showed a lake colder than 10 °C, and this was followed by an eruption of size 4 within 2 months. Including this one data point would skew the distribution towards an eruption frequency of 100% after a temperature measurement of less than 10 °C, which we discarded due to questionable significance

Conclusions and outlook

Our results shed light on the underlying statistics of phreatic eruptions at Ruapehu and possibly elsewhere. The most important conclusions for Ruapehu are:

  1. 1.

    Phreatic events do not follow a Poisson process but tend to cluster, which is possibly due to increased heat input and hence increased activity in periods of a more shallow-seated magma column.

  2. 2.

    Larger events are most likely to follow other events after a short IET, rather than to occur during a longer period of quiescence.

  3. 3.

    Magmatic episodes are likely preceded by an increase in the number of phreatic events.

  4. 4.

    The probability of a phreatic eruption occurring at Ruapehu Crater Lake within the next month is about 10% and conditional to a phreatic eruption, large areas are likely to experience an impact density sufficient to threaten people present on the summit.

  5. 5.

    At any time, there is a non-negligible probability for the summit area to be affected by ballistics. The hazard is especially high on the northern shore, where infrastructure exists.

  6. 6.

    If the crater lake is significantly warmer or colder than the background temperature, the probability for an impending eruption is increased.

To improve our understanding of recurrence statistics, future research should test whether the data in the discussed phreatic eruption catalogue can be described with a Cox-Hawkes process, a probability distribution that describes cluster processes. Future work concerning the ballistic hazard on Ruapehu’s summit needs to consider the effect of wind on ballistic trajectories by incorporating information on the statistical distribution of wind velocities and directions. To further improve the hazard assessment, other hazards from phreatic eruptions should be included in the investigations, most importantly lahars and base surges. Finally, the accuracy of ballistic trajectory simulations and resulting hazard assessments will significantly benefit from a better understanding of processes during ballistic ejection, as well as ground truthing in the field of poorly constrained parameters such as impact densities and grain size distributions.

This study highlights the necessity to better understand steam-driven explosions, which often occur with little or no detectable precursors or advance warning. Statistical approaches developed in this study should support local hazard assessment at Ruapehu and provide a possible framework for similar studies at other volcanoes.

Appendix

Relationship between IET and eruption size

Figure 16 shows IET versus the scale of the eruption preceding or following the quiet interval, respectively. In both cases, no clear relationship is visible and linear regression analysis confirms that no linear fit to the data can be proven: The p-value for the F-test for a linear relationship between scale and IET is 0.76 (preceding scale) and 0.84 (following scale), respectively, so the null hypothesis of no relationship cannot be rejected at the 0.05 significance level - meaning that there is no statistically significant linear relationship. From visual inspection, it seems that only eruptions of smaller-medium scales precede or follow a longer IET, while all scales occur for shorter IET. However, this is very likely a pure sampling effect: generally, longer IET are less likely, as are larger eruptions scales. In combination, this leads to few large eruptions corresponding to a long IET. Linear regression analysis shows that the hypothesis of no linear relationship between the maximum IET and corresponding eruption scale cannot be rejected at the 0.05 significance level (p-value of 0.42 and 0.22, respectively). We also tested for linear relationships between eruption scales and log(IET), but again, the linear models are not significant at the 0.05 significance level (p-values are >0.05).

Fig. 16
figure 16

IET versus scale of the eruption preceding (a) and following (b) the corresponding quiescence. Also shown is a linear regression model for both cases, which - however - is not significant