# BET_VH: a probabilistic tool for long-term volcanic hazard assessment

- First Online:

- Received:
- Accepted:

## Abstract

In this paper, we illustrate a Bayesian Event Tree to estimate Volcanic Hazard (BET_VH). The procedure enables us to calculate the probability of any kind of long-term hazardous event for which we are interested, accounting for the intrinsic stochastic nature of volcanic eruptions and our limited knowledge regarding related processes. For the input, the code incorporates results from numerical models simulating the impact of hazardous volcanic phenomena on an area and data from the eruptive history. For the output, the code provides a wide and exhaustive set of spatiotemporal probabilities of different events; these probabilities are estimated by means of a Bayesian approach that allows all uncertainties to be properly accounted for. The code is able to deal with many eruptive settings simultaneously, weighting each with its own probability of occurrence. In a companion paper, we give a detailed example of application of this tool to the Campi Flegrei caldera, in order to estimate the hazard from tephra fall.

### Keywords

Volcanic hazard Probabilistic assessment Bayesian event tree## Introduction

Volcanic hazard studies have a prominent impact on society and volcanology itself, being an area where the “rubber hits the road”, that is, where science is applied to an important societal problem. Despite its importance, volcanic hazard assessment is still commonly presented in many different ways, ranging from maps of past deposits of the volcano to more quantitative probabilistic assessment (e.g., Scandone et al. 1993; Newhall and Hoblitt 2002; Marzocchi et al. 2004; Martin et al. 2004; Neri et al. 2008; Marti et al. 2008). The latter, being quantitative, has remarkable advantages: (1) it allows comparisons among different volcanoes and with other natural and nonnatural hazards, (2) its reliability can be tested through statistical procedures, and (3) it provides a basic component for rationale decision making (e.g., Marzocchi and Woo 2007, 2009; Woo 2008).

In order to unambiguously distinguish the quantitative approach from others which are more qualitative, it has been suggested to call it *probabilistic volcanic hazard assessment* (PVHA; see Marzocchi et al. 2007). The term “probabilistic” means that the extreme complexity, nonlinearities, limited knowledge, and the large number of degrees of freedom of a volcanic system result in difficult, if not impossible, deterministic prediction of the evolution of volcanic processes (see, e.g., Marzocchi 1996; Sparks 2003). In other words, volcanic systems are stochastic, and hazardous volcanic phenomena involve so many uncertainties that a probabilistic approach is needed.

That being said, we note that full PVHA is still quite rare (Magill et al. 2006; Ho et al. 2006; Neri et al. 2008; Marti et al. 2008 are among the few exceptions). Generally, hazard assessment represents, at best, a conditional probability of one specific hazard conditioned to the occurrence of one specific event (for instance, the most likely event; Cioni et al. 2003; Macedonio et al. 2008), or it is focused on one specific aspect of volcanic hazard, such as vent opening (Martin et al. 2004; Jaquet et al. 2008; Selva et al. 2010). In other cases, as mentioned above, hazard assessment merely consists of maps of volcanic deposits of past events. By contrasts, a full PVHA requires the assessment of the impact of hazardous phenomena associated with every possible “eruptive setting” (ES) and eventually the merging of all ESs, each of them weighted with its own probability of occurrence. Hereafter, with the terms “eruptive setting”, we mean the occurrence of an eruption of a specific size or type from a specific vent.

Another basic feature of PVHA is that it needs to account for all relevant sources of uncertainty. Indeed, the great importance of PVHA is due to its practical implications for society. In this perspective, it is fundamental that PVHA is “accurate” (i.e., without significant biases) because a biased estimation would be useless in practice. On the other hand, PVHA may have a low “precision” (i.e., a large uncertainty) that would reflect limited knowledge of some physical processes involved, from the preparation of an impending eruption to the derived impact on the surrounding area.

In the following sections, we describe the structure of the event tree for BET_VH, and the basic rules to estimate the probability distributions at each node. Finally, we show how a full PVHA is achieved, combining the probabilities of each node. A tutorial application of these concepts is outlined in the companion paper (Selva et al. 2010), where a PVHA for ashfall at Campi, Flegrei, Italy, is reported.

## The Bayesian Event Tree scheme for long-term volcanic hazard

Node 1–2–3: there is an eruption, or not, in the time interval (

*t*_{0},*t*_{0}+*τ*], where*t*_{0}is the present time and*τ*is the forecasting time window. This node condenses the probabilities of nodes 1 (unrest), 2 (presence of magma given an unrest), and 3 (eruption given a magmatic unrest) in BET_EF by Marzocchi et al. (2008) as regards the nonmonitoring part.Node 4: the eruptive vent will open in a specific location, provided there is an eruption.

Node 5: the eruption will be of a certain size or type, provided that there is an eruption in a given location.

Node 6: a particular hazardous phenomenon will be generated, or not, given that an eruption of a specific size or type occurs. Several hazardous phenomena can be uploaded in BET_VH at a time; volcanic hazard due to different phenomena can be juxtaposed and compared.

Node 7: a selected area around the volcano will be reached, provided the occurrence of a specific hazardous phenomenon produced by an eruption of given size or type and location.

Node 8: a specific intensity threshold will be overcome, or not, provided that the area selected at node 7 is reached by a specific hazardous phenomenon generated by an eruption with a given size or type and location.

Note that since the definition of event tree is mainly driven by its practical utility, the branches at each node point to the whole set of different possible events, regardless of their probabilistic features. In other words, the events at each node need not be mutually exclusive (see upper part of Fig. 1). This makes the combination of nodes a little bit more complicated, but it keeps a more logical and comprehensible structure (see Marzocchi et al. 2004, 2006, 2008 for more details).

In the following section, we describe how the probability distribution at each node is assigned. We will try to make use of the same symbols and terminology already published in Marzocchi et al. (2008) and in the Electronic supplementary material of that paper.

## Estimating the probability distribution at each node

BET_VH focuses on long-term PVHA only. Because of this choice, the information to be used is related to geological and/or physical models and past data. We do not account for monitoring measures. The latter, in fact, are related to short-term variations in the state of the volcano. This is one of the basic differences with BET_EF by Marzocchi et al. (2008), and it implies a simpler formalism for the probability computation at the nodes of the event tree in common with BET_EF. On the other hand, BET_VH has a more complex structure than BET_EF because it accounts also for the impact of hazardous phenomena on the territory; this leads to a more complex dependence on the selected path, i.e., on the events selected at the previous nodes. In order to keep the notation as simple as possible, hereinafter, we set specific indexes for specific selected outcomes at the different nodes: *i* indicates the vent location, *j* the eruption size or type, *p* the hazardous phenomenon, *k* the area, and *s* the threshold related to the phenomenon *p*. Despite the description of the following nodes should be self-consistent, we recommend to refer to Marzocchi et al. (2008) for more detailed description of the statistical distributions used (Beta and Dirichlet distributions).

### Node 1–2–3

*eruption*or

*no eruption*, in the time interval (

*t*

_{0},

*t*

_{0}+

*τ*]. A suitable time window for long-term hazard purposes is

*τ*= 1 year. The long-term probability of eruption is estimated similarly to Marzocchi et al. (2008) for the nonmonitoring probability. Here, BET_VH condenses the first three nodes of that tree in one. In particular, BET_VH assigns a prior Beta distribution for the two possible outcomes:

*α*

_{1}and

*β*

_{1}are determined from the average Θ

_{1}(i.e., the best guess for the long-term probability of eruption) and from the equivalent number of data, Λ

_{1}, by inverting the system of equations (see Marzocchi et al. 2008 for more details):

_{1}comes from models and/or a priori considerations, Λ

_{1}is set according to the degree of confidence that the user puts on the estimate of the best guess. The parameter Λ

_{i}(here the index

*i*is relative to the node considered) is a friendly measure of the confidence on the prior distribution, or, in other terms, of the epistemic uncertainty. In general, the higher the Λ

_{i}, the larger our confidence on the reliability of the model, so that the number of past data needed to modify significantly the prior must be larger. On the contrary, if we believe that the prior is poorly informative (i.e., our a priori information is very scarce), Λ

_{i}must be small, so that even a small number of past data can drastically modify the prior (for more information see Marzocchi et al. 2008).

*y*

_{1 − 2 − 3}represents the number of past eruptions having occurred and

*n*

_{1 − 2 − 3}represents the total number of not overlapping time windows of length

*τ*(for example, years) available in the catalog, starting without unrest or eruption (see Electronic supplementary material in Marzocchi et al. 2008).

### Node 4

At this node, BET_VH estimates the spatial probability of vent opening, given that an eruption occurs. The basic assumption here is that only one vent at a time will erupt. Because of this, this node has *I*_{4} possible and mutually exclusive outcomes, corresponding to the number of possible vent locations defined by the user.

*I*

_{4}vent locations:

*i*=1,...,

*I*

_{4}) is determined (as in Marzocchi et al. 2008) on the basis of \(\Theta_{4}^{(i)}\) (i.e., the expected value, or best guess, of the long-term probability of eruption in vent location

*i*) and of Λ

_{4}(i.e., the equivalent number of data), set similarly to node 1–2–3. The parameters \(\alpha_{4}^{(i)}\) are obtained by inverting the system of equations (see Marzocchi et al. 2008 for more details):

*i*.

### Node 5

Here, we examine the probabilities related to the size or type of the eruption. The magnitude can be represented either by the type of the eruption (explosive, effusive, phreatomagmatic, and so on), or by the size (e.g., VEI), or by groups of types or sizes that for the user’s purposes are considered homogeneous (e.g., VEI ≥4). In the following, we use the term *size class*, meaning anyone of these parametrizations. Thus, the number of possible outcomes at this node is *J*_{5}, corresponding to the number of possible size classes defined.

At this node, we estimate the probability of a specific size class of the eruption, given its occurrence and given that the vent opens in a specific location. Here, we propose a substantial improvement over Marzocchi et al. (2008) because in BET_VH (and in the version 2.0 of BET_EF that can be downloaded from the website: http://www.bo.ingv.it/bet), we allow the probability of each size class to vary as a function of the vent location. In this way, it is possible to account for potential differences among vent locations. For example, for a certain volcano, we might want to assign a higher probability of a hydromagmatic type of eruption if the vent is located under shallow water than on land. Similarly, with this improvement, we can take into account flank instability in case the vent opens in a radial sector of a central, steep flank volcano rather than in a central crater. Note that *J*_{5} does not depend on vent location since only the probabilities of the specific size classes do.

*J*

_{5}size classes, BET_VH assigns a prior Dirichlet distribution for the size classes at the

*i*-th vent location:

*j*=1,...

*J*

_{5}) is determined (as for node 4, Eq. 5) on the basis of \(\Theta_{5;i}^{(j)}\) (the best guess of the long-term probability of an eruption having the

*j*-th size class in the

*i*-th location) and Λ

_{5;i}(the equivalent number of data for the

*i*-th location). These parameters are set similarly to node 1–2–3. Note that the parameter Λ

_{5;i}can differ from vent locations since we may be more confident on the best guess related to a vent location than to another.

*i*-th vent location into the posterior through:

*j*occurred in vent location

*i*.

### Node 6

*p*, this node has two possible outcomes:

*generation*or

*no generation*of the phenomenon from an eruption of size class

*j*. The generation of a specific phenomenon depends only on the size class, and not on the vent location. The prior distribution is set, similarly to node 1–2–3, as

*j*eruption) and of \(\Lambda_{6;j}^{(p)}\) (i.e., the equivalent number of data). The parameters \(\Theta_{6;j}^{(p)}\) and \(\Lambda_{6;j}^{(p)}\) are set similarly to node 1–2–3. Note that the apex

*p*on

*θ*is necessary because we have different (and independent) distributions depending on the phenomenon considered.

*j*that generated the phenomenon

*p*and

*n*

_{6;j}represents the total number of past eruptions of size class

*j*available in the catalog.

### Nodes 7 and 8

These two nodes have a similar structure. First, the surrounding of the volcano is divided into a number (*K*_{7}) of areas (not necessarily equal and equally spaced). For each area *k*, at node 7, we have two possible outcomes: *area k is reached* or *area k is not reached* by the hazardous phenomenon *p* generated by an eruption of size class *j* and location *i*. At node 8, the two outcomes are: *the selected threshold s** is overcome* or *the selected threshold s** is not overcome*, considering an area *k* reached by the hazardous phenomenon *p*, generated from an eruption of size class *j* and location *i*. Both probabilities are assumed to be homogeneous all over the area *k*.

*α*

_{*}and

*β*

_{*}are determined (as for node 1–2–3, Eq. 2) by the correspondent Θ

_{*}(i.e., the best guess) and Λ

_{*}(i.e., the equivalent number of data) as for the previous nodes. The use of the Beta distribution for these nodes implies that each area is independent from the others; this assumption requires some additional comments. Let us consider node 7; obviously, the probability that a specific area will be hit by a hazardous phenomenon is somehow correlated to the probability on adjacent cells. This sort of dependency is accounted for, in the prior distribution, by the parameters Θ (the best guesses) for every area that are set by a model. In other words, a reliable model will produce probability estimations that are strongly spatially correlated (see the companion paper Selva et al. 2010), and therefore, also the Θ values for adjacent areas will be correlated. The characteristics of such a correlation are set by the physics of the model used. The use of a Beta distribution for each area implies some sort of independence also among adjacent areas. In particular, the average probability of hitting an area will be very similar to the average probability of hitting an adjacent area, but each single probability value is sampled from the Beta distribution independently from the sampled values in adjacent areas. The rationale of this choice is mostly technical because it greatly simplifies the computations of the code; nevertheless, this choice does not introduce any bias in the final assessment because the BET_VH code never combines the probabilities of different areas.

*y*

_{*}represents the number of successes (reaching area

*k*for node 7, overcoming threshold

*s*for node 8) and

*n*

_{*}represents the total number of past data (past eruptions of size class

*j*and vent location

*i*generating the hazardous phenomenon

*p*for node 7 and number of such eruptions in which it reached area

*k*for node 8).

## Estimating PVHA

In the previous chapters, we have described the general features of the probability distributions at each node. Their combination allows a full and complete PVHA to be determined. In order to accomplish that, we have to still explore in detail three issues: (1) how model output can be used to set prior distributions for the nodes described before, (2) how to combine the conditional probability at each node to get absolute probabilities, and (3) how to account for different ESs. The following three subsections are devoted to describe these three issues. The last subsection reports the kind of outputs provided by the code.

### Numerical models to define prior distribution

As we have seen so far, the setup of prior probability distributions is mainly based on models, through the best guess Θ_{*} and the number of equivalent data Λ_{*} that represents a sort of confidence on our best guess (see Marzocchi et al. 2008). For the prior distribution of nodes 1 to 5, we refer to the estimation of the nonmonitoring part in Marzocchi et al. (2008). For nodes 6, 7, and 8, we can use results from numerical models that are available for most of the hazardous phenomena related to volcanic eruptions (e.g., Favalli et al. 2005 for lava flows; Neri et al. 2007 for pyroclastic density currents; Pfeiffer et al. 2005 and Costa et al. 2006 for ash dispersion). For a more detailed discussion about this point, see (Selva et al. 2010). Here, we highlight the basic philosophy behind models’ usage in PVHA.

- 1.
Intrinsic stochasticity of the process (the so-called

*aleatory*uncertainty) - 2.
*Epistemic*uncertainty in the model parameters and in the boundary conditions at the time and during the eruption (e.g., for an ash fall model, the uncertainties in wind conditions) - 3.
Epistemic uncertainty in the (volcanological) input parameters (e.g., for an ash fall model, the uncertainties in the relevant eruption parameters given a specific eruption size class)

- 4.
Any model is always a simplification of the reality, leading to unavoidable uncertainties into the forecasting.

If different models can be used, BET_VH may use the output of each one of them recursively. In the companion paper (Selva et al. 2010), we propose a general scheme to introduce models’ results from a large number of runs and from more than one model, presenting also a practical application for tephra fall hazard estimation around the Campi Flegrei caldera.

### Combination of the probabilities to obtain PVHA

The probability distributions at each node are conditional on the selection of a well-defined path (see Section “The Bayesian Event Tree scheme for long-term volcanic hazard”). In general, PVHA, as well as the evaluation of probability of each event for which we may be interested, requires their combination. With BET_VH, it is possible to compute the probability associated to a single ES (an eruption of size class and location specified) or to a combination of possible ESs. In the latter case, at nodes 4 and 5, we can select more than one branch at the same time (i.e., a set \(\mathcal{J}\) of *J* possible eruption size classes and/or a set \(\mathcal{I}\) containing *I* possible vent locations; see also Fig. 1b for a snapshot of the main window of BET_VH code, where it is possible to see that more than one location and/or size class can be selected). This feature of the code is quite remarkable because these new combined probabilities are usually very important for practical purposes and for a full PVHA (Selva et al. 2010). In the following, we will carefully describe how to obtain meaningful probabilities for ES combinations.

#### Absolute probability

*Φ*of an outcome at node

*m*, we multiply the probabilities along all the selected path, from node 1–2–3 to the selected outcome at node

*m*(see Marzocchi et al. 2008). For example, the long-term absolute probability [

*Φ*

_{a}] of an area

*k*being impacted by a specific hazardous phenomenon

*p*overcoming the selected threshold

*s*, generated by an eruption of size class

*j*occurring in location

*i*(i.e., a single ES), in any time window of duration

*τ*in the future, is

*Φ*

_{b}] of an area

*k*being impacted by the same phenomenon

*p*overcoming the selected threshold

*s*, associated to a combination of possible ESs (see above) in any time window of duration

*τ*in the future, is

#### Conditional probability

Beyond the absolute probabilities, in many practical applications, some conditional probabilities are particularly important and useful, like the conditional probabilities that can be obtained by the combination of different ESs (see Selva et al. 2010). To this purpose, the code BET_VH gives the possibility to average, with proper weights, the conditional probabilities for different size classes, and/or locations.

*φ*]) starting from the conditional probability distributions at each node. In order to distinguish clearly these new conditional probabilities from the conditional probabilities of the nodes, we decide to indicate them with the symbol

*φ*instead of

*θ*. Let us consider all the possibilities offered by the code. For node 5 (size classes), the probability \([\theta_{5;i}^{(j)}]\) is by definition the conditional probability of an eruption of size class

*j*, given the occurrence of an eruption at a specific location

*i*. If a set \(\mathcal{I}\) of

*I*possible vent locations is considered, then the conditional probability [

*φ*

_{a}] of an eruption of size class

*j*occurring in a location belonging to the set \(\mathcal{I}\), given an eruption occurs, becomes

*φ*

_{ES}] (given an eruption occurs) as:

*φ*

_{ES}] represents the probability of occurrence of the selected single or combined ES, given an eruption occurs. This probability is particularly important because it gives the “weight ” to every possible ES selected; this is necessary either to compare or to merge them (see Selva et al. 2010, for more details).

- At node 6, the conditional probability [
*φ*_{b}] of a specific phenomenon*p*, given the occurrence of an eruption within the selected combination of ESs is$$ \label{eq:multscenarios_cond6} [\phi_b] =\frac {\sum_{i \in \mathcal{I}} \left( \left[\theta_{4}^{(i)}\right] \sum_{j \in \mathcal{J}} \left[\theta_{5,i}^{(j)}\right] \left[\theta_{6,j}^{(p)}\right] \right)} {\phi_{\rm ES}} $$(19) - At node 7, the conditional probability [
*φ*_{c}] of a specific area*k*being reached by the phenomenon*p*, given the occurrence of an eruption within the selected combination of ESs is$$ \label{eq:multscenarios_cond7} [\phi_c]=\frac {\sum_{i \in \mathcal{I}} \left( \left[\theta_{4}^{(i)}\right] \sum_{j \in \mathcal{J}} \left[\theta_{5,i}^{(j)}\right] \left[\theta_{6,j}^{(p)}\right] \left[\theta_{7,i,j,p}^{(k)}\right] \right)} {\phi_{\rm ES}} $$(20) - At node 8, the conditional probability [
*φ*_{d}] of overcoming the selected threshold*s*in a specific area*k*reached by the phenomenon*p*, given the occurrence of an eruption within the selected combination of ESs is$$ \label{eq:multscenarios_cond8} [\phi_d]=\frac {\sum_{i \in \mathcal{I}} \left( \left[\theta_{4}^{(i)}\right]\! \sum_{j \in \mathcal{J}} \! \left[\theta_{5,i}^{(j)}\right] \!\left[\theta_{6,j}^{(p)}\right]\! \left[\theta_{7,i,j,p}^{(k)}\right]\! \left[\theta_{8,i,j,p,k}^{(s)}\right] \right)} {\phi_{\rm ES}} $$(21)

*i*-th vent location,

*j*-th size class), \(\mathcal{I}=\{ i \}\) and \(\mathcal{J}=\{ j \}\).

### PVHA output maps

Spatial probability of vent opening (conditional on eruption occurrence), which is related to the same probability of the susceptibility map proposed by Felpeto et al. (2007); see Fig. 2 for a snapshot example from the code

Absolute spatial probability of a specific size class or a specific phenomenon (in the code, it is called

*absolute map*); this map shows, for each area, the absolute probability of a specific eruption size class, or the absolute occurrence probability of a specific phenomenon; see Fig. 3 for a snapshot example from the code.ES conditional probability (conditional on eruption occurrence): this map (called

*sizes map*in the code) displays, for each area (or group of areas), the probability of a specific eruption size class, weighted with the spatial probability of vent opening, conditional on eruption occurrence in that area (or group of areas). In practice, if one class is selected, the map displays [*φ*_{a}] (see Eq. 17); if more size classes are selected, the map shows the sum of [*φ*_{a}] for all size classes belonging to \({j \in \mathcal{J}}\). Figure 4 displays a snapshot example of the code.Absolute or conditional probability of the phenomenon impacting the territory, in terms of “reaching” proximal or distal areas (node 7; [

*φ*_{c}]), or in terms of overcoming a specific threshold (node 8; [*φ*_{d}]) in proximal or distal areas; this is called*outcome map*in the code; see Fig. 5 for a snapshot example from the code

The probabilities reported in each map are represented as distributions. The average value represents the “best guess” of such a probability; the dispersion around it gives the uncertainty about this guess.

## Discussion and final remarks

It provides PVHA in a user-friendly and transparent way (it is not a black box).

It estimates almost all probabilities useful for hazard and risk applications. In particular, it calculates and visualizes ES maps, as well as weighted combinations of all possible ESs.

It is based on the Bayesian approach; this enables us to take into account different sources of information, such as models output, field data, and relevant geological and historical information. Moreover, the Bayesian approach allows aleatory and epistemic uncertainties to be estimated and visualized.

The outputs are provided in different formats (maps in GoogleEarth, GIS, gif formats), in order to make easy the use of the results.

In the companion paper (Selva et al. 2010), an extensive application of BET_VH to ash hall hazard at Campi Flegrei, Italy, is reported.

## Acknowledgements

Laura Sandri wishes to thank Alex Garcia-Aristizabal for his help with the figures of this paper. We also thank Thea Hincks whose suggestions help us to improve the manuscript.

**Open Access**

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.