Introduction

Unlike some hazardous events that lack clear precursors, magmatic eruptions are usually preceded by multiple precursors at the volcano over time scales of minutes to decades that may facilitate the forecast of the evolution of a volcanic system over a wide range of forecasting time windows. Such forecasting is an essential scientific tool to assist managers to set up rationale and defensible risk reduction actions such as the evacuation of endangered areas before an eruption (e.g., Marzocchi and Woo 2007; Woo 2008; Wild et al. 2022). Eruption forecasts on time windows of decades or more are mostly based on the history of the volcano, for instance, the past frequency of eruptions (e.g., Mendoza-Rosas and de la Cruz-Reyna 2008; Bebbington 2014; Bevilacqua et al. 2016; Selva et al. 2022); these probabilities can be successfully rescaled to even much shorter time windows if the volcano is in a quiet state. Conversely, during a phase of unrest, the monitoring of physical quantities related to the unrest processes may become more informative than past frequency of eruptions (Sparks 2003; Marzocchi et al. 2008; Poland and Anderson 2020). In essence, the short-term forecasting of the evolution of a period of volcanic unrest is dominantly driven by the information provided by monitoring anomalies, i.e., by the occurrence of one or more concomitant monitoring signals outside the background range. If these anomalies indicate the imminent occurrence of an eruption, they are usually called “precursors.” The most common anomalies that lend information on the unrest are related to seismic activity and ground deformation, but important information can also be obtained from geochemical or other kinds of geophysical and geochemical signals. To summarize, lacking plausible and exhaustive physical models that unequivocally and deterministically track the evolution of the system, volcanologists must rely on the detection of anomalies to track volcanic unrest.

The most obvious approach to translate anomalies into probabilities of a specific event of interest (E hereafter) is the frequentist approach, i.e., calculating from the past data of the volcano or from analog volcanoes the rate of occurrence \(\gamma\) of the target event E, the false alarm rate \(\delta\), and the fraction of time occupied by predictions, ω (Marzocchi and Bebbington 2012). For example, in the most common event tree formulations, the event E can be (i) the presence of a magmatic intrusion driving the unrest (i.e., the presence of magma given an unrest); (ii) the occurrence of a magmatic eruption in a given time window \(\tau\); (iii) the occurrence of a magmatic eruption in \(\tau\) conditional upon the presence of a magmatic unrest; (iv) the occurrence of a phreatic explosion in \(\tau\); (v) the termination of the unrest in \(\tau\); of course, other possible definitions of E are possible. Unfortunately, rich and complete unrest databases do not yet exist for most volcanoes, even though major efforts have been under development for decades (Costa et al. 2019). The lack of such specific databases and the constant improvement in high quality monitoring procedures makes the use of expert judgment, to a certain extent, unavoidable.

The most common structured experts’ judgment exercises are based on eliciting directly the probability of one or more specific events (e.g., Neri et al. 2008; Aspinall and Cooke 2013; Hincks et al. 2014), weighting the different experts based on their calibration against seed questions (Cooke 1991). This approach is not particularly suitable to track the evolution of the volcanic system that is rapidly evolving with time (see Cronin 2008 for the Ruaumoko exercise at Auckland Volcanic Field). In these cases, it may be convenient to set up procedures (during a quiescent period) to identify monitoring anomalies that anticipate one specific event of interest, i.e., establish a direct link between anomalies and probability. Clearly, different experts’ judgment procedures should lead to similar results if properly implemented, because all of them aim at describing what a volcanological community thinks about pre-eruptive processes (Lindsay et al. 2010).

Volcanologists already strongly rely on subjective interpretation of monitoring anomalies, usually in terms of alert levels (see, for instance, the alert levels at Vesuvius: http://www.protezionecivile.gov.it/attivita-rischi/rischio-vulcanico/attivita/piano-emergenza-vesuvio#livelli_allerta, Rosi et al. 2022), but their definition and the practical attribution of alert levels during an unrest are vague and not clearly rooted in a scientific domain (Marzocchi et al. 2012, 2021a, b; Papale 2017). A more structured and transparent probabilistic procedure is essential to quantitatively forecast the evolution of a volcanic unrest using experts’ judgment. This approach has significant advantages: the use of probability facilitates the establishment of transparent and quantitative decision-making protocols, clarifies the roles and responsibilities of the different experts, and is a formidable educational and communication tools for both society and individual scientists.

To date, several efforts have been devoted to translating the observation of one or more precursors into a probabilistic assessment using directly formal expert judgment, such as in the setup of a Bayesian Belief Network (BBN; e.g., Aspinall et al. 2003; Aspinall and Woo 2014; Hincks et al. 2014; Christophersen et al. 2018). In most applications of the Bayesian Event Tree scheme (BET; Marzocchi et al. 2004, 2008), experts are elicited regarding which and what level of monitoring anomalies best characterize a specific pre-eruptive phase (e.g., a magmatic intrusion or an impending magmatic eruption; Lindsay et al. 2010; Selva et al. 2012; Scott et al. 2022). Then, the observation of one or more of these anomalies are automatically transformed into probabilities through a formalized subjective procedure (Selva et al. 2014), which is described by an exponential learning curve with which is associated an uncertainty that mimics the “confidence” on the probabilistic assessment (the meaning of the word “confidence” here is taken from the IPCC report; IPCC 2013).

In this paper, we aim at improving, modifying, clarifying, and generalizing the approach described above in several ways. First, we root the procedure in an entropy-based framework, in which the observation of anomalies is information about E that can be translated into entropy and then into probability of E. Second, we modify the functional form of the learning curve giving a more formal interpretation of the weight associated to each anomaly. Third, we embed the procedure in a formal unified probabilistic framework (Marzocchi and Jordan 2014) that provides a clear and unambiguous link between the probability distribution and the different kinds of uncertainty that affect the assessment. Notably, the proposed model is very general and it can be either implemented in a new revision of the BET method or it can be used as a stand-alone model to forecast eruptive activity, or to set up the conditional probabilities in other models like for example the BBN.

Monitoring observations are information

In a generic volcano monitoring system, volcanologists record a set of measurements \(\left\{{a}_{1},{a}_{2},\dots ,{a}_{L}\right\}\) at a generic time t, where L is the number of monitoring observations. The observations can be continuous or discrete variables (for example, CO2 flux and number of earthquakes, respectively), taken in different time windows, e.g., instantaneous measure (the magnitude of the earthquake just occurred), or the cumulative observation in a time window (e.g., the number of earthquakes in the last month).

The information provided by these measurements in terms of one specific event E is given by the fact that anomalous values are, or are not, observed. This can be quantitatively defined through a degree of anomaly. For example, if the observed measurement \({a}_{i}\) is anomalous with respect to a background state of the volcano, it can indicate the presence of a magmatic intrusion, or the occurrence of an impending eruption. In mathematical terms, the set of observations \(\left\{{a}_{1},{a}_{2},\dots ,{a}_{L}\right\}\) is transformed into a set of N anomalies \(\left\{{z}_{1},{z}_{2},\dots ,{z}_{N}\right\}\), where \({z}_{i}\) is a continuous number between 0 and 1, which defines the degree of anomaly of each observation: 1 (0) if it is (not) anomalous, and the fraction in between represents an intermediate degree of anomaly, indicating the existing uncertainty in defining the actual anomaly. We stress that \(N\ge L\) (number of anomalies equal to or larger than the number of monitored parameters), as one anomaly can also be defined by the simultaneous observation of different parameters, or a single observation may be connected to different types of anomalies (e.g., number of seismic events and maximum magnitude per day); if some of the monitoring anomalies alone are not considered informative about E, they can be neglected by setting a weight equal to zero (see below).

Marzocchi et al. (2008) use a fuzzy logic scheme for mapping \({a}_{i}\) into \({z}_{i}\). For \(i=1,\dots ,L\), when an anomaly is characterized by high values of the monitoring parameter.

$$z_i=\left\{\begin{array}{cl}1&\,\mathrm{if}\;\;\;a_i\geq a_{max}\\\frac12\left\{\sin\left[\pi\frac{\left(a_i-a_{min}\right)}{\left(a_{max}-a_{min}\right)}-\frac\pi2\right]+1\right\}&\mathrm{if}\;{\;\;a_{min}<a}_i<a_{max}\\0&\mathrm{if}\;\;\;{\mathrm a}_{\mathrm i}\leq{\mathrm a}_{\mathit m\mathit i\mathit n}\end{array}\right.$$
(1a)

or when it is characterized by low values

$$z_i=\left\{\begin{array}{cl}0&\mathrm{if}\;\;\;a_i\geq a_{max}\\\frac12\left\{sin\left[\pi\frac{\left(a_i-a_{min}\right)}{\left(a_{max}-a_{min}\right)}+\frac\pi2\right]+1\right\}&\mathrm{if}\;{\;\;a_{min}<a}_i<a_{max}\\1&\mathrm{if}\;\;\;{\mathrm a}_{\mathrm i}\leq{\mathrm a}_{\mathit m\mathit i\mathit n}\end{array}\right.$$
(1b)

where \({a}_{min}\) and \({a}_{max}\) are, respectively, the minimum and maximum value of the monitoring variable which defines when the parameter is certainly anomalous or not (see Fig. 1).

Fig. 1
figure 1

Variation of the degree of anomaly z as a function of the monitoring variable \(a\): upper panel from Eq. (1a); lower panel from Eq. (1b)

In some cases, the information is given by the simultaneous observation of different parameters, i.e., an anomaly can be also defined as

$${z}_{i}=\prod_{k=1}^{{K}_{i}}{z}_{k}$$
(1c)

where \(L<i\le N\) and Ki are the size of a subset of single monitoring anomalies (\({K}_{i}\le L\)).

The monitoring information is then summarized by the cumulative degree of anomaly (hereafter anomaly score) observed at time t that is given by

$$Z\left(t\right)={\sum }_{i=1}^{N}{{\omega }_{i}z}_{i}$$
(2)

where \({\omega }_{i}\) is the un-normalized weight of the ith anomaly that will be described in subsection “Weighting the anomalies.”

Information, entropy, and probability

In the previous section, we condensed all the information regarding monitoring observations about E into one single value \(Z\left(t\right)\) that describes the anomaly score observed at t. Information can be related to the probability of E through the physical concept of entropy.

Although the entropy has been defined in different ways depending on the physical framework of interest, such as thermodynamics, information theory, and quantum mechanics, all of these definitions can be reduced more or less directly to the following equation:

$$H=-{\sum }_{i=1}^{M}{P}_{i}{\text{ln}}\left({P}_{i}\right)$$
(3)

where H is the entropy of a system that is composed by a finite set of equally probable microstates that can be aggregated in a few macrostates, or outcomes of the process, M, each macrostate with probability \({P}_{i}\) (Shannon 1948). In our specific case, we usually have two outcomes (M = 2) that we name “event” (E) and “no event” (\(\overline{E}\)), each one characterized by a probability \({P}_{1}={P}_{E}\) and \({P}_{2}={1-P}_{E}\), respectively.

In the information-theory view of entropy, entropy is ignorance, i.e., we are most unsure about the evolution of the system when the two outcomes are equally likely (in our case, \({P}_{1}={P}_{E}=0.5\) or maximum ignorance about the evolution of the system) that corresponds to the maximum of Eq. (3). The minimum of H (H = 0) is achieved when we are sure about the evolution of the system, either towards a “no event” (\({P}_{E}=0\)) or to an “event” (\({P}_{E}=1\)). When the two outcomes have a quite different societal impact, e.g., when E is a volcanic eruption, it is convenient to consider the two outcomes of Eq. (3) separately. The term \(-{\text{ln}}({P}_{i})\) is called the entropy score (Daley and Vere-Jones 2004) and it is particularly important being a measure of the unpredictability of the particular outcome i; i.e., \(-{\text{ln}}({P}_{E})\) is a measure of the unpredictability of E: when E is perfectly predictable, the entropy score \(-{\text{ln}}({P}_{E})\) is zero, whereas when E has a very low probability the entropy score goes towards infinity. It is worth noting that the expected value of this score (e.g., the mean value) is given by Eq. (3) (assuming that we are using the true probabilities).

According to the definition of the entropy score as a measure of the unpredictability of the outcome, we may call the unpredictability of \(\overline{E}\), \(-{\text{ln}}\left(1-{P}_{E}\right)\), as the predictability of E: the predictability goes towards infinity when \({P}_{E}\) is approaching to 1, i.e., we are becoming certain about E, and it goes towards zero when E becomes more and more unlikely. In our case, the predictability of E is a function of the observed anomalies in favor of E, whose information condensed in Z. Under this interpretation, the simplest way in which we may use the monitoring information to linearly modify the predictability of E as a function of Z is

$$-{\text{ln}}\left(1-{P}_{E}\right)=k+Z$$
(4)

Note that Eq. (4) is slightly different from the two-parameter equation used in Marzocchi et al. (2008; see Eq. (29) in their Appendix B); we will discuss the differences in subsection “Weighting the anomalies.” The parameter k sets a lower bound of predictability when the monitoring network shows no anomalies, that is, k accounts either for potential informative anomalies that the present monitoring system is not able to detect, and/or for the intrinsic unpredictability of the evolution of the system (even for a perfectly monitored system), an unrest can start during the time window \(\tau\). Hence, the parameter k can be set, at a first order, depending on the quality of the monitoring network; it is expected that the more developed the monitoring network, the smaller k.

From Eq. (4), we can establish how the probability \({P}_{E}\) evolves as a function of the anomalies detected.

$${P}_{E}=1-{\text{exp}}\left[-\left(k+Z\right)\right]=1-a~ {\text{exp}}\left(-Z\right)$$
(5)

Equation (5) looks like a simple learning curve with some additional interesting features. First, the relationship is monotonically increasing so that the larger \(Z\), the larger the probability \({P}_{E}\). Second, the relationship implies that the largest increase in probability occurs when one of the monitoring variables shows some degree of anomaly; as more monitored variables become anomalous, the probability continues to rise, but more slowly, i.e., moving from zero to one anomaly is much more important than moving from three to four anomalies. Third, Eq. (5) states that the probability of E depends only on the anomaly score Z. Despite its simplicity, this approach is very flexible; for example, if E represents the occurrence of an eruption, the model can handle quite different conceptual physical models, envisioning the existence of a single pre-eruptive pattern (a sort of silver bullet approach), or different pre-eruptive scenarios, before eruptions that can be tracked though \(Z\left(t\right)\).

Weighting the anomalies

The definition of the weights \({\omega }_{i}\) in Eq. (2) is of paramount importance to determine \({P}_{E}\). In this subsection we discuss the implications of choosing different weighing schemes for the monitoring anomalies.

Conceptually, the weights of Eq. (2), \({\omega }_{i}\), have to be constrained considering their relative and absolute importance. Specifically, the ratio of two weights must reflect the relative importance of one anomaly with respect to the other; conversely, the absolute importance of one anomaly considers if its weight depends, or not, on how many monitoring parameters are measured (N in Eq. 2). To make this point clear, we consider two cases: in the first case (case A) \({\omega }_{i}\) is independent of how many other monitoring parameters, N, are considered; in the second case (case B), \({\omega }_{i}\) are normalized, i.e., \({\sum }_{i}{\omega }_{i}=1\), hence \({\omega }_{i}\) depends on N. In case A, we are assuming that each monitoring anomaly yields an absolute amount of information about E that does not depend on how many other anomalies are or are not observed. In case B, we are assuming that the absolute importance of one anomaly changes depending on how many other monitoring parameters are considered.

For example, let us assume that one anomaly i brings twice as much information as another anomaly j. In this case\(\frac{{\omega }_{i}}{{\omega }_{j}}=2\), but the absolute values of the weights are left unconstrained; they could be 10 and 5, 2 and 1, and 0.01 and 0.005. The absolute importance of each anomaly bounds them to specific values. For example, in case A, we may say that the full anomaly for parameter i (\({z}_{i}=1\)) implies that E is almost certain (say,\({P}_{E}=0.95\)); hence, from Eq. (5), we get\({\omega }_{i}=-\left[a+{\text{ln}}\left(0.05\right)\right]\approx 3-a\), and\({\omega }_{j}=\left(3-a\right)/2\). In case B, the weight \({\omega }_{i}\) has to consider the fact that we are measuring N monitoring parameters and even the observation of no anomalies counts. Here, we use case A, i.e., each anomaly (composed by one parameter or by a combination of parameters) has its own physical meaning, regardless how many parameters are measured.

Notably, the defined approach is different to that discussed by Marzocchi et al. (2008), where the translation between Z and probability was made considering two parameters, a and b. To relate the two approaches, Eq. (2) can be re-written using normalized weights \({\widetilde{\omega }}_{i}\).

$$Z={\sum }_{i=1}^{N}{{\omega }_{i}z}_{i}=b{\sum }_{i=1}^{N}{{\widetilde{\omega }}_{i}z}_{i}$$
(6)

where \(b={\sum }_{i}{\omega }_{i}\). This equation gives a clear interpretation of the parameter b used in Marzocchi et al. (2008): the chosen value of b represents the normalization factor (not necessarily equal to 1) that was attributed to the weights \({\omega }_{i}\).

A complete description of uncertainties in eruption probability: a taxonomy of uncertainties

Equation (5) provides the probability of E given a specific anomaly score Z. However, this probability does not contain yet another important piece of information, i.e., how much the selected anomalies effectively represent a reliable information for E and hence a reliable estimation of the probability calculated by Eq. (5). This can be measured by the degree of consensus among volcanologists on the identified anomalies. In the IPCC report (IPCC 2013), this additional piece of information is mimicked by the “confidence,” a qualitative quantity that describes the subjective reliability of the likelihood given by a model.

In an attempt to incorporate formally this additional level of uncertainty in eruption forecasting and volcanic hazard, Marzocchi et al. (2004, 2008) describe the probability through a distribution instead of a single number. At that time, the proposed approach could be considered purely heuristic because it does not conform with standard frequentist (e.g., Hacking 1965) or subjective (e.g., Lindley 2000) probabilistic frameworks, for which the probability is just one number. More recently, Marzocchi et al. (2021a, b) have introduced a formal unified probabilistic framework for eruption forecasting based on Marzocchi and Jordan (2014). This framework allows probability to be described by a distribution. In this section, we do not describe this framework in detail (that is deeply described in the references reported above), but we show how this framework can be applied to describe the uncertainty on the probability given by Eq. (5).

The unified probabilistic framework is rooted in the definition of an experimental concept, which allows us to define an unambiguous hierarchy of uncertainties. Let us assume that we collect the sequence of observations when the anomaly score is equal to \(Z\). The sequence can be collected at regular time windows or can be any ordinal sequence; in any case, it will be represented as a sequence of bins where the anomaly score is equal to Z. Then, we define the binary variable \({e}_{i}\): \({e}_{i}=0\) if E does not occur in the ith bin, and \({e}_{i}=1\) if E occurs. According to the de Finetti theorem (1974), if we can assume that the sequence \({e}_{i}\) is stochastically exchangeable (Draper et al. 1993) and the sequence must have a well-defined, but unknown, frequency of occurrence of the event E, \(\widehat{\phi }\)(given an anomaly score \(Z\)). In practice, we may consider the probability of E as a frequency (unknown) of a well-defined experimental concept reflecting the natural (also named aleatory) variability. The uncertainty on the estimation of such an unknown frequency may be described by a distribution (the so-called epistemic uncertainty) through the subjectivist mathematical apparatus typical of Bayesian methods, which is particularly suitable to handle uncertainties. The use of the frequentist interpretation of probability with the subjective mathematical apparatus is the main reason for use of the term “unified” to characterize the probabilistic framework described in this paper.

From a more physical point of view, the application of this framework to the development described in the previous sections comes with the following assumptions: (i) all information regarding E at the time t is summarized by the anomaly score \(Z\left(t\right)\), (ii) future observations collected in forecasting time windows with equal anomaly score \(Z\) are exchangeable, i.e., if the observations are shuffled, we are not losing any information about the process.

In this framework, Eq. (5) can be rewritten as

$${\phi }_{b}=1-{\text{exp}}\left[-\left(k+Z\right)\right]=1-a ~ {\text{exp}}\left(-Z\right)$$
(7)

where \({\phi }_{b}\) is our best estimation of the probability \({P}_{E}\), i.e., the unknown frequency of the experimental concept described above. The uncertainty over this unknown frequency (epistemic uncertainty) can be described assuming that \(\phi \sim {\text{Beta}}\left(\alpha ,\beta \right)\), whose \({\phi }_{b}\) is the expectation value (the average), \({\phi }_{b}\equiv <\phi >\). This distribution is named Extended Experts Distribution (EED) by Marzocchi and Jordan (2014).

In some cases, such as in hazard analysis, this beta distribution can be obtained by fitting N estimations given by different models \(\{{\phi }_{i},{\widetilde{\omega }}_{i}\}\), where \({\widetilde{\omega }}_{i}\) is the normalized weight of the ith forecast. In our case, we do not have a set of estimations, but we get the average \(<\phi >\equiv {\phi }_{b}\) of the distribution using Eq. (7); the variance of \(\phi\) can be conveniently expressed in terms of the equivalent number of data \(\Lambda\), as defined in Eq. (11) in Appendix A of Marzocchi et al. (2008). The maximum possible variance is set when \(\Lambda =1\), which means that the information equals the information of one single datum. Of course, we can use \(\Lambda >1\), with increasing values of \(\Lambda\) as the confidence that volcanologists have in the definition of the anomalies. In practice, this value can be defined through an expert elicitation session. Having set the average and variance (standard deviation) of \(\phi\), we can get the parameters of the prior distribution \({\text{Beta}}\left(\alpha ,\beta \right)\) (Marzocchi et al. 2021a, b).

It is worth noting that, in contrast to the subjectivist framework, for which all models are “wrong” and model validation is pointless (Lindley 2000), the unified framework allows model validation. Specifically, we can define an ontological null hypothesis, which states that the true aleatory representation of future occurrence of natural events—the data generating process—mimics a sample from the EED that describes the model’s epistemic uncertainty. According to the ontological null hypothesis, the true unknown frequency \(\widehat{\phi }\) of the experimental concept defined by a well-defined anomaly score Z cannot be distinguished from a realization of the EED, i.e., \(\widehat{\phi } \sim p\left(\phi \right)\). If the data are inconsistent with the EED, the ontological null hypothesis can be rejected, which identifies the existence of an ontological error (Marzocchi and Jordan 2014). In other words, the “known unknowns” (epistemic uncertainty) do not necessarily completely characterize the uncertainties, presumably due to effects not captured by the EED—“unknown unknowns”—associated with ontological errors. An interesting implicit consequence of the unified probabilistic framework is that it challenges the false syllogism adopted in many critics to the use of experts’ judgment (e.g., Stark 2022): science is objective, and volcanic analysis relies on subjective experts’ judgment; hence, volcanic hazard analysis is not science. More details on this topic can be found in Marzocchi and Jordan (2014) and Marzocchi et al. (2021a, b).

Tracking the probability of magmatic unrest during the 1982–1984 phase at Campi Flegrei

The unrest episode in 1982–1984 has been the most important event in that area since the last eruption in 1538 at Monte Nuovo, with a maximum vertical displacement of 1.79 m, and the recording of about 5500 felt seismic events (Orsi et al. 1999). A full probabilistic forecast for such an unrest episode has been reported by Selva et al. (2012) using the most updated BET model for the volcano and the available monitoring observations. A forecasting time window of \(\tau =1\) month has been used. Here, we show the practical implementation of the method described in this paper to one of the BET nodes, i.e., node 2, dedicated to the quantification of the time-dependent probability of a magmatic unrest during a phase of unrest (where the probability at node 1 of BET is equal to 1).

For the sake of example, here we use the monitoring anomalies identified in the fifth experts’ elicitation carried out for the Campi Flegrei, which has been thoroughly described in Selva et al. (2012). The monitoring parameters and thresholds for the detection of a magmatic unrest are reported in Table 1, reproduced from Selva et al. (2012). In this example, we calculate the probabilities through Eq. (7) using a = 0.9. This means that if we do not observe any anomaly in the monitored parameters, then \({\phi }_{b}=0.1\) (see Eq. 7).

Table 1 List of parameters to assess the probability of magmatic unrest (Selva et al. 2012). With the term “inertia,” we mean the time during which an anomaly is considered active after its detection. The parameters in bold are the ones that we consider in this tutorial application, because some monitoring observations were not available at that time. Owing to some lack of information about earthquake magnitude of each event, here we assume that all magnitudes were above the M = 0.8 threshold. Asterisk ( ∗) is the number of earthquakes divided by the number of days from the start of the counting. This choice makes the inertia proportional to the number of earthquakes recorded (and to the total energy emitted); that is, the higher the number of earthquakes, the longer the inertia

Figure 2 shows the month-by-month variation of the Z parameter (Eq. 2) and the consequent variation of the probability of magmatic unrest, \({\phi }_{b}\) (Eq. 7), from the start of the unrest (August 1982) until its end (December 1984). We can see that the highest values of the probability of magmatic unrest are reached between the months of June and September 1983 due to the opening of a fracture near the Solfatara volcano and in particular to the occurrence of deep seismic events on September 10, 1983.

Fig. 2
figure 2

Variation of Z (left panel) and variation of the probability of magmatic unrest \({\phi }_{{\text{b}}}\) during the examined period (right panel)

Figure 3 shows the Beta distributions of the probability calculated in September 1983 using \({\phi }_{b}=0.76\) as the mean value for that month and different equivalent numbers of data \(\Lambda\) to show the effect of the epistemic uncertainty on the shape of the distribution. The figure shows that in September 1983 the monthly probability of a magmatic unrest ranged from \(0.58\) (10th percentile) to \(0.91\) (90th percentile) for \(\Lambda =10\) (black curve). Figure 4 shows the time variation of the 10th, 50th, and 90th percentiles of each distribution obtained during the examined time for different values of \(\Lambda\). Figures 3 and 4 show that an increase in confidence on the probabilistic assessment (higher \(\Lambda\)) implies a reduction of the epistemic uncertainty, as expected.

Fig. 3
figure 3

Probability density functions of magmatic unrest for the month of September 1983, calculated using three different values of \(\Lambda\). An increase of the equivalent number of data corresponds to a narrowing of the distribution, i.e., to a decrease of the epistemic uncertainty

Fig. 4
figure 4

Variation of the median (solid lines) and of the 10th and 90th percentiles (dashed lines) of the probability distributions of magmatic unrest during the episode of unrest for three different values of Λ. As for Fig. 3, a larger Λ implies a narrower 80% confidence interval

Although this specific tutorial example concerns Campi Flegrei, the procedure may be applied to any volcano. Application of the procedure requires a specific elicitation to identify the anomalies for the particular volcano and the change of the parameter a (or k) that characterizes the specific level of monitoring of the volcano. For example, if we consider a volcano that is less well-monitored than Campi Flegrei, we should consider a smaller value of a (a higher value of k), representing the possibility that some signals of magmatic intrusion in the forecasting time window \(\tau\) will not be detected by the monitoring system.

Conclusions

To date, a satisfactory and complete physical knowledge of pre-eruptive processes is not available. This lack does not mean that volcanologists know nothing about pre-eruptive phenomena; volcanological knowledge is prevalently heuristic or based on a few conceptual models that remain mostly untested and forecasts are mainly based on the interpretation of precursory anomalies, where available. In this paper, we have formalized such knowledge in terms of information, which is overall summarized by the score anomaly Z. Then, we have developed an entropy-based strategy to move from information (detection of anomalies) to entropy and eventually to probability of the event of interest.

This methodology

  1. i.

    Is transparent and describes how we obtain probabilities from volcanological information, which is particularly important for the reproducibility of the results and to parametrize subjective expertise in a way that future generations can understand what we think we know.

  2. ii.

    Provides forecasts in almost real-time. This is particularly appealing during a rapidly escalating volcanic unrest and it is a distinctive feature with respect to more classical experts’ judgment procedures that elicitate directly the probabilities.

  3. iii.

    Is the simplest way to translate information into eruption probabilities. The largest step in probability is associated with the first observed anomaly. A generic further probability step associated to ith anomaly tends to decrease when i increases; in other words, the largest step in probability is moving from a score anomaly Z from 0 to 1, and it decreases for increasing Z.

  4. iv.

    Is based on an unambiguous taxonomy of uncertainties, where aleatory variability, epistemic uncertainty, and ontological error are clearly defined and formally described. This unambiguous taxonomy of uncertainties allows, at least in principle, the validation of the forecasts. Worthy of note, the quantification and separation of uncertainties of different kinds allows quantification of the confidence associated with the probabilistic assessment.

  5. v.

    Is versatile and it can be applied to any kind of volcano or event we are interested in, e.g., the presence of magma or not in an unrest episode, the location of vent opening, the occurrence of an eruption in a forecasting time window. The method can be used as a stand-alone model to provide eruption forecasting given a set of monitoring anomalies, or to set up the conditional probabilities that are necessary to implement models based on the Bayesian Event Tree or Bayesian Belief Network.