Introduction

Recent volcanic eruptions have resulted in a significant number of fatalities, evacuations and economic losses, for example at La Palma, Spain in 2021 (Global Volcanism Program 2021), La Soufrière, St. Vincent and the Grenadines in 2021 (Global Volcanism Program 2021), White Island, New Zealand in 2019 (Park et al. 2020; Cao et al. 2020); Fuego, Guatemala in 2018 (Albino et al. 2020; Naismith et al. 2019); Hawaii, U.S.A. in 2018 (Feng et al. 2020; Neal et al. 2019); Anak Krakatoa, Indonesia in 2018 (Heidarzadeh et al. 2020; Grilli et al. 2019); Sinabung, Indonesia in 2013–2018 (Andreastuti et al. 2019; Gunawan et al. 2019); Ontake, Japan in 2014 (Maeno et al. 2016) and Eyjafjallajökull, Iceland in 2010 (Langmann et al. 2012). It is estimated that > 500 million people live in areas with volcanic risk (e.g., Martí 2017; Auker et al. 2013) and large volcanic eruptions affect our interconnected world in numerous ways. Volcanic eruptions create multiple hazardous phenomena (e.g., ballistics, pyroclastic density currents, tephra fallout) with impacts at different scales (e.g., local, regional, and global), placing a premium on hazard assessments at various map scales (e.g., Chester et al. 2002).

A useful volcanic risk assessment, from local to global scales, relies on probabilistic volcanic hazard assessment (PVHA). Any volcanic hazard assessment employs diverse techniques to generate different types of volcanic hazard maps. These maps commonly depict the areas that can be affected by one or more hazardous volcanic phenomena and are based upon the distribution of deposits from past eruptions, modern topography, and/or the results of computer modeling of a volcanic phenomenon (Ang et al. 2020; Martí 2017; Loughlin et al. 2015; Calder et al. 2015). From a practical standpoint, a deterministic hazard analysis is often scenario-based and assumes that an eruption of some type and magnitude will occur within some specific period of time (e.g., Tierz 2020; Calder et al. 2015; Connor et al. 2015). A probabilistic analysis considers the probability of eruptions occurring within some time interval and their likely characteristics, including range of eruption styles, eruption sizes, and eruption source parameters (e.g., Bertin et al. 2022; Poland and Anderson 2020; Tierz et al. 2016; Connor et al. 2015). PVHA can merge information derived from the record of past activity, theoretical models and numerical modeling of volcanic phenomena. The advantage of this approach is the ability to provide a quantitative way of understanding the hazard posed by a volcanic phenomenon and to account for the uncertainties stemming from the natural variability of the phenomena (aleatoric uncertainty) or our limited knowledge about the physical processes (epistemic uncertainty) (Tierz et al. 2020; Ang et al. 2020; Martí 2017; Tierz et al. 2016; Loughlin et al. 2015; Connor et al. 2015; Calder et al. 2015; Marzocchi and Bebbington 2012). The resulting probability maps are a type of volcanic hazard map that show the probability that a given location will be affected by a volcanic phenomenon. Importantly, these maps include the associated uncertainties, therefore differing substantially from the deterministic-approach maps. Probabilistic maps represent either a conditional probability (i.e., probability that a location of interest will be affected by a volcanic phenomenon given that an eruption occurs, perhaps sampling a range of eruption styles or sizes) or the absolute probability (unconditional in terms of eruption probability, size, style and vent location) (e.g., Poland and Anderson 2020; Rouwet et al. 2019; Tierz et al. 2016; Calder et al. 2015; Connor et al. 2015; Tonini et al. 2015; Marzocchi et al. 2010). A PVHA is therefore often prefered to a deterministic analysis, although deterministic scenarios are definitely useful in some circumstances, such as illustrating potential eruption outcomes.

PVHA e-tools have been developed to estimate and visualize the probabilities that an area will be affected by a volcanic event and the associated uncertainties (e.g., Bertin et al. 2019; Bartolini et al. 2019; Sobradelo et al. 2013; Marzocchi et al. 2010). PVHA serves to bridge the volcanological and societal aspects of a volcanic crisis and ideally can assist authorities in real-time decision-making (e.g., Bartolini et al. 2019; Papale 2017; Newhall and Pallister 2015; Marzocchi and Bebbington 2012). These tools are mostly based on Event Trees (ET) or Bayesian Belief Networks (BBN) structures. An ET is a directed graph (nodes and branches) (Fig. 1) of progressive events, from an initial state, through subsequent stages, to final outcomes (Connor et al. 2001; Newhall and Hoblitt 2002). The final outcomes are often referred to as contingencies, and the idea of the ET is to capture all possible contingencies, unless otherwise stated. This unidirectional structure allows probabilities for sequences of events in volcanic activity to be calculated in a Bayesian framework. By updating the ET with different sources of data, including eruptive history (past data) and theoretical or mathematical models (prior models), it becomes a Bayesian Event Tree (BET), and the probability of specific outcomes is estimated in a Bayesian way (e.g., Marzocchi et al. 2008). Similar to BETs, Bayesian Belief Networks (BBN) are graphical structures representing different events related to volcanic activity. Unlike BETs, BBNs describe the complexity of this activity as variable nodes interlinked by branches representing the causality between them (Christophersen et al. 2018; Tierz et al. 2017; Hincks et al. 2014; Aspinall and Woo 2014; Aspinall et al. 2003). The estimation of probabilities at each node in the BET or BBN schemes is done by implementing a computer algorithm based on Bayes’ rule, which allows update of the output as new information becomes available (Christophersen et al. 2018; Marzocchi et al. 2008; Aspinall et al. 2003). Both ETs and BBNs have been used to forecast or hindcast eruptions for several volcanoes: Aluto, Ethiopia (Tierz et al. 2020); White Island, New Zealand (Christophersen et al. 2018); La Soufrière, Guadeloupe (Hincks et al. 2014); Galeras, Colombia (Aspinall et al. 2003); Santorini, Greece (Aspinall and Woo 2014); Etna and Somma-Vesuvius, Italy (Tierz et al. 2017; Cannavò et al. 2017); St. Helens, U.S.A (Newhall 1982); Soufrière Hills, Montserrat (Aspinall and Cooke 1998), Pinatubo, Philippines (Punongbayan et al. 1996) and others.

Fig. 1
figure 1

Graphical representation of the Ceboruco event tree (ET). The example illustrated here is the event tree developed for Ceboruco based on prior models and past data. We identify three potential causes of unrest at Ceboruco: new magma entering the system, a large tectonic earthquake in the graben or an increased degassing episode associated with non-magmatic changes in the hydrothermal system. Each unrest episode has associated potentially hazardous phenomena. For example, we consider that rockslides and debris flows can occur at Ceboruco even if the volcano is not in unrest due to the instability of the volcano’s edifice and/or due to heavy rainfall. The highlighted branch and associated nodes (gray boxes) represent the sequence of events attributed to a potential magmatic unrest that we assess with PyBetVH. Note that some of these phenomena have relatively high probability in eruptive scenarios while others (e.g., debris avalanche) have relatively low probability. The ET attempts to account for all contingencies

Two of the most commonly used e-tools in long-term PVHA are HASSET (Hazard Assessment Event Tree) (Sobradelo et al. 2013) and BET_VH (Bayesian Event Tree for Volcanic Hazard) (Marzocchi et al. 2010). Both are software implementations of the volcano eruption ET structure (Marzocchi et al. 2004; Newhall and Hoblitt 2002). HASSET and BET_VH have been applied for hazard evaluation at different volcanoes such as San Miguel, El Salvador (Jiménez et al. 2018), Okataina, New Zealand (Thompson et al. 2015), Deception Island, Antarctica (Bartolini et al. 2014), El Hierro, Canary Islands (Becerril et al. 2014), El Misti, Peru (Sandri et al. 2014), Somma-Vesuvius and Campi Flegrei, Italy (Sandri et al. 2019; Tierz et al. 2018; Selva et al. 2014; 2010), Auckland Volcanic Field, New Zealand (Sandri et al. 2012) and Teide – Pico Viejo, Canary Islands (Martí et al. 2012; 2008).

Here, we use the PyBetVH software (e.g., Wild et al. 2019; Strehlow et al. 2017; Tonini et al. 2015), an updated version of BET_VH, to conduct a PVHA for Ceboruco volcano (Mexico). This PVHA is intended to update the currently available deterministic hazard maps for Ceboruco with probabilistic maps. We define a generic ET, that is, not calibrated by current monitoring or related observations, for Ceboruco. Like many volcanoes, these monitoring data are unavailable. The ET includes hazardous phenomena related to both magmatic and non-magmatic unrest (Fig. 1). We describe how PyBetVH is used to produce probability maps for the magmatic branch by merging information from the geological record (i.e., past data), numerical simulations and expert-based weighting of the data (i.e., prior models). The strength of the PVHA analysis and resulting maps is that the numerous sources of uncertainty are incorporated in the analysis therefore being fundamentally different to the deterministic maps produced by Sieron et al. (2019b).

An important reason to develop an ET for Ceboruco volcano is that the volcano has been quite active within the last ~ 1000 yrs. but has not erupted during the last century. Because of this recent hiatus in activity, Ceboruco is less monitored than other active volcanoes in Mexico. We show through constructing the ET that this comparative under-monitoring increases uncertainty in probability of eruptions and probable eruption impacts. The resulting probability maps reflect this uncertainty. Consequently, in addition to estimating eruption probabilities, the Ceboruco ET can be used to consider the cost-benefit of additional monitoring efforts, and perhaps also to help justify them. As this situation exists at many volcanoes globally, this uncertainty quantification can be useful for global efforts to prioritize volcano monitoring more systematically (Ewert et al. 2005). Finally, we show how a PVHA is sensitive to the assumptions about volcanic unrest at volcanoes with limited or no monitoring networks and how these assumptions affect the resulting probabilities, with effects on decision-making. To our knowledge, it is the first time this issue has been addressed.

Eruptive history and hazard assessment at Ceboruco

Eruptive history

Located at the western edge of the Trans-Mexican Volcanic Belt (TMVB), Ceboruco (2280 m a.s.l.) (Fig. 2) is the only historically active stratovolcano among the ~ 28 monogenetic edifices of the San Pedro – Ceboruco half-graben (Sieron et al. 2019a; Petrone 2010; Sieron and Siebe 2008). Its eruptive history includes both effusive and explosive activity (Sieron et al. 2019a; Sieron and Siebe 2008; Gardner and Tait 2000). We refer the reader to Sieron et al. (2019a), Sieron and Siebe (2008), Gardner and Tait (2000) and Nelson (1980) for detailed descriptions of Ceboruco’s geology and eruptive history. Here we summarize the activity of the last ~ 1000 years at Ceboruco as this time span represents the basis of our hazard assessment and mention that stratigraphic evidence indicates the volcano had a long period (~ 40 kyrs) of repose before that (e.g., Sieron et al. 2019a).

Fig. 2
figure 2

Location map of Ceboruco volcano (red volcano symbol) in the western part of the Trans-Mexican Volcanic Belt of Mexico. The main active volcanoes (orange) in the TMVB: 1 – Ceboruco; 2 – Colima; 3 – Parícutin; 4 – Nevado de Toluca; 5 – Popocatépetl; 6 – Pico de Orizaba; 7 – San Martín; 8 – El Chichón; 9 – Tacaná. The upper right inset: Digital Elevation Model (DEM) of the San – Pedro – Ceboruco half-graben with the location of Ceboruco (red) within the monogenetic cones (black volcano symbols). The main communities at risk from an eruption at Ceboruco are marked with blue stars: I – Ixtlán del Río (pop. ~ 33,000); A – Ahuacatlán (pop. ~ 9000); J – Jala (pop. ~ 16,000); M – Marquezado (pop. ~ 1000); U – Uzeta (pop. ~ 2000); C – Chapalilla (pop. ~ 1500) (map created using Generic Mapping Tools 6 (Wessel et al. 2019))

In the past ~ 1000 years Ceboruco experienced two main eruptive periods that comprised both effusive and explosive activity limited to the current edifice. We consider an eruptive period as characterized by more or less continuous eruptive activity, recognizable in the geological or historical records.

Stratigraphic evidence suggests that after a repose period of ~ 40 kyrs effusive activity preceded the caldera forming Jala eruption (VEI 6) (~ 990–1020 CE) that created the current summit caldera (~ 3.7 km diameter). The Jala eruption produced extensive tephra fallout and numerous pyroclastic density currents (PDC) with a total estimated eruption volume of 3–4 km3 dense rock equivalent (DRE) (Sieron et al. 2019a; Sieron and Siebe 2008; Gardner and Tait 2000; Nelson 1980). The PDC deposits formed from concentrated flows and dilute surges. Syn-and-post eruptive lahar deposits are identified as far as 35 km from the vent. Ballistic bombs are found within a 5 km radius from the vent. Effusive activity persisted for another ~ 150 years during which time several lava flows were emplaced (Böhnel et al. 2016). We consider this the first eruptive period. After a break of several hundred years, the second eruptive period started with small explosions in February 1870 and continued for 5 years with alternating effusive and explosive activity that generated lahars, tephra fallout and PDC. The total erupted volume of this eruption was estimated to be ~ 0.1 km3 for pyroclastic deposits and ~ 1.14 km3 for lava flows (Sieron et al. 2019a; Sieron and Siebe 2008). Today, small urban centers, national freeways, railroads and agricultural land are constructed on these eruption deposits (e.g., Marquezado, Ahuacatlán, Jala).

Deterministic Hazard assessment

Ceboruco is ranked the third most hazardous volcano in Mexico (Sieron et al. 2019a; Espinasa-Pereña 2018). Several urban centers (> 55,000 people), as well as important infrastructure lie within the area impacted by past activity (Sieron et al. 2019a, b). Sieron et al. (2019b) developed the first hazard maps for Ceboruco as part of a project funded by the Federal Electricity Commission that manages two hydropower plants in the area and the Tepic-Mazatlán sectorial power distribution station at the foothills of Ceboruco. Based on the activity of the last ~ 1000 years, three possible eruption scenarios were defined. We refer to Sieron et al. (2019a) for a full description of these scenarios. Summarizing:

Scenario 1 (S1) – this high likelihood scenario considers a small magnitude (i.e., VEI < 2), mainly effusive eruption that will generate andesitic lava flows. Small explosions will likely produce ballistics and low ash plumes from a central or flank vents (Sieron et al. 2019a, b).

Scenario 2 (S2) – is a medium likelihood event (i.e., VEI 2–3). This medium magnitude scenario considers both explosive and effusive activity. The explosive activity is expected to produce ballistics and to be of Vulcanian to sub-Plinian in style with small to moderate transient plumes, similar to the 1870-‘75 CE eruption. Dome emplacement and lava flows are expected due to the high viscosity of the dacitic lava. Tephra fallout is expected and might reach a relatively large thickness near the vent (Sieron et al. 2019a, b). PDC and lahars are also expected.

Scenario 3 (S3) – a low likelihood, large magnitude eruption similar to the Jala Plinian eruption. This eruption is expected to produce ballistics and widespread tephra fallout and voluminous PDC. Syn-and-post eruptive lahars are expected due to the availability of pyroclastic material and water during the rainy season (Sieron et al. 2019a, b).

The hazard maps created by Sieron et al. (2019b) used these three scenarios to simulate individual hazardous phenomena. These computer models use various source parameters derived from previous research, field data and from analogue volcanoes (e.g., ballistics), to generate maps with the area impacted by each eruptive phenomenon. The resulting maps of each of these hazards were integrated into a single hazard map for each of the three scenarios (i.e., scenario-based hazard maps) (Figs. 8, 9 and 10 in Sieron et al. (2019b). A generalized hazard map for Ceboruco was obtained by integrating the three scenario-based hazard maps into a single map showing the areas that could be affected by various volcanic phenomena associated with a future eruption (Fig. 11 in Sieron et al. 2019b). In the following, we use the ET approach to assign probabilities to these eruptive scenarios and produce probabilistic hazard maps.

The event tree for PVHA at Ceboruco

We conduct a PVHA at Ceboruco volcano by designing an ET (Fig. 1) that considers one magmatic and two non-magmatic triggers for unrest. Unrest is a diverse and complex phenomenon described as a deviation from the background activity of a volcano or by anomalies in monitoring data (e.g., Gottsmann et al. 2019; Phillipson et al. 2013). We consider magmatic unrest as triggered by ‘moving magma’ and non-magmatic unrest as other types of unrest (i.e., seismic unrest, increased degassing) not related to ‘magma-on-the-move’ (Rouwet et al. 2014):

  1. a)

    magmatic unrest – with a high rate of eruptive activity in the past ~ 1000 years compared to the previous ~ 40 kyrs, we consider that future unrest at Ceboruco may be triggered by magma migration to shallow depth, convective overturn in a long-lived magma chamber or by fresh magma entering the chamber from depth (e.g., Pritchard et al. 2019; Sparks and Cashman 2017; Woods and Cowan 2009);

  2. b)

    seismic unrest without magmatic unrest – triggered by a large regional tectonic earthquake or a local seismic swarm. i) ‘Sulphur-smelling’ waters after two tectonic earthquakes in 1566 CE (Sieron and Siebe 2008; de Ciudad 1976) and 1567 CE (Sieron and Siebe 2008; Tello 1968) indicate changes in porosity and permeability that might have facilitated migration of magmatic gases (Sieron and Siebe 2008). ii) Núñez-Cornú et al. (2020), Rodríguez Uribe et al. (2013) and Sánchez et al. (2009) concluded that the seismic swarms at Ceboruco are dominated by volcano-tectonic earthquakes associated with faults in and around the edifice;

  3. c)

    increased degassing without magmatic unrest – infiltration of abundant meteoric water from a passing hurricane may lead to an increased degassing due to the residual heat from cooling magma. From the analysis of two seismic datasets, Rodríguez Uribe et al. (2013) and Sánchez et al. (2009) confirmed that the low-frequency events recorded at Ceboruco are associated with movement of pressurized fluids. But there is no reason to conclude that seismicity always accompanies degassing, or vice-versa.

Several hazardous phenomena have been associated with non-magmatic unrest episodes (i.e., seismic and increased degassing) and are considered in our ET: rockslides, debris avalanches, debris flows, ballistics, ash fallout and increased fumarolic activity (Fig. 1). By including these events in the ET, we clarify that potentially hazardous phenomena may result from both explosive (e.g., phreatic explosions) or non-explosive events and perhaps may occur even if there is no evidence of unrest at all (e.g., Barberi et al. 1992). We emphasize that while all of these events represent contingencies in the ET, their probabilities vary.

If unrest of magmatic origin is detected at Ceboruco, we consider three potential outcomes: i) no eruption (i.e., activity subsides); ii) phreatic eruption or increased degassing activity (i.e., if large amounts of water infiltrate the volcanic system) without a magmatic eruption and, iii) magmatic eruption, which may include some phase of phreatic activity. In contrast to the latter case, the hazards associated with the first two outcomes (i.e., no eruption or phreatic eruption) resemble the contingencies in the non-magmatic unrest case, especially for locations proximal to the volcano. However, the occurrence of a magmatic eruption may result in a series of destructive phenomena that would impact the region around Ceboruco at different scales. We focus on the magmatic unrest branch and its subsequent events (the highlighted branch in Fig. 1). The past eruptive periods at Ceboruco included both explosive and effusive activity, therefore we model the most devastating phase of the past eruptions (e.g., Plinian explosive; sub-Plinian explosive; small magnitude eruption). We follow the three eruptive scenarios identified by Sieron et al. (2019b) and described in the previous section, acknowledging that more complex scenarios and cascading events are plausible.

Estimating the probabilities in the event tree: the calculation

The ET for Ceboruco becomes a Bayesian Event Tree (BET) because of the additional information we incorporate in the analysis. The ET and the BET have identical structure (Fig. 1), but the way in which probabilities are estimated is different. To estimate the probability at each node of the BET, PyBetVH uses prior models that are theoretical, statistical or numerical, and observations of past volcanic activity (past data) including the geological record (Constantinescu et al. 2016; Tonini et al. 2015; Thompson et al. 2015; Sandri et al. 2014; Marzocchi et al. 2010); although, parameterization without using past data is also possible (e.g., Marzocchi et al. 2004). Often, prior models rely on information such as the state of activity derived from monitoring (e.g., Wright et al. 2019) or data from analog volcanoes (e.g., Tierz et al. 2020; Ogburn et al. 2016).

The background calculations performed by PyBetVH rely on specified data and parameters for each branch of the ET. Given that the first three nodes of a hypothetical BET represent questions with binary answers, the probability estimates are managed through a binomial distribution (e.g., Connor 2021; Marzocchi et al. 2008; 2006), which arises because (1) the volcano will enter into a period of unrest in the time window (e.g., 𝛕 = 1 yr) or not, (2) the unrest will be associated with magma or not, (3) the magmatic unrest will result in an eruption, or not. Each of these nodes will be answered yes or no, with some probability assigned to each event, summing to 1. If the ET is not Bayesian, then the probability at each node only depends on the sample proportion (μu) (i.e., likelihood): the number of observations of an affirmative outcome in the past (or from analogue volcanoes), y, divided by the number of time windows observed, n:

$${\mu}_u=\frac{y}{n}$$
(1)

Consider a hypothetical volcano for which in 3 of 100 time windows unrest is observed to begin and one time window includes a magmatic eruption, then the sample proportion for the first node, corresponding to unrest, is 0.03. If two of these three periods of unrest were observed to be magmatic, then the sample proportion for the second node is 0.67. If one of these two periods of magmatic unrest resulted in eruption then the sample proportion of the third node, magmatic eruption, is 0.5. The product of these probabilities is the probability of eruption in a given time window, based only on likelihood, which in this example is 0.03 × 0.67 × 0.5 = 0.01. This result is easily checked, since one time window out of one hundred was observed to have an eruption.

In a BET, such as implemented in PyBetVH, the calculation starts by assuming a prior model (i.e., a prior probability based on theoretical models or expert knowledge of the phenomena) at each node. The sample proportion is then used to update the posterior probability estimate. When limited observations cannot provide a robust sample proportion, the probability estimate relies on prior functions. Conversely, if the number of past observations is large enough, prior models do not play a key role in the posterior estimate, which becomes dominated by the sample proportion (e.g., Marzocchi et al. 2004). In practice, prior models are cast as a Beta distribution, which depends on parameters α and β that take on values between 0 and 1. The mean of the prior distribution (μ) at any binomial node is given by:

$$\mu =\frac{\alpha }{\alpha +\beta }$$
(2)

and the expected value of the posterior probability at any binomial node with a Beta prior distribution is (Gelman et al. 1995; Marzocchi et al. 2006):

$$E=\frac{\alpha +y}{\alpha +\beta +n}$$
(3)

The posterior probability deviates from the sample proportion depending on the values of these two parameters, α and β. If the value of α is large compared to β, the posterior probability will increase, as in the case of monitoring data that indicates increased unrest. Conversely, if monitoring data indicates that there is no unrest, then β is large compared to α and the posterior probability will be less than the sample proportion. For the cases in which there are no monitoring data available to inform the values of α and β, then α = β = 1, μ = 0.5 and the Beta distribution is uniform random. This means that the posterior probability will tend toward 0.5.

Suppose that for our theoretical volcano we lack consistent monitoring data, then the Beta distribution is uniform random between 0 and 1 (α = β = 1). For the first node, recall that three time windows had unrest onset out of one hundred observations, so we update this prior with field observations and the resulting posterior probability of node 1 is 4/102 = 0.039, which is only slightly higher than the sample proportion (0.03) because n > > α, β. For node 2, there are only three observed intervals of onset of unrest and two of these are associated with magmatism, so the sample proportion is 2/3 = 0.67. The posterior probability is 3/5 = 0.6. The prior is relatively more important because n is small. Stated another way, once the volcano enters unrest, there are many fewer time windows, n, and fewer observations, y, so the prior probability gets more weight. For the third node, there is one eruption for two magmatic events, the sample proportion is equal to 0.5, which is the same as the prior mean (Eq. 2). For node three, the posterior probability is 2/4 = 0.5, the same as the sample proportion. Overall, the estimate of the absolute probability of magmatic eruption based on the BET in this theoretical example is 0.039 × 0.6 × 0.5 = 0.0117. Using the BET, the probability of the volcano erupting in a given time window is higher than the probability calculated using a likelihood approach (0.01), which only depends on the sample proportions.

Of course, this probability estimate depends on the certainty with which the data are known. If, for example, it is uncertain how many episodes of unrest occur in the series, then exploration is required to bound the uncertainty using alternative data. Consider the alternative case that there are zero observed eruptions associated with the two episodes of magmatic unrest. In this case, the likelihood approach (product of the probabilities based on sample proportion) is 0. The same data using Eq. (3) to calculate the posterior probability yield 0.006. That is, we are significantly less confident that no eruptions will occur in the time window. Conversely, if some sort of monitoring data were available for this theoretical volcano, then Eq. (3) would be modified by changing the values of α, β.

Subsequent nodes in the BET (Fig. 1) are not binomial but their probabilities are calculated in a similar way (Marzocchi et al. 2006, 2010). We refer the reader to Marzocchi et al. (2006, 2008 and 2010) and Tonini et al. (2015) for a detailed description of the software implementation of the calculations of the BET structure. An advantage of PyBetVH is that the posterior probability calculation includes the distribution for the probabilities, in addition to the expected values, providing a more complete view of the uncertainty (epistemic and aleatoric) in the probability estimate than is otherwise possible.

PyBetVH uses as input several text files in which the user specifies the values each parameter (e.g., y, n, α and β) is assigned in order to estimate the prior distributions and the sample proportions. The software and a step-by-step tutorial for building the input files are freely available at https://vhub.org/resources/betvh (Tonini et al. 2016b) or by request to the corresponding author. In the next section we describe the implementation of the BET for Ceboruco and the choice of model input data to estimate probabilities related to a magmatic eruption.

PyBetVH set-up: the implementation of the Ceboruco BET

Nodes 1–2-3: what is the probability of a magmatic eruption within the next time window?

Typically, volcanic hazard assessment and eruption forecasting are conducted for different time windows, from short-term (days, weeks after the initiation of unrest) to intermediate-term (months) and long-term (years to decades) (e.g., Marzocchi and Bebbington 2012; Newhall and Hoblitt 2002). The long-term hazard assessment is useful for the local authorities and stakeholders to devise plans and strategies for responding to future potential activity. A commonly used time window (𝛕) for hazard assessment is 1 year (Thompson et al. 2015; Sandri et al. 2014; Marzocchi et al. 2008; Newhall and Hoblitt 2002). However, we suggest that this time window can be extended (e.g., 10 years, 100 years) when the hazard assessment is considered for very long-term development of infrastructure projects (e.g., Gallant et al. 2018; Connor 2011; Volentik et al. 2009).

At Node 1 (N1), we want to estimate the probability of entering a new unrest phase within the next time window. Usually, unrest is defined by changes in the behavior of the volcano as observed by a monitoring network (e.g., Gottsmann et al. 2019; Phillipson et al. 2013). However, such a generalized definition of unrest is rather subjective and cannot be readily applied to active volcanoes that are poorly monitored or are not at all monitored. Ceboruco is under-monitored, therefore, we rely on geological information to determine unrest including reports of felt seismicity, recognizing that this lack of reliable monitoring data limits our analysis to long-term PVHA and is associated with high epistemic uncertainty.

Geological field data suggests that a long period of dormancy (~ 40 kyrs) at Ceboruco preceded the eruptive period that comprised the Plinian Jala eruption, its precursory effusive activity (i.e., the Destiladero lava flow; Nelson 1980), and the activity of the next ~ 150 years (Sieron et al. 2019a, b). From 1142 CE until 1870 CE no magmatic eruptions were observed. This quiescent period was interrupted by two powerful tectonic earthquakes (1566 CE and 1567 CE; Sieron and Siebe 2008; de Ciudad 1976; Tello 1968). The aftermath of these events included changes in activity observed at the surface of the volcano (Sieron and Siebe 2008; Tello 1968) and no other activity was reported for another ~ 200 yrs. A second eruptive period started with the reactivation of the system in 1783 CE and 1832 CE and included reported seismic activity (ground shaking), underground noise, and a water vapor plume at the summit. Decades later, this activity was followed by the 1870–1875 CE eruption. Post- 1875 CE, fumarole activity persisted until ~ 1894 CE (Sieron et al. 2019a; Ordóñez 1896) and since then decreased in intensity and the fumarole temperature declined with time.

Occasional surveys of the fumaroles show that their composition is meteoric water vapor generated by the residual heat from the last eruption and that there is no magmatic component in fumarole gases or their condensates (Sieron et al. 2019a; Ferrés et al. 2019; Centro Nacional de Prevención de Desastres (CENAPRED) 2016). Sánchez et al. (2009) and Rodríguez Uribe et al. (2013) analyzed several years of seismic data (2003–2008) from one station and concluded that low-frequency events indicate the presence of pressurized fluids, in agreement with the presence of water vapor fumaroles and an active hydrothermal system. Occurrence of VT events indicate intra-crustal stress accommodation, either from a magmatic body or the regional tectonic setting (i.e., the half-graben within which the edifice is located). Recently, Núñez-Cornú et al. (2020) used data from four stations around Ceboruco (deployed 2012–2014) and concluded that most of the recorded events follow shallow structural lineaments and have shallow hypocenters (< 10 km). Although insightful, these geophysical signals cannot serve as a basis for a wider definition of background activity and therefore unrest, since the network was small and deployed only temporarily. Sánchez et al. (2009) suggest that Ceboruco might be in an intra-eruptive phase while Núñez-Cornú et al. (2020) suggest that the seismic activity indicates local tectonic stresses. No other observations that may indicate changes in Ceboruco’s behavior have been recorded. We consider Ceboruco is currently not in a state of unrest.

In BET, the prior probability function is usually informed by monitoring data at volcanoes with a monitoring network. However, due to the lack of reliable and sustained geophysical data collection at Ceboruco, we consider there is no prior information about whether unrest will be initiated during the next time window and therefore set the mean prior probability at 0.5 (maximum ignorance), with an equivalent number of data parameter 𝛬 = 1 in PyBetVH (indicating the maximum uncertainty associated with this choice) (Constantinescu et al. 2016; Sandri et al. 2019; Marzocchi et al. 2008).

The past data describes our likelihood function (Marzocchi et al. 2008) and is estimated from geological information. The lack of a high resolution geological and historical catalogue prohibits us to accurately determine if the two main eruptive periods consisted or not of several unrest episodes. Therefore, we conservatively assume these two eruptive periods (i.e., the one including the Jala eruption and the one including the 1870-‘75 eruption) represent one unrest episode each and acknowledge the high uncertainty associated with this choice. The long dormancy (quiescence) interval between these two unrest episodes indicates an inter-eruptive period with no clear signs of magma intrusion but disturbed twice by large tectonic earthquakes that may have perturbed the magmatic system at Ceboruco. We count four episodes of unrest, magmatic or non-magmatic in 1108 time windows. Therefore, the sample proportion, which is the mean of the likelihood function, is 4/1108. Because the mean prior probability is greater than the sample proportion, the posterior probability is greater than the sample proportion. That is, there is a penalty for lack of monitoring data.

Node 2 (N2) - Given unrest, what is the probability that magma is involved? An unrest episode may be magmatic or non-magmatic (e.g., tectonic). Given its activity of the past ~ 1000 years and the relatively short time since its most recent eruption, it is reasonable that unrest could be associated with magma. Ceboruco, however, lies within a tectonically active area within which large-magnitude tectonic earthquakes are common, like the two earthquakes reported in the sixteenth century. Given the variability of both volcanic systems and tectonic regions we assume a prior probability distribution with μ = 0.5 and 𝛬 = 1, indicating maximum ignorance (i.e., equal chances of having magmatic or non-magmatic unrest). The likelihood function is described by the past data according to which two of the four unrest episodes were clearly magmatic in origin (i.e., the two eruptive periods). As the sample proportion, 2/4, is equal to the mean of the prior probability, 0.5, the posterior probability for this node is also 0.5, which is the probability that magma is involved, given unrest. Or, using the nomenclature in Eq. (3), α = β = 1, μ = 0.5, y = 2, and n = 4., or the expected value of the posterior function is 3/6 = 0.5.

Node 3if magmatic unrest is due to magma, what is the probability a magmatic eruption will occur during the time window? Previous studies involving the use of a BET tool used as prior models at this node a Beta distribution with α = β = 1 (i.e., maximum ignorance; equal probability of eruption or no-eruption; see Constantinescu et al. 2016; Sandri et al. 2014, 2009) or data from Phillipson et al. (2013) which includes an analysis of 228 episodes of unrest (e.g., Sandri et al. 2019; Constantinescu et al. 2016). According to the data set presented by Phillipson et al. (2013), 64% of unrest episodes at stratovolcanoes lead to eruption. For comparison, Newhall and Dzurisin (1988) found 38% of unrest episodes documented at silicic calderas led to eruption and 54% of unrest episodes at mafic calderas. Here, we choose maximum ignorance as prior probability (i.e., μ = 0.5; 𝛬 = 1). We update the posterior probability with past data informed by the two magmatic unrest episodes (i.e., two eruptive periods) that resulted in magmatic eruptions. The sample proportion is therefore 2/2. This value is modified by the prior mean, 0.5, to yield a probability of magmatic eruption given magmatic unrest of 0.75.

Based on the assumptions made at the first three nodes from available models and geological data, we estimate the probability of eruption at Ceboruco in the next year (time window) to be ~ 0.002. If, on the other hand, the volcano is in a state of unrest, the posterior probability (magmatic eruption given unrest) is 0.375. Since the sample proportions at nodes 2 and 3 are based on very few events, the values assumed for parameters in the prior distribution (α = β = 1) are quite significant. A summary of the input data at the first three nodes is presented in Table 1.

Table 1 Summary of the input data used in PyBetVH for the first three nodes

Nodes 4 and 5: vent location and eruption type/size

We focus here on the activity at Ceboruco’s cone and do not consider hazards related to the monogenetic volcanic field surrounding the volcano, because the past ~ 1000 years of activity at Ceboruco was concentrated within the summit caldera. Therefore, we assume the probability of monogenetic eruptions is much smaller than summit caldera eruptions. Several lava flows occurred closer to the present caldera rim or on the north flanks (e.g., Copales, El Norte, Coapan lava flows (Fig. 8 in Sieron et al. 2019a)), however, most activity was restricted to the main edifice of Ceboruco. We therefore consider five possible sectors for a new eruption to occur: - the summit caldera and four flank locations (North, e.g., the Coapan lava flow site; East, South, e.g., the Copales lava flow site; and West flanks). The probabilities for new vent opening were decided by project participants who weighted geologic evidence. We assume a 0.98 prior probability for the summit caldera location and 0.005 for each flank sectors. Based on the three eruption scenarios identified by Sieron et al. (2019a) and discussed in the previous section, we assume that future activity at Ceboruco will likely include a: S1 - small magnitude (effusive / small explosive); S2 – medium magnitude (Vulcanian/sub-Plinian) and S3 – large magnitude (Plinian). The probabilities of these scenarios were assigned by a power-law, with small eruptions having a higher likelihood of occurrence than large events: S1–0.6, S2–0.3, S3–0.1 (e.g., Sandri et al. 2014; Marzocchi et al. 2008; Newhall and Hoblitt 2002).

Nodes 6, 7 and 8: occurrence of hazardous phenomena and areas impacted

We consider: lava flows, ballistics, tephra fallout for Scenario 1; lava flows, ballistics, tephra fallout, PDC and lahars for Scenario 2 and ballistics, tephra fallout, PDC and lahars for Scenario 3. The prior best-guesses of occurrences for each phenomena conditional to the eruption size were set using data from Newhall and Hoblitt (2002) (i.e., frequency of phenomena associated with eruption type) except ballistics that has a probability of 1, acknowledging that some phenomena may be underreported. Further, we divide the area around Ceboruco in a grid of 500 m × 500 m cells and compute the probability of each cell to be invaded by a selected phenomenon. The modeling-based maps shown in Sieron et al. (2019b) serve as prior models in our PyBetVH analysis. We refer the reader to Sieron et al. (2019b) for a detailed description of the methodology, the simulation tools utilized to model each selected phenomenon, and the output maps. As past data we use the area covered by the deposits of past eruptions associated with each phenomenon and eruption size (Sieron et al. 2019a).

We parameterize past data and prior models by assigning a probability for each grid cell based on the frequency of inundation by a volcanic phenomenon during past eruptions (i.e., past data) or as indicated by the numerical modeling presented in the Sieron et al. (2019b) (e.g., if a grid cell was affected by one simulation from a total of three, the prior probability of the grid cell will be 1/3; if there is only one past eruption and a given grid cell was affected, then the prior probability is 1 (1/1).

In Table 2 we present a summary of the hazards considered for each scenario and the PyBetVH input data.

Table 2 Summary of the hazardous phenomena considered in our event tree for each of the three scenarios considered. For past data we have the number of times the selected phenomena occurred at past eruptions. Most past data are considered for Scenario 3 (Plinian eruption) due to the better preservation and mapping of the deposits of the Jala eruption. As prior models we used the maps created by various simulation tools used by Sieron et al. (2019b) for each considered phenomenon

Probability maps

The absolute probability maps (Figs. 3 and 4) produced by PyBetVH show the combined mean probabilities that an area is invaded by a volcanic phenomenon regardless of the eruption size and vent location (i.e., considering the occurrence of an eruption of any size from any vent) within a 1-year time window.

Fig. 3
figure 3

The absolute annual probability maps for: a ballistics; b lava flows and c pyroclastic density currents. Both PDC and lava flows show considerable probabilities for Ahuacatlán and Jala to be impacted should an eruption occur within the next time window (i.e., 1 year). The infrastructure built around Ceboruco will likely be affected by these phenomena. The ballistics map shows highest probabilities within the summit caldera

Fig. 4
figure 4

The absolute annual probability maps for a tephra fallout and b lahars. Ceboruco’s flanks and immediate surroundings have a higher probability of being inundated by a lahar, whereas the areas downstream the Ahuacatlán river show a lower probability. This is assumed to be due to the loss of particles as sedimentation occurs within a lahar and captured in the simulations. Tephra fallout maps rely heavily on the numerical simulations and prior models show that the expected areal distribution of tephra is controlled by the prevailing winds at tropopause levels for a large eruption and by the smaller variable wind fields at lower altitudes for smaller events

Ballistics. We use 540 simulations made with the Eject! Code (Mastin 2001) and field measurements of distances travelled by the ballistics of the Jala eruption (i.e., Scenario 3) (Sieron et al. 2019a) to estimate the posterior probabilities. The simulated parameters included: - ejection angles between 350 and 890; ejection velocities between 150 ms− 1 and 250 ms− 1 and bomb radii of 10 cm to 100 cm. A detailed description of how these simulations were conducted is presented in Sieron et al. (2019b). The map (Fig. 3a) shows the highest probabilities (~ 1–2 × 10− 3) around the summit caldera, decreasing with distance for approximately 5 km. The flanks of Ceboruco up to 5 km away from the vent are uninhabited and several crop fields lie at the base; however, the telecommunications towers located on the NE of the caldera rim will likely be affected by ballistics, with considerable impact on regional communications.

We considered lava flows in scenarios 1 and 2. Figure 3b shows the extent of the lava flows is comparable with the results from the numerical modeling in Sieron et al. (2019b). The areas of higher probability (~ 8 × 10− 4 – 1 × 10− 3) correspond to the areas inundated most often in simulations (i.e., ~ 5–10 km to the north, north-east and south-west) and the areas where past deposits were identified (i.e., north). Areas of very low probability (near zero) on the south-east and west flanks correspond to topographic barriers and have not been affected by simulations (Sieron et al., 2019b). The towns of Ahuacatlán (pop. ~ 9000) and Jala have relatively high probabilities (~ 4–6 × 10− 4) of being affected by lava flows. Smaller communities to the South and West as well as important infrastructure (e.g., major roads) to the South and North (some already built on lava flows) have significant probabilities (~ 6.5 × 10− 4) of being affected.

Pyroclastic density currents (Fig. 3c). We consider both end members of pyroclastic density currents (flows and surges) and use past data (from Jala eruption) and simulations (i.e., Titan2D; Patra et al. 2005) for flows and the Energy Cone module in LaharZ (Schilling 1998) for surges) to estimate posterior probabilities for these hazardous phenomena. Sieron et al. (2019b) simulated pyroclastic flow volumes of 0.025–0.125 km3 and 0.5–1 km3 for Scenario 2 and 3 respectively. The H/L values selected for simulating pyroclastic surges were 0.14–0.17 and 0.07–0.12 for Scenario 2 and 3. A detailed description of the methodology and a full list of the parameters used for these simulations is provided in Sieron et al. (2019b). The PDC map shows the highest probabilities of PDC inundation (> 5 × 10− 4) within 10 km of the vent. The valleys and some interfluves at > 10 km from the vent have a lower probability of inundation attributed to the simulated surges and field data from past large magmatic eruptions, in cases overriding geographic barriers such as the NE edge of the graben. The towns of Ahuacatlán and Jala, both at ~ 10 km from the vent, along with several smaller communities, have relatively elevated probabilities (~ 1–3 × 10− 4) of being affected by PDC deposits, along with existing infrastructure.

Tephra fallout hazard is evaluated for all three eruption sizes. Sieron et al. (2019b) describe in detail the methodology used for simulations of tephra fallout that serve as prior models in our analysis. Figure 4a shows the average annual probability map for the accumulation of at least 10-cm-thick tephra layer, acknowledging that tephra accumulations of < 10 cm will have a wider impact. The distribution of the probabilities is strongly controlled by the spatial extent of tephra in the numerical models since we lack a good set of field data (only one set for Jala eruption, Scenario 3). The widespread tephra deposit to the NE is associated with a large Plinian eruption and it is controlled by the prevailing winds at high altitudes (and illustrated by the numerical simulations). The higher probabilities closer to the volcano are associated with the thickening of tephra deposits near the vent and with sedimentation from lower transient plumes affected by the variable monthly wind field at lower altitudes. However, the monthly wind profiles may hide the daily variability of winds with unusual directions (e.g., Michaud-Dubuy et al. 2021). All communities to the East, South and West of Ceboruco will be impacted by tephra fallout and significant damage is expected to the agricultural lands and the infrastructure in the area.

Lahars (syn-and-post eruptive) are considered for both S2 and S3 eruption scenarios due to the availability of fresh pyroclastic material during such events and the availability of water during the rainy season at Ceboruco or a passing tropical storm. Lahar deposits have been identified in the field and associated with the large Jala eruption (Sieron et al. 2019a). We use the extent of past lahars and simulation with LaharZ as input data in PyBetVH. The average probability map of lahar inundation is shown in Fig. 4b. Unlike Sieron et al. (2019b) who considered lahar hazards as far north as the Grande de Santiago river, we focus our assessment on Ceboruco and its main drainage network and tributaries to the Ahuacatlán river. The highest probabilities are within the drainage network of Ceboruco (> 1.7 × 10− 4) and the towns of Jala and Ahuacatlán, as well as the nearby infrastructure. Marquezado, Uzeta, Tetitlan and the area of the drainage network of the Ahuacatlán river to the SW have relatively high probabilities of being affected by lahars (> 1.7 × 10− 4).

In the Additional Material 1 we present the conditional probability maps for each considered phenomena (i.e., occurrence of hazards is conditional to the occurrence of a specific eruption size from the central vent).

Discussion

The event tree and the probability of eruption

Volcanoes, active or extinct, are inherent sources of hazard conditioned to the occurrence of a specific triggering event (e.g., a magmatic eruption may trigger a PDC or a tectonic earthquake may trigger slope failures and debris avalanches at extinct or long dormant volcanoes). Given the complexity and intrinsic variability of volcanic systems, PVHA (Marzocchi et al. 2010; Marzocchi and Woo 2007) is preferred to a simple deterministic approach based on the extent of deposits of past eruptions or on several numerical simulations. Although we acknowledge that both approaches are complementary and mutually informative (e.g., Tierz 2020; Rouwet et al. 2019) the event tree approach provides a useful tool to visualize and estimate the probabilities of occurrence of various hazards at different time scales (Punongbayan et al. 1996; Marzocchi et al. 2008; Newhall and Hoblitt 2002). To fully benefit from the use of event trees, these have to comprise a wide range of volcanic events, magmatic and non-magmatic. Therefore, they can provide a clear and broad view of the possible hazardous phenomena related to the volcanic system and how these may affect the area during a long-term period. This is potentially of great interest to local authorities and stakeholders who often make decisions regarding long-term investments in the local community. Important infrastructure projects are developed to last for decades and the sites selected for their development must be considered carefully with respect to an acceptable level of hazard even if the probability of occurrence is very small (e.g., Connor 2011). The advantage of a probabilistic approach in hazard assessment is that it can account for all probable events; although, accounting for all possible hazard outcomes can be challenging (Rougier and Beven 2013). Furthermore, a probabilistic approach can account for associated uncertainties (aleatoric and epistemic) and provide an easy way to visualize them.

When direct evidence of occurrence of a specific eruptive phenomenon is not identified in the past deposits, information from analogue volcanoes is sometimes used (e.g., Ogburn et al. 2016). Most volcanoes have very long periods of inactivity (dormancy) and are often considered extinct or less dangerous and their fertile slopes attract cultivation. To complicate forecasting, most active volcanoes are poorly monitored, or not monitored at all (e.g., Loughlin et al. 2015). With an ever-increasing population around active volcanoes, we need to look at a broad range of volcanic processes in order to consider all the ways a volcano might impact us. A volcano poses multiple threats, not only related to a magmatic eruption. Hazards can occur long after the volcano goes extinct. However, our human perspective forces us to consider the most dramatic events in our hazard assessments and those are more often than not related to a magmatic eruption. Here we developed a generic event tree for Ceboruco volcano and try to account for all the possible outcomes regardless of the involvement of renewed magmatism. We attempt to show that estimating probabilities of each outcome can be challenging depending on the assumptions made about the volcano.

Bayesian methods and the introduction of prior distributions is important in hazard assessment, especially when we have a sparse record of events, such as the case for periods of unrest at Ceboruco volcano. When there are many data, the prior distribution has little impact on the probability, but for nodes represented by few observations, the parameters of the prior distribution are of critical importance. Consider two scenarios in which Ceboruco is currently in the state of: a) no unrest, or b) unrest. We explore the effects of the likelihood and prior functions on the posterior probability estimates for the first three nodes of the ET (Fig. 5a, b). When the volcano is not in a state of unrest, such as when we need a long-term forecast, based on our interpretation of the recent eruptive history at Ceboruco (i.e., four unrest episodes with two of magmatic origin) (Fig. 5a) the probability of entering unrest in the next time window is 0.0036 (black line - ‘triangle’ marker, Fig. 5a) and the overall probability of eruption based only on the mean likelihood function is 0.0018 (PLK, Fig. 5a inset). Although unlikely, the fact that the volcano is not monitored extensively and continuously may justify a conservative assumption that after ~ 40 kyrs of quiescence Ceboruco entered unrest before the Jala eruption and continues today (i.e., second scenario). We know this unrest is magmatic. Based on the likelihood function, the overall probability of eruption is 1 (PLK, Fig. 5b inset). Therefore, a hazard analysis based only on the limited geological record is highly biased and poorly informative for such a short time window.

Fig. 5
figure 5

a The probabilities of the first three nodes of the ET given Ceboruco is not currently in unrest. The black line with triangle markers shows the probabilities based on the mean likelihood functions. The blue line with cross markers, represent the probabilities using maximum ignorance at each node plus likelihood functions. The green line with circle marker corresponds to the case in which node 3 uses 64% of unrest leading to eruption (Phillipson et al. 2013) with high uncertainty whereas the red line with plus marker uses the same information but with higher confidence in the data (lower uncertainty). b The probabilities of the first three nodes of the ET given Ceboruco is currently in unrest (i.e., Punrest = 1). The colored lines correspond to the same assumptions at each node as described for the first scenario. The figure insets show the overall probability of eruption for the two scenarios: PLK – probability estimate based on the likelihood function alone; P50 – adding maximum ignorance at each node; Ph-u – adding Phillipson et al. (2013) data at node 3 with high uncertainty (high variance); Pl-u – adding the same data but with more confidence (low variance). NOTE: the probability scale in panel a) is log

In a Bayesian analysis however, if we assume the simplest prior model with maximum ignorance (i.e., μ = 0.5; 𝛬 = 1) at each of the first three nodes, the posterior probability at Node 3 decreases by > 20% (blue line - ‘x’ marker, Fig. 5a, b) but is associated with high uncertainty (σ2 = 0.03 to 0.05). The overall probability of eruption also decreases slightly to 0.0016 in the first scenario and significantly, to 0.66, in the case of the second scenario (P50, Fig. 5a, b insets). These estimates are better than those based only on the likelihood, but they are associated with large uncertainties.

The Phillipson et al. (2013) dataset indicates that approximately 2/3 of stratovolcanoes entering unrest will have an eruption and this information was used in other BET studies (e.g., Sandri et al. 2019; Constantinescu et al. 2016). We can use this information, or information from other data sets, to design a prior function for node 3 by adjusting the α and β parameters of the distribution. With no monitoring data, this information must be used cautiously as adjusting the values for α and β leads to a higher or lower confidence in our data. For example, for Beta (α = 64; β = 36), which indicates a higher confidence in our assumption, the probability at node 3 decreases in both scenarios (σ2 = 0.002) (red line - ‘+’ marker, Fig. 5a, b). Conversely, for Beta (α = 6.4; β = 3.6) the same probability increases again but this time with higher uncertainty (σ2 = 0.03) (green line - ‘o’ marker, Fig. 5a, b). This is also reflected in the overall probabilities of eruption (Pl-u and Ph-u in Fig. 5a, b insets). This shows that although the posterior probabilities estimated via a BET analysis differ substantially to those based on the likelihood function alone, they are still associated with uncertainties that can only be reduced with monitoring data.

Although a commonly accepted time window for volcanic hazard assessment is one-year (e.g., Thompson et al. 2015; Sandri et al. 2014; Newhall and Hoblitt 2002), the annualized probability is often extensible to longer periods. Decisions regarding the development of important infrastructure projects (e.g., bridges, dams, power plants) might benefit of an analysis that considers longer time windows, i.e., 10 years or more (e.g., Connor 2011). This requires a high-quality eruptive history catalogue and the incorporation in the analysis of events that occur at larger time scales even though they may be extremely unlikely (e.g., Marzocchi and Bebbington 2012; Connor 2011). For a 10-year time window the overall probabilities of eruption at Ceboruco will increase ten times if we assume the volcano is not currently in unrest. If Ceboruco is in unrest, the ET can also be parameterized from node 2 and used to estimate the subsequent conditional probabilities (e.g., Queiroz et al. 2008).

All things considered, a PVHA at Ceboruco, or any other under-monitored volcano, is highly sensitive to the definition of unrest and the prior probability estimates, and both depend on geophysical monitoring and/or on a rigorous eruptive catalogue. Without consistent data from a developed and continuous monitoring network to inform the hazard assessment, the final probability estimates are associated with high uncertainties. These uncertainties in both the prior and likelihood functions are propagated to the posterior probability estimates; the only solution to refine the PVHA and deal with the uncertainty is the development of a real-time and continuous monitoring network and to conduct more detailed field studies.

Limitations and advantages of using PyBetVH

The ability to combine multiple sources of information, as well as the possibility of exploring various combinations of eruptive scenarios leads to a better estimate of the uncertainty associated with different hazard maps. Probabilistic hazard maps are usually conducted for a single volcanic phenomenon by using different numerical simulation tools (e.g., Charbonnier et al. 2020; Gallant et al. 2018; Strehlow et al. 2017; Selva et al. 2014). PyBetVH can merge the outputs of different simulation tools with geological data and assess the uncertainty range associated with the average prior probability presented in each map. In Fig. 6, we show an example of the uncertainty range (i.e., 10th, 50th and 90th percentiles) in the conditional probability maps for PDC (conditional to the occurrence of a Plinian eruption (S3). These maps provide an easier way to evaluate our confidence in the hazard maps, a useful feature for decision makers (e.g., Tierz et al. 2018; Thompson et al. 2015; Sandri et al. 2014; Lindsay et al. 2010).

Fig. 6
figure 6

Example of uncertainty in the pyroclastic density currents hazard probability maps for a large Plinian eruption (Size 3). We show the a 10th percentile; b the average, and c the 90th percentile maps for the conditional probability (i.e., conditional to the occurrence of a Plinian eruption at the central vent)

PyBetVH has an easy-to-use graphical interface (Tonini et al. 2015) that allows the user to update the input files as soon as more information becomes available. The current version of PyBetVH was developed to analyze volcanic hazards associated to magmatic unrest only. However, recent efforts helped describe and recognize indicators for non-magmatic unrest and related hazards (Rouwet et al. 2014). The implementation of a non-magmatic branch in the ET is important for short-term forecasting applied to monitored volcanoes (Tonini et al. 2016a).

Although PyBetVH produces separate probability maps for each selected volcanic phenomenon, the user has the option to download the datafiles and plot them using other tools to create an aggregated hazard map that can provide a rapid view of all possible volcanic phenomena at Ceboruco.

Conclusions and final remarks

We conduct a PVHA for Ceboruco, an active but poorly monitored volcano in the western part of the TMVB. This work represents an effort to update the volcanic hazard assessment at Ceboruco by conducting a probabilistic analysis that incorporates different sources of uncertainty. Based on the eruptive scenarios proposed by Sieron et al. (2019b), we create a generic ET for Ceboruco to account for magmatic and non-magmatic activity. For the magmatic eruption branch of the ET, we choose three scenarios: i) small magnitude (effusive/explosive), ii) medium magnitude (Vulcanian/sub-Plinian) and iii) large magnitude (Plinian); with their related hazardous phenomena: ballistics, tephra fallout, pyroclastic density currents, lahars, and lava flows. We use PyBetVH, an e-tool based on the Bayesian event tree methodology, to create probabilistic hazard maps for each of the selected eruption scenarios. Using geological data with outputs from other numerical and theoretical models in PyBetVH, we estimate the probability of a magmatic eruption at Ceboruco within the next time window (i.e., 1 year) of ~ 0.002. The resulting maps show the absolute annual probability for the communities and infrastructure around Ceboruco to be impacted by lava flows, PDC, tephra fallout and lahars. The ballistics will likely impact only the cone area. While the deterministic maps presented by Sieron et al. (2019b) represent a first step towards quantifying volcanic hazard at Ceboruco, are easy to interpret by the general public and are an asset in case of a volcanic emergency, the maps presented here are fundamentally different. These maps do not show just a footprint of the hazard but rather provide probabilities of the areas around the volcano to be affected by potential hazardous phenomena. These maps assign a probability to a particular scenario given the current state of unrest at the volcano (i.e., no unrest). However, should any activity be detected, the current estimates will have to be updated by running the model with the newly available information incorporated in the event-tree. The probability maps can be a useful tool for authorities and stakeholders in the decision-making process. We believe the decisionmakers will find the probabilistic results useful, especially if they are able to compare probabilities with other potential events that may affect the community. For example, globally, in a community, children are the people least likely to die in a given year, with a probability of dying of around 1/10000 with higher rates among children in rural communities (Svenson et al. 1996). A probability of 1/500 or 1/1000 of PDC inundation of their community (Fig. 6) is a high probability for children in a community because it considerably increases their probability of dying in a given year considerably. Of course, it is up to the community to decide which level of probabilities are unacceptable, but such comparisons can provide context and these comparisons rely on valid probabilistic hazard maps.

Our analysis indicates that PVHA at poorly monitored volcanoes relies mostly on the likelihood function (informed by geological data) and/or maximum ignorance for prior distribution functions. These estimates are associated with high uncertainties which can be significantly reduced by refining the prior distribution functions with information from geophysical monitoring (e.g., Wright et al. 2019).

We recognize the numerous efforts by academic and government institutions, and we recommend the set-up of at least a minimal permanent monitoring network at Ceboruco with the capabilities to provide real-time geophysical data continuously. Another important consideration at Ceboruco is to extend the event tree presented here to include the surrounding monogenetic field.