Introduction

How readily would you take a chance on forecasting the outcome of an uncontrolled experiment using incomplete data on a process you cannot see? When volcanologists are asked to forecast an eruption, this is precisely the challenge they face: to anticipate from indirect signals of unrest the possibility of molten rock forcing its way to the surface from an unknown set of starting conditions in a location that is hidden at least several kilometres underground. In spite of pioneering work carried out in the first decades of the twentieth century (Omori 1920; Minakami 1960; Tokarev 1971), the odds of success 50 years ago were far from encouraging and forecasts remained little better than guesswork (Tazieff and Sabroux 1983; Tilling 1989). Today, forecasts are merely uncertain (Marzocchi and Bebbington 2012; Bell et al. 2015; Sobradelo and Martì 2015; Kilburn 2018). The change seems a minor step forward. In reality, it is a giant leap. Past guesswork was born from ignorance of how volcanoes behave beneath the surface; modern uncertainty is based on understanding why that behaviour can be unexpected.

The distinction reflects fundamental progress in our knowledge of the physical processes that operate within magmatic systems—an advance reflected by changes in the topics that have been covered in English-language textbooks and edited volumes (e.g. compare Rittmann (1962), Macdonald (1972) and Williams and McBirney (1979) with Breitkreuz and Rocchi (2018), Edmonds et al. (2019) and Papale (2020)). One practical consequence has been the emergence of new strategies for forecasting eruptions from preceding signs of unrest. In this context, we use the term “forecast” to mean the likelihood of an eruption within a specified time interval in the near future (commonly days, but possibly weeks) based on interpretations of ongoing unrest. Swanson et al. (1985a, b) proposed a similar definition for a “prediction”, reserving a “forecast” for longer term assessments that extrapolated past frequencies of eruption; today, however, “forecast” is often applied to both short- and long-term assessments (GVP, 2021)). Established forecasting methods are rooted in sophisticated statistical analyses. New understanding has since advanced to the point that, when combined with established methods, physics-based constraints have the potential to make forecasts more reliable. The potential is illustrated here for the case of volcanic unrest after long repose; however, analogous applications can be anticipated for forecasts at all types of volcano, regardless of how often they erupt.

When an emergency begins, the questions normally asked first are whether and when an eruption will occur. They are especially pressing at volcanoes that have been quiet for several generations, because little or no information will normally be available for comparing unrest with previous episodes. Evaluating the potential for eruption then relies on analogy with emergencies at apparently similar volcanoes elsewhere (Marzocchi and Bebbington 2012; Sobradelo and Martì 2015) or on generic models of unrest that have yet to be tested against more than a handful of examples (Voight 1988; Yokoyama 1988; McNutt 1996; De la Cruz-Reyna and Reyes-Davila 2001; Kilburn 2018). Long repose also favours a volcano remaining unmonitored (Tilling 1995; Sparks et al. 2012), so that forecasts must additionally be based on data from emergency monitoring networks installed after unrest has begun. Classic examples include the reawakening of Mount St Helens, USA, in 1980 (Lipman and Mullineaux 1981; Newhall and Punongbayan 1996a); of Mt Pinatubo, Philippines, in 1991 (Newhall and Punongbayan 1996a,b); and of Chaitén Volcano, Chile, in 2008 (Carn et al., 2009; Lara et al. 2013), after respective repose intervals of 123, more than 500 and possibly 350 years. A first requirement for advancing emergency forecasts, therefore, is that they do not depend on knowing the previous states of the volcano.

The rise of probabilistic reasoning

Local seismicity and ground movement have been the staple precursors to eruptions since the 1970s (Tazieff and Sabroux 1983; McGuire et al. 1995; Scarpa and Tilling 1996; McNutt 2005; Dzurisin 2007; Chouet and Matiza 2013; Gottsmann et al. 2019; Papale 2020). Early forecasts were qualitative and strongly influenced by the personal experience of the forecaster (Tazieff 1977, 1979). Subsequent compilations of emergency data have allowed patterns to be recognised in how individual signals vary with time (Swanson et al. 1983, 1985a, b; Cornelius and Scott 1993; Cornelius and Voight 1995) and how contemporaneous time series vary with each other (Sobradelo and Martì 2015). These have been integrated into probabilistic tools, such as event trees, for estimating the likelihood of an eruption (Newhall and Hoblitt 2002; Marzocchi et al. 2008; Marzocchi and Bebbington 2012; Sobradelo et al. 2014), with uncertainties quantified through methods such as expert elicitation to account for human bias (Aspinall 2010; Donovan et al. 2014; Bebbington et al. 2018); the approach is similar to that being developed for probabilistic volcanic hazard assessment (PVHA; Tierz 2020).

By analysing combinations of precursory trends, probabilistic reasoning and statistics have enhanced the quality of forecasts compared with methods based on individual signals. They nevertheless remain empirical, because specific correlations found at one volcano cannot automatically be applied to another (Bell et al. 2011, 2013; Boué et al. 2016; Vasseur et al. 2015). This applies especially to recognising thresholds in the value of a precursor before an eruption is believed to be imminent. For example, how much ground movement can be expected? Or how many local earthquakes? Does the critical value for a signal depend on the design of the monitoring network and the sensitivity of its instruments? Are the critical values for different precursors inter-related? The fact is that the absolute numbers vary from case to case, so that each set of critical thresholds can only be used at the volcano from which they were obtained. A second requirement for advancing emergency forecasts, therefore, is that they do not depend on the particular volcano that happens to be restless.

The role of deterministic reasoning

Volcanic behaviour is frequently presented as “complex” and “highly non-linear” (Sparks 2003; Sobradelo and Martì 2015; Rouwet et al. 2019; Palmer 2020). Such descriptions are most relevant when anticipating the style and scale of an eruption, or how that style may fluctuate with time (Sparks 2003). They may also be convenient for managing expectations about the reliability of eruption forecasts. Caution, though, is necessary, because the descriptions run the risk of cementing a mindset that volcanoes are too complex to generate pre-eruptive behaviour that is systematic, general and open to deterministic study. Indeed, even complex and non-linear processes are capable of yielding forecastable patterns of behaviour (Main 1996).

The success of statistical approaches has distracted attention from exploring physical limits on the behaviour of pre-eruptive phenomena and how we can create a new generation of high-quality forecasts by combining empirical data with new quantitative physical models. After long quiescence, for example, eruptions are commonly preceded by a year or less of elevated unrest (White and McCausland 2016; Stix 2018); the cumulative seismic moment of volcano-tectonic (VT) earthquakes increases with the volume of the pressurizing source (White and McCausland 2016), while the total amount of seismic energy released is on the order of 1010–1011 J (Yokoyama 1988); and rates of local VT seismicity and ground deformation appear to increase hyperbolically with time several days before an eruption (Voight 1988). Such repeatable behaviour suggests the action of a common set of controls on precursory signals, regardless of a volcano’s location. To move beyond statistical analyses, therefore, the set of controls needs to be quantified in terms of known physical processes.

The procedure is well illustrated at volcanoes reawakening after long repose, when magma usually has to break open a new pathway through the crust before it can erupt. This constraint allows ground deformation and VT seismicity to be used as quantitative measures of changing stress in the crust and, hence, to conditions when the stress exceeds the crust’s strength and allows a new fracture to open. The crust is stretched by an increase in pressure at depth, normally attributed to some combination of the pressurization of an existing magma chamber, the growth of a new magma intrusion or the disturbed flow of hot fluids in a hydrothermal system (Gudmundsson 2020; Acocella 2021). Deformation is dominated at first by elastic (and theoretically aseismic) stretching of unbroken rock but later by brittle fracture and slip along faults (Kilburn 2018). The change occurs when unbroken rock cannot deform any further without cracking, so marking the approach to conditions under which a major fracture can form.

Elastic deformation and fault slip both contribute to ground movement, whereas fault slip determines VT seismicity alone. The growing importance of brittle deformation can thus be followed by how the amount of VT seismicity increases with ground movement and time. A common strategy is to relate the potential for eruption to a critical rate of VT seismicity (Tilling 2008). The chosen threshold depends on the sensitivity of the seismic monitoring system and data processing techniques, the existing distribution of faults in the crust and the size and shape of the stress field that is created above the pressurizing source (which, in turn, depends on the size and shape of that source, as well as the shape of the volcanic edifice). As a result, it is unlikely that any two volcanoes will share the same set of absolute thresholds before an eruption. An alternative is to express thresholds in terms of relative change—for example, expecting rupture when the rate exceeds its initial value by more than a critical amount. However, the same critical amount can be applied only if measurements begin when the volcanoes are in the same starting condition, which is another unlikely coincidence, especially when systematic observations are first made at unspecified times after the start of unrest. As a result, particular values of critical threshold cannot simply be transferred from one volcano to another.

A more reliable measure emerges from the self-accelerating nature of fracture growth. The rate of fracturing depends on how much the rate at which the stress being supplied exceeds the rate at which it can be absorbed elastically. It does not depend on the amount of crust being deformed. As the crust approaches conditions for rupture, the rate of VT seismicity tends to increase hyperbolically with time (equivalent to a linear decrease in inverse rate with time (Voight 1988)). The onset of rupture is thus preceded not by a particular value of the rate, but by a particular style with which the rate increases (Voight 1988; Cornelius and Voight 1995). Forecasts can be based on extrapolating the shape of the increasing trend to effectively infinite values, at which stage fracture growth is uncontrolled and creates a new rupture. They no longer depend on knowing the rates before (so-called baseline data) or at the start of unrest and so can realistically be attempted even when data are first collected after an emergency has begun. Moreover, the same limiting rate applies regardless of the volcano in question, so that the method can be transferred from one case to the next.

The VT trends before rupture are a natural result of elastic-brittle behaviour and can be expressed in terms of normalised stress and strength, namely, the ratio of stress lost by fracturing to stress stored in the crust, which describes the amount of fracturing, and the ratio of the crust’s tensile strength to modulus of elasticity, which describes how much deformation can occur before rupture (Kilburn 2018). The control of ratios, rather than the sizes of individual parameters, reflects the scale-independent properties of fracturing and, as anticipated by Voight (1989), the patterns of precursory behaviour in the field can be investigated quantitatively with rock-physics experiments in the laboratory. Experiments performed under known conditions can then be used not only to reproduce observed precursory trends, but also to reveal constraints on their underlying controls. For example, the forms of accelerating VT event rate commonly observed are found to emerge when the rate of stress supply is constant (Hao et al 2016, 2017; Kilburn 2018). A constant supply rate is not obligatory, but is controlled by dynamical conditions in the pressure source. Natural dynamical systems establish steady rates to minimise energy loss and, beneath volcanoes, may indicate that pressurization in the crust is governed by steady rates of magma supply from depth. Deterministic reasoning thus bridges the gap between identifying general methods for interpreting precursory trends and explaining why volcanoes operate under restricted conditions in Nature.

An integrated framework

In addition to the advances in statistical analyses (Aspinall et al. 2003; Marzocchi and Bebbington 2012; Sobradelo et al. 2014; Sobradelo and Martì 2015; Sandri et al. 2019), progress in forecasting eruptions is being driven by a greater confidence that physics-based methods can enhance a purely probabilistic approach (De la Cruz-Reyna et al. 2008; Hao et al. 2016, 2017; NASEM 2017; Kilburn 2018; Rouwet et al. 2019). At long-quiescent volcanoes, the new confidence follows a shift in focus from unravelling the complex details of magmatic processes to the comparative simplicity of evaluating their combined influence on stress in the crust (Kilburn 2018; Roman and Cashman 2018).

Strictly, however, the new approach forecasts not the time of eruption, but the time when the crust breaks. Whether an eruption will follow depends on the ability of magma to utilise the new pathway and to continue its propagation to the surface. Potential outcomes range from an eruption almost immediately (Kilburn and Voight 1998; Kilburn 2003) or after several weeks (Syahbana et al. 2019) to no eruption at all, when magma stalls along the pathway (Hincks et al. 2014) or is unable to enter the pathway, because it has formed too far from the magma body (Kilburn 2018; Roman and Cashman 2018). Hence, although rupturing is necessary for an eruption after long repose, it is not a guarantee that an eruption will occur.

Distinguishing the outcomes may seem a challenge that can only be addressed with statistics based on the relative frequency with which each has occurred. Improving forecasts of rupture may have thus simply delayed the stage at which probabilistic analyses are deployed. The problem to be solved, however, has changed from estimating the probability of eruption directly from multivariate trends among precursory signals to estimating the probability of eruption from precursory trends given that rupture has occurred. The added condition simplifies analyses through Bayesian methods (Newhall and Hoblitt 2002; Marzocchi et al. 2008; Sobradelo et al. 2013). In other words, the deterministic reasoning applied to forecasting rupture also serves to constrain more tightly the subsequent probabilistic forecasts of eruption. Evaluating deterministic limits on different outcomes has thus the potential to constrain probabilistic forecasts still further.

The outcomes are controlled by the structure and geometry of a volcanic feeding system (for both magma bodies and the crust) and whether and how the pressures driving magma ascent can overcome the strength of the crust and a magma’s resistance to motion. Even when the conditions are unknown in detail, physical arguments and dimensional analysis can define limits to the rates and timescales of controlling processes and connect these with the corresponding characteristics of observable precursors. Further constraints may emerge from analytical and numerical models of specific magmatic processes, such as mechanisms for creating and pressurizing magma bodies (Jellinek and DePaolo 2003; Cashman et al. 2017; Huber et al. 2019) and for controlling rates of magma ascent (Rivalta et al. 2015; La Spina et al. 2019).

The physically defined limits reveal conditions that apply to volcanoes in general, although behaviour within the limits may vary between volcanoes. Consider, for example, the simple case when a pressurizing magma body breaks the crust and is connected to the new rupture. An eruption follows when the time to reach the surface is shorter than the time for magma to solidify. This is favoured by a shallower pressure source and a faster rate of ascent, which, for a given magma viscosity and vesicularity, is promoted by faster rates of pressure increase in the magma body. Hence, slower rates of ground movement are expected to favour longer delays between rupture and eruption and, when they are extremely slow, even to indicate that an eruption might not occur at all. The result may be a new and general application for relating rates of ground movement to the timing of an eruption, but with the particular form of quantitative connection differing for each volcano and remaining in the province of probabilistic analysis. Nevertheless, the probabilistic assessment will be more precisely defined as the conditional probability of eruption given that two deterministic conditions have now been satisfied, namely, that rupture has occurred and that the ground is continuing to move at a particular rate. Further refinements may then be possible by relating the observed movements to outputs from numerical models of magma pressurization and ascent (Jellinek and DePaolo 2003; Rivalta et al. 2015; Huber et al. 2019; La Spina et al. 2019). This example is not exhaustive, but illustrates how physical reasoning and modelling can help to anticipate scenarios that may be observed in the field.

A new framework that integrates deterministic and probabilistic reasoning is an exciting prospect for improving the quality of eruption forecasts. The idea is similar to initiatives in computer science that have been successful for evaluating the reliability of deterministic numerical models (DAKOTA 2021). It has the scientific advantage of clarifying which elements of precursory behaviour can quantitatively be transferred among similar volcanoes. Thus, deterministic reasoning can take advantage of the scale-independent features of fracturing to investigate in the laboratory how precursory behaviour changes under a range of stress histories broader than those covered by available field data; to extrapolate trends to different timescales in the field—for instance, connecting patterns typically seen at intervals of a year or less to those evolving over years to decades (De la Cruz-Reyna et al. 2008; Parks et al. 2012; Robertson and Kilburn 2016; Kilburn et al. 2017; Stix 2018; Bell et al. 2021); and to use rates and durations of signals to seek distinctions between pre-eruptive and intrusive unrest (Sturkell et al. 2003; White and McCausland 2019), between magmatic and hydrothermal pressure sources (Pritchard et al. 2018; Sandri et al. 2019) and even between styles of eruption (Cassidy et al. 2018).

An integrated framework brings also a social advantage. Progress in forecasting is not always obvious to non-specialists, who may see uncertainty in forecasts as a sign of professional inadequacy. The credibility of specialists is called into question and rumours may emerge that a forecast is driven by political or economic motives and not by scientific study (Tazieff 1977; McBirney 2004; Donovan et al. 2014; Solana et al. 2018; Longo 2019). Even without conspiracy theories, the excuse that volcanoes are complex is hardly reassuring. A deterministic, physics-based component provides a rational structure for explaining why forecasts are not perfect. Explanations favour understanding; understanding favours engagement; and engagement favours a successful outcome during an emergency. The framework is thus valuable not only for improving the quality of forecasts, but also for reinforcing the trust of non-specialists in warnings of eruption.

Looking forward

Four combinations of expectation and outcome describe the success or failure of a forecast. Success follows when (1) an eruption is forecast and does occur or (2) an eruption is not expected and doesn’t occur, whereas failure results when (3) an eruption is forecast but does not occur or (4) an eruption is not expected and does occur. An oft-cited dilemma for decision-makers is to choose between the consequences of incorrect forecasts, especially when evaluating the need to evacuate an area under threat (Marzocchi and Woo 2007). Less regard is paid to the optimistic view of a successful forecast. Although prudent operationally, a focus on being wrong is reinforced by uncertainty and a lack of confidence in forecasts. A key goal for the coming decades is to raise confidence that forecasts are becoming more reliable. Introducing physics-based criteria is a natural contribution towards meeting this goal.

The approach requires acceptance that volcanoes really do show patterns of pre-eruptive behaviour that are systematic and repeatable and that measurable precursory signals can be linked quantitatively to the physical processes in operation (Voight 1988; McNutt 1996; Kilburn 2018; White and McCausland 2019). Once the patterns have been established, additional procedures can be developed for identifying and understanding deviations from the general trends: the essential point is to have confidence that the general trends exist.

We have demonstrated the approach by using the physics of fracturing to understand the mutual dependence of just two precursory signals, VT seismicity and ground movement, at volcanoes reawakening after long repose. With the growth of machine learning and artificial intelligence, new opportunities are emerging for revealing mutual dependencies among a broader range of precursors at both frequently and rarely erupting volcanoes (Anantrasirichai et al. 2019; Gaddes et al. 2019; Sun et al. 2020; Carniel and Guzmán, 2021). Physics-based procedures are thus well placed to advance quantitative analyses of (1) additional physical precursors and their causes, such as long-period earthquakes, tremor and seismic b value (McNutt 1996, 2005; Chouet and Matiza 2013; Bean et al. 2013; Roberts et al. 2015; Chardot et al. 2015; Dempsey et al. 2020; Butcher et al. 2021); (2) relations between the composition and flux of volcanic gases and the physical state of subsurface magma (Chiodini et al. 2016; Liu et al. 2020; Moretti et al. 2020; Raponi et al., 2021); (3) connections between observed patterns of unrest and those expected from numerical models of magmatic processes (NASEM 2017); (4) forecasts of the likely size and style of an eruption (Tazieff 1979; White and McCausland 2019); and (5) forecasts of eruption and changes in eruptive behaviour at frequently erupting and open-system volcanoes (Brancato et al. 2011; Carrier et al. 2015; Bell et al. 2017; Neuberg et al. 2018; Gaunt et al. 2020). All these goals are within our grasp and a test of progress in the next 20 years will be how far we have transformed today’s uncontrolled volcanic experiments into tomorrow’s forecastable events.