Introduction

Wildfire can prime mountainous terrain for deadly and destructive debris flows (e.g., Kean et al. 2019). These fast-moving flows are triggered by intense rainfall that can entrain sediment by surface-water runoff (Cannon et al. 2001) or by shallow (centimeter- to meter-scale) landsliding (Gabet 2003). In steep catchments that have experienced moderate to high soil burn severity, where the hillslopes have been made bare due to the combustion of vegetation, and the surface and near-surface soil-hydrologic functioning has been altered (Shakesby and Doerr 2006), even relatively modest rainstorms can generate hazardous postfire debris flows compared to the triggering conditions for unburned settings (Cannon et al. 2008). The low (1- to 2-year) recurrence interval of the rainfall conditions known to initiate postfire debris flows in mountainous regions of the western USA (Staley et al. 2020) underscores the hazard potential along developed valley bottoms, floodplains, and fan surfaces downslope from areas that are experiencing increased fire activity (Westerling et al. 2006, 2016). Postfire hazard assessment is also emerging as an area of active research around the world (e.g., Carabella et al. 2019; Bisson et al. 2005; Esposito et al. 2019; Tiranti et al. 2021).

Monitoring-based approaches for early warning of postfire debris flows (e.g., near-real-time measurements of rainfall or flow stage) are largely intractable because the flows can occur within minutes of intense rainfall (Kean et al. 2011; Staley 2018). Consequently, emergency assessments of postfire debris flows in the western USA (USGS 2022) has been geared for compatibility with rainfall forecasting (NOAA-USGS Debris Flow Task Force 2005). Regional rainfall thresholds calibrated to historical events have dominated several approaches for postfire response planning, including debris-flow warning (e.g., Cannon et al. 2008; Staley et al. 2013). Although this type of empirically derived rainfall threshold has proven valuable, it cannot be used to make spatially explicit predictions about debris-flow activity across highly variable geomorphic and burn characteristics within a given fire. To address this limitation, hazard assessment models, which were originally designed to predict the statistical likelihood of postfire debris flows (e.g., Cannon et al. 2010), have been adopted to produce rainfall thresholds associated with a specific probability level for stream segments and drainage basins for the first year after a wildfire (Staley et al. 2017). These kinds of rainfall thresholds can help inform hazard potential for values at risk throughout a burn area (e.g., Cafferata et al. 2021) and may further benefit from analyses that consider the likely increased frequency of postfire debris flows fueled by climate change (Kean and Staley 2021; Oakley 2021).

The US Geological Survey M1 likelihood model (Staley et al. 2016, 2017) is currently used to produce spatially explicit rainfall thresholds for postfire debris flows in the western USA (Fig. 1A). The M1 model produces continuous estimates of debris-flow likelihood (i.e., values ranging from 0 to 1). A probability level equal to 0.5 is typically used for rapid emergency hazard assessment (Swanson et al. 2022) and has been widely applied to represent debris-flow triggering conditions for the first year after a fire in the western USA (USGS 2022). The 15-, 30-, and 60-min rainfall accumulations that correspond with a likelihood of debris-flow initiation that is equal to 0.5, or the \({R}_{P50}\), is given by:

$$R_{P50}\;=\;\frac{-\beta}{C_1T+C_2F+C_3S}$$
(1)

where \(\beta\) and \({C}_{\mathrm{1,2},3}\) are empirical coefficients (values reported in Staley et al. 2016), \(T\) is the proportion of upslope area with moderate to high soil burn severity and topographic gradients greater than or equal 23 degrees, \(F\) is the mean upslope differenced normalized burn ratio divided by 1000, and \(S\) is the soil-erodibility, or the Kf factor.

Fig. 1
figure 1

A Map showing burn areas where the US Geological Survey postfire debris-flow hazard M1 likelihood model (Staley et al. 2016, 2017) was trained (Southern California, orange circles), tested (Intermountain West, orange triangles), and applied (western USA, red circles) between 2017 and 2021 (USGS 2022). The location of our study area is shown with a gray circle. B Proportion of emergency hazard assessments (USGS 2022) conducted for Southern California (light orange), the Intermountain West (intermediate orange), and Northern California, Western Oregon, and Western Washington (dark orange) between 2017 and 2021

The empirical model, which is trained with observations from Southern California, has been shown to perform as well (and in some cases better than) regional rainfall thresholds throughout the Intermountain West (Staley et al. 2017). This level of performance indicates that the input variables that underlie the probabilistic debris-flow hazards model, including topographic steepness, soil burn severity, and soil erodibility, are of widespread significance to postfire debris-flow susceptibility in the western USA. Despite encouraging predictive performance across the Intermountain West, we have an incomplete understanding of how we can expect the M1 model to perform for burn areas spanning from Central California to Northern Washington, a region where demand for hazard assessment has been substantial during the last 5 years (Fig. 1B). Herein, our objective was to evaluate the predictive performance of rainfall thresholds produced by the M1 model for multiple fires that burned along the Central California coast in the summer of 2020 (Fig. 2). The scope of this work included (1) documenting postfire debris-flow activity for an area where postfire debris-flow hazard potential has long been recognized (e.g., Cleveland 1977) and yet no published inventories are available, (2) using this inventory to assess the performance of the M1 model with receiver-operator characteristic analysis, and (3) discussing how these results can guide strategies for future hazard model development.

Fig. 2
figure 2

Map showing the locations of the 2020 CZU Lightning Complex (blue), River Fire (gray), Carmel Fire (orange), and Dolan Fire (red) along the Central California coast in the Santa Cruz Mountains (SCM) and Santa Lucia Mountains (SLM). The approximate landfall for the 26–29 January 2021 atmospheric river storm (i.e., the southeasterly movement followed by a retrograde to the northwest) is shown with a dashed gray arrow. The black arrows indicate the orientation of intense rainfall bands as they moved onshore and across the burn areas. The timing of the most intense periods of rainfall across the burn areas is reported in a local, 24-h format. An animated radar loop for this storm is available in NOAA (2021) and Kostelnik et al. (2021)

The Central California coast fires and the 26–29 January 2021 storm sequence

This study focused on postfire debris-flow activity in four burn areas on the Central California (USA) coast, including the CZU Lightning Complex (San Mateo–Santa Cruz Unit; hereafter referred to as “CZU”), and the River, Carmel, and Dolan Fires (Fig. 2). These wildfires started (CZU and River by lightning, Dolan by arson, and Carmel by cause unknown) during 16–18 August 2020 and collectively burned 1079 km2 (or approximately one-quarter million acres) during a record-breaking year for wildfire in California (Safford et al. 2022). The steep, tectonically active terrain where these four fires occurred has highly variable annual rainfall, vegetation, and rock types (Table 1). The CZU is located in the Santa Cruz Mountains (Fig. 2), where 820 to 1525 mm of annual rainfall (PRISM Climate Group 2022) sustains evergreen broadleaf and needleleaf forests, including old-growth coast redwood (Sequoia sempervirens) forest (NASA 2022). The River, Carmel, and Dolan Fires are located in the Santa Lucia Mountains (Fig. 2). The more arid terrain in the River and Carmel burn areas (420 to 890 and 565 to 630 mm of annual rainfall, respectively) is typified by grasslands, shrublands, and woody savannas. Coastside vegetation at the Dolan Fire (annual rainfall ranging from 870 to 1695 mm) is dominated by evergreen broadleaf and needleleaf forests, including coast redwood. Vegetation in the drier inland zones of the Dolan Fire (annual rainfall ranging from 530 to 1370 mm) is more similar to the River and Carmel Fires. The CZU and Carmel burn areas are typified by plutonic and marine sedimentary rocks (California Geological Survey 2022). The River Fire and Dolan Fire include these rock types, and also a diverse assemblage of metavolcanic and metasedimentary rocks (California Geological Survey 2022).

Table 1 Rainfall, vegetation, and rock types across burn areas that we focused on for this study

Ninety percent of water-year (WY; October through September) rainfall in our study region is between November and April, with the most intense rainstorms typically arriving between December and February (NOAA 2022). Rainfall accumulations for WY 2020–2021 in the Santa Cruz and Santa Lucia Mountains were below average prior to the arrival of an atmospheric river storm on 26–29 January 2021. Ninety-five percent of California was in drought at this time, including moderate to severe levels within our study region (National Drought Mitigation Center 2022). The lower rainfall intensities associated with minor rain events that preceded the January storm did not produce postfire flood or debris-flow activity, but in some cases, they mobilized ash or caused minor rilling on hillslopes. Rainfall associated with the January atmospheric river storm sequence (Fig. 2) began in the northern reaches of our study region (Fig. 1) on January 26. Atmospheric rivers are narrow, low-altitude storms characterized by high and concentrated integrated water vapor transport due to the advection of heat from tropical latitudes (Zhu and Newell 1998; Ralph and Neiman 2005; Waliser and Guan 2017). A particularly intense but narrowly focused zone of rainfall associated with a narrow cold frontal rain band (e.g., Houze et al. 1976) developed along the leading edge of the storm as it swept southward across the region (NOAA 2021). As producers of some of the highest precipitation intensities from mid-latitude cyclonic storms, narrow cold frontal rainbands have generated extensive and destructive debris flows in recent and recovered postfire regions of California (e.g., Oakley et al. 2018b; Collins et al. 2020). The high-intensity rainfall moved southward at the beginning of the storm on 26 January 2021, prompting flash flood warnings for the CZU, River, Carmel, and Dolan Fires in the early morning hours of January 27. The Dolan Fire had multiple periods of intense rainfall activity throughout the afternoon and evening, after which the main band of rainfall turned back to the north, producing a second pulse of less intense rainfall in the Santa Cruz Mountains that lasted through January 29. WY 2020–2021 concluded without another intense rainstorm or postfire hydrologic response of equal or greater size.

Methods

Field observations

We visited the 2020 CZU, River, Carmel, and Dolan Fires along the Central California coast to visually document the occurrence of debris flows resulting from the 26–29 January 2021 atmospheric river storm (NOAA 2021) and to construct a debris-flow inventory (Tables 23; Thomas et al. 2023). As our objective was to test rainfall thresholds produced by the M1 debris-flow hazard model (i.e., Staley et al. 2016, 2017), we focused on postfire debris-flow activity that caused (or would have been capable of causing) damage to infrastructure or bodily harm. We classified each positive postfire hydrologic case across the four burn areas as a “minor” or “major” response. We defined a minor response as capable of impairing infrastructure function (e.g., deposition or erosion along a road that could be regraded by mechanized earth-moving equipment within a matter of hours) or causing minor bodily injury (e.g., abrasions, sprains, or broken bones). We defined a major response as capable of causing sustained infrastructure impairment (e.g., damage to roads requiring weeks or longer of emergency repair efforts or to residential structures made uninhabitable) or serious bodily injury (e.g., protracted disfigurement or death). We made all of our observations for the CZU, River, and Carmel Fires via ground reconnaissance, supplemented by aerial-based observations at the CZU. At the Dolan Fire, where vehicular access following the storm was more challenging, we used a combination of ground-based mapping and oblique aerial photographs to identify potential debris-flow drainages. All of the positive postfire debris-flow cases that we report here (including for the Dolan Fire) were verified by visiting the sites in person. We distinguished null cases (i.e., non-debris-flow events) as those cases lacking evidence of poorly sorted and matrix-supported levees or clasts imbedded in standing vegetation. The null cases often exhibited evidence of fluvial-dominated processes, such as bedload transport that was capable of imbricating cobble-sized sediment.

Table 2 Summary of postfire debris-flow inventories used to train and test the M1 model
Table 3 Summary of the Central California coast inventory that we used to test the M1 model

Rainfall and spatially explicit rainfall thresholds

We paired the coordinate locations in our inventory with rain gage records. Due to the remote nature of some field locations and the substantial footprint of the storm, which radar showed to have bands of heavy precipitation on the order of 25 km in width (Kostelnik et al. 2021), we used the nearest rain gage within an 8-km search radius around each location. Compared to the 4-km search radius used by Staley et al. (20162017), we found that this criterion struck the best balance between spatial variability in rainfall and the number of usable field observations due to the broad footprint of rainfall as confirmed by visual inspection of radar. The mean and standard deviation of the distance between the 131 field observations in our inventory and a rain gage within the search radius is 4 ± 2.5 km. We lack precise reports of flow timing to distinguish triggering rainfall intensities (e.g., McGuire et al. 2021). Therefore, we calculated the peak 15-, 30-, and 60-min rainfall intensities for the storm as recorded by each rain gage, which we defined as a period of rainfall separated by at least 8 h of no rainfall (e.g., Staley et al. 2020). At each field observation, we extracted the 15-, 30-, and 60-min rainfall thresholds estimated by the M1 model.

Model testing

We utilized receiver-operator characteristic analysis (Swets 1988; Fawcett 2006) to evaluate the predictive performance of the rainfall thresholds estimated by the M1 model with respect to the observed peak rainfall intensities and postfire hydrologic response that we cataloged in our inventory. We tested our inventory under two scenarios: (1) minor and major responses were both considered positive cases and (2) only major responses were considered positive cases. For each scenario, we tallied the number of true positives (correctly predicted positive cases), false positives (incorrectly predicted null cases), true negatives (correctly predicted null cases), and false negatives (incorrectly predicted positive cases). To facilitate a consistent comparison with areas in the USA where the M1 model was originally trained (Southern California; Fig. 1A) and tested (Intermountain West; Fig. 1A), we used the threat score (\(TS\)) skill statistic, which is defined as:

$$TS= \frac{TP}{TP+FN+FP}$$
(2)

where \(TP\), \(FN\), and \(FP\) reflect the total number of true positives, false negatives, and false positives, respectively. The \(TS\) ranges from zero to one (with one being a perfect score) and tends to be more risk-averse than other skill statistics used in landslide science (e.g., Mirus et al. 2018; Postance et al. 2018; Thomas et al. 2019), in part because it does not weigh model performance with respect to the number of true negatives. Although true negatives are not considered in the threat score, we tally them to facilitate visual interpretation of measured versus modeled rainfall threshold values. We examined the predictive skill of the M1 model across the four Central California coast burn areas and then compared model performance for this entire region to Southern California as well as to seven regions within the Intermountain West that include Central New Mexico (CNM), the Colorado Front Range (FRCO), Northern Arizona (NAZ), Southern Arizona (SAZ), Southwestern Colorado (SWCO), Southwestern Montana (SWMT), and Western Colorado (WCO). However, we note that the Southern California and Intermountain West cases cannot be similarly classified into minor versus major responses, and thus, we have higher confidence in the comparisons we draw between our inventory when minor and major responses are considered positive cases.

Results

Field observations

Our postfire hydrologic response inventory for the Central California coast comprises 131 observations, of which 20 are minor responses, 27 are major responses, and the remainder are null cases (Thomas et al. 2023). This inventory facilitated an evaluation of the M1 model for a previously untested region (Fig. 1A) and increased the body of observations available to test the M1 model by 22% (Table 2). Most of our observations came from the Dolan Fire (70%), followed by the CZU, Carmel, and River Fires (18, 8, and 4%, respectively; Table 3). In the next paragraph, we present a brief overview of the response in each burn area. The inventory of responses is provided by Thomas et al. (2023).

At the CZU, the northernmost burn area that we document for this study, there were abundant signs of enhanced runoff. Rill networks formed within soils (Fig. 3A). Low-order, previously unchanneled hollows were scoured up to 0.5 m and channelized by water-dominated flows that exposed root networks (Fig. 3B) and partially remobilized small postfire ravel cones (Fig. 3C). However, we observed only one minor debris-flow deposit within the burn area (Fig. 4A), which flowed onto a road. Approximately 100 km to the south, postfire hydrologic response at the River Fire included more pronounced (2 to 3 m) scouring that left channels susceptible to bank failure (Fig. 3D). Sand to boulder-sized sediment was transported, primarily as fluvial bedload, through channels and across fan surfaces, inundating several residential properties (Fig. 4C). Adjacent to the River Fire at the Carmel Fire, distributed hillslope runoff and erosion processes produced small debris flows with volumes ranging from 10 to 1000 m3 (Smith et al. 2021). Although the data are not sufficiently precise to determine the 15-min triggering intensity, repeat photography from a game camera indicates that debris-flow activity occurred within a 15-min window that included the arrival of the peak rainfall intensities (Smith et al. 2021). Approximately 50 km to the south of the Carmel Fire at the Dolan Fire, the southernmost burn area that we document in this study, widespread debris-flow activity caused extensive damage to California State Route 1 (Fig. 4B). Minor debris flows blocked portions of the highway to emergency vehicle traffic during the first pulse of rainfall in the morning of January 27. Major debris flows transported large (≥ 2 m) boulders, scoured channel walls, and pummeled or removed tree stands, at times producing mud splashes more than 5 m high (Fig. 4D). In high-order stream segments, where flood waters dominated the flow, channels experienced multiple cycles of aggradation and incision (Fig. 3E). In forested terrain, large woody debris was commonly deposited along channel margins (Fig. 3E) and also formed jams within channel beds.

Fig. 3
figure 3

Examples of A rilling, B channelization, C partial remobilization of postfire ravel, D channel scour and bank failure, and E cycles of channel bed aggradation, wood jams, and fluvial incision resulting from the 26–29 January 2021 storm sequence. Photographs AC are low-order drainage settings at the CZU Lightning Complex. Photographs DE are high-order drainage settings in the River and Dolan Fires, respectively. Photographs by Matthew A. Thomas, US Geological Survey

Fig. 4
figure 4

Examples of A–B minor and C–D major postfire hydrologic response within the CZU Lightning Complex, River Fire, and Dolan Fire resulting from the 26–29 January 2021 storm sequence. White arrows indicate the direction of flow. Photographs A, CD by Matthew A. Thomas, US Geological Survey, and photograph B by Donald N. Lindsay, California Geological Survey

Rainfall

The peak rainfall intensities for the CZU, River, and Carmel Fires correspond with the first pulse of rainfall, which occurred in the early morning hours of January 27 (Fig. 5). In contrast, sustained atmospheric river activity within the Dolan Fire resulted in higher rainfall totals (Fig. 5E), and the peak intensities did not occur until the second pulse of rainfall, which began in the afternoon hours of January 27. The rain gages associated with our observations in the CZU burn area recorded the widest range of peak 15-min rainfall intensities (14 to 57 mm h−1; blue-shaded region in Fig. 6). The peak 15-min rainfall intensities in the River (11 to 18 mm h−1), Carmel (34 mm h−1), and Dolan (17 to 47 mm h−1) burn areas fall in between what was observed at the CZU and generally increase from north to south (gray, orange, and red shaded regions in Fig. 6). Rainfall intensities also generally decreased as the storm system moved west to east (e.g., Carmel versus River Fire; Figs. 25B–C), likely owing to orographic effects associated with the northwest/southeast orientation of the Santa Cruz and Santa Lucia Mountains (Fig. 2). With the exception of one case in the CZU, the rainfall recurrence interval of the peak 15-min rainfall intensities recorded by the rain gages that we used for this study (Fig. 5) was less than or equal to 1 year (NOAA 2022).

Fig. 5
figure 5

Time series of 15-min rainfall intensity (I15; solid colored line is maximum; black line is mean) and mean accumulation (dashed colored line) for the 13 rain gages associated with our postfire debris-flow inventory for the Central California coast, including, from north to south, the A CZU Lightning Complex, B River Fire, C Carmel Fire, and D Dolan Fire. Rain gage locations are shown in Fig. 2. Note: there is no solid black (mean) line in C because the number (N) of rain gages is equal to one

Fig. 6
figure 6

Measured peak 15-min rainfall intensity versus modeled 15-min rainfall thresholds for the CZU Lightning Complex (light blue), River Fire (dark gray), Carmel Fire (orange), and Dolan Fire (red). Colored shading indicates the range of measured and modeled values for each burn area. The “X,” small circle, and large circle indicate conditions where we observed little to no, minor, and major postfire hydrologic response, respectively. For perfect model performance, any point that plots on or above the one-to-one (dashed black) line would be a circle and any point that plots below the one-to-one line would be a “X” symbol

Model testing

We found that approximately 60% of the minor responses and 90% of the major responses were correctly predicted by the M1 model (Fig. 6). Our receiver-operator characteristic (ROC) analysis for the two inventory scenarios that we considered for this study (i.e., minor and major responses versus major responses only) produced mean threat scores equal to 0.35 and 0.25, respectively (Table 4, Fig. 7). Correct predictions (i.e., true positives and true negatives) accounted for about one-half of our ROC cases (Table 4). The primary driver of incorrect predictions was false positives, which accounted for about one-third of our ROC cases.

Table 4 Resultant receiver-operator characteristic metrics when we tested the M1 model with our postfire debris-flow inventory for the Central California coast. The two columns under each metric correspond to the two scenarios for which we test our inventory
Fig. 7
figure 7

Threat scores for rainfall thresholds estimated by the M1 likelihood model at A 15-, B 30-, and C 60-min durations. Staley et al. (2016) trained the M1 model with data from Southern California (SCA; light orange) and tested the M1 model with data from regions throughout the Intermountain West (intermediate orange). We tested the M1 model with a postfire debris-flow inventory for the Central California coast (CCAC; dark orange) and evaluated performance when minor and major responses are considered positive cases (“All”) and when only major responses are considered positive cases (“Major”). The median threat score is shown with a horizontal gray line. We report regional abbreviations for the Intermountain West in Table 2

When we considered both minor and major responses to be positive cases (i.e., the first scenario under which we test our inventory), we found that the rainfall thresholds produced by the M1 model perform, on average, within approximately 85% of the threat score for Southern California and better than 60% of the previously tested regions of the Intermountain West (Fig. 7). When only major responses are considered positive cases (i.e., the second scenario under which we test our inventory), we found that the rainfall thresholds produced by the M1 model perform, on average, within 60% of the threat score for Southern California and poorer than many of the previously tested regions (Fig. 7). Under both inventory scenarios, we found that the threat score increases slightly for 60-min versus 15-min rainfall durations, owing primarily to a reduction in false positives for the Dolan Fire (Table 4), an area that experienced the most rainfall activity and multiple episodes of high-intensity rainfall (Figs. 25D).

Discussion

Postfire debris-flow generation mechanisms

Throughout the CZU, River, Carmel, and Dolan Fires, we consistently observed evidence of runoff that eroded sediment from hillslopes and channels (e.g., rilling, channelization, and channel bed scour; Fig. 3). We did not see widespread infiltration-induced landsliding that could have fueled debris flows in the first rainy season following these fires, an observation corroborated by the dry antecedent conditions and the weak-to-moderate (i.e., category 1 to 2; Ralph et al. 2019) strength of the atmospheric river storm (Center for Western Weather and Water Extremes 2021). Despite the modest rainfall accumulations, the intensity of the rainfall was sufficient to initiate postfire debris flows, possibly owing to the development of a narrow cold frontal rainband over the burn areas (NOAA 2021), similar to that which has occurred in postfire landscapes elsewhere in California (e.g., Oakley et al. 2018a). We also observed that the runoff-generated debris flows in our study region occurred in channel reaches that have similar upstream contributing drainage areas as reaches in Southern California and the Intermountain West. For example, the minimum, geometric mean, and maximum contributing areas for postfire debris flow observations in the burn areas that were originally used for training and testing the M1 model were 0.026, 0.52, and 7.8 km2 (Staley et al. 2016, 2017) compared to the 0.02, 0.16, and 4.6 km2 that we inventoried for the Central California coast. We note that debris flows can occur in channel reaches with contributing areas larger than 8 km2 (e.g., Schwartz 2021), but finding diagnostic evidence of debris flows in these areas is often complicated by the presence of flood-dominated flow processes (e.g., Fig. 3E).

Classification of postfire hydrologic response

We found that the predictive performance of the M1 model (Staley et al. 2016, 2017) for our Central California coast inventory is similar to where the model was trained (Southern California; Fig. 1A) and previously tested (Intermountain West; Fig. 1A). Given the parallels in the postfire debris-flow generation mechanism for these areas, the congruence in model performance is expected. We also observed that the predictive performance of the M1 model drops off substantially (30%) when only major responses for our Central California coast inventory are considered positive debris-flow cases (Fig. 7). The threat score decreases here because the minor responses for the Carmel and Dolan Fires become false positives (Fig. 6; Table 4). This is an expected result given that the M1 model was not developed with these response types in mind and was, instead, geared to estimate the “tipping-point” (likelihood equal to 0.5) conditions for any debris flow, small or large. To meet this goal, all size ranges of debris flows were included in the original empirical database of postfire debris flows (Staley et al. 2016), including smaller events that likely occur more frequently.

Our definition of minor and major postfire hydrologic response can be useful in that it considers hazard potential from a magnitude-based perspective, but it is also qualitative and subjective. The human body and various forms of human infrastructure (e.g., homes, roads, and bridges) have different fragilities (e.g., Kean et al. 2019), and it can be unclear which impacts (or potential impacts) to prioritize when classifying postfire hydrologic response in the field. Another measure of magnitude is the debris-flow volume, which is estimated as part of the US Geological Survey emergency assessment of postfire debris-flow hazards (Gartner et al. 2014; USGS 2022). However, this measure is less appropriate for providing a warning (compared to a rainfall threshold) because the volume estimates scale with drainage area, and a small debris flow that escapes the channel from a small basin can have a major impact on a community.

One way to improve upon the minor versus major response types that we present in this study would be to develop inventories with information about the normalized peak discharge (\({Q}^{*}\)), given by Kean et al. (2016):

$${Q}^{*}=\frac{{Q}_{p}}{{A}_{c}I}$$
(3)

where \({Q}_{p}\) is the peak discharge [L3T−1; L, length and T, time], \({A}_{c}\) is the contributing area [L2], and \(I\) is the characteristic rainfall intensity (LT−1). For the drainage areas that we consider here, \(I\) is taken as the 30-min rainfall intensity (e.g., Moody 2012; Kean et al. 2016). The resulting normalization is equivalent to the runoff coefficient (Chow et al. 1988). This approach has proven useful to distinguish postfire hydrologic response types, such as floods (\({Q}^{*}\) < 1) and debris flows (\({Q}^{*}\) > 1; Kean et al. 2016). The \({Q}^{*}\) of a clearwater flood must be less than one, a value which represents the response of an impermeable basin to constant rainfall. The \({Q}^{*}\) of a debris flow is greater than one owing to sediment bulking and surge dynamics.

Here, we explored the normalized peak discharge concept within the context of debris flooding (Church and Jakob 2020) in the River Fire and debris-flow activity in the Dolan Fire, both of which would fall under the major postfire hydrologic response type (i.e., capable of causing sustained infrastructure impairment or serious bodily injury). We estimated the peak discharge (\({Q}_{p}\)) for two cases, Limekiln Creek at the River Fire (Fig. 2) and Santa Lucia Creek at the Dolan Fire (Fig. 2), assuming critical flow conditions, which has been shown as a good approximation for flow velocity in high-gradient channels (e.g., Jarrett 1987; Grant 1997; Moody 2016; Brogan et al. 2017):

$${Q}_{p}={\left(gR\right)}^{0.5}A$$
(4)

where \(g\) is the acceleration due to gravity [LT−2], \(R\) is the hydraulic radius [L], and \(A\) is the cross-sectional area of the peak flow [L2]. We calculated \({Q}^{*}\) values equal to 0.6 and 4.3 for Limekiln Creek (River Fire) and Santa Lucia Creek (Dolan Fire), respectively (Table 5). These estimates indicate that sediment and other debris did not play as big of a role in amplifying the peak discharge at Limekiln Creek compared to Santa Lucia Creek. Clearly, the postfire hydrologic response at Limekiln and Santa Lucia Creeks presents a hazard to human life and infrastructure that warrants consideration, but perhaps, a metric like the normalized peak discharge, which is linked to the likelihood of a flow escaping a channel, would be more helpful than the minor and major response types we tested in this study to screen and rank potential impacts. Future postfire debris-flow inventories aimed to inform hazard potential could prioritize collecting field evidence to approximate the peak discharge. In conjunction with other, more qualitative debris-flow observations (e.g., matrix-supported levees or clasts imbedded in standing vegetation), these quantitative observations could facilitate the classification of flow type and possibly serve as training data for rainfall thresholds geared to predict the magnitude of debris-flow events. This approach could facilitate a better identification of those debris-flow events capable of escaping the channel and causing damage. The ability to predict flow type is also of interest to flood control professionals, as mitigation strategies can shift depending on flow composition.

Table 5 Hydraulic characteristics that we used to estimate the peak normalized discharge

Interregional applicability of the M1 likelihood model

The development of postfire debris-flow inventories, especially for places where few to no observations are available (Fig. 1A), is an important component of refining our ability to improve the M1 likelihood model or any model intended to predict the occurrence of postfire debris flows. In this study, we collected a new postfire hydrologic response inventory and used it to further test the M1 model and its coefficients (i.e., \(\beta\) and \({C}_{\mathrm{1,2},3}\); Eq. 1), which were trained to observations from Southern California. The calibration targets for the M1 model were limited to Southern California because this is where the inventory data were considered the highest quality with regard to rain gage accuracy, proper identification of hydrologic response, and the location of the response (Staley et al. 2017). A natural next step would be to use these (and future) inventories to evaluate if the empirical coefficients that comprise the M1 model should be regionalized. To facilitate discussion of this topic and as a first step toward this goal, we used the nonparametric Mann–Whitney U test (Mann and Whitney 1947) to evaluate the null hypothesis that the terrain (\(T\)), fire (\(F\)), and soil (\(S\)) metric distributions among positive debris-flow cases for Southern California, the Intermountain West, and the Central California coast are statistically similar. A p value less than 0.05 for the Mann–Whitney U test would indicate that the differences among the metric distributions are statistically significant. We found that the \(T\) metric (i.e., the proportion of upslope area with moderate to high soil burn severity and topographic gradients greater or equal to 23°) distributions shift to slightly higher values from Central California to the Intermountain West to Southern California (mean \(T\) equal to 0.53, 0.549, and 0.578, respectively; Fig. 8A). The p value among these distributions is greater than 0.05, which indicates that the differences are not statistically significant. Similarly, the \(S\) metric (i.e., the soil-erodibility) distributions increase slightly from Central California and the Intermountain West to Southern California (mean \(S\) equal to 0.207, 0.207, and 0.216, respectively; Fig. 8C), and their differences are also not statistically significant. These tests indicate that, while Southern California debris-flow cases may host slightly steeper slopes and more erodible source material than the Intermountain West and Central California, the \(T\) and \(S\) distributions are sufficiently similar to justify local calibration to Southern California and broader application to the western USA. However, we found that the \(F\) metric (i.e., the mean upslope differenced normalized burn ratio divided by 1000) distributions exhibit a distinct shift from lower values for Southern California to higher values for the Intermountain West and Central California coast (i.e., mean \(F\) equal to 0.306 versus 0.492 and 0.53, respectively; Fig. 8B), possibly owing to differences in vegetation type (e.g., chaparral-dominated landscapes in Southern California versus more heavily forested areas in our study area) or satellite sensor type (e.g., 10-m Sentinel-based data versus 30-m Landsat-based data). The \(F\) metric distributions for the Intermountain West and Central California coast are statistically similar and they are both significantly (i.e., p value less than 0.05) different than Southern California. To contextualize the relative effect of differences in \(T\), \(F\), and \(S\) metrics on debris-flow likelihood, consider theoretical 15-min rainfall thresholds (\({R}_{P50}\); Eq. 1) calculated with the M1 likelihood coefficients and the mean of the \(T\), \(F\), and \(S\) metric values we report for Southern California, the Intermountain West, and the Central California coast. A 15-min rainfall threshold estimated with the mean \(F\) metric for the Intermountain West and Central California coast could be up to 20% lower than that estimated for Southern California. Owing to the distinct differences in the F metric distribution, it seems likely that rainfall thresholds estimated by the M1 model for the Intermountain West and the Central California coast would be systematically lower than Southern California. Here, the justification for calibration to Southern California and application of the M1 model to places like our study region is less robust. Our work indicates that the \(F\) metric may benefit from regional calibration or replacement by a variable that is less susceptible to variation in vegetation or satellite sensor type.

Fig. 8
figure 8

Kernel density estimation plots for the A terrain, \(T\), B fire, \(F\), and C soil, \(S\) metrics associated with postfire debris-flow cases in Southern California (SCA; light orange), the Intermountain West (IMT; intermediate orange), and the Central California coast (CCAC; dark orange). The terrain (\(T\)) metric is the proportion of upslope area with moderate to high soil burn severity and topographic gradients greater than or equal to 23°, the fire (\(F\)) metric is the mean upslope differenced normalized burn ratio divided by 1000, and the soil (\(S\)) metric is the soil-erodibility factor (Eq. 1; Staley et al. 2016, 2017)

Summary and conclusion

We aimed to improve our understanding of postfire debris-flow hazard along the Central California coast by collecting field evidence of postfire hydrologic response across multiple burn areas following an atmospheric river storm. The rainfall associated with this storm triggered a broad spectrum of postfire hydrologic responses. Enhanced surface-water runoff mobilized sediment from hillslopes and channels to produce small (but morphologically distinct) debris flows, nuisance to destructive debris floods, and violent, potentially life-threatening, debris flows. We used these observations to compile the first postfire hydrologic response inventory for this region and to test the predictive performance of the US Geological Survey M1 likelihood model, a tool that presently underlies the emergency assessment of postfire debris-flow hazards in the western USA.

We found that the M1 model produces rainfall thresholds that perform as well or better than most other regions where it has been tested. However, when we leveraged the wide range of postfire hydrologic responses that we observed to distinguish between minor and major response types, we found that the predictive performance of the M1 model decreases substantially. The M1 model was designed to identify the minimum threshold (i.e., err on the side of safety) and therefore has a high false positive rate. Ideally, rainfall thresholds could distinguish between minor and major events because they require different levels of response. Our results underscore that the problem of false positives is a challenge for developing accurate rainfall thresholds for the occurrence of postfire debris flows (e.g., Staley et al. 2013, 2017). We also identified several instances of false negatives, wherein debris flows occurred for rainfall below the assumed rainfall threshold. We conclude that the collection of field-based surface hydraulic metrics across a wide variety of burn areas could be used as training data to facilitate the delineation of postfire hydrologic response types from a more quantitative perspective. A model that can predict the likelihood of low versus high discharge in debris-flow source and transport areas would also complement debris-flow volume models geared to inform the inundation hazard in depositional zones.

As wildfire activity increases throughout mountainous areas of the USA, so too will the demands for the assessment of postfire debris-flow hazards, including areas previously untested, such as our study area on the Central California coast. We conclude that it will be critical to collect additional field-verified inventories of postfire hydrologic response (e.g., De Graff et al. 2022; Swanson et al. 2022) and subject these observations to statistical (or other objective) tests for prioritizing which model variables may be suitable candidates for regional calibration or replacement.