Global debris flow susceptibility based on a comparative analysis of a single global model versus a continent-by-continent approach

Debris flows, and landslides in general, are worldwide catastrophic phenomena. As world population and urbanization grow in magnitude and geographic coverage, the need exists to extend focus, research, and modeling to a continental and global scale. Although debris flow behavior and parameters are local phenomena, sound generalizations can be applied to debris flow susceptibility analyses at larger geographic extents based on these criteria. The focus of this research is to develop a global debris flow susceptibility map by modeling at both a continental scale for all continents and by a single global model and determine whether a global model adequately represents each continent. Probability Density, Conditional Probability, Certainty Factor, Frequency Ratio, and Maximum Entropy statistical models were developed and evaluated for best model performance using fourteen environmental factors generally accepted as the most appropriate debris flow predisposing factors. Global models and models for each continent were then developed and evaluated against verification data. The comparative analysis demonstrates that a single global model performs comparably or better than individual continental models for a majority of the continents, resulting in a debris flow susceptibility map of the world useful for in-depth research, international planning, and future debris flow susceptibility modeling and determining societal impacts.


Introduction
Debris flows, and landslides in general, are worldwide catastrophic phenomena (Campbell 1974;Brabb et al. 1999;Brighenti et al. 2013;Dowling and Santi 2013;Froude and Petley 2018). Due to limitations in funds, time, data, as well as focused interest in specific locales, landslide research is often necessarily directed at risk analyses, hazard assessments, and mitigation efforts associated with known landslide sites at a local level (Reichenbach et al. 2018). Much research on debris flows has been conducted at the local level based on direct field surveys, and diverse statistical methodologies, and has resulted in considerable knowledge about predisposing factors, triggers, and return rates associated with debris flows worldwide (Wilford et al. 2004;Leoni et al. 2009;Rossi et al. 2010;Park 2014;Dou et al. 2015;Lombardo et al. 2016;Van Westen 2016;Zezere et al. 2017;Soma 2018). Although debris flow behavior and parameters are local phenomena, sound generalizations can be applied to debris flow susceptibility analyses at larger geographic extents based on these criteria (Kirschbaum et al. 2015a, b;Jacobs;.
As world population and urbanization grows in number and geographic coverage, we realize the need to extend the focus, research, and modeling to a continental and global scale currently and projected into the future. Localized field surveys are not a practicable approach in continental and global hazard susceptibility modeling and projections. Debris flow susceptibility analyses at broader scales require data-driven and statistical methodologies which include data inventories of historical events, and continental and global remotely sensed coverages of environmental factors which may influence susceptibility.
How far-reaching can a susceptibility analysis extend? Can the world be analyzed or modeled as one body for purposes of landslide susceptibility or is this an oversimplification not accounting for the geologic, geomorphic, and tectonic histories and dissimilarities of the continents, and disregard for the latitudinal influences of climatic conditions? This study develops an optimal global debris flow susceptibility map and illuminates the results obtained when performing a 'single' global debris flow susceptibility analysis versus analyses performed on a continent-by-continent basis.
In this research, susceptibility models were developed for the world and for each of seven continents (Africa, Asia, Australia, Europe, North America, Oceania, and South America). An analysis was not performed for Antarctica as there are no Antarctic debris flows in the historic event inventories used.

Study area
Debris flow susceptibility analyses are performed for the entire world and for each continent. Although all continents have extensive mountain ranges, vast plains and plateaus, and complex river systems, each continent is unique in the geomorphic expression of these features, their climate, soil, and vegetation (Bridges 2012), as depicted in Tables 1, 2, and 3.

Data and methodology
Two datasets were used for model training and testing. A landside inventory was sourced from NASA (Kirschbaum et al. 2015a, b) as model training data, herein referred to as TRAIN. This inventory contains 11,033 landslides of various types. 194 events are classified as debris flows and 2100 as mud slides, herein collectively referred to as "debris flows". "Mudslide" was included in this study as it is a common misnomer for debris flow.
There may be sampling bias in this inventory, due to landslide data being commonly collected where the data are easier to gather near cities/towns/villages, where there is known potential danger to facilities and/or people, and at areas more easily accessible for observation or field survey. This bias, if it does exist, may be common among most, if not all, landslide inventories (Reichenbach et al. 2018). An additional component to the sampling bias is a result of the unequal availability of data from every country. A more equitable geographic representation of data acquisition across the world would result in more finely tuned continental and global models. The test dataset (TEST1) was curated from many agencies across the world, resulting in 5695 debris flow events. The geographic distribution of these events is not as well distributed across the world as would be desired, due to either an inability to locate data sources or inability to obtain inventories from the sources identified. The numeric breakdown by continent and global distribution of the events is shown in Table 4 and Fig. 1, respectively.
While additional factors are relevant, such as slope aspect, they are difficult to process and summarize at a global level. Climate, average monthly precipitation, and aridity in addition to acting as predisposing factors, provide a view on the antecedent weather and precipitation conditions associated with debris flows, as well as areas with potential triggering factors.
The first objective was to choose the best statistical approach at a global level. That approach was then used to model each of the seven continents and compare these results with the global model. The overall process is shown in Fig. 2.
Determination of the most relevant debris flow conditioning factors, and factor classes associated with an historical event inventory, is essential to developing susceptibility models. Five statistical models were developed to determine the most significant factors and factor classes which would then become input to susceptibility models. Probability Density (PD), Frequency Ratio (FR), Conditional Probability (CP), and Certainty Factor (CF) algorithms were developed in Microsoft Excel (Excel), and a Maximum Entropy model (MAX-ENT) was developed with MaxEnt software v. 3.4.4 (Phillips et al. 2021).
Maximum Entropy, a "presence-only" machine learning algorithm was used due to the ambiguity of "absence" in this context; and the dependence on landslide inventories that were not collected through manual field surveys and thus without verified locations. Absence does not necessarily mean that there are or were no debris flows in an area. It means that we do not know and/or we do not have substantiating data sources or ability to conduct field surveys, particularly at a continental or global scale. Maximum Entropy is a widely used technique in biological species distribution modeling with recent and growing interest in its use for landslide susceptibility modeling due to its predictive success compared with other methodologies in "presence" only scenarios (Convertino et al. 2013;Park 2014;Lombardo et al. 2016;Kornejady et al. 2017;Yuan et al. 2017;GÁL et al. 2018).
The MaxEnt algorithm also renders information for the predisposing factors that provide the greatest contribution to the susceptibility output. The same environmental variables (factors) were used for PD, FR, CP, and CF. A few of these variables were not employed in the MAXENT model, and a few of the variables used in the MAXENT model were not utilized in the PD, FR, CP, and CF models. Table 5 lists the environmental factors employed in each model. The environmental factors not utilized in PD, FR, CP, and CF are due to the requirement for an area calculation of the underlying factor classes, which is not practicable for these factors (e.g., slope) at continental and global scales, but which is not a MaxEnt software requirement. Variables not employed in the MAXENT model were due to the difficulty of employing distance-based factors (e.g., distance to faults) as continuous or categorical spatial representations, as required by the MaxEnt software. It is believed that the slight difference in factors is not an impact on the resulting model comparisons and choice, as the majority of the factors employed in all models are those which are in common use for debris flow susceptibility analyses, as cited earlier. Using only the common subset of factors, in all models, was not chosen as it is believed that each model will perform better with the maximum relevant factors. The verification dataset (TEST1) was used to determine which statistical model yielded the best results, among the FR, CP, CF, PD, and MAXENT global models. The best statistical model was then used to develop susceptibility maps for each continent. Each susceptibility map (continental and global) was further processed in ArcGIS Pro 2.7 (Esri 2020). Global and continental comparisons were performed by two methods. For the first method, a "continental cut" from the global model was performed for each continent, with a subsequent comparison with the continental model. For the second method, all continental models were mosaicked into a single "continental model composite" and compared against the global model. The resulting maps were classified using five equal interval breaks. The five breaks were ranked as Very Low, Low, Medium, High, and Very High. The number of TEST1 events within each classification was calculated for each continent, for the continental and global models, and the number of pixels (area) for each classification rank.

Results
Frequency Ratio, Conditional Probability, and Certainty Factor yielded the same results for all factors. Therefore, subsequent discussions will refer to "FR_CP_CF" and represent all three models. Of the global susceptibility models, FR_CP_CF yielded the poorest test results with 23.3% of the verification (TEST1) events occurring in the Medium to Very High susceptibility classes. MAXENT was the best performing model with 82.9% of the verification events occurring in the Medium to Very High susceptibility classes (  analyses with MAXENT continental models to determine the best approach to "global" modeling. Each MAXENT continental model was compared with the global MAXENT model individually and as a mosaic, utilizing the verification data (Table 7). A count was calculated for the Very High susceptibility class and a cumulative count for the Medium, High, and Very High classes for both the global and continental models.
As an example of the continental comparative analysis, Figs. 6, 7, and 8 are the MAXENT global ("continental cut"), continental, and difference models for Europe, respectively. Figure 9 presents a susceptibility difference between the MAXENT global model and a mosaic of all MAXENT continental models.

Discussion
Several statistical methods (Frequency Ratio, Probability Density, Conditional Probability, Certainty Factor, MaxEnt) were employed to develop a global debris flow susceptibility map. The results were tested and compared to determine which is the best approach to a debris flow susceptibility map of the world. The MAXENT model resulted in superior performance (Table 6) in identifying highly susceptible areas, as proven with TEST1 verification data, and was used for subsequent continental analyses in determining whether a global or continental model provides the better fit. Using only the Very High susceptibility class, four of the seven continents (Europe, North America, Oceania, South America) demonstrated better performance (greater verified susceptibility results) in the global model than in the individual continental models (Table 7). Africa, Asia, and Australia demonstrated better results in the continental model for Very High Susceptibility. When looking at cumulative results for the Medium, High, and Very High susceptibility classes, all continents except Oceania and Australia demonstrated better results in the global model. Australia exhibited comparable results in both the For all continents, except Oceania, the global and continental models exhibit greater than 70% agreement in their susceptibility classifications (Table 8). The majority of the areas exhibit a difference in susceptibility of only one classification. Figure 8 is a susceptibility difference map for Europe, as a continental example. Figure 9 represents the susceptibility difference between the single global model and a mosaic of all continental models. These difference models demonstrate areas where the global susceptibility classifications are higher or lower than the continental modeled values. 76.71% of the Europe continent has the same susceptibility classification in both the global and continental models (Table 9), and 88.78% of the global model demonstrates susceptibility classifications equal to those in a mosaic of all the continents (Table 10). Greater than 95% of the world global model is within ± one susceptibility classification difference of the continental models, and less than 10% of the global model exhibits a greater susceptibility than the continental models.
Subsequent use, of debris flow susceptibility maps for the world, will dictate whether the error of commission (a single global model) or omission (continental models) is better tolerated.
As is true in all modeling, subjectivity and biases may be introduced in multiple ways, which may then influence the results. It is acknowledged in this study that there may be a geographic bias in the TRAIN and TEST1 global event inventories based on data collection methodologies and data availability. Comparing two continents (North America and Asia) in the Northern hemisphere with similar overall compositions of climate, and other predisposing factors, the ratio of North America land area to Asia is 0.54, while the TRAIN and TEST1 event density ratios are 5.8 and 7.9, respectively. It is unknown from this research how this may affect the results presented but is an important topic for further study.
A future refinement of this model will include spatial-temporal modeling by evaluating the seasonal periodicity of high monthly mean precipitation values compared with the dates of the historical events, along with anecdotal information for those debris flows identified as triggered by anomalous precipitation events. Additionally, with relevant changes, the methodology followed herein will be applied to other landslide types such as co-seismic and rock avalanches. A suite of global models specific to each landslide type is planned.

Conclusion
Global debris flow susceptibility analyses are possible and meaningful. Conventional wisdom might lead one to believe that a single debris flow susceptibility model for the entire planet may not represent the susceptibility of the individual continents, just as a single model representing susceptibility of any/all landslide types does not optimally represent the susceptibility of individual and distinct landslide types and associated hazards. Thus, a After analyzing the performance of Frequency Ratio, Probability Density, Conditional Probability, Certainty Factor, and Maximum Entropy models at the global level, the Maximum Entropy model provided the best performance and was then used to develop and compare the single global model with individual continental models.
While the predominant lithology type (sedimentary) and land use cover (forest and woodland) are common among all continents, there are differences from continent to continent with regard to other environmental factors which are known to be associated with debris flows (Tables 1, 2, and 3). Mean elevation varies from Asia at 915 m a.s.l. to Europe with 300 m a.s.l. The dominant landcover varies from forest and woodland in Africa, Asia, Oceania, Europe, North and South America to grassland and shrubland in Australia. Mountain belts dominate the geomorphic structures of Asia and North America, crystalline shields dominate Africa, Australia, and South America. Erosional plains dominate Europe. There is sufficient variation to suggest that a single global model may not adequately represent the debris flow susceptibility on all continents. Yet, in this study we find the global model performs exceptionally well in comparison with the continental models when evaluating on the cumulative results of the Medium, High, and Very High susceptibility classes.
Although debris flow behavior and parameters are local phenomena, sound generalizations can be applied to debris flow susceptibility analyses at larger geographic extents, based on these criteria. The debris flow environmental predisposing factors are similar and well-defined in many different types of regions across the world, and although there are dissimilarities in geologic, geomorphologic, hydrologic, tectonic, land use/landcover, climate, and other factors among the continents, there are pockets or regions within each continent with the environmental factors conducive to debris flows. For example, 70% of Australia is composed of arid or semi-arid land, yet the Eastern seaboard exhibits debris flow susceptibility and associated historical debris flow events.
The authors believe that with a larger debris flow event inventory, and an unbiased geographic distribution, these global and individual continental susceptibility models can be further improved. Fig. 8 Europe debris flow susceptibility difference map. Positive numbers represent areas where the global model susceptibility is higher than the continental model, negative numbers represent areas where the continental susceptibility is higher. Base map is from ArcGIS®, the intellectual property of Esri, used herein under license. Copyright © Esri While local and regional hazard studies will always be essential, we must reframe our thinking and research to include studies at much broader scales. All hazard problems are global problems. We cannot nibble away at global problems, but rather must understand where problems exist, what problem areas may begin to coalesce, and where and when these problems may begin to encroach on human populations or vice versa. The more we know about our problems at the global scale, the more information we will have to better understand and address them locally and regionally.
This global debris flow susceptibility map and model is an important foundation for further refinements and extensions as an international perspective on the potential impact of debris flows on people and economics, as population, urbanization, and climate changes expand. Fig. 9 Global MAXENT susceptibility model minus mosaic of all continental MAXENT models. Positive numbers represent areas where the global model susceptibility is higher than the continental model, negative numbers represent areas where the continental susceptibility is higher. Base map is from ArcGIS®, the intellectual property of Esri, used herein under license. Copyright © Esri