Landslides across the USA: occurrence, susceptibility, and data limitations

Detailed information about landslide occurrence is the foundation for advancing process understanding, susceptibility mapping, and risk reduction. Despite the recent revolution in digital elevation data and remote sensing technologies, landslide mapping remains resource intensive. Consequently, a modern, comprehensive map of landslide occurrence across the United States (USA) has not been compiled. As a first step toward this goal, we present a national-scale compilation of existing, publicly available landslide inventories. This geodatabase can be downloaded in its entirety or viewed through an online, searchable map, with parsimonious attributes and direct links to the contributing sources with additional details. The mapped spatial pattern and concentration of landslides are consistent with prior characterization of susceptibility within the conterminous USA, with some notable exceptions on the West Coast. Although the database is evolving and known to be incomplete in many regions, it confirms that landslides do occur across the country, thus highlighting the importance of our national-scale assessment. The map illustrates regions where high-quality mapping has occurred and, in contrast, where additional resources could improve confidence in landslide characterization. For example, borders between states and other jurisdictions are quite apparent, indicating the variation in approaches to data collection by different agencies and disparity between the resources dedicated to landslide characterization. Further investigations are needed to better assess susceptibility and to determine whether regions with high relief and steep topography, but without mapped landslides, require further landslide inventory mapping. Overall, this map provides a new resource for accessing information about known landslides across the USA.

Landslide occurrence, impacts, and assessments in the USA In the USA, landslides are a geologic hazard known to occur in every state. Some estimates suggest that they cause an average of 25-50 fatalities each year and contribute to billions of US dollars in economic losses annually (National Research Council 1985;Schuster 1996). Landslide fatalities vary considerably from year to year, and more recent estimates report that 93 landslide-related fatalities occurred within the USA between 2004 and 2016 (Froude and Petley 2018). Two notable events include a large, deep-seated landslide near Oso, WA, in March 2014, which resulted in 43 fatalities (Iverson et al. 2015;Collins and Reid 2019), and widespread debris flows in Montecito, CA, in January 2018, which resulted in 23 fatalities . In contrast to fatalities, the estimates of economic losses related to landslides involve considerably more uncertainty. Initial calculations were based in part on landslide-related losses to private dwellings in southern California, which were subsequently extrapolated across the country (Krohn and Slosson 1976), resulting in projected private losses of approximately $400M in 1971 US dollars, or $2.5B in 2019 US dollars (based on www.usinflationcalculator.com). This extrapolation seems more than justified considering that recent estimates of losses in just the city of Portland, Oregon (a mid-sized city with population~650,000 in a landslide-prone area), indicate that landslides result in direct economic losses between $1.5-3M US dollars during typical winters and upwards of $64-84M in more extreme weather years . Similarly, estimates of the direct costs to repair roads and private residences damaged by landslides for the state of Kentucky are approximately $10-20M US dollars annually (Crawford 2014). However, the indirect losses due to reduced economic productivity and other landslide-related costs are exceedingly difficult to estimate and have not been reported. Updated estimates of both direct and indirect losses are sorely needed for the range of typical and severe landslide weather conditions across the USA, particularly since the impacts of landslides are expected to grow with ongoing climate change, increasing disturbances such as wildfire, and populations expanding into landslide prone terrain (Leshchinsky et al. 2017;Mirus et al. 2017).
Recently, several landslide-related tragedies and disasters in the USA (Coe et al. 2014;Iverson et al. 2015;Warrick et al. 2019;Bessette-Kirton et al. 2019;Collins and Reid 2019;Kean et al. 2019) have further increased public attention and focused additional resources toward landslide research and mapping. These changes in priorities and recent technical advances have contributed to concerted efforts to map landslides within certain administrative areas, often by state or county (e.g., Slaughter et al. 2017). Landslide inventories have long provided the foundation for research and various types of hazard assessments designed to reduce losses. For example, inventories that include the timing of slope failures are critical for optimizing empirical and deterministic criteria for landslide early warning systems across various scales (e.g., Caine 1980;Keefer et al. 1987;Guzzetti et al. 2008Guzzetti et al. , 2019Mirus et al. 2018). Similarly, spatial distributions of landslide occurrence are used to develop susceptibility maps, which typically define areas with different classes of potential landslide occurrence (see review by Reichenbach et al. 2018 and references therein). Both the precise timing and exact locations of landslides are needed to test distributed models of landslide initiation (e.g., Brien and Reid 2008;, and the spatial extent of landslide deposits are needed to test simulations of runout behavior (e.g., Iverson et al. 2015;Reid et al. 2016). Multitemporal landslide inventories are critical for evaluating processes such as landslide recurrence (Samia et al. 2017(Samia et al. , 2019Temme et al. 2020). Furthermore, it has long been recognized that detailed landslide inventories can improve hazard assessments used to inform development planning and emergency management (e.g., Nilsen et al. 1979;Fell et al. 2008), as well as encourage public engagement on critical issues surrounding exposure to landslide risk. Thus, compiling landslide inventories over broad regions or even entire continents-such as the European inventory initiative (Herrera et al. 2018)-can provide great utility for landslide risk reduction at national or multi-national scales.
Previous attempts at a national-scale landslide map The United States Geological Survey (USGS) has a long history of coordinating efforts for landslide hazard assessment and risk reduction (see USGS 1982;Wieczorek and Leahy 2008). One of the earliest assessments of landslide hazards across the contiguous USA was the USGS landslide overview map (Radbruch-Hall et al. 1976, 1982, which shows landslide incidence and susceptibility. These classifications were based on the authors' interpretation of a 1:2,500,000 scale geologic map (King and Beikman 1974), though the final map was reduced to 1:7,500,000 scale, which was eventually digitized for publication (Godt and Radbruch-Hall 1997). Geologic formations or groups of formations were assigned a high, medium, or low landslide susceptibility and/or a high, medium, or low landslide incidence. Their incidence assignments were based on the percent area of the given formation that was mapped as landslides, whereas their landslide susceptibility assignment was based on unspecified subjective criteria (due to insufficient data on mapped landslide areas). Ultimately, the published map ( Fig. 1a) shows six distinct classifications of landslide potential (incidence and susceptibility), which were not explicitly ranked by the authors. However, based on the colors, they selected for each category and our own understanding of landsliding across the USA, we interpret these from highest to lowest potential for landslides as high incidence (HIGH), high susceptibility with moderate incidence (HIGH-MOD), moderate incidence (MOD), high susceptibility with low incidence (HIGH-LOW), moderate susceptibility with low incidence (MOD-LOW), and low incidence (LOW). Their qualitative and somewhat subjective classification system, as well as the overlap between incidence levels of these six classes, reflects both an incomplete knowledge of landsliding across the country, and the relatively coarse scale topographic and geologic maps available at the time of publication. Around the same time, Wiggins et al. (1978) developed an alternative landslides map by combining the analysis of Krohn and Slosson (1976) with the preliminary efforts of Radbruch-Hall et al. (1976), though the details of how these two maps were combined are not specified. This hybridized map included four simpler and more intuitive classifications of (1) high, (2) medium, (3) apparently low (based on limited data), and (4) low. Regrettably, the Wiggins et al. (1978) map is only available in its original printed format with very coarse resolution (see two-page composite figure in USGS 1982).
After digitization of the original USGS landslide overview map (Godt and Radbruch-Hall 1997), it was noted that debris flow hazards in the arid Southwest were not considered in the susceptibility and incidence classifications, which prompted the compilation of limited inventory of debris flows in combination with a national map of slope angles greater than 25 degrees (Brabb et al. 1999).
The next published national-scale assessment of landslide hazards was completed over a decade later (Godt et al. 2012). This more recent assessment developed a simple susceptibility model informed by topographic slope and relief gleaned from the National Aeronautics and Space Administration's (NASA) 30-arcsecond Shuttle Radar Topography Mission (SRTM). The model was calibrated using landslide inventories from New Jersey, New Mexico, North Carolina, Oregon, and the San Francisco Bay region, then applied to map susceptibility across the conterminous USA. The map (Fig. 1b) indicates two susceptibility classes: negligible hazard from landslides (NONE) and some hazard from landslides (SOME). This two-class model was largely conceived to distinguish, in the most general sense, which areas are expected to pose essentially no landslide risk versus others that could potentially have some risk. The authors suggested that such a model could be used to inform an initial category of landslide insurance policies offered by US postal code, with considerable room for future improvement. Given that other applications such as infrastructure and development planning benefit from the additional information provided by multiple different levels of landslide susceptibility, the earlier landslide overview map (Radbruch-Hall et al. 1982) has been more widely used. However, to date no formal assessment of its validity has been published, due in large part to a lack of suitable data.
The most recent susceptibility model with coverage over the entire USA was developed by NASA  as part of their global Landslide Hazard Assessment for Situational Awareness (LHASA) (Kirschbaum and Stanley 2018). It relies on a series of "fuzzy logic" operators based on topographic slope, geologic formation ranking, proximity to roads and faults, and recent forest loss. It distinguishes five levels of susceptibility to landsliding including very low (VL), low (L), moderate (M), high (H), and very high (VH), where the performance of the highest susceptibility (VH) was evaluated using receiver operating characteristics for a selection of eight localized landslide inventories in Afghanistan, El Salvador, Guatemala, Italy, the Himalaya (Nepal-India-China), Nicaragua, Oregon (USA), and Utah (USA). The implication of this evaluation is that the other four lower susceptibility classifications (VL, L, M, H) are considered locations where landslides are not expected. However, since the NASA model considers both geology and topographic slope (albeit not relief), its expression for the conterminous USA ( Fig. 2a) compares favorably to a combination of the prior USGS landslide overview map developed by Radbruch-Hall et al. (1982) and susceptibility model developed by Godt et al. (2012) (Fig. 2b). The NASA model has been applied uniformly across most of the globe (56°South to 72°N orth latitudes) to help inform disaster planning, situational awareness, and decision support (Kirschbaum and Stanley 2018).
Better tools to increase awareness and evaluate current understanding The emergence of satellite remote sensing, machine learning, and other computational technologies has introduced new tools for landslide mapping efforts (Guzzetti et al. 2012). High resolution aerial imagery and topographic data, such as lidar, have accelerated the revolution in landslide mapping techniques (Schulz 2004(Schulz , 2007Van Den Eeckhaut et al. 2007;Ardizzone et al. 2007;Petschko et al. 2016), and machine learning processes have facilitated automated or  Fig. 1 a USGS landslide overview map of the conterminous USA, showing areas of high, moderate, and low landslide susceptibility and/or incidence listed from highest to lowest, such that HIGH, high incidence; HIGH-MOD, moderate susceptibility, high incidence; MOD, moderate incidence; MODHIGH, high susceptibility, moderate incidence; MOD-LOW, low susceptibility, moderate incidence; and LOW, low incidence (after Radbruch-Hall et al. 1982;Godt and Radbruch-Hall, 1997). b USGS topographic susceptibility model distinguishing between areas with negligible landslide hazards (NONE) and potential landslide hazards (SOME) (after Godt et al., 2012) semi-automated approaches to detect and classify landslide features (Bunn et al. 2019). However, a national-scale understanding of landslide hazards in the USA still predates the digital data revolution (i.e., Radbruch-Hall et al. 1982;Brabb et al. 1999 Fig. 2 a NASA landslide susceptibility model showing five hazard classes from very low through very high (after Stanley and Kirschbaum, 2017) and b overlay of two previous USGS products: the landslide overview map (Radbruch-Hall et al., 1982;Godt and Radbruch-Hall, 1997;Fig. 1a) and the topographic susceptibility model (Godt et al., 2012;Fig. 1b) landslide mapping facilitated by various technological advances and the lack of a rigorously tested susceptibility assessment, the USGS identified the need for an updated national-scale database of landslide occurrence with the following objectives: -Provide a centralized portal to explore and access existing landslide data across the USA. -Facilitate landslide research within broader geologic or geographic contexts that transcend jurisdictional boundaries. -Enable general hazard assessments and disaster management plans at the national scale. -Identify areas where additional landslide mapping may be needed. -Promote awareness of landslide occurrence across the country.
Here, we present the results of our initial efforts to compile available geodatabases of landslide occurrence across the USA, and then compare these integrated data with the three previously digitized products of landslide potential with national-scale coverage (Godt and Radbruch-Hall 1997;Godt et al. 2012;Stanley and Kirschbaum 2017).
Compilation of local-scale data into a national-scale product Landslide mapping and classification are typically addressed at local scales or during post-event response efforts, often with very different objectives and resources allocated. In the USA, several state geological surveys or agencies have established clearly defined protocols for landslide mapping (Burns and Madin 2009;Slaughter et al. 2017;Wills et al. 2017), which has paved the way toward comprehensive catalogs of landslide occurrence within their various jurisdictional boundaries. However, given the limited guidance for standardized data acquisition and management, the formats of landslide data can vary considerably between inventories, which pose a challenge for developing a uniform national-scale product. Our initial progress toward establishing an inventory of known landslide occurrence within the USA compiles existing, publicly available geodatabases, but reduces these data to a uniform subset of attributes that we deemed essential to developing a broad understanding of landslides and their impacts across the country. Furthermore, we identified the need to develop consistent criteria to characterize the variability in confidence between different sources and types of landslide information. We note here that the landslide inventories include both prehistorical landslides identifiable via mapping and field studies, as well as recent or historical landslides that have been directly observed and/ or mapped following a landslide event. While there is considerable variability in data quality and confidence, any and all characterization of landslides are potentially useful for future hazard assessments.

Existing products and data sources
The large spatial extent of the USA (~9.1 million km 2 of land area) combined with the geographic and topographic diversity (subaerial elevations ranging from − 86 m to 6194 m) and variety of landslideprone terrain (including nearly all forms of landslides-rockfalls, rock avalanches, earth flows, debris flows, among others) have previously presented considerable obstacles to a comprehensive national landslide inventory. Additionally, the range of resources allocated to landslide assessments and research varies considerably from state to state. There are several prominent global-scale landslide information products, albeit with a somewhat narrower scope. The USGS hosts an open repository for seismically triggered ground-failure inventories (Schmitt et al. 2017), which includes combined incidences of both liquefaction and landsliding linked to specific earthquakes. Those inventories are contributed by authors of technical reports and scientific journal articles, but access is maintained by the USGS in a centralized location. Academic researchers in England have compiled a global database of fatal landslides from media and other reports dating back to 2004 (Froude and Petley 2018). Additionally, NASA maintains the Global Landslide Catalog (GLC) of selected rainfalltriggered landslides across the world (Kirschbaum et al. 2015). The GLC includes only those that occurred since 2007 that are gleaned largely from NASA's periodic analysis of selected media outlets and citizen scientist reports. Despite these limitations, the landslides compiled in the GLC are perhaps the most comprehensive for rainfalltriggered slope failures globally. However, within the USA, agencies at the state and local level often maintain more precise and comprehensive maps of landslide occurrence, including historical landslides that predate 2007 or do not necessarily include specific information on the date of occurrence. These inventories are derived by qualified geoprofessionals using a variety of robust investigative techniques ranging from lidar-based identification and subsurface investigations, to regional geologic mapping of extensive Quaternary landslide deposits. A subset of landslides from these state and local records with known dates were compiled by NASA and combined with the GLC into a database of dated rainfall-triggered landslides (Kelkar et al. 2017), which served as our motivation and starting point for the more comprehensive database compiled herein.
Although statewide landslide inventories are not available for all 50 states, many states with frequent landslides support an advanced landslide mapping program, often with online maps and databases available to the public (e.g., Arizona, California, Kentucky, North Carolina, Oregon, Vermont, Washington, West Virginia, and Wyoming). Some federal agencies also support localized landslide mapping efforts within defined jurisdictional boundaries, such as for specific National Forests or National Parks (e.g., Stock et al. 2013). Additionally, the USGS is regularly tasked with mapping landsliding events that are of national significance, including the most widespread, damaging, and deadly instances (e.g., Baum et al. 2000;Coe and Godt 2001;Coe et al. 2014;Collins et al. 2018;Collins and Reid 2019;Kean et al. 2019). These generally overlap with the relatively few recent landslide fatalities within the USA (see Froude and Petley 2018), but include more specific information and detailed documentation. Landslide impacts to territories of the USA are also common (e.g., Harp et al. 2004;Bessette-Kirton et al. 2019), but landslide information is generally less available in these regions.
For our national-scale compilation, we attempted to include all publicly available geodatabases collected by researchers and local, state, and federal agencies, with the understanding that more data may exist or ultimately become available and can be added to this database periodically. Digital geodatabases of landslide occurrence range from scanned and georeferenced images of geologic maps that include pre-historical landslides, to information-rich GIS data or maps of historical events that include a host of different attributes and organizational schemas. These all require presentation in a uniform format and database structure with some context to distinguish between the various data types.
Disparate data, simplified attributes, and confidence metric Precise characterization of the location, extent, and nature of landsliding benefit both planning efforts and research advances. At the same time, the quality of data and supporting information to characterize landslides varies widely between different localscale inventories. For example, sometimes landslides are mapped as point locations and sometimes as polygons; sometimes landslide features such as head-scarps or runout deposits are delineated explicitly and other times not. Landslides are often mapped at different scales with different attributes depending on their intended use and the resources that were devoted to mapping them.
The disparate existing datasets present a substantial data integration and management challenge for developing a national-scale product with a common set of attributes and simplified database structure. The vastly different mapping techniques and scales of these inventories lead to considerable variability in the confidence in landslide position. Thus, it could be misleading to compile a uniform database without distinguishing between which composite data have high confidence in the nature and extent of landsliding versus those that may represent only the approximate location of a possible landslide. To address these dual concerns, we identified a limited selection of attributes in addition to the geolocation, that are critical to a national scale picture of landslide occurrence, which we include in the database: (i) an object identification (ID) number assigned for the USGS data compilation, (ii) date of landsliding (if known), (iii) number of fatalities (if any), (iv) confidence classification in landslide attributes and location, (v) source inventory name (and its associated identifying label used in the original source database), (vi) links to the source information and to the source inventory (often the same), and (vii) notes to include any additional relevant information or qualifiers. With the exception of (iv) and (vii), we selected these attributes because they are generally common across most inventories, are simple to interpret, or are potentially critical for national-scale assessments. For example, landslide timing is very important for developing landslide warning systems Guzzetti et al. 2019) even though landslide age is often unknown or only known very roughly (e.g., post-glacial landslide, Quaternary landslide deposit). Thus, except for recent landslides and historical events with detailed documentation, the landslide date attribute (ii) is often null but included anyway. Similarly, comparatively few of the world's fatal landslide events occur within the USA (Froude and Petley 2018), so the fatalities attribute (iii) is typically null, but we include this information in our database due to the major significance of landslide-related deaths. The notes attribute (vii) allows inclusion of other potentially important information that is not readily classified into the same attributes across all inventories (e.g., landslide material, movement type, field notes, mapping technique, damages or other impacts), which might help users decide whether to seek more data from the original source (v) and (vi). Different methods to characterize data quality have been used by various state agencies (e.g., Wills et al. 2017), but here we develop a standardized confidence attribute (iv) that allows a uniform classification of the relative accuracy of the information available for each landslide. This metric illustrates, in a general sense, that not all landslide information can be used with equal confidence for hazard assessment, and that even in areas where landslides have been characterized, more resources could lead to substantial improvements in understanding the location, nature, age, or extent of landsliding.
We rank confidence with a semi-quantitative classification ranked one "1" (low) through eight "8" (high) to reflect the relative value of different data for landslide research and hazard assessments (Table 1). Using decision-tree scripts, we automatically assigned confidence level for each individual landslide, based on logical rules related to how the data were collected (see metadata for our inventory compilation in Jones et al. 2019). For example, inclusion in Oregon's SLIDO inventory requires a relatively good degree of confidence in the occurrence and location of a given landslide; thus, the default is set to "3." However, for those landslides that are based on lidar analysis or detailed field investigations, a higher value of between "5" and "8" would be assigned. In contrast, the GLC includes numerous media reports, which may include imprecise point location and descriptions of landsliding by non-geo-professionals, thus leading to a lower range of confidences between "1" and "5" depending on the source of information and self-reported location accuracy. Regardless, a point indicating the accurate location of a known landslide is still representative of a larger landslide body that could ultimately be identified. Conversely, in Colorado, vast areas of Quaternary landslide deposits are mapped without distinguishing between Table 1 Semi-quantitative metric and associated description used to characterize relative confidence in landslide occurrence and position 8 -High confidence that the nature and/or spatial extent of the landslide is well characterized This highest confidence level is typically based on detailed field observations and/or expert analysis of high-resolution topographic data or aerial imagery to characterize the landslide. 5 -Confident that a consequential landslide took place at the specified location This level of characterization still involves high confidence that a landslide took place at the specified location as evidenced by fatalities and/or damage to infrastructure, but detailed observations of landslide features are not described in the geodatabase. 3 -Landslide likely at or near the specified location This middle confidence level reflects a known landslide occurrence with lower certainty on the exact position or nature of the slope failure. These typically include verified landslides on lower resolution topographic maps or aerial imagery and landslide data that predate digital topography and precise global positioning systems.

-Probable landslide in the area
Although the exact location and extent of the landslide is not documented, a landslide probably did occur within close proximity to the specified location. This includes geologic mapping of landslide deposits that may correspond to multiple landslides as well as individual landslides mapped with low-resolution topographic data. 1 -Possible landslide occurred in the area The lowest confidence level reflects the uncertain nature of some media reports and the lack of expert classification and characterization of the location and nature of landsliding. Typically, these represent unverified media reports without precise location attribution.
individual failures or source areas, thus leading to a lower confidence rating of "1" or "2" to reflect this uncertainty.
Some landslide geodatabases, including several of the composite inventories in our compilation, are rich in information and complex in structure (e.g., Wooten et al. 2007;Crawford 2014;Wills et al. 2017;Napolitano et al. 2018;Piacentini et al. 2018), whereas the physical structure of our database is quite simple: it is stored in ArcGIS Online with only the seven parsimonious attributes listed above, including an unstructured notes section. The complete database and description of our confidence classification can be accessed via the USGS ScienceBase data release (Jones et al. 2019), or can be viewed online through an interactive map: https://www.usgs.gov/maps/national-landslides-map-anddata. The individual composite databases with links to their original sources are listed in Table 2.
Evaluating previous understanding of landsliding with current data We compare our integrated landslide inventory database to three previously digitized landslide products with continuous coverage over the conterminous USA: (1) the USGS landslide overview map with six classes of low, moderate, and high susceptibility and/or incidence (Radbruch-Hall et al. 1976, 1982Godt and Radbruch-Hall 1997), (2) the USGS topographic susceptibility model (Godt et al. 2012) that distinguishes between areas that are prone to potential landsliding from those that are not, and (3) the NASA fuzzy logic susceptibility model that distinguishes five classes from very low to very high . Although we identified two other maps of landslide susceptibility across the conterminous USA (Krohn and Slosson 1976;Wiggins et al. 1978), we could not locate adequate copies to digitize for the sake of comparison with our inventory. Whereas the Radbruch-Hall et al. (1982) map reflects interpretation of landslide occurrence by geologic formation and terrain (Fig. 1a), the Godt et al. (2012) model reflects topographic characteristics of steep slopes and high relief (Fig. 1b), and the Stanley and Kirschbaum (2017) model considers only topographic slope, as well as geologic classifiers, proximity to roads and faults, and recent forest loss (Fig. 2a). These three previous products are shown overlaid with the landslide database in Fig. 3. Although the new USGS database does include some landslides in Hawaii, Alaska, and Puerto Rico, those states and territory are not represented in two of these three maps, thus for simplicity, we only consider the conterminous USA for the present study.
Visual comparison to susceptibility maps Initial visual comparisons across the country reveal that the mapped landslides in the database generally fall within areas modeled as potentially susceptible to landslides, or SOME, by Godt et al. (2012) due to their steeper slopes and higher relief (Fig. 3b). Additionally, the areas with substantial concentration of mapped landslides with higher confidence ratings (3-8), typically coincide with landslide-prone geologic terrains that are classified as HIGH or HIGH-MOD by Radbruch-Hall et al. (1982) (Fig. 3a) or among the VH or H susceptibility classes of Stanley and Kirschbaum (2017) (Fig. 3c). Conversely, our new compilation includes very few landslides across the vast Midwest and central regions of the country that were modeled as SOME (Fig. 3b, c). Furthermore, considerable areas that were also classified as either HIGH or HIGH-MOD in the landslide overview map or as VH or H in the NASA model do not include any mapped landslides (Fig. 3a, c). On the other hand, the regions where the greatest number of landslides have been mapped vary considerably, which is apparent where jurisdictional boundaries such as state borders or topographic quadrangles are clearly visible features in the data. Obviously, these are not linear boundaries between different landslide processes, but rather highlight differences in methodology, such as the way landslides are mapped (corresponding to the confidence rating), whether landslides are mapped as points or polygons, and in some cases whether landslides are even mapped at all.
Quantitative evaluation of susceptibility classes The visual comparisons above are supported by a straightforward quantitative analysis, in which we calculate the percentage of the 294,454 individual landslides in the conterminous USA ( Table 2) that fall within each of the susceptibility classes for the three national-scale products we considered (Fig. 4). Given the different nature and number of classes in each of the three products and the incompleteness of the national database, a direct comparison of their accuracy is not possible. However, these quantitative metrics of landslide occurrence by susceptibility classes do facilitate some interesting observations and reveal potential issues with each of these three products, as well as with the compiled landslides database.
The landslide overview map includes 59% of landslides within the three highest classes HIGH, HIGH-MOD, and MOD, but 37% are within the LOW class, which is the greatest number of landslides in any class. The NASA fuzzy logic model includes 51% of landslides in the top two VH and H classes and only 1% in the lowest VL class, but 42% fall within the M class, which is the greatest number of landslides, and 7% of landslides are in the L class. Thus, the landslide overview map and NASA fuzzy logic model do correctly identify many high susceptibility areas where the majority of the landslides are mapped, but we also conclude that both substantially underestimate the potential for landsliding in the more moderate and lower susceptibility classes.
The USGS topographic susceptibility model achieves its objective of broadly distinguishing between areas that do and do not include mapped landslides, since 98% are correctly classified as SOME and only 2% of landslides fall within the NONE class. However, the NASA fuzzy logic model is even more effective at this objective and includes only 1% of landslides in the lowest VL class. Whereas both these models that consider slope and involve calibration against landslide inventories can correctly identify areas of low susceptibility, the landslide overview map greatly underestimates landslide potential with 37% of our landslides falling within the LOW class. This highlights the substantial and potentially catastrophic errors that can result from not only ignoring topography but also by mis-interpreting the hazards posed by certain landslide-prone geologic units. For example, numerous rockfalls have been documented in the intrusive igneous rocks of the Sierra Nevada in eastern California (e.g., Stock et al. 2013) and the fatal landslide near Oso occurred in particularly landslideprone glacial outwash deposits that are common throughout western Washington (Iverson et al. 2015;Collins and Reid 2019), yet both these geologic terrains were classified as LOW by Radbruch-Hall et al. (1982). To fully explain why numerous documented landslides in the conterminous USA occur within the moderate susceptibility classes of the USGS landslide overview map, and the NASA fuzzy logic model is difficult (i.e., MOD has 36%, M has 42%). In the case of the landslide overview map, this observation could be related to the large area of the western states classified as MOD, which coincides with the very thorough and systematic mapping of landslides that has been established in Washington, Oregon, and California. Indeed, 67% of the mapped landslides in our inventory are found in these three West Coast states (Table 2). For the NASA fuzzy logic model, it could simply be that the M susceptibility class covers very large areas of the country, including much of the Pacific Northwest, Rocky Mountains, and Appalachian Mountains, whereas a much smaller area of the country falls within the higher H and VH classes. However, neither susceptibility map accounts for the temporal component of landslide occurrence, and our database includes both pre-historical and recent landslides, without consideration for landslide frequency. Thus, in both the landslide overview map and the NASA fuzzy logic model, the large number of landslides in the moderate categories could be due to a reporting bias. Population centers, roads, and infrastructure tend to be less concentrated in the areas that are the most susceptible to landsliding (or more concentrated in areas that are less susceptible); at the same time, landslides tend to be reported and recorded Fig. 3 Maps of known landslide occurrence across the conterminous USA with color indicating confidence metric (see Table 1), overlain with the a USGS landslide overview map (Fig. 1a), b USGS topographic susceptibility model (Fig. 1b), and c the NASA fuzzy logic susceptibility model (Fig. 2a) more frequently when human activities are impacted. Therefore, reports of landslide occurrence tend to be more common in lower to moderate susceptibility zones.
In addition to the different number and type of susceptibility classifications used in these three products, the disparate input data and variability within the landslide database complicate any objective or quantitative comparison of their performance at the national scale. The USGS landslide overview map is based solely on geologic formations at the 1:2,500,000 scale (and then reduced to 1:7,500,000 for publication); the simplified USGS topographic susceptibility model is based on topographic slope and relief at roughly 30 m resolution; and the NASA model uses the same topographic data considering only the slope angle, but also includes geology, roads, faults, and forest loss in the fuzzy logic calibration. Of course, topography and geology are not completely independent, particularly when viewed at such coarse resolutions. However, despite the disparate data inputs, our interpretation of Figs. 3 and 4 indicates that more work is needed both to improve all these existing susceptibility models and to compile a complete and more comprehensive landslide inventory database.

Regions of interest and areas for improvement
The general qualitative and quantitative inferences of variability and incompleteness that we observe at the national scale (Figs. 3 and 4) are also apparent within the three broad regions that display the highest concentration of mapped landslides (Fig. 5): (a) the Pacific Northwest, (b) the southern Rocky Mountains, and (c) the Appalachian Mountains. To differing degrees, these three regions also tend to coincide with areas on the landslide overview map that include the higher susceptibility and incidence classes. The combination of high landslide concentration with higher confidence data are found in areas classified as high susceptibility and incidence on the landslide overview map, but these may be directly adjacent to other areas that had been similarly classified that exhibit no mapped landslides. This apparent contradiction further reinforces the reality that the current inventory database is far from complete. For example, in northwestern California (Fig. 5a), a high concentration of landslides abruptly stops at topographic quadrangle boundaries. Similarly, this occurs at the borders between states such as Kentucky and Ohio or North Carolina and Georgia (Fig. 5c). These situations clearly indicate that further mapping is needed to perform consistent analyses. Thus, the map of our landslide database can be used to identify areas with dense data coverage and high-confidence mapping, which would be suitable for development of various types of landslide hazard assessments, including quantitative susceptibility modeling, as well as subjective landscape-driven methods to derive the important factors that influence landslide occurrence.
The areas where high-confidence data coincide with previous assessments of high susceptibility indicate that other areas that were designated as higher susceptibility or even modeled as potentially susceptible should be examined more closely. These other areas with steep topography and high relief designated as potentially susceptible to landslides by the calibrated USGS (Fig. 3b) or the NASA (Fig. 3c) susceptibility models likely do incur landslides, but those may not have been identified yet due to incomplete mapping or features that have been obscured by vegetation growth or other changes over time. Such areas that are potentially hazardous may also include landslides that have been mapped, but information is not readily accessible in online or public databases. In contrast, landslides were identified in areas not recognized by the landslide overview map, such as California's Sierra Nevada or the area surrounding Oso, Washington ( Fig. 3a), but these areas do reflect the importance of slope and relief (Fig. 3b). Sparse landslides identified throughout the Midwest and Central States are also in areas previously classified as low susceptibility and incidence, or even modeled as unlikely to be prone to landslides. While landsliding is certainly more prominent in areas with steeper topography and higher relief that are already recognized as potentially hazardous (i.e., Pacific Northwest, Rocky Mountains, and Appalachia), the previous low-susceptibility classifications across much of the country do not necessarily indicate that landsliding is improbable (see also Fig. 4). Indeed, the landslides across the central USA are all integrated from NASA's GLC (Kirschbaum et al. 2015), which means they are recent (since 2007). In contrast, many regions with higher concentration of mapped landslides include lowconfidence geologic mapping of Quaternary landslide deposits, such as large portions of western Colorado.
Differences in data availability and quality across the country reflect the contrasting approaches to landslide mapping, which are a product of the regulatory environment, the limited resources available, and whether development has expanded into landslideprone terrain. In some cases, such as New Mexico, numerous landslides were mapped as points, albeit with lower confidence Fig. 4 Pie charts showing the percentage of landslides from the national inventory found within each of the susceptibility classes for the a USGS landslide overview map (Radbruch-Hall et al., 1982;Godt and Radbruch-Hall, 1997; Fig. 3a), b USGS topographic susceptibility model (Godt et al., 2012;Fig. 3b), and c NASA fuzzy logic susceptibility model Fig. 3c) methods, whereas in neighboring Arizona, selective mapping of fewer landslides as polygons with greater confidence is more prevalent (Fig. 5b). On the West Coast, lidar and high-resolution aerial imagery are being used to systematically map landslides within counties in Washington State and by topographic quadrangles in California, but in between them, Oregon stands out for an even greater coverage of high confidence and likely landslides (Fig. 5a). In the eastern USA, Kentucky, North Carolina, and Vermont stand out from neighboring states, even though steep topography, high relief, and landslide-prone geologic units are consistent across state boundaries throughout the sub-ranges of the Appalachian Mountains ( Fig. 3b and Fig. 5c). These are just a few very broad examples that illustrate where further landslide mapping is likely needed.
Potential utility and future opportunities Our current map of landslides within the USA and its associated database are the result of a broad community effort, which highlights the importance of working together toward the set of common and overlapping objectives and outcomes described in Better tools to increase awareness and evaluate current understanding. While certainly not comprehensive, these products represent a successful collaboration between numerous state and federal agencies to characterize landslide occurrence at the national scale. Additionally, the centralized public access has already encouraged further data sharing, new research, and awareness about landslide occurrence.
The parsimonious database structure is inclusive of even the most basic landslide inventories, but at the same time our confidence metric allows users to isolate the highest quality data for novel research applications, such as training landslide detection and mapping algorithms. The database still includes critical information on whether fatalities were incurred, if the date of occurrence is known, and unstructured notes on the failure mode, damages, impacts, or whatever other information is available. Thus, the database can not only be used to map the geographic location of landslides, but researchers could identify those events that have resulted in extraordinary losses to refine models for quantifying landslide risk. Similarly, researchers can easily select the events with precise timing information needed to develop and evaluate thresholds for landslide warning systems. The interactive, searchable map of landslide occurrence has prompted general inquiries from both the media and public about landslide studies  Table 1). AL Alabama, AZ Arizona, CA California, CO Colorado, GA Georgia, ID Idaho, IN Indiana, KY Kentucky, MT Montana, NV Nevada, NM New Mexico, NC North Carolina, OH Ohio, OR Oregon, SC South Carolina, TN Tennessee, UT Utah, VA Virginia, WA Washington, WV West Virginia and the inconsistency of landslide mapping across the USA. Overall, the database is successfully meeting our objectives of providing open access to landslide data, facilitating a variety of new research activities, and promoting awareness about landslide occurrence across the country.
Even landslide inventories developed with high-quality lidar data, and rigorous analyses are rarely complete; the lack of landslide points or polygons at any given point does not guarantee the lack of landslides, but rather it points to the lack of a publicly available geospatial database that can confirm either the absence or occurrence of landslides. Although individual states are leading the way in developing comprehensive and high-confidence landslide catalogs within their boundaries (Wooten et al. 2007(Wooten et al. , 2017Burns and Madin 2009;Crawford 2014;Slaughter et al. 2017;Wills et al. 2017), providing these data in the context of national-scale understanding to identify regions that have likely received less attention or resources to assess landslide hazards and associated losses is important. Our semi-quantitative confidence metric and comparisons to previous national-scale susceptibility maps (Figs. 3 and 5) point to areas where landslide mapping may be lacking or where data are not accessible, which could inform future work and funding decisions. Such comparisons can not only guide further mapping, but also help us to develop improved susceptibility models and disaster management plans that account for the broader geologic and geographic contexts across state borders or other jurisdictional boundaries.
In summary, the database allowed the first objective evaluation of previous national-scale landslide susceptibility products presented herein. The compilation can ultimately inform other research and more general hazard assessments for disaster management plans, transportation routes, and potentially insurance or other private industries. Finally, it is our intention that the openly accessible format will continue to motivate ongoing contributions to further improve landslide characterization and awareness across the country.

Acknowledgments
This work was supported in part by the United States Geological Survey's Landslide Hazards Program and the Community for Data Integration. All data can be found in Jones et al. (2019) and online at: https://doi.org/10.5066/P9E2A37P. We are grateful to Brian Collins and two anonymous reviewers for providing constructive comments. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the US Government. Please contact us (GS-HAZ_landslides_inventory@usgs.gov) with inquiries or further contributions of geospatial data for inclusion in the future updates of the US national landslide database.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/ 4.0/.