1 Introduction

Inland flood exposure continues to rise in the USA, driven by changes in precipitation and development in floodplains. Heightened exposure has translated into economic impacts, as evidenced by increases in both average annual losses (ASFPM 2020) and billion-dollar events (NCEI 2020). The leading measures of flood impact tend to focus on direct damage to physical assets, painting a picture of what is exposed and to what degree. Much less is understood about who is exposed to floods, and what is known is largely based on local post-disaster studies. National-level understanding of population exposure is limited, in part constrained by the lack of spatially contiguous floodplain data.

Flood exposure is higher for socially vulnerable populations (Lee and Jung 2014; Rolfe et al. 2020), especially for inland floods (Qiang 2019). Social vulnerability results when social, political, and economic process combine to produce heightened susceptibility to hazards for some populations (Cutter et al. 2003; Emrich and Cutter 2011). Vulnerable groups often inhabit flood-prone areas due to societal barriers related to social stratification, and their exposure has been examined in the USA (Adeola and Picou 2012; Lee and Jung 2014) and around the world (Kaźmierczak and Cavan 2011; Rolfe et al. 2020). Spatial indicators are regularly applied to measure and model dimensions of social vulnerability (e.g., age, race, poverty) and can deepen the understanding of the social dimensions of flooding.

Quantitative assessments of social vulnerability to floods have generally employed two methodological approaches. The first is integrated analysis, in which geospatial layers of flood hazard and social vulnerability indicators are superimposed and compared to identify where high levels of each coincide (Finch et al. 2010; Emrich and Cutter 2011). In the second approach, indices are constructed and mapped in flood-prone places to identify which dimensions of social vulnerability dominate (Zhang and You 2014; Mavhura et al. 2017). Each approach has tended to employ generic indicator sets at local and regional scales, despite conceptual consensus of social vulnerability as hazard-specific (Rufat et al. 2015). There remains limited empirical understanding of the spatial distribution and determinants of social vulnerability to floods at the national level. Such understanding is needed to support changes in flood mitigation policy toward social equity.

This study explores the geography of social vulnerability to inland flood exposure in the conterminous United States (CONUS). Our objectives are to identify the most vulnerable places and develop a set of indicators of social vulnerability to flood exposure. In doing so, we apply flood hazard maps with a unique national-level combination of continuous spatial coverage, representation of fluvial and pluvial hazards, consistent spatial resolution, and high accuracy (Wing et al. 2017) compared to previous analyses of social vulnerability. Through combining the floodplain data, land cover, social indicator modeling, and bivariate spatial cluster analysis, we address two research questions: where do high flood exposure and social vulnerability coincide in the CONUS, and which socially vulnerable populations distinguish these places?

2 Background

2.1 Flood exposure

Flood exposure refers to valued societal elements (e.g., people, buildings) located in floodplains (de Moel et al. 2011; Koks et al. 2015). Analyzing flood exposure generally entails two steps: delineating the extent of flood hazard and aggregating the intersecting population and/or built environment assets. The primary source for delineating US flood hazard is maps approved by the Federal Emergency Management Agency (FEMA). FEMA maps depict the spatial extent of the 1% annual chance flood (100-year flood) for communities enrolled in the National Flood Insurance Program. While widely available, FEMA flood maps have varying levels of quality and spatial coverage, with notable gaps in small catchment areas and low population communities (Qiang et al. 2017; Wing et al. 2017). According to a recent estimate, FEMA maps cover only one-third of the nation’s stream miles (ASFPM 2020). Nationwide, there are also significant differences among FEMA maps in the underlying input data, analytical methods, and recency (Wing et al. 2018; Pralle 2019).

Continental-scale mapping initiatives have begun to address quality and coverage gaps inherent in local flood hazard maps (Sampson et al. 2015; Dottori et al. 2016). Inland flood maps are now available for the USA that are contiguous, high-resolution, span multiple flood return periods, and exhibit accuracy approaching that of local studies (Wing et al. 2017). These new maps also incorporate pluvial hazard, which is not reflected by FEMA maps yet is a significant source of flood exposure to people and the built environment (Houston et al. 2011; Grahn and Nyberg 2017; NASEM 2019). Pluvial flooding occurs when heavy rainfall causes localized flooding independent of a river or overflowing waterbody. By contrast, fluvial occurs when high flow in rivers spills into the floodplain and is the focus of most inland flood mapping efforts. Newer initiatives are combining fluvial and pluvial maps with coastal flood maps to in the future derive continental-scale risk maps across flood hazard types (First Street Foundation 2020).

Aggregating exposed populations or built environment assets requires pinpointing their locations within the floodplain. Common inventory sources include point-level building and facility datasets, polygonal census and tax assessor data, and gridded population data. Polygonal and gridded datasets are the most widely available, but their boundaries may contain land types like open water, barren rock, and perennial snow where people and assets are largely absent. Including these land classes in exposure assessments can lead to misalignment of people and exposed areas. Dasymetric mapping addresses this problem, employing ancillary data to spatially distribute coarse gridded or polygonal data to realistic locations (Mennis 2003; Maantay et al. 2007; Prasad 2016). Land use/land cover data are the most oft-applied type of ancillary data (Zandbergen and Ignizio 2010), and dasymetric population mapping is now often employed in flood exposure and risk analysis. A recent study employed population data, dasymetric mapping, and flood maps for the CONUS to estimate that 41 million people are exposed to the 100-year floodplain (Wing et al. 2018), compared to lower estimates of 13.0–21.8 million based on FEMA maps (Wing et al. 2018; Qiang 2019).

2.2 Social vulnerability to floods

Population estimates are valuable for defining the general severity of flood exposure, but such aggregated measures inform only how many people are exposed, not who. Disaggregating exposed populations is important because socially vulnerable populations disproportionately inhabit flood-prone areas (Platt 1998; Lee and Jung 2014). Root causes include social stratification and associated paucity of political and economic power and resources that limit the locational choices of vulnerable groups in avoiding hazardous areas (Blaikie et al. 2014). For example, a study in central Texas reported comparatively lower house values, lower incomes, and more mobile homes inside the floodplain (Lee and Jung 2014), while research in the UK found that non-white populations were more flood exposed (Fielding 2018). Higher exposure can lead to greater susceptibility to impacts for socially vulnerable groups during flood disasters (Cutter et al. 2014). Numerous empirical studies have evaluated this notion using post-disaster damage and related socio-demographic information (Laska and Morrow 2006; Adeola and Picou 2012; Kamel 2012; Collins et al. 2013; Hamel et al. 2017; Emrich et al. 2020). Collectively, this research found multiple characteristics of vulnerable populations (e.g., race, poverty, unemployment, lower income) to be associated with more adverse outcomes.

At the national level, FEMA found inhabitants of the 100-year floodplain to have lower household incomes (FEMA 2018), while Qiang (2019) reported similar disparities for additional socioeconomic variables. However, social vulnerability is well understood to extend beyond socioeconomic characteristics, and encompass multiple dimensions associated with heightened hazard susceptibility. The current state of knowledge regarding statistical and spatial relationships between exposure and social vulnerability at the national level has limited empirical support. In short, who lives in the floodplain and where?

Case study research on flood disasters has significantly contributed to understanding of the relationship between floods and social vulnerability, but translation of findings to the development of flood-specific social indicators is not well established (Rufat et al. 2015). Spatial indicators of social vulnerability are useful tools for flood risk management, including prioritizing areas for flood mitigation measures (Lee 2014), distributing evacuation resources and personnel (Koks et al. 2015), and identifying communities most in need of recovery funding (SCDRO 2017). However, most studies apply generic models of social vulnerability across varying hazard types and disaster phases. Such generic measures continue to be used to spatially model social vulnerability in flood-prone areas (Kotzee and Reyers 2016; Roder et al. 2017), despite important distinctions between modeling social vulnerability in flood-prone places and modeling social vulnerability to floods. The former entails constraining the geographic scope of vulnerability indicators to flood-prone places (Garbutt et al. 2015; Aroca-Jimenez et al. 2017), while the latter analyzes spatial, statistical, and/or process interactions between flood hazards and social vulnerability (Burton and Cutter 2008; Sayers et al. 2018). Related research activities include customizing and testing indicators for flood exposure (Chakraborty et al. 2019b; Collins et al. 2019), flood preparedness (Działek et al. 2019), flood impacts (Fekete 2009; Rufat et al. 2019), and flood recovery (Emrich et al. 2020).

2.3 Flood exposure and social vulnerability

The spatial relationship between flood exposure and social vulnerability has often been studied using integrated analysis. Integrated studies develop and combine geospatial layers of multiple dimensions (e.g., physical, social, ecological) of natural hazards (Burton and Cutter 2008; Tate et al. 2010; Emrich and Cutter 2011; Rahman et al. 2016) to identify places with high multivariate vulnerability and their leading drivers. For example, Finch et al. (2010) combined map layers of tract-level social vulnerability and remotely sensed flood depths from Hurricane Katrina. They used the resulting bivariate map to assess correlation with the rate of recovery. In a similar vein, Koks et al. (2015) combined GIS layers of 50-meter flood hazard, parcel-scale exposure, and social vulnerability, and applied spatial statistics to examine the degree of clustering. Integrated studies are well suited for analyzing and visualizing the spatially varying linkages among physical and social dimensions of vulnerability to hazards. Conceptually, integrated approaches are germane to hazards analysis because the relative effect of exposure on places depends upon relative levels of susceptibility (Luers 2005).

Maps of bivariate spatial distributions are an increasingly common output of integrated geospatial analysis. But if the goal is to locate coincident extremes that are also spatially nonrandom, cluster mapping using Local Indicators of Spatial Association (LISA) can be applied. LISA measures spatial autocorrelation, the degree to which values at one place are similar to those in surrounding areas. LISA statistics can be computed in both univariate and bivariate modes to map statistically significant clusters of phenomena of interest. Previous research has applied univariate LISA statistics to study spatial autocorrelation in social vulnerability (Cutter and Finch 2008; Koks et al. 2015; Armas and Gavris 2016; Frigerio et al. 2018). More rare is the use of bivariate LISA for multidimensional exploration of social vulnerability and natural hazard (e.g., Gaither et al. 2015).

3 Data and methods

Figure 1 outlines the three methodological components of the analysis: construct a social vulnerability index, estimate population flood exposure, and conduct exploratory spatial analysis. We began by constructing a social vulnerability index for CONUS using demographic variables at the census tract scale. Then, using dasymetric mapping techniques, we calculated a tract-level measure of population flood exposure: percent habitable flooded area in each census tract. Finally, using the social vulnerability index and exposure measures, we applied bivariate LISA to map spatial clusters of extremes of flood exposure and surrounding social vulnerability.

Fig. 1
figure 1

Spatial analysis flowchart

3.1 Social vulnerability index

We derived variables related to social, economic, and demographic parameters from the 2012–2016 release of the American Community Survey (ACS), for each census tract in the CONUS (n = 71,901). The ACS data were used to build an indicator set of 29 variables (Table 1) drawn from the latest incarnation of the Social Vulnerability Index (SoVI) (HVRI 2015), using the Vulnerability Mapping and Analysis Platform (UCF 2020). Most of the variables have a positive relationship with social vulnerability, i.e., they increase in value with increasing vulnerability. The few variables that have an inverse conceptual or empirical relationship with social vulnerability are bolded in Table 1.

Table 1 Input social vulnerability indicators

We proceeded to create a social vulnerability index based on the SoVI algorithm of Cutter, Boruff, and Shirley (2003). Research examining the accuracy and stability of different social vulnerability indices has had conflicting findings (Schmidtlein et al. 2008; Bakkensen et al. 2017; Rufat et al. 2019; Spielman et al. 2020). We selected the SoVI algorithm and variable set due to its widespread adoption and the exploratory nature of this study. The SoVI algorithm employs principal components analysis (PCA) to reduce a large number of input indicators to a smaller and decipherable number of latent factors. Prior to the PCA, we reversed the sign of each bolded variable in Table 1 to account for the variable’s influence (positive/negative) on social vulnerability. We employed the statistical package SPSS (version 25) to implement the PCA, applied a Varimax rotation to the resulting components, and the Kaiser criterion to extract components with an eigenvalue of at least 1.0. This resulted in seven components that collectively explained 69% of the variance in the original indicator set. We then additively aggregated the factors into a social vulnerability index.Footnote 1 The spatial distribution of the resulting index is displayed in Fig. 2.

Fig. 2
figure 2

Social vulnerability index for the CONUS at the census tract scale

Regions of high social vulnerability are most evident in rural areas of the US Southwest and US South, driven by high Native American populations in the Southwest and rural racialized poverty long associated the Black Belt region of the South (Wimberley and Morris 2002). The spatial distribution of high social vulnerability is more contiguous compared to previous national analysis at the county scale (Cutter and Finch 2008). By contrast, places with low social vulnerability are more spatially dispersed, and regionally occur to the highest degree in the Midwest and along the Northeastern Seaboard.

3.2 Population flood exposure

To develop a tract-level measure of places exposed to inland flooding, we began with geospatial datasets of flood hazard and land cover. Flood hazard for the 100- and 500-year return periods was represented using flood depth grids obtained from Fathom, which cover the CONUS at a horizontal resolution of 1 arc-second (~ 30 m). The flood grids reflect both fluvial (riverine) and pluvial (surface water) hazard, and represent all CONUS locations because there is no minimum catchment area associated with the underlying methodology.Footnote 2 Following Wing et al. (2017), we removed all cells from the pluvial grid with a depth of less than 15 cm, as this represents a typical ground-floor threshold depth beyond which damage may be expected. We then mosaiced the fluvial and pluvial grids, and in the resulting dataset classified all cells with a depth greater than zero as wet. Validation testing of Fathom maps based on this process has shown correspondence of 86%–92%, compared to high-quality and local-scale maps from FEMA and the US Geological Survey (Wing et al. 2017).

We obtained land cover data from the 2016 National land cover database (NLCD) of the US Geological Survey (NLCD 2020a). The NLCD is a 30-m resolution raster grid, with each cell representing one of twenty land cover types based on a modification of the Anderson Level II classification system.Footnote 3 To restrict the selection of NLCD pixels to those that could support human population, we used the classification scheme from the EnviroAtlas dasymetric population map (Pickard et al. 2015; USEPA 2015).

The EnviroAtlas dasymetric population dataset spatially classifies levels of human habitability, based on land slope and NLCD land cover (Table 2). We retained only NLCD grid cells with land cover classes corresponding to those of EnviroAtlas Habitability classes 1 through 5. This excluded locations with land cover types including water, perennial snow, and emergent wetlands. It should be noted that other notions of habitability have been applied for dasymetric mapping in flood exposure analysis. For example, the HAZUS flood loss estimation model restricts the distribution of built environment assets to developed and cultivated areas (NLCD classes 21–24, 81–82), while Qiang (2019) used only developed areas to distribute population.

Table 2 Habitability of land cover classes (USEPA 2015; NLCD 2020b)

For each census tract, we identified grid cells that are both habitable and in the floodplain. We then computed the ratio of flooded habitable cells to all habitable cells. The result is an exposure metric for populated places: percent flood-exposed habitable area per tract. Figure 3 shows the spatial distributions of the floodplain (Fig. 3a) and population exposure metric (Fig. 3b). The mean exposure value for the CONUS is 12%.

Fig. 3
figure 3

Flood hazard and population exposure for the 100-year return period. a Fluvial and pluvial flood hazard extent; b % Habitable flooded area per census tract

3.3 Exploratory spatial analysis

We used the values of the flood exposure metric and social vulnerability index at each tract in exploratory spatial analysis using bivariate LISA statistics. Bivariate LISA maps depict the spatiality of how the value of one variable is surrounded by values of a second variable (Anselin 1995). More specifically, it identifies two forms of spatial dependence: positive spatial autocorrelation (clusters) and negative spatial autocorrelation (heterogeneity). Positive autocorrelation occurs where high values of variable 1 are surrounded by high values of variable 2 (High–High hotspots) or low values surrounded by low (Low–Low cold spots). In places with negative spatial autocorrelation, high values are surrounded by low values (High–Low clusters), or vice versa.

The bivariate LISA measures in this study inform how values of flood exposure are surrounded by values of social vulnerability. To measure local and global spatial autocorrelation and map resulting clusters and their statistical significance, we used the GeoDa spatial data analysis software (version 1.14). Although LISA statistics cannot explain causal mechanisms underlying the resulting spatial clusters, they are well suited to identify places and develop hypotheses for further exploratory and explanatory analyses. Bivariate LISA analysis determines the statistical significance for each cluster and also generates a global spatial autocorrelation statistic for the entire study area that averages the local values (Anselin 2005).

4 Results

4.1 The geography of social vulnerability to floods

The global bivariate Moran’s I statistic is 0.1, indicating low positive spatial autocorrelation. This means that on average in the CONUS, the spatial association between social vulnerability and flood exposure has only a small degree of spatial clustering. However, the global statistic can mask substantial local variation in spatial autocorrelation. Hence, we also computed bivariate LISA to map spatial autocorrelation for each census tract. The results demonstrate that there are distinct geographic patterns of spatial clustering. Figure 4 provides maps of bivariate LISA clusters (4a) and their statistical significance (4b). Based on the significance map, the cluster locations remain relatively unchanged even as the significance threshold increases from p < 0.05 to p < 0.01.

Fig. 4
figure 4

Bivariate LISA of 100-year flood exposure and surrounding social vulnerability. a Cluster map and b Cluster significance

The high–high (HH) hotspots (dark red) are regions where census tracts with higher than average flood exposure are surrounded by tracts with higher than average social vulnerability. The majority occur in the US South, while smaller regional hotspots occur across the CONUS. These spatial patterns are discernable in the CONUS maps for social vulnerability (Fig. 2) and population exposure (Fig. 3B), but are even more pronounced in the bivariate results. The states with the highest percentage of tracts in HH hotspots are Mississippi (44%), Louisiana (35%), Florida (26%), South Carolina (21%), and Arkansas (20%). Urban hotspots are difficult to visualize on the national map due to the small size of census tracts in these places, but they occur in major metropolitan areas, including St. Louis, Chicago, and Houston (Fig. 5). On balance, the census tracts in the hotspots are decidedly rural, with a population density of 73 people per square mile in habitable areas, compared to the national average of 110 for the CONUS (Table 3). Collectively, the HH clusters are home to approximately 19.4 million people.

Fig. 5
figure 5

Bivariate LISA for selected metropolitan areas

Table 3 LISA cluster characteristics

The high–low (HL) clusters (dark blue) are regions where tracts with high flood exposure have neighboring tracts with low social vulnerability. These clusters are most prominent in the agricultural Midwest and suburban areas of major cities. Tracts in HL clusters have a population density of 221 people per square mile in habitable areas, indicating that these places are much more urban than the national statistic. In total, the HL clusters are populated by approximately 16 million people.

The low–high and low–low clusters also demarcate places of bivariate extremes. The low–high clusters represent areas with low flood exposure with neighboring high social vulnerability and are home to approximately 28 million people. Extreme floods or changes in flood probability in these places could have an outsized impact due to high population susceptibility. Tracts in low–low clusters are more numerous and populated. These places have the lowest levels of flood exposure and social vulnerability and warrant little attention from the perspective of flood vulnerability.

4.2 Spatial indicators of social vulnerability to floods

Due to the exploratory nature of LISA, the resulting spatial clusters lack statistical explanatory power. However, they do indicate places to focus further attention (Anselin et al. 2007). We do so to address research question 2: which indicators distinguish HH hotspots from other places? Table 4 presents the results of the hotspot interrogation, identifying indicators that substantially differ in mean value between tracts in HH clusters and all other tracts. The italicized rows highlight indicators that decrease in mean value when moving into a HH cluster. The results pertain to the 100-year flood with LISA cluster significance of 0.05. We conducted additional analyses for a significance value of 0.01 and for the 500-year return period. Each of these parameter changes produced only a minor shift in locations of clusters and the rank order of indicators.

Table 4 Distinguishing characteristics of high–high clusters

Demographically, the HH hotspots are distinct from other places. Of the 29 indicators, 15 have an average difference of at least 30% when moving into a HH cluster. The largest disparities are for indicators of housing and race, with the hotspots located in places with substantially higher percentages of mobile homes (250% change), African Americans (131%), and Native Americans (111%). Other indicators with large differentials include dependence on extractive industries, female-headed households, lack of health insurance, and limited English proficiency. Socioeconomic indicators of wealth, income, and educational attainment are all much lower in hotspots of flood exposure and social vulnerability. The $156,000 difference in median housing value is particularly striking. We applied a t-test to evaluate if the differences in indicator values between HH hotspots and elsewhere are statistically significant. The p-value for all indicators was statistically significant (< 0.01), except for nursing home residents (0.046) and renters (0.266).

Table 5 identifies indicators that substantially differ in average value between tracts in HL clusters and all other tracts. The italicized  rows in Table 5 highlight indicators that decrease in mean value when moving into a HL cluster. The values of all indicators were statistically different between HL clusters and elsewhere (p < 0.01). The HL clusters are census tracts where high flood exposure is surrounded by low social vulnerability. Deconstructing the HL clusters directs attention to the low extreme of the social vulnerability continuum. Instead of asking which social characteristics compound physical flooding in places with high exposure, the HL clusters draw attention to the population characteristics of highly exposed places that are relatively free from added strain from high social vulnerability.

Table 5 Distinguishing characteristics of high–low clusters

The top distinguishing indicators for HL clusters are race and income, with these places home to more Asian residents (157%) and high-income households (114%). Other dominant characteristics include lower proportions of Black and Native American residents, vacant homes, and mobile homes, and higher socioeconomic status (housing wealth, educational attainment, income) compared to the HH hotspots. Vacant homes are an indicator of neighborhood distress (Molloy 2016). As with the HH hotspots, indicators of age, renters, and gender are weak discriminators. Many of the dominant HL indicators overlap with those of the HH hotspots, but with a change in direction. In other words, indicators that increase in value when entering a HH hotspot decrease when entering a HL cluster, and vice versa. Although the HL and HH indicators are mirror images in direction, they are not so in percent change. For example, indicators of vacant housing and Hispanic ethnicity are weak discriminators of HH hotspots but are distinguishing characteristics of HL clusters.

5 Discussion

5.1 Population characteristics of flood vulnerability hotspots

Although not explanatory, the LISA findings help identify places and social vulnerability dimensions that merit deeper attention. What are the societal processes underlying such high population disparities in floodplains, and how can flood mitigation planning be improved to address them? The HH and HL clusters can contribute to this understanding, as they characterize extremes of social vulnerability in highly flood-exposed places. The HH hotspots draw attention to places and populations potentially facing stark inequity in flood exposure, while the HL clusters are places where the spatial linkage between high exposure and high social vulnerability is weak.

For each indicator in Tables 4 and 5, we averaged the values in the percent change column. Table 6 includes all indicators with an average change of at least 40%, producing a set of 12 indicators of social vulnerability to high flood exposure in the USA. The threshold of 40% is arbitrary, but sufficiently high to establish a set of indicators to prioritize for deeper investigation and hypothesis formulation. The table column 'Relationship with Social Vulnerability' denotes the direction that each indicator changes (increase/decrease) when entering a HH cluster. Reversing the signs gives the direction when entering a HL cluster. The leading indicators and their rank order are robust to a change in the flood hazard from the 100-year to the 500-year return period.

Table 6 Priority indicators of social vulnerability to flood exposure in the CONUS

In what ways are the exposure-social vulnerability clusters unique? Principally, the hotspots are places characterized by housing and racial disparities. Previous studies have investigated the flood exposure of mobile home residents and found them to be vulnerable due to widespread siting of mobile home parks in floodplains, structural fragility, and poverty (Shen 2005; Baker et al. 2014; Rumbach et al. 2020). Percentages of Black and Native Americans are also higher in hotspots. Numerous studies have documented the heightened flood exposure of Black residents (Ueland and Warf 2006; Bullard and Wright 2009; Chakraborty et al. 2019a). By contrast, the body of scholarship exploring the flood exposure of Native Americans is thin (Vickery and Hunter 2016). Both populations share histories of historical and contemporary subjection to discrimination, marginalization, and exclusion that are well understood through environmental justice research to produce disproportionate exposure to hazards. Collectively, the prominence of housing and race indicators highlights the multidimensionality of social vulnerability, in that its leading causes extend beyond the common focus on income.

The HH hotspots are also distinguished by the relative absence of some populations, particularly Asian residents and high-income households. Overall, the high ranking of socioeconomic indicators of income, wealth, education, and occupation align with existing conceptual and empirical understanding of social vulnerability (Fothergill and Peek 2004; Winsemius et al. 2018; Qiang 2019). However, the direction of the relationship with the Asian (%) variable runs counter to its conceptualization in US indicator sets. It may be that race and ethnicity indicators like Asian (%) and Hispanic (%) are too broadly defined to serve as valid proxies for socially vulnerable populations (Montgomery and Chakraborty 2015).

Based on population density, the hotspots are more rural than the national average. Much of what is understood about social vulnerability to floods stems from studies of disasters in urban areas, particularly those focused on Hurricane Katrina. Our findings suggest devoting greater attention to the driving processes and distinguishing characteristics of social vulnerability in rural flood-prone settings (Cross 2001; Horney et al. 2017; Jamshed et al. 2020). Deeper investigation of rural places may benefit methodologically from experimentation with alternative land use/land cover information, such as the US building footprint dataset (ArcGIS Online 2019). Research has shown estimates of exposed total population to be sensitive to the resolution of flood hazard and population data (Huang and Wang 2020), particularly for rural areas (Smith et al. 2019).

Notably absent from Table 6 are indicators of age and renters that are ubiquitous in social vulnerability models and empirical studies of flood disaster impacts. Their absence does not necessarily signify inconsequential relevance to social vulnerability to floods. It could be that the indicators of age and land tenure we employed are weak proxies for the underlying social vulnerability processes. An alternative potential explanation is that age and renter variables are salient for identifying disproportionality in disaster impacts and recovery outcomes (Jonkman et al. 2009; NASEM 2019), but not for discriminating places with high flood exposure. These variables may also be influential in a manner that is more intersectional than primary (Rufat et al. 2015; Rumbach et al. 2020), such as low-income renters and elderly residents of mobile homes.

Although this study has focused on individual indicators that distinguish hotspots, the findings are also relevant to index construction and indicator validation. The variables in Tables 4 and 6 are directly applicable to indicator selection, while the % Change column could serve as an empirical basis for assigning differential weights in an index. Regarding validation, recent efforts have used disaster outcomes to assess the explanatory power of social vulnerability indicators (Bakkensen et al. 2017; Rufat et al. 2019). The results of this study provide priority indicators for similar evaluation based on flood exposure, and methodology to do so for other hazard outcomes.

5.2 Implications for flood mitigation

While the clusters identify locations of heightened vulnerability, the distinguishing social indicators within them can inform changes to flood mitigation programs, policies, and interventions aimed at making them more socially equitable (Cutter et al. 2013). Although interventions to reduce flood impacts span disaster phases (i.e., mitigation, preparedness, response, recovery), social vulnerability processes and the most flood-susceptible populations differ from phase to phase (Rufat et al. 2015). The results of this study enable interrogation of the relationship between vulnerable populations and flood exposure, a critical undertaking given connections among exposure, built environment vulnerability, and social vulnerability in explaining flood damage (Highfield et al. 2014). Because the clusters are based on probabilistic flood hazard, our findings are most applicable to flood mitigation and planning.

The cluster locations can be used to tailor mitigation priorities. For example, floodplain managers in HH hotspots could place greater emphasis on both implementing solutions that both benefit households with elevated exposure and are customized to be accessible to socially vulnerable populations. Places with similar cluster typology might also be logical partners to share knowledge and best practices (Chang et al. 2018). The HL clusters of the Central Plains and Northeast are flood prone yet have a low regional social vulnerability. These are potential places to prioritize exposure reduction. This includes traditional land-altering structural approaches such as levees, flood walls, detention basins, and green infrastructure, as well as nonstructural measures that remove people from risky areas like land use planning, buyouts, elevating buildings, and early warning systems.

But focusing flood mitigation solely on reducing physical exposure may fail to protect the most socially vulnerable. Given the high flood exposure of HL clusters, these places are certainly sensible targets for mitigation investment. But the HL clusters are also characterized by higher levels of wealth, racial homogeneity, and home ownership, meaning the mitigation benefits are likely to accrue to those with higher than average coping capacity. Indeed, previous studies have highlighted income inequities in flood insurance (FEMA 2018), inequities in race, ethnicity, age, wealth, and housing tenure in buyouts (Muñoz and Tate 2016; Siders 2018; Elliott et al. 2020), and racial inequities in storm water infrastructure (Hendricks 2017) and home reconstruction (Bates and Green 2009). Hence, it is important to design mitigation programs that reduce disparities in protection, so that flood-exposed populations equitably reap mitigation opportunities and benefits.

Consider the role of the economic benefit-cost ratio, which is a leading selection criterion for national funding of structural mitigation projects. Projects with high ratios receive priority for US government support, but high ratios are also often associated with high asset values. Median house value is an indicator of asset value, and a major distinguishing characteristic of both HL and HH clusters. Median house value in HL clusters far exceeds those of other places (Table 5), while home values in HH hotspots are much lower than elsewhere (Table 4). As such, selecting structural mitigation projects based on benefit-cost ratio may preferentially allocate resources to places with low social vulnerability, while at the same time lead to funding denials or delays in socially vulnerable neighborhoods.

If the mitigation goal is to optimize both reduction in physical risk and remedy to socially vulnerable populations, HH hotspots can be prioritized. Interventions in these places are more likely to address multiple dimensions of flooding: high physical exposure and high social inequity. A social equity emphasis in mitigation means prioritizing resource investments based on satisfying human needs (Sayers et al. 2018; Emrich et al. 2020). Such an approach aligns with the rising prominence of social resilience in flood mitigation (Flatt et al. 2019), which emphasizes strengthening social equity and resources alongside the traditional focus on reducing built environment and sectoral impacts. Given the raft of social disparities in HH hotspots, prioritizing these locations for flood mitigation could be a boon for improving community resilience to floods. Deeper interrogation of HH hotspots should yield not only a better understanding of processes generating social vulnerability, but also of the capacities of people in these places that can be leveraged for vulnerability reduction. The indicators in Table 6 provide a tool to spatially target mitigation investments and evaluate social equity in implemented approaches.

6 Conclusion

Using a unique combination of spatially continuous flood grids and bivariate cluster analysis at the CONUS scale, this study has mapped multidimensional hotspots of flood exposure and social vulnerability and identified dominant demographic characteristics within them. Mobile homes and racial minorities were the most overrepresented in hotspots and warrant greater attention in both flood mitigation and development of social vulnerability indicators. Most current measures of social vulnerability are generalized, meaning similar sets of indicators tend to be applied across a wide range of hazards, geographies, and disaster phases. Few research efforts have connected indicator selection with specific hazard contexts. A primary contribution of this study is the generation of a set of social vulnerability indicators for the context of flood exposure in the USA.

Although the analysis focus was 100-year fluvial and pluvial flooding for the national level at the census tract scale, similar analyses can be implemented for regions, states, or disaster impact areas and employ different hazard surfaces. The LISA methodology is generalizable to different geographic scopes, analysis scales, flood hazards, and index construction methods. Adjusting these analysis parameters may very well yield different hotspots geographies and dominant indicators to explore (Qiang 2019). It also allows customizing mitigation decisions to the hazard and geographic scope of interest.

The exploratory nature of the spatial analysis leads to new questions regarding mitigation decisions for project prioritization, resource allocation, and program design. Do the characteristics of hotspots substantially differ between urban and rural settings? Urban areas are often the focus of flood mapping and vulnerability analysis (NASEM 2019), but far less is understood about the interacting physical and social vulnerabilities of rural places (Rolfe et al. 2020). How do mitigation expenditures compare between HH and HL clusters? Numerous studies have investigated risk reduction approaches such as structural mitigation, buyouts, and flood insurance, but relatively few have explored the associated degree of social vulnerability. What is the relationship between social vulnerability indicators of flood exposure, flood impacts, and flood recovery? Addressing these and related questions can provide evidence to support the development of more socially equitable strategies for flood risk reduction.