1 Introduction

Extreme events have inspired decades of research on risk and vulnerability, with the goal of identifying communities with the least capacity to prepare for, respond to, and rebound from disasters. Unfortunately, studies suggest that beyond the impact of physical hazard, socially disadvantaged communities face disproportionate risk when exposed to natural disasters (Laska and Morrow 2006). Similar observations underpinned the development of early social vulnerability indices, which utilize variability in sociodemographic data to quantify population vulnerability to loss in the face of an event such as a hurricane (e.g., Clark et al. 1998; Cutter et al. 2000; Tapsell et al. 2002).

Although the theoretical links between sociodemographic factors and hazard vulnerability are well established, there remains considerable debate around best practices for measuring social vulnerability (Beccari 2016). Several social vulnerability indices have been proposed to quantify this latent construct, but most are ecological in nature (i.e., measuring vulnerability at various levels of census geography or other units), and few studies have attempted to validate them or otherwise assess their relative utility (Fekete 2019). Decision makers have begun to utilize measures of social vulnerability for resource disaster recovery allocation (New York Times 2020), and nearly 40% of states are planning on using vulnerability measures to prioritize COVID-19 vaccine distribution (NPR 2020; Schmidt et al 2020). Such use of this theoretical contract and its empirical manifestations, although seemingly ethical in nature, may have unknown or unintended consequences. Although these measures may have face validity and are being applied, we still don’t know how “good” or “useful” these measures are from a variety of perspectives. Here, utility may take several forms, including ease of construction, internal consistency, robustness, transferability, or explanatory power. Arguably, the best model would perform well on each of these criteria, yet most social vulnerability models have not been rigorously tested.

Validation studies aimed at understanding the utility of social vulnerability surrogates, variable aggregates, and models/indices remain difficult to undertake for a variety of reasons, including the lack of available point level data on vulnerability and outcome measures, the ecological fallacy created when aggregating outcomes and social vulnerabilities to various levels of geography and the challenges related to using a multivariate characteristic such as social vulnerability to describe univariate outcomes such as total losses, fatalities, slow recovery and the like. Recently, Rufat et al. (2019) tested the empirical validity of four social vulnerability indices against Federal Emergency Management Agency (FEMA) claims data from Hurricane Sandy (as a proxy for losses) aggregated to census tracts, with mixed results. The four indices they tested were highly variable in their use of data driven, pillar-based, and expert opinion approaches.

One of these indices, the social vulnerability index (SoVI) originally developed by Cutter and colleagues (2003) remains among the most heavily referenced and often utilized for understanding capacity in the face of disaster. SoVI’s original formulation—a 42 variable empirical measure of sociodemographic and built environment characteristics of community capacity to prepare for, respond to, and rebound from disaster events—has a long history of utility in both academic settings with more than 4700 citations and applied planning tools (NOAA 2020; Surging Seas 2014). Derived as an empirical method to understand, measure, and monitor difference between places, SoVI is the product of well-known individual/community social inequality measures and place inequality measures or those characteristics contributing to the social vulnerability of places. SoVI, the first large-scale (county level) composite measure capturing the broad suite of social vulnerability characteristics identified through a literature review, has been largely accepted and applied across academic and practical fields of use. SoVI’s more recent incarnations (27–29 variables focused only on sociodemographics) have long filled the need to understand lack of capacity to prepare for, respond to, and rebound from environmental threats. However, an important distinction here is that SoVI’s original conceptualization centered on building a model based on a retrospective of disaster impacts and outcomes rather than building the most robust model from a psychometric perspective. Although, the SoVI model has become one of the more notable targets for critical review. Most recently, Spielman and colleagues (2020) observed several problems with two measures used to determine a model’s utility, construct validity, and internal consistency. Spielman and colleagues (2020) further argue that SoVI’s construction is not consistently robust with respect to traditional psychometric theory. Although the recent criticism is specific to SoVI, it is not clear if any existing social vulnerability measures are generalizable beyond context-specific applications. Rather, these limitations may lie in the approach to measuring and assessing the latent constructs of interest.

Although the advent of social vulnerability measurement has yielded some important insights in the disaster risk literature, several limitations remain in the ways which researchers have attempted to quantify social vulnerability. On the one hand, the concept is theoretically robust; it has long been appreciated that social factors influence one’s experience of hazards and stressors (Turner et al. 2003). Simultaneously, data-driven and pillar-based (i.e., construct-based) approaches, such as those tested by Rufat and colleagues (2019), appear to be limited in their generalizability. This hinders progress toward the ultimate goals of developing portable metrics for understanding who is affected by disasters and mitigating related losses.

The development of more generalizable social vulnerability indices may benefit from statistical approaches that combine theoretical or conceptual knowledge of vulnerability with empirical confirmation. Data-driven approaches, such as principal component analysis (PCA) and exploratory factor analysis (EFA), are known for their susceptibility to sample-specific characteristics of the data (Widaman 2018). Models informed by theory, such as the pillar-based approach by Rufat and colleagues (2019), can help reduce sample specificity and overfitting by imposing constraints consistent with theory. At the same time, purely pillar-based or expert-based models account for theoretical knowledge but minimize nuances in the data pertinent to a given disaster or a given community. Moving forward, this area of research can rely on current confirmatory latent variable approaches that incorporate the best of both worlds.

Latent variable models can enhance the generalizability of social vulnerability indices. In particular, confirmatory factor analysis (CFA) is a framework that uses theory-guided a priori knowledge to specify a latent construct or constructs hypothesized to underlie the covariation among a set of indicators (Llabre and Fitzpatrick 2012). Applying CFA to social vulnerability indexing would involve specifying indicators, or observed variables, that are presumed to reflect the construct or constructs of interest. CFA can test single factor or multifactor models, and for each factor or latent variable, the indicators reflecting the construct are specified. For example, if theory suggests that a socioeconomic factor is important for social vulnerability indices, we can specify census measures—such as per capita income, percent living below the poverty line, and median house value—as indicators of the latent socioeconomic construct. CFA models partition the variability in an indicator into that which is derived from the latent variable and the residual which contains specific variance and error. The model yields estimate of the loadings associated with each indicator. These loadings represent the correlation between the latent variable and the indicator and provide a measure of the adequacy of the indicator in reflecting the construct (Kline 2016; Llabre and Fitzpatrick 2012). Further, squared standardized factor loadings provide the proportion of explained variance of the observed variable. Tests of fit between the model specified and the data provide important information about the adequacy of the theory in a given situation, as well as the adequacy of the indicators. Applying a confirmatory measurement modeling approach to social vulnerability indices may provide a deeper understanding of the latent social vulnerability construct. Additionally, this approach may identify additional social characteristics that can best inform policies to decrease disaster vulnerability.

In the present study, our overarching objective was to apply confirmatory factor analysis (CFA) to evaluate SoVI model construct validity and generalizability following separate natural disasters (Rufat et al. 2019). This study represents a departure from the purely data-driven approaches typically used to quantify social vulnerability by applying a theoretically-based pillars approach to SoVI’s raw underlying data. Our overall intent is to leverage the best aspects of prior social vulnerability indices and incrementally create a revised index with stronger empirical validity for our study area, Florida. Our first aim was to fit confirmatory measurement models of each of the five pillars identified by Rufat and colleagues (2019), including socioeconomic status, population structure, race and ethnicity, access and needs, and housing structure (Fig. 1). Assuming we could attain acceptable fit for each measurement model, our second aim was to integrate the individual measurement models into a five-factor model representing the overall SoVI construct and ultimately test its association with FEMA claims data from Hurricane Irma, the fifth costliest hurricane to hit the mainland United States (National Hurricane Center 2018). But, we did not get that far our analysis encountered several obstacles highlighting some of the underlying challenges that have afflicted past attempts to validate social vulnerability indices. This paper focuses on those methodological issues and discusses the implications for generalizable social vulnerability measures with recommendations for future attempts to quantify and validate this construct.

Fig. 1
figure 1

Theoretical correlated factor structure of SoVI

2 Method

2.1 Data and data screening

This study analyzed the 28 sociodemographic census variables commonly used as candidates in the computation of SoVI (Table 1), gathered from the American Community Survey, five-year product (2014–2017) and aggregated to 4,246 Florida census tracts. We could have used alternate spatial units, such as counties or metropolitan statistical areas, but census tracts are the smallest geographic unit for which demographic variables are considered to be statistically reliable (Spielman et al. 2014). Of the initial sample, 84 tracts were excluded due to zero residents, resulting in an effective sample of 4162 tracts.

Table 1 Descriptive statistics for each of the SoVI census variables at the census tract level

In preparation for testing confirmatory factor models, we examined the variability, skewness, kurtosis, and scaling of each SoVI variable before proceeding with further analyses. Low variability indicates a variable that is relatively stable across census tracts. In the extreme case, when there is no variability for a given measure (i.e., the variable was effectively constant), it was impossible to assess the variable’s quality as an indicator of a latent construct or relationships to other variables.

In addition to variability, we also addressed variable scaling, as estimation difficulties can arise when variables have drastically different ranges. We chose to rescale percentage variables by multiplying them by 100, and we divided absolute measures such as per capita income, median house value, and median rent by 1000. Measures of skewness and kurtosis provide information about the extent to which variables meet the normality assumptions required by maximum likelihood estimation. The recommended cut points are ± 3 for skewness and 8 for kurtosis (Kline 2016).

2.2 Statistical analyses

2.2.1 Confirmatory factor analysis

We conducted a series of confirmatory factor analyses (CFA) with full-information maximum likelihood (FIML) estimation to test the structure of theoretically driven models of social vulnerability. All statistical analyses were conducted in R version 3.6.3, and CFA was conducted via the lavaan package (Rosseel 2012). For each model, we explicitly specified which indicators (i.e., variables) served as observed measures of a given latent construct. Measures of model fit allowed us to evaluate how well the theoretically specified model was supported by the data. Each CFA was evaluated for overall model fit and reliability, with model fit evaluated according to the following standard criteria: nonsignificant chi-square test (χ2), root mean square error of approximation (RMSEA) less than .08, standardized root mean square residual (SRMR) less than .06, and comparative fit index (CFI) greater than .95 (Kline 2016). A standardized factor loading (λ) less than .30 suggests the indicator is a weak reflection of the latent construct. We also examined Cronbach’s alpha (α), a commonly used metric of internal consistency, for the five measurement models (i.e., for each pillar); α greater than .70 generally indicates acceptable reliability. Indicators which loaded negatively onto a latent variable were reversed (i.e., multiplied by −1) before calculating reliability. Fit indices for CFA models are reported in Table 2. Standardized and unstandardized factor loadings as well as associated significance tests are reported in Table 3.

Table 2 Fit indices for single-factor CFA models with subsequent modified models including residual covariances between indicators
Table 3 Unstandardized and standardized factor loadings, with associated test statistics, for the final single-factor models

2.2.2 Initial single-factor measurement model specification

To test our first aim, we specified a set of confirmatory single-factor measurement models based on the pillars suggested by Rufat and colleagues (2019) (Table 1). Socioeconomic status was indicated by per capita income, percent living under the poverty line, median house value, percent earning over $200,000 per year, percent working in the service sector, percent working in the extractive sector, percent with less than 12 years of education, and median rent. Population structure was indicated by median age, percent living with a child younger than 5 years or adult older than 65 years, people per housing unit, percent of female-headed households, percent of females employed, percent of population which is female, and percent of married families. Race and ethnicity was indicated by four variables: percent of population which is African American, Hispanic/Latino, Native American, and Asian American. Access and needs was indicated by percent living in a nursing home, percent receiving social security benefits, percentage of population in which English is a second language, percent of population who do not have an automobile, and percent uninsured. Housing structure was indicated by percent of units occupied by renters, percent of vacant houses, and percent living in a mobile home.

2.2.3 Single-factor model respecification

If specified single-factor models did not fit the data well, we explored modifications to improve fit. While model respecification is an empirically guided step, we sought to make modifications consistent with theory, in line with our overarching aim of increasing the generalizability of the SoVI model. To make modifications, we examined problematic indicators (e.g., low factor loadings) and modification indices suggested by lavaan. Typically, model respecification was done by removing indicators or correlating residual variances between indicators, with the latter modification suggesting that two variables share some relationship above and beyond the covariance explained by the latent variable. In the case of correlating residual variances, it is generally the responsibility of the analyst to interpret the covariance and theorize what external factor might contribute to these associations.

2.2.4 Correlated factor model

Although we observed that the single-factor models were not a good fit, we tested our second aim by fitting a correlated five-factor model that tested the overall fit of the theoretical SoVI structure. All five of the single-factor models described above were included in a model simultaneously, and latent variables were allowed to correlate. We tested the correlated factor model first using the original single-factor models (i.e., with no modifications). We then tested the model again including the respecifications from the single-factor models to see if this would improve fit.

2.2.5 Alternative models

To examine alternative factor structures of SoVI variables, we tested several additional models. These included a single-factor model with a broad social vulnerability latent variable indicated by all previously mentioned indicators and a bifactor model which tested both a general and domain-specific latent variable. A single-factor model implies all census variables represent a common, social vulnerability construct rather than representing distinct but related subfactors, as discussed above—this solution would reject the theory-based pillars of past literature. A bifactor model is a combination of the single-factor model and the specific factor models discussed above and implies all census variables represent two constructs: a general social vulnerability latent variable and a domain-specific latent variable (e.g., socioeconomic status).

3 Results

3.1 Data screening

Many SoVI candidate variables yielded minimal variability across census tracts in Florida; several percentage variables had standard deviations as low as 0.01 (Table 1). Normality was a concern for many of the variables included in SoVI (Table 1 and Fig. 2). Skewness ranged from −2.08 (percent of females employed) to 9.95 (percent of population that is Native American), with a median skewness of 1.68 (SD = 2.27). Similarly, kurtosis ranged from −0.06 (percent of married families) to 170.57 (percent Native American), with a median kurtosis of 4.04 (SD = 33.63). Consequently, all variables were transformed logarithmically to resolve complications in model estimation that can arise due to non-normality. Due to a lack of variability and concerningly high skew, three census variables were removed from subsequent analyses: percent working in the extractive sector, percent of the population which is Native American, and percent living in a nursing home.

Fig. 2
figure 2

Frequency distributions for all 28 census tract-level SoVI indicators before conducting logarithmic transformations

3.2 Single-factor confirmatory factor models

3.2.1 Socioeconomic status

The initial CFA demonstrated poor overall fit, although the SRMR was acceptable (Table 2). Several respecifications based on modification indices were conducted to reach acceptable model fit, and improvements between each respecified model were marginal. Residual covariances were added one at a time between the following pairs of census variables, with model fit reassessed after each addition: per capita income and median rent; median house value and percent earning over $200,000; median house value and median rent; per capita income and percent earning over $200,000; per capita income and median house value; and percent unemployed and percent working in the service sector. The final respecification step improved model fit to acceptable cutoffs and produced a reliable latent variable (α = .88). Note that all of these respecifications are data driven and require additional validation.

3.2.2 Population structure

The initial single-factor model demonstrated very poor fit to the data (Table 2). A residual covariance between percent of females employed and percent of the population which is female improved fit substantially, although fit was still poor according to standard criteria (Kline 2016). While the addition of a residual covariance between percent of female-headed households and percent of married families improved fit, as did the inclusion of a covariance between percent of female-headed households and percent of people per housing unit, attempts to include additional parameters produced negative variances. Consequently, the final model did not achieve acceptable fit indices as demonstrated by marginal reliability (α = .67).

3.2.3 Race and ethnicity

The indicator representing percent of the population that is Native American was removed due to minimal variability, such that the race and ethnicity factor had only 3 indicators. While factor loadings for a single-factor model with three indicators can be estimated, fit indices cannot be produced because a minimum of four indicators is required in order to test model fit. Consequently, we focused on the factor loadings, which revealed several concerns. The standardized factor loadings of the percent of the population identifying as Asian American (λ = .24) and African American (λ = .18) were low, suggesting that the indicators do not share significant common variance. Additionally, these indicators demonstrated highly skewed and leptokurtic distributions, which persisted after logarithmic transformations. Unsurprisingly, the reliability of the race and ethnicity factor was extremely low (α = .17), which may mirror the transition from historic settlement patterns shaped by racial capitalism to newer, evolving patterns of residential segregation (Ellis et al. 2018).

3.2.4 Access and needs

The initial model demonstrated poor fit to the data (Table 2). Of note, the standardized factor loading of percent receiving social security benefits (λ = −.34) was negative. The inclusion of a covariance between percent uninsured and percent without an automobile improved fit; however, this addition resulted in a low standardized factor loading for percent without an automobile (λ = .25). Removal of percent without an automobile resulted in a model with three indicators, so fit could not be examined. The loading of percent receiving social security benefits remained negative (λ = −.43), and the final model demonstrated low reliability (α = .60).

3.2.5 Housing structure

As with the race and ethnicity model, the housing structure latent variable only included three indicators, such that model fit could not be tested. While the standardized factor loadings of the housing structure latent variable were moderate ranging from .40 to .49, the reliability of this latent variable was extremely poor (α = .35).

3.3 Combined models

Given that all but one of the single-factor models demonstrated poor fit to the data and/or low reliability, it was unlikely that a correlated factor model would be tenable. However, in keeping with our second aim, we tested a correlated five-factor model, wherein each of the hypothesized latent variables was estimated simultaneously. As expected, the model demonstrated very poor fit to the data (Table 2), indicated errors in estimation and produced correlations between factors exceeding 1. Attempting to estimate the correlated factor model using a combination of the modified single-factor models produced similar results. Given these empirical limitations, a correlated factor structure of SoVI was not supported.

Next, we tested a general single-factor model, wherein every indicator loaded onto one social vulnerability factor. Again, this model demonstrated poor fit (Table 2), although there were no difficulties in model estimation. Despite removing several indicators with low loadings and the addition of 23 residual covariances, model fit remained poor.

Finally, we explored a bifactor model, wherein each indicator belonged not only to its previously identified specific factor, but also to a general social vulnerability factor. Factors in bifactor models are required to remain uncorrelated. The bifactor model demonstrated poor fit to the data (Table 2) and resulted in an estimated factor loading greater than 1, suggesting an improper solution.

4 Discussion

This study applied confirmatory factor analysis to test the generalizability and construct validity of SoVI, a commonly used measure of social vulnerability and its component input variables. We demonstrated that the previously theorized pillars of social vulnerability, established with indicators drawn from the US Census data, were not well supported in a latent variable framework. Only one of the five pillar models (Socioeconomic Status) achieved acceptable fit, while two others had marginal fit even after significant model respecification that may not generalize to new samples. Our results underscore challenges with the construct validity of SoVI (Spielman et al. 2020), which remain an obstacle to achieving the goal of specifying a widely applicable vulnerability index. At the same time, we also face the reality that social vulnerability to hazards and disasters may be more context-specific than many other public health constructs, and that a generalizable indicator may be unnecessary for locally identifying vulnerable communities and facilitating effective disaster preparedness, response, and recovery efforts. Simply put, space matters, and social vulnerability manifest itself differently across study/impact areas. As many scholars have previously found, a one size fits all approach to social vulnerability may not be achievable (Cutter and Emrich 2006; Cutter and Finch 2008; Tapsell et al. 2010; Emrich and Cutter 2011; Tate 2012).

At both the pillar level and the overall SoVI construct level, confirmatory models did not fit the data and showed low reliability, indicating that the SoVI model applied to the New York region following Hurricane Sandy was not portable to a statewide vulnerability analysis in Florida. The establishment of well-fitting confirmatory factor models is the minimum requirement to establishing unidimensional, reliable indices of social vulnerability which can then be examined for generalizability. Results from our study indicated that this minimum requirement is not met for the pillar approach to SoVI put forth by Rufat and colleagues (2019), at least not with the proposed set of indicators. The consistently poor fit of our attempted models suggests that the theorized SoVI pillars may not adequately capture latent commonalities between groups of census variables that are thought to measure common vulnerabilities. While the socioeconomic status latent variable did achieve acceptable fit and reliability after considerable modification, this is likely a consequence of overfitting the model. When a model is excessively modified, it will fit the observed data very well; however, the likelihood that such a data-driven model will generalize to new samples is substantially lower than if the theorized model had fit the data well without modification (Babyak 2004). This general finding aligns with criticism of the internal consistency and construct validity of SoVI (Spielman et al. 2020), further suggesting that all indicators of social vulnerability may not be applicable to every community. On the other hand, achieving an acceptable model does provide some degree of optimism that at least the socioeconomic dimension of SoVI may be generalizable across contexts.

While a number of factors likely explain why these models failed to fit the data, one reason stems from properties of the indicator variables, pointing to important considerations for future attempts to fit SoVI models. A critical step in these attempts is an examination of the distribution of each indicator. In our data, several variables were constants or near constants for Florida (e.g., percent working in the extractive sector; percent Native American). Variables with no or low variability do not contribute to covariation and detract from the fit of the model. Moreover, a variable with low to no variance cannot serve as an informative predictor of some outcome, such as hardships during disaster recovery. Finding differences in the variation of indicators across datasets, as is the case when our data are compared to the data on Hurricane Sandy presented by Rufat and colleagues (2019), suggests that indicators may be region and disaster specific, limiting the generalizability of social vulnerability models in general and SoVI in this case.

Given that generalizability is essential for prediction, it seems important to identify a minimum set of indicators that are likely useful across situations and therefore generalizable. To the extent, it is possible to identify a minimum number of indicators that are generalizable, these indicators could serve as anchor items (i.e., items used across all regions and contexts in the assessment of vulnerability to disasters). For each use case, the generalizable anchor items could be combined with specific indicators that are more variable across certain disasters and regions. Multiple group latent variable models represent a promising methodology for pursuing such an approach, as they allow a combination of invariant and region/disaster specific indicators. In order to test the notion of a SoVI that includes a set of fixed generalizable indicators and a subset of specific ones that may vary across disasters or regions, researchers would need data across multiple disasters and regions, all contributing a comparable set of measures.

Many indicators depend on context for their valence in social vulnerability and may not be theorized to consistently increase or decrease social vulnerability. For example, a greater percent Hispanic may result in greater vulnerability in a less diverse region such as northern Florida, but may buffer households from vulnerability in more diverse places like South Florida, particularly in Florida’s most populous county, Miami-Dade, where the majority of the population is Hispanic. Consequently, while Rufat and colleagues’ (2019) SoVI pillars made theoretical sense for the New York metropolitan area, Florida’s sociodemographic context may present a challenge to constructing them using this set of 27 indicators.

Other researchers have also raised the possibility that the indicators that are typically combined during social vulnerability analyses might be better used as independent measures, rather than combined in an index (Fekete 2019). Such a formative modeling approach would test common indicators as a proxy for social vulnerability, rather than attempt to directly approximate this latent construct. Either way, this approach requires a “best” set of indicators, which remains elusive throughout the vulnerability case study literature, and certainly no set of indicators has been validated across a region as large and diverse as the US. Given the limitations described earlier, it is possible that the social vulnerability of large, diverse regions cannot be reduced to a small set of generalizable indicators with any meaningful degree of precision that would shape public policy in the planning and response to disasters. It is more likely that a big data approach leveraging a very large set of indicators would improve precision but at greater cost. For local and regional planning purposes, perhaps the local scale is what matters most.

Variable selection may also be limited by the classic modifiable areal unit problem, i.e., the ecological fallacy that can inadvertently arise when grouping data into regions at different geographic scales (Openshaw 1984). When conceptualizing social vulnerability, the intent is to identify individuals who are at the greatest risk of hardship following a natural disaster. To operationalize this, a predictive model would require individual-level data, which is generally unavailable due to important ethical restrictions on the use of personal identifiable information. In homogenous regions, census tract-level data may approximate individual-level data. But in heterogeneous areas such as Florida, particularly urban areas with severe residential segregation over short distances, the census tract level of analysis may not accurately reflect the individual social vulnerability. Moreover, such levels of analyses may put decision makers at risk of ecological fallacies, in which conclusions based on group-level data are erroneously applied to individuals within that group. Consequently, indices relying on census tract data may provide valuable risk prediction in regions with homogenous and stable populations but may not be specific enough in more dynamic areas.

A related issue that may account for differences in indicator variability across contexts is that different relative scales are appropriate for different regions with different costs of living. A household income that suggests vulnerability in one area, such as New York City or Miami, may present little risk in nearby rural or suburban areas such as Central Florida. One approach is to normalize indicators that are in absolute scales in order to standardize their meaning across regions. While past studies have used Z-scores as an attempt to address this issue (Emrich et al. 2020), Z-scores computed relative to the overall population still reflect regional heterogeneity. Future studies may consider the use of multilevel models, which would account for effects relative to the group mean (e.g., household income relative to others in Central Florida rather than the country overall). Hierarchical models can improve the accuracy of comparisons by controlling for regional homogeneity and could be further refined to adjust for spatial interaction effects between regions (e.g., Tate 2012; Dong et al. 2020). Hierarchical models also enable the simultaneous modeling of indicators and their interactions across multiple scales, an approach likely to bring new insights to social vulnerability analysis.

These issues of scale and level of analysis have led others to call for improvements to social vulnerability analysis that differentiate urban and rural places and that conceptually separating social vulnerability at national, community, and individual scales (Fekete 2019). We must acknowledge that Florida is, both empirically and experientially, an unusual place that does not uniformly conform to the theoretical manifestation of social vulnerability. For reasons already noted, Florida’s most populated county, Miami-Dade, is so different from the rest of the state that it may bias our attempts to reconstruct the SoVI pillars used by Rufat and colleagues (2019). That said, the New York metropolitan area used by Rufat and colleagues is itself a diverse urban area, and if the pillars and indicators were robust enough, we should have experienced fewer modeling challenges.

Natural hazards require a coordinated response from federal to local scales, and scale matching in disaster response has long been a challenge for environmental policy (Baker and Refsgaard 2007). Ultimately the implementation of disaster response occurs locally, often implemented by federal agencies, but under direction from state and local officials who know their constituencies best. By using a context-driven set of indicators, vulnerability models could be tailored to pinpoint local risk and protective factors. Social vulnerability measures can tell us about differential recovery. SoVI and its input variables are highly utilized in emergency management (Tierney and Oliver-Smith 2012) and have proven useful in linking outcomes to underlying sociodemographic in several disaster specific case studies (Cutter and Finch 2008; Domingue and Emrich 2019; Emrich et al. 2020), demonstrating how baseline conditions are influential across the emergency management cycle. From this perspective, we can identify interventions and pathways toward hazard mitigation by setting aside the theoretically wobbly ideal of a single vulnerability model—one that can be broadly applied to different types of disasters at different geographic scales—and instead focusing on the links between variables and outcomes.

This study faced several important limitations. We tested the portability of a prior analysis of the New York metropolitan area to the state of Florida using a defined set of indicators and pillar design. When our modeling efforts failed, we did not explore other indicators or pillars in the absence of new theory. It is possible that a data-driven reconfiguration of the pillars might have improved our modeling approach, but such an approach would abandon longstanding theory about the drivers of social vulnerability. Further, Florida’s peninsular shape, which creates problems for geospatial analysis, may also lead to challenges in underlying data representations, estimations, and variation. Finally, because SoVI and disasters are both place-specific in nature, perhaps a tract-level index computed at the state level should make little empirical sense given the local scales at which processes such as residential segregation operate. Perhaps SoVI’s utility as a measure of outcomes would be much more appreciable at smaller scales of analysis such as disaster impact areas or informing city level programs and regional intervention.

Future research on the development and validation of vulnerability should consider the kind of evidence needed to validate an index across multiple contexts and how a balance of universal and context-specific indicators may be integrated. A social vulnerability index might be reconceptualized using different mixes of theoretical and data-driven approaches to deriving its underlying pillars, and with potentially different indicators and pillars for different scales of analysis, and for different types of disasters. Future social vulnerability measures will likely evolve along with changes in data availability and more robust conceptual and practical ties between vulnerability and outcomes. Ultimately new approaches must strike the balance between theoretical rigor and the reality that—despite the challenges outlined in this paper and others—applications of social vulnerability measures, including SoVI, continue to provide decision makers with empirical, pragmatic measures of a complex concept.

5 Conclusion

This study sought to confirm the construct validity of theoretically driven organizations of social vulnerability indices, with the goal of empirically testing the generalizability of SoVI in diverse contexts. The lack of alignment between previous formulations of SoVI and census tract-level data for Florida, as demonstrated in this study, does not support the generalizability of SoVI to other contexts—at least in SoVI’s current form. We believe several limitations in the data contributed to the poor psychometric evidence produced from this study. The poor support for the theorized pillars of SoVI and alternative model structures suggest the need to reconsider the input variables and scale of analysis, if the goal is to measure a generalizable construct (or subconstructs) of social vulnerability. This study highlights the challenges inherent in measuring social vulnerability across different geographic settings and types of natural disasters and the need for more robust assessment of confirmatory models.