Introduction

In the United States, natural disasters are frequent and substantially impact the health and economies of communities. In recent years, the U.S. has ranked as one of the leading countries in terms of disaster frequency and economic loss [1, 2]. Climate and weather-related disasters in the U.S.—which included Hurricanes Maria, Irma, Nate, and Harvey in 2017 and Hurricanes Florence and Michael and California’s wildfires in 2018—were particularly devastating [3]. Almost 8% of the U.S. population was impacted by the 2017 events alone, and associated costs topped $300 billion [4, 5]. From 2016 to 2018, the U.S. experienced the highest annual number of $1 billion natural disaster events on record [3]. Due to their increasing frequency, cost [3, 6, 7] and large direct and indirect impact on morbidity and mortality [8], natural disasters remain a critical public health challenge for many local health departments.

It is well-established that disasters can affect the mental health of individuals [9]. Disaster exposure has been linked to a variety of behavioral health conditions including depression, posttraumatic stress disorder (PTSD), anxiety, general psychological distress, and increased substance use [9]. Somatic symptoms and unexplained medical illnesses are also elevated after disasters and may be manifestations of psychological distress [10]. Research has highlighted both short- and long-term effects of disasters on behavioral health [11, 12]. Longitudinal studies of disaster survivors have shown that adverse psychological sequelae can follow different trajectories; for some, symptoms arise soon after the disaster but remit gradually over time, while for others symptoms are delayed [13, 14] or persist for months or even years after the disaster [15, 16].

In the aftermath of a disaster, either natural or man-made, it is important for public health officials to understand the burden of behavioral health conditions among survivors and the needs of the affected community. Moreover, estimating mental health impacts of acute collective stressors is not limited to natural hazard events, but is equally critical during and after other events such as the current COVID-19 pandemic [17], which has caused substantial morbidity, mortality, and economic and social disruption worldwide. Affected individuals may experience a variety of difficulties, leading to the development or exacerbation of behavioral health issues. Moreover, the prevalence of post-disaster mental disorders varies from one disaster to the next, making it difficult to predict behavioral health service needs from previous disasters [18], and within populations, disproportionately affecting specific subgroups (e.g., lower income communities, healthcare workers during the current pandemic) [17, 19]. Data from a post-disaster rapid behavioral health assessment can help decision-makers at state and local health departments determine what types of interventions and services are needed and can aid in appropriate resource allocation and service distribution by identifying disproportionately affected areas and populations. Properly assessing and addressing the affected populations’ behavioral health needs in a timely manner is critical, as community resilience is contingent upon psychological functioning [20]. Additionally, rapid assessment can be useful in determining behavioral health needs in cases where the regular local health and medical system has been overwhelmed or disrupted as a result of the disaster and cannot provide this information [21].

Conducting a rapid behavioral health assessment in disaster settings can pose several challenges to local public health practitioners, including identifying and reaching the target population, gathering a field team, and obtaining funding and ethical review board approval [9, 18]. Another key challenge that can hinder rapid deployment of an assessment team is the lack of an existing assessment tool to evaluate the affected population [21]. Although it is recommended that public health staff assemble questions used in previous work prior to a disaster [18], this may not always be possible. In this case, practitioners must quickly determine what constructs should be measured, identify validated behavioral health instruments and scales, and select questions for inclusion. Evaluating behavioral health may be part of a larger post-disaster assessment or a stand-alone endeavor; thus, the amount of time available for evaluation will also be a key consideration, and practitioners will need to make decisions about what questions to include. These challenges can delay community field surveillance, which can in turn impede timely delivery of needed resources and services. Although it does not address all of these challenges, having a standard, flexible module of key assessment questions on hand may help practitioners to begin evaluation sooner.

Efforts to assess the behavioral health of affected populations after a disaster have been done previously by federal agencies [22, 23]. However, these efforts are often based on methodology from existing national survey questions that may or may not serve the purpose of post-disaster assessment among selected populations, and these assessments (e.g., Community Assessment for Public Health Emergency Response, CASPER) generally do not include behavioral health modules. In addition, these questions may not represent fully the different aspects of behavioral health. A comprehensive, validated compendium of questions that focus solely on behavioral health matters post-disaster is currently lacking. The National Institutes of Health and the National Library of Medicine have created a resource depository, Disaster Research Response (DR2), fully accessible online [24], but there is limited guidance or validation information for practitioners.

We aimed to develop a standardized but flexible, validated behavioral health module—built around a clear conceptual framework and through extensive literature review—that can be used by local public health practitioners to rapidly assess behavioral health needs in post-disaster settings. If adopted widely, this module could also increase comparability of findings across populations and time. This paper describes the development and validation of a brief module that can be adapted to diverse disaster settings.

Methods

Development of the behavioral health module comprised the following steps: (a) defining the domains of behavioral health to build a conceptual framework, (b) reviewing the peer-reviewed literature and previous rapid assessments to identify appropriate measures of behavioral health domains, (c) selecting measures to include in an assessment for testing, (d) testing and validating these measures in two disaster-affected cohorts, and (e) identifying questions for inclusion in the final, validated, core module.

Our conceptual framework of behavioral health (Fig. 1) highlights areas of concern (domains) in the post-disaster setting, informed by the Substance Abuse and Mental Health Services Administration (SAMHSA) definition of behavioral health [25] and conceptualization of the link between stressors, disruption or threat to individual resources, and psychological distress described in the literature [20, 26]. Domains included were behavioral health outcomes, including common mental disorders (depression, anxiety, PTSD) and substance use disorders; physical health; risk factors such as stress, disaster-related experiences, resilience, and prior behavioral health issues; and service-related domains including perceived need for services, previous and current mental health and substance use treatment, and barriers to behavioral health treatment. The goal was to focus on concerns that can be specifically addressed through public health intervention.

Fig. 1
figure 1

Conceptual framework of post-disaster behavioral health MH = Mental Health, PSS = Perceived Stress Scale, AUDIT = Alcohol Use Disorders Identification Test, DAST = Drug Abuse Screening Test, PHQ = Patient Health Questionnaire, PC-PTSD4 = Primary Care PTSD Screen

We reviewed previous rapid assessments and disaster studies for post-disaster behavioral health—including CASPER questionnaires and reports [22, 27, 28]—and the peer-reviewed literature to identify and systematically evaluate validated scales and stand-alone items that capture each behavioral health domain in the conceptual framework for inclusion in field testing. Selected scales met several criteria including brevity (10 items or less), acceptable validity and reliability in previous studies, use in previous studies (in particular, state-based or national surveillance studies, from which baseline data may be extracted for comparison), accessibility (e.g., no fees or permissions required), and availability in different languages (Online Appendix 1). For each scale, we also evaluated face and content validity, based on previous studies and the project team’s knowledge of the construct captured by the scale. The majority of scales and items identified were ultimately tested in the field (17 of 25 items). Key reasons for exclusion were scale length; limited evidence of scale validation; lack of use in national, state-based or community surveys; lack of free access; and lack of applicability to the U.S. context. We also included key demographic questions from the Behavioral Risk Factor Surveillance System (BRFSS) [29] in field testing to describe the participants.

The assessment was then field tested in a sample composed of adults 18 years or older from two previous study cohorts, one affected by Hurricane Katrina in 2005 (the Gulf Coast Child & Family Health Study, G-CAFH [30]) and the other by Hurricane Sandy in 2012 (the Sandy Child & Family Health Study, S-CAFH [31]). The G-CAFH and S-CAFH studies are population-based longitudinal studies of approximately 1000 adults each. The G-CAFH cohort comprises individuals from Louisiana or Mississippi displaced for more than six months or greatly affected by Hurricane Katrina, followed over five study waves since 2005. The S-CAFH cohort includes New Jersey residents exposed to Hurricane Sandy, followed over two study waves since 2014. G-CAFH study participants who lived in the New Orleans area and S-CAFH study participants who lived in Ocean, Monmouth, and Middlesex counties, which are geographically close to the current project’s interview sites, were recruited through email and mail invitations. Members of the original cohorts could nominate one individual to also participate in the current field validation testing.

Interviews were conducted in New Orleans and New Jersey in April and May 2018, respectively. Participants completed an in-person assessment administered by a field team member and were also randomly assigned to complete the assessment in a second mode, conducted shortly after over the telephone with a field team member or self-administered using a web-based assessment shortly prior to or following the in-person assessment, with the goal of examining mode differences. Median duration of interview was 36.6 minutes for in-person, 34.6 minutes for telephone, and 24.6 minutes for web-based assessments. Interviewers obtained verbal or written informed consent prior to beginning each interview, depending on the mode of administration. Participants were offered $50 for each assessment, and original cohort members received $10 if their referral also completed the assessment. The field validation testing received Institutional Review Board approval.

Descriptive statistics were calculated for all field validation testing variables, including frequencies and percentages for categorical variables and means, standard deviations (SD), medians and minimum and maximum values for continuous variables. The prevalence of behavioral health outcomes (e.g., depression, alcohol use disorder) was estimated using validated cutoff scores from the literature, and internal consistency reliability examined using standardized Cronbach’s alpha. Convergent validity was assessed by evaluating the correlation between scales using total score for each scale and Pearson’s correlation coefficients (r). Test–retest reliability across administration mode, comparing in-person to telephone and in-person to web-based administration, was assessed using intra-class correlation coefficients (ICC) for scales and kappa and percent agreement for categorical items, among participants who completed interviews in two modes. Scales and stand-alone items were selected for the core module based on results from principal components analysis (PCA), in addition to feedback from participants on clarity of specific questions and other considerations described above (e.g., brevity, use in other studies). All analyses used Stata version 15 (College Station, TX).

Results

A total of 101 individuals participated in the field validation testing, 44 (43.6%) in the G-CAFH cohort and 57 (56.4%) in the S-CAFH cohort. Thirty-two (31.7%) participants completed an in-person and telephone assessment, 34 (33.7%) in-person and web-based assessment, 8 (7.9%) telephone assessment only, 26 (25.7%) web-based assessment only, and 1 (1.0%) completed all three modes. Of the total sample, 71 (70.3%) individuals were original G-CAFH or S-CAFH study participants, while 30 (29.7%) were participants recruited by original cohort members (not shown in tables). In addition to the significant demographic differences between field validation testing subjects exposed to Hurricanes Katrina and Sandy, individuals in the Katrina cohort had higher prevalence of prior mental health diagnoses and greater exposure to the direct and indirect effects of the disaster compared to the Sandy cohort. In the overall sample, approximately 40% had a prior mental health diagnosis; almost one-quarter received the diagnosis after the hurricane. Exposure to disaster-related events and stressors was reported by most participants (Table 1).

Table 1 Sociodemographic, behavioral and disaster-related characteristics of the study sample, New Orleans and New Jersey, April and May 2018

In the validation phase of the field validation testing, all scales showed very good to excellent internal consistency reliability (standardized Cronbach’s alpha = 0.79 to 0.92). Current mental health and substance use symptoms were common, with almost one-quarter meeting criteria for probable depression based on the Patient Health Questionnaire (PHQ-8), generalized anxiety disorder (GAD) based on the GAD-7, and PTSD based on the Primary Care PTSD Screen (PC-PTSD; 23.8% for all). Over 15% met criteria for serious psychological distress on the Kessler-6 (K6). Over one-third reported problem drinking based on the three-item Alcohol Use Disorders Identification Test (AUDIT-C), and 5.0% and 7.9% met criteria for alcohol and drug use disorders, respectively. One-quarter (25.7%) reported increasing use of any substance following the hurricane (not shown in tables).

In terms of convergent validity (Table 2), scales that measured depression, anxiety, PTSD, non-specific psychological distress, functional impairment, and perceived stress were strongly and positively correlated with each other (r = 0.61 to 0.98). The mental health scales were also strongly and negatively associated with resilience on the Brief Resilience Scale (BRS; r =  − 0.64 to − 0.71).

Table 2 Correlation coefficients between behavioral health measures, New Orleans and New Jersey, April and May 2018

Table 3 reports results for test–retest reliability for each scale. All scales demonstrated very strong correlation between modes (in-person vs. telephone, ICC = 0.75 to 1.00; in-person vs. web-based, ICC = 0.73 to 0.97), with the exception of functional impairment due to physical health when comparing in-person to telephone (ICC = 0.54; in-person vs. web-based ICC = 0.93). Overall, ICCs were only slightly higher for in-person vs. telephone compared with in-person vs. web-based, with the exception of the Perceived Stress Scale (PSS) and the Sheehan Disability Scale (SDS) for physical health. Stand-alone items had large kappas and high percent agreement between modes (not shown in tables): general health status (kappa = 0.70 and 0.81), increased use of substances (kappa = 0.93 and 0.87), mental health diagnosis prior to the event (kappa = 0.74 and 0.94), mental health treatment prior to event (100% agreement and kappa = 0.79), and disruption of mental health treatment due to event (both 100% agreement).

Table 3 Intraclass correlation coefficients (ICC) comparing behavioral health measure total score between administration modes, New Orleans and New Jersey, April and May 2018

PCA of the behavioral health scales yielded two components (not shown in tables)—one that included the mental health symptom scales (e.g., PHQ-4, K6, PC-PTSD) and the other that included the substance use scales (AUDIT and Drug Abuse Screening Test, DAST-10). Thus, both domains were represented in the core module, which ultimately comprised 26 items including validated scales and stand-alone items and took 5–10 min to complete (Online Appendix 2). Two mental health symptom scales, the PHQ-4 and four-item PC-PTSD and two substance use scales, the three-item AUDIT-C for alcohol and the DAST-10 for other drugs, were included. Shorter versions of scales that functioned similarly to longer versions were ultimately chosen. For example, the PHQ-4 was selected in place of the K6, PHQ-8 and GAD-7 because the PHQ-4 assesses the two sentinel symptoms of both depression and anxiety [32], was very highly correlated with the two longer scales, and was strongly correlated with other mental health measures. The four-item PC-PTSD was selected over the PC-PTSD-5, despite the fact that it is based on the Fourth Edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) not the DSM-5, because the additional question on the PC-PTSD-5 regarding guilt and blaming oneself for the event was confusing to participants, it was almost perfectly correlated with the five-item version, and it was equally correlated with other mental health measures. The three-item AUDIT-C, which screens for problem drinking, was highly correlated with the 10-item AUDIT and was, therefore, selected in place of the longer AUDIT. Stand-alone items measuring overall general health status, increased substance use following the event, previous mental health diagnosis, receipt of mental health treatment prior to the event, and disruption of mental health treatment following the event (not shown in tables) were reliable across modes and, therefore, included to assess physical health, capture emerging behavioral health issues, and evaluate continuity of mental health treatment.

Discussion

This project aimed to develop a short, structured, validated module to assess behavioral health following catastrophic events that is flexible in terms of length and mode of administration. There were many options for measuring the behavioral health domains described in our conceptual framework, including several validated scales for each construct in our conceptual framework. Although most scales and items performed well, those included in the final core module were selected based on evidence of validity, reliability, question comprehension, efficiency (fewer items in a scale), and use in large population-based surveys. This yielded a brief behavioral health assessment that can be administered in 5–10 min in person, over the telephone, or over the web as a stand-alone assessment or as part of a larger set of questions.

A specific set of questions was ultimately selected for the core module (Online Appendix 2). In some cases local public health staff have limited time or may only be able to include a few questions as part of a broader assessment. In these instances, it is suggested that an abbreviated version including only the PHQ-4 and a question about increased substance use following the event be used, which will provide an estimate of the point prevalence of psychological distress and indicate emerging issues regarding substance use, completed in under two min. However, the module was also designed to be flexible, in order to meet the post-disaster needs of public health practitioners. We encourage practitioners to assess both psychological symptoms and substance use in post-disaster settings to obtain a comprehensive picture of community behavioral health [25]. If there is more time for assessment, full versions of behavioral health scales that capture additional symptoms and constructs can be substituted for the abbreviated versions included in the core module to make an expanded module. For example, the full PHQ-8 or the 10-item AUDIT may be included if there is additional time and interest in somatic symptoms of depression or screening for alcohol use disorder as opposed to problem drinking. Additionally, although the PHQ-4 was ultimately selected for inclusion in the core module, the K6 is often used in disaster research and performed well in testing. If the K6 is preferred over the PHQ-4—because, for example, the K6 is included in local surveillance efforts (e.g., New York City Community Health Survey) and therefore available for baseline comparison—it can be included in the core module in place of the PHQ-4.

We did not find that one mode of administration (in-person, telephone, web-based) was preferable over the others in terms of the measures selected for inclusion in the core module. However, when asked, participants expressed preference for the in-person assessment, which may reflect access to and comfort with technology or the connection one can make with an interviewer in person. There are strengths and limitations to each method. Telephone assessments are reliable and may be more cost-effective and appropriate in situations of substantial population displacement than in-person assessments, but may result in lower response rates [33] and may be limited if power lines, cellular towers, or internet access are impacted by the disaster. Use of a web-based module may also facilitate more rapid assessment after a disaster and reduce social desirability bias compared to in-person surveys, but obtaining a representative sample of the affected population may be challenging. Protocols will also be needed to address the potential for acute distress among respondents that cannot be assessed by an interviewer [34], as well as to maintain confidentiality and provide basic tips for self-management of mild symptoms and referral to mental health-promoting resources. Choice of mode will ultimately depend on these issues, as well as the specific context of the disaster and characteristics of the affected population (e.g., computer literacy). Therefore, public health officials, in coordination with first responders and other local stakeholders, should select the best method for conducting these type of post-disaster assessments.

These results should be considered in light of several limitations. First, among most participants who completed two modes, assessments were administered with little time in between assessments, which could have affected estimates of test–retest reliability. Second, although the respondent sample was socio-demographically and geographically diverse, our results may not be generalizable to all populations. Additionally, the sample was relatively small, which could influence some psychometric indicators (e.g., kappa), and was selected from a cohort of individuals affected by non-recent disasters. However, disasters such as Hurricanes Katrina and Sandy are indelible events with long-lasting effects for those affected; therefore, the questions we tested very likely remain relevant. Finally, the assessments were conducted in English only, which may limit generalizability. However, the module includes validated behavioral health measures that have been translated into several other languages, which allows for use of the module in diverse populations.

Public Health Implications

Disasters are frequent and threaten the health of U.S. populations. Rapid assessment of behavioral health following catastrophic events is critical for reducing the negative psychological impact of these events. This project yielded a brief, validated, structured and flexible instrument that can be used to assess behavioral health concerns in post-disaster settings. Widespread use of this type of structured module can facilitate more rapid assessment and improve our ability to compare outcomes across disaster types and disaster-affected populations. With this information, practitioners can quickly evaluate behavioral health needs, provide referrals to mental health services when indicated, effectively allocate resources, and appropriately target interventions to help promote the recovery and overall well-being of affected communities.