Background

The term moral injury (MI) has increasingly appeared in the research literature since it was first coined by psychiatrist Johnathan Shay in the early 1990s [1]. To date, there are many definitions of MI that have been proposed [2]. More recently, Shay suggested a definition made up of three components: “(1) betrayal of ‘what’s right’ (2) by someone who holds legitimate authority (3) in a high stakes situation” [3]. MI has been found to be present in a wide range of populations experiencing severe trauma, including military personnel, war veterans, first responders, rape victims, and others [4, 5]. At least one qualitative study has reported that the term moral injury is useful for exploring medical students’ experience in emergency medicine settings [6]. A study of refuges in Switzerland found that MI accounted for 16% of the variance in post-tramatic stress disorder (PTSD) symptoms [7]. Papazoglou found that MI was frequently experienced by police officers after suffering repeated trauma [8].

Until 2013, there were no measures to assess MI as currently understood. Since then, several have emerged to assess the presence of MI among military populations, including two types of assessment tools: (1) those that measure both morally injurious events and MI symptoms, and (2) those that measure MI symptoms only. Measures in the first category include the 9-item Moral Injury Events Scale (MIES) developed Nash and colleagues in diverse military samples [9, 10]. Several years later, the 20-item Moral Injury Questionnaire was developed by Currier and colleagues, again assessing both morally-injurious events and symptoms [11]. The first measure to assess MI symptoms only was the 45-item Moral Injury Symptoms Scale-Military Version-Long Form (MISS-M-LF) [12], followed soon by the publication of the 17-item Expressions of Moral Injury Scale-Military Version (EMIS-M) by Currier and colleagues [13]. The MISS-M-LF was then shorted by Koenig and colleagues to a 10-item version (MISS-M-SF) [14], and this was later followed by a 4-item short version of the EMIS-M [15]. Those measures were all developed in samples of active duty military or war veterans.

These scales have largely followed the definitions by Shay [3] and Bret Litz et al. [16] that focused on MI symptoms acquired during combat, such as feelings of shame, grief, meaninglessness, and remorse from having violated core moral beliefs [17]. Symptoms relate to what one has done (killed combatants or innocents, dismembered bodies, maltreated others, or deserted comrades during battle), what one has failed to do (protected innocents or prevented the death of fellow soldiers), and what one has observed others do or fail to do [18]. MI symptoms may also involve intense feelings of betrayal by those in authority, either in or outside of the military, and include religious or spiritual struggles or a complete loss of religious faith resulting from experiences during wartime [17].

Recently, MI among physicians and other health professionals has attracted attention in the mainstream literature, particularly when discussing issues related to burnout [19]. Clinicians may experience MI when they feel their ability to deliver care is compromised by the systems (e.g., insurance, reimbursement, electronic health record) being implemented in hospitals, clinics, and medical practices [20]. During the COVID-19 pandemic, physicians in China have faced difficult ethical/moral decisions given the enormous influx of patients with life-threatening infections and limitations in available ventilators, personal protective equipment, and lifesaving medications. These physicians (and nurses) have had to play God in making decisions on who gets treatment and who does not, as well as having to deal with exposure to the coronavirus themselves and the risk this poses to their families and patients [21, 22]. As a result, health professionals have been stigmatized as vectors of contagion, resulting in their assault, abuse, and isolation during the COVID-19 pandemic, just as they had been during the SARS pandemic [23]. This situation has caused many health professionals to feel a sense of helplessness, shame, and guilt, as hundreds of patients die every day [24]. Unfortunately, until now there have been no psychometrically reliable and valid scales to measure MI symptoms in healthcare professionals.

The purpose of this study is to examine the psychometric properties of the 10-item Moral Injury Symptoms Scale-Health Professional (MISS-HP) developed by Koenig and colleagues [25], which is a modified version of the MISS-M-SF developed in military personnel [13] to make it applicable to healthcare professionals. This measure assesses 10 dimensions of the moral injury: betrayal, guilt, shame, moral concerns, loss of trust, loss of meaning, difficulty forgiving, self-condemnation, religious struggle, and loss of religious/spiritual faith.

Methods

Participants and procedure

A convenience sample of physicians and nurses from across mainline China was recruited using a snowball sampling method [26] between May 27 and April 26, 2020. Inclusion criteria were 1) physicians or nurses; and 2) length of practice at least 2 years. The exclusion criteria were: (1) a history of 6 months or more of an extended break from practice for any reason during the past 2 years; (2) inability to use the internet or other mobile devices due to the vision or other disability preventing completion of an online questionnaire; and (3) those not formally licensed to practice medicine or nursing.

Potential participants were provided a link to an online questionnaire through a popular social media platform (Wechat). Those who responded to the invitation were encouraged to forward the invitation letter to colleagues and post it on social media sites. The invitation letter was initially sent to 19,583 potential participants by the Wechat network, of whom 4003 responded to the invitation; 28 participants refused after reading the informed consent form, resulting in 3975 completed questionnaires (Fig. 1). Of those, 968 records were excluded during the data cleaning process, leaving a final sample of 3006 that consisted of 583 nurses and 2423 physicians who were included in the analysis.

Fig. 1
figure 1

The flowchart of participant enrollment. (MISS-HP: moral injury symptoms scale; SFI: secure flourishing index)

Two-week test-retest reliability was determined by asking 100 physicians from three hospitals to complete the full survey on two occasions, of whom 73 completed the survey at both times.

Measures

Sociodemographic characteristics

Information was collected on age, gender, marital status, educational attainment, ethnicity (Chinese Han vs. minority ethnicity), area of specialty, work area (general medical ward, ICU, emergency room), and length in practice.

Moral injury

The Moral Injury Symptom Scale-Health Professional (MISS-HP) is a measure of moral injury symptoms that assesses betrayal, guilt, shame, moral concerns, loss of trust, loss of meaning, difficulty forgiving, self-condemnation, religious struggle, and loss of religious/spiritual faith [25]. Response options for each of the 10 items range from 1 to 10 to signify agreement or disagreement with each statement, with a total score ranging from 10 to 100. The higher scores indicate a greater number and severity of MI symptoms [14].

In order to assess convergent validity, the 4-item Expressions of Moral Injury Scale-Short Form (EMIS-SF) was administered. Developed by Currier and colleagues, this measure has been used widely to assess MI in military personnel [10]. Items were rated on a Likert scale from 1 (strongly disagree) to 5 (strongly agree). Higher total scores indicate the number and severity of MI symptoms, reflecting maladaptive behaviors and internal experiences associated with the moral challenges of delivering clinical care.

Mental health

The 9-item Patient Health Questionnaire (PHQ-9) [27] and 7-item Generalized Anxiety Disorder (GAD-7) [28] were used to measure depressive symptoms and anxiety symptoms, respectively. These two instruments are short screening measures frequently used in medical and community settings. Each item on these measures is rated on 4-point Likert scale (from 0 to 3) indicating how often each symptom has occurred within the past 2 weeks. Total scores range from 0 to 54 for PHQ-9 and 0–42 for GAD-7, with higher scores indicating more severe symptoms. The Chinese version of PHQ-9 and GAD-7 scale have strong internal and test-retest reliability as well as strong construct validity and factor structure validity in both medical patients and those in the general population [29, 30].

Well-being

The 12-item Secure Flourish Index (SFI) was used to measure six domains of well-being: happiness and life satisfaction, physical and mental health, meaning and purpose, character and virtue, close social relationships, and financial and material stability [31]. Each item was measured on an 11-point visual analogue scale (from 0 to 10), where higher scores indicate higher levels of well-being in each of these areas. Two items assess each of the six domains, and these are averaged to obtain domain-specific scores. The total SFI score is calculated as the average of all six domains with equal weighting. The Chinese version of the SFI has been shown to have acceptable validity and reliability in a Chinese sample [32].

Burnout

A modified version of the Maslach Burnout Inventory-Human Services Survey for Medical Personnel (MBI-HSMP) was used to measure the three dimensions of burnout: emotional exhaustion, depersonalization, and reduced personal accomplishment [33]. Each item on the 22-item scale is scored on a 7-point Likert scale from 0 (never) to 6 (daily). Higher scores on each subscale and the overall scale indicate higher levels of burnout. The MBI-HS has been translated into Chinese following a standard procedure, which was shown to have acceptable reliability and validity in a sample composed of participants from a range of occupations [34].

Workplace violence

Workplace violence was assessed by asking, “Have you ever been attacked by your patients or their close relatives, either physically or verbally?” Response categories were yes or no.

Translation of instruments

A 4-step procedure recommended by WHO was used to guide the translation of instruments used in this study into Chinese [35, 36]. First, the original English MISS-HP was translated into Chinese by two health professionals from outside the research team, both of whom were bilingual and fluent in Chinese and English. Next, the two translations were compared and discrepancies reconciled to arrive at a draft Chinese version. Second, a bilingual expert panel consisting of three health professionals (including the original translators) and two social science researchers reviewed the draft Chinese translation separately, making cultural adaptations as necessary. Third, the draft Chinese version was back-translated into English by two bilingual health professionals (different translators than those in the first step). The back-translated English version was then compared to the original English version and reviewed by the original author to ensure that the questions were translated correctly and discrepancies resolved at this stage. Fourth, the draft version of the scale was administrated to 11 physicians from two hospitals for pre-testing. These physicians were asked to send comments about ease of administration, clarity of wording, and time burden. Necessary changes in language were then made based on consensus to arrive at the final Chinese version of the MISS-HP (Supplementary Table 1).

Data analysis

Missing values

When computing scale scores, the mean substitution method was used to replace missing values [37]. If two items or fewer on a scale were missing, we substituted the average score of items answered on the scale for the missing item. If more than two items were missing, the scale score was considered missing and no substitutions were made.

Statistical analyses

Descriptive analyses were performed on all subjects depending on whether responses were categorical or continuous. Differences in socio-demographic characteristics between nurses and physicians were tested using the Student’s t-test for continuous variables and the chi-square test for categorical variables. Differences in MISS-HP total scores between different demographic groups were examined using one-way analysis of variance (ANOVA). General linear regression was used to control for covariates.

Convergent/divergent validity was determined by examining correlations between the MISS-HP score and other measures. A correlation matrix was constructed using Pearson correlation coefficients. Cronbach’s alpha was used to assess the internal consistency the of MISS-HP, where alphas equal to or greater than 0.70 are considered acceptable [38]. The intra-class correlation coefficient (ICC) was used to determine 2-week test-retest reliability, where ICCs between 0.41 and 0.60 indicate moderate reliability, those between 0.61 and 0.80 represent good reliability, and those higher than 0.80 indicate excellent reliability [39]. Internal reliability tests were performed separately for the total sample, nurses, and physicians.

Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) were conducted to extract scale factors. Both physicians and nurses were split randomly into two separate groups. In Group 1 (n = 1198 for physicians, n = 292 for nurses), EFA was performed using principal components analysis with Promax rotation (an oblique rotation method allowing factors to correlate with each other). The Kaiser-Meyer-Olkin (KMO) index was used to measure sample adequacy, where KMO values of 0.6 or higher indicate adequacy. The Bartlett’s test of sphericity was used to assess the appropriateness of the correlations between variables in the factor model.

In Group 2 (n = 1225 for physicians, n = 291 for nurses), CFA using the maximum likelihood method was performed to assess the stability of the factor structure. Model adequacy was determined using the chi-square test with degrees of freedom (df), where a p-value less than 0.05 indicates model adequacy. Indices for model fit included the comparative fit index (CFI), normed fit index (NFI), incremental fit index (IFI), and root mean square error of approximation (RMSEA). The Akaike information criterion (AIC) was also calculated. Values of CFI > 0.90, NFI > 0.90, IFI > 0.90, and RMSEA < 0.08 indicate that model fit is acceptable [40]. All the statistical analyses completed under IBM SPSS 23.0 version software (SPSS Inc., Chicago, IL, USA).

Results

Demographic characteristics

Demographic characteristics of the final sample of nurses and physicians are displayed in Table 1. The average age of the overall sample was 35.4 (SD 8.1; range = 20–70 years), and the average length in practice was 11.6 (SD 8.5; range 2–50 years). Approximately one-third of participants were male, and more than half (62.5%) provided inpatient care. Nearly two-thirds (64.2%) of participants reported experiencing workplace violence at some time during their professional practice. Compared with the physicians, nurses were more likely to be female, younger, worked in the ICU or emergency room, had lower educational attainment, and were less likely to experience burnout. Specialty area among physicians was 34% internal medicine, 14% surgery, 12% obstetrics-gynecology or pediatrics, 8% psychiatry, and 31% other specialties.

Table 1 Socio-demographic characteristics of participants

Reliability

As shown in Table 2, The Cronbach’s alpha for the MISS-HP scale when each item was deleted ranged from 0.64 to 0.76 in the overall sample (0.65–0.71 in nurses, 0.63–0.69 in physicians). The Cronbach’s α for the MISS-HP in the overall sample was 0.70 (0.71 in nurses, 0.70 in physicians). Test-retest reliability after 2 weeks indicated ICCs for individual MISS-HP scale items ranging from 0.41 to 0.74; for the total score, the ICC was 0.77. Pearson correlations between the two times of administration were similar to ICCs (results not showed).

Table 2 Cronbach’s alpha for the MISS-HP with items removed and total score

Validity

As evidence for convergent validity, a significant positive correlation was found between the MISS-HP and EMIS-SF in both physicians (r = 0.45) and nurses (r = 0.43) (Table 3). Divergent or discriminant validity was demonstrated by moderate correlations between MISS-HP score and mental health, well-being, and burnout scales. These included PHQ-9 depressive symptoms (r = 0.45 for physicians, r = 0.37 for nurses), GAD-7 anxiety symptoms (r = 0.41 for physicians, r = 0.37 for nurses), and similar correlations for the three burnout subscales and well-being measure.

Table 3 Correlation matrix for moral injury, mental health, burnout, and well-being

Known groups validity was supported by comparing MISS-HP scores between those who reported workplace violence and those who did not. As indicated in Table 4, health professionals who experienced workplace violence scored higher on the MISS-HP and EMIS-SF score than those who did not (p < 0.01). After controlling demographic variables, workplace violence was significantly correlated with MI symptoms (B = 4.16, 95% CI = 3.21–5.10, p < 0.001).

Table 4 Moral injury score and workplace violence exposure

Construct validity was examined by exploratory factor analysis (EFA) followed by confirmatory factor analysis (CFA). EFA in the nurses’ sample (Group 1) revealed a KMO index = 0.72, and the Bartlett’s test of sphericity indicated the sample was factorable at p < 0.001 (X245 = 6.49E2). As illustrated in Supplementary Figure 1, the three extracted factors explained 59.2% of the total variance. The EFA in the physicians’ sample (Group 1) revealed a KMO index of 0.73, and the Bartlett’s test of sphericity demonstrateed factorability at p < 0.001 (X245 = 5.27E3). As in nurses, three factors were extracted that explained 58.9% of the total variance. As indicated in Table 5, factor 1 (“shame and guilty”) included items MI2, MI3, and MI4, whereas factor 2 (“mistrust”) included items MI5, MI6, and MI10, and factor 3 (“forgiveness”) made up of four items MI1, MI7, MI8, and MI9.

Table 5 The factor structure model of the MISS-HF

CFA confirmed the three factor model for the MISS-HP scale in nurses (χ2 = 74.19; df = 32; p < 0.001, CFI = 0.93, NFI = 0.88, IFI = 0.93, RMSEA = 0.067, AIC =120.19, and ECVI =0.414). Likewise, CFA confirmed the three factor model in physicians (χ2 = 232.03; df = 32; p < 0.001, CFI = 0.93, NFI = 0.92, IFI = 0.93, RMSEA = 0.071, AIC = 278.03, and ECVI =0.23) (see Fig. 2).

Fig. 2
figure 2

The confirmatory factor analysis models

Discussion

To our knowledge, this is the first study to examine the psychometric properties of the MISS-HP, a short but comprehensive measure of moral injury symptoms, in a large sample of health professionals. Unlike other measures of MI, the MISS-HP is unique in that it assesses both psychological and religious/spiritual dimensions of MI. The results indicated that the MISS-HP is a reliable and valid measure of MI in both nurses and physicians. The findings provide primary evidence supporting the use of this tool for assessing symptoms of MI as part of health promotion programs for health professionals in China. The MISS-HP also fills an important gap in research that examines the prevalence, correlates, and health consequences of MI in nurses and physicians.

The internal consistency of the MISS-HP (alpha = 0.70 for physicians and 0.71 for nurses) is acceptable, as is the test-retest reliability (ICC = 0.77 in physicians). With regard to validity, the MISS-HP has acceptable convergent validity with another established measure of MI, the EMIS-SF (r = 0.45 for physicians and r = 0.43 for nurses). Correlations with common mental conditions (depression and anxiety), well-being, and burnout measures are as robust with the MISS-HP as with the EMIS-SF.

Known groups validity supports using the MISS-HP to identify MI among those suffering from potentially morally injurious events such as being assaulted by patients or relatives. This finding is partly supported by a study of military veteran family members, which found that such violence inflicts damage to moral belief systems and causes a loss of trus t [41]. Many physicians have been killed and injured during the past decade in China [42]. Moral injury can be the consequence of unexpected violence from patients or their relatives, giving rise to feelings of betrayal in nurses and physicians by the very population they are risking their lives to help (especially during this COVID-19 pandemic) [16].

Construct validity of the MISS-HP was established using exploratory factor analysis (EFA), which was then verified by CFA. Factor analysis indicated a three-dimensional structure for the MISS-HP, explaining 59% of the total variance. This finding is consistent with the work of Griffin and colleagues [2] who suggested at least two interrelated MI symptom dimensions, self-directed outcomes (e.g., thoughts/feelings of responsibility for occurrence of moral violations such as shame or viewing oneself as unlovable or unforgivable) and other-directed outcomes (e.g., thoughts/feelings associated with being a victim of others’ morally transgressive acts). Add to this the religious/spiritual dimension of MI involving struggle and loss of faith.

Limitations

Several aspects of the present study limit the generalizability of these findings, thereby influencing both research and clinical implications. First, we assessed the MISS-HP in a single cross-sectional study involving a nonrandom sample of Chinese health professionals which did not include those in practice for less than 2 years (who may be at even greater risk if MI given their lack of experience). The the findings here require cautious generalization to service members in other areas of the China and to health professionals outside of China. Second, although, a standard translation procedure was used to create a Chinese version of the MISS-HP, cultural differences between China and the Western society (where the scale as initially developed and designed) may have affected the final result (both the translation and the meaning of items). Third, despite the consistent findings showed in nurses and physicians, test-retest relieability was conducted only in physicians, which may lead to uncertainty for the scale’s use in nurses. Fourth, the internal reliability of the MISS-HP was borderline but acceptable in both nurses and physicians (alpha = 0.70 or higher). Fifth, other morally injurious events besides workplace violence need to be assessed in future studies. Finally, like all self-report measures, the accuracy of responses cannot be guaranteed where external factors may influence the report of symptoms (even though the survey was anonymous in nature).

Conclusions

The 10-item MISS-HP is a brief, comprehensive, reliable, and valid measure for assessing symptoms of moral injury in physicians and nurses providing healthcare to patients in mainland China during the COVID-19 pandemic. Scores on the scale of 50 or higher have been found to indicate significant difficulty with social and occupational functioning in this population [41]. From both a clinical and research perspective, the MISS-HP can be used to screening for MI symptoms and follow response to treatment among healthcare professionals in China.