Background

Health-related quality of life (HRQL) is acknowledged as an important outcome measure of health care, particularly in patients with chronic diseases such as cardiovascular diseases [1,2,3]. According to the World Health Organization, cardiovascular diseases were the leading cause of death globally in 2016, accounting for 31.4% of all deaths with 52.8% of cardiovascular diseases due to ischemic heart disease (IHD) [4]. Latest available domestic data showed slightly decreasing numbers of deaths caused by IHD for German-speaking countries including Austria (16.8% in 2017) [5], Germany (13.9% in 2015) [6], and Switzerland (10.9% in 2015) [7]. Furthermore, IHD was the second leading cause of disability-adjusted life years in Western Europe in 2010 [8], with major reductions in HRQL and estimated health care costs (in-patient stays) of, for example, 359 million Euros per year in Austria [9]. Patient-reported outcomes, including HRQL, are predictive of mortality, cardiovascular events, rehospitalization, and care expenses in patients with cardiovascular diseases [10, 11]. As a result, patient-reported outcomes have been recommended for routine use in clinical practice [1, 2].

Generic and specific HRQL questionnaires have been developed and validated to measure short- and medium-term changes in terms of patient-reported HRQL. Generic instruments allow comparisons of HRQL between different healthy populations, between different diseases and healthy populations, and populations with various diseases. The Short Form-36 Health Survey (SF-36) [12, 13] is one of the most widely used generic HRQL measures in the general population and patients with cardiovascular disease [14,15,16]. In contrast, disease-specific HRQL questionnaires are designed to provide more tailored information concerning a given disease or a specific diagnosis [17] and therefore offer greater clinical relevance than generic questionnaires. But this approach may preclude between-diagnosis HRQL outcome comparisons within a given disease. Core disease-specific HRQL instruments, like the HeartQoL questionnaire [18, 19], provide a potential solution to this limitation. The HeartQoL was developed and validated in the HeartQoL Project with 6384 patients with IHD between 2002 and 2011 who lived in five global regions with a total of 22 countries where 15 languages are spoken [18, 19]. With permission, and in a cross-sectional survey [18], 14 items from the Seattle Angina Questionnaire (SAQ) [20], the MacNew Heart Disease Health-related Quality of Life Questionnaire (MacNew) [21], and the Minnesota Living with Heart Failure Questionnaire (MLHFQ) [22] were identified and comprise the HeartQoL. In the second step, the psychometric properties of the 14-item HeartQoL were tested [19].

When HRQL questionnaires are used as outcome measures, they need to demonstrate reliability (the degree to which an instrument is free from random error), validity (the degree to which the instrument really measures what it purports to measure), and responsiveness (the ability to detect change over time) [23]. The psychometric properties of the SF-36 [24] and the three questionnaires on which the HeartQoL is based upon (SAQ, MacNew, MLHFQ) have all been individually assessed in the German language [25, 26]. Reliability, validity, and partially responsiveness of the English HeartQoL have been already demonstrated in patients with angina, myocardial infarction (MI), and ischemic heart failure in the HeartQoL Project [19], in patients with angina or MI living in the USA [27], and with reliability and validity documented in stable coronary patients in EuroAspire IV [28]. Other language versions, e.g., examining Danish-speaking patients with atrial fibrillation [29], an implantable cardioverter defibrillator [30], and valve surgery [31] or Chinese patients with angina, MI, and ischemic heart failure [submitted] demonstrated adequate psychometric properties as well.

Methods

The aim of this study was to determine the psychometric properties of the German HeartQoL. Cross-sectional (N = 305) and longitudinal data (N = 184) were collected from five centers in Austria and Switzerland to validate the German version of the HeartQoL in patients with angina, MI, and ischemic heart failure. The German HeartQoL, SF-36 [24] and Hospital Anxiety and Depression Scale (HADS) [32] were administered as paper and pencil questionnaires.

Patients

Patients diagnosed with angina, MI, or ischemic heart failure were recruited at five sites with ethics approval. Eligibility criteria were the same as in the original HeartQoL project and included the following: (a) currently being treated for angina (disease severity: Canadian Cardiovascular Society (CCS) class II, III or IV) [33] with an objective measure of IHD; (b) had experienced a documented MI between one to six months previously; (c) currently being treated for ischemic heart failure (disease severity: New York Heart Association (NYHA) class II, III, or IV) [34] with evidence of a left ventricular dysfunction and an objective measure of IHD. Additional eligibility criteria included the following: age ≥ 18 years, to be able to complete the self-administered battery of HRQL measurement in German, no hospitalization in the last 6 weeks, and no serious psychiatric disorder as well as no current substance abuse as identified by the referring physician. Participation in the study was discussed with all patients meeting the eligibility criteria and written informed consent was obtained from those agreeing to participate.

Patient-reported questionnaires

Sociodemographic and clinical variables

Age, sex, body mass index, comorbidities (cancer, chronic pain, dialysis, gastro-intestinal diseases, orthopedic diseases, neurological diseases, respiratory diseases, and urogenital diseases), diabetes, hypercholesterolemia, hypertension, physical inactivity, and smoking, shortness of breath, and chest pain were self-reported by patients with disease severity evaluated by physicians for patients with angina (CCS class II, III, or IV) and ischemic heart failure (NYHA II, III, or IV) diagnoses.

HeartQoL

The German version of the disease-specific HeartQoL was used to measure HRQL in this study. The questionnaire consists of a physical (10 items) and an emotional (4 items) subscale making up the 14-item global scale with higher values representing better HRQL [18]. The HeartQoL items are based on the items in the SAQ, MacNew, and MLHFQ (25, 26) which were translated into the German language. All items on the physical (e.g., “In the last 4 weeks, have you been bothered by having to lift or move heavy objects?”) and the emotional subscale (e.g., “In the last 4 weeks, have you been bothered by being worried?”) are answered on a 4-point scale ranging from “bothered a lot” (= 0) to “not bothered” (= 3).

Hospital Anxiety and Depression Scale (HADS)

The German version of the HADS [32] is a 14-item self-assessment questionnaire used to screen for anxiety and depressive symptoms with an anxiety (e.g., “I get a sort of frightened feeling as if something awful is about to happen”) and a depression subscale (e.g., “I look forward with enjoyment to things”). The items are answered on a scale ranging from 0 to 3 with higher scores representing higher levels of anxiety or depression. In particular, patients with a cut-off score ≥ 8 are considered to be potentially either anxious or depressed in terms of a clinical diagnosis [35]. Moreover, as some studies argue for a general distress factor summing up all anxiety and depression symptoms, a respective common score including all items was generated with cut-off criteria of ≥ 12 showing the best sensitivity/specificity for general distress in patients with coronary heart disease [36]. Cronbach’s alpha in the total cohort was .81 for both scales and .87 for the whole instrument.

SF-36 Health Survey (SF-36)

The German version of the SF-36 [24] consists of 36 items, each scored in one of eight scales which form two distinct summary measures, namely the Physical Component Summary (PCS) and the Mental Component Summary (MCS) measure. Data from the PCS and MCS are presented as T-scores with a mean (M) of 50 ± 10 standard deviation (SD), with higher scores indicating better HRQL. The first item “In general, would you say your health is…excellent/very good/good/fair/poor?” and the second item “Compared to 1 year ago, how would you rate your health in general now? Much better/somewhat better/about the same/somewhat worse/much worse… than 1 year ago” were used to check the discriminative validity of the HeartQoL. Cronbach’s alpha in the total cohort was .88 for PCS and .91 for MCS.

Statistical analyses

Only data sets with full information on the cardiac diagnoses were included in the cross-sectional (N = 305) and longitudinal statistics (N = 184). No outliers were detected (z-transformed means exceeding ± 3.29) and no data imputations were carried out. Mean ± SD and proportions for the total cohort and each of the three IHD diagnoses (angina, MI, and ischemic heart failure) were identified. Categorical sociodemographic and clinical variables were analyzed with Pearson’s Chi-square test while continuous variables and scale means were examined with analyses of co-variance (ANCOVA with post hoc Bonferroni correction as they were adjusted for age, sex, risk factors, and disease severity within diagnosis). A two-sided p-value < .05 was considered as significant. Data were analyzed using IBM SPSS Statistics 24 [37] and STATA 14 [38].

The evaluation of the psychometric properties of the German HeartQoL followed criteria recommended by the Scientific Advisory Committee [23]. Floor and ceiling effects of the HeartQoL were considered present when more than 15% patients of the total group and of each diagnosis reported the lowest score (= 0; floor) or the highest (= 3; ceiling) score [39]. Mokken scale analysis was used to determine the scale structure. Loevinger’s Hi coefficients for each item as well as H coefficients for the global scale and each subscale were calculated with a cut-off value of ≥ .50 considered a “strong,” .49–.40 a “moderate,” and .39–.30 a “weak” Mokken scale [40]. Internal consistency reliability was measured using Cronbach’s alpha with values of ≥ .70 acceptable for group and ≥ .90 for individual comparisons [41]. Convergent and divergent validity were tested with Pearson’s coefficient inter-correlations which can be interpreted as follows: r < .10 = no correlation, r = .10–.29 = low correlation, r = .30–.49 = moderate correlation, r ≥ .50 = high correlation [41]. These correlations were then compared using Steiger’s test [42]. The “known-groups” approach [43] was used to test for discriminative validity on groups known based on previous research and clinical knowledge to differ on the variables of interest. Discriminative validity was tested with the following groups: significantly higher HeartQoL scores were hypothesized to be reported by patients with either MI or angina than by patients with ischemic heart failure and, regardless of diagnosis, by younger and by male patients than by older and female patients, by patients reporting “excellent/very good” compared to “good” or” fair/poor” SF-36 general health status, by patients reporting “improved” compared to either “no change” or “deteriorated” health on the SF-36 health transition item, by patients not exceeding the cut-off score of ≥ 8 (anxiety or depression) or ≥ 12 (general distress) compared to those exceeding these scores, and by patients with angina or ischemic heart failure with less disease severity than those with greater severity.

Responsiveness was tested in two different groups. Patients undergoing percutaneous coronary intervention (PCI) were assessed pre and four weeks post the intervention and patients participating in a four-week in-patient cardiac rehabilitation (CR) program completed the questionnaires at the start and end of the program. Results are reported as effect sizes (small: ≥ .20 to < .50; moderate: ≥ .50 to < .80; and large: ≥ .80) using the standardized response mean method (effect size = mean score time 2 − mean score time 1/SD of the change score) [44].

Results

Patient characteristics and questionnaire means

In this analysis, 101 patients (33.1%) had documented angina, 123 (40.3%) had a documented MI, and 81 (26.6%) had documented ischemic heart failure. None of the sociodemographic or patient data were significantly different in patients from the two countries. The mean age of the total cohort was 63.5 ± 11.1 years and 77.7% were male, being representative for a German-speaking cardiac population with IHD [45,46,47]. Most of the patients were married (71.1%) and did not complete high school (62.6%), and about a half of them were employees (white collar; 52.1%). Physical inactivity was the most prevalent risk factor (60.7%), followed by hypercholesterolemia (55.4%), and hypertension (50.8%). Among patients with angina, 56.1% were classified as CCS class II and 46.9% of patients with ischemic heart failure were classified as NYHA class II; all other patients were classified in either class III or IV. Only 41 patients (13.4%) suffered from comorbidities, e.g., cancer, chronic pain, and so on. Anxiety scores ≥ 8 were reported by 30.2% and depression scores ≥ 8 by 20.7% of the total cohort with the greatest number of either anxious or depressed patients observed in patients with angina. General distress scores ≥ 12 were reported by 38.0% of all cardiac patients with almost half of the patients with angina feeling distressed. All sociodemographic and clinical characteristics are detailed in Table 1 as well as mean ± SD for each questionnaire.

Table 1 Description of the sample

For the total cohort, the mean HeartQoL physical score was 1.81 ± .72, the mean emotional score was 2.20 ± .74, and the global HRQL score was 1.92 ± .64 with ANCOVA results demonstrating significant physical and global score differences between patients with angina and MI. The highest physical and emotional HeartQoL scores were reported by patients with MI as hypothesized; however, the lowest scores were found in patients with angina (physical subscale and global scale p < .05) and not in ischemic heart failure (Tables 1, 2) as hypothesized. Means on the HADS or the SF-36 scales were not significant when comparing across different diagnoses (Table 1). HeartQoL floor effects on each scale were always ≤ 3.3% in the total group and in each diagnosis (Table 2). HeartQoL physical and global score ceiling effects were ≤ 3.0% in the total group and ≤ 4.9% in each diagnosis while ceiling effects on the emotional subscale were observed in 23.3% of the total group, ranging from 20.4 to 25.2% by diagnosis (Table 2).

Table 2 Measurement values of the German HeartQoL

Psychometric properties of the HeartQoL

Factor structure

Mokken analysis revealed that the HeartQoL H coefficients were “strong” with .59 for the physical subscale, .77 for the emotional subscale, and .51 for the global scale confirming the original HeartQoL two-factor structure. The Hi coefficients were mostly “strong” ranging from .52 to .68 on the physical subscale and from .74 to .81 on the emotional subscale (Table 3) with the only exception being on the eighth item (“…feeling tired, fatigued, low on energy?”) where the Hi coefficient was .48 on the physical subscale (Table 3).

Table 3 Mokken Scale analysis of the German HeartQoL

Reliability

Internal consistency reliability was confirmed with Cronbach’s alpha ranging from .89 to .92 in the total cohort, from .89 to .91 in patients with angina, from .90 to .93 in patients with MI, and from .89 to .91 in patients with ischemic heart failure (Table 2).

Validity

  1. (a)

    Convergent validity was confirmed in the total cohort and each diagnosis with correlations between the HeartQoL and SF-36 physical scales ranging from .62 to .78 and from .71 to .76 between the HeartQoL and SF-36 emotional scales. Correlations between dissimilar scales of both instruments were significantly lower according to Steiger’s test for comparing Pearson correlations (Table 4).

    Table 4 Convergent validity of the German HeartQoL with the SF-36
  2. (b)

    Discriminative validity was partially confirmed for age, sex, and disease severity, largely confirmed for the SF-36 health status and transition, and totally confirmed for anxiety, depression, and general distress (Table 5).

    Table 5 Discriminative validity of the German HeartQoL
    • Age: HRQL score differences were significant on the physical subscale in patients with angina with better HRQL in young patients and on the emotional subscale in the total cohort with higher HRQL in elderly patients and in patients with MI with better HRQL in middle-aged patients.

    • Sex: HRQL score differences were only significant in the total cohort on the emotional subscale with higher HRQL in males.

    • SF-36 health status: Global HRQL score differences were always significant with higher HRQL in the total cohort and each diagnosis when patients reported excellent/very good health or good health on the SF-36 health status item when compared to patients reporting fair/poor health. Other HRQL score differences on the physical and emotional subscales are detailed in Table 5.

    • SF-36 health transition: The HRQL score differences were not as consistent with the SF-36 health transition item as with the health status item. However, patients in the total cohort and each diagnosis reporting either improved health or no change in health always had higher physical and global HRQL than patients reporting deteriorated health. Other HRQL score differences on the physical and emotional subscales are detailed in Table 5.

    • Anxiety, depression, and general distress scores: HRQL score differences were significant on each scale in the total cohort as well as in each diagnosis with higher HRQL in patients who did not report anxiety or depression scores exceeding the cut-off ≥ 8 or general distress scores ≥ 12.

    • Disease severity: HRQL score differences were significant on each scale in patients with angina with better HRQL in patients assigned to CCS grade II whereas no HRQL score differences were found in patients with ischemic heart failure.

Responsiveness

The HeartQoL physical subscale and the global scale means and the SF-36 PCS means improved significantly after both PCI and CR (Table 6). Significant improvement on the HeartQoL emotional subscale and the SF-36 MCS was only achieved with CR (p < .001). Effect sizes ranged from .31 (HeartQoL physical and global score with PCI) to .72 (HeartQoL global score with CR). The three HeartQoL effect sizes were greater with CR than those with PCI.

Table 6 Responsiveness of the German HeartQoL and the SF-36 component measures

Discussion

The German HeartQoL is a valid, reliable, and responsive HRQL instrument and these data support its potential use for clinical practice and research to assess and compare HRQL in German-speaking IHD patients. Moreover, the shortness of the HeartQoL may prove to be helpful in clinical practice. The psychometric properties were evaluated based on a sample of 305 patients with either angina, MI, or ischemic heart failure from Austria and Switzerland and are consistent with the original validation study [19], the English HeartQoL version based on patients in the USA [27], the EuroAspire IV study [28], and also with validation studies in patients with atrial fibrillation [29], with an implantable cardioverter defibrillator [30] or following valve surgery [31]. According to Mokken analysis, the German HeartQoL factor structure is consistent with the original two-factor structure [18] although the moderate Hi coefficient loading of .48 for the eighth item in the German HeartQoL (“feeling tired, fatigued, low on energy”) reflects ambiguous wording and may need substantiation in a future study as loadings in the original study [19] and the more recent English [27], Chinese [submitted], mixed European countries [28], and Danish [29, 30] studies were all > .50. However, despite the weakness of the eighth item, the current Mokken analysis results suggest that subscales of physical and emotional HRQL are more substantial than the overall global scale, as some items had clearly weak Hi coefficients (< .50) on the global scale.

The German HeartQoL demonstrated adequate convergent and divergent validity as well as internal consistency reliability in the total cohort and in each diagnosis. These results are similar to other studies using the HeartQoL, e.g., the English validation study including patients with angina or MI [27] which reported strong correlations with the respective matching physical and emotional scales of the HeartQoL and SF-36 (all coefficients r > .60). In the original HeartQoL study [19], Cronbach’s alpha for the physical, emotional, and global scale was .90, .81, and .91, respectively, which was confirmed with the German HeartQoL version in this study. Discriminative validity was largely confirmed for the SF-36 health status and transition as well as the HADS anxiety/depression scales and the general distress factor although with age and sex the hypothesized lower scores in females as well as in older patients were not consistently met. Although there were no HeartQoL floor effects, high ceiling effects were observed on the emotional subscale in all groups with minimal effects on the physical and global scales. These observations, consistent with the original [19] and other validation studies [27,28,29,30,31], suggest that it may be more difficult to assess improvement in emotional HRQL than either physical or global HRQL with the HeartQoL questionnaire. Responsiveness was confirmed with significant pre-post HeartQoL physical, emotional, and global scale score changes with CR and with significant physical and global scale score changes after PCI. These results support the assumption that an invasive functional intervention such as PCI is more likely to reduce physical limitations than emotional burden. On the other hand, comprehensive rehabilitation interventions such as CR are more likely to positively influence IHD patients on a number of different levels, e.g., physical activity, heart-healthy nutrition, psychological care, relaxation, and social support, leading to a broader effect.

Regarding the discriminant validity of the German HeartQoL and contrary to our expectation, angina patients had significantly lower physical HRQL levels than patients with ischemic heart failure. This might be explained on the basis of the sample composition or that the patients with angina were significantly older than patients with MI or ischemic heart failure. Therefore, the lower HRQL scores seem to be less influenced by diagnosis but rather by age. Another explanation of this unexpected finding could be the timing of the assessment as patients with angina could have been recruited during an acute phase whereas patients with ischemic heart failure were chronic and already on an optimized medical treatment schema.

Limitations and future research

A general critique for any “short-form” questionnaire with a relatively small number of items per scale is that the breadth of an assessment of physical or emotional HRQL may be limited. The eighth item in the HeartQoL seems to be linguistically ambiguous in the Chinese, Danish, English, and German versions as data demonstrate it may belong to either the physical or the emotional subscale. It would be worthwhile addressing this question empirically by comparing the HeartQoL head to head with other core heart disease HRQL instruments such as the MacNew. Further limitations of the German HeartQoL refer to the missing test–retest reliability and the quite small sample size of some sub-group analyses (e.g., only 10 female patients with ischemic heart failure). However, the psychometric analyses reveal that the German HeartQoL has adequate reliability and validity and is a responsive IHD-specific core HRQL instrument demonstrating its potential in research projects where economy of instruments is at a premium. Future studies will need to confirm these results using a confirmatory approach in a different sample.

Conclusion

Psychometric characteristics of the 14-item, two-factor German version of the IHD-specific core HeartQoL questionnaire with a physical and an emotional subscale in German-speaking patients with IHD and its three major diagnoses (angina, MI, and ischemic heart failure) were examined. The German HeartQoL demonstrated excellent internal consistency reliability, adequate convergent, divergent, and discriminative validity as well as good responsiveness. Overall, the German HeartQoL can be strongly recommended for clinicians and researchers to assess and compare the impact of IHD on patients’ HRQL in German-speaking countries. The shortness of the tool may prove to be helpful in clinical practice too.