Introduction

Cardiovascular disease (CVD) affects one-third of all adults or nearly 81 million individuals in the United States [1]. Coronary heart disease (CHD) is a substantial contributor to both morbidity and mortality from CVD. CHD leading to acute myocardial infarction (MI) remains one of the most common causes of hospitalization, disability, and death in the United States [1].

CHD or an MI has physical, emotional, and social consequences. As improvements in survival of ischemic events continue, researchers and clinicians acknowledge that subjective assessment of HRQoL is necessary as a complementary criterion for assessing prospective benefits of medical interventions [24]. Comparison of the impact of CHD with that of other conditions on the population level is clearly valuable for making public policy decisions incorporating cost-effectiveness [5, 6].

Population studies typically use generic HRQoL indexes [7]. It is not well known whether different generic indexes of HRQoL give consistent estimates of the impact of CHD. Some generic indexes such as the EuroQol EQ-5D (EQ-5D) and the Medical Outcomes Study Short Form-36 (SF-36v2™) have been found to be valid measures in patients with CHD [24, 810]. The EQ-5D, SF-6D, Health Utilities Index Mark 2 (HUI2) and Health Utilities Index Mark 3 (HUI3) have all been shown to be responsive to other chronic diseases in populations, such as rheumatoid arthritis [11, 12], type 2 diabetes [13], stroke [14], and intermittent claudication [15].

On the other hand, several instruments have been designed to specifically capture HRQoL with CHD or other cardiovascular conditions, and tend to be used in clinical populations [1618] and in clinical practice [8, 10]. Comparing the performance of generic indexes to a disease-specific instrument is of interest to physician researchers who may wish to incorporate the use of generic instruments to monitor HRQoL. There is some overlap in item content of CVD-specific instruments and generic indexes allowing investigators to potentially extract subsets of disease-specific questions to use as proxy disease-specific HRQoL indicators.

The objective of this study was to assess six widely used generic HRQoL indexes (the QWB-SA, SF-6D, EQ-5D, HUI2, HUI3, and HALex) as well as the physical (PCS) and mental health (MCS) subscales of the SF-36v2™ in a population-based sample in terms of the estimated differences in HRQoL between individuals with and without CHD and with varying CHD severity. We compare effect sizes to those of a proxy heart disease-specific index constructed from only CHD-relevant questions within the QWB-SA. A parallel sample of patients from three heart failure clinics allowed us to derive an equation to combine these questions to predict the CHD-relevant content of the Minnesota Living with Heart Failure Questionnaire® (MLHFQ) [16]. Comparison with a proxy score simulating a CVD-specific instrument provides a benchmark with which to compare the abilities of generic indexes. This comparison is valuable as clinicians will increasingly be graded on performance as judged by generic instruments [2].

Methods

Data collection

The National Health Measurement Study

The NHMS was a random-digit-dialed telephone interview of a sample of non-institutionalized U.S. adults, ages 35–89 years, living in the contiguous United States in 2005–2006 [19]. Five generic HRQoL instruments were administered in random order during the telephone interview: SF-36v2™ [20], the Health Utilities Index (HUI) [21, 22], EQ-5D [23], the Self-Administered Quality of Well-Being Scale (QWB-SA) [24], and the Health and Activities Limitations Index (HALex) [25].

Sampling was in three stages: sampling telephone numbers within telephone exchange strata, sampling an age-stratum within households, and sampling a single respondent from a selected age-stratum. Interviews were conducted in English by trained interviewers at the University of Wisconsin Survey Center using commercial computer-assisted telephone interview (CATI) software. All subjects provided verbal informed consent. The survey was approved by the Institutional Review Board at the University of Wisconsin (protocol #H-2004-0083).

A total of 3,844 participants completed the interview, representing an estimated response rate of 46%. For each participant, a sampling weight was computed based on the sampling design. Post-stratification was used to further adjust the weights for differential response rates by age, race, and sex. Fryback et al. [19] provide further details about the sampling techniques and weighting used for the NHMS.

Clinical Outcomes and Measurement of Health Study

A parallel study to the NHMS, the Clinical Outcomes and Measurement of Health Study (COMHS) was conducted at clinics for heart failure (HF) at the University of Wisconsin, University of California, San Diego and University of California, Los Angeles. Chronic heart failure cases newly referred to the clinics were eligible if the left ventricular ejection fraction was less than 50% for at least 3 months, as measured by echocardiography, radiographic ventriculography, or radionucleotide ventriculography. Furthermore, to be enrolled in the study, patients had to be at least 35-years old, able to provide competent informed consent, able to hear and understand verbal instructions in English, and have sufficient vision and ability in reading and writing English to complete the questionnaires. Data collected included the generic HRQoL instruments administered in the NHMS sample as well as the disease-specific MLHFQ. The instruments were distributed to participants in paper form in a packet assembled with the generic HRQoL questionnaires in randomized order, followed by the MLHFQ. Analyses include baseline data from 154 participants who completed the packet of questionnaires at the first clinic visit. The study was approved by the Institutional Review Boards at the University of Wisconsin (protocol #M-2005-1171) and the University of California.

Generic HRQoL measures

Scoring according to the guidelines specific to each instrument yielded the preference-scored indexes SF-6D (from SF-36v2™ [26]), HUI2 and HUI3 (from the HUI), EQ-5D, QWB-SA, and HALex [2125]. In addition, the physical and mental component scores (SF-36v2™ PCS and SF-36v2™ MCS, respectively) were computed from the SF-36v2™ [20]. For the preference-based indexes, HRQoL is measured by a single score anchored at dead (0.0) and full health (1.0) [27]. The EQ-5D, HUI2, and HUI3 allow for scores “worse than dead” with possible scores ranging from −0.11 to 1.0 for EQ-5D, −0.03 to 1.0 for HUI2, and −0.36 to 1.0 for HUI3 [23, 28]. The QWB-SA scores, excluding dead (0.0), can range from 0.09 to 1.0 [24], and SF-6D from 0.30 to 1.0 [26]. The HALex score can range from 0.10 to 1.0 [25]. PCS and MCS scores from the SF-36v2™ have a range of 0–100, with a mean score standardized at 50 and a standard deviation of 10 [20]. Fryback et al. [19] provided detailed descriptions of all instruments and established population norms for these generic indexes.

Definition of CHD subgroups

The NHMS telephone interview collected respondent-level information frequently associated with HRQoL including some details about eleven health conditions common in U.S. adults. CHD was self-reported via the question “Have you ever been told by a doctor or other health professional that you had coronary heart disease or a heart attack, also known as a myocardial infarction or MI?”

Three CHD severity subgroups were defined in the NHMS population as follows: (1) no self-reported CHD (n = 3,350), (2) self-reported CHD without current use of chest pain medication (n = 265), and (3) self-reported CHD with current use of chest pain medication (n = 218). Current chest pain medication use was self-reported via the question “Do you currently take medicine for chest pain?” Analyses exclude 11 who did not provide an answer to the CHD question.

Development of proxy score

CHD is a common cause of HF [29] and the conditions share symptoms. The item content of the MLHFQ emphasizes activity, mobility limitations, and worry and is similar to that of the Seattle Angina Questionnaire [17], but the latter contains several items related to chest pain. Conversely, two MLHFQ items were considered not to apply to CHD (items 1 and 14, see Table 1). Several generic indexes also contain items that resemble those in the MLHFQ (as displayed in Table 1). Table 1 shows the polychoric correlation in the COMHS sample between each ordinal MLHFQ item and its generic matches. For this and subsequent purposes, QWB-SA items, which asked whether a person had symptoms during the past 3 days were dichotomized into whether a person had the symptom at all (1) or not (0).

Table 1 Individual items selected for analysis from generic HRQoL instruments

The QWB-SA had the largest number (11) of items matching the MLHFQ, all of which had polychoric correlation of >0.40 with the corresponding MLHFQ item. The MLHFQ total score was recomputed in the COMHS sample, without the two items deemed applicable only to HF, as the sum of the remaining items rescaled to the range of the original MLHFQ. The CVD-specific proxy instrument was developed by linear regression of this modified MLHFQ total score on the matched generic items from the QWB-SA. The resulting regression coefficients were used to create a scoring algorithm for the proxy score, shown in the following equation, where the predictors are individual item values from the QWB-SA items listed in Table 1. The equation lists the QWB-SA items in the same order as they appear in Table 1, where the complete wording of each item can be found.

$$ \begin{aligned} {\text{Proxy\;score}} & = 25.7 + 9.1 \times {\text{bed}} + 9 \times {\text{walking}} + 3.9 \times {\text{work}} + 7 \times {\text{sleep}} + 1.7 \times {\text{social}} + 3.1 \times {\text{sex}} \\ & + 2.5 \times {\text{diet}} + 7.5 \times {\text{breathing}} + 17.7 \times {\text{nocontrol}} + 0.54 \times {\text{worry}} - 0.6 \times {\text{confuse}} \\ \end{aligned} $$

The negative sign of the statistically non-significant coefficient of the QWB item measuring confusion is due to it having a negative polychoric correlation (−0.18) with reporting side effects from treatments (item 16) on the MLHFQ. The proxy score correlated with the modified MLHFQ at r = 0.82.

Statistical analyses

All analyses were performed using SAS version 9.0 software (The SAS Institute, Cary, NC). To produce nationally representative estimates of index means and differences, further analyses incorporated trimmed post-stratification sampling weights and accounted for telephone exchange strata.

Weighted means and standard deviations of the generic instruments and proxy score within CHD subgroups were computed. Higher scores indicate better HRQoL on the generic measures. As the CVD-specific proxy score was developed to resemble the MLHFQ, scoring is reversed for this index; so higher scores represent an increase in problematic symptoms and thus worse HRQoL. Both unadjusted and adjusted differences in mean scores were estimated and statistical significance of group differences assessed by F-tests implemented in SAS PROC SURVEYREG. Differences were first adjusted in a joint model across groups for age (as a continuous predictor), race (white, black, and other categories) and sex, and then additionally for arthritis, respiratory disease and diabetes (comorbidities that share symptoms with CHD). Group differences adjusted for these comorbidities were also obtained.

Standardized group differences were estimated from the means adjusted for age, race, and sex, and the residual standard deviation of the adjustment model. An effect size of 1 corresponds to a one standard deviation difference in magnitude. Guidelines for interpreting standardized differences are well established, with 0.2–0.5 representing a small effect size, 0.5–0.8 medium, and >0.8 large [30]. Weighted Pearson partial correlations, adjusted for age, race, and sex between the proxy score and the scores for all generic instruments were also obtained.

Results

The three CHD severity subgroups were described by unweighted statistics (Table 2). Mean scores for the proxy index and each of the generic indexes weighted to the U.S. population are also reported (Table 3). Those without CHD have the lowest proxy score, followed by those with only CHD and those with CHD plus chest pain medication use. This suggests that higher proxy scores reflect worse CVD-related health. All generic score means for respondents with CHD are lower than for those without CHD. Differences in unadjusted and adjusted mean scores between the three CHD subgroups were calculated (Table 4) and were significant for all indexes (P = 0.018 for SF-36 MCS, all others P < 0.0001). The minimally important difference is considered to be 0.03 for the EQ-5D, QWB-SA, HUI2, and HUI3, 0.033 for the SF-6D, and 5 for the SF-36v2™ MCS and SF-36v2™ PCS [31, 32]. Unadjusted and adjusted mean score differences for these generic indexes between all CHD severity subgroups exceeded clinically significant values, with the exception of the SF-36v2™ MCS. Differences between CHD subgroups 1 and 3 tended to be 2–3 times greater than the differences between CHD subgroups 1 and 2. Adjusted differences controlling for diabetes, arthritis, and chronic respiratory disease were smaller than unadjusted differences, but remain statistically and clinically significant.

Table 2 Descriptive statistics for NHMS sample (unweighted)
Table 3 Mean HRQoL scores and standard deviations weighted to US population
Table 4 Unadjusted and adjusted differences in mean scores between CHD groups

Three effect sizes were calculated for each instrument: those with CHD without chest pain medications compared to those without CHD (subgroup 2 vs. 1), those with CHD using chest pain medications compared to those with CHD without chest pain medications (subgroup 3 vs. 2), and those with CHD using chest pain medications compared to those without CHD (subgroup 3 vs. 1). The effect sizes are shown in Table 5. The results show the HALex, the SF-36v2™ PCS and the proxy score to have the largest effect sizes in all comparisons, and the SF-36v2™ MCS to have the lowest. However, while the HUI2 and HUI3 differentiate next best between the CHD groups taking and not taking chest pain medication, the QWB-SA has a larger effect size between those with CHD without chest pain medication and those without CHD. All measures except the SF-36v2™ MCS have strong effect sizes between those with CHD taking chest pain medication and those without CHD.

Table 5 Effect sizes between CHD severity groups

Partial correlations demonstrated that all of the generic indexes correlated highly with the CVD-specific proxy score, in both the NHMS sample as a whole and in a subgroup of only those with self-reported CHD (severity subgroups 2 and 3 combined) (Table 6).

Table 6 Correlations between proxy score and generic indexes, partial on age, sex, and race

Discussion

This study is the first to examine the abilities of six simultaneously administered generic instruments to detect HRQoL differences related to CHD in a cross-sectional, nationally representative sample of U.S. adults. The total scores for all indexes demonstrated ability to differentiate between individuals with and without CHD, and between CHD severity subgroups defined by self-reports of taking or not taking medication for chest pain. The generic indexes correlated highly with a proxy CVD-specific index. While the QWB-SA and SF-36v2™ appeared to have the greatest overlap of questions with heart specific instruments, it is worth noting these generic indexes did not display larger effect sizes than the other indexes. Notably, the HUI2, HUI3, and HALex have large effect sizes, and also correlate highly with the proxy index. It is likely that much of the equivalence between measures is caused not only by items that are explicitly similar, but also the fact that heart disease may cause many general health problems. Based on these findings, it appears that administering CHD specific instruments to general population samples will be of limited value. These findings may also be of interest to clinicians, as there is increasing interest in the administration of generic HRQoL indexes to monitor patients in the ambulatory setting [2]. Items within generic measures may offer much of the information captured by disease-targeted approaches. Generic measures might be adapted to offer both general and disease-specific assessment.

There is relatively little difference between the generic indexes in their sensitivity to CHD-related HRQoL. Effect sizes were of similar magnitude to that of the proxy score for the MLHFQ, even between severity subgroups 2 and 1. Much CHD in this lower severity group could be asymptomatic, and part of the effect on HRQoL may be through the diagnostic label itself. Part of the HALex total score is based on a self-reported health scale, while other indexes ask respondents to report functioning not feelings. This difference may be important for conditions that are serious but not associated with many symptoms. HUI3 and HUI2 have higher effect sizes and absolute differences with the CHD group taking chest pain medication, while QWB-SA has a greater effect size with the CHD group not taking chest pain medication. This finding is consistent with the HUI3 having large score decrements with health states at the lowest range of health, while the QWB-SA contains more items sensitive at the higher end of health.

The analyses presented in this study have limitations. One limitation is that the proxy CVD-specific index is not a validated, disease-specific instrument such as the Seattle Angina Questionnaire [17]. Although there is overlap in item content, questions in our proxy score are not as specific with respect to physical functioning with CHD as those in the Seattle Angina Questionnaire. Our score also does not contain questions specific to chest pain, which may have led to lower sensitivity to CHD.

Another limitation is that both CHD and current chest pain medication use were self-reported in the NHMS population, and the study design did not include verification of self-report with clinical records. The accuracy of self-report for MI was investigated by Heckbert et al. [33] in the Women’s Health Initiative Study, and good agreement was reported between self-report and physician review of medical records (kappa = 0.64). Specificity was very high at 99%, while sensitivity was lower at 64%. Based on this report, HRQoL differences in our study may be somewhat attenuated, as some individuals may have been diagnosed with CHD but did not report it and some patients with symptomatic CHD have not been diagnosed. Furthermore, some individuals may have reported chest pain medication use if they have a prescription for nitroglycerin, regardless of how often or infrequently they need to use it. Such circumstances would all lead to our effect sizes being underestimated, lending further support to the ability of the generic indexes to differentiate these CHD subgroups.

As with any data obtained via survey, differential participation and response rates between groups are a limitation. Telephone surveys are particularly limited, as calls are often screened and an increasing number of households rely only on cellular phones, which are not included in random-digit-dialed household sampling. However, it has been reported that in the time the NHMS survey was completed this seems to have had little effect on population health estimates [34]. Furthermore, as several different HRQoL indexes were administered, the length of the interview and the time required to complete it may have led to the selection of participants with higher education and/or better health. This would likely have resulted in underestimation of the differences in indexes between CHD subgroups.

Despite these limitations, our results contribute an important finding to the field of cardiovascular research. Generic indexes can capture differences in HRQoL between populations with and without CHD. These differences are similar to those detected by questions specifically targeted at cardiovascular disease, and appear to also be valid as an indication of disease severity within a CHD population.