Introduction

Health-related quality of life (HRQL), as an integral aspect of subjective patient-reported health status, has become an increasingly important health care outcome measure, especially in patients with chronic diseases, for example, cardiovascular disease [13]. In 2012, cardiovascular diseases were the number one cause of death globally with about 17.5 million people dying of cardiovascular diseases, or 31 % of all global deaths; of these deaths, approximately 7.4 million (42 %) were due to ischemic heart disease (IHD) [4]. Patient-reported health status, including HRQL, is predictive of mortality, cardiovascular events, hospitalization and costs of care in patients with cardiovascular disease; despite this, instruments to assess patient-reported health status are underused in clinical practice [1, 2].

Attributes and criteria for HRQL instruments are important as quality indicators. Key attributes of HRQL instruments include the conceptual and measurement model, reliability, validity, language adaptations and interpretability [5]. The key to the interpretation of HRQL is having reference data. Without such data, it is difficult for the user to assess the meaning of the scores because benchmark values are missing. For example, reference data allow a determination of whether group or individual HRQL scores and standard deviations are below, similar to, or above those of a reference group thus placing them into a context; furthermore, comparing percentiles and minimum/maximum values in a study sample can provide useful information on the distribution of HRQL scores [6]. A number of instruments have been developed to quantify HRQL, and some HRQL manuals offer population norms and distributions relating to gender, age or disease [7, 8] while various studies have provided within-country [911] and between-country [12, 13] HRQL comparative data. The Short Form-36 health survey (SF-36) [8, 12, 13] is arguably one of the most widely used generic health-related quality of life measures in the general population and also in patients with IHD [1219].

The International Quality of Life Assessment (IQOLA) Project [12, 13] was a comprehensive project to translate, adapt and validate the SF-36 internationally in patients with chronic disease. Patients with congestive heart failure reported the second lowest SF-36 physical health, while the “effect of ischemic heart disease on a number of physical health scales was noteworthy.” The EuroAspire III study [20] used the SF-12 in patients with IHD, where lower HRQL estimates were found in women, older patients and patients from Eastern European countries. Soto et al. [15] also reported lower physical health scores for females and older patients and moreover for patients with myocardial infarction (MI) compared to patients with angina. A similar distribution was found in a study by Alphin et al. [16] where patients with heart failure reported the lowest physical health, followed by patients with MI and then angina. The impact of a chronic condition on mental health was always lower than physical health. Meta-analyses or systematic reviews showed that the SF-36 also correlates with disease-specific questionnaires like for heart failure [17] and is a predictor of health status [18] confirming its broad area of application [19].

Despite the widespread use of the SF-36 in different populations, to our knowledge, no international IHD reference data in patients with angina, MI or ischemic heart failure are available. As a result, international comparisons (especially including Eastern European countries) with data acquired on the basis of either one defined study protocol or studies by independent researchers are unavailable. Therefore, the aim of this report is to present international reference data for the SF-36 (including sub-analyses on the effect of, e.g., diagnosis, gender and age) based on a sample of 5508 adult patients with IHD and a diagnosis of angina, MI or ischemic heart failure living in one of 22 countries and speaking one of 15 languages.

Methods

Sample

The data analyzed in this study were generated in the HeartQoL Project where a new HRQL disease-specific questionnaire for patients with IHD—the HeartQoL questionnaire—was developed and validated [21, 22]. The HeartQoL Project is an international HRQL survey conducted between 2002 and 2011, including 6384 patients with documented angina, MI or ischemic heart failure living in five regions (Eastern, Northern, Southern and Western European regions and an English-speaking region) with a total of 22 countries where 15 languages are spoken: Danish, Dutch, English (Australia, Canada, Ireland, UK, USA), French, Flemish, German (Austria, Germany, Switzerland), Hungarian, Italian, Norwegian, Polish, Portuguese, Russian, Spanish (Cuba, Spain), Swedish and Ukrainian. Each of the sites (N = 67) received local Ethics Committee or Institutional Review Board approval. Following specific protocol directions (e.g., guidelines were specified in each language for identifying crucial disease symptoms, universal inclusion criteria, workshops on standardized surveying), participating physician investigators at a total of 67 sites (hospital cardiology clinics and cardiac rehabilitation programs) identified eligible patients when seen at their clinic visit; the nature and purpose of the study was then explained, and, with written informed consent, eligible patients were enrolled in the study.

Eligibility criteria

All HeartQoL Project patients had to be at least 18 years old, were not currently substance abusers, did not have a serious psychiatric disorder, were considered by the referring physician to be able to complete a self-administered battery of HRQL instruments in the particular language and had not been hospitalized during the past 6 weeks [21, 22]. Patient physicians reported the primary diagnosis of angina, MI or ischemic heart failure according the following criteria:

  1. 1.

    Currently treated for angina (typical chest pain, Canadian Cardiovascular Society (CCS) class II, III or IV) with an objective measure of IHD (exercise testing, echocardiogram, nuclear imaging or angiography); or

  2. 2.

    Experienced a documented MI between 1 and 6 months previously, including chest discomfort, electrocardiogram changes indicative of MI and positive creatine kinase or troponin rise; or

  3. 3.

    Currently treated for ischemic heart failure (New York Heart Association (NYHA) class II, III, or IV) with evidence of left ventricular dysfunction (ejection fraction <40 % by invasive or noninvasive testing) and IHD (previous MI, exercise testing, echocardiogram, nuclear imaging or angiography). Other underlying heart failure diagnoses were excluded.

SF-36 health survey (version 1)

The SF-36 [8] consists of 36 items, each scored in one of eight scales (Physical Functioning, Role-Physical, Bodily Pain, General Health, Vitality, Social Functioning, Role-Emotional, Mental Health) which then form two distinct higher-ordered clusters—the physical component summary (PCS) and mental component summary (MCS) measure. Data from the eight scales are presented as raw values (0–100). PCS/MCS data are presented as T-scores with a mean (M) of 50 ± 10 standard deviation (SD), with higher scores indicating better HRQL. The instrument meets required psychometric standards [5] and is one of the most widely used generic HRQL measures in patients with IHD [1219]. Reference data on the eight scales and the higher-ordered PCS/MCS measures are available [7, 8]. The first SF-36 question “In general, would you say your health is…excellent/very good/good/fair/poor?” was used to characterize the current health status of the sample (Table 1).

Statistical analyses

Data are presented as M ± SD, lowest and highest SF-36 scores, medians and 25th and 75th percentiles including Cronbach’s Alpha α as follows; (a) socio-demographic characteristics (age, gender, education, family status); (b) clinical characteristics (angina, MI or ischemic heart failure including the SF-36 question on self-reported health, disease severity, risk factors); and (c) geographic regions. Sites were located in Eastern Europe (EE)—Hungary, Poland, Russia, Ukraine (N = 4 sites); English-speaking countries (ES)—Australia, Canada, Ireland/UK, USA (N = 19 sites); Scandinavia (Sc)—Denmark, Norway, Sweden (N = 11 sites); Southern Europe (SE)—Italy, Spain, Portugal (N = 10); Cuba (N = 1); and Western Europe (WE)—Austria/Germany/Switzerland, Belgium, France, Netherlands (N = 22). We combined German language data from Austria, Germany and Switzerland and European English language data from Ireland and UK. Cuban and Spanish patients were originally pooled in the parent HeartQoL Project to a “Spanish-speaking group” for maximizing variance but, within this analysis, patients coming from Cuba were examined separately due to economic and cultural differences. Means are provided for the PCS and MCS measures and raw values for the eight SF-36 scales. SPSS 21 was used for all statistical analyses (descriptive statistics; crosstabs; independent-samples t tests, uni- and multivariate analyses of variance (ANOVA; MANOVA) with Bonferroni correction (BC) investigating group differences; z tests for population proportions) and effect sizes (partial eta-squared (η 2 p ) or Cohen’s d are reported. Incomplete datasets (e.g., missing values on age, gender or diagnosis, incomplete SF-36 data) and SF-36 outliers with standardized z-scores in excess of 3.29 [23] were excluded from the total dataset (description within the limitation section), leading to a cohort of 5508 patients.

Results

Socio-demographic and clinical characteristics (Table 1)

In this analysis, there were 1836 patients (33.3 %) with documented angina, 2086 (37.9 %) with a documented MI and 1586 (28.8 %) with documented ischemic heart failure. Socio-demographic and clinical characteristics are detailed in Table 1. Based on the first SF-36 question “In general, would you say your health is…?”, only 1.8 % of the patients rated their health as “excellent”, 47.3 % rated their health as being either “very good” or “good”, 41.5 % rated their health as “fair” and 9.4 % rated their health as “poor”. There were 1160 patients from EE (21.1 %), 1231 from ES (22.3 %), 856 from Sc (15.5 %), 793 from SE (14.4 %), 1309 from WE (23.8 %) and 159 from Cuba (2.9 %).

Table 1 Socio-demographic characteristics including diagnosis, age, gender, family status, education, self-reported health, region, risk factors and disease severity (data missing if sample sizes do not equal N or 100 % for each group)

SF-36 component summary measures: diagnosis, gender and age (Table 2)

All M ± SD, minimum, maximum, 25th, 50th and 75th percentile PCS/MCS scores and Cronbach’s Alpha for diagnosis, gender and age are given in Table 2.

Table 2 Physical component summary (PCS) and mental component summary (MCS) mean scores, standard deviations and Cronbach’s Alpha by diagnosis, gender and age; grouped by a combination of gender-diagnosis, age-diagnosis and age-gender-diagnosis (mean PCS or MCS scores >1 standard deviation below the standardized mean of 50 are bold)

Mean PCS and MCS scores

The mean PCS score in the cohort was 39.8 ± 9.9, more than one SD below the standardized M of 50, and the mean MCS score of 47.7 ± 10.6 was within the normal range of 40–60.

Diagnosis

A main effect of diagnosis on mean PCS scores was found (ANOVA; F(2, 5505) = 274, p < 0.001, η 2 p  = 0.091). Post hoc analyses using BC indicated that mean PCS scores of patients with MI were significantly higher than in patients with either angina (p < 0.001, d = 0.53) or ischemic heart failure (p < 0.001; d = 0.74) who also had a significantly lower mean PCS score compared to patients with angina (p < 0.001, d = 0.21). All MCS scores were within the normal range and did not differ statistically (ANOVA; F(2, 5505) = 2.5, p = 0.083, η 2 p  = 0.001).

Gender

An independent-samples t test indicated that mean PCS scores were significantly lower for women (37.0 ± 9.9) than for men (40.6 ± 9.7; t(5506) = 11.7, p < 0.001, d = 0.37). Although mean MCS scores were in the normal range for both genders, female patients reported lower scores (45.9 ± 10.9) than male patients (48.2 ± 10.4; t(2101) = 6.9, p < 0.001, d = 0.22).

Age

A main effect of age on mean PCS scores was found (ANOVA; F(3, 5504) = 38.4, p < 0.001, η 2 p  = 0.020). According to post hoc analyses, patients <51 years reported significantly higher mean PCS scores compared to all other age groups (BC, all p < 0.001; vs. 51–60 years d = 0.22; vs. 61–70 years d = 0.27; vs. > 70 years d = 0.46). Patients >70 years had the lowest mean PCS scores (BC, all p < 0.001; vs. 51–60 years d = 0.24; vs. 61–70 years d = 0.19). A main effect of age was also found on mean MCS scores (ANOVA; F(3, 5504) = 31.6, p < 0.001, η 2 p  = 0.017) with patients <51 and 51–60 years reporting significantly lower MCS scores than both older aged groups (BC, all p < 0.001; <51 vs. 61–70 years d = 0.27 and vs. > 70 years d = 0.32; 51–60 vs. 61–70 years d = 0.22 and vs. >70 years: d = 0.27).

Interaction effects

Sub-analyses investigating interaction effects of diagnosis × gender, diagnosis × age, gender × age, diagnosis × gender × age showed no significant results for either PCS or MCS scale, except for gender × age on PCS (MANOVA; F(3,5500) = 4.19, p = 0.006, η 2 p  = 0.002) and diagnosis × age on MCS (MANOVA; F(6,5496) = 2.71, p = 0.012, η 2 p  = 0.003). However, the effect sizes for both are negligible.

SF-36 component summary measures: region and country within region (Table 3)

All M ± SD, minimum, maximum, 25th, 50th and 75th percentile PCS/MCS scores and Cronbach’s Alpha for each region and country are given in Table 3. All means were tested for possible influences of the different data collecting sites, and all differences were less than the minimal important difference.

Table 3 Physical component summary (PCS) and mental component summary (MCS) mean scores, standard deviations and Cronbach’s Alpha grouped by region and countries (mean PCS or MCS scores >1 standard deviation below the standardized mean of 50 are bold)

PCS scores

By region, the mean PCS score was more than one SD below the standardized M = 50 in Eastern Europe and the English-speaking region. The mean PCS score in EE was 35.9 ± 8.5, with each country in this region scoring more than one SD below M = 50. In the ES region, the mean PCS score was 39.7 ± 10.9, with only Canada within the normal range. In Sc, the mean PCS score was 40.5 ± 10.1; only Norway scored more than one SD below M = 50. In SE and WE, the mean PCS scores (40.9 ± 9.2; 42.4 ± 9.3) in all countries were within the normal range. In Cuba, the mean PCS score was 38.4 ± 8.6, significantly different from Spain (40.6 ± 10.0; t(372) = 2.41, p = 0.016, d = 0.37).

A main effect of the region (ANOVA; F(4, 5503) = 73.6, p < 0.001, η 2 p  = 0.051) and the country (ANOVA; F(17, 5490) = 23.8, p < 0.001, η 2 p  = 0.069) was found on mean PCS scores. Post hoc analyses indicated that patients in EE had significantly lower mean PCS scores than in each other region (BC, all p < 0.001; vs. ES, d = 0.39; vs. Sc, d = 0.49; vs. SE, d = 0.56; vs. WE, d = 0.73) and patients in WE had the highest mean PCS score when compared to each other region (BC, all p < 0.01; vs. ES, d = 0.27; vs. Sc, d = 0.20; vs. SE, d = 0.16). Except Hungary and Ukraine, the lowest PCS score in Russia was significantly lower than in all other countries (p < 0.001); Austria/Germany/Switzerland’s, the Netherlands’ and Belgium’s PCS scores were significantly higher than those reported in Russia, Hungary, Ukraine, Australia, Poland, Ireland/UK, Norway and the USA (all p < 0.001).

MCS scores

By region, each mean MCS score was within one SD± the standardized M = 50, ranging from 44.7 ± 10.0 in EE to 49.8 ± 10.1 in the ES region. A main effect of region (ANOVA; F(4, 5503) = 47.5, p < 0.001, η 2 p  = 0.033) and country (ANOVA; F(17, 5490) = 26.2, p < 0.001, η 2 p  = 0.075) was found on mean MCS scores. Post hoc analyses indicated that patients in EE had significantly lower mean MCS scores than in each other region although they were all within the normal range (BC, all p < 0.001; vs. ES d = 0.50; vs. Sc d = 0.49; vs. SE d = 0.31; vs. WE d = 0.19). Patients in the ES region had higher mean MCS scores than SE and WE patients (BC, all p < 0.01; vs. SE d = 0.18; vs. WE d = 0.29), and so did patients from Sc (BC, all p < 0.02; vs. SE d = 0.17; vs. WE d = 0.28). France had the significantly lowest mean MCS score compared to all other countries (p < 0.001), except for Russia and Ukraine. Patients in Spain, Canada and Ireland/UK had significantly the highest mean MCS scores when compared to patients from France, Russia, Ukraine, Portugal, Hungary, Austria/Germany/Switzerland and Italy (all p < 0.001).

SF-36 scales: age, gender, diagnosis, region and country (Table 4)

The raw values of the eight scales give more detailed information than PCS and MCS values and are presented in detail in Table 4. As further country-specific reference data would be helpful, supplementary material for each country presenting reference values by diagnosis, age and gender separately are available [online resource 1].

Table 4 SF-36 scales (raw values 0–100) grouped by diagnosis, gender, age and region including countries

Discussion

The key to the interpretation of HRQL as an outcome is having reference values as these are useful for users of an instrument who wish to place their results in an appropriate context by comparing their scores to a reference group [6]. Without reference values, it is difficult for the user, whether clinicians, researchers, or policy makers, to assess the meaning of the comparative or relative scores. This is the first dedicated study representing generic HRQL SF-36 international IHD reference values allowing comparisons across angina, MI and ischemic heart failure as well as across 22 countries in one publication. The pattern of all HRQL differences observed in this report is similar to findings of other studies using the SF-36 [15, 16] or SF-12 [20]. The SF-36 scores reported here by 5508 patients substantiate findings from other studies (e.g., EuroAspire) [20], therefore contributing to the establishment of relationships between HRQL and variables such as diagnosis, age, gender or region.

The mean PCS score in the analyzed cohort with IHD was below the normal range of 40–60, whereas the mean MCS score was within this range. Diagnosis exerted an influence on physical health with patients with MI always reporting the highest mean PCS scores. However, patients with angina and ischemic heart failure were more than one SD below the standardized PCS mean score. On the other hand, cardiac diagnosis had no effect on the mean MCS scores which were all within the normal range. These results indicate that angina and especially ischemic heart failure have a greater impact on physical health than MI. This might be due to their different subsequent physical limitations and chronicity. Furthermore, cardiac diseases apparently have a stronger influence on physical than on mental health. Females reported worse physical and mental health than males. This maybe indicates more subjective perceived physical and mental burden in women. Worse physical health was also reported by older patients (especially in those >70 years), whereas younger patients (especially in those <51 years) had lower mean MCS scores in comparison. These results may lead to the assumption that younger patients are fitter and therefore can handle physical strain due to a cardiac disease better than older patients. In contrast, older patients perceive less mental stress because of more coping strategies available or the feeling of “normality” when they are confronted with a cardiac disease at a greater age. Finally, patients from EE had significantly lower physical and mental health scores than patients from all other regions. These PCS and MCS results, with patients from EE are reporting worse HRQL than patients from the other regions, are consistent with the well-recognized East–West health divide [13, 24].

The well-known East–West Europe health divide between Eastern Europe, i.e., the formerly communist countries of Central and Eastern Europe and the former Soviet Union [24, 25], and Western Europe reflects major differences in health policy, access to and quality of health care, health care funding, and certain health risks and their impact on health outcomes such as life expectancy, morbidity and mortality. The East–West Europe health divide was confirmed for HRQL as a patient-reported outcome measure in this report. The combined effects of economic growth, improved health care and successful health policies (e.g., tobacco and alcohol control, food policy, road traffic safety) across Western Europe have resulted in a higher life expectancy, lower mortality and morbidity and healthier populations [20, 24]. In contrast, economic and political problems in many Eastern European countries have frequently led to a failure to implement effective health policies with concomitant lower life expectancies, higher mortality and morbidity and less healthy populations [26]. However, inequities in health, health care and health care policies also exist within and between neighboring countries in Western Europe [24, 27, 28] and remarkable HRQL differences between neighboring countries have been noted in this analysis indicating further challenges to health, well-being and health care in Western Europe [20].

The generic SF-36 is useful when comparing various populations with a healthy cohort. The SF-36 PCS and MCS measures, with a standardized mean of 50 ± 10, are derived from the general “healthy” US population [8]. There are SF-36 norms by IHD diagnosis, i.e., angina, MI and heart failure, in the German [7] and US [8] population substantiating the findings in this study of highest scores in patients with MI, followed by patients with angina, and the lowest scores in patients with heart failure. Furthermore, in the German [7] and US [8] general populations, females reported worse PCS and MCS scores than males and younger patients reported better PCS and worse MCS scores older patients. This analysis revealed similar results within an IHD population. Moreover, there are also PCS and MCS norms broken down into their components for the eight SF-36 scales by diagnosis (heart disease in total, MI and heart failure), age and gender in the US population [8]. The reference values generated in the present study demonstrated that all PCS scores, but none of the MCS scores, were more than one SD lower than the general “healthy” US population norms in patients with angina and heart failure. However, the SF-36 does not quantify symptom burden or functional limitations specific to IHD and is less sensitive to clinical change, either over time or after a therapeutic intervention, and its clinical interpretation is more difficult than with a disease-specific instrument [6] such as either the MacNew [29] or the HeartQoL [21, 22]. Furthermore, when the SF-36 scales are summated and transformed to the higher-order PCS and MCS measures, the formula always includes the means, the SDs and the regression coefficients from the general American population (“US weights”). This needs to be considered when interpreting these or other SF-36 data in countries other than the USA as, when interpreting SF-36 data across different groups (e.g., gender, ethnicity, language), problems may occur because people belonging to different groups may have a different probability of giving a certain response on a questionnaire [30]. By generating new IHD-specific reference values in the future, where a standardized mean of 50 ± 10 would represent the “average IHD-patient,” these specific SF-36 reference data could be used for comparing scores within an IHD population more precisely.

Therefore, analyses of measurement invariance or differential item functioning can be conducted to provide an indication of unexpected behavior of items on a test and giving information on which items may be revised, e.g., exclusion, rephrasing, new translation. There is apparently a valid assumption based on the vast literature that the SF-36 at least does measure the same construct across different cultures (IQOLA-Project) [3133]. Moreover, the Bjorner et al. DIF-study [26] showed that the use of homogenous reduced scales instead of the full SF-36 scales (containing items with DIF) did not change conclusions about the existence of a cross-national difference, at least not between the general populations in the USA and Denmark. These data can lead to the assumption that the SF-36 is also robust when locating items with DIF regarding the “higher-ordered” results (scales, PCS/MCS) in other populations. But as the study of Bjorner et al. [26] is unique with results suggesting that cross-language DIF may be a frequent problem in questionnaire translations, some consequences occur. Therefore, validations of translated instruments are needed every time a questionnaire is presented in a new language and interpretations of cross-national comparison data need to be used with caution since the interpretation of questions and the use of response categories may vary between countries.

Limitations

The data are based on convenience samples. This needs to be considered when referring to these results. For example, only patients with ischemic heart failure (left ventricular dysfunction <40 % and IHD) were included, whereas other heart failure diagnoses were excluded (e.g., preserved ejection fraction or diastolic heart failure, right heart failure). Furthermore disease severity, also influencing physical and mental health, varied across regions in this sample. The highest proportion of patients with severe angina (CCS III + IV) was found in the ES region (36 %) and the lowest in Sc (20 %). The highest proportion of patients with severe ischemic heart failure (NYHA III + IV) was found in EE (47 %) and the lowest in Sc (28 %).

The 876 excluded participants were more likely to be female, older, less likely to be married and higher educated. They reported worse health status (single SF-36 question “In general, would you say your health is…excellent/very good/good/fair/poor?”), came more often from ES countries, Sc and SE and less often from EE and WE, were less likely to have hypercholesterolemia but more likely to be diabetic and physical inactive. Excluding these patients because of incomplete data or being outliers could have influenced the presented reference values.

Conclusions

In 5508 patients with angina, MI and ischemic heart failure, the diagnosis exerted a significant influence on the perception of physical health with the highest mean SF-36 PCS scores reported by patients with MI and the poorest scores reported by patients with heart failure. Worse physical health was also reported by females, older patients (especially those >70 years) and patients from EE. The cardiac diagnosis had no effect on the mean MCS scores which were all within the normal range; however, females, younger patients (especially those <51 years) and EE patients reported the lowest mean MCS scores. Clinicians, researchers, health professionals and health-policy makers can use these SF-36 reference values for patients with IHD as an indication of how an individual patient, or a group of patients, compares to patients of the same sex, age and diagnosis as well as to patients with IHD in a specific country. Important health challenges that need to be addressed remain in both Eastern and Western Europe countries concerning unresolved issues in health-policy-making and rising health inequalities between and within countries.