Background

The term 'comorbidity' was first used in 1970 by Feinstein (as cited by Kessler et al, 2001 [1]) and by van den Akker et al [2, 3] to refer to situations where an individual has two or more physical and/or mental health conditions. More recently, the term multimorbidity was introduced [24]. Although comorbidity and multimorbidity are both used to describe two or more health conditions, a distinction is made between these two terms. Comorbidity is used when an index condition of interest is being discussed, and multimorbidity is used when no reference condition is considered [4]. Although these distinctions often are not clearly applied, and both terms are used interchangeably in the literature, we will use this definition of these terms in this paper. Sometimes health conditions can be comorbid purely by chance; however, certain comorbidity clusters can also occur at higher than chance levels[1].

International and Australian research demonstrates the prevalence of comorbidity or multimorbidity as increasing significantly with age [36], indicating that patients with multimorbidity in general practice represent the rule, rather than the exception[5, 7, 8]. For example, an Australian study exploring data obtained through 305 general practitioners in 2005 reported that the prevalence of multimorbidity increased with age, with 83% of surveyed patients aged 75 years or older having multimorbidity [6].

The study of patterns of multimorbidity is a new field. While there is a growing body of evidence regarding the prevalence of comorbidity and multimorbidity [35, 9], most studies use either a count of the number of comorbidities, such as the Charlson Index [10], or a Cumulative Illness Rating Scale (CIRS), which groups conditions by body systems affected [6, 1113]. These methods do not use statistical approaches to identify the nonrandom cluster patterns of individual health conditions into groups of multimorbid conditions, perhaps due to the limitations of statistical methods to date. Most statistical packages that can perform exploratory factor analysis (EFA) require the data to be in a continuous format, but health conditions are usually dichotomously represented; that is, the person either has the condition or does not.

The objective of this study was to use software and statistical analysis methods that allow for the dichotomous nature of disease data to identify nonrandomly occurring clusters of multimorbid health conditions. Identifying clusters of multimorbidity is important due to rising health care costs associated with servicing an increasingly aging population with complex health care needs. Health service providers need to better understand the complexity of the health status of consumers to ensure more strategic and tailored health care is provided.

Methods

Data

The Australian Work Outcomes Research Cost-benefit (WORC) project (http://www.qcmhr.uq.edu.au/worc/) provides a large cross-sectional data set of 78,430 working Australians to explore clusters of nonrandomly occurring multimorbid health conditions.

Study sample: Employees of 58 large Australian-based companies were invited to participate in the WORC study. The survey was undertaken between October 2004 and December 2005.

Study measures: The Health and Productivity Questionnaire (HPQ) from the World Health Organization [14] was used to collect self-reported health status on 22 health conditions. The Kessler 6 (K6) [15], a validated measure of psychological distress, which is included within the HPQ, was used to collect psychological distress data. In total, 23 conditions were explored for multimorbidity patterns in this study. The following health conditions were included in the analyses, as these were available in the HPQ: arthritis, asthma, back/neck pain, cancers (excluding skin cancer), skin cancers, chronic obstructive pulmonary disease (COPD) (including chronic bronchitis and emphysema), cardiovascular disease (CVD), psychological distress (defined as a K6 score of 13 and above [16]), drug and alcohol problems, diabetes, fatigue (including sleep problems), high blood pressure, high cholesterol, injury (workplace injury requiring medical treatment), migraine (and severe headache), obesity (using self reported height and weight to calculate body mass index), bladder problems, heartburn, irritable bowel disorder, ulcers, osteoporosis, or other chronic pain. Self-reported health status was coded for this study as "yes" if respondents reported having the condition and were either currently or had previously received professional treatment for that condition, and "no" if they reported never having the condition. Respondents were excluded if they reported having a condition but never received treatment, as these respondents may have incorrectly self-diagnosed the health problem. An average of 0.05% respondents were excluded for each condition.

Statistical analysis

Exploratory factor analysis was performed in the software package Mplus[17], which accommodates for dichotomous variables by calculating tetrachoric correlations among the variables. When working with tetrachoric correlations, there are no assumptions concerning the shapes of the frequency distributions, and as a consequence, there is no need to be concerned that some distributions are skewed. Factor solutions for the one-factor solution through to the eight-factor solution were explored. The optimal number of factors was determined after applying a number of rules and indices: the scree test (in a plot of eigenvalues against factor number, a kink in the plot gives the optimal number of factors [18]); the eigenvalues-greater-than-one rule [18]; standardized root mean square residual (SRMR), which should be less than 0.05 [19]; comparative fit index (CFI) and Tucker Lewis index (TLI), both of which should be greater than 0.95 [19]; and a rule which says that more than two items should contribute to the definition of a factor [17]. An orthogonal quartimin rotation was applied to facilitate interpretation of factor loadings.

Results

The sample demographic characteristics are listed in Table 1. The sample included part-time, full-time, and casual workers. In the sample, 65% were female and 35% male. The two largest age groups were those aged 30-44 years and those aged 45-59 years, comprising 80% of the sample. Those aged less than 18 years and over 70 years were excluded from the study, as these age groups are not usually in the Australian workforce (0.2% deleted). A total of 71% was married or cohabiting, 69% had no children, 48% had completed a tertiary qualification, and 53% earned $50,000 or more per year.

Table 1 Sample Demographic Characteristics

We obtained all solutions from the one-factor solution to the eight-factor solution. The scree test (Figure 1) suggests that the optimal number of factors is two or three. However, all of the other indices suggest a larger number of factors. The CFI and TLI goodness-of-fit statistics (Table 2) suggest a five-factor solution, whereas SRMR suggests a six-factor solution. The eigenvalues-greater-than-one rule suggests a six- or perhaps a seven-factor solution. However, the seven-factor solution does not meet the requirement of having a minimum of three items in a factor, and so is not considered ideal. Therefore, the six factor solution was selected. Table 3 provides the loadings for the six-factor solution (loadings exceeding the cut-off of ± 0.40 appear in bold).

Figure 1
figure 1

Scree Test with Eigenvalues for Range of Solutions.

Table 2 Exploratory Factor Analysis Goodness-of-fit Statistics for the One Factor Solution through to the Eight Factor Solution
Table 3 Loadings for the Six-factor Solution following an Exploratory Factor Analysis Based on a Polychoric Correlation matrix

The following factors were identified:

  • Factor 1: arthritis, osteoporosis, other chronic pain, bladder problems, and irritable bowel

  • Factor 2: asthma, COPD, and allergies

  • Factor 3: back/neck pain, migraine, other chronic pain, and arthritis

  • Factor 4: high blood pressure, high cholesterol, obesity, diabetes, and fatigue

  • Factor 5: CVD, diabetes, fatigue, high blood pressure, high cholesterol, and arthritis

  • Factor 6: irritable bowel, ulcer, heartburn, and other chronic pain

Discussion

Some conditions appear in more than one factor. (One reason exploratory factor analysis was used for this study is that it allows for more than one factor per condition.) Previous studies that use statistical methods to explore relationships of multimorbid conditions or clusters of organ systems have also found that some conditions appear in more than one factor [6, 20, 21]. Of the 23 conditions available for analysis in our study, we found chronic pain to be in three of the six clusters; diabetes, high blood pressure, and high cholesterol to be in the same two of the six clusters; and arthritis and irritable bowel to be in two different clusters.

We found that health conditions do not cluster neatly into organ or body system, as has been assumed in the methods underpinning the CIRS [22]. A study by Britt et al [20] demonstrates this. They explored patterns of multimorbidity and found that groups of individuals fit into between two and eight combinations of CIRS domains [20].

Only one other study was found that explored patterns of multimorbidity among individual health conditions [21]. A study by Cornell et al included more than 1.3 million primary care patients cared for by the Veterans Health Care System with two or more comorbidities and categorized 45 health conditions. Similarities exist between our fifth group of health conditions and Cornell's "metabolic cluster," the cluster that had the highest degree of association in their study. They reported that 83% of their sample fell into this cluster; three of the conditions in this cluster were also represented in our fifth factor [21]. Differences between the study by Cornell et al and this study include statistical method (Cornell's methods of cluster analysis relies on prevalence, so conditions with low prevalence will be underrepresented), sample size and composition (Cornell's sample was much larger, and all study participants had two or more health conditions; our sample included people well enough and young enough to attend work), and the number of health conditions (these were greater in the Cornell study). These differences may account for discrepancies in the cluster composition between the two studies.

Other existing measures either calculate a comorbidity score based on the number of coexisting conditions, with some weights applied to adjust for severity of condition, such as the Charlson Index [10, 2325], or calculate the impact on functional status, such as the Functional Comorbidity Index [26]. Studies that explore multimorbidity tend to use one of these instruments to determine comorbidity and/or multimorbidity. Because the Charlson Index requires hospital admission data and accurate International Classification of Diseases 10th Revision (ICD-10) records, many of these studies do not reflect the population as a whole. Our study uses those still in the workforce, perhaps skewing to those in better health in the community. Further research is required in this area to determine prevalence and structure of multimorbid clusters of health complaints occurring in Australia.

This study adds to the only other available study [21] that uses statistical methods on a group of individual health conditions to explore nonrandom clustering of multimorbidity. With an increasingly aging population and evidence that comorbidity and multimorbidity increase with age [35], combined with rising health care costs associated with new procedures and treatments, a better understanding of how health conditions cluster together will enable better care management of individuals with chronic and complex diseases.

There are some limitations to our study that need to be considered, and extrapolation of these findings to the general population should be done with caution. This is an opportunistic sample of willing employees from 58 large organizations. The response rate was low (22%). A comparison of respondents and nonrespondents was not possible, so the implications of the poor response rate are not known. For example, only those at work during the data collection period responded. People on extended sick leave or out of the workforce are not represented. The sample also has overrepresentation of females. The self-reported nature of health conditions, and the number and type of health conditions available also need to be considered. For example, there is an absence of some high-cost conditions, such as kidney disease. Therefore, extrapolation of these findings to the general population should be done with caution. The findings are relevant, however, to those sectors and groups where the demographic profile is similar. Fatigue, which may be either chronic or acute, was included in the model. Fatigue is mostly acute, so one might question whether it should be included. However, the results demonstrate that fatigue is included in two of the multimorbidity groupings, highlighting its importance for inclusion in multimorbidity analyses.

Conclusions

This study identified clinically meaningful clusters of multimorbid health conditions that do not fall neatly into organ or body systems. Some conditions appear in more than one cluster. Few studies are available that use statistical methods to explore patterns of multimorbidity in a group of individual health conditions. A large population-based sample with reliable diagnosis data at an individual level is required.