Introduction

Low-back pain (LBP) still is the number one disorder worldwide for years lived with disability [1]. Thus, people suffering from LBP are amongst those most commonly seeking health care. The availability of reliable and valid PROMs enables practitioners to assess and evaluate pain, function and quality of life (QOL) in a standardised manner [2]. PROMs may also act as a tool to identify the effectiveness of LBP interventions when used on a larger scale [2].

In 1996, a multinational group of back pain researchers called for the use of standardised and easily administered outcome measures in patients with LBP [3]. They recommended assessment of the most important domains including pain, function, generic health status/well-being, disability and patient satisfaction. Items covering these domains were extracted from widely used PROMs resulting in a new core set of six items [3]. The initial core set showed satisfactory reliability and validity [4, 5], and was further developed with the addition of a seventh item to cover general quality of life, resulting in the Core Outcome Measurement Index (COMI), a self-administrated and multidimensional PROM [6].

The items finally included in the COMI include back pain and leg pain intensity, function in everyday life, symptom-specific well-being, general quality of life and social and work disability. The COMI has been shown to be a valuable instrument and been recommended for further use as a standardised outcome measure in clinical practice, clinical trials, multi-centre studies and surgical registries [7]. It is short, easy to administer, and to date has been translated and cross-cultural adapted into several different languages; Brazilian-Portuguese [8], Chinese [9], French [10, 11], Hungarian [12], Italian [13], Japanese [14], Korean [15], Norwegian [16] and Polish [17]. So far, the COMI has been used as an outcome measure in several clinical studies [18,19,20] and in the International EUROSPINE Spine Tango registry [21].

The COMI has not yet been adapted for the Swedish language. Therefore, this study aimed to translate and cross-culturally adapt the COMI from English to Swedish, and to assess its reproducibility and validity in patients with LBP.

Methods

The study was conducted in two stages. The first stage involved the translation and cross-cultural adaption of the COMI from English to Swedish in accordance with stated guidelines [22].The second stage involved evaluation of the instrument's face and construct validity along with the reproducibility of results through test–retest in Swedish primary and secondary health care settings. The regional Ethical Committee in Stockholm approved the study (Dnr: 2015/1866-32).

The Core Outcome Measures Index

The COMI covers five different domains, with seven individual items; pain intensity (two separate items measuring back pain and leg/buttock pain), back function in everyday life (one item), symptom-specific well-being (one item), general quality of life (one item) and disability (two separate items measuring social disability and work disability).

The composite COMI score (range 0–10) is calculated using the average score of the five domains. Higher COMI scores indicate worse status. For the domain pain intensity, the data are collected using 0–10 graphic rating scales, with the higher of the two values for back pain and leg/buttock pain being used to represent the "pain" domain. Five-point scales (1–5) are used for the remaining domains, with the scores being rescaled into a 0 to 10-point range (score (1–5) minus 1, multiplied by 2.5). The values for the two disability items are averaged to represent the "disability" domain [5]. Averages of the five domains form the COMI (0–10, best–worst).

Translation and cross-cultural adaption

The translation and cross-cultural adaption was carried out according to recommended guidelines for health status measures [22]. Two native-speaking Swedish persons (one expert T1 and one naïve T2) who were fluent in English completed a forward translation from English to Swedish. All translators and the expert group consisting of three physiotherapists (two researchers/clinicians and one clinician) synthesised the results of the translation resulting in a common translation (T-12). In the process, it was considered important that consensus resolved the issues and not one person’s opinion. The T-12 version was then back-translated from Swedish to English by two translators (BT1 and BT2) fluent in Swedish but with English as their mother tongue. Back translation highlights inconsistencies or conceptual errors in the translation. The same expert group as before, with addition of the PROMs developer, consolidated all versions (the original, T1, T2, T-12, BT1 and BT2) and developed the pre-final version of the Swedish COMI for field-testing.

Test of pre-final version (face validity)

Ten participants were asked to verbalise their thoughts about the questions whilst completing the COMI. The participants were asked whether they fully understood the questions and their implications and whether the wording seemed clear. If any item seemed unclear, the participant was asked by the observer (AL) to rephrase it in order to improve readability and understanding.

Study population

Patients with a primary complaint of LBP, with or without leg pain > 3 months, and with a good understanding of the Swedish language were consecutively recruited at a physiotherapy primary health care clinic with direct access to physiotherapists, and an orthopaedic surgery clinic with patients being considered for spinal surgery (decompression for spinal stenosis, surgery for disc hernia or spinal fusions). The exclusion criteria were pregnancy and “red flags” such as fracture, cancer and inflammatory disease. Written informed consent was obtained from all participants. One-hundred-two participants were included in the validity study and 49 of those were included in the test–retest analysis (Table 1). The study sample size was determined according to the recommendations [23].

Table 1 Demographic characteristics and clinical status of included participants (n = 102)

At their first visit to either of the clinics, all participants completed the COMI, the reference scales and questions concerning socio-demographic variables. The chosen reference scales were Oswestry Disability Index (ODI) [24] and the EuroQol-5 Dimensions Index (EQ5D) [25]. The ODI is a spine specific PROM assessing LBP-related disability using 10 questions rated from 1–5 and transformed to a score of 0 (no disability) to maximum 100 for worst disability [24]. The EQ5D is a generic instrument measuring Health Related Quality of Life and comprises 5 items concerning mobility, self-care, usual activities, pain/discomfort and anxiety/depression [25]. We used the 3-level version of the EQ5D in which each item is rated on a 3-point adjectival scale. To calculate the EQ5D Index, the time trade-off valuation technique (TTO) was used (https://euroqol.org/docs/EQ-5D-3L-User-Guide.pdf).

Statistical analysis

Floor and ceiling effects

Floor (worst status) and ceiling (best status) effects were evaluated and considered to be present if more than 15% of the patients reported the highest or the lowest possible score, respectively [26].

Construct validity

Analysis of the construct validity was carried out in relation to predefined hypotheses, based on the findings of previous studies in which the COMI was cross-culturally adapted. Specifically, the hypotheses were that each of the specific COMI domains would correlate at least moderately (r = 0.30–0.60) with the composite scores of the two reference scales (Table 2). We hypothesized that the COMI score would correlate highly (r > 0.60) with the composite score of the reference scales.

Table 2 Construct validity with prior formulated hypotheses and floor/ceiling effects (n = 102)

Spearman’s Rank correlation coefficients (Rho) were used in all correlation analyses due to the variety in scale types. The coefficients were described as low (< 0.3), moderate (0.3—0.6) and high (> 0.6) [16].

Reproducibility of results

A test–retest analysis was completed for the specific COMI items and the COMI score, with 7 days between the tests [23]. The first assessment was carried out at the clinic. The participants were given the second questionnaire in a pre-paid envelope, to fill in and to send to or leave at the clinic after 7 days. The participants were prompted by a text message to fill in the COMI a second time. No treatment was scheduled between the two tests. To study differences in values between test 1 and 2, Wilcoxon’s matched pairs test was used.

The reproducibility of the specific COMI domains and the COMI score was assessed with the Intraclass Correlation Coefficient (ICC2,1) using a two-way random effects model. ICC can range from 0 to 1, and values were considered good if ICC was 0.60–0.80 and excellent if > 0.80 [27]. The domains back function in everyday life, symptom-specific well-being, general quality of life, social and work disability were also analysed by means of Kappa values with quadratic weighting. The response options for these items are categorical and the categories are ordinal. Reliability estimates were interpreted as less than 0.40 = poor, 0.40–0.59 = moderate, 0.61–0.80 = substantial, and more than 0.81 = excellent [28]. Agreement was given by the standard error of measurement (SEM) and minimal detectable change (MDC) where MDC is the minimal amount of change in a patient's score that with 95% certainty is not likely to be due to error [23].

R package version 1.3.1 was used for the statistical analyses [29].

Results

Cross-cultural adaption of the COMI

The Swedish version of the COMI is presented in Appendix 1 in ESM. COMI was successfully forward- and back-translated into Swedish with neither semantic nor language ambiguities, and all items were approved by the expert group. Pre-testing of the preliminary Swedish COMI revealed no ambiguities.

Floor and ceiling effects

Floor effects were present for the item symptom-specific well-being (48%) and for social disability (20%) (Table 2). Notable ceiling effects were present for the items leg pain (16%), social (20%) and work disability (54%). When analysing the higher of the two pain scores from the pain domain, (as used when calculating the composite COMI score), no end effects were shown (0% respectively). When analysing social and work disability together, the ceiling effect was 18%. For the composite COMI score no floor or ceiling effects were seen. We had few missing values (n = 2).

Construct validity

All predefined hypotheses regarding the relationship between the scores for the specific COMI domains and the scores for the reference instruments were positively confirmed, with moderate to high correlations (Table 2). Back pain showed a lower relationship with the ODI (r = 0.33) but in the combined value with leg pain (higher of the two scores) the relationship was stronger (r = 0.60). Symptom-specific well-being, showed a lower correlation with the ODI and the EQ5D (r = 0.41 and r = − 0.46, respectively). The composite COMI score showed a high correlation with the EQ5D (r = − 0.73) and the ODI full score (r = 0.72).

Reproducibility

The test–retest results are shown in Table 3. No significant differences between test and retest scores were detected for either the specific COMI domains, or the composite COMI. Reliability point estimates suggested a moderate agreement for the domains back function in everyday life (κw = 0.55), general quality of life (κw = 0.47), and work disability (κw = 0.41). For symptom-specific well-being and social disability a substantial reliability was shown (κw =  0.64/0.65). The ICCs for the specific COMI domains were 0.41 to 0.57 except for the item leg pain (0.78). The ICC for the COMI score indicated adequate reproducibility for the instrument (ICC2,1 0.63, 95% CI 0.42–0.77). The SEM was 1.0 point and the MDC was 2.8 points for the COMI score.

Table 3 Test–retest reliability results for each domain and the COMI score (n = 49)

Discussion

We cross culturally adapted the COMI for use in Swedish-speaking patients suffering from LBP. In addition, we explored its test–retest reliability. The cross-cultural adaption procedure resulted in a Swedish version of COMI that was considered equivalent to the original English version and our results demonstrated acceptable psychometric properties.

We collected data at a primary and a secondary care clinic and included patients with LBP and with/without leg pain treated either conservatively by physiotherapists or referred to be considered for spinal surgery. The Norwegian study included patients in primary care and suffering from back pain but not leg pain [16]. Other studies have included a variety of hospital samples [8, 10, 12, 13]. The diversity in the previous studies, with respect to both the included patients and the reference scales used, makes it somewhat difficult to compare our findings, as results from the analyses may vary due to populations studied.

Floor and ceiling effects

Our results on floor and ceiling effects showed somewhat diverse effects compared to previous validation studies. There was a notable ceiling effect (best status) for the item leg pain (16%) and work disability (54%), and a floor effect (worst status) for the item symptom-specific well-being (48%). For the composite score no floor or ceiling effects were seen, which concurs with the findings of previous studies. A floor effect for symptom-specific well-being has previously been reported [8,9,10, 12, 13]. The Hungarian study showed a floor effect of 75.2% for this item, which might be considered above the critical value of 70% but might be explained by their evaluation of a surgical population [12]. Even if consistently showing a higher ceiling effect, this specific question is considered to add valuable information and should therefore continue to be part of the COMI [4, 5]. The ceiling effect of 54% for work disability is in keeping with the fact that our cohort was only moderately disabled (ODI 30, SD 17); more than 50% were still at work and only 12% were sick listed.

Construct validity

In line with previous validation studies, the composite COMI score as well as the scores for each of the separate COMI items correlated to at least some extent with the full scores of the reference scales, the ODI [24] and the EQ5D [25]. The COMI composite score showed a strong correlation with the full scores for the reference scales (ODI, r = 0.72; EQ5D, r = − 0.73). The correlation with ODI concurs with the findings of some previous studies, including those evaluating Brazilian Portuguese (0.64) [8], Chinese (0.69) [9], Hungarian (0.83) [12], and Korean (0.83) [15] versions of the COMI. The correlation with EQ5D also concurs with some previous studies [11, 13,14,15]

Our hypotheses of a moderate correlation between each of the COMI items and the composite reference scales were confirmed, with correlation coefficients ranging from 0.41 to 0.73, with the exception of that for back pain, which had a lower correlation with ODI (r = 0.33). This might be considered odd as the ODI is considered a back pain related disability instrument. However, this was not evident when the higher of the pain score was used; then the relationship was stronger (r = 0.60). Further, the item Symptom-specific well-being showed lower correlations with the ODI full scale (r = 0.41) and with the EQ5D (r = − 0.46) in the present study. This concurs with previous studies also showing lower correlations for this specific item either measured with the ODI (Polish 0.43 [17], Hungarian 0.44 [12], Chinese 0.45 [9]) or the EQ5D (French 0.36 [11], Norwegian 0.43 [16].

Reproducibility

For analyzing test–retest reproducibility, 49 participants from the primary care cohort filled in the COMI twice with 7 days in-between. We considered this a reasonable period as 7–10 days previously has been recommended [23]. We reached a fair to good reliability for the specific COMI items (ICC 0.41–0.78, Kappa W 0.41–0.77). The COMI composite score showed good reliability (ICC 0.63) but was somewhat lower than that reported in previous studies (French 0.85 [11]; Italian 0.92 [13], Norwegian 0.89 [16]; Brazilian-Portuguese; 0.91 [8] and Chinese 0.91 [9]). The ICC is dependent on the variance between and within the subjects. Our data, compared with previous studies [8, 15, 17], showed a lower COMI score (mean) at both T1 and T2 and a slightly narrower variance, which might have affected the ICC value (Table 3).

For a test retest design the item measured over time should be stable, to avoid bias [23]. We did not use a transition question between the tests as the time interval was only 7 days and no treatment was scheduled during this week. Even so, we cannot rule out the possibility that patients' symptoms were not sufficiently stable between the tests, and that there was indeed real change between them, which would have influenced the reproducibility statistics. However, looking into the differences between the tests in response to each domain on the COMI, only back function fell short of the recommended 90% of ± 1 category (6 out of 49, 12%) [30].

Our participants filled in the COMI and the reference scales at the clinic on the first occasion and at home on the second, prompted by a text-message. This methodology has successfully been used before [31]. The different environments in which the COMI was completed might however have influenced the stability of the data. Encouragingly, there were no significant differences in the mean scores on the two test occasions for back pain, leg pain, the other specific COMI domains and the COMI score. Our SEM (1.0), and hence MDC (2.8), was somewhat higher than reported in some of the previous COMI studies [5, 9, 17]. Our results are still in line with the Norwegian (MDC 2.2) [16] and not far from the French (MDC 2.0) [11] studies. Our MDC for the COMI score indicates that a change of 2.8 points or more gives a 95% likelihood that it is a result of "real change" in the patient's condition instead of a measurement error. Based on previous COMI studies, the estimated minimal clinically important difference for the COMI summary score is between 2 and 3 [32].

Strengths and limitations

A strength of our study is that we followed strict guidelines in the cross-cultural adaptation and validation processes. Another strength is that we included participants from both primary and secondary care to increase the validity of the COMI in a Swedish context. Some limitations however need to be considered. As our patients were not highly disabled compared with those included in some of the previous validation studies, our results can only be generalized to a population like ours. Even so, our study included a diverse sample of patients with and without leg pain (51% without leg pain) and from both primary (60%) and secondary care.

The COMI covers several domains in one short, easy to complete instrument. Short instruments are warranted and useful for clinical and research purposes. A recent study by Osthols et al. [33] who surveyed physiotherapists in Sweden concluded that few PROMs are currently used in the everyday work of primary care physiotherapists, and at a low frequency, and one reason given for this is lack of time. We therefore find that to present the COMI in a Swedish context is important.

The Swedish COMI shows acceptable psychometric properties and may thus be suitable to use as a short instrument, measuring important domains in patients suffering from low-back pain with and without leg pain. Future studies should further evaluate its sensitivity to measure change in response to treatment.