Background

More and more elderly people are affected by multiple health problems and face a significant burden in managing these conditions [1, 2]. Patients with multimorbidity are often seen by multiple primary and secondary healthcare professionals, each advising different treatment, including long-term medications and recommendations for diet and exercise. There is a growing body of literature recognizing the critical role of treatment burden in chronic care [3]. This concept describes the workload of health care-related tasks and its impact on patients’ daily lives and their well-being [4]. There is a positive correlation between treatment burden and the number of conditions [5, 6], especially in discordant conditions with different pathophysiologic profiles and/or treatment strategies. Tran et al. [7] found that about 40% of people with long-term conditions do not feel able to sustain the present efforts for their care in the future. Although patients with a poor overall health status spend more than 3.5 h per week on health-related self-care, clinicians tend to overlook their workload [8].

Boyd et al. [9] demonstrated in their classic example of a hypothetical 79-year-old woman with diabetes, chronic obstructive pulmonary disease (COPD), osteoporosis, hypertension and osteoarthritis how following single clinical practice guidelines can lead to a complex and contradictory treatment regimen with 19 daily doses of medication and 14 non-pharmacological recommendations. The authors concluded that for patients with complex care needs, individual treatment burden, potential risks and benefits should be taken into account in a shared decision-making process. Shippee et al. [10] used the Cumulative Complexity Model to illustrate that overburdening and non-adherence occur when patients’ treatment-related workload is greater than their capacity to cope with these demands. A recent systematic review outlines that treatment burden is determined not only by health care-related workload and individual abilities and resources, but also by context factors such as health care structures [11]. For example, in countries with universal health coverage (such as Germany or the UK), individual financial resources are less likely to have a significant impact on treatment burden [12]. Eton et al. [13] further distinguish three components of perceived treatment burden: care-related tasks, strategies for facilitating treatment burden such as seeking support from family or friends, and factors that increase burden, such as problems with medication, financial challenges, or a lack of information. In their Guideline for Clinical Assessment and Management of Multimorbidity, the National Institute for Health and Care Excellence recommends monitoring treatment burden and taking action to reduce it where indicated, with the ultimate goal to improve quality of life [14]. In clinical practice, this can be an opportunity to initiate an open dialogue about patients’ preferences and challenges in managing their conditions [15], as well as a first step towards shared-decision making. Similarly, it could help to identify those at risk of being overburdened and resulting non-adherence. However, no instrument is yet available for use in German language.

Several patient-reported instruments for the assessment of treatment burden in patients with multimorbidity have recently been developed for the English-speaking population [16,17,18,19]. When assessing these instruments, we found the Multimorbidity Treatment Burden Questionnaire (MTBQ) to be most appropriate for the target population of older adults with multiple chronic conditions and diverse educational backgrounds. Whereas other instruments were longer and more difficult to understand or focused only on specific aspects of treatment burden, the MTBQ stood out for its brevity, intelligibility, and participatory and theoretically informed development [19]. The original MTBQ was tested in a large sample (n = 1,524) of older adults with multimorbidity (defined as the presence of three or more long-term conditions) recruited from general practices in England and Scotland as part of the 3D Study [20]. The MTBQ demonstrated good construct validity, internal consistency and responsiveness. The questionnaire has previously been translated into Danish [5, 21] and Chinese [22]. Its items address aspects of treatment burden that are also relevant to patients navigating the German health care system: Scheduling medical appointments, obtaining prescriptions, challenges related to medication intake, self-management and necessary lifestyle changes, financial aspects and the impact of treatments on personal relationships.

In this study, our objective was to (a) translate and cross-culturally adapt a German version of the MTBQ, (b) validate the adapted version in a sample of older adults with multimorbidity and (c) analyse the relationship between treatment burden scores and sociodemographic characteristics as well as other patient-reported health measures.

Materials and methods

Following the Guidelines for Translating and Adapting Tests by the International Test Commission [23], we used a multiple step translation process and conducted cognitive interviews as well as a pilot test. Psychometric properties of the questionnaire were determined in a sub-study of the MULTIqual project [24], which included a sample of 346 older adults (65 years and older) with three or more long-term conditions.

Description of the multimorbidity treatment burden questionnaire

The MTBQ is an easy-to-understand questionnaire on the perceived difficulty of health care tasks and their impact on everyday life. The original questionnaire consists of ten items and includes three optional questions (items 3, 9, and 10) that were not applicable in the UK study sample but may be relevant in other populations. The MTBQ was developed based on a literature review and discussions with a patient and public involvement group. In addition, cognitive interviews were conducted to examine content validity. Responses are given on a 5-point Likert scale, with values ranging from 0 (not difficult or does not apply) to 4 (extremely difficult). The global score is computed as the mean score multiplied by 25, resulting in a global score between 0 and 100. The global score can be computed if a person has answered at least 50% of the questions. Four treatment burden groups were categorized in the UK sample: no treatment burden (score 0) and the tertiles low (< 10), medium (10–22) and high treatment burden (≥ 22).

A previous psychometric study of the original MTBQ found positively skewed scores with floor effects for all items. Exploratory factor analysis yielded a one-factor solution with a Cronbach’s Alpha of 0.83 suggesting high internal consistency. Analysis of construct validity revealed a strong positive association with self-reported disease burden (rs = 0.43, p < 0.001), and moderate negative associations with quality of life (rs = − 0.36, p < 0.001) and self-rated health (rs = − 0.36, p < 0.001). Moreover, higher treatment burden scores were significantly associated with a higher number of comorbidities (rs = − 0.31, p < 0.001. Regression analysis showed significant associations of changes in MTBQ scores with changes in measures of health-related quality of life and patient’s assessment of chronic care at nine-month follow-up, suggesting good responsiveness [19].

Translation

Two translators with in-depth knowledge of the target language, health care system and culture (a researcher with experience in test development and a cultural scientist) carried out the forward translation of the full 13-item version of the MTBQ independently. The two versions were then reconciled into one by both translators. This was followed by two independent backward translations by a psychologist and a professional interpreter. All translators speak German as their mother tongue. The translations were reviewed using the checklist by Hambleton and Zenisky [25] in order to reach consensus on a final version. To ensure cross-cultural validity, ambiguities concerning the different healthcare systems were resolved by consulting the translator group and the author of the original questionnaire.

Cognitive interviews and pilot test

Applying verbal probing and think-aloud technique, we conducted semi-structured cognitive interviews. The aim was to assess the underlying cognitive mechanisms in item processing according to Tourangeau et al. [26]: comprehension, information retrieval, judgement and response behaviour. At first, patients were asked to verbalize their thoughts when answering the questions. Following this, a series of probe questions was administered to elicit further information on the cognitive process [27]. Interviews were digitally recorded and transcribed verbatim. Analysis of the cognitive protocol was performed question by question to identify difficulties that could potentially lead to responses that did not reflect the intended meaning of the item. The coding corresponded to the categories outlined above. Cognitive interviews were conducted among six patients aged 65 and over with three or more long-term conditions recruited from two GP practices. In addition, we carried out a pilot test in order to determine feasibility and to identify potential problems with test administration. Five researchers piloted the questionnaire in a sample of seven persons living with (multiple) long-term conditions.

Study setting and data collection

We recruited patients (aged 65 and over) with multimorbidity from GP practices as part of the cross-sectional MULTIqual study on quality of care for older patients with multimorbidity [24, 28]. GP practices in North and South Germany (Hamburg and Heidelberg and surroundings) were randomly selected, stratified by region, and invited to take part in the study. Participating GPs screened their regular practice clientele for the presence of at least three long-term conditions out of a pre-defined list of diagnoses that were found to be associated with high disease burden and lower subjective health status. We excluded patients without sufficient German language skills or ability to give informed consent, patients living in nursing homes and patients in palliative care. Out of 1,243 eligible patients from 35 GP practices, 346 patients agreed to participate (response rate: 27.9%). Following written informed consent, standardized interviews were carried out in their homes or in the GP practice. We collected data on sociodemographic characteristics, health care utilization, course of treatment and medication. Other patient-reported health outcomes were assessed with validated instruments: Medication adherence via MARS-D [29], patient activation via PAM 13-D [30], health-related quality of life and self-rated health via EQ-5D-5L [31], and perceived social support via F-SozU K-14 [32]. Recruitment and data collection took place from April 2019 to March 2020.

Statistical analysis

We used descriptive statistics in SPSS 25.0 to assess item properties. Assuming a congeneric model with varying means and variances of true values, we calculated McDonald’s omega as reliability coefficient using the MBESS package for R [33], with scores above 0.70 considered sufficient given the length and purpose of the instrument [34]. We conducted an exploratory factor analysis to assess the dimensionality of the questionnaire. Underlying factors were extracted according results of scree plot, parallel analysis of eigenvalues and Velicer’s minimum average partial test with the R package EFA.dimensions [35]. Confirmatory factor analysis was computed using R package lavaan [36]. Goodness of fit was examined using the following indices: Robust root mean square error of approximation (RMSEA) with values below 0.05 indicating good model fit and values between 0.05 and 0.08 indicating acceptable model fit; standardized root mean square residual (SRMR) with values below 0.05 indicating good model fit and values between 0.05 and 0.10 indicating acceptable model fit; and comparative fit index (CFI) with values below 0.90 indicating poor model fit [37].

In examining construct validity, we used bivariate correlation analyses to assess the relationship between MTBQ scores and related measures. We hypothesized a positive correlation with number of long-term conditions and negative correlations with medication adherence, patient activation, health-related quality of life and self-rated health, as previously established for several self-report treatment burden questionnaires [6, 18, 19, 38]. Furthermore, we expected lower levels of social support to be associated with greater treatment burden. Unfortunately, there was no comparator scale available to assess concurrent validity. Mann–Whitney U tests were used to analyse the associations between participant characteristics and treatment burden scores. We expected higher treatment burden scores for female respondents and respondents with mental health conditions, as similar results were previously reported for the UK version [19].

Results

Translation, cognitive interviews and pilot test

Discrepancies between forward and backward translation as well as the final review highlighted problems in the adaptation of optional item 10: “Getting help from community services”. Due to the different organization of the health care systems, there is no analogous service structure in Germany. Although an item was created to determine treatment burden pertaining to services such as physiotherapy and mobile nursing service, cognitive interviews revealed that respondents already included these services under item 6: “Arranging appointments with health professionals”. Accordingly, the adaptation led to ambiguities and queries and therefore, we decided to exclude item 10. All other items were found to be applicable to the German health care system.

Furthermore, respondents reported that it was difficult to distinguish between the response categories “does not apply” and “not difficult”, as the German translations are commonly used for similar purposes. For this reason, we added an instruction to further distinguish between both options: “It is also possible that some of the aspects do not play a role in your care and do not apply to your situation”. Cognitive interviews showed that this addition was sufficient to enable a clearer distinction between the two responses. Interviewed patients listed a wide range of health care providers for answering questions 6 to 8, indicating a comprehensive understanding of these items. Likewise, a number of monitoring behaviours were specified for item 5. Respondents did not describe any further aspects of treatment burden not covered by the questionnaire, so that we assume no limitations to the face validity of the German adaptation. The results of the cognitive interviews are presented in more detail in Additional file 1. In the pilot test, it took about 4 min to answer the questionnaire, indicating little burden for respondents. There were no suggestions for any further changes.

Statistical analysis

Out of the initial sample of 346 patients who participated in the main study, 344 patients responded to at least half of the MTBQ questions, which allowed inclusion in the statistical analysis. Table 1 illustrates the characteristics of the participants. The mean age was 77.5 years with a range from 65 to 97. The sample included slightly more female (55.2%) than male participants. The majority (56.4%) had less than ten years of school education which equates to CASMIN (Comparative Analysis of Social Mobility in Industrial Nations classification of education) grade 1 [39]. On average, participants reported 9.8 long-term conditions.

Table 1 Characteristics of the study sample

Item properties

Proportion of missing data was less than 1% for each item (see Table 2). We observed floor effects for all items, with positively skewed distributions (Skewness = 2.02, Kurtosis = 4.72). Following the instruction of the original questionnaire, all items with more than 40% “does not apply” responses were excluded from the statistical analysis, which led to the removal of item 9. However, in contrast to the original questionnaire, we retained item 3, as this applied to most respondents. The mean score was multiplied by 25 to obtain the global MTBQ score. Descriptive statistics are provided in Table 3. For the purpose of comparison, we retained the original thresholds for categorizing the global score into four levels of treatment burden as applied to the UK, Chinese, and Danish versions. As a result, 25.6% of the study population showed no treatment burden, 39.0% low treatment burden, 28.2% medium treatment burden, and 7.3% high treatment burden (see Table 1). Mann–Whitney U tests showed that global MTBQ scores were significantly higher in female (Mdn1 = 6.82) than in male participants (Mdn2 = 4.55; U = 11,729, p = 0.001) and higher for persons with anxiety or depression (Mdn1 = 9.10, Mdn2 = 4.55; U = 3172, p = 0.024).

Table 2 Distribution of responses (N = 344)
Table 3 Descriptive statistics of health measures and correlation to global MTBQ scores

Dimensionality and reliability

The Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy (0.755) indicated that the data was suitable for exploratory factor analysis. Although Kaiser Guttman criterion determined a four-factor solution, the scree plot (see Additional file 2) and both parallel analysis of eigenvalues and Velicer’s Minimum Average Partial test (MAP) suggested one common factor, which explained 21.20% of total variance. Model indices for the confirmatory factor analysis showed acceptable to slightly less than good model fit: χ2 = 86.917 (44), p = 0.000, Robust CFI = 0.845, RMSEA = 0.073, SRMR = 0.072. Internal reliability was satisfactory with ωt = 0.71.

Construct validity

We found a positive association between the global MTBQ score and patient-reported number of comorbidities (rs = 0.409, p < 0.001). There was an association between greater burden and lower health-related quality of life (rs = − 0.351, p < 0.001) and self-rated health (rs = − 0.335, p < 0.001). We found negative associations for social support (rs = − 0.308, p < 0.001), patient activation (rs = − 0.457, p < 0.001) and medication adherence (rs = − 0.222, p < 0.001) as well. Our results (see Table 3) support all our hypotheses on construct validity, with coefficients comparable to the results shown for the UK version of the MTBQ.

Discussion

In this study, we developed a German tool to measure treatment burden in patients with multimorbidity. We used a thorough and well-established methodology to translate and adapt a German version of the MTBQ. The questionnaire contains 11 core items and one optional item that was not applicable to our study population but may be relevant for other settings. Cognitive interviewing and piloting led to smaller adaptations and demonstrated overall good content validity. Statistical analysis was performed within a cross-sectional design. Validity analysis yielded significant associations to related constructs as hypothesized, similar to the results reported for other instruments assessing treatment burden [19, 40]. While factor analysis indicated a single-factor solution, the proportion of explained variance and model fit were only satisfactory. The UK version showed one-dimensionality, whereas other psychometric studies suggested alternative three-factor solutions for the Chinese and Danish versions [21, 22]. Due to these differences, we assume that the underlying construct is a formative model rather than a reflective model and therefore dimensionality is less relevant for this instrument [41]. The difference between formative and reflective models in psychometric validation mainly relates to the internal structure of the patient-reported outcome measure. In a reflective model, one latent variable manifests itself in all indicators, i.e., in the items of a (sub-) scale, so that the items are expected to be positively correlated. In a formative model, on the other hand, the measured items constitute the latent variable, with no presumption of inter-item correlation [42]. Regarding the construct of treatment burden, our empirical results support theoretical assumptions: The manifestation of treatment burden is influenced by relatively stable individual factors such as diseases, resources and coping strategies, while it varies as a function of the workload resulting from treatment regimens and also the organisational requirements of health care systems. While this finding suggests limited utility of factor analysis, it also underscores the importance of cross-cultural adaptation and qualitative pretesting. The distinct floor effects of the scale are similarly reported for other instruments on treatment burden [16, 43], in some cases even to a greater extent [18].Nevertheless, in patients with low burden, the responsiveness to improvement may be limited.

In our sample, patients scored lower on average for treatment burden than in the UK sample of Duncan et al. [19]. These differences could be explained by the dense medical care structure in the regions of our study centres and the older age of our participants. Previous studies found that treatment burden tends to be greater in younger patients [19, 40, 44]. One possible explanation may be that younger people not only experience stronger life demands and requirements regarding to their social roles but also have different expectations of their general functioning. The longer patients live with long-term conditions, the more likely they are to adjust to everyday treatment requirements, such as taking medication regularly or monitoring symptoms.

We were surprised to see that item 3 on financial burden was relevant to most of our participants, in contrast to the original UK sample. In outpatient care in Germany, out-of-pocket costs of statutory health insurance only apply to co-payments for medication, supportive therapies and medical aids. For patients with long-term conditions, there is a burden limit of 1% of their income. Even so, not all patients use this exemption or are even aware of this possibility. On average, the highest scores were obtained for item 12 (“Making recommended lifestyle changes”) and item 13 (“Having to rely on help from family and friends”). These findings corroborate the relevance of intervention studies to target these aspects of care.

One major advantage in our study is that the sample represents the most prominent target group of multimorbidity research. On the other hand, the sample restrictions might pose a limitation to the generalizability of these findings to younger patients and patients with fewer comorbidities. Moreover, it should be noted that our data collection is confronted with a possible self-selection bias: those who already perceive a greater burden and feel overwhelmed might be less likely to respond to a study invitation. Due to the cross-sectional design, we were not able to establish the predictive value of the German MTBQ and to explore the possible effect of overburden on non-adherence. One psychometric property we did not examine was test–retest reliability, which should be addressed in the future. Despite our promising results, additional research is required to determine the responsiveness of the German MTBQ in longitudinal designs and the clinical utility of this tool in primary care settings. Future studies will be needed to confirm and advance the threshold values for categorizing MTBQ scores into four levels of treatment burden, ideally based on a clinical anchor [21]. Further work is required to shed light on the impact of life demands, social roles, coping behaviour and resources on treatment burden to allow a better understanding of our findings.

Conclusions

The German MTBQ is a brief and concise patient reported outcome measure to examine perceived treatment burden in patients with multimorbidity. It demonstrated good psychometric properties and is the first valid instrument to assess treatment burden in patients with multiple long-term conditions in Germany. Since cross-culturally adapted and validated versions of the MTBQ are available in several languages, it could also be used in international research studies. With its short administration time, the MTBQ may be suitable to detect people experiencing high burden in clinical practice, although its use in as a clinical tool has not yet been validated. Implementing the MTBQ as a clinical tool has the potential to inform shared decision-making and elicit areas where patients need additional support.