Figure 1 (CONSORT diagram) illustrates that 715 patients were screened, and 107 eligible patients were randomised between April 2013 and November 2013 to one of the two trial treatment arms (intervention group (RDSI) n = 53 and control group n = 54). Of the ineligible patients (n = 608), 29 % reported either not experiencing two or more symptoms or not being bothered by at least two symptoms. The average age of the patients was 67.7 years (SD 9.6 years); 54 (53 %) were female and most (60 %) reported all three symptoms at baseline. The two groups were different in baseline COPD prevalence (control group 50 % and RDSI group 28 %) and Lung Cancer Symptom Scale (higher mean scores in the control group (421) compared to the RDSI group (352) (Table 1).
Of the 107 randomised patients, four were found to not meet the inclusion criteria (RDSI = 2 and control = 2) and were removed from data analysis. A further two did not provide any data at baseline; therefore, they were removed from the analysis since they had not strictly followed protocol. The total sample analysed was 101 patients (RDSI n = 50 and control n = 51).
By week 4 29/101 (29 %), patients were withdrawn, and by week 12, a further two patients in the control group were lost to follow-up. No patients in the RDSI group were lost between week 4 and week 12. Total attrition at 12 weeks was 30/101 (29.7 %). Since there was very little attrition after week 4, we conducted sample size analyses based on observations at week 12.
At week 12, there were more drop-outs in the treatment group (38 %) than the control group (24 %); however, this was not statistically significant (χ
2 = 2.5, df = 1, p = 0.1). Logistic regression for drop-out status using age, gender, treatment type and WHO score revealed no significant items. Individually, only the WHO score was borderline significantly associated with drop-out (χ
2 = 4.0, df = 1, p = 0.044). GEE models with an unstructured correlation parameterisation were used, because of the differently spaced time intervals for assessments.
The six NRS breathlessness scales combined scored the lowest for ease of use (6.5/10) and relevance (6.8/10), and the EQ-5D-3L was the best for ease of use (7.8/10) and relevance (7.8/10) (Table 2).
Respiratory symptom cluster outcomes
Results for the respiratory symptom cluster outcomes (breathlessness, cough and fatigue measures) are presented as change in mean scores from baseline to 4 and 12 weeks (Table 3). Figure 2 demonstrates variability of the scores for each symptom outcome measure at baseline, week 4 and week 12, although with a trend in scores to show some improvement in the RDSI group. Dyspnoea-12 was chosen as the primary outcome for breathlessness for the follow-on trial because it was significant in the GEE model at week 12 for the RDSI group (see below), and patients rated it higher (scale of 0–10, higher is better) than the set of NRS breathlessness scales for ease of completion (6.7 versus 6.5) and relevance (7.1 versus 6.8) (Table 2).
Non-respiratory symptom cluster outcomes
The mean change from baseline to week 4 and baseline to week 12 for the Lung Cancer Symptom Scale, HADS and EQ-5D-3L is shown in Table 3, again with a positive pattern in terms of improvement in the RDSI group.
The GEE models included the WHO score as a covariate due to its potential relationship with drop-outs, although it was usually non-significant. In a number of models (consistent with Fig. 2), time was a significant term for some endpoints (NRS ‘worst breathlessness’ and NRS ‘distress from breathlessness’, Dyspnoea-12 and MCLC), and group was significant for others (Dyspnoea-12, EQ-5D-3L, Lung Cancer Symptom Scale) due to imbalances at baseline. However, when the differential effect of group was considered (i.e. the group × time interaction term), only EQ-5D-3L was significant (p = 0.042 respectively) and was due to a change at week 12 (marginal mean difference at week 12 of −0.17 (SD 0.06, 95%CI [−0.04, −0.30]), p = 0.009). The group differences at week 12 were also estimated, and only Dyspnoea-12 had a significant result (marginal mean 5.19, SD 2.33, 95%CI [0.62, 9.75], Wald χ
2 = 1.27, df = 2, p = 0.026), where the reduction (improvement) was greatest in the RDSI group (Table 4).
Sample size calculations for follow-on trial
Attrition in the feasibility trial was 29.7 % and was highest in the palliative care treatment group. The largest correlation of Dyspnoea-12 to other breathlessness questionnaires used in the feasibility study was the NRS worst breathlessness (r = 0.416) which has a MCID of 1 unit [20,33,35]. Using an anchor-based regression analysis, a 1 U change in worst breathlessness was equivalent to a 1.22 U change in Dyspnoea-12. Using the same anchor, an ROC analysis estimated a cut-point (maximising Youden’s J) and resulted in a Dyspnoea-12 MCID of 2.5 U. A distribution-based analysis using a baseline effect size of 0.5 produced a Dyspnoea-12 MCID of 4.7. Based on these analyses, we have taken a Dyspnoea-12 MCID conservative estimate of 3 U to use in the sample size calculation. Since three different variables are used as co-primary endpoints, to maintain an overall significance of 5 %, the separate parts of the sample size are based on an alpha of 5 %/3. To detect a difference of 3 U in Dyspnoea-12, a sample size of 97 patients per arm, with 80 % power, at 5 %/3 significance (for a one-sided t test at week 12, adjusted for large correlation to baseline values ). Allowing for 20 % attrition at week 12, then 122 patients per arm should be recruited.
Using data from the sub-group of patients with ‘bothersome’ cough (66 % of the total sample), the MCLC had a mean difference between the two treatment groups of 3.0 U. A distribution-based analysis using an effect size of 0.5 produced a MCLC MCID of 4.4. Based on these analyses, a MCLC MCID estimate of 3 U was used in the sample size calculation. To detect a difference of 3 U in MCLC, a sample size of 81 patients per arm at week 12 is required, and allowing for attrition at week 12, then 102 per am should be recruited.
For fatigue, the FACIT-F was used which has a published MCID of 3 to 4 U [36,38]. The FACIT-F had a mean difference between the two treatment groups of 4.8 U. Therefore, a FACIT-F MCID of 4 U was used in the sample size calculation. To detect a difference of 4 U in FACIT-F, a sample size of 80 patients per arm at week 12 is required, and allowing for attrition at week 12, then 100 per arm should be recruited. Therefore, by recruiting to the higher number required (i.e. 122 patients per arm) will provide 80 % power that meaningful differences could be detected in all three symptoms in the cluster, with overall type I error rate of 5 %.
Adherence to and perceptions of usefulness of the intervention components
Between 19 and 32 patients completed the daily and weekly diaries at any time point. Participants reported high adherence to the intervention. Breathing exercises were practiced daily for the first 4 weeks between 87 and 100 %, and weekly practice for the remaining 8 weeks was between 96 and 100 %; acupressure was practiced for the first 4 weeks between 84 and 100 %, and weekly practice for the remaining 8 weeks was between 91 and 96 %; cough easing techniques were practiced daily for the first 4 weeks between 32 and 63 %, and weekly practice for the remaining 8 weeks was between 36 and 54 %. The lower use of cough easing techniques is likely to be related to the fewer number of participants that were bothered by cough in the intervention group (35/50).
The majority of participants reported that the breathing techniques were useful at least a little bit over the 12 weeks; there were only seven responses to not at all (five on day 1 and one during weeks 8 and 12). For cough easing techniques, there were six responses to not at all, all during the first 4 weeks post-intervention. Eighteen not at all responses were reported for usefulness of acupuncture; 16 were within the first 4 weeks.