Study design
The study design was already described in detail in our study protocol and the following descriptions are based on the explanations there [37]. The StudyKuS began in 2016 as a multicentre, randomised, controlled study. It was conducted in three different regions across Germany: a) Erlangen/Nuremberg/Fuerth (metropolitan region), b) Weyarn (rural region), and c) Berlin (capital region). Within one region, the three interventions offered in the study (BPT, CBT or EP) took place during the same time period and were conducted in consecutive waves, with four waves in the Erlangen/Nuremberg/Fuerth region as well as the Weyarn region and two waves in the Berlin region. Participants within one region and one wave were randomly allocated to one of three groups (BPT, CBT, or EP). Subsequent to the therapy period, participants, who were assigned to the EP were offered the opportunity to participate in an additional ten-week bouldering group, which followed the same treatment plan as the BPT group and took place directly after the therapy period (see Fig. 1). Data were collected via computer-assisted telephone interviews (CATIs) before start of the therapy (t0), at the end of the therapy (t1), and 3 months (t2), 6 months (t3), and 12 months (t4) after the therapy period (Follow-Up). For details of the data collection please see our study protocol [37]. Interviewers conducting the CATIs were blinded with respect to participants’ allocations. Participation was voluntary, and participants were free to leave the study at any time. All procedures were approved by the Friedrich-Alexander Universität of Erlangen-Nürnberg Ethics Committee (Ref. 360_16 B).
Interventions
Bouldering psychotherapy (BPT)
Our newly developed bouldering psychotherapy is a combination of bouldering and psychotherapy. The programme consists of ten consecutive sessions of 2 hours. In the current study, it took place in groups of about ten participants in a bouldering gym once a week in the late afternoon. In each study centre, therapeutic teams consisted of two therapists, but the personnel composition varied across the different waves of therapy because the therapists sometimes had other commitments (a total of nine climbing therapists). For qualification of the therapists see the study protocol [37]. Each of the ten BPT sessions focussed on a specific psychological topic that was considered to be relevant in the development and maintenance of depression. Table 1 shows an overview of the specific subjects covered in the ten therapeutic sessions [see also 37]. Each session was based on the BPT manual and followed a standardised procedure (introduction, action phase, closure), comprising mindfulness exercises, psychoeducational elements, topic-related bouldering exercises under therapeutic supervision, exchange of individual experiences between participants and transfer to daily life, body-related relaxation exercises, and free bouldering. The purpose of the bouldering exercises was to evoke underlying emotions (e.g. anxiety), unveil patients’ characteristic patterns (e.g. avoidance), and enable patients to engage in new experiences (e.g. exposition: bouldering blindfolded). For a detailed description of the treatment, see the study protocol [37].
Table 1 Overview of sessions of the BPT [37] Exercise programme (EP)
The home-based Exercise Programme was supposed to address the same muscles as used in bouldering. It consisted of a 20-min physical training programme that was conducted by the participants on their own at home, using a training DVD and/or a training manual (with instructions and explanations for all of the exercises). Additionally, participants received training material (e.g. a multifunctional latex band and training rings to enhance finger and arm power) as well as psychoeducational material explaining the positive effect of physical exercise on mood. Participants were instructed to perform the exercises about three times a week for ten weeks (resulting in 60 min per week, comparable to the ‘active time’ in the BPT group). At regular intervals, they received reminders and motivational material to keep on exercising. Depending on their personal preferences, participants were contacted via e-mail, telephone, or postal mail. As opposed to other Internet-based offers of therapy (i.e. by health insurance companies), people without Internet access or an e-mail address were also able to participate at the EP group. In addition, participants were provided with an exercise diary that encouraged them to record their training sessions and subsequently rate their mood. After the intervention period, the total rate of exercise was assessed in a personal interview with an external rater via self-report.
Cognitive behavioural therapy (CBT)
The third intervention that was applied was based on a classical cognitive behavioural group therapy but will not be described in detail because it is not relevant to the hypotheses we tested in this article. For a detailed description of the treatment, see the study protocol [37] and other publications by the work group.
Recruitment and randomisation
Between January 2017 and March 2018, participants were recruited by the distribution of informational materials (e.g. flyers, posters) in relevant institutions (e.g. psychiatric hospitals, psychotherapists’ offices, primary care physicians’ offices, pharmacies, support groups). Additionally presentations were held at local events and press releases were issued and addressed to different local newspapers and radio stations prior to the start of each intervention wave. A homepage (www.studieKuS.de) and a Facebook account were created and regularly updated. Informational sessions were held by the study personnel, in which all people interested in the study were informed about the conditions surrounding a participation in the study (e.g. randomisation). Those willing to participate were asked to fill out a short screening questionnaire and to declare their written informed consent. All individuals fulfilling the inclusion criteria were informed that they had been included in the study and subsequently randomised blockwise within one region and one wave to one of the three groups (BPT, CBT, EP). Randomisation was stratified by sex and severity of depression (PHQ-9 scores 8–14 mild, 15–19 moderate, 20–27 severe depression). Randomisation was performed by the Institute of Medical Informatics, Biometrics, and Epidemiology (IMBE) at the Friedrich-Alexander-Universität Erlangen-Nürnberg with a computer-based system and was based on participants’ codes without the statistician’s knowledge of assignment to the intervention arms.
Inclusion and exclusion criteria
Eligibility was determined through the screening questionnaire handed out at the information sessions or upon request. Only a few inclusion and exclusion criteria were applied, in order to increase the external validity of the study. Potential participants were personally interviewed by the study personnel if the fulfilment of the inclusion or exclusion criteria was unclear. Inclusion criteria were acute depressive symptoms, informed consent to participate in the study (especially approval for randomised allocation and data collection), and the ability to get to the therapy locations. The presence (or absence) of depression was operationalised by a PHQ-9 score of at least eight points, ensuring a high level of sensitivity for all depressive disorders [38]. Exclusion criteria were an age under 18 years, a Body Mass index (BMI) under 17.5 or over 40, contemporary participation in another psychotherapeutic group therapy, started psychiatric medication or psychotherapy within the last 8 weeks, planned inpatient stay during the intervention period, physical contraindication for bouldering (physical disorders or pregnancy), specific psychiatric disorders (psychosis within the last 5 years, a manic episode within the last 5 years, substance addiction with substance abuse within the last year, borderline personality disorder diagnosis with self-harming behaviour during the last year), and acute suicidality [see also 37]. All participants were obliged to sign an anti-suicide contract for the duration of the study. After randomisation, participants were informed about their allocation and provided with all the necessary information about group participation.
Instruments
Primary outcome measure
Montgomery–Åsberg Depression Rating Scale (MADRS) [39]. The MADRS is one of the most commonly used rating scales for assessing core symptoms of depression [40]. It consists of ten items, which are rated by a clinician in a semi-structured interview on a seven-point scale with higher scores indicating greater severity of depression (ten or below: remission, greater than 31: severe depression) [41]. In our study, the structured interview guide for the Montgomery–Åsberg Depression Rating Scale (SIGMA) [40] was used, which offers a selection of different questions for each item.
Secondary outcome measures
Subscale interpersonal sensitivity of the Symptom-Checklist (SCL-90) [42, 43]. The SCL-90 is a self-report inventory which measures among other variables the intensity of distress experienced during the past 7 days caused by interpersonal sensitivity. Ratings on the five-point Likert-type scale were summed, and standardised t-values were computed, with higher scores indicating an increasing severity of symptoms.
Generalised Anxiety Disorder 7 (GAD-7) [44, 45]. The GAD-7 is a brief self-report questionnaire, asking patients how often they have felt bothered during the last two weeks by each of the seven core symptoms of generalised anxiety disorder. Items can be rated on a four-point scale, with higher sum scores indicating higher anxiety (≥ 5 mild ≥10 moderate, and ≥ 15 severe anxiety).
Subscale vital body dynamics of the Body Image Questionnaire (Fragebogen zum Körperbild, FKB-20) [46]. The FKB-20 assesses body image disturbances and subjective aspects of body experience. The vital body dynamics subscale consists of ten items rated on a five-point scale with higher sum scores indicating a more positive body image.
Subscale coping of the Questionnaire on Resources and Self-Management Skills (Fragebogen zur Erfassung von Ressourcen und Selbstmanagementfähigkeiten, FERUS) [47]. The FERUS assesses an individual’s health-related resources and manageability, among others on the subscale coping [48]. Twelve items are rated on a five-point Likert scale. Ratings were summed with higher test scores indicating better resources and manageability skills.
Rosenberg Self-Esteem Scale (R-SES) [49]. The R-SES is a self-report instrument for evaluating global self-worth. It consists of ten items answered on a four-point scale with higher values indicating higher self-esteem.
As an additional measure for depression the 9-Item Patient Health Questionnaire (PHQ-9) [44, 50] was applied. The PHQ-9 is a short self-assessment tool which is often used for the screening of depression in primary care settings [51]. Its nine items cover the nine DSM-IV criteria and are rated on a four-point scale. The total sum score suggests varying levels of depression (0–4 minimal depression, 5–9 mild depression, 10–14 moderate depression, 15–19 moderately severe depression, 20–27 severe depression) [38, 50].
For all of the psychometric measures, change scores were computed as the difference between t1 and t0.
Other outcome measures
Other variables that were assessed either through the screening questionnaires or via the CATIs included, among others, sociodemographic data (e.g. age, gender, and level of education), body mass index (BMI), current therapeutic treatment (antidepressant medication, additional psychotherapy), psychiatric comorbidities, medical history of depression, critical life events, physical limitations, and attitude towards physical activity (positive or negative). The interview guide regarding those variables is presented in Additional file 1.
For a comprehensive depiction of all the measures we assessed, see the study protocol [37].
Statistical analysis
To increase the power of the statistical procedures, all participants in each group (BPT and EP) were combined across the three study centres and two/four waves for the main analyses. Descriptive statistics (frequencies, means, and standard deviations) were computed to illustrate sample and baseline characteristics. To assess the quality of the randomisation, differences between the EP and BPT groups in the variables of interest were evaluated via two-sample t-tests, U-tests, chi-square (χ2) tests, and univariate analyses of variance (ANOVAs). The underlying assumptions of parametric tests were checked before using the Kolmogorov-Smirnov and Levene’s test. Baseline variables that were significantly different between the two groups were included as confounders in the multiple regression model. All data were checked for plausibility. A missing data evaluation was carried out, and missing values were imputed using the expectation maximisation (EM) algorithm. The primary data analysis strategy was ‘per protocol’ (PP). Participants who dropped out during the intervention period were subsequently interviewed and included in the ‘intention to treat’ (ITT) analyses. Dropout analyses were computed to check for differences between participants who dropped out and those who completed the study, using χ2 tests, U-tests and two-sample t-tests.
To check for pre-post changes within the groups in the main outcome criterion (i.e. depressive symptoms assessed with the MADRS), paired t-tests were computed to compare the changes between t0 and t1 in both the BPT and EP group. To compare the improvements between the two groups, t-tests for independent groups were computed to compare the change scores (t1-t0) between the BPT and EP group. In addition, a multiple regression analysis was calculated to predict the post-intervention (t1) MADRS score from group allocation (BPT vs. EP), controlling for demographic variables (age, sex), participants’ BMI, attitude towards sports, other therapeutic treatments (antidepressant medication, additional psychotherapy), and severity of depression at baseline (MADRS t0 score).
As a sensitivity analysis, additional analyses with ITT data were computed and compared with the results of the PP analyses. Furthermore, an additional regression analysis controlling for the study centre was calculated to rule out potential centre effects. Collinearity statistics were examined in advance to ensure there were no issues with multicollinearity.
Secondary outcomes as well as depression measured by the PHQ-9 were tested in an exploratory fashion. To check for improvements within the groups, paired t-tests were computed to assess changes between t0 and t1 for the EP and BPT. After checking for homogeneity of variance, change scores were compared with t-tests for independent groups and U-tests (as sensitivity analyses) between the two groups. As a measure of effect size, Cohen’s d was calculated.
To determine consistency among the raters, for 10 (4%) of the pre-intervention (t0) and 11 (5%) of the post-intervention (t1) CATIs, a second person also rated the interviewee’s answers on the SIGMA and intraclass correlations (ICCs) were computed across all groups to assess interrater reliability. For all analyses, a Type 1 error rate (alpha) of less than 5% was considered to indicate statistical significance. Statistical analyses were performed with the aid of the IBM SPSS Statistics 21 software.