Background

Musculoskeletal injuries (e.g., fractures, dislocations, etc.; also known as traumatic injuries) represent the leading cause of adult hospital admissions [1], with many patients developing chronic pain and disability [2] despite adequate recovery of their bones and soft tissues. These patients may continue to have multiple surgeries and medical appointments, resulting in increased health care costs and a significant public health burden [3, 4].

Catastrophic thinking about pain, pain anxiety, and depression are established risk factors for disability and pain in patients with traumatic musculoskeletal injuries [5,6,7,8,9,10,11], regardless of the severity of the injury [12, 13]. Recognizing these psychosocial factors early in the recovery process creates a window of opportunity to identify and intervene with patients who are at risk for chronic pain and disability in the acute phase when psychosocial treatments are most effective [14, 15]. A recent systematic review conducted by our team showed that there are no evidence-based interventions that address psychosocial factors in patients recovering from acute orthopedic injuries [16].

Current usual care for patients with acute traumatic musculoskeletal injuries consists primarily of surgical interventions and pain medications. However, medical care is undergoing a shift in priorities recognizing the multifactorial influences on successful recovery after injury and the pivotal role of psychosocial factors [17]. Although surgeons are now aware of the importance of psychosocial factors in recovery after musculoskeletal injuries [18], they are often uncomfortable referring patients for outpatient care [18,19,20]. Referrals are often done when the pain has already become chronic, patients have undergone multiple medical treatments, and psychological treatments are generally less efficacious [14, 15]. Timely psychological care is also challenging due to long wait times and lack of trained providers. Traditional mental health treatments like cognitive behavioral therapy are met with resistance due to the stigma associated with mental health concerns [10]. Patients with orthopedic injuries are also unable to travel to clinics for medical appointments and have to rely on family and friends for transportation.

To bypass these barriers, using feedback from patients and surgeons, we iteratively developed a four-session live video mind-body program “Toolkit for Optimal Recovery” which we aim to test in a phase III multi-center hybrid efficacy-effectiveness trial [21, 22]. In the current study, we followed recommendations for rigorous feasibility testing [23, 24] and conducted preliminary work needed to inform the study design and identify and rectify potential shortcoming in study procedures and measurement. We sought to determine the best ways to recruit patients with orthopedic traumatic musculoskeletal injuries, estimate loss to follow up, determine acceptability of randomization and assessment procedures, assess adherence to home practice, and determine satisfaction with the program.

Methods

Design

The study was designed with the goal of preparing for a future large multi-center hybrid efficacy-effectiveness RCT of the TOR versus UC. In this pilot feasibility study, we conducted the RCT at a level I trauma center at a major urban medical teaching hospital. Randomization was performed using a random number generator to maintain balance between the groups. Participants completed questionnaires electronically with the secure web-based data collection platform REDCap [25]. As comparisons were made with usual care (UC), participants were not blinded to intervention versus control. The trial was designed to address specific objectives relating to study design and methodology for the subsequent hybrid efficacy-effectiveness trial. Consistent with recommendations for rigorous clinical trials, it was not designed to determine efficacy [23, 24]. We collected data at baseline, 4–5 weeks after baseline, and 4 months post-baseline (3 months after post-test). There were no important changes to methods (e.g., eligibility criteria) after the trial commenced.

Intervention

The TOR is a four-session, live video, manualized mind-body program informed by the fear avoidance model [26, 27]. It combines relaxation response skills (e.g., breath awareness, body scan) with cognitive behavioral skills (e.g., adaptive thinking, activity pacing) and acceptance and commitment therapy skills (e.g., acceptance and value-based goals). Table 1 provides a description of program components. Each session starts with a description of the skills previously learned. Patients are asked to practice skills at home and email a practice log to their clinician prior to the next session. Given known barriers to psychosocial care in this population, great attention is placed on building rapport, normalizing pain after injury, and ensuring that participants’ experiences are validated while instilling hope. Patients learn about the mind-body connection and the importance of practicing mind-body skills to optimize recovery.

Table 1 Session content for the “Toolkit for Optimal Recovery” (TOR)

The TOR was developed based on prior research [5, 6, 10], the fear avoidance theoretical model [26, 27], feedback from an orthopedic surgeon (DR) and the senior author’s (AMV) extensive clinical and research experience with this population. Briefly, we used elements of the NIH stage model [28] and ORBIT models [29] as conceptual frameworks. The first iteration of the program was called Relaxation Response Cognitive Behavioral Program (RRCB) and was designed to have a 4–6-session flexible format [30]. The program was tested in person in an open pilot. Due to challenges in recruitment, we next modified the delivery to a combination of in-person and telephone visits and conducted a pilot feasibility RCT [30]. We performed qualitative exit interviews over the phone with ten patients from the RRCB group using a semi-structures guide. Data was analyzed using grounded theory and was shared with two of the surgeons who provided referrals. Information from these exit interviews and the two surgeons informed modifications in treatment modality (now video), duration (now 45 min/session), number of sessions (now four), and manual language (to foster patient engagement through language that normalizes challenges associated with recovery and avoids the use of medical jargon which some patients did not like). The Toolkit for Optimal Recovery (TOR) manual was next developed and tested with results reported in this manuscript.

Usual care (UC) control

Usual care involves meetings with surgeons as needed, referrals to occupational or physical therapy, and pain medications. A strict opioid prescription policy is followed at our institution, and it entails a multi-modal approach to pain medication prescribing utilizing non-opioid drugs such as acetaminophen, NSAIDs, and gabapentin; rational opioid prescribing that emphasize low starting doses and early encouragement of a weaning regimen; and monitoring of the state prescription monitoring database. No more than 1 week worth of opioids is provided at discharge or during follow up. All patients received usual care regardless of randomization.

Recruitment, consent, and screening

In order to determine whether we could recruit the desired number of participants and ascertain which methods or recruitment are most successful at our level I trauma center, we report the number of participants approached and consented. We aimed to enroll 50–60 participants to achieve 50 study completers. Given approximately 200 participants that meet the study criteria available each year, we had at least an 80% probability of demonstrating that recruitment is feasible if the expected proportion of eligible patients who agree to participate is at least 72%. This sample size is sufficient for answering additional feasibility and acceptability questions and is typically used in randomized controlled feasibility trials [31,32,33]. The research assistant screened the orthopedic trauma clinic of two surgeons who agreed to participate in the study, to identify patients who underwent an orthopedic injury 1–2 months prior to this clinic visit. The research assistant next approached patients for participation in the patient rooms, as they were waiting for their appointment with the surgeon. Interested participants completed informed consent and were subsequently screened for participation. Inclusion criteria included: (1) an orthopedic injury in the prior 1–2 months, (2) 18 years or older, (3) English fluency and literacy, and (4) score over median split on the Pain Catastrophizing Scale (PCS) or Pain Anxiety Symptom Scale (PASS-20). Exclusion criteria included: (1) major medical comorbidity expected to worsen in the next 6 months; (2) comorbid chronic pain conditions; (3) change in antidepressant medication in the past 3 months; (4) evidence for potential secondary gain such as litigations or worker’s compensation procedures that may interfere with patients’ motivation for treatment; (5) psychoses, bipolar disorder, active untreated substance dependence or other factors that could interfere with informed consent processes and treatment (by self-report); (6) inability or unwillingness to complete questionnaires electronically or participate in a mind-body program delivered via secure synchronous live video; and (7) pregnancy, per our Institutional Review Board (IRB). Recruitment methods sampled included: (1) approaching participants with an orthopedic injury in the prior 1–2 months in the waiting room, with no help from the medical team; (2) introduction of the research assistant by the orthopedic surgeon; and (3) introduction of the study by the orthopedic surgeon to individual patients. Recruitment occurred between January 2016 and May 2018. The study protocols (summary and detailed) are stored with the IRB.

Assessments

Eligible participants completed demographics and a self-report questionnaire electronically. We explored two methods of assessment at baseline: in the office on an iPad or at home through an email link. One of the surgeons allowed completion of screening and assessments before the medical visit, while the other preferred to interrupt the assessment to conduct the medical visits.

Demographic questionnaires: assessed age, gender, race/ethnicity, employment status, marital status, educational level, prior orthopedic injuries, current psychotropic or pain medications, and history of psychological diagnoses.

The Short Musculoskeletal Function Assessment Questionnaire (SMFA) [34] is a validated 46-item questionnaire that measures physical functioning/musculoskeletal disability [11, 35]. It is developed from the 101-item parent questionnaire which has been extensively validated and tested for reliability and responsiveness [36]. The score is calculated by summing up the individual items which cover assessment of function (34 questions) and perception of how bothersome symptoms are (12 questions). All questions are answered on a 4-point Likert scale with high scores depicting higher disability. Raw scores are summed and transformed so that the final score ranges from 0 to 100. Internal reliability was excellent (Cronbach’s α = 0.91). SMFA will be our primary outcome in the definitive efficacy-effectiveness trial.

The Numerical Rating Scale (NRS) was used to assess pain at rest and with activity. This is a commonly used, validated, and reliable measure of pain intensity [37,38,39,40] on an 11-point item scale from “no pain” (0) to “worst pain imaginable” (10). We assessed both pain at rest and with activity.

The Pain Catastrophizing Scale (PCS) [41, 42] is a reliable and valid measure of negative pain-related cognitions or catastrophic thinking (thinking of the worst). It has 13 items answered on a 4-point Likert scale from “not at all” (0) to “all of the time” (3). Items are grouped into three subscales: rumination (tendency to spend a lot of time dwelling on the pain), helplessness (feeling hopeless and helpless when in pain), and magnification (thinking the worst when in pain). A total PCS score is computed by adding all items, with high scores depicting worse coping. In the current sample, the PCS had good internal reliability (Cronbach’s α = 0.89).

The Pain Anxiety Symptom Scale (PASS-20) [43] is a reliable and valid measure of anxiety about pain. It has 20 items answered on a 6-point Likert scale from “never” (0) to “always” (5). Items are grouped into four subscales: avoidance (avoiding activities that cause pain), fearful thinking (fear thoughts related to pain), cognitive anxiety (difficulty thinking when in pain), and physiological response (somatic anxiety symptoms in response to pain). A total pain anxiety scale is computed by adding all items, and high scores depict worse coping. In the current sample, the PASS-20 had good internal reliability (Cronbach’s α = 0.88).

The PTSD Checklist-Civilian Version [44] is a reliable and valid 17-item measure of symptoms of post-traumatic stress disorder (PTSD). The measure provides a total severity score [45] as well as diagnostic scores. In the current sample, this questionnaire had excellent internal reliability (Cronbach’s α = 0.95). High scores depict worse symptoms.

The Center for Epidemiologic Study of Depression (CES-D) [46] is a widely-used, 20-item measure of self-reported depression symptoms, each answered on a 4-point Likert scale from “rarely or none of the time (less than 1 day)” (0) to “most or all of the time (5–7 days)” (3), with high scores depicting higher depression. In the current sample, the CES-D had acceptable internal reliability (Cronbach’s α = 0.77).

Participants completed the same battery of questionnaire with the exception of demographics at post-test and 3-month follow-up. They were emailed a link to the questionnaire via REDCap [23], a secure, electronic program used widely in research at our institution. Each participant was emailed three times and then received a call from the principal investigator prior to being considered lost to follow-up.

Patient satisfaction. At post-test, participants in TOR completed three questions regarding their satisfaction with the program. We used measures of patient satisfaction that evaluated the overall current satisfaction with one’s physical situation, evaluation of care delivered, and satisfaction with the personal manner of the clinician.

There were no changes to assessment or measurement after the trial commenced.

Feasibility assessments

Feasibility of recruitment was assessed by determining the number of patients approached who agreed to participate.

Acceptability of randomization and procedures was determined by measuring loss to follow-up (post-test and 3-month follow-up) in both TOR and UC.

Acceptability of treatment was determined based on the number of sessions attended by participants in TOR, as well as by self-reported adverse events. At the end of the last session, the first 10 participants provided semi-structured feedback on the intervention (10–15 min interview). Patients were asked the following questions: (1) What did you think of the program? (2) If you could go back, would you choose to participate again? (3) What were the most helpful parts of the program (open ended)? and (4) Did you experience any technical difficulties? Participants also answered questions about the usefulness of each of the skills taught.

Adherence to homework was determined by the number of logs turned in by participants over the course of the study.

Therapist adherence to sessions was determined by completion of an adherence checklist by each study therapist.

Feasibility of quantitative measures was deemed acceptable if no questionnaires were missing in full in more than 25% of the participants and if reliability was higher than 0.70.

Randomization and allocation concealment

Participants were randomized 1:1 to TOR or UC using a computerized random number generator program (randomnumber.org) overseen by the statistician. Only the research assistant involved in recruitment had access to the randomization sequence aside from the statistician but was not aware of the randomization status at the time of patient recruitment. Surgeons who referred participants as well as the PI were blind to intervention and control. Because we compared TOR with UC, neither the patient nor the therapist was blinded. Review of data and derivation of outcome scores was performed blind to the treatment assessment. We stopped recruitment when we achieved 50 participants with completed post-tests.

Identification of study limitation to inform the future trial

We monitored issues related to recruitment, retention, assessments, and intervention delivery throughout the study in order to identify and rectify any emerging shortcomings that were not previously considered and testing new strategies to improve our overall methodology. The study PI maintained a constant dialog with the research assistants who collected data, the study therapists, orthopedic surgeons and patients, and the study was discussed and feedback was generated in weekly team meetings. We also monitored any adverse events in the form of self-reported increase in pain in the intervention group, none of which were observed.

Data analyses

Feasibility studies are not designed to detect a treatment effect [21, 22]. It is recommended that feasibility trials should report primarily descriptive statistics on variables. The trial was designed to inform a multi-site hybrid efficacy-effectiveness trial. We present information on feasibility of recruitment, acceptability of screening and randomization, and feasibility of quantitative assessments. For TOR, we report on satisfaction with the program, adherence to homework, therapist adherence to treatment, and acceptability of treatment. We also present means and standard deviations (M, SD) for the quantitative outcomes at each time point, as well as within-subjects Cohen’s d effect sizes for improvement in participants randomized to TOR. There were no interim analyses or stopping rules for this feasibility study. Qualitative data from the open-ended questions was coded and summarized, but we did not use qualitative statistical packages to analyze it given its homogeneity.

Results

Sample

We present patient characteristics at baseline in Table 1, separately for TOR and UC. Patients were in majority white, educated, and women. Participants were comparable in terms of age, racial and ethnic distribution, marital and work status, and other demographic variables. However, the control group was primarily represented by women, while the gender distribution in the intervention group was balanced. There were no differences in any of the baseline characteristics between follow-up status (trial completers versus lost to follow up) (p > .05).

Feasibility of recruitment

A total of 243 participants were screened for eligibility; 37 declined to participate in the study (no time, no interest, did not believe they would benefit); 94 were excluded because they screened out based on PCS and PASS scores; and 58 did not meet other inclusion criteria (35 did not have a webcam or computer, 5 were on a stretcher and needed to be moved, 5 did not speak English, 2 had dementia, 1 was an inmate, and 10 agreed but changed their minds during screening). Figure 1 describes the CONSORT diagram for the study. Recruitment was most effective when surgeons introduced the study to potential patients, while cold approaching patients in the waiting room was least effective. Two of the four surgeons in the practice refused to participate in the study. One of them expressed lack of belief in psychosocial care for orthopedic trauma patients while physical recovery is in progress, while the other expressed concern about interruption in the patient flow.

Fig. 1
figure 1

CONSORT flow diagram

Acceptability of randomization and procedures

The CONSORT diagram shows the flow of participants in the study. Acceptability of randomization and procedures was high, and no participants dropped out after they learned their randomization status.

Acceptability of treatment

Acceptability of treatment was high. Only one patient randomized to TOR dropped out, but this participant completed the post-test. All four sessions were scheduled on the same day and at the same time for the duration of the 4-week program. When scheduling conflicts arose, rescheduling occurred within the same week. The first ten participants in TOR who provided qualitative feedback on the program unanimously reported that they enjoyed the program and were glad they decided to participate. They noted that the program was very helpful in aiding their return to activities of daily living and putting them at ease with their recovery path. They shared that one of the toughest parts of recovery is the interruption of routine and the unknowns of the recovery process. Of the skills thought, the most useful were breath awareness, acceptance, working through myths about pain, and activity pacing. Participants loved the flexibility of the program and the live video delivery method. Neither the patients nor the therapist reported any challenges with the live video delivery of the program (e.g., internet connection, etc.), and this was regardless to whether participants used a computer, tablet, or phone.

Adherence to homework

Participants were asked to return a home practice form weekly to the study clinician. Out of the 25 participants in TOR, 15 returned logs and practiced at least one skill. Six participants reported that they practiced skills but did not record their practice.

Satisfaction

Satisfaction with care, clinician, and the recovery process was high (see Table 2).

Table 2 Baseline patient characteristics

Therapist adherence

Therapist’s adherence to sessions was determined by completion of adherence checklist by each study therapist. Adherence was high and therapists reported that it was easy to cover each session’s materials within the 45-min allotted time.

Feasibility of quantitative measures

Feasibility of quantitative measures was high. No questionnaires were missing in full. Reliability was good to excellent in all questionnaires except the CES-D where reliability was acceptable (Cronbach α = 0.77). Participants also noted that the questionnaires were relevant and easy to understand.

Means, standard deviations, and ranges

Means, standard deviations, and ranges for all study outcomes at all time points are depicted in Table 2.

Within-subject effect size for improvement from baseline to post-test in TOR was large for all variables (d = 0.83 to 2.7). The largest effect sizes were observed for SMFA (d = 2.7), PCS (d = 2.1), and PASS (d = 2.0), followed by pain with activity (d = 1.6) and pain at rest (d = 1.2), depression (d = 1.2), and PTSD (d = 0.8). Effect sizes for improvement in TOR between post-test and 3 months were small for SMFA (d = 0.23), PCS (d = 0.1), PASS (d = 0.07), pain at rest (d = 0.3), and depression (d = 0.2), and medium for pain with activity (d = 0.5) and PTSD (d = 0.7) (Table 3).

Table 3 Unadjusted means, standard deviations, and ranges for the outcome variables

Discussion

We conducted a randomized controlled feasibility trial aimed at informing the design and conduct of a future multi-site phase III clinical trial with the goal of preventing chronic pain and disability in at-risk patients with orthopedic traumatic musculoskeletal injuries. While recruitment was lengthy and generally challenging, we were able to recruit the desired number of patients. Recruitment was most successful when the surgeons mentioned the study to participants. The randomization methods used in the current study successfully distributed patients equally into the TOR and UC groups; however, there was a chance difference between the groups in terms of the ratio of female to male patients. As such, stratified randomization may be employed in the phase III trial to ensure comparable gender distributions between treatment and control groups.

Once recruited and randomized, patients attended all study sessions and in majority completed post-test and follow-up questionnaires. Attrition was better than in other clinical research protocols in patients with pain [47]. However, the majority of participants were white and non-Hispanic though consistent with the typical racial and ethnic representation of the patient population at our clinic. Recruitment of a more heterogeneous ethnic sample from more ethnically diverse regions is warranted in future research. The research methodology and questionnaires were acceptable to study participants. Satisfaction with the TOR was high. There were no technical issues associated with the delivery of TOR via secure live video. Recording of homework was another area that was identified for potential improvement as 6 out of 25 patients in TOR reported taking part in homework activities but failed to record those activities in their written logs, and 4 made no report of home practice. For the phase III trial, we will implement electronic capture methods combined with text-based reminders. Due to funding restrictions, we were understaffed for the current trial and staff involved in recruitment also performed randomization. In the future multi-site feasibility trial, we will ensure that all staff but the clinician who provides the TOR intervention remain blind to allocation for the duration of the study. We did not find any differences in baseline characteristics between completers and non-completers within our sample size. In the hybrid efficacy-effectiveness definitive trial, we plan to utilize shared-baseline models for analysis to account for chance difference in baseline and incorporate all randomized participants without regard to completion of follow-up and thus would provide some protection against informative missingness.

Exploration of within-TOR effect sizes showed robust improvement in all study variables between baseline and post-test, with slight continued improvement after the end of the program and until the 3-month follow-up assessment. The largest effect sizes were for our treatment targets and for SMFA physical function, which is our primary outcome in the subsequent phase III trial. Although the small sample size prevents us for making any conclusions about the efficacy of the TOR, results provide encouraging evidence that the TOR has the potential to help improve functioning, coping, pain, and mood in patients with acute orthopedic injuries. The most robust improvement was observed for physical function, which is the proposed primary outcome in our future multisite hybrid efficacy-effectiveness design.

This feasibility trial allowed us to identify the impact of research on the normal operations of a busy orthopedic level I trauma center. We learned about the importance of integrating the research assistants within the practice and working with each individual surgeon to fit our recruitment methodology around their preferences. Two surgeons refused to allow us to recruit their patients, and one suggested that psychological factors are not important in recovery and shared his own skepticism for this work. Two surgeons were highly enthusiastic, but each required that the research assistant follows different strategies for recruitment (e.g., one surgeon wanted the research assistant to cold approach the participants in the waiting room prior to the medical visit but leave the room immediately when she was free to see the participant, and then continue recruitment after, while another surgeon introduced the study to participants and allowed the research assistant to finish recruitment before meeting with the patient). This information is important for the future study, but also for implementation purposes. The success of the multi-site hybrid efficacy-effectiveness trial will depend on securing buy-in from surgeons including providing referrals to the study. It is very likely that other trauma centers smaller than ours have a much higher patient burden which may require flexibility with study procedures. It will be important to further understand, via focus groups, barriers for participating in the recruitment process and develop educational materials to facilitate buy-in on the importance of psychosocial care for at-risk orthopedic injury patients, as well as referrals. It will be pivotal for the success of the trial to work with the surgical team and determine the best workflow for participants that will lead to highest feasibility ratings while not being disturbing of the patient flow in busy orthopedic practices. Surgeons and medical professionals in the orthopedic setting will need to be active participants in the trial design.

In conclusion, this feasibility randomized controlled trial provided rich information for the design of future research of the TOR. Lessons learned will be used to conduct a multi-site feasibility trial that include two additional geographically and ethnically diverse sites of smaller size from across the USA. Investment in a multi-site feasibility trial is mandatory in order to ensure high feasibility of the intervention and study procedures at all sites before a definitive multi-site phase III clinical trial. In doing so, researchers and clinicians can avoid wasting time and resources on running definitive trials before feasibility markers for both research and the intervention are established at all sites.

Conclusions

We conducted a feasibility trial of the first skills-based intervention for patients with orthopedic injuries, the Toolkit for Optimal Recovery (TOR) versus usual care. We found promising evidence for feasibility and acceptability of recruitment and study procedures. Information from this feasibility trial will be used to conduct a multi-site feasibility trial at three geographically diverse trauma clinics from the USA, to set the stage for a successful phase III multi-site hybrid efficacy-effectiveness clinical trial.