Background

Musculoskeletal (MSK) pain from common conditions such as back pain and osteoarthritis are costly global health challenges, particularly for primary care where the majority of patients are managed. For example, in the UK, common MSK problems such as back, shoulder, knee and multi-site pain account for 14% of General Practitioner (GP) consultations [1] and estimates from the most recent global burden of disease studies suggest they are the leading cause of disability adjusted life years (DALYs) [2, 3]. Given the ageing population and the increasingly complex and multi-morbid clinical presentations of patients, clinical decision-making is becoming more challenging [4,5,6]. In addition, consultation rates for MSK pain are increasing, for example in the UK, GP consultations for MSK pain have increased by 19% (from 310 to 370 million per year) over a five-year period [7, 8].

Randomised controlled trials (RCTs) show that non-pharmacological interventions such as physiotherapist-led supervised exercise and cognitive behavioural approaches are more effective than minimal usual care [9,10,11,12], yet most guidelines [13,14,15] lack clarity about which patients should be offered these additional interventions [16,17,18]. At present, primary care decision-making for MSK pain is mostly based on ruling out serious pathology and using clinical reasoning without formal stratification tools to decide on treatment. Assessing the severity, impact and prognosis of individual patients can be difficult in short primary care consultations and patient access to other treatments is often variable [19,20,21,22]. Offering everyone consulting in primary care with MSK pain further treatments is both unnecessary and impractical [16, 17]. Therefore, finding ways to better identify which patients to de-medicalise by limiting care primarily to reassurance and self-management whilst conversely identifying which patients should be offered more intensive and expensive healthcare treatments, is an international priority [14, 17, 23].

We have previously demonstrated the clinical- and cost-effectiveness of a stratified primary care approach to support clinical decision-making for patients with low back pain in the UK [24,25,26]. This approach combines prognostic stratification (using the STarT Back tool that classifies individuals into either a low, medium or high risk subgroup for persistent low back pain-related disability) with recommended matched treatments for each subgroup [27,28,29]. This approach to stratified care for low back pain has since been recommended in several international clinical guidelines [30,31,32]. Whilst low back pain is the most common MSK pain presentation in primary care, it accounts for only 26% of the MSK caseload [1], and it is unknown whether a similar prognostic approach to stratified care would benefit the large volume of patients with MSK pain in other body sites/locations (e.g. knee or shoulder pain).

Given the results of several systematic reviews showing consistent prognostic factors across MSK pain conditions [33,34,35,36,37], we developed and validated a single prognostic stratification tool, the Keele STarT MSK tool, for use among patients with the five most common MSK pain presentations in primary care (back, neck, shoulder, knee, and multi-site pain) [1]. The Keele STarT MSK Tool has shown good predictive and discriminative ability in development and validation samples [38], identifying patients at low, medium or high risk of persistent MSK pain over 6-months. Using systematic review and consensus methods, we also agreed evidence-based recommended matched treatment options for each of the risk subgroups [39, 40].

The STarT MSK stratified primary care intervention has two components: use of the tool to identify risk subgroups, followed by matched treatment options. A definitive trial is needed to test whether this approach is better for patients’ outcomes and the healthcare system, compared to usual non-stratified care. Prior to conducting the main randomised controlled trial (RCT), we examined the feasibility of a) a future definitive cluster RCT, and b) GPs using stratified care at the point-of-consultation. Specific objectives were to:

  1. 1)

    Estimate participant recruitment and follow-up rates in a pilot cluster RCT

  2. 2)

    Examine evidence of selection bias between trial arms and participants and non-participants

  3. 3)

    Assess GP fidelity to the stratified care intervention (use of the stratification tool and matched treatments) at the point-of-consultation.

  4. 4)

    Conduct secondary descriptive analyses of GP decision-making and patient self-reported outcomes.

Methods

Trial design

The study design was a pragmatic, feasibility and pilot, two-parallel arm (1:1 ratio), cluster RCT in 8 general practices, with a nested qualitative study reported separately [41]. A cluster RCT was chosen over an individual patient randomisation design as stratified MSK care involves GPs using a slightly different consultation approach following specific training, as well as the use of a bespoke electronic medical record (EMR) template, which was only possible to implement at a practice level without causing a high probability of intervention contamination across arms [42]. The units of randomisation were the general practices and units of observation were adults consulting with MSK pain. The International Standard Randomised Controlled Trials Number is ISRCTN15366334.

Participant eligibility criteria and identification

Patients were eligible if, during their visit to a participating GP practice, the trial’s purpose-built participant identification screen, embedded within the EMR, was completed at the point-of-consultation, including GP confirmation of patient eligibility. Inclusion criteria were: aged over 18 years, registered at that general practice, consulting for MSK pain in the back, neck, shoulder, knee or multi-site pain. The trial identification template activated automatically for all new or returning episode cases when GPs (intervention and control) entered one of over 200 pre-identified MSK Read-codes (i.e. symptom/ diagnostic codes) into the patient’s electronic medical record (EMR). Exclusions were: clinical indicators of (suspected) serious ‘red flag’ pathology requiring urgent medical intervention or a known systemic inflammatory condition, those unable to communicate in English (both in reading and speaking), vulnerable patients including those on the ‘severe and enduring mental health register’, a diagnosis of dementia or terminal illness, and recent trauma or bereavement. To reduce patient/clinician burden, the participant identification screen only activated once per patient (providing it was completed or an exclusion was entered). A further eligibility criterion, administrated by the research centre, specified that initial questionnaire responses were completed within 4 weeks of invitation mailing date (using self-reported date-of-completion on the questionnaire).

Recruitment

General practices

The UK West Midlands National Institute for Health Research (NIHR) Clinical Research Network (CRN) facilitated recruitment of eight general practices who used the EMIS Web EMR system and collectively served a target population of > 40,000 adults. GP practice eligibility criteria included willingness to be randomised to either stratified care or usual care, to engage in intervention training (if allocated to stratified care) and to facilitate an anonymised EMR audit after 6-months in the trial. Practices were also required to remove any existing MSK stratification tools (e.g. STarT Back) if they were randomised as a control practice. Consent to these criteria was sought through a written agreement with a representative from each participating practice, prior to randomisation. We aimed for practices that varied in size, location (urban, semi-urban and rural) and population socio-demographics.

Patients

Patient identification, invitation and recruitment were facilitated by CRN staff, or practice staff (if preferred), through a weekly download into a secure mailing database of eligible patients identified from the trial’s IT identification template. Eligible patients were sent a study invitation letter and information leaflet, an initial questionnaire and a consent form with a stamped addressed envelope to return. A study administrator (blind to GP practice allocation) was available for telephone support if required. Signed consent to provide questionnaire outcome data was obtained from all participants and NHS ethical approval gained (Reference: 16/EM/0257). Participant recruitment lasted 8 months (October 2016 to May 2017).

Randomisation and blinding

Randomisation used stratified block randomisation based on GP practice list size to allocate the 8 practices in a ratio of 1:1 (4 intervention, 4 control). Keele Clinical Trials Unit (CTU) computer-generated the random sequence and ensured concealment by providing each practice with an anonymised code. Allocation (at cluster and individual level) was shared with the study team (except for the trial statistician and outcome data collectors who were blinded until the analysis was finalised). Blinding for participating GPs was obviously not possible, however, patients were unaware of the RCT and the differences between consultations in intervention and control practices, and instead were informed about, and consented to, providing questionnaire data for a study investigating the Treatment of Aches and Pains (TAPs). These processes follow recommendations for cluster RCTs [42].

Interventions

Usual care

Patients consulting at the four usual care general practices received clinical care as usual for MSK pain. Usual primary care is known to be variable [43,44,45]; for example, some patients may receive advice, prescriptions for medications and nothing more, some may be asked to return to the GP for follow-up assessment or treatment, whereas others may be referred to other services, including for tests and investigations, or treatment services such as physiotherapy, orthopaedics or pain clinics. As part of the trial’s participant identification screen, GPs in control (and intervention) practices recorded patient’s average MSK pain intensity (see outcomes section) and primary MSK pain site at the point-of-consultation on the study EMR identification template.

Stratified care intervention

The intervention development was based on the Medical Research Council’s (MRC) framework for the design and evaluation of complex interventions [46]. To support GPs in intervention practices to deliver stratified care, we extended the trial point-of-consultation identification EMR template to also contain the prognostic stratification tool (a development version of the Keele STarT MSK tool) - see Fig. 1 and recommended matched treatment options. The tool was developed and validated in UK General Practice to predict persistent pain and disability and allocate individuals into low, medium or high risk subgroups and is published elsewhere [38]. The recommended matched treatment options for each subgroup are provided in Fig. 2 and were developed through a systematic review and expert consensus process, described in detail elsewhere [39, 40]. In brief, for patients at low risk the treatment options were restricted to supporting self-management and over-the-counter medication, discouraging unnecessary investigations or referral. For those at medium risk, they included referral to conservative non-pharmacological treatments (e.g. those offered by physiotherapists) and workplace assessment and advice, and for those at high risk, they included referral for corticosteroid injections specialist clinical services (including rheumatology, orthopaedics and pain clinics), and opioids.

Fig. 1
figure 1

Development version of the Keele STarT MSK Tool© Keele University

Fig. 2
figure 2

STarT MSK pilot trial recommended matched treatment options

GP training (3–4 h) within intervention practices was facilitated by an experienced GP trainer (VC) and the lead author (JH) and included: the rationale for stratified care, how it differs from usual care, familiarisation with the EMR template and its fit within the consultation, as well as addressing any questions or concerns. GPs also received a training-update half-way through their recruitment period at which feedback data were shared about individual GP intervention fidelity, with peer-to-peer comparisons and discussion.

Outcomes measures and analyses

The defined pre-specified measures and success criterion to address each pilot trial objective were as below, with no changes once the pilot commenced:

Objective 1

To examine the recruitment and retention rates of general practices we examined the numbers of expressions of interest, face-to-face introductory meetings and signed agreements to participate. To examine the recruitment and retention rates for individual participants we examined the numbers of: participant identification screen activations in the EMR (these were potentially eligible patients screened by the GP at the point-of-consultation) and completions (confirmed eligibility and therefore invited by post to participate), as well as the initial questionnaires returned with written consent to participate in data collection, and monthly and 6-month questionnaires returned. Questionnaire items were examined to identify missing items and any floor-or-ceiling effects. Means and/or medians, standard deviations were reported for all the participant self-reported measures.

The pre-specified success criteria for this objective was that the trial participant identification screen would be activated in approximately 2000 consultations leading to a minimum of 500 participants participating in data collection within an expected 3-month recruitment period and a follow-up rate of > 75% with less than 5% missing items in participant questionnaires.

Objective 2

To examine evidence of recruitment selection bias we descriptively analysed (means and standard deviations (SD)) the characteristics of intervention and control arm participants, and characteristics of trial participants and non-participants, using information from the EMR participant identification screen at the point-of-consultation (i.e. MSK pain location, pain intensity, age, sex and deprivation score) and within the participant self-reported initial questionnaire (demographic and clinical characteristics, as listed in Additional file 1). The pre-specified success criteria for this objective was to find little evidence of recruitment selection bias either between intervention and control participants, and between study participants and non-participants.

Objective 3

To assess GP fidelity to the stratified care intervention at the point-of-consultation we examined the proportion of eligible cases in which GPs used the stratification tool and choose at least one of the recommended matched treatments. Per protocol matched treatments for each subgroup were defined as follows:

  • Low risk: must only have low risk treatment options reported in the EMR

  • Medium risk: must have at least one medium risk treatment option and none of the high risk options reported in the EMR

  • High risk: patients must have reported within the EMR, at least one high risk treatment option, or a referral to an MSK service providing a medium risk treatment option (e.g. physiotherapy or psychological intervention) with tool subgroup information within their referral so that services were aware that an onward referral to a high risk treatment option might be required.

The pre-specified success criteria for this objective were that within relevant MSK pain consultations intervention GPs would:

  1. 1.

    Complete the prognostic stratification tool in:

  • > 50% of cases: “Complete success” (proceed to main trial without amendments)

  • 40–50% of cases: “Partial success” (proceed to main trial with amendments)

  • < 40% of cases: “Unsuccessful” (consider whether or not to proceed to main trial)

    1. 2.

      Adhere to per protocol matched treatment options in:

  • > 65% of cases: “Complete success” (proceed to main trial without amendments)

  • 50–65% of cases: “Partial success” (proceed to main trial with amendments)

  • < 50% of cases: “Unsuccessful” (consider whether or not to proceed to main trial)

Objective 4

To examine differences in GP decision-making and patient self-reported outcomes at the level of intervention and control we conducted secondary descriptive statistical analyses using the anonymised 6-month EMR audit and follow-up questionnaire data. As this was a feasibility and pilot trial the objective was not hypothesis testing of process/health outcomes, there were no pre-specified success criteria and only complete cases were analysed.

There were four sources of data:

  1. 1.

    The GP EMR participant identification screen collected identical point-of-consultation data in all 8 GP practices, including the primary MSK pain site/location and average pain intensity (intended primary outcome for the main trial) by asking:

  • How intense was your pain, on average, over the last 2 weeks? [Responses on a 0–10 scale, where 0 is “no pain” and 10 is “worst pain ever”].

Pain intensity was chosen as the potential primary outcome for the future main trial as it had the strongest face validity with patients during a pre-pilot Patient and Public Involvement and Engagement (PPIE) workshop and is also a recommended outcome for trials testing treatments for MSK pain [47, 48]. In the intervention practices the EMR participant identification screen was extended to embed the stratified care intervention and collect additional data relating to stratification tool item responses and the matched treatment options chosen at the point-of-consultation. All template responses were date stamped and linked to an individual GP and patient. It was also possible from the EMR screen to collect automated data on the MSK consulter’s age, sex and English Index of Multiple Deprivation (IMD) 2015 [49], with non-participants data anonymised first.

  1. 2.

    Baseline and 6-month postal questionnaires included self-reported measures for average pain intensity over the last 2 weeks (identical wording and responses to the trial identification template), physical function measures for each of the MSK pain sites (filtered according to GP designation) including the back specific Roland-Morris Disability Questionnaire (RMDQ) [50], the Neck Disability Index (NDI) [51, 52] the Shoulder Pain And Disability Index (SPADI) [53], the Knee Injury and Osteoarthritis Outcome Score Physical Function Short-form (KOOS-PS) [54] and for multi-site pain, the Short Form 12 (v2) Physical Component Scale [55]. Other outcomes were MSK risk status using the development version of the Keele STarT MSK tool [38], overall MSK health status using the Musculoskeletal Health Questionnaire [56], fear avoidance beliefs using the 11-item Tampa Scale of Kinesiophobia [57], patient perceived reassurance (from their GP) using the Effective Consultation and Reassurance Questionnaire (ECRQ) [58] (which has four subscales: information gathering, relationship building, generic reassurance and cognitive reassurance), health-related quality of life using the EuroQol five-dimension, five-level version (EQ-5D-5 L) [59], single items each capturing satisfaction with care received, whether participants had received written education material from their GP about their MSK problem (yes/no), and overall rating of global change (− 5 to + 5 numerical response scale) since their index GP visit (the one in which the trial EMR screen was activated and they were invited to participate in the study data collection) [60], whether they were in paid employment and had taken any work absence due to their MSK pain, and an item asking how their productivity at work is affected (0–10 NRS). Patient population descriptors (captured at baseline alone) included; the Single Item Health Literacy Screener (SILS) [61] and pain episode duration by asking “how long is it since you had a whole month without [insert pain site e.g. back] pain”. Additional file 1 provides a summary of the self-reported measures collected.

  2. 3.

    Monthly follow-up

Three items were collected using monthly follow-up via Short Message System (SMS) text or one-page postal questionnaire (depending on participant preference): average pain intensity (same wording as GP EMR screen), distress due to pain, and pain self-efficacy using:

  • How much distress have you been experiencing because of your pain, on average, over the last 2 weeks? [Responses from 0 = no distress to 10 = extreme distress]

  • How confident have you felt about managing your pain by yourself e.g. medication, changing lifestyle? [Responses from 0 = not at all confident to 10 = extremely confident]

    1. 4.

      Anonymised GP medical record audit

An anonymised audit of medical record data from all 8 GP practices for patients in whom the trial EMR participant identification screen had been completed, including:

  1. i)

    prescriptions (categorised into simple analgesics, non-steroidal anti-inflammatories (NSAIDs), neuromodulators, muscle relaxants, corticosteroid injections and opioids)

  2. ii)

    referrals (categorised into physiotherapy/MSK interface services, secondary care specialist services including orthopaedics, pain clinics, and rheumatology)

  3. iii)

    imaging (categorised into x-rays/MRI scans, MSK ultrasound scans and bone density scans)

  4. iv)

    sick certifications or ‘fit-notes’ (categorised into number per patient and mean length in days)

  5. v)

    repeat MSK general practice consultations.

Sample size

Whilst sample size calculations for pilot cluster trials are known to be difficult [62], the initial plan was to carry out an internal pilot trial with a 3-month recruitment phase, that mirrored the methods of the main cluster trial but was limited to assessing feasibility within 8 GP practices (4 intervention and 4 control) prior to involvement of a further 22 GP practices (30 in total). If the internal pilot had achieved its success criteria, we had planned that these 8 randomised practices would continue to recruit patients for a full 6-month period, and their data included in the main trial. Hence, we anticipated recruiting 500 patients from the 8 practices over the first 3-months in the internal pilot trial, with a further 500 participants to be recruited from those practices (and in addition 2750 from a further 22 practices for the main trial phase).

Results

Objective 1: general practice and participant recruitment and retention rates

There were 32 general practices who expressed an initial interest in participating in the pilot trial from the West Midlands region of England, of which 16 agreed to a face-to-face introductory meeting with the research team, and 8 were recruited (with written agreements) and randomised (4 intervention, 4 control). The reasons given for declining participation included the practice lacking capacity in terms of resource at that particular time (n = 2), unwillingness to participate in the training session (n = 2), unwilling to use the EMR participant identification screen (n = 2), being already involved with another MSK pain research study (n = 1), and a perception that the practice’s patient population would struggle to respond to the self-report questionnaires (n = 1). The 8 participating practices had a total adult practice population size of 58,307 (25,697 intervention, 32,610 control). The smallest practice had 3 GPs and a registered adult population of 3992; the largest had 9 GPs and 13,359 adult patients. In total 59 GPs identified patients for the trial (39 in control practices and 20 in intervention practices).

Patient recruitment and follow-up through the trial are described in Fig. 3. Recruitment started on 11/10/2016 and the last practice template was deactivated on 24/05/2017 with the last invite reminder sent on 21/06/2017 and last patient provided consent to data collection on 21/07/2017. There were 3063 potentially eligible patients screened by GPs at the point-of-consultation, the EMR participant identification screen was completed in 1281 with confirmed eligibility, of whom 1237 were actually invited by postal letter to participate in data collection, 567 initial questionnaires returned with written consent to participate in data collection, and 524 responses were received within the 4-week eligibility time-period (231 intervention and 293 controls). To recruit 500 patients took 28 weeks, more than twice as long as the original estimate (12 weeks). Recruitment varied substantially between the 8 practices (range n = 11–127) suggesting the need to account for this variation within the main trial sample size calculation. Once 500 participants were recruited, the EMR participant identification screen in practices was switched off, however, we recruited a further 24 participants (n = 524 in total) over the following month (33 weeks in total) due to the time lag in sending invitations and receiving patient consent to data collection (via the post).

Fig. 3
figure 3

Participant flowchart

The overall participant 6-month follow-up rate for the intended future RCT primary outcome of pain intensity was 477/524 (91.0%); usual care 209/231 (90.4%), intervention 268/293 (91.4%). Response rates for monthly pain intensity scores at 5 or more time-points (max. Possible was 6) was 82.6%, with data for 3 time-points available in 91.8%. 15 patients withdrew over the 6 months follow-up period: 5 from intervention practices (2 due to illness/surgery/poor health, 1 due to moving house, and 2 did not want further contact about the study), and 10 from control practices (5 due to illness/surgery/poor health, 1 had died (unrelated), 2 withdrew because they felt recovered, and 2 did not want further contact). There were no related, unexpected serious adverse events or harms reported. At 6-month follow-up patients reported 11 hospital admissions (5 intervention, 6 control) related to their MSK pain (e.g. knee replacement or shoulder surgery). Missing data items in the questionnaires remained less than 5%. Anonymised medical record data were available for 1281 patients (529 from intervention practices and 752 from control practices).

The success criteria for this objective (the template activated in approximately 2000 consultations leading to a minimum of 500 participants providing consent within an expected 3-month recruitment period and a follow-up rate of > 75% with less than 5% missing items in patient questionnaires) was only “partially successful”, as although patient recruitment and retention were “successful”, the timeline needed to recruit 500 patients was 28 rather than 12 weeks.

The learning/change needed ahead of the main trial included reducing the main trial sample size (following discussion with the independent Trial Steering Committee and funder) by removing the pre-specified sub-group analysis (at the risk-subgroup level) and instead powering the trial for the overall comparison between intervention and control arms. In addition, the main trial sample size was re-calculated based on the following: Firstly, the pilot recruitment rate showed that the template was completed in just under 40% of cases, and from the subsequent letter of invitation 40% returned their initial questionnaire and provided consent to participation in the data collection (on average, 60 patients per practice). A conservative estimate (50 patients per practice) was therefore used for the main trial. Secondly, the proportions expected within each of the three risk subgroups, as determined from the self-complete questionnaires, were revised based on the pilot trial findings, to: 32% low risk, 55% medium risk, 13% high risk. This was important as the trial was powered to detect superiority of stratified care in the medium and high risk subgroups, with an expected effect size of 0.20.

Thirdly, for GP cluster parameterisation, we made the following estimates, based primarily on previous guidelines, as pilot trial figures need to be viewed cautiously given the possible lack of precision [62]. For the main trial primary outcome (pain intensity) we have conservatively allowed for an intracluster correlation coefficient (ICC) of 0.01 based on a guideline from previous primary care trials [63] and the pilot trial ICC being considerably lower (0.004). Our main trial estimated coefficient of variation in recruitment per practice is also based on a guideline estimate of 0.65 [64] as well as the pilot being similar at 0.66. Our expected loss to follow-up across all time-points is conservatively estimated at 25%, which in the pilot was around 5%. Lastly, our repeated measures correlation is estimated using a guideline figure of 0.7 [65], which is conservative based on our pilot trial figure of 0.65. These factors combine to give a sample size inflation factor of × 2.3 (based on an average cluster size of about 50 participants per practice in 6 months). Correlation of data within 6 repeated measurements and correlation of follow-up scores with baseline score are typically 0.7 and 0.5, respectively which combine to give a sample size deflation factor of × 0.5). The product of inflation and deflation effects result in a magnification of 1.15 compared to a conventional, individual-patient, single follow-up comparison, whereby the sample size requirement would be 525 per treatment arm (or, 1050 in total). The adjusted sample size target for the main trial was is therefore 600 patients per arm (1200 in total) from approx. 24 general practices (approx. 12 per arm).

Objective 2: to examine evidence of selection bias

Table 1 shows a descriptive evaluation of individual participant demographics and characteristics (split by trial arm) and participants and non-participants. Whilst most characteristics were similar (e.g. sex) between intervention and control arms suggesting minimal selection bias, there were a few differences between participants (e.g. overall, they were slightly older and from more deprived areas) and non-participants. Mean pain intensity (0–10 Numerical Response Scale (NRS)) at the point-of-consultation was similar between participants (6.33, SD 2.05) and non-participants (6.35, SD 2.10), but pain scores were 0.5 points higher in participants in the intervention arm than control, although this difference had disappeared by the time of the initial patient questionnaire (typically 1–3 weeks later).

Table 1 Patient baseline characteristics

Overall there were few differences across other characteristics and the pre-specified success criteria for this objective of finding little evidence of selection bias was judged “successful”. There were, therefore, no changes required to recruitment procedures for the main trial.

Objective 3: assessing GP fidelity to the stratified care intervention

GPs from intervention practices used the stratification tool within the EMR in 513/1591 (32%) of eligible patients, which was “unsuccessful” according to our pre-specified success criteria. GP fidelity to choosing recommended matched treatment options (shown in Table 2) achieved “complete success” with 81% of patients at low risk, 89% for medium risk and 87% for patients at high risk being correctly matched to a recommended treatment.

Table 2 GP fidelity to the recommended matched treatment options

Through the nested qualitative research (reported separately, [41]) and feedback discussions with the participating GPs about the reasons for the low rate of completion of the tool, we gathered a number of insights to inform the main trial. Firstly, GPs perceived that the using the whole EMR template increased their consultation workload and asked for the treatment options to be simplified. They also reported that the stratified care intervention was only appropriate for consultations where MSK pain was the primary reason for the consultation, where they could focus on the MSK pain problem. GPs also admitted that patients had frequently left the consultation room before they used the EMR and that they did not use the tool when their clinics were very busy. We therefore agreed in the future main trial to lower the expected proportion of MSK related consultations in which the tool would be used at the point-of-consultation from 50 to 25%. We also identified that some GPs rarely coded MSK pain consultations and that others tended to use ‘Synonym’ codes, which are set of diagnostic codes that needed to be removed from the list of codes used to activate the EMR participant identification screen, as they caused it to activate in error for a range of non-MSK pain problems (e.g. chest pain). It was agreed that for the main trial the GP training needed to include ways to mitigate these issues. GPs also recommended reducing the 4 h of intervention training to 2 h and to provide a dedicated NHS physiotherapy pathway for patients in the main trial to overcome GPs’ concerns about over-loading physiotherapy services with patients with MSK pain. Finally, GPs reported feeling uncomfortable with the self-report style wording of the development version of Keele STarT MSK tool. For example, they felt certain items could be modified to be less ‘clunky and awkward’ to ask (e.g. item 4: “Do you have any other important health problems?” which confused/unsettled patients when asked by their own family doctor who they expected to know their health problems well). We therefore developed a clinician-completed version of the Keele STarT MSK tool for use in the main trial, to overcome these wording problems, but keeping the item constructs as similar as possible. A license to obtain both the original self-report and clinician completed versions of the tool is available on request at www.keele.ac.uk/startmsk.

Objective 4: describing GP decision-making and patient outcomes in both arms

The results from the EMR audit of GP decision-making in MSK consultations are shown in Table 3 (split by intervention and control). GPs in intervention practices prescribed less opioids and more over-the-counter medication and anti-inflammatories than GPs in control practices. In addition, they gave more written self-management information to patients, used less MSK-related imaging and referred patients to physiotherapy earlier than in control practices. Numbers of corticosteroid injections, sick certifications, and repeat MSK pain related general practice consultations over 6 months were similar in intervention and control practices.

Table 3 Comparison of GP decision-making between intervention and control practices.

Descriptive data on patients’ clinical outcomes over 6-months follow-up are presented in Table 4. Mean (SD) 6-month pain intensity was 3.93 (2.98) in participants in intervention practices and 4.18 (2.88) in control. Most other 6-month outcomes were similar although there was less MSK-related time-off-work in participants from intervention (17.4%) than control practices (25.4%). We did not statistically compare these outcomes in this pilot trial.

Table 4 Clinical outcome measures at 6-month follow-up by intervention arm

Discussion

This feasibility and pilot trial examined the feasibility of a future definitive cluster RCT in respect to recruitment and retention rates, potential selection bias and GP intervention fidelity to stratified care at the point-of-consultation for adults with MSK pain.

Our original plan was that this study was an internal feasibility and pilot trial. Our findings showed that participant retention rates were high, that GPs matched patients to recommended treatment options well (> 80% of cases), and there was little evidence of selection bias, therefore the cluster trial design was deemed suitable for the future main trial. However, the length of time taken to recruit participants was over twice as long as expected (28 rather than 12 weeks), and GPs completed the Keele STarT MSK Tool in fewer patient cases than we had hoped for (they used it in 32% of patient cases when the target was > 50%). The nested qualitative study findings [41] and feedback discussions with participating GPs explored the reasons why only two of the four pre-specified pilot trial success criteria were met. These identified in the particular challenge of using the EMR template and stratified care intervention when MSK pain was not the primary reason for the consultation.

GPs also suggested a number of positive changes to make prior to the future definitive RCT and thus this study became an external pilot trial. These changes included simplifying the recommended treatment options and developing a clinician-completed version of the Keele STarT MSK Tool. Furthermore, we agreed to lower the expected proportion of MSK consultations in whom the tool would be used from 50 to 25% as we were unable to stop the EMR template from firing in consultations where MSK pain was a multimorbidity and not the main focus of the consultation. We also agreed to give GPs training specifically about the issue with ‘Synonym’ codes that failed to activate the EMR participant identification screen and reduced the intervention GP training from 4 h to 2 h. Lastly, we organised for NHS physiotherapy services receiving patients from participating intervention practices to provide a dedicated pathway for patients in the main trial. This pathway was put in place to overcome GPs’ concerns about their referrals over-loading NHS physiotherapy services with patients with MSK pain and we specified that is was strictly not allowed to increase the speed of access to physiotherapy treatment for intervention participants.

The main STarT MSK trial is currently ongoing (ISRCTN15366334).

Conclusions

This feasibility and pilot trial has successfully demonstrated the feasibility of the cluster RCT design with high retention rates over 6 months (> 90%) and little evidence of selection bias, although changes to the main trial sample size were required due to a slower than expected recruitment rate. GP point-of-consultation fidelity to the stratified care intervention was mixed with GPs using the tool less often than expected (only when they coded consultations, when they had time and when MSK pain was the primary reason for the visit). However, there was high fidelity to choosing recommended matched treatment options (> 80% of cases). The learning from this feasibility and pilot RCT has led to a number of important changes prior to the main STarT MSK trial testing the clinical and cost-effectiveness of stratified primary care for patients with MSK pain.