Background

Back pain is a widespread and costly health problem in many countries including the UK[1]. In the UK the number of days of Invalidity Benefit attributable to spinal disorders rose threefold over the 1980s [2]. There is little evidence that the rise in reported disability reflects changes in pathology, prevalence or even morbidity. Since the early 1990s there have been many systematic reviews of the effectiveness of physical interventions for low back pain. There appears to be a consensus, reflected in national guidelines in the UK [3, 4], USA [5], the Netherlands [6] and elsewhere [7], that patients with acute back pain should be encouraged to return to normal activity as soon as possible and have early access to physical therapy. It is less clear what form such physical therapy should take. Two commonly suggested physical treatments for low back pain are manipulation and physiotherapist led exercise programmes. The evidence to support their use is far from conclusive.

Evidence for manipulation

There have been many systematic reviews of the effectiveness of manipulation. Koes et al. [8] reviewed 38 trials and concluded that, although some results were encouraging, further trials were needed to establish the effectiveness of manipulation. In contrast, Shekelle et al. [9] did a meta-analysis combining data from nine trials and concluded that manipulation could increase the rate of recovery from acute uncomplicated low back pain, but that there were insufficient data to provide evidence for the effectiveness of manipulation in patients with chronic pain. The US Agency for Health Care Policy and Research (AHCPR) [5] reviewed four meta-analyses and 12 additional randomised trials and also concluded that manipulation could speed the recovery of patients with acute back pain and that the evidence to support the use of manipulation for radiculopathies or longer standing back pain was inconclusive. The systematic review of reviews by Assendelft et al. [10] was highly critical of the general standard of reviews. Nevertheless, nine of their best ten reviews, as judged by methodological criteria, reported positive effects of manipulation.

Subsequently Assendelft et al. in the most recent systematic review of which we are aware [11] concluded that spinal manipulative therapy has no statistically or clinically significant advantage over general practice care, analgesics, physical therapy, exercise or back school for acute or chronic back pain. However, they did find some advantages for spinal manipulation when compared to sham manipulation or therapies judged to be ineffective or harmful. This systematic review, that was published after we had collected our data, went on to recommend that any future trials of manipulation should concentrate on cost-effectiveness rather than effectiveness. This trial includes a cost-effectiveness analysis.

Evidence for exercise

Fewer systematic reviews have examined the effectiveness of exercise for back pain. Koes et al. [12] examined 16 randomised trials and found most to be of poor quality with inconclusive results. The AHCPR [5] extended this to 20 randomised trials, of which six related to acute back pain. With the exception of one well-designed study [13], however, the studies included other interventions that made it difficult to evaluate the effects of exercise. Faas [14] extended the data set to 28 randomised trials but concluded that there was insufficient evidence that specific back exercises produce clinically significant improvement in acute low back pain. While endorsing these conclusions, Waddell et al. [3] cited evidence that general exercise programmes can improve pain and functional levels in those with chronic low back pain. More recently, a Cochrane review identified 39 studies and concluded that the data did not indicate that specific exercises are effective for the treatment of acute low back pain. However, (general) exercises may be helpful for chronic low back pain patients to increase return to normal daily activities and work [15].

Waddell et al. [16] carried out a systematic review of advice about staying active with back pain and concluded that advice to continue normal activities leads to less chronic disability and time off work than the traditional advice to rest and 'let pain be your guide'. Subsequent Cochrane reviews of treatments for acute low back pain and sciatica concluded that 'advice to stay active alone has little beneficial effect for patients' [17] and that, compared to bed rest, advice to stay active alone will have limited beneficial effects [18].

On the basis of the limited evidence available the UK Clinical Standards Advisory Group [2] and the authors of the national acute low back pain guidelines, produced by the Royal College of General Practitioners (RCGP) [3, 4], advised widespread access to physical treatment for back pain. However, because the evidence underpinning this approach is weak, and the health and social impact of back pain is large, it became a NHS research priority. Consequently, the UK MRC formed an international working party, including back pain clinicians and researchers from different disciplines, to design a national trial to evaluate the effectiveness of different physical treatments for back pain in primary care: the UK Back Pain Exercise and Manipulation Trial (UK BEAM).

Methods

Trial design

The trial had a two-dimensional 3 × 2 factorial design [19]. Each participant was randomised:

  1. 1.

    between the 'Back to Fitness' progressive exercise package and general practice management; and

  2. 2.

    between the spinal manipulation package and general practice management.

Those randomised to the manipulation package were further randomised to be treated in NHS or private premises, to allow the effect of treatment location on outcome to be measured. Each participant had an equal chance of receiving general practice management, the manipulation package only, the exercise package only or both manipulation and exercise. Participants randomised to manipulation each had an equal chance of receiving treatment in NHS or private premises. (Figure 1).

Figure 1
figure 1

Trial design

Hypotheses tested

Essentially explanatory hypotheses

The trial tested a total of four 'null' hypotheses. The three primary null hypotheses were, that in people consulting their general practitioners with back pain there is no difference in clinical outcome between:

  1. a.

    Those receiving an additional defined package of spinal manipulation or general practice management only;

  2. b.

    Those receiving this package of spinal manipulation in NHS premises or those receiving it in the private premises of chiropractors, osteopaths or physiotherapists; and

  3. c.

    Those receiving a defined package of exercises (Back to Fitness) or general practice management only.

The secondary null hypothesis was that:

  1. d.

    In those people receiving both spinal manipulation and the Back to Fitness exercise package, the improvement in clinical outcome is merely the cumulative effect of those treatments, i.e. that there is no interaction.

For hypothesis b, the feasibility study confirmed that we could recruit therapists to perform spinal manipulation from those working in their own premises who also worked, or were willing to work, in NHS premises. For hypothesis d, the feasibility study confirmed that those performing spinal manipulation and those providing Back to Fitness exercise classes could deliver their own package to participants who also received the other package.

Essentially pragmatic estimates

The trial will estimate (to within a confidence interval) the cost per unit of health utility gained by adding to general practice management:

  1. a.

    The spinal manipulation package in NHS premises;

  2. b.

    The spinal manipulation package in the private premises;

  3. c.

    The exercise package; and

  4. d.

    The spinal manipulation package (NHS or private) plus the exercise package.

Interventions

General practice management (the 'control' treatment)

All participants in the trial continued to be managed by their general practitioners even if randomised to spinal manipulation or the exercise package. Members of the MRC Working Party and the trial team developed an active management package for use by general practices within the trial [20]. This included training in active management for all clinical and support staff from all participating practices.

The AHCPR [5], the Clinical Standards Advisory Group (CSAG) [2], and the RCGP [3, 4] have all advocated active management. The recommended advice is also available through the evidence-based patient resource 'The Back Book' [21]. An early evaluation showed that this booklet can produce positive changes in beliefs about back pain [22]. Open-ended questions showed that almost all respondents found it easy to understand, interesting, and helpful, and assimilated its main messages. A more recent trial showed that it improved beliefs about back pain and had a positive effect on outcomes [23].

Thus we based the active management package used in the trial on the RCGP guidelines [3, 4] and The Back Book [21]. With the support of general practitioners and their staff (practice nurses and non-clinical support staff) it aimed to achieve early activity and allay fears about future consequences of back trouble. The message concentrated on minimising bed rest, promoting ordinary physical activity, remaining or returning to work, and encouraging positive attitudes to pain.

We actively sought the support of all relevant staff for this package since changes in clinical management may be successful when the entire clinical team delivers the innovation in a consistent manner [24]. To this end we issued all general practitioners and nursing staff in trial practices with copies of the RCGP evidence base [4] and clinical guidelines. To reinforce the theory, principles, and practice of active management we invited all members of participating practice teams to a training session. We also trained them in distributing and emphasising the messages in The Back Book to patients with back pain. The Back Book complements the clinical guidelines and thus reinforces the information and advice given by practitioners and other primary care staff.

We encouraged practitioners to limit prescriptions to a small range of drugs including analgesics (paracetamol with or without codeine) and non-steroidal anti-inflammatory agents (diclofenac or ibuprofen or naproxen). However, adherence to this guidance is not crucial since there is little evidence that analgesics differ in their effectiveness in back pain. Adherence was assessed by collecting data on other treatments used by participants from participant questionnaires and examination of the medical record.

We recruited general practitioners to the trial only if they agreed not to perform spinal manipulation or manual therapy on trial participants. We discouraged them from referring participants to other physical therapists or other specialists especially during the initial three months of treatment although not during the remainder of the follow up period.

We originally planned to compare the effect of training practice teams in the active management of low back pain with conventional care in general practice as an additional layer of our factorial design. Thus in the feasibility study we randomised 26 general practices in two centres (West Yorkshire and Stockport) between an active management training programme and conventional general practice management. For the active management comparison we randomised by practice rather than individual because of evidence from the guideline literature that the chances of successful implementation are greater when an entire clinical team (or practice) delivers an innovation coherently and consistently [24]. Practice-based randomisation also circumvents the possibility of 'contamination' between the active intervention and usual management.

However, the active management practices recruited more than twice as many participants (165) as usual management practices (66). Furthermore these additional participants reported a milder form of back pain. Thus continuation with this comparison would have made the other trial comparisons difficult to interpret and could have adversely affected recruitment to the main trial. With the agreement of the MRC Trial Steering Committee, we modified the study design of the main trial. For the main study we provided the active management training package to staff from all practices.

In light of the existing evidence, the wide distribution of the acute back pain treatment guidelines, and the increasing popularity of the active management philosophy, we judged that active management represented 'best care in general practice'. Additionally this would improve recruitment and standardise the 'control' treatment. We therefore chose this approach to the management of low back pain as our 'control' treatment.

Manipulative package

Different practitioners have practised spinal manipulation for many years. In the UK, Acts of Parliament in the mid-1990s formally recognised chiropractic and osteopathy. At the same time, the physiotherapy profession has been increasingly utilising manual therapy. In the UK, similarities between chiropractors, osteopaths and some physiotherapists seem to be much greater than their differences. The MRC Working Party therefore agreed that to compare the effectiveness of the three disciplines was unnecessary, and probably impossible, in view of the large sample size needed.

There is evidence that people with back pain may derive modest benefit from spinal manipulation. In the UK, at the time the trial was designed, the only substantial trial of manipulation was in secondary care, suggesting there may be some longer term benefit [25, 26]. However, the design of the trial was criticised because it compared NHS outpatient care with treatment in private chiropractic premises [27] raising the possibility that the location and style of treatment might explain the differences that were found.

Members of the MRC Working Party developed a package in collaboration with the three physical therapy professions involved (chiropractic, osteopathy and physiotherapy) who agreed to use it in this trial. It defines a common core of manipulative practice while permitting enough flexibility in both assessment and treatment to be representative of all three professions. The package, described in detail elsewhere [28], comprised the commonly used range of manual treatment techniques together with non-manual elements usually used by physical therapists (for example exercises and hot/cold packs); oral advice was to be consistent with The Back Book [21].

For individual participants the package comprised a series of scheduled sessions with the same therapist. The maximum of eight sessions could be spread across the intervention period of 12 weeks (six weeks for participants randomised to both manipulation and exercise) at the discretion of the practitioner. We provided a detailed manual to all therapists delivering manipulation treatments as part of the trial. They also attended familiarisation sessions and were asked to commit to the 'active' philosophy of the trial.

Exercise package

Physiotherapy is widely and increasingly available in UK primary care. The evidence to support the use of any individual physiotherapy technique in those with back pain is scant. There is some weak evidence to support the use of intensive extensor strengthening exercises [29], [30] and for group education or 'back schools' [31, 32]. There is better evidence for the effectiveness of a progressive exercise programme based on cognitive behavioural principles [14, 3335] and for the benefits of exercise programmes aimed at endurance, muscular strength and normal use of the back [5, 33, 29, 3437]. Building on previous small UK studies [34, 35], members of the MRC working party developed the Back to Fitness exercise programme. The aims of the programme were to:

  • encourage normal movement;

  • increase participants' confidence in their spines; and

  • help participants take control of their problem.

This trial will help to confirm whether:

  1. 1.

    This type of exercise programme is effective in the UK NHS, as much of the evidence is from studies in Scandinavia and the US;

  2. 2.

    An exercise programme is effective among patients recruited from primary care, as most studies have either been in hospital or in the work place; and

  3. 3.

    The 'Back to Fitness' programme is effective when implemented nationally.

The Back to Fitness programme is based on the principles of circuit training and incorporates a cognitive-behavioural approach. The programme is described in detail elsewhere [38]. The exercise classes took place in community settings accessible to patients from participating general practices, usually in the early evening. Each class had up to ten participants. If necessary to maintain group size, additional non-trial participants were recruited to the classes. The classes lasted one hour and took place twice a week over a four-week period. Those who found this schedule difficult could spread their classes over a period of, at most, 12 weeks (six weeks for participants randomised to both manipulation and exercise). Before joining, each participant in the exercise class had an individual assessment by a physiotherapist. We asked the physiotherapists to provide advice consistent with messages in The Back Book [21]. Participating physiotherapists attended an intensive study day led by a multidisciplinary team including a behavioural scientist specialising in this field and therapists with experience of similar programmes.

Combination of therapies

If the trial showed that the manipulative package [28] and the Back to Fitness programme [38] were both effective, those who received a combination of these two might experience an average health gain greater or less than the sum of the average health gains attributable to each individual therapy. For example, theoretically the combination of manipulation and exercise could be especially beneficial if the exercise maintained any new range of movement gained through manipulation [39]. Thus it is important to test whether there are any positive or negative interactions between the two treatments. Those participants randomised to receive both spinal manipulation and exercise attended the manipulator exclusively for the first six weeks and the exercise programme for up to another six weeks. Thus all treatment was complete within the specified 12-week period. This arrangement meant that participants were not under the clinical management of different therapists at the same time. However, both treatments occupied a shorter period in the combined arm than in the other two intervention groups. Analysis will test whether participants receiving the combined package experience an average health gain significantly less or significantly greater than the sum of the average health gains attributable to each individual therapy.

Inclusion and exclusion criteria

The target population was people aged between 18 and 65 years presenting in general practice with non-specific back pain with or without referred leg pain. We defined back pain as pain of musculoskeletal origin in the area bounded by the lowest palpable ribs, the gluteal folds, and the posterior axillary lines. We included those with pain referred into the legs provided it was predominantly above the knee. To avoid carry-over effects, participants should not have had physical therapy in the previous three months. To facilitate participation in the exercise class and assessment they had to be fluent in English and able to read and write.

To avoid including participants who recovered rapidly from an acute episode without specific treatment, we included those whose current episode of back pain had lasted at least four weeks from their initial consultation. They did not have to have had pain every day of this episode. They were eligible if they had had back pain either for 28 consecutive days, or, for at least 21 of these days and at least 21 of the 28 days before that. No upper limit to the duration of pain was set. Exclusion criteria are summarised in Figure 2.

Figure 2
figure 2

Exclusion criteria

Serious adverse events

We defined a serious adverse event as one leading to hospital admission or death within one week of treatment within the trial. To ensure that we recognised any relevant events the manuals describing spinal manipulation and exercise classes each asked participating health professionals to report and follow up any potential events. Additionally, the follow-up questionnaire after three months asked participants whether they had been admitted to hospital, and practice research nurses searched participants' records for hospital discharge summaries.

Participant recruitment and allocation to treatments

General practitioners and other practice team members notified the practice research nurse of patients consulting for back pain. In practices that computerised their consultation data the nurses also did regular computer searches for back pain consultations. Potential participants were sent an invitation letter, an information sheet, and a brief questionnaire covering the main inclusion and exclusion criteria.

Those interested in the trial made an initial appointment with the nurse. At this appointment the nurse confirmed eligibility by collecting further data. She asked about employment status and how the current back pain was affecting them and explained the treatment packages available through the trial. Data on those who did not join the study will allow us to generalise our findings.

Those who appeared eligible and interested in participating in the trial saw the nurse again at least one week later. This qualifying week gave them time to reflect on the implications of taking part in the trial [40]. It also allowed the general practitioner to confirm that they appeared suitable for the trial. Randomisation took place at least four weeks after the initial consultation with the practice, to exclude anyone who made a rapid spontaneous recovery. Those who no longer met the severity entry criterion, a Roland Morris Disability Questionnaire (RDQ) [41] score of four or more could have become eligible if their condition subsequently deteriorated without needing to go through the waiting period.

Following consent, participants completed the main baseline questionnaire. The research nurse then telephoned the York randomisation service to obtain the participant's random treatment allocation. Randomisation was stratified by practice.

Outcome measures

The main outcome measures were self report questionnaires asking about participants' health, beliefs about back pain, and psychological profile. After recruitment, they received similar follow-up questionnaires by post after one month, three months (when all treatment within the trial has finished) and one year. Non-responders received written reminders after two and four weeks, the second by recorded delivery. Practice research nurses also prompted non-responders by phone at the time of the second reminder.

Health outcomes

The internationally agreed minimum data set for back pain research informed our selection of outcome measures [42]. We used two generic health outcome measures: the Short Form 36-item Health Survey (SF-36) [43], a valid and reliable measure that is responsive to changes in the health of people presenting to primary care with low back pain [40, 44]; and the EuroQol (EQ5D), which produces a single index based on health state valuations [45]. The primary outcome measure was the Roland Morris Disability Questionnaire (RDQ) [41], an established outcome measure for community based back pain studies [46]. We shall use it in its original form for our main analyses. However, we added four alternative items, that Patrick et al. [47] suggested, improved the performance of the RDQ for assessing the health status of patients with sciatica.

The two generic and one specific measure were complemented by:

  1. 1.

    a general measure of pain and disability, derived from the chronic pain grade proposed by Von Korff et al. [48]. This has been shown to be acceptable, valid, and reliable when modified to measure back problems over a one month period in a postal questionnaire in the United Kingdom [49];

  2. 2.

    a general question about the perceived global effect of the condition [50] (by analogy with the 'generic health transition' question of the SF-36 health survey [43], we describe it as 'specific health transition').

Back pain beliefs

We used the Fear Avoidance Beliefs Questionnaire (FABQ) [51] and the Back Beliefs Questionnaire (BBQ) [52] to measure participants' beliefs about back pain at recruitment and follow-up. The FABQ comprises 16 items that measure beliefs about the effect of physical activity and work on low back pain. The BBQ comprises 14 items that measure beliefs about the consequences of low back pain. There is evidence for the validity and reliability of both instruments [51, 52]. The BBQ is also responsive to changes in beliefs about back pain [53]. However, we excluded items from both measures. Low response rates in the feasibility study led us to drop the work component of the FABQ and base our analyses on the FABQ-physical scale only [51]. To reduce respondent burden we used only the nine items forming the 'inevitability' sub-scale of the BBQ [52].

Psychological instruments

Psychological instruments may predict the outcome of episodes of back pain in primary care [54, 55]. Psychological distress at recruitment was assessed by the Distress and Risk Assessment Method (DRAM) [56]. This consists of 55 items to categorise people into four types – normal, at risk, distressed depressive, and distressed somatic. However, only 36 of these items contribute to the scale scores for its components, the Modified Somatic Perception Questionnaire [57] and Modified Zung [58]. To reduce respondent burden we included only those items that are scored. It is valid for use by those with low back pain.

Table 1 summarises the resulting portfolio of measures.

Table 1 Measures of health outcome, back pain beliefs and psychological profile

Health economic outcomes

The economic design for this trial is that of an incremental cost-utility analysis from the perspective of society as a whole. The main focus of the economic evaluation is the EuroQol – a generic measure of health utility [59]. However we shall also analyse the other outcomes in Table 1 to check whether choice of outcome measure affects the economic comparison of treatment packages.

The other main economic benefit is the extent to which each of the (combinations of) treatment packages reduces subsequent use of health care over 12 months. Given the difficulty of disentangling back-specific from other costs, it will estimate generic health care costs including hospital admissions and referrals, and consultations in primary care. Practice research nurses abstracted data from their records at the end of data collection.

We estimated the costs of receiving different packages in the form of time off work and normal activities and the costs of travel in both time and money. The feasibility study tested the most reliable means of collecting these data through an embedded randomised trial comparing simple prospective diaries and retrospective questions within the main outcome questionnaires. As diaries achieved lower response rates, we used retrospective questions in the main trial. These data from participants serve to validate, and elaborate upon, the estimates of health care use derived from practice records.

We collected data on the cost of each package including all fixed and variable costs incurred in delivering them. Each therapist and practitioner recorded direct costs on a simple form with an estimate of the time associated with each consultation including non-contact time. We valued these costs in terms of both opportunity costs to individual practices, and national averages, for greater generalisability. We shall estimate the cost of training for the Back to Fitness exercise package and include it in our evaluation of the widespread implementation of the packages in the trial. We shall also estimate the marginal cost of extending each package to individual practice populations.

Trial centres

The main trial recruited participants from practices that were part of the Medical Research Council General Practice Research Framework (GPRF) http://www.mrc-gprf.ac.uk/ in 12 centres across the UK – Belfast, Edinburgh & Tayside, Exeter, Harrow, Northampton, Norwich, Nottingham, Plymouth, Reading, Sheffield, Teesside and Wrexham & Chester. The aim was for each area to provide an average population of 83,000 patients registered with ten to 15 general practices. The practices and patients needed to be within easy travelling distance of the set of (at least) three premises where physical therapy was available – one in the community for the Back to Fitness exercise package and at least two for spinal manipulation, one within and one outside the NHS. In each centre a part-time administrative 'local co-ordinator' supported by a senior 'clinical adviser', had responsibility for organising the delivery of physical therapies.

Sample size

The primary outcome measure for this trial is the RDQ [41] used by many community-based back pain researchers in the UK and elsewhere. It is therefore the appropriate criterion by which to estimate the target sample size for the trial.

The MRC Working Party, originally responsible for the trial design, considered a difference of 2.5 points in the RDQ to be clinically important when comparing the manipulation or exercise packages with general practice care. A previous study [35], and baseline data from our feasibility study, found the standard deviation for the RDQ for participants recruited from primary care to be around 4.0.

However the crucial comparison in estimating sample size is between manipulation in NHS and private premises (hypothesis 'b'). There would be much less value in testing hypothesis 'b' if a prior test had already established that there was no clinically important difference between spinal manipulation in general and general practice care. Thus hypothesis 'b' is most important when the test of hypothesis 'a' has established that spinal manipulation participants have average RDQ scores that are at least 2.5 points lower than participants just receiving general practice care. Furthermore the test of hypothesis b is most in need of statistical power when this difference is exactly 2.5 points. It follows that the alternative hypothesis to hypothesis 'b' is one in which spinal manipulation participants in one location have average RDQ scores that are 3.33 lower than general practice participants, while those in the other location have average RDQ scores that are 1.67 lower than general practice participants. Thus for hypothesis 'b' we defined a clinically important difference as 1.67 points on the RDQ or 0.417 of the standard deviation of four points. To yield a power of 80 per cent of detecting such a difference using a one per cent significance level would need one year follow up data on two simple random sub-samples of 130. Since half the participants did not receive the manipulation package, a sample size of 520 was needed.

Embedded within this population there are three types of clusters – clusters of participants from the same general practice, clusters of participants attending the same exercise class, and clusters of participants receiving spinal manipulation from the same therapist. It follows that the power of each of the three dimensions of this trial would depend on the relevant cluster sizes and intra-cluster correlation coefficients [60]. Using existing published estimates [35, 6163] and data from our feasibility study we estimated the sample size inflation required to account for clustering effects at practice, exercise class, and manipulator levels. The largest inflation factor is required to account for intra-manipulator effects. In our feasibility study the intra-manipulator correlation coefficients were 0.04. In each centre in the main study there were two manipulators, each seeing patients in NHS and private premises, and thus each treating 25% of the sample in that centre. If the average cluster size were 19 (that is one quarter of the mean sample size of 75 per centre), we needed to inflate the sample size by a factor of 1.72 {viz. 1 + (19 - 1) × 0.04} [60].

We planned to have data from 900 participants after one year, slightly more than 520 × 1.72. These sample sizes yield a power of at least 99% of detecting a difference of 2.5 RDQ points between either exercise only or manipulation only and active management using a one per cent significance level. This would allow us to estimate the main effects of exercise or manipulation, even if we were to find significant interactions, or if the active management (best care) training package were to reduce the baseline scores of participants and give less scope for therapists to improve RDQ scores through manipulation or exercise (Table 2).

Table 2 Power of trial to detect differences across main comparisons

Allowing for one third loss to follow up at one year we sought to recruit 1,350 participants. In our feasibility study active management practices recruited 1.62 participants per 1000 registered patients. Assuming recruitment rate of 1.5/1,000 in the main study, in which all practice received the active management training package, we required practices with 900,000 registered patients for the main study.

Statistical analysis

The analysis of the main trial will take account of the 3 × 2 factorial design. The primary analysis will be by intention to treat. It will include all participants properly randomised, even if they did not subsequently receive the package allocated, or if they received elements of other packages. Proposed secondary analyses will focus on participants who received the essentials of the package to which they were allocated. All significance tests will be two-sided. The analysis will take into account the existence of three natural hierarchies – participants within practices, or therapists (for spinal manipulation if allocated), or community settings (for progressive exercise if allocated) within centres.

The primary outcome measure is the RDQ score at three and 12 months. We shall calculate the change from baseline for each participant and use this as the dependent variable. The baseline RDQ score is a potential covariate in this analysis. The secondary outcome measures are: the RDQ at one month; and, at one, three and 12 months, the SF-36; EuroQol; FABQ; BBQ; Modified Von Korff, Deyo's 'Bothersomeness' scores (anglicised to 'Troublesomeness'); and a specific health transition question (Table 1).

If interactions are not statistically significant, the analysis will present the main effects of treatments, namely NHS manipulation, private manipulation and exercise. As significant interactions are possible however, we may need to estimate the effects of the six distinct combinations of treatments, for example, NHS manipulation and exercise. Thus we plan to build a model by first estimating main effects and then testing for interactions. This differs from a traditional factorial analysis that assumes that interactions are absent or tests this immediately.

First, we shall estimate the main effect of exercise by comparing the exercise only group with the AM group. Second we shall estimate the main effect of manipulation by comparing the manipulation only group with the AM group. Third we shall estimate the differences between NHS and private manipulation within the manipulation only group. Then we shall investigate interactions between exercise and manipulation by examining the group allocated to this combination. Multi-level modelling will take account of variation between practices. It can also estimate the effects of centres, manipulators, and exercise physiotherapists, albeit with wide confidence intervals.

Missing data will be treated in two ways. Missing items within individual outcome measures will be treated according to the instructions for that measure. Then we shall calculate response rates for each treatment group. If these differ we shall compare responses to three distinct mailings (namely first questionnaires, and first and second reminders) and adjust accordingly.

If some or all treatments are effective, secondary analysis will examine whether variations in the process of care can explain variations in outcome. For example, we shall test whether variation in the numbers of exercise and manipulation sessions, and in the manipulation techniques used, affect primary and secondary outcomes. Further secondary analyses will investigate potential prognostic variables collected at the first and second appointments with the practice research nurse.

Conclusion

The UK BEAM trial is a major trial of physical treatments for low back pain. Obtaining participation by members of the three physical therapy professions in the UK (chiropractic, osteopathy and physiotherapy) to work to an agreed treatment was an important achievement. Whatever the outcome of the trial, the results will inform the future management of low back pain both within the UK and internationally. Participant recruitment and follow up is now complete. We recruited 1334 participants from 168 practices. With agreement from the Trial Steering Committee we included data from participants recruited by the 13 practices within the 'active management' arm of the feasibility study, thus adding two centres and making 14 in all. We obtained 12 month follow up data on 995 (75%) of all participantsthese. This provides ample data to test all our main hypotheses.

Contributors

Full details of all contributors to UK BEAM are available at: http://www.york.ac.uk/healthsciences/centres/trials/ukbeam/contrib.htm.