Background

Pain and osteoarthritis (OA) of the knee are accompanied with varying degrees of functional limitation and reduced quality of life[1] In adults aged 45 years and over, the most common site of peripheral joint pain is in the knee and the highest prevalence of knee pain is amongst women aged 75 and over[2] Established risk factors for knee OA include obesity, age, female gender, misalignment, and knee injury [35] Osteoarthritis of the knee is a serious and chronic condition. In an increasingly ageing and obese population, the prevalence and incidence of OA of the knee is predicted to rise[6]

Patients may present with symptoms of pain and stiffness, joint instability, crepitus, decreased function and mobility. Diagnosis is commonly made clinically in primary care, and may be confirmed by radiological tests. The mainstay of conventional treatment includes analgesia, physiotherapy and joint replacement if indicated. Whilst osteoarthritis has often been considered as progressive and incurable,[7] the disease has also been viewed as a metabolically dynamic, essentially reparative process that is potentially amenable to treatment[8]

Three recent high quality meta-analyses of pooled data from randomised controlled trials of acupuncture treatment for osteoarthritis of the knee provide consistent evidence that acupuncture is more effective than sham acupuncture when patients are blinded to the intervention they received [911]. For example in one review, Manheimer et al found effect sizes ranged between 0.35 (95% CI: 0.15 to 0.55) for short term outcomes and 0.13 (95% CI: 0.01 to 0.24) for long term outcomes[10] Only one of the three reviews pooled data comparing acupuncture to usual care alone in which patients were unblinded. For this comparison, statistically significant differences were again found but with larger effect sizes of 0.62 (95% CI: 0.49 to 0.75) for short term outcomes (up to 3 months) and 0.52 (95% CI: 0.39 to 0.66) for longer term outcomes (at 6 months)[10] It may be counter-intuitive that effect sizes were larger when acupuncture was compared to usual care (an active comparator), however it is consistent with findings across the field that sham acupuncture tends to be highly therapeutic[12] Despite this emerging data, the UK National Institute for Health and Clinical Excellence (NICE) guidelines published subsequently state that, "There is not enough consistent evidence of clinical or cost-effectiveness to allow a firm recommendation for the use of acupuncture for the treatment of osteoarthritis."[1]

There are a number of limitations to the existing evidence. Firstly, none of the studies answered the pragmatic question about the effectiveness of acupuncture as an adjunctive treatment to usual primary care as delivered in the UK. A second problem is that most trials tended to have short-term follow-ups, typically no more than 26 weeks, more commonly as little as a month. While significant longer-term effects of acupuncture when compared to usual care have been shown for headache over 12 months,[13] and for low back pain over 24 months,[14] there is inconclusive evidence of longer term benefits of acupuncture for osteoarthritis of the knee. Thirdly the evidence on cost-effectiveness is limited, with trial data from Germany,[15] but no trial data relevant to the UK context[1] The combination of these limitations makes it difficult to estimate health service benefits and costs that need to be considered as part of a NHS decision process.

Given the challenges of treating osteoarthritis of the knee, the NICE guidelines have clearly identified the need for further studies to evaluate non-pharmacological therapies with a focus evaluating longer-term and sustained clinical effects as well as cost-effectiveness[1] NICE guidelines have also identified the need concurrently to address quality of life issues and comorbid conditions that compound the effect of osteoarthritis[1] It is in this context that we have designed a pilot study to identify the key design features necessary for a full scale randomised control trial. Our key objectives were to establish potential recruitment rate, appropriate validated outcome measures, attendance levels for acupuncture treatment, loss to follow up and the sample size for a full scale trial.

Methods

Design

We conducted a pragmatic parallel two-armed randomized control trial (RCT) comparing 'acupuncture and usual care' to 'usual care' alone. This was a Phase II study based on the methodology set out in the Medical Research Council's guidelines for the evaluation of complex interventions[16] The pragmatic design builds on current evidence that there is a significant difference between acupuncture and sham acupuncture for this condition [911]. It addresses practical research questions raised by NICE regarding clinical and cost effectiveness in an NHS context[1] In practical terms this pragmatic trial design offers best value by providing clinical results that are immediately applicable to patients and providers. The cost- effectiveness data, based on a real-world comparison will aid policy and decision-makers.

Participant Recruitment and Randomization

Participants were recruited from a York-based GP practice with a list size of 15,927 patients. A search of their database identified patients over 50 years old who had consulted their GP in the last 3 years with knee pain, using the READ codes of 'knee pain', 'knee joint pain', 'osteoarthritis of the knee', 'anterior knee pain', 'other knee injury', 'painful right knee' and 'arthralgia'. These classifications were used to capture the wider population of patients with clinical symptoms of OA of the knee, but no radiographically confirmed diagnosis[17] These patients were sent an information leaflet, consent form and a screening questionnaire. Patients were recruited if they reported experiencing ongoing pain and stiffness in their knee, but were not under cancer care review, currently receiving acupuncture, having had a knee or hip replacement, involved in any insurance claim or litigation related to their knee pain, or suffering from rheumatoid arthritis or haemophilia. Recruited patients were randomised by the York Trials Unit using a computer-generated (STATA) random allocation method. As participants were enrolled their pre-randomisation number was emailed to the Unit, and their post-randomisation number and the group allocation to emailed back by return. This ensured that the person recruiting participants was different from the person implementing the allocation sequence in accordance with CONSORT guidelines[18] The study gained ethical approval from the York Local Research Ethics Committee on the 6th June 2006 (ref: 06/Q1108/30).

Outcome measures

Our primary outcomes are to establish the patient recruitment rate, appropriate validated outcome measures, attendance levels for acupuncture treatment, loss to follow up and the sample size for a full-scale trial. Our clinical outcomes included the Western Ontario and McMaster's University Osteoarthritis Index (WOMAC)[19], Oxford Knee Score (OKS)[20], SF36 version 2[21] and EQ-5D[22] The WOMAC is used to measure dimensions of pain, stiffness and function in knee and hip osteoarthritis, and the higher the score the more severe impairment. The OKS is designed to measure function and pain and total score. A lower score indicates a better state of health. The SF36 and EQ5D were used to measure health status and quality of life with higher scores indicating a better level of functioning. All data were collected by post at baseline, and 1, 3 and 12 months post randomisation, with reminders sent out as necessary for the first two of these follow-ups.

Intervention

Patients allocated to the acupuncture treatment group were referred to one of 5 acupuncturists who were members of the British Acupuncture Council (BAcC), with a minimum training of three years full-time and at least three years post qualification experience. Acupuncture practitioners were advised they could see these patients for up to 10 treatments as necessary. The participating acupuncturists agreed to follow an adapted protocol developed for the treatment of depression[23], with a focus on the use of clinical judgment drawing on a specified range of theoretical frameworks This flexible approach meant that the numbers of needles inserted, depth of needle insertion, needle responses elicited, needle stimulation used, needle retention time and needle type varied. Treatments were usually weekly and key treatment details were recorded in a treatment log, including theoretical frameworks, acu-points used, and any additional components of treatment provided. Adverse events reported by patients to practitioners were recorded after each treatment.

Both groups received 'usual care', which included any appointments, medications (prescribed or over the counter) and interventions sought by participants from any health practitioner. Data on all usual care treatments received by both groups were collected using follow-up postal questionnaires at 3 and 12 months.

Statistical analysis

Using SPSS version 15, data were analysed on an intention to treat basis. A regression model, with baseline outcome measure as the covariate, was used to assess changes over time between the two treatment groups in pain, function and other health related outcomes. The standard deviation of the outcome variable was used to calculate the sample size needed for a full scale trial using PS: Power and Sample Size Calculation http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize.

Results

Recruitment rate and baseline data

From the database of one GP practice, 335 patients were identified as potentially eligible to enter the trial. On mailing these patients, 78 returned screening questionnaires and consent forms, and were found to be eligible (see Figure 1.) The 40 people who responded most promptly were sent the baseline questionnaire, on the basis that they were most likely to complete the trial and return questionnaires. Of the 40 patients who were sent the questionnaire, 36 people returned their forms. The first 30 patients to reply were selected and randomised into two groups: 'acupuncture and usual care' and 'usual care'. The baseline data collected prior to randomization are presented in Table 1.

Table 1 Baseline Characteristics of Patients
Figure 1
figure 1

Flow chart displaying details of eligible patients.

Attendance levels and intervention data

Of the fifteen participants randomised, fourteen attended 136 appointments (90% of the 150 maximum). Treatments were provided by four acupuncturists, usually on a weekly basis. Twelve patients attended all 10 available sessions. One participant discontinued treatment after 7 sessions as she felt her knee was very sensitive and did not feel there was any improvement. A second participant discontinued treatment after 9 sessions because she felt that the acupuncture treatment had been sufficiently successful. The acupuncturists used theoretical frameworks that included Channel Theory (14/14), Qi & Blood and Body Fluids (12/14), Zang-fu Syndromes (8/14) and Pathogenic Factors (7/14). The number of acu-points used in each treatment session ranged from 4 to 24, with a mean of 12. The most frequently used points were SP 6, 9, and 10; ST 36; LIV 3 and 8; LI 4; GB 34 and 41; KID 6; SJ 5; and the extra point Xiyan. Stainless steel needles were used with a diameter varying from 0.2 mm to 0.28 mm, length from 25 mm to 50 mm, and depth of insertion from 3 mm to 30 mm. De qi was usually elicited, and a variety of stimulation methods were used, including tonification and reduction. Retention time for the needles varied from 10 to 30 minutes. Auxiliary treatments of moxibustion (3/14) and acupressure massage (3/14) were provided. Lifestyle advice was offered to 11 out of 14 patients and most commonly in relation to relaxation (8/14), diet (6/14) and exercise (6/14 times).

Data on usual care are presented in Table 2, which shows that GP and physiotherapy consultations were the most common type of additional healthcare, and medication and walking aid use remained at an equivalent level between groups and over time.

Table 2 Data on usual care at 3 and 12 months

Outcomes measured at 3 and 12 months

The outcomes for the WOMAC, OKS, SF-36 and EQ-5D at 3 and 12 months, with mean scores and SD, are shown in Table 3. The results (including 1 month data) are displayed in Figures 2 and 3 for the mean WOMAC pain and global scores. Adjusted differences in outcome using a regression model are shown in Table 4. Although the trial was not powered to detect significant changes in outcome, the WOMAC pain index showed a significant reduction [-2.62 (95% CI: -0.77 to -4.47)] at 3 months in the acupuncture group compared to usual care. This was not sustained at 12 months. No significant differences were observed on the OKS scale or the domains of the SF-36.

Table 3 Outcomes at three and twelve months
Table 4 Adjusted differences in WOMAC scores comparing acupuncture to usual care at 3 and 12 months, using a regression model with baseline WOMAC score as a covariate
Figure 2
figure 2

Graph displaying Mean WOMAC Pain Scores (range 0 to 20).

Figure 3
figure 3

Graph displaying Mean WOMAC Global Scores (range 0 to 96).

Loss to follow up

The loss to follow up in the 'acupuncture plus usual care' group was 0% at 3 months (n = 0) and 13.3% at 12 months (n = 2). It was higher in the 'usual care' group, at 6.7% at 3 months (n = 1) and 46.7% at 12 months (n = 7). We failed to send out reminders at the follow-up at twelve months.

Adverse Events

No major adverse events were reported. Seven minor adverse events were reported (Table 5). The seven events out of a total of 136 treatments equate to adverse events occurring in 5.2% of treatments. No patients discontinued treatment as a consequence of adverse events.

Table 5 Minor adverse events reported by patients in the acupuncture group

Trial sample size and primary care list size for a fully powered trial

The WOMAC measure was chosen as the primary outcome for the sample size calculation for the full scale trial because it showed more sensitivity to change in this patient group than the OKS. We know from a previous study that an effect size of 0.39 on the WOMAC pain scale or 0.37 on the WOMAC function scale is needed to detect a minimally important change[24] We have selected the WOMAC pain sub-scale as our primary measure because of potential discordance in pain and function and because the overlap on the pain and function sub-scales may play a causal role in limiting the ability of the WOMAC function subscale to detect change[25] The mean square of the residual variance of the WOMAC pain score was found from our regression model to be 5.8, which can be converted to a standard deviation of 2.4. Given a minimum clinically important effect size of 0.39, the sample size required to detect this difference at 5% significance and 90% power is 139 patients in each arm of a two-arm trial. To allow for loss to follow up of 20%, a full-scale trial will require 350 patients in total.

Our evidence suggests that we should introduce a minimum WOMAC score of > or = 4 and at least 2 activities with at least moderate pain[26]. This will impact on our predicted recruitment rates, as only 63% (i.e. 49 patients out of the 78) of patients in this pilot met this cut-off score. Our GP practice had a registered list size of 15,900, therefore we estimate that to recruit 350 patients we will need a total primary care list size of approximately 115,000 patients.

Discussion

Key Findings

This pilot study has met key objectives by helping us identify useful information to assist the design of a full scale acupuncture trial for osteoarthritis of the knee. With regard to recruitment rate, we found that 23% of patients identified on a York-based GP database as eligible agreed to be recruited to the trial. With regard to outcome measures, we found the WOMAC scale to be more sensitive to change in this patient group than the OKS scale. Attendance at acupuncture sessions was high, with patients taking up on average 9 out of the 10 sessions available. We had a very poor (46%) loss to follow-up at 12 months, which is an area that will need better managing in a fully powered trial. We calculated that a sample size of 350 would be required to detect clinically relevant differences in this population.

Although the trial was not powered to detect significant changes in outcome, the WOMAC pain index showed a significant difference in favour of the acupuncture group at 3 months but this was not sustained at 12 months. Though no significant differences were observed in WOMAC function, stiffness or global scores, nor in the OKS, the trend was towards favouring acupuncture. The size of any difference may have been more accurately assessed if we had introduced a minimum baseline score for knee pain when screening for eligibility. A single positive result within this pilot must be interpreted with caution, given the number of statistical tests conducted, though it does reinforce the potential value of conducting a full-scale study.

Comparisons with other studies

Our evidence is consistent with other studies that compared acupuncture to usual care for OA of the knee in which short term benefits at the end of treatment appeared to 'wear off' over time, including for example data from a meta-analysis[10] and from a large scale trial with a similar design from Germany[27] This pattern for OA of the knee is not consistent with recent primary care trials in the UK of acupuncture for headache over 12 months [13] and for low back pain over 24 months [14], which both showed increasing statistical and clinical differences between groups over the longer term. The reason for this may be that osteoarthritis of the knee is a progressive disease, whereas headache and back pain could be more likely to fully remit between episodes. However there remains a need for more primary data on the trajectory of changes in outcome over the longer term, a point emphasised by the recent NICE guidance and its associated recommendations[1]

Limitations and strengths of study

The attrition rate was higher at 12 months than would be acceptable in a full scale trial, the reason being that no follow-up requests were made to those who did not respond to the initial request. For this pilot therefore, we have less confidence in the results at 12 months. For a full scale trial, it is recommended that a more systematic approach to collecting patient outcome data be put in place, including follow-up reminders and possibly financial incentives in the form of a voucher or payment to patients. We also note that by selecting the first patients to respond to our invitations, we recruited the more eager volunteers. This group may have higher adherence and follow-up rates than may be present in the primary care population.

Patient preference was not taken into account, and because it is likely that patients entered the trial with a preference, those allocated to usual care may have been more negative about their experience than those allocated to acupuncture. This may have biased the outcome in favor of the acupuncture group. However as the group sizes were very small the interpretation of these data are limited.

The pragmatic trial design is by necessity an open one. As a result, this trial does not answer the question as to the extent that the overall outcome was due to the 'specific components' of the intervention, or to the 'non-specific components', such as expectations of acupuncture. For this question, an explanatory trial design would be appropriate. However the cost would be the reduced applicability of the results to the general population of patients within primary care. Secondary gains of a pragmatic trial design for a full scale trial is that it can address the area of cost-effectiveness and provide relevant data for decision-making, whether by patients, service providers or policy makers. This pilot was based in one GP practice and one acupuncture clinic and therefore the results should be interpreted with caution.

Future Research

There is a clear need for a full scale randomised controlled trial to confirm and extend the applicability of the tentative findings from this pilot at three months. There have been calls to establish whether any putative benefits at three months are sustained over the longer term. Monitoring patients in a trial over twelve months would help answer this question, as well as provide more robust data on cost- effectiveness, which would help reduce the dearth of UK related data as reported by NICE[1] To this end we recommend that resource use data associated with usual care be collected every three months, to reduce vulnerability to under-reporting. It would be worth considering other sources of data beyond self-reports, such as GP records for medication data. As part of "usual care", it would also be useful to capture the extent that "core" treatments, as proposed by NICE, such as advice and education, weight loss and exercise, are taken up. Our tentative findings seem to show that the effects of acupuncture might not carry-over beyond the end of treatment. If so, then it might be worth considering a trial design that includes the option of patients who have benefited receiving further top-up sessions after three months. A further dimension to this area of research is the need to explore any impact of acupuncture on quality of life. A new scale to measure quality of life in this population has recently been developed[20], and we recommend its use alongside the WOMAC index in a full-scale trial. Another area we suggest is monitored in a trial, which was identified by the NICE guidance as highly relevant to this patient group, is the impact of co-morbidities, such as anxiety and depression, that may compound the experience of osteoarthritis[1] There is a need to explore more fully the acceptability of acupuncture in this over 50s age group for which in-depth interviewing is probably the method of choice. Qualitative methods would also be useful in exploring the trajectory of recovery for patients with this condition.

Conclusion

This study has shown that it is feasible to recruit patients to a primary care trial to receive acupuncture for osteoarthritis of the knee, and that the tentative findings support conducting a full-scale trial. The pilot data have led to an estimate of the sample required for a full scale trial as well as the expected recruitment rates. Recommendations have been made to assist in the design, with the emphasis on the need for data on longer term follow up, cost-effectiveness, quality of life, as well as the collection of qualitative data on acceptability and the trajectory of recovery in this patient group.