Introduction

Knee osteoarthritis (KOA) is a leading and increasingly prevalent cause of disability [1]. The development of this heterogeneous disorder involves inflammatory, mechanical, and metabolic factors, and altered pain processing at the joint and brain levels [2]. Physical treatments (including therapeutic exercise, education, and hyaluronic acid or corticosteroid joint injections) have a substantial effect on improving pain, functional status, and inflammatory markers [1, 3,4,5]. However, general practitioners commonly prescribe medications and refer patients to orthopedic surgeons, even for new osteoarthritis-related problems [3]. Exercise is recommended as a cornerstone treatment of KOA [1], yet there is a gap in the implementation of this recommendation [3, 6].

The reasons for the lack of uptake are complex but include misconceptions that exercise causes joint harm [7]. However, the available evidence to refute this belief has important limitations. Magnetic resonance imaging (MRI) studies allow greater insight into structural adaptations that may occur [2]. A recent systematic review of MRI randomized controlled trials (RCTs) investigating the impact of knee joint loading exercise on articular cartilage concluded that exercise seems to not be harmful for articular cartilage [8]. A separate systematic review concluded that exercise therapy did not change cartilage morphology or synovitis/effusion but may slightly increase the likelihood for bone marrow lesion (BML) worsening [9]. The limitations of studies to date include small sample sizes, variable adherence to exercise (4–70%), inadequate reporting of participant withdrawal, and the use of exercises that may place too little (e.g., aquatic) or excessive (e.g., high-impact) load on a knee to induce beneficial joint structure changes [8, 9].

Outdoor walking is cost efficient, practical, and easily reproduced. Previous research has shown that walking can be used to reduce KOA-related pain and functional restrictions, but evidence for its potential effects on joint structure is contradictory [8,9,10,11,12]. Some studies show a detrimental effect [10,11,12], while others show either no effect or a beneficial effect [8, 9]. The structural effects of walking have not yet been investigated by an adequately powered MRI RCT. This pilot RCT aimed to determine the feasibility of an RCT examining outdoor community walking on KOA symptoms and MRI knee structural change to inform the design of future large-scale studies.

Materials and methods

WALK was a pilot single-center two-arm RCT conducted in Tasmania, Australia. The trial was registered on the Australian New Zealand Clinical Trials Registry prior to recruitment (12,618,001,097,235) and is reported according to the CONSORT 2010 extension to randomized pilot and feasibility trials [13]. Ethics approval was from the Tasmanian Health and Medical Human Research Ethics Committee (H0017108), and all participants provided written informed consent.

Recruitment and screening

Recruitment and screening took place from October 2018 to June 2019. Participants were recruited from the community local and social media advertisements. All potential participants were pre-screened via telephone followed by face-to-face screening, during which a detailed explanation of the project was given and written informed consent was obtained. Participants were eligible if they were aged 45 years or over, met the American College of Rheumatology (ACR) criteria for KOA [14] by clinical diagnosis, had symptomatic KOA for at least 24 weeks, had a pain visual analogue scale (VAS) score of at least 40 mm/100 mm over the last 7 days, and a BML present on MRI. Additional eligibility criteria include no difficulty in walking a city block (75–100 m) and willingness to participate in a walking program for 24 weeks with the ability to attend the scheduled walking classes. Excluded were individuals with any of the following: another form of arthritis, significant trauma to the study knee in the previous 12 months, receiving intra-articular therapy during the previous 24 weeks, were experiencing severe knee pain (> 80 mm VAS) while standing, were using a gait aid, planned to commence new forms of exercise, or undergo knee or hip surgery in the next 24 weeks. Analgesic medications were recorded, although no restrictions were made. Individuals with any conditions that precluded safe participation in exercise (e.g., a heart condition), as assessed by the adult pre-exercise screening tool (stage 1) [15], were required to receive medical clearance from their general practitioner before enrolling. Participants had to have at least one eligible knee, determined by verbal screening, clinical examination, and an MRI scan. When a participant had two eligible knees, the knee with the worst pain was selected as the study knee.

Randomization

Eligible participants were randomized with a 1:1 ratio to either outdoor community walking plus usual care (walking group) or a usual care control group for 24 weeks. Allocation of participants to either the walking group or usual care was based on simple randomization using computer-generated random numbers via a central randomization website hosted by the University of Tasmania. This was conducted by a blinded staff member with no direct involvement in the study. In this study the assessors, MRI readers were blinded to treatment allocation, but participants were not.

Outdoor community walking plus usual care group (walking group)

The participants who were randomized to the walking group were asked to train 3 days per week for 24 weeks. Each week consisted of two sessions in a supervised group and one unsupervised session at a location of their choice. The walking program was based on a protocol published by Ettinger et al. [16]. Each session (supervised or unsupervised) lasted 1 h and consisted of a 10 min warm-up (Appendix 1), a 40-min walk aimed at 50–70% of heart rate reserve (using the validated Borg rating of perceived exertion scale (RPE)) [17], and a 10-min cool-down consisting of slow walking and 3 flexibility exercises (Appendix 1). Each supervised class was led by one trainer who was either a physiotherapist or exercise physiologist. The participants reported adherence to their unsupervised sessions and provided feedback via an online questionnaire which was sent by email each week. To stimulate adherence, participants had flexibility to choose day/time/location/trainer, were allowed to bring family members on walks, and received a Fitbit on study completion. In addition, trainers would facilitate social support, positive reinforcement, goal setting, rewards for attendance, frequent contact, and recognition in study update newsletters.

All study staff (trainers and research assistants) participated in a half-day workshop for protocol training, which included instructions about administering the program, monitoring adherence, and education about behavioral change [18]. Additionally, the trainers attended an on-site introduction to the three outdoor group walk locations where optional walking back, early opt-out, and loops options were shown. By using these options, the intensity and/or distance of the walk could be tailored to each participant. Adequate exercise intensity was guided by trainers who used the RPE. The walking group also received the same care that was provided to the usual care group, as outlined below.

Usual care group

Participants in the usual care group received generic information about KOA and community services and resources (Arthritis & Osteoporosis Tasmania resources, Arthritis Australia flyer, and information about the website “MyJointPain” and a local physiotherapy program for KOA patients). Participants in the usual care group were discouraged from initiating any new exercise program for the 24-week study duration. Uptake of new activity was assessed by a questionnaire and objectively with accelerometers at screening, 12 and 24 weeks. Participants in the usual care group were given a Fitbit upon study completion to encourage retention.

Safety

All adverse events (AEs) were recorded. A research assistant determined in communication with the participant whether an adverse event was probably not, possibly related or probably related to the study. Furthermore, the NRS-11 [19] was used to measure pain before and after each group walking session, as a way to monitor acute pain exacerbations. Pain was accepted during walking but monitored. An increase in pain from pre-walk levels of 0 to 2 was considered safe, from 3 to 5 acceptable, and increases above 5 were flagged as high risk [20].

Retention

If participants withdrew from the study before 24 weeks, the reason and date were recorded. Missed walks from these participants were considered in the adherence counts until the official withdrawal date.

Primary outcome

The primary outcome will indicate whether the study protocol is feasible by assessment of design (any required changes to the protocol during the pilot), recruitment and screening (duration and number of people screened to enroll 48 participants; this target was chosen to enable the estimation of effect sizes that are small to large [21]), randomization (balance of characteristics in each group), adherence (number and percentage of supervised sessions, unsupervised sessions and total sessions that were completed), safety (number and description of AEs by group), and retention (number of participants that withdrew by group).

Exploratory outcomes

Exploratory outcome measurements were included to ensure that feasibility was assessed with participants undertaking a study protocol as similar to a large-scale trial as possible. Symptoms were measured using visual analogue scale (VAS) knee pain (0–100 mm), Western Ontario and McMasters Universities Osteoarthritis Index (WOMAC) knee pain (0–500 mm), WOMAC knee function (0–1700 mm), WOMAC knee stiffness (0–200 mm) (lower is better), and Osteoarthritis Research Society International-Outcome Measures in Rheumatology Clinical Trials (OARSI-OMERACT) [22] response to treatment. Isometric leg strength (predominantly quadriceps and hip extensors) was assessed simultaneously for both legs in kilogram (kg) by dynamometry (TTM Muscular Meter Tokyo, Japan). Physical performance was measured as indicated by OARSI [23], including the 30-s chair stand test, 40-m fast-paced walk test (m/s) and 6-min walk test (higher scores are better), and the timed up and go test and stair climb test (lower scores are better). Physical activity was assessed by waist-worn ActiGraph® wGTX3-BT (Firmware 1.9.2) activity monitors (ActiGraph LLC, Fort Walton Beach, FL, USA) if participants had at least 4 days of 10 h wear time, using settings as recommended by Migueles et al. [24]. Health-related quality of life and utility was assessed using the assessment of quality of life (AQoL-8D) [25] (0–1) and the EuroQol 5-dimension 5-level (EQ-5D-5L) [26] (0–1) questionnaires (1 = full health). Depression was assessed using the patient health questionnaire (PHQ-9) [27] (0–27) (lower is better).

MRI knee pathology

An MRI scan of the “study” knee was acquired with a 1.5 T whole-body magnetic resonance unit (GE Optima 450 W, Milwaukee, USA) using a dedicated 8-channel knee coil at weeks 0 and 24. Image sequences included (1) a T1-weighted fat-saturated 3D gradient-recalled acquisition and (2) proton density fat-saturated 2D fast spin echo sequence. The parameters are described in Appendix 2.

BMLs were assessed on the proton density-weighted sequences and defined as areas of increased signal adjacent to the subcortical bone at the medial tibial (anterior and posterior), medial femoral (anterior and posterior), lateral tibial (anterior and posterior), lateral femoral (anterior and posterior), superior patella, and inferior patella sites, by measuring the maximum area of the lesion (mm2) as previously described [28]. Intra-observer repeatability was assessed in 20 participants with at least 1-week interval between the two readings, with an intra-class correlation coefficient (ICC) ranging from 0.86 to 0.98.

Effusion-synovitis was defined as the presence of intra-articular fluid-equivalent signal on the proton density-weighted images. The volume of effusion-synovitis [29] was measured using semi-automated segmentation, and the final 3D volume rendering was generated using a free open-source imaging software (3D Slicer, version 4.10, National Alliance of Medical Image Computing, NA-MIC). The ICC for intra- and inter-observer repeatability for this method has been previously reported as 0.99 and 0.84, respectively [29].

Cartilage defects were assessed using a modified outer bridge system [28] at the medial tibial, medial femoral, lateral tibial, lateral femoral, and patella sites. Grading ranged from grades 0 (normal cartilage) to 4 (full-thickness chondral wear) as previously described [28]. ICCs for intra- and inter-observer repeatability ranged from 0.94 and 0.93 for the total score of cartilage defects, as previously described [28].

Meniscal extrusions were scored separately at the anterior, middle, and posterior horns (medially/laterally). The intra- and inter-reader ICCs ranged from 0.85 to 0.92 for meniscal extrusions, as previously described [30].

Additional measures

The following were also assessed: (1) demographics: sex and date of birth; (2) anthropometrics: height and weight to calculate BMI and waist and hip circumference; (3) standing anteroposterior semi-flexed X-ray of the study knee: from this image, static knee alignment from the anatomic axis based upon the methods of Moreland et al. [31] and radiographic KOA severity was measured by consensus with two readers (GJ, SD) utilizing the OARSI atlas to grade osteophytes and joint space narrowing (JSN) [32]. Intra-observer reproducibility was assessed for 20 participants, the ICC (2-way mixed-effects model) of the measurements was 0.95 for knee alignment, 0.84 for osteophytes and 0.93 for JSN; (4) participant satisfaction with the walking intervention: this was self-reported by participants on a scale of 1 (extremely unsatisfied) to 7 (extremely satisfied) at week 24.

Statistical analyses

Analyses were performed using Stata (version 16; StataCorp, College Station, TX, USA) and intention-to-treat with all available data from all randomized participants using their randomized group allocation. Descriptive statistics were used to summarize the program adherence, the adverse events, and the randomization of characteristics in groups at baseline. Changes in exploratory outcomes were analyzed to provide an indication of the direction and magnitude of changes in symptoms and physical performance/activity and whether there may be some indication of MRI structural change. Continuous outcomes at baseline and 24 weeks are shown as means and standard errors derived from unadjusted linear mixed-effects models. The change between these time points by group is presented as means and 95% CI. To investigate whether there was a difference in slope between both groups, the difference of differences was analyzed by linear mixed-effects models for treatment, time point, and a treatment/time point interaction, adjusted for sex and age as covariates. The models were adjusted for the baseline value of the corresponding outcome, which is considered best practice in RCT analysis [33]. For the analyses of categorical variables, log-binomial regression was used.

Results

Primary results

Over 9 months, 49 potential participants were screened for eligibility of whom 40 (81.6%) were enrolled. They had a mean age of 66 years [SD 1.36], mean BMI of 32.9 kg/m2 (5.3), and 60% were female. Twenty-four participants were randomized to walking and 16 to the usual care group (Fig. 1). The characteristics of participants by group allocation are presented in Table 1.

Fig. 1
figure 1

Flow diagram of participant recruitment and completion

Table 1 Baseline characteristics of participants

The target sample size of 48 participants could not be reached due to budgetary constraints, and the process of simple randomization resulted in a difference in the number of participants randomized to each group. During the study, some changes to the protocol were required. Following the prolonged absence of a participant, availability was added as an exclusion criterion, excluding those who were unable to attend two class times per week or those who had planned absences (e.g., trips away) of > 2 weeks during the 24-week timeframe. Furthermore, at the walking sessions, participants would walk at different pasts making a group size of 10 challenging to supervise. Therefore, trainers requested to reduce group sizes to 6–8. Over the 24-week study period (Table 2), adherence was 70.0% to the total program of 72 scheduled walk sessions, 60.8% to the 48 scheduled group sessions, and 88.4% to the 24 unsupervised sessions. The total program adherence to 36 sessions over 12 weeks was better during weeks 0–12 than weeks 12–24 (85.1% for the first 12 weeks and 54.9% for the second 12 weeks). Meanwhile, a similar trend was observed in the attendance of weekly sessions by retained participants (90.0% for the first 12 weeks and 73.3% for the second 12 weeks). The retention rate at week 24 was 70.8% in the walking group and 100% in the usual care group.

Table 2 Adherence measures over the first 12 weeks (0–12) and the second 12 weeks (12–24) for those participants in the walking group (n = 24)

Safety was analyzed using adverse event rates which are shown in Table 3. At least 1 adverse event was reported by 12 (50%) participants in the walking group and 6 (38%) in the usual care group. Within the walking group, most adverse events were musculoskeletal (11 vs 6), while in the usual care group, the majority were non-musculoskeletal (4 vs 2). Adverse events that were deemed related to the walking intervention (n = 6) were classified as mild and included foot pain (n = 2), chest tightness (n = 1), dizziness (n = 1), shortness of breath (n = 1), and one fall incident (n = 1). Three serious adverse events occurred that were deemed unrelated to the study (walking group: pneumonia (n = 1); complications due to anemia (n = 1); usual care group: angioplasty (n = 1)). During screening, an error occurred while interpreting the accelerometer data, and despite the initial criteria to exclude individuals who met physical activity guidelines, 5 (31.3%) participants in the usual care group and 7 (29.2%) in the walking group who met these guidelines had not been excluded.

Table 3 Reported adverse events during the study

Exploratory results

Exploratory results are presented in Table 4 and Fig. 2. Over 24 weeks, participants in the walking group had improved VAS knee pain, while those in the usual care group did not (change of − 38.7 mm [95% CI − 47.1 to − 30.3] and 4.3 mm [− 4.9 to 13.4], respectively). The walking group experienced greater mean improvements in WOMAC pain, function and stiffness, each OARSI performance measure, leg strength, and weekly time spent in MVPA and were 4 times more likely to meet the OMERACT-OARSI responder criteria compared to the usual care group. There are very small changes in MRI-based structural measures over 24 weeks in each group, with the magnitude and direction of effects shown in Table 4.

Table 4 Change in secondary outcomes over 24-week follow-up between the walking and usual care group
Fig. 2
figure 2

VAS, WOMAC subscales, 30-s chair stand test, and changes in time spent in MVPA values in the walking and usual care group during 24 weeks of intervention (mean with 95% confidence interval). The data are estimates from linear mixed-effects models, adjusted for age, sex, and corresponding baseline values. Abbreviations: MVPA, moderate to vigorous physical activity; VAS, visual analogue scale; WOMAC, Western Ontario and McMaster University Index; W, walking; UC, usual care

Discussion

This pilot demonstrates feasibility and lays the groundwork for a full-scale RCT that investigates the joint structural implications of a walking exercise program for KOA, though the recruitment approach could be improved. A subsequent full-scale RCT has the potential to build evidence about the clinical benefit of exercise therapy and help to debunk common misconceptions that exercise causes joint harm [7].

The study demonstrates feasibility in terms of randomization, safety, retention, and adherence and presents improvements to the study design. Supervised outdoor walking allowed for individualized support and encouragement for participants, attributing to high patient satisfaction and symptomatic improvements. Group sizes were reduced to provide better individualized support and to ensure protocol adherence and safety. In addition, adding planned absences as an exclusion criterion helped to promote adherence to the program.

Overall adherence to the walking program was 70.0%, which is higher than most comparable exercise studies [8]. The total program adherence was better during the first 12 weeks (85.1%) compared to the second 12 weeks (54.9%), and increased adherence promotion strategies during the second half of the study could further increase the quality of a more definitive study. In participants who remained in the study for the full 24 weeks, attendance remained high for the whole study duration (90.0% for the first 12 weeks and 73.0% for the second 12 weeks). The overall retention rate within the walking group (70.8%) was equal to [34, 35] or lower than [9] comparable studies. Dominant reasons for exclusion and withdrawal included being busy with work and going away. Prioritizing the screening of participants on time availability and making verbal agreements to complete follow-up measurements at the least could potentially benefit enrollment, adherence, and retention. This study did not investigate whether adherence and retention were related to the amount of benefit participants were experiencing and this is an area of future research which could potentially be better understood using a qualitative component. The intervention was considered safe. Six adverse events were deemed possibly related to the intervention and these were mild in nature and could be managed by the supervising physiotherapist and/or exercise physiologist.

Recruitment was slower than anticipated, and the recruitment target could not be reached due to budgetary constraints. This could have been due to our institute recruiting for two KOA exercise trials simultaneously, and this will be considered when planning the implementation of a future large-scale trial. Recruitment in future studies could potentially benefit from an enhanced recruitment strategy, such as snowball sampling, engaging primary care practitioners, promotion through local organizations (e.g., Arthritis & Osteoporosis Tasmania), and by including more walk locations or using a multi-center design. The use of simple randomization resulted in an imbalance between the number of participants in each group, which could be prevented in a future trial by using block randomization [36].

This pilot study investigated the direction and magnitude of changes in several exploratory outcomes. As this was a pilot study, no a priori power estimations were performed. Walking improved knee pain, function, and stiffness over 24 weeks, which supports the findings of previous RCTs [37]. The improvement in VAS pain observed in the walking group (− 38.7; (95% CI − 47.1 to − 30.3) is clinically important [38]. There was also a mean clinically important improvement seen in WOMAC function (− 565.6 mm (95% CI − 685.3 to − 446.0)) [38], and all walking group participants met the OMERACT-OARSI responder criteria [22]. Some factors could have contributed to the relatively large improvements in symptoms. One is that all participants who completed the walking intervention reported to be extremely satisfied. Patient satisfaction is a contextual effect rather than a direct treatment effect, which can be responsible for 5% (− 10 to 33%) of the measured effect [39]. In addition, symptoms were self-reported by participants, who were not blinded, and this could contribute to an overestimation of benefits. Self-reported quality of life improved more in the walking group as well, and the average improvement in the walking group approached clinical importance for the AQoL scale (the minimum important difference in AQoL scores for the Australian population is 0.06 [40]).

Physical performance and activity were assessor-blinded measures conducted to objectively examine changes in the ability to perform daily activities. Both the walking and control group showed improvements in OARSI performance measures over 24 weeks, which may have been due to a test learning effect [41]. However, improvements in the walking group were larger for every test. At baseline, most participants did not meet physical activity guidelines of 150 min spent in MVPA per week. During the intervention, most participants in the walking group met the MVPA guidelines. This is important, because adults above 60 years who are physically active have a reduced risk of cardiovascular and all-cause mortality, cancer, fractures, recurrent falls, functional limitations, cognitive diseases, and a better quality of life [42].

There were very small changes in MRI-based structural measures over 24 weeks in each group. Apart from a slight difference in meniscal extrusion score increases between the groups, there was no indication of detrimental effects related to the walking intervention. Meniscal extrusions increased by 23.5% in the walking group and 7.7% in the control group. This finding was based on a small number of participants (n = 5) and needs to be verified in a larger study.

This study has several strengths. First, its design as a pilot RCT enabled a thorough investigation of the study design to improve the quality for a subsequent larger RCT. Second, the intervention was intensively supervised ensuring standardization of exercise and reducing self-report bias of exercise duration and frequency [43]. Third, the strategies to enhance participant satisfaction were a strength as this has shown to benefit retention and adherence [44]. Limitations of the study included a screening error meaning that twelve participants were included who met MVPA guidelines at baseline. Our original intention was to enroll KOA patients who were not very active, and this error could have diluted the benefit of the program or may propose an additional challenge to recruitment in a subsequent RCT. In addition, the different withdrawal rate between groups could be a source of potential attrition bias [45].

Conclusion

A full-scale RCT is considered feasible given acceptable adherence, retention, randomization, and safety though its quality can be enhanced using the findings of this pilot. The large improvements in symptoms in the walking group support the potential clinical usefulness of a subsequent trial. Small changes were observed in MRI knee joint structure, and the estimates can be used to inform a more definitive study.