Background

The stepped-wedge cluster randomised trial (SW-CRT) is a novel and appealing trial design which can be used for the evaluation of interventions during routine implementation [1, 2]. The design involves randomisation of clusters to sequences which dictate the order at which the clusters cross from the control to the intervention condition [3]. In general, this involves all clusters starting the trial in the control condition, followed by a staggered introduction of the clusters to the intervention, resulting in all clusters receiving the intervention by the end of the trial [3]. The implementation of interventions under evaluation can often proceed in much the same way as it would have had the evaluation not been taking place, with the exception of the order of implementation [3]. This means the design has the potential to provide real world evidence of effectiveness; that can be generalised; and can be implemented with minimal disruption. With the increasing availability of routinely collected data, the trial design has the potential to become the gold standard for the evaluation of implementation and quality improvement interventions.

The use of the SW-CRT is seeing an unprecedented and exponential increase [4]. However, some of the complexities of the trial design can put studies at risk of not delivering on their objectives [5, 6]. In particular, due to the staggered implementation of the intervention, there is the potential for the scheduled timings of the trial to be disrupted. Disruptions to the planned implementation schedule and organisation of the trial may have repercussions that ultimately result in the trial being unsuccessful. In addition, it is generally required for all clusters to be recruited prior to any randomisation taking place. If recruitment of clusters is slower than expected then this can severely delay the start of the trial or could result in fewer clusters being recruited and an underpowered trial. Getting the timings of the trial right, particularly the timing of the clusters starting the intervention can be a challenge. Without first testing the implementation of the intervention, it may be difficult to determine how long the intervention will take to embed in a cluster, and therefore how long the periods need to be between clusters starting the intervention. Maintaining consistency in participant recruitment over the duration of the trial may be difficult, especially when the cluster changes from control to intervention condition. When continuously recruiting throughout the trial variations in the number or type of participants can occur. These variations can relate to variations in the level of engagement from those recruiting participants, which may wane as the trial progresses, or as a result of staff turnover. These challenges surrounding the design and conduct of SW-CRTs mean feasibility studies could be particularly useful.

Feasibility and pilot studies are small scale studies conducted prior to a definitive trial. They aim to guide the planning or design of the trial, or determine whether the main trial is feasible and if not, what issues, if any, can be resolved to make the main trial feasible [7,8,9,10]. Feasibility studies can be designed to investigate a vast range of issues [11]. Some may be focussed on testing the feasibility of the intervention; for example whether the intervention is acceptable to its intended recipients; whether it is suitable for the environment where it will be introduced; and whether there are any challenges that might arise during the implementation [11, 12]. Other feasibility studies may be more concerned with assessing the feasibility of the trial processes: for example testing the methods of data collection; the acceptability of the randomisation or recruitment procedures; or testing if there are sufficient resources available to conduct the trial [12]. Depending on its objectives a feasibility study may or may not have the same design as the main trial. Pilot studies can be defined as a subset of feasibility studies, where pilot studies have a particular design feature [10, 13]. A pilot study is conducted as a small scale version of all or part of the future definitive RCT (that may or may not be randomised) [10, 13]. Throughout, the term feasibility study will be used to encompass both feasibility and pilot studies.

Testing and refining the trial processes for SW-CRTs will be pivotal to their success. For example, it may transpire that due to resource availability or the complexity of the intervention, a limited number of clusters can simultaneously cross from the control to the intervention. The resource levels needed to start and maintain a cluster in the intervention condition can be investigated in a feasibility study. The length of time required between clusters crossing to the intervention might also be important; and again can be investigated in a feasibility study. This time should be long enough to allow the intervention to become embedded in a cluster before a measure of the outcome is obtained, whilst being short enough to allow the trial to complete within a set funding period. It might also be important to determine if recruitment of participants can be done in such as way so as not to be influenced by any knowledge of the intervention condition (which can induce biases) [14].

Up until now, it has not been known whether feasibility studies are being used to inform the design of SW-CRTs and if they are, which issues are being investigated. A recent systematic review of SW-CRTs [4] identified three pilot studies for SW-CRTs, which were themselves of a stepped-wedge design. However, since not all published feasibility studies for SW-CRTs will have a stepped-wedge design, not all will have been identified by previous reviews. This review aims to gain an insight into how feasibility studies are being used to inform the design of SW-CRTs. Specifically, our objectives were to:

  • Systematically identify published feasibility studies designed to inform SW-CRTs;

  • Ascertain the design characteristics of and rationale for these feasibility studies;

  • Establish how the feasibility studies informed the main trials.

Methods

Search strategy

Feasibility studies for SW-CRTs, published in English, were identified via electronic searches conducted on 6th February 2017 of the online published databases Ovid MEDLINE (from 1946), Scopus (from 1966), Embase (from 1947) and PsycINFO (from 1967). An example of the search strategy used is outlined in Table 1 [15] and was based on previously published search strategies [1, 7, 12, 16, 17].

Table 1 Example search strategy for Ovid MEDLINE

Inclusion criteria

Eligible studies were full reports or protocols of feasibility studies conducted to inform a future SW-CRT. For the purpose of this review, a feasibility study for a SW-CRT was defined as any study which aimed to ascertain the feasibility of a planned SW-CRT, through the assessment of issues other than solely the refinement of the intervention. We consider pilot studies to be a subset of feasibility studies and were therefore eligible for inclusion in this review.

No restrictions were placed on the design of the feasibility study, so it was not necessary for the feasibility study to be of a stepped-wedge design, or even randomised. The feasibility study should, however, have focussed objectives to ascertain the feasibility of a planned SW-CRT and make it clear how the findings of the study will inform the main trial, which must be intended to be of an SW-CRT design. An SW-CRT is defined as any trial that randomises clusters to two or more steps (time-points at which clusters have a unidirectional change of treatment condition). Studies for which the intended definitive trial is individually randomised, has a bidirectional cross-over design or is non-randomised have been excluded.

Screening

Two reviewers (CK & Maria Yao) independently and in a random order screened the titles and abstracts of the identified studies for eligibility. For those studies not excluded at the initial screening, full-text articles were obtained and the same duplicate method of assessment used. Ineligible studies were excluded and the reason for exclusion noted. If any additional information was required the authors were contacted and attempts were made to access any protocols for the identified feasibility studies. For each eligible study, the reference lists were also checked for any potentially eligible studies.

Data extraction

Data extraction for eligible studies was done independently and in duplicate in a random order by two reviewers (CK & (KH or LG)), using a data extraction form that had been tested, revised and finalised using a small number of the studies. Extracted data was managed in Microsoft Excel V.2013.

Information about the design of each feasibility study was extracted. This included how the authors defined their study (as a pilot, feasibility study or something else); the size of the study and how the sample size was justified; and whether the study has been registered with a recognised clinical trial registry, such as ClinicalTrials.gov. In addition, information was extracted on blinding, randomisation and overall design of the feasibility study (parallel, stepped-wedge etc.). The rationales for conducting each feasibility study prior to a main trial were obtained by extracting the specific aims of each study. These were categorised into process (feasibility of the processes that take place during the trial), resource (time, people and budget issues), management (feasibility of collaborations and coordination of teams) and scientific type (assess scientific processes and estimation of parameters) motivations (more detail is included in the published protocol [15]).

Information on the types of analysis conducted and the emphasis put on any results were extracted. Information was also extracted on any hard or soft stopping rules that were in place and the criteria used to determine whether the main trial would be feasible or not. Whether the decision was made to go ahead with the main trial and how the feasibility study informed or resulted in changes being made to the main trial were recorded, along with any information on whether any of the participants from the feasibility study would also be taking part in the main trial.

Analysis of results

We present a narrative synthesis of our findings, as well as a descriptive analysis of the study characteristics of each eligible feasibility study included in the review. We also present a critical appraisal of a single case study which highlights specific issues regarding feasibility studies and SW-CRTs [Additional File 1]. Where appropriate, this report adheres to the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement [18].

Results

A total of 1861 records were identified from the search of the databases, of which 11 studies were found to be eligible and are included in this analysis [Fig. 1; Additional File 2].

Fig. 1
figure 1

Flow diagram: The flow of information through the different stages of the systematic review

Of these 11 studies included, less than half reported that the study had been registered (Table 2) [Additional File 3]. The majority of the identified studies were reports of results, but three protocols were also identified. Just over half of the studies described themselves as a pilot study, with the others using terms such as “feasibility study”, “acceptability and feasibility pilot”, “consultation exercise” or “formative research”.

Table 2 Characteristics of identified feasibility studies

Three of the identified studies were of a stepped-wedge design, with two of these having a qualitative component. Over half of the identified studies used mixed methods; and most (73%) were external to the intended main trial. The duration of the studies ranged from 16 weeks to two years, with a median duration of one year (interquartile range (IQR) (months): 4.7, 15). Three of the studies were randomised (all at the cluster level and all having a stepped-wedge design) with no restrictions to the randomisation mentioned. Blinding was present in one of these [19].

The majority of the included studies were conducted in health care settings, where the clusters were mostly defined as hospitals, clinics, or wards. The median number of clusters in each study was 3.5 (IQR: 1.8, 6). The study participants were mostly patients, healthcare professionals or both, including on average 109 (IQR: 35, 2180) participants. The majority of studies used a convenience sample and for three studies the rationale for the sample size was unclear.

Rationales for conducting the feasibility study

The studies reported a range of rationales, including process and resource rationales, though none reported any management rationales [Table 3, Additional File 4].

Table 3 Rationales given for conducting the identified feasibility studies

The most common process type motivations were investigating acceptability of the intervention (64%); identifying issues or barriers to implementation (55%) and intervention adherence (36%). Six studies investigated outcome measure related process type motivations; either choice of outcome measure; testing of data collection methods; or assessing the amount of missing data. Only one study assessed the feasibility of the processes specifically relating to the use of the stepped-wedge design. A description of this study is given in [Additional File 1].

Only three of the identified studies stated any resource type motivations for conducting the study. Each of the following were assessed by one of the studies: the resources used in the intervention; the post-intervention impacts on service use and staff time; the time taken to complete study procedures; waiting and consultation times; and patient volumes and staffing levels. None of the feasibility studies identified listed any management type issues that they investigated.

Some scientific type motivations were investigated, the most common being an estimation of the potential effectiveness of the intervention (45%). In addition, some studies aimed to assess the cost-effectiveness/theoretical cost savings of introducing the intervention or intended to use information gained from the feasibility study to inform the sample size calculation.

Progression to a main trial

Only two (18%) studies gave any criteria for determining the success of the feasibility study and deciding whether to proceed to a main trial. One study provided specific criteria, stating the threshold enrolment rate required and the lowest proportion of visits that needed to be completed and sessions that needed to be attended [20]. The criteria for the other study were not as specific: the completion of the economic modelling of the pilot data would “underpin the decision to progress to a main trial” [21].

One study put in place a stopping rule. The decision as to whether to continue the study was based on the change seen in the outcome from a period prior to the implementation of the intervention [21]. This study is described in more detail in [Additional File 1]. None of the feasibility studies that had been conducted at the time of this review were stopped prior to completion.

Analysis method

The majority of the studies (55%) used a mixed-methods approach to analysing their data. The quantitative methods used included descriptive statistics, simple statistical tests and generalised linear mixed models. The qualitative methods used included constant comparative analysis, framework analysis, thematic analysis and content analysis. Hypothesis testing alone was used by two (18%) of the studies to gain estimates of the effectiveness of the intervention. Content analysis was the method of choice for the one study which only used qualitative methods. Two studies (18%) did not specify the method of analysis that was used.

Remaining feasibility concerns and modifications required

Some feasibility concerns remained for the completed studies. One study observed differential recruitment success due to possible response bias, yet still deemed the main trial to be feasible without changes [22]. Two studies also concluded their studies were feasible without changes, despite some remaining concerns.

Even for those studies that listed no remaining concerns, changes were still intended to be made. These included changes to: the intervention [23], study procedures [23, 24] and data collection methods [24]. One study which identified several barriers, found issues predominately relating to time and resource availability [25]. The study was found to be feasible with several modifications, including the introduction of an adherence and retention package. One of the studies did not specify whether the main trial would be going ahead as a result of the feasibility study [26].

The majority of the studies did not specify whether the participants (55%) or clusters (64%) from the feasibility study would be taking part in the future trial. For four of the studies (36%) the commitment to using a SW-CRT design for the future definitive trial was not as strong as for the other trials. For these trials the stepped-wedge design was considered to be the most appropriate design for a future trial or the feasibility study itself was an SW-CRT. The only study to specifically assess the feasibility of using the stepped-wedge trial design, was one of the feasibility studies that was itself of a stepped-wedge design [21]. This feasibility study is described in [Additional File 1] along with a critical appraisal of its objectives.

Discussion

Through a systematic review of the published literature we have identified 11 feasibility studies (eight reports, three protocols) conducted to inform SW-CRTs. Given the increasing frequency with which SW-CRTs are being used, it would be expected that there would be a greater number of feasibility studies published for these trials. This would suggest that few feasibility studies are being conducted in advance of running a definitive SW-CRT or that conducted feasibility studies are not being published. Furthermore, of the few SW-CRTs that have published feasibility studies, few have assessed the feasibility issues surrounding the use of the SW-CRT design itself. Given the complexities of the trial design, especially around the timings of the roll-out of the intervention under evaluation, evaluations using this trial design will be at risk of not delivering on their objectives.

The SW-CRT is an emerging, innovative and potentially very useful yet complex study design [3, 5, 6]. It has been shown to be particularly useful for the evaluation of interventions that would have been rolled out regardless of the trial taking place, as the implementation can often proceed in much the same way whilst providing randomised evidence of effectiveness [3]. In this way SW-CRTs are assisting in the move towards more pragmatic trials to answer routine pragmatic healthcare questions. There have been many reviews of SW-CRTs [1, 4, 16, 17, 27,28,29,30,31,32]. However, most reviews have focussed on statistical methodology (particularly sample size) and quality of reporting and none have looked at the use of feasibility studies for these trials. When designing a SW-CRT, there will often be aspects of the design that cannot be informed by previous trials, systematic reviews, routine data etc. and this information might only be gained through the use of a feasibility study. Obtaining this additional information can improve the feasibility of the designed trial.

We ascertained the design characteristics and rationale of the identified studies, in order to see how feasibility studies are currently being used to inform the design of SW-CRTs. In addition, we ascertained the processes employed by these studies for determining progression to a main trial, in order to see how feasibility studies are being used to inform SW-CRTs.

The studies varied considerably in both their size and duration, with some studies being completed in 16 weeks whilst others took two years and some studies requiring 16 participants whilst others included observations from more than 26,000 individuals. Many of the sample sizes lacked clear justification. Three studies did not provide any rationale for the size of the study and so it is not possible to determine whether the studies were large enough to accomplish their objectives or whether they might be excessively large. Under the CONSORT 2010 statement extension for randomised pilot and feasibility trials [10] the rationale for the numbers included in the study should be provided, especially for those studies where estimation of parameters such as recruitment rates is an objective. For many of the studies one of the main aims was to estimate the potential effectiveness of the intervention, which should not feature as an objective of a feasibility study as it will not be sufficiently powered for this [12]. Therefore, the decision as to whether to continue with the study or to progress to the definitive trial should not be based on any estimate of potential effectiveness from the feasibility study.

Only one of the feasibility studies aimed to assess the feasibility of using the stepped-wedge design, despite three of the studies being of a stepped-wedge design themselves. One study investigated the time taken to complete the study procedures, whereas the rest of the studies were mostly assessing the feasibility and acceptability of the intervention itself. With the complexity of the design of a SW-CRT, it is surprising to see so few of the studies investigating issues that are specific to the SW-CRT design. However, with the current dearth of papers describing the practical challenges of conducting a trial of this design, maybe this comes as less of a surprise. A small number of stepped-wedge trials that have been published, have reported the challenges faced [33]. Known challenges include delays in the start of the trial, poor recruitment and limited quantity and quality of data [27, 33, 34]. Many of which could be investigated using a feasibility study. Less than half of the identified studies were registered. The importance of registering feasibility studies has been highlighted by the CONSORT 2010 statement extension for randomised pilot and feasibility trials [10] and we reiterate this point. By registering feasibility studies it can make them easier to identify. Once identified these studies can be used to help inform the design of future SW-CRTs, by highlighting identified feasibility issues associated with this design.

Strengths and limitations

Our review used a pre-specified search strategy, inclusion and exclusion criteria and duplicate data extraction in order to minimise potential sources of bias. The search strategy included many terms used for SW-CRTs, based on those included in other reviews [1, 16, 17] in an attempt to capture those studies using some of the less common terms. Yet, despite our best efforts there is still the potential for some selection bias as our search will not have captured those studies using other terms to describe the stepped-wedge design and non-English language studies. In addition, the results of our search will be limited to feasibility studies that specify that the main trial will be a SW-CRT in the title or abstract. Another added complexity and potential limitation is that feasibility studies often go unpublished [7, 12]. A recent review of feasibility studies funded by the National Institute for Health Research’s (NIHR) Research for Patient Benefit (RfPB) programme found almost half of the studies that they looked at had not published results [35]. We included studies self-defining as either pilot or feasibility studies; but other studies may have used other self-defining terminology to describe these studies - although this is likely to improve with the publication of the CONSORT 2010 statement extension for randomised pilot and feasibility trials [10].

Further work is required to highlight to researchers all of the potential feasibility issues associated with the SW-CRT, how some issues become more serious when using the stepped-wedge design and to promote the use of feasibility studies to inform these trials. The work presented here is part of a larger programme funded by the National Institute for Health Research (NIHR), which intends to identify the feasibility issues encountered by SW-CRTs and ultimately lead to the development of guidance on how feasibility studies can be conducted for SW-CRTs.

Conclusions

Published feasibility studies to inform SW-CRTs are scarce and those that are being published do not aim to investigate many of the issues specific to this design of trial. SW-CRTs are complex and compared to other designs they are relatively inflexible to change once the trial has commenced. There is the potential for feasibility studies to be really informative in the designing of SW-CRTs, improving their design and giving them a greater chance of being completed successfully, on time and with the required sample size. We highlight the importance of conducting a feasibility study prior to any SW-CRT and encourage the publication of the findings in order to help other researchers planning on conducting a SW-CRT. We also encourage the published reports of completed SW-CRTs to highlight the challenges faced during the trial in order to help future trials to avoid encountering the same issues and provide them with the opportunity to investigate solutions to these issues during their own feasibility studies.