Background

As a methodology designed to inform policy and practice decisions, it is particularly important to ensure that systematic reviews are shaped by those who will make use of them, a process known as stakeholder engagement. Stakeholders might include service users such as patients, practitioners such as teachers, community leaders, those working to set or implement national or local policy, and many others. There is a range of approaches for engagement of stakeholders in research, from advisory groups to co-production. This range includes the opportunity for stakeholders to shape the scope of the review, the types of outcomes considered, and the dissemination of the research findings, amongst other things [1]. Where stakeholders get involved as co-producers, they may also learn and apply specific review skills such as searching, coding, and critical appraisal. The choice as to which approach to stakeholder engagement is adopted is shaped in many ways by whether the review is ‘supply-led’ (i.e. driven by the researchers/research community) or ‘demand-led’ (i.e. driven by the users of the review). The former is likely to already have scope and methodology in place, with stakeholder engagement used as a mechanism to improve particular aspects of the scope, validate the question, or advise on dissemination. In the latter the review is being produced in direct response to stakeholders’ demands, and their inputs are therefore much more likely to influence the scope and design of the review. As such not all stakeholder engagement leads to demand-led reviews, but all demand-led reviews are steered by stakeholder engagement.

Any engagement by stakeholders in systematic reviews can be particularly challenging due to the complexity of the methodology. A key hindrance here is that, in systematic reviews, the link between the user of the research and the data collected and analysis generated is thinner than in primary research. For example, systematic reviews do not interview research participants or collect household level data, a process with which review users might be more familiar than, say, extracting effect sizes or conducting thematic synthesis of data reported in primary research. There is a body of literature that aims to understand and advise how best to elicit contributions from stakeholders, including consideration of who initiates engagement [1]. There is however less guidance on what to do with the contributions stakeholders make, particularly if they contradict what methodologists recommend [2].

Discussions about how best to engage stakeholders, and meet their evidence needs, have given rise to a debate around how best to balance the sometimes-competing interests of the different contributors [3, 4]. For some achieving rigour is a scientific and technical process to maximise the generalisability of the findings; it is seen as a process that obliges adherence to requirements laid down by one of the specialist systematic review collaborations (including the Collaboration for Environmental Evidence (CEE)) with an emphasis on the methodological aspects of the review. For others the legitimacy of methods is paramount [5, 6, 7]. Parkhurst defines this as ensuring that the review is perceived to have been produced in such a way that is respectful of stakeholders’ divergent values, and fair in its treatment of views and interests [3, 7]. There are also issues of relevance of the review, which can include its focus, format, and timeliness [8].

Stakeholder engagement in systematic reviews therefore presents a major challenge to review teams that goes beyond the usual discussion of whom to involve and how. Responding to stakeholders’ priorities can often drive review teams towards a more relevant, actionable, and timely (rapid) process. Engagement in itself therefore creates a tension between the production of globally relevant systematic reviews—in which methodological steps to enable generalisability are prioritised—and locally-specific, often rapidly-produced evidence syntheses for policy needs. Stakeholder engagement therefore presents a dilemma for review teams about what is a ‘gold standard’ review.

This commentary aims to address head on this ‘elephant in the room’ with regard to stakeholder involvement in systematic reviews: that responding to stakeholders when producing a demand-led review can mean reconsidering what makes a review rigorous.

After 20 years of producing evidence synthesis in partnership with stakeholders, our team at the Africa Centre for Evidence at the University of Johannesburg has adopted an approach for producing evidence syntheses that prioritises methodological generalisable ‘public goods’ published in recognised systematic review libraries, and responsive evidence products that meet the needs of decision-makers, which can require a broader understanding of rigour. This paper presents this approach for discussion.

Commentary

What has led us to develop this approach?

We understand the need for rigour. We have conducted reviews for 3ie, CEE, Cochrane, Campbell, and the EPPI-Centre and so bring a wealth of methodological expertise to the challenge of balancing stakeholder engagement with the need for rigour. Our work has at times been supply-led and at other times demand-led, and this has influenced the range of people involved and types of engagement we have undertaken as well as how we have viewed the concept of rigour. We have worked with a wide range of stakeholders using approaches all along the spectrum of involvement [9]; from one-off requests for advice from stakeholders to formal advisory groups, working groups, and even full co-production. We have also supported decision-makers in producing their own evidence syntheses. In employing this spectrum of engagement approaches, we have produced a wide range of synthesis products: from full reviews through to responsive evidence assessments, reviews of reviews, and evidence maps.

The range of stakeholder engagement we have undertaken, with different drivers and different products, has led us to reflect that the definition of rigour commonly used by research producers does not always fit within particular stakeholder contexts, and therefore we have been reconsidering the question of what is a ‘gold standard’ systematic review.

An overview of the approach we now use

As a team committed to producing evidence syntheses which are demand-led, useful, and used we have to take seriously these issues about how and why to include stakeholders and how to address the tensions with regard to rigour that arise as a result. As methodologists we have a good understanding of the ‘compromises’ made when different priorities are balanced with respect to what makes a review rigorous.

Our approach includes the following eight steps:

  1. i.

    Stakeholder mapping to ensure all relevant groups are considered;

  2. ii.

    Engaging a wide range of stakeholders including methodologists, subject experts, and decision-makers;

  3. iii.

    Producing a protocol that can be peer reviewed to ensure we garner feedback from methodological experts;

  4. iv.

    Producing an evidence map as comprehensively and rigorously as possible within the time and funds available;Footnote 1

  5. v.

    Sharing that map through an interactive visualisation with a wide range of stakeholders, including both decision-makers and methodologists. This visualisation takes the form of a spreadsheet that can be viewed online. The two axes most commonly represent (a) interventions and (b) outcomes, and within each cell is a representation (for example as numbers, dots, or colours) to indicate the size and nature of the available evidence that corresponds with that specific intervention and outcome combination. Users can apply a number of filters to focus the evidence that is included in the display (for example selecting studies based in a particular country or applying a particular study design), and can click through into each cell to find reference information for included evidence;

  6. vi.

    Selecting areas for synthesis based on stakeholder input. This might include one or more specific cells, or particular intervention or outcome areas. It may also involve applying one or more filters, for example selecting randomised controlled trials conducted within Africa;

  7. vii.

    Conducting syntheses that are explicit about the elements that constitute rigour and how they are balanced; and

  8. viii.

    Producing more than one output to meet the needs of both the immediate stakeholders with whom we have engaged (the tailored evidence syntheses), and the needs of potential future users (the global good systematic review).

What this means for the rigour of our evidence syntheses

In theory, having different outputs from the same project should mean that we are able to meet the requirements for rigour as laid out by systematic review collaborations. Having said that, we have found that we do not ‘fit’ in the usual publishing requirements of the systematic review collaborations. For example in 2012/2013 we produced a three-stage review on smallholder farming that included a systematic review of reviews, an evidence map, and a full synthesis. The Campbell Collaboration’s processes were not flexible enough to consider all three steps and only accepted the full synthesis stage, which had to be written up as a standard systematic review, almost as though the first two stages had not taken place [10, 11]. The very fact that our approach has not fit within the usual formats hints that the requirements for rigour within these formats may not be fit for real world decision-makers’ evidence demands.

We are unlikely to be able to employ all the ‘best practices’ promoted by systematic review collaborations as the elements of rigour within our reviews are likely to be broad and responsive to stakeholders’ priorities. Stakeholders’ priorities sometimes take us outside what is considered ‘best practice’. When trying to be responsive to decision-makers’ needs we often have to be quick, which can mean that some steps required for technical/scientific rigour are adapted to the demands of the specific context. For example, having a percentage of papers double screened or double coded rather than all of them, or doing a shorter, less comprehensive critical appraisal stage. The stakeholders with whom we are working may also have priorities for synthesis that do not match those of other stakeholders. This may mean that the review may be of considerable value to some people but not others. This might be for a number of reasons including the fact that the subject that they choose is relevant only to specific environments, or that their outcome of preference does not apply to others’ contexts.

Full publication of all our reviews’ outputs is less likely to take place when we adopt stakeholders’ priorities. Outputs are fed into decision-making cycles immediately without waiting for formal publication processes, which will not necessarily take place. If confidential documents have been included, as has been the case in some of the syntheses we have conducted for government colleagues, it may limit the scope for full publication of data. Quality assurance processes of reviews, such as peer-review, can also look different. Rather than having formal methodological peer-review, decision-makers’ quality assurance processes (and thus definitions of rigour) have to be followed. These can often be different (for example validation meetings of the usefulness of evidence mapping methodologies by a range of government departments) but are not necessarily less stringent: for instance when evidence syntheses are tabled at Cabinet level, the level of scrutiny of the synthesis can be much higher than in traditional academic review as the stakes are higher.

What this means for the relevance and usefulness of our evidence syntheses

We propose that this approach to stakeholder engagement for demand-led reviews is much more likely to be relevant to the needs of those specific stakeholders involved. In our experience of working with the government in South Africa to produce an evidence map on Human Settlements, we developed a conceptual framework that fit closely with the country’s National Development Plan and Mid-Term Strategic Framework. This enabled the evidence map to feed directly into policy debates in government.

Of course such close consultation does not necessarily mean that the synthesis will meet the priorities of other groups of potential users, but we believe that this approach creates more legitimacy as the syntheses are easily recognised as having responded to the priorities and values of the users [7]. Timeliness is such an important factor for decision-makers so by working with them and to their timelines, the review is much more likely to be used: if you miss the policy-window, then the review simply will not be read.

Demand-led reviews move the review design and conduct much closer to the user of the review. This approach changes the balance of power between the researcher and the review user, which can elicit worries about the independence of the review process and findings. Review stakeholders might for example influence the review in such a way as to arrive at the preferred findings and recommendations. In our experience, there are three points to consider in this regard. First, while being more flexible and tailored to decision-making needs, demand-led reviews cannot compromise on the underlying systematic review principles of transparency and following a structured, systematic review approach. Any demand-led review has to comply with these principles as traditional, supply-led reviews do. Second, where vested interests become a challenge to a demand-led review, the review project should be discontinued. However, it is not clear why an independent but unused review is any less a waste of research than a review that cannot be completed due to undue attempts of stakeholder influence. The risk of vested interest due to stakeholder engagement therefore does not seem to present an inherent reason not to conduct demand-led reviews. One could also argue that to challenge and change vested interests and beliefs, if possible at all, engaging with such actors and groups in the review process has a higher likelihood of success than assuming that review findings will reach such groups by themselves. Third, we are not arguing that linking the concept of rigour closely to the review methodology followed is per se not valid. Rather, we are aiming to extend the concept of rigour to not only include methodological soundness, but also questions of the review’s relevance to decision-making contexts, and the perceived legitimacy of the review by the user audience. In this extended definition of rigour then, different aspects can be balanced against each other. However, a review of high-relevance and legitimacy which has achieved these attributes through allowing stakeholders to influence and undermine the review research process certainly would not be considered a rigorous systematic review.

How this relates to approaches taken by others

Our attempts to tailor evidence synthesis methodology to better meet users’ demands and needs have not been developed in isolation. Oliver and Dickinson, for example, highlight the challenges of producing policy-relevant reviews with issues of context and questions about transferability raised [8]. As highlighted in their paper there are issues in relation to translating the global reviews to specific contexts and needs, suggesting that even when these public goods are produced there is considerable translation required to achieve policy-relevance in specific contexts. Some efforts start with this challenge, taking ‘public good’ reviews and aiming to make them more accessible and more likely to be used by decision-makers [12]. This supply-driven approach is different from, but not necessarily contradictory to, our approach.

Others aim to provide evidence response services that are limited in their generalisability and future value, but maximise the potential for evidence-use by decision-makers by meeting their urgent needs [13]. Whilst this meets requirements for rigour in terms of relevance, timeliness, and legitimacy it does not conform to the methodological requirements of full ‘public good’ reviews. The formal systematic review collaborations are shifting slightly in this regard: the Collaboration for Environmental Evidence is discussing where different review products fit, whilst recognising that its primary goal is to produce ‘public good’ systematic reviews; there are indications that the Campbell Collaboration will accept evidence maps in the future.

The greater the number of funders commissioning reviews and the more people from different disciplines apply the method, the more these issues come to the fore and need to be discussed. A good example of this is the recent introduction of evidence synthesis in the humanitarian sector, which motivated a range of interesting debates on rigour and policy-relevance too [14]. We anticipate that there will therefore be more people asking what it means to produce demand-led reviews that respond to stakeholders’ needs, and seeking new approaches such as the one we propose.

Conclusions

We have identified the following strengths to our approach with regard to increasing the usefulness and use of reviews through stakeholder engagement in demand-led reviews: our syntheses meet a decision-making need (or needs) and are therefore much more likely to be used. The general appeal of evidence synthesis as an input in decision-making processes increases as their value is demonstrated to stakeholders (e.g. awareness of reviews and more positive perceptions of them). And depending on the approach for stakeholder engagement that is taken, stakeholders’ skills to produce, use, and commission syntheses also increase.

On the other hand, the generalisable ‘public good’ aspect of synthesis decreases the more you engage stakeholders’ priorities. Furthermore, those working on the syntheses with stakeholders need to be very flexible in terms of labour: gathering, and then being responsive to, stakeholders’ needs is very time consuming. There are also high opportunity costs for both academics and decision-makers in the production of demand-led syntheses. For example, researchers might derive few or no publications out of the synthesis and decision-makers might have few professional incentives and rewards for engaging in evidence synthesis. There is also a need for a range of expertise within the synthesis team—including technical methods expertise, public policy-making, and engagement skills—and careful project management.

What this means for definitions of rigour and what is a ‘gold standard’ review

We set out to discuss a tension that is inherent within the promotion of stakeholder involvement in systematic reviews but is rarely recognised—that to be responsive to stakeholders in producing demand-led reviews requires a re-thinking of what constitutes rigour. This issue is often presented as a tension between rapid evidence assessments and full reviews, but we believe it is a bigger question about what makes a ‘gold standard’ review. We propose that a shift in language is required. We prefer ‘responsive reviews’ to ‘rapid reviews’. We also believe that responsive reviews are not ‘quick and dirty’ but rather ‘quick and good enough’ [15].

We are proposing that a shift in our whole approach is needed, whilst also recognising that this is not always feasible. We believe that responsive reviews remain an important way to increase the use of systematically reviewed evidence in decision-making. At the same time, the inherent value of ‘public good’ reviews for future decision-making remains. We also acknowledge that funding sometimes requires that responsive reviews are done without time for a linked ‘public good’ full review. Perhaps most importantly ‘gold standard’ reviews are not only those that are technically methodologically ‘rigorous’, but are also those that are responsive to decision-makers’ needs and are recognised as being so.

Concluding statement

This commentary aims to address head on the often undiscussed key challenge with regard to stakeholder involvement in systematic reviews: that responding to stakeholders can mean reconsidering what makes a review rigorous. It proposes a new model to address these tensions that combines the production of ‘public good’ reviews with stakeholder-driven syntheses. During 2017, we will be putting this model to the test on a synthesis project exploring ecosystems services' interventions for poverty alleviation in Africa and are looking forward to reporting back on our experience.