Background

Policy makers in global health are increasingly adopting ‘evidence-based’ decision making practices [1, 2]. By using the available evidence to inform their decision making, it is believed that resultant policy choices can improve in terms of their appropriateness in being applied to a particular context, or likelihood of achieving their envisioned aims. However, at present there are no commonly accepted guidelines within global public health for how to evaluate evidence. Evaluations of existing evidence frameworks have identified details of program context and project implementation as needed, yet missing, components [3, 4]. In this pilot study, we evaluated how context and implementation were reported in recent studies of global health interventions in order to identify areas reported sufficiently well, and areas where action is needed to improve the design, conduct, and reporting of global health interventions to better enable decision-makers to make sound, evidence-based decisions.

Methods

In this study, we identified candidate criteria for reporting context and implementation, selected a representative sample of published global health intervention studies, and then applied the candidate criteria to the published studies. To assist us in the first two tasks, we assembled an international and multidisciplinary Technical Expert Panel (TEP) consisting of intervention developers, evaluators, practitioners, sponsors and policy makers regarding global health (see acknowledgements for list of technical experts, totaling 17 in number). Experts were identified based on publication records covering global health interventions or program evaluation, as well as suggestions from policymaking organizations and sponsors. This was the second phase of a study assessing the utility of existing global health evidence frameworks, the first phase of which has already been published. That study found that existing evidence frameworks vary in their criteria and judgments, and do not sufficiently take into account all of the many needs of decision-makers in global health contexts [3].

Identifying candidate criteria

As part of the discussion of the utility of existing global health evidence frameworks, the TEP identified information about context and implementation as gaps in current evidence frameworks. More specifically, they identified the need for more information about who delivered the intervention, what the intervention consisted of, how it was implemented, what it cost, and contextual details about the population receiving the intervention. We used this input, plus our own knowledge of implementation science, to guide our selection of candidate criteria for reporting of context and implementation. We chose to use existing criteria, selected according to their applicability to the needs identified by the TEP. We used three existing sets of implementation criteria (IC) identified by experts in implementation science: the Consolidated Framework for Implementation Research (CFIR [5]); the proposed criteria for reporting the development and evaluation of complex interventions in healthcare (CReDECI [6]); and those criteria required by the editors of the journal Implementation Science, themselves based on the WIDER criteria [7]. From these IC sets, we identified specific criteria based on the input described above. We tried to include at least one criterion from each of the five domains in the Consolidated Framework for Implementation Research: intervention characteristics, outer setting, inner setting, characteristics of the individuals involved, and the process of implementation. We ended up selecting 10 implementation criteria as a parsimonious set that were potentially relevant to report for implementation of global health interventions, and therefore worth testing in a pilot study. One criterion had eight components that were each rated separately.

Identifying a representative sample of published global health intervention studies

We identified a diverse set of global health interventions as potential candidates with which to apply these existing implementation criteria by considering the major causes of morbidity and mortality in developing countries or the major diseases of focus among international global health financing bodies. This set of global health interventions was chosen both to apply the selected implementation criteria as well as existing evidence frameworks as part of the first phase of this study (described elsewhere) [3].

We developed a draft set of key dimensions for classifying global health interventions in order to map out these potential exemplar interventions to select a diverse set along these dimensions (e.g., population affected, geographic location, whether the intervention addresses a communicable or non-communicable disease, whether the target is a one-time behavior change or recurring event, etc.). TEP members provided input on the dimensions and on their preferred exemplars. From this exercise, we selected three interventions: household water chlorination, prevention of mother-to-child transmission of HIV (PMTCT), and lay or community health workers to reduce childhood morbidity and mortality. Details of this selection process are presented in more detail elsewhere [8].

For each of the three chosen representative global health interventions, we located published systematic reviews of their effectiveness by conducting a Medline search and selected one review for use in this pilot study. The reviews chosen for each intervention were the most recent ones that we judged best assessed the representative interventions. For each of these reviews, we retrieved the original research studies included in the review and used these original studies as sources of evidence when applying the pilot implementation criteria.

Applying the criteria

For each of the 10 implementation criteria and each of the original research articles used in the systematic reviews on each of the representative global health interventions, one reviewer recorded the exact text that was judged related to the criterion, and assigned an initial score of ‘good,’ ‘fair,’ ‘poor/none’ following a rating scheme used by many other quality and reporting checklists. Then, at a group meeting, each criterion for each article was reviewed in detail, and a group decision was reached regarding the final rating, based on the degree to which we judged the text met the needs of stakeholders regarding that aspect of implementation, as determined by the input received from our technical expert panel.

Results and discussion

Results

Table 1 lists the 10 implementation criteria adapted for this project from the 3 IC sets. The provided text examples accompanying these criteria are ones we identified from the global health research articles. The rationale or clarifying statements are taken directly from the original source, with adaptation to the global health context.

Table 1 Global framework – implementation criteria for pilot test

For the pilot testing of these implementation criteria, we applied the 10 implementation criteria in Table 1 to the 15 original research articles that form the evidence base for our three global health interventions. For the exemplar intervention of household water chlorination, we used the three original household chlorination research studies that were included in the Clasen et al. [9] meta-analysis for the outcome of rate-ratios for all-age diarrhea (see Analysis 1.1 stratified by intervention type). Furthermore, we include the published journal version for one of these studies instead of the original dissertation, both due to practical convenience and because this is likely to be what would be available to a policymaker undertaking a similar exercise (we substitute the published version of Sobsey et al. [10] for the original dissertation credit of Handzel [11]). For the representative intervention of PMTCT, we utilized five original research studies included in a 2012 systematic review on community strategies to improve PMTCT programs in the developing world [12]. The primary outcomes were prevention of vertical transmission of HIV as well as program retention. For the intervention of community or lay health workers to reduce child mortality, we utilized all seven trials contributing to the meta-analysis in the sole 2010 Cochrane review on this subject [13]. Thus, we applied 10 criteria to each of 15 articles.

Tables 2 and 3 present summary findings for household water chlorination, prevention of mother-to-child-transmission of HIV, and lay health workers by criterion and by article, respectively. Additional file 1 summarizes findings by article and by criterion together. More detailed tables assessing what text was found and how we judged it meeting the criteria can be found in Additional file 2.

Table 2 Ratings of implementation criteria in published studies of three representative global health interventions
Table 3 Ratings of implementation criteria by article

We found we could not operationalize criterion 6, about the outer setting, and therefore dropped it. The remaining 9 criteria generated 16 ratings, since criterion 4 includes 8 subcriteria. The proportion of criteria for which reporting was poor or none ranged from 11% to 54% with an average of 30%. The two most common criteria for which reporting was rated poor or none were criteria 5 and 10, which dealt with, respectively, cost (either intervention or implementation) and describing or assessing the implementation process itself, in terms of the function or aims of each of the program components. The three most common criteria for which reporting was rated good were criteria 1, 4c, and 4d, which described the source of the intervention (in almost all cases, these were investigator initiated research projects), the setting, and the mode of delivery (such as face-to-face), respectively.

Discussion

The three most important findings of this pilot study are: overall reporting is poor or none for about one third of a sample of criteria; the quality of reporting varies across articles and interventions; and good reporting is possible, with examples of good reporting for each criterion (except costs).

The reporting of implementation information is highly variable both within and across articles, with some articles reporting a great deal of information about some criteria and almost nothing about others, and likewise some articles report almost nothing about most criteria while others report a great deal about most criteria. For example, the articles by Chandisarewa [19], Kumar [23], and Marsh [28] are examples of good reporting on most criteria. In total, eight articles had ‘good’ or ‘fair’ documentation for greater than 75% of criteria, while five articles had ‘poor or none’ documentation for 50% or more criteria. This degree of variability within and across studies suggests that the decisions by each global health study team about which aspects of context and implementation to measure and report are idiosyncratic and not guided by any commonly accepted norms. This contrasts with the more standardized reporting of other aspects of study design and execution, such as whether or not a study uses random assignment to intervention groups and what the attrition rate may be for study participants. Reporting tools such as the CONSORT statement have helped improve the reporting of features such as these [29, 30], and perhaps reporting of global health interventions could be similarly improved by the development and widespread adoption of reporting criteria.

In this pilot study, we made some observations that may prove useful to future assessments of context and implementation in global health interventions. First, as already noted, we found we could not satisfactorily operationalize the CFIR criterion about the outer setting, since we judged that in most or all of the settings the national and local health authorities would judge the aims of the interventions (e.g., reduce water-borne diarrheal illness, reduce mother-to-child transmission of HIV, reduce infant mortality) to be compatible with their health policies. Scoring this criterion as ‘poor or none’ if there was not an explicit statement in the article about this alignment of the intervention with health goals seemed too harsh, and hence we dropped this criterion. Secondly, because we used criteria from different sources, there were some that overlapped and could be consolidated; for example, criterion 4h, ‘A detailed description of the intervention/program content provided to each study group,’ and criterion 9, ‘Description of all materials or tools used for the implementation.’ Thirdly, we followed the common practice of giving the ‘benefit of the doubt’ when making judgments between categories, meaning our ratings are probably a ‘best case scenario’ for current reporting. More work in making sharper the operational definitions between different categories of reporting will be needed to avoid this upward bias. Lastly, we found some criteria more difficult to rate than others; specifically, criteria 2, 3, 7 and 8 were particularly difficult to judge, and we believe further work is needed to assess and improve inter-rater reliability. Our judgment is that the items 4a to 4h (from the WIDER criteria) are potentially the most immediately useful and applicable, with the addition of criterion 5 about costs, as we found these easiest to rate.

In summary, this pilot study found that reporting of context and implementation information in studies of global health interventions is at best mostly fair or poor, and highly variable. The lack of context and implementation information is a major gap in the evidence needed by global health policy makers to reach decisions. The idiosyncratic variability in reporting indicates global health investigators need more guidance about what aspects of context and implementation to measure and how to report them. This pilot study could be useful to an effort to develop that guidance. Without better reporting, policy makers will be left in the dark about context and implementation details that are key to designing and introducing an effective and sustainable intervention.

Authors’ information

JL (PhD) is an Economist at the RAND Corporation. PS (MD, PhD) is an Internist who serves as the Chief of General Internal Medicine at the West Los Angeles Veterans Affairs Medical Center in Los Angeles. He is also the Director of the Southern California Evidence-based Practice Center based at the RAND Corporation in Santa Monica. MM (MPP) is the Associate Director of the Southern California Evidence-based Practice Center based at the RAND Corporation in Santa Monica. BJ (BS) was a Research Assistant and TP (BHM) was a Project Assistant. Both worked for the Southern California Evidence-Based Practice Center based at the RAND Corporation in Santa Monica.