Background

Social accountability (SA) interventions, or the mechanisms and processes by which citizens and civil society groups hold the health system and its actors accountable for their commitments, are being used more frequently in health programming in developing countries. Such interventions seek to raise awareness among community members of their rights around health and gaps in services, and empower communities to engage with actors (e.g., providers) in the health system to improve health programming and health outcomes [1, 2]. SA interventions are complex, using diverse approaches and engaging diverse stakeholders in a process to understand problems (e.g., gaps in services) and identify and take actions to solve problems. Their design, implementation, and impact are also context specific, grounded in social, economic, and political realities of where they are implemented. This complexity, along with the extended pathways and time horizons for realizing community empowerment and health outcomes create evaluation challenges. Randomized controlled trials and experimental designs are not always feasible and some outcomes are not directly measurable. Evaluations, thus, use a range of study designs, including mixed methods approaches and participatory research tools to explore both health and governance-related outcomes. There is, however, little consensus on how to best evaluate SA interventions and how to estimate and measure change in outcomes.

In 2017, the World Health Organization organized a Community of Practice on Measuring Social Accountability and Health Outcomes (COP) to build consensus on outcome measures and evaluation designs; participants, including practitioners and researchers, meet annually to share experiences, methodologies, and outcomes from their work on research and evaluation; and discuss how to action research. One of the first products of the COP was a synthesis of evaluation designs for SA interventions in health to summarize common designs, research questions, and how well the designs are implemented. Based on that synthesis and discussion during the COP meeting in 2018, participants identified limited detail and inconsistent reporting across SA studies as a key gap that hinders researchers in the field from summarizing and understanding the strength of the collective evidence on SA and identify best practices for replication in other contexts, as well as key contextual factors and mechanisms relevant to implementation [3]. As a first step toward improving the level of detail and consistency in reporting across studies, the COP charged a Reporting and Guidance Working Group (including authors of this paper) to develop a reporting checklist to be used by researchers and evaluators to improve the documentation of intervention processes, context, study designs, and outcomes in the peer reviewed literature in order to facilitate cross-study comparisons, shared learning around effective SA interventions and how they can be adapted and scaled. This paper outlines the steps we took to develop the Social Accountability Reporting for Research (SAR4Research) checklist for health programming.

Methods

We used a multi-step process to develop and refine the SAR4Research checklist, see Table 1 for the timeline for developing the checklist. Below we describe how we identified gaps in reporting, adapted existing reporting guidelines to develop the checklist, and carried out worked examples to test and revise the proposed checklist.

Table 1 Timeline of checklist development

Developing the checklist

To develop the checklist, three authors (VB, LP, JK) carried out an umbrella review of eighteen systematic and narrative reviews of the SA literature to extract reporting limitations [4]. Our umbrella review sought to identify gaps in reporting on SA interventions in the peer-reviewed literature, and to that end we included systematic, landscaping, critical, narrative or other reviews that: included descriptions and/or results from SA interventions implemented in low- and middle-income countries, and were published or disseminated between 2010-2020. Reviews could have included SA interventions from a range of countries, covering a range of health topics and populations (e.g., rural, urban). To identify the reviews, we applied search terms related to SA (e.g., social accountability, scorecards, participatory interventions) and evaluations (e.g., program evaluation, follow-up studies, outcome evaluation) to peer-reviewed (Pubmed) and grey literature (GoogleScholar) search engines. We also requested reviews from participants in the 2018 COP meeting and received two reviews, one captured in our literature search and one that was in progress (i.e., published after the meeting) [2, 3]. Two authors (VB, JK) reviewed the abstracts, applied selection criteria and summarized the reviews, with a focus on reporting gaps. Next, we reviewed reporting guidelines, including recommendations for reporting on clinical and behavioral interventions evaluated with randomized controlled trials, quasi-experimental designs, or realist evaluations, on qualitative research, and on economic evaluations of health interventions [5,6,7,8,9]. We noted items included (e.g., research design) and information required for each item. We compared the reporting gaps in SA against the reporting guidelines to assess whether any existing guidelines could be adopted “as-is” for our purposes. Because none met our needs, we adapted one guideline that had been through the guideline development process for our purposes [5, 10].

We presented the first draft of the checklist at the COP meeting in 2019. Based on feedback, we revised the checklist and drafted a narrative to describe key issues for items in the checklist (e.g., explanation of mechanisms of effect). We shared the checklist and narrative, via e-mail, with CoP members in May-June 2020, and incorporated their feedback into the checklist that we tested using worked examples.

Testing the checklist

To test the checklist we carried out three worked examples. We requested examples from COP members and purposively selected examples that: (1) evaluated SA interventions using randomized, quasi-experimental or realist evaluation, with the intent of including a mix of study designs; (2) were carried out in the last 5-7 years; (3) collected data from community members and stakeholders; and (4) reported on at least one health outcome, preferably published in a peer reviewed journal. Based on COP member recommendations, we identified one example in Uganda and two in Indonesia. For each, we engaged with principal investigators to describe the checklist development, secure their agreement to participate in testing the checklist and join us as co-authors (authors CT, AB, and AS).

The purpose of the worked examples was to assess whether items in the checklist were included in reports, and better understand study investigators’ decisions about what information they included in one or a set of papers reporting on a study. Specifically, we considered whether: (1) information called for in the checklist was included in published or grey literature manuscripts; (2) whether the checklist omitted any domains or content area that projects reported; and (3) if the information called for in the checklist was not included in published or grey literature manuscripts, whether it was included in documentation that was not published.

For each worked example, we held initial conversations with at least one study investigator to describe our process, identify published and non-published manuscripts and reports and set the stage for further discussions about the checklist (e.g., what was the checklist, the worked examples, need to revise and streamline). Then, one author (SE) conducted the data extraction and analysis, reading published and grey literature reports to identify whether items in the checklist were present and the degree to which they were covered. When checklist items were not present in papers, we discussed the reviewed internal documentation (e.g., process documentation, draft reports not yet publicly available) with the study investigators (who joined us in authoring this paper). In our discussions, these authors were able to shed light on whether the gaps could be filled (e.g., data collected, but not reported) and how they made decisions about whether they reported specific information or not. Finally, we assessed how the checklist performed within and across the worked examples (i.e., was information for each checklist element included in at least one paper/report or in project files) to revise the checklist on last time, reducing overlap and making suggestions for depth of reporting.

Results

Gaps in reporting on social accountability found in the umbrella review

The literature review identified reporting gaps pertaining to: conceptual underpinnings; site description; study information; intervention; context; study design; outcomes; and analyses (see Table 2) [1,2,3, 11,12,13,14,15, 18,19,20,21]. For example, few studies described how interventions were expected to work or the pathways through which the intervention would produce outcomes. Site descriptions rarely provided characteristics of organizations involved, existing social capital, and relationships between communities and leaders [1, 2, 13,14,15, 18, 20, 21]. In addition, few studies reported on the genesis of the intervention (e.g., grassroots, externally funded), details of the actors involved, the scale and process of implementation, the recourse mechanisms, or linkages with other efforts [2, 11,12,13,14,15, 19, 21]. Study designs, analyses and outcomes were not always described in sufficient detail. One explanation for this may be the complexity of SA interventions and evaluations, for which guidelines for reporting are needed. In addition, information on how funding and the relationship between implementation and evaluation teams may have influenced the evaluation were sometimes missing [12, 14, 15]. Reasons for the gaps were not always addressed in the reviews. Please see Marston et al (2020) for details of what was reported [3].

Table 2 Reporting gaps identified in evidence reviews

Existing guidelines and the initial “Social Accountability Reporting for Research (SAR4Research)” checklist

None of the reporting guidelines we reviewed addressed all the reporting gaps that were flagged in our literature review [5,6,7, 9, 10, 23,24,25,26,27,28,29,30]. For example, although most called for a description of implementing partners and intervention sites, none reflected details about the power or other relationships between implementers and participants or considered the range of contextual factors that influence implementation and outcomes of SA interventions. Further, only the RAMESES guidelines for realist evaluations capture study designs that included both quantitative and qualitative designs, a characteristic of many SA evaluations [9]. Because it had recently gone through a rigorous development process and because CONSORT guidelines are routinely used in public health, we selected the CONSORT-SPI guidelines as the basis for our checklist [5, 10].

We augmented the CONSORT-SPI guidelines to capture the unique components of SA interventions and evaluations, such as accounting for diverse contextual conditions and actors, issues around equity and representation, complex, non-linear SA processes, and pathways from intermediate- to longer-term community empowerment and health outcomes. To augment the CONSORT-SPI, we drew from other relevant guidelines such as RAMESES and CICI [29, 31]. For example, we drew upon the CICI recommendations for items related to reporting on context [31]. We also added content to draw out more information related to key reporting gaps such as context, mechanisms of effect, and longer-term outcomes.

The first draft of the SAR4Research checklist contained six sections, corresponding to the typical sections of peer-reviewed articles: Title and abstract (1a-b); Introduction (2 a-d); Methods (3a; 4a-b; 5a-d; 6a-c; 7a-b); Results (8; 9; 10a-b; 11; 12a-b; 13; 14a-b; 15); Discussion (16-18); and Important information. The checklist was targeted at researchers reporting the implementation and/or evaluation of SA interventions. The checklist was designed to be applicable to various methodologies used to study SA – notably qualitative, quantitative and mixed methods approaches, as well as a range of study designs (e.g., randomized controlled trials, quasi-experimental designs, qualitative case studies). The original draft of the checklist is available by request.

SAR4Research checklist review and testing

Feedback on the first draft of the checklist (November 2018) from COP members emphasized the need to clarify the purpose of the checklist, to streamline and reduce the number of items and redundancy across sections, and to test the checklist on available case studies to determine if all items are practical (i.e., if study teams have data to report). In addition, because the checklist is intended to be responsive to different study designs and methodologies, COP members encouraged us to enhance the description of each items to ensure that users could easily identify the items relevant to their study. We clarified the items, but did not reduce the number of items.

The revised draft of the checklist was then applied to three worked examples, including the Transparency 4 Development scorecard application in IndonesiaFootnote 1; the ACT Health citizen report card application in Uganda; and the World Vision application of citizen voice and action in Indonesia [32,33,34]. Summaries of interventions implemented, research methods, and key findings are provided in Appendix 1.

We then compared the checklist items reported in each of the worked example (see Appendix 2). Overall, none of the worked examples covered every item in the checklist in one paper. Looking across papers from a study and internal project documentation (based on discussion with study investigators), information for most, but not all elements, were reported or available. However, none of the worked examples provided keywords in the abstract (item 1c) or intervention components such as costs (item 5d), and all had no or limited discussion of harms (item 15) and of generalizability/external validity (items 16-17). All three worked examples contained information about the SA intervention description, as well as some details, if not all about the local context shaping the intervention. In our discussions, study investigators indicated that they did have additional information to report to fill some gaps, but either did not have space to include all information in one paper were still working on papers to fill in gaps.

Checklist finalization

Based on the worked examples and our discussions, we removed repetitions within and between sections to streamline the checklist. We also divided out reporting on methods and results for quantitative and qualitative methods, to clarify what should be reported for each type of study. For the few items where none of the three examples had collected that information, we considered whether to retain the item. In all instances, we decided to retain the items because they had been identified as gaps in the umbrella review. For example, we retained items on content of the intervention because of its importance for interpretation of SA design, implementation, and evaluation.

The final SAR4Research checklist (Brief version)

The final checklist contains six sections, each with several items that aim to ensure that reporting is robust, comprehensive and comparable across studies and contributes to the body of knowledge around SA. (Table 3). To make the checklist feasible to use, research teams with plans for multiple papers should consider what information to provide in each paper. For example, detailed information describing the evaluation and the intervention protocol can be cited in outcome papers. Thus, authors should consider, in advance, the sequencing of papers and grey literature reports, the depth of reporting on particular items in the checklist in each paper/report and provide cross citations among study papers and reports. Another option is to include clear and concise explanations for some checklist elements in an Annex (or more than one) in published papers, particularly as more journals allow for the inclusion of supplementary materials. These options will enable readers to develop a better understanding of the approach being evaluated, whether the evaluation design met the research objectives, and whether the results can be generalized to their own setting. Appendix 3 provides an explanation and elaboration of the final checklist.

Table 3 Final SAR4Research reporting checklist (expanded)

Discussion

We developed and tested a reporting checklist to ensure that design, implementation, and evaluation aspects of SA are more comprehensively and consistently reported by researchers in peer-reviewed articles. The motivation to develop the checklist stems from COP discussions around problems associated with reporting gaps, including our inability to identify patterns across studies about what works and what contextual factors are most important to consider in implementation. Although our review of reviews was not systematic, the reviews were consistent in gaps reported. The reviews included in our analysis and our own experience in SA suggest that the causes for these gaps are many, including cases where a robust evaluation was not planned, journal’s word limits, the volume of documentation and evaluation materials produced by study teams, and an underappreciation of process details in favor of major results. The SAR4 Research checklist may not address all these gaps, but aims to highlight the multiple factors that need to be better understood to build an evidence base for effectiveness of, and provide more guidance on, the design and implementation of SA interventions.

To the best of our knowledge, this checklist is the first attempt to address a gap in reporting for SA, and it is in line with other efforts to improve reporting, syntheses and use of findings from experimental studies, quasi-experimental studies and implementation research with the aim of improving and applying the evidence base around health programming [35,36,37]. For example, the WHO Programme Reporting Standards for Sexual, Reproductive, Maternal, Newborn, Child and Adolescent Health call for information on the context and stakeholders, recognizing the importance of both and the lack of attention to these elements in reporting guidelines for research studies [38]. In addition, assessments of implementation research to improve health programs identify the importance of adaptation and the need to understand when and how adaptations are made, thus suggesting the importance of documenting results of adaptive designs [37].

The final checklist aims to be flexible and versatile, irrespective of the SA interventions implemented and the evaluation design. We explored whether it would be feasible to report on all components through one article. However, in practice, each of our worked examples had several associated papers that documented the intervention design, implementation, and evaluation, with SAR4Research items spread across several papers and reports. Furthermore, research on SA is at its core interdisciplinary and, therefore, published across diverse peer-reviewed journals and grey literature reports. These journals’ word limits for research and review articles vary significantly, with health and biomedical field journal’s word limits being much tighter than in the social sciences. Given this insight, which is supported by our worked examples, the reporting checklist's purpose has shifted from being a checklist for a single paper to a checklist of information about a single study across a compendium of documents that summarize a single study. Where possible, we recommend that authors provide citations to other study papers when there is insufficient space to provide detail on each item in the checklist. This allows readers to understand the broader picture of the intervention and its effects.

Better reporting on SA is timely and relevant to support meaningful community engagement and strengthening accountability in health systems as part of the broader Universal Health Coverage movement and achievement of the Sustainable Development Goals [39]. Better reporting would help to enhance the interpretation of findings, as well as to compare results across settings – all of which are necessary to justify the long-term efforts needed to sustain and institutionalize accountability mechanisms.

Limitations

Although we strove for comprehensive recommendations for reporting, we recognize several limitations in our methods. First, the checklist is intended for reporting in peer-reviewed articles, and thus may not meet the needs of implementers preparing monitoring or learning reports or for emergent SA interventions which often have less quantitative data to report. Furthermore, public health and clinical journals have a much shorter word limit than social science ones, representing an important barrier to reporting, particularly detail on intervention context and components. Thus, full reporting of the complexity of SA will require multiple papers/reports, often in different outlets. We did not assess the feasibility of using the checklist from the authors’ perspective, nor were we able to use it to determine what items to report on in different kinds of papers. Because a growing number of SA interventions are evaluated with mixed methods studies, modifying reporting recommendations for RCTs to meet the needs for reporting on other evaluations may lead to underreporting of important information about some study designs. Last, but not least, the worked examples used in our test of the checklist are not representative of the larger body of SA interventions. Smaller studies implemented locally, without sufficient resources could face different reporting challenges.

Conclusions

Results of SA evaluations will be more useful to researchers and practitioners when study designs, context, and interventions are described fully and completely in manuscripts. The checklist aims to improve reporting, syntheses and use of findings from a range of study designs that can contribute to building the evidence base around SA, that can help inform future programming and more accountable health systems. This checklist will help authors identify and prioritize the relevant information to provide. Sufficient information will help researchers to identify the emerging findings and gaps in the literature that they might address with their own work. As with any reporting checklist, refinements are to be expected. The authors welcome feedback on the checklist as part of the wider effort to improve reporting and understanding of SA.