Accountability is increasingly seen as central to improving equitable access to health services [1, 2]. Despite the fact that social accountability mechanisms are “multiplying in the broader global context of the booming transparency and accountability field” [3, p. 346], whether and how these interventions work to improve health is often not adequately described. Measuring effects of social accountability interventions on health is difficult and there is no consensus on how social accountability should best be defined, developed, implemented, and measured.

The term accountability encompasses the processes by which government actors are responsible and answerable for the provision of high-quality and non-discriminatory goods and services (including regulation of private providers) and the enforcement of sanctions and remedies for failures to meet these obligations [4]. The Global Strategy for Women’s, Children’s and Adolescents’ Health, 2016–2030 defines accountability as one of nine key action areas to, “end preventable mortality and enable women, children and adolescents to enjoy good health while playing a full role in contributing to transformative change and sustainable development” [2 p. 39]. The Global Strategy’s enhanced Accountability Framework further aims to “establish a clear structure and system to strengthen accountability at the country, regional, and global levels and between different sectors” [2].

Social accountability, as a subset of accountability more broadly comprises “…citizens’ efforts at ongoing meaningful collective engagement with public institutions for accountability in the provision of public goods” [5 p. 161]. It has transformative potential for development and democracy [1, 6,7,8,9]. Successful efforts depend on effective citizen engagement, and the responsiveness of states and other duty bearers [3, 10]. Social accountability and collective action processes may contribute to better health and healthcare services by supporting, for example, better delivery of services (e.g., via citizen report cards, community monitoring of services, social audits, public expenditure tracking surveys, and community-based freedom of information strategies); better budget utilization (e.g., via public expenditure tracking surveys, complaint mechanisms, participatory budgeting, budget monitoring, budget advocacy, and aid transparency initiatives); improved governance outcomes (e.g., via community scorecards, freedom of information, World Bank Inspection Panels, and Extractives Industries Transparency Initiatives); and more effective community involvement and empowerment (e.g., via right to information campaigns/initiatives, and aid accountability mechanisms that emphasize accountability to beneficiaries) [10,11,12].

An early attempt to evaluate a social accountability intervention using an experimental study design was a 2009 paper presenting the evaluation of community-based monitoring of public primary health care providers in Uganda by Bjorkman and Svensson [13]. The authors conclude that, “…experimentation and evaluation of new tools to enhance accountability should be an integral part of the research agenda on improving the outcomes of social services” [13 p. 26]. Since then, various study designs have been used to assess social accountability initiatives. These include randomized trials, quantitative surveys, qualitative studies, participatory approaches, indices and rankings, and outcome mapping [10].

In common with other fields, social accountability interventions are increasingly popular in the area of reproductive, maternal, neonatal, child, and adolescent health (RMNCAH). Also in common with the broader area of social accountability, measuring effects of these interventions on RMCAH is challenging.

In this paper, we review and critically analyze methods used to evaluate the health outcomes of social accountability interventions in the area of RMNCAH, to inform evaluation designs for these types of interventions.


Eligibility criteria

We searched for original, empirical studies published in peer-reviewed journals between 1 January 2009 and 26 March 2019 in any language. We included papers which described an evaluation of the health effects of interventions aiming to increase social accountability of the healthcare system or specific parts of the healthcare system, within a clearly defined population. We included papers that reported one or more RMNCAH outcome. Because many papers did not include direct health outcome measures or commentary, we also included studies that reported on health service outcomes such as improvements in quality, on the grounds that this was likely to have some effect on health. Because we were interested in methods for measuring effects of social accountability interventions on health, we excluded papers that did not report at least one health (RMNCAH) outcome, for instance we excluded papers which only discussed how the intervention had been set up or how it was received and did not mention any health-related consequences of the interventions.

We excluded papers that described only top-down community health promotion type initiatives (e.g., improving community response to obesity); interventions aiming to improve accountability of communities themselves (e.g., community responsibilities toward women during childbirth); clinician training interventions (e.g., to reduce abuse of women during childbirth); quality improvement interventions for clinical care (e.g., patient participation in service quality improvement relating to their own care and treatment and not addressing collective accountability); intervention development (e.g., testing out report cards as there was no evaluation of the effects of using these); natural settings where people held others to account (i.e., there was no specific intervention designed to catalyze this); or papers that exclusively discussed litigation and legal redress.

Information sources

We searched the following databases via Ovid: MEDLINE, EMBASE, and Social Policy & Practice. Both SCOPUS and The Cochrane Library were searched using their native search engines. All database searches were carried out on 28 August 2018 and updated on 26 March 2019. We reviewed reference  lists and consulted subject experts to identify additional relevant papers.


We developed search terms based, in part, on specific methods for achieving social accountability as defined in Gaventa and McGee 2013 [10]. The search combined three domains relating to accountability, RMNCAH, and health. The complete search strategy used for all five databases is included in Table 1.

Table 1 Search terms

Study selection

Papers were screened on title and abstract by CM and CRM and lack of agreement was resolved by VB. Full text papers were screened by CM and VB.

Data collection and data items

Data were extracted by CM and CRM. Data items included intervention, study aims, population, study design, data collection methods, outcome measures, social accountability evidence reported/claimed, cost, relationship between evaluator and intervention/funder, which theoretical framework (if any) was used to inform the evaluation, and if so, whether or not the evaluation reported against the framework.

Social interventions are complex and can have unexpected consequences. Because these may not always be positive, we were interested to explore how this issue had been addressed in the included studies. We extracted from the studies any discussion of how such negative effects were measured, whether they were measured, and whether any such effects were reported on. We defined harms and negative effects very broadly and included any consideration at all of negative impacts or harms, even if they were mentioned only in passing.

Because we were examining accounts of interventions that increase accountability in various ways, we were interested in the extent to which the authors included information that would promote their own accountability to their readers. We examined whether the studies contained information about the funding source for the intervention and for the evaluation, or any other information about possible conflicts of interest.

Risk of bias

For this review, we wished to describe the study designs used to evaluate social accountability interventions to improve RMNCAH. Papers reporting on interventions that aimed to affect comprehensive health services where the studies did not explicitly reference RMNCAH components (or which have not been indexed in MEDLINE using related keywords and/or MeSH terms) were not included. Interventions in general areas of health are likely to employ similar methods to evaluate social accountability interventions as those in RMNCAH-specific areas. However, if not, these additional methods would not have appeared in our search and will be omitted from the discussion below.

Synthesis of results

We present a critical, configurative review (i.e., the synthesis involves organizing data from included studies) [14] of the methodologies used in the included evaluations. We extracted data describing the social accountability intervention and the evaluation of it (i.e., evaluation aims, population, theoretical framework/theory of change, data collection methods, outcome measures, harms reported, social accountability evidence reported, cost/sustainability, and relationship between the funder of the intervention and the evaluation team). We presented the findings from this review at the WHO Community of Practice on Social Accountability meeting in November 2018, and updated the search afterwards to include more recent studies.


The review protocol is registered in the PROSPERO prospective register of systematic reviews (registration # CRD42018108252).Footnote 1 This review is reported against PRISMA guidelines [15].


The search yielded 5266 papers and we found an additional six papers through other sources. One hundred and seventy-six full text papers were assessed for eligibility and of these, 22 met the inclusion criteria (Fig. 1).

Fig. 1
figure 1

PRISMA 2009 flow diagram

Interventions measured

We took an inclusive approach to what we considered to be relevant interventions, as reflected in our search terms. Our final included papers referred to a range of social accountability interventions for improving RMNCAH. Eight types of interventions were examined in the included papers (Table 2).

Table 2 Intervention types

Study aims

To be included in this review, all studies had to report on health effects of the interventions and be explicitly orientated around improving social accountability. The different studies had somewhat different aims, with some more exploratory and implementation-focused, and some more effectiveness-orientated. Exploratory studies were conducted for maternal death reviews [16], social accountability interventions for family planning and reproductive health [17], civil society action around maternal mortality [18], community mobilization of sex workers [19], community participation for improved health service accountability in resource-poor settings [20], and exploring a community voice and action intervention within the health sector [21]. These aimed to describe contextual factors affecting the intervention, often focusing more on implementation than outcomes. Others explicitly aimed to examine how the interventions could affect specific outcomes. This was the case for studies of an HIV/AIDS programme for military families [22]; effects of community-based monitoring on service delivery [13]; effectiveness of engaging various stakeholders to improve maternal and newborn health services [23]; acceptability and effectiveness of a telephone hotline to monitor demands for informal payments [24]; effectiveness of CARE’s community score cards in improving reproductive health outcomes [25]; assess effects of quality management intervention on the uptake of services [26]; examine structural change in the Connect 2 Protect partnership [27]; improve “intercultural maternal health care” [28]; and whether and how scale up of HIV services influenced accountability and hence service quality [29]. Some studies were unclear in the write up what the original aims were, but appeared to try to document both implementation and effectiveness, for example the papers reporting on scorecards used in Evidence4Action (E4A) [30, 31].

Study designs used

Study designs varied from quantitative surveys to ethnographic approaches and included either quantitative or qualitative data collection and analysis or a mix of both (see Table 3). Direct evidence that the intervention had affected social accountability was almost always qualitative, with quantitative data from the intervention itself used to show changes, e.g., in health facility scores. The possibility that those conducting the intervention may have had an interest in showing an improvement which might have biased the scoring was not discussed.

Table 3 Evaluation study design and data collection methods of the included studies

Qualitative data were essential to provide information about accountability mechanisms, and to support causal claims that were sometimes only weakly supported by the quantitative data alone. For example, this was the case in the many studies where the quantitative data were before-and-after type data that could have been biased by secular trends, i.e., where it would be difficult to make credible causal claims based only on those data. Qualitative data were primarily generated via interviews, focus group discussions, and ethnographic methods including observations.

Additionally, some papers contained broader structural analysis contextualizing interventions in relation to relevant, longstanding processes of marginalization. For instance, Dasgupta 2011 notes that “in addition to the health system issues discussed [earlier in the paper], the duty bearers appear to hold a world view that precludes seeing Dalit and other disadvantaged women as human beings of equivalent worth: you can in fact die even after reaching a well-resourced institution if you are likely to be turned away or harassed for money and denied care” [18, p. 9].

There were very few outcome measures reported in the studies which directly related to social accountability. Instead, they usually related to the intervention (e.g., number of meetings, number of action points recorded). Outcome measures included quantitative process measures such as total participants attending meetings (e.g., [16]), how many calls were made to a hotline (e.g., [24]), numbers of services provided, and outcome measures such as measures of satisfaction (e.g., [25, 32, 33]). Qualitative studies examined how changes had been achieved (for instance by exploring involvement of civil society organisations in promotion and advocacy), or perceptions of programme improvement (e.g., [20, 22, 34]). Many of the health outcomes were reported using proxy measures (e.g., home visits from a community health worker, care-seeking) [32, 35].

There were various attempts to capture the impact of the intervention on decision-making and policy change. For example, “process tracing” was used, “to assess whether and how scorecard process contributed to changes in policies or changes in attitudes or practices among key stakeholders” [23 p. 374], and “outcome mapping” (defined as, “emphasis on capturing changes in the behavior, relationships, activities, or actions of the people, groups, and organizations with whom an entity such as a coalition works”) [27, p. 6] was used to assess effects of the intervention on systems and staff.

Theoretical frameworks

In 10 out of 22 cases, we found an explicit theoretical framework that guided the evaluation of the intervention. In some additional cases, there appeared to be an implicit theoretical approach or there is reference to a “theory of change” but these were not spelled out clearly.

Harms or negative effects reported

Studies which emphasised quantitative data either alone or as a part of a mixed methods data collection strategy did not report harms or intent to measure any. The only studies reporting negative aspects of the intervention—either its implementation or its effects—emphasised qualitative data in their reporting. Not all qualitative studies reported negative aspects of the intervention, but it was notable that the more detailed qualitative work considered a wider range of possible outcomes including unintended or undesirable outcomes.

Studies reporting any types of negative effects varied in terms of the type of harms or other negative aspects of interventions reported, although complex relationships with donors was mentioned more than once. For instance, Aveling et al note:

…relations of dependence encourage accountability toward donors, rather than to the communities which interventions aim to serve […] far more time is spent clarifying reporting procedures and discussing strategies to meet high quantitative targets than is spent discussing how to develop peer facilitators’ skills or strategies to facilitate participatory peer education. [22, p. 1594–5]

Some authors did not report on negative effects as such, but did acknowledge the limitations of the interventions they examined—for instance, that encouraging communities to speak out about problems will not necessarily be enough to promote improvement [16]. Similarly Dasgupta reported how, “[t]he unrelenting media coverage of corruption in hospitals, maternal and infant deaths and the dysfunctional aspects of the health system over the last six years, occasionally spurred the health department to take some action, though usually against the lowest cadre of staff” [18 p. 7] and “[w]hen civil society organizations, speaking on behalf of the poor initially mediated the rights-claiming to address powerful policy actors such as the Chief Minister, it did not stimulate any accountability mechanism within the state to address the issue” [18p. 7]. In their 2015 study, Dasgupta et al. address the potential harms that could have been caused by the intervention—a hotline for individuals to report demands for informal payments—and explain how the intervention was designed to avoid these [24].

Costs and sustainability

Only four studies contained even passing reference to the cost or sustainability of the interventions. One study indicated that reproductive health services had been secured for soldiers and their wives [22]. One mentioned that although direct assistance had ceased, activities continued with technical support provided on a volunteer basis [28], one (a protocol) set out how costs would be calculated in the final study [26], and one mentioned in passing that a district had not allocated funds to cover costs associated with additional stakeholders [20].

Challenges to sustainability were noted in several studies [16, 20,21,22,23,24,25, 32, 33].

Accountability of the authors to the reader

Very few studies specified the relationship between the evaluation team and the implementation team and in many cases, they appear to be the same team, or have team members in common. In most cases, there was no clear statement explaining any relationships that might be considered to constitute a conflict of interest, or how these were handled.

Information about evaluation funding was more often provided, although again it was not clear whether the funder had also funded the intervention, or if they had, to what extent the evaluation was conducted independently from the funders.


Most studies reported a mix of qualitative and quantitative data, with most analyses based on the qualitative data. Two studies used a trial design to test the intervention—one examined the effects of implementing CARE community score cards [32] and the other tested the effects of a community mobilization intervention [36]. This relative lack of trials is notable given the number of trials related to social accountability in other sectors [3, 9]. The more exploratory studies which attempted to capture aspects of the interventions—such as how they were taken up—used predominantly qualitative data collection methods.

The studies we identified show the clear benefits of including qualitative data collection to assess social accountability processes and outcomes, with indicative quantitative data to assess specific health or service improvement outcomes. High-quality collection and analysis of qualitative data should be considered as at least a part of subsequent studies in this complex area. The “pure” qualitative studies were the only ones where any less-positive findings about the interventions were reported, perhaps because of the emphasis on reflexivity in analysis of qualitative data, which might encourage transparency. We were curious about whether there was any relationship between harms being reported and independence of studies from the funded intervention, but we found no particular evidence from our included studies to indicate any association. One study mentioned that lack of in-country participation in the design process led to lack of interest in using the findings to help plan country strategy [31].

It was notable that studies often did not specify their evaluation methods clearly. In these cases, methods sections of the papers were devoted to discussing methods for the intervention rather than its evaluation.

When trying to measure interventions intended to influence complex systems (as social accountability interventions attempt to do), it is important to understand what the intervention intends to change and why in order to assess whether its effects are as expected, and understand how any effects have been achieved. There was a notable lack of any such specification in many of the included studies. For example, there were few theoretical frameworks cited to support choices made about evaluation methods and, related to this, there were few references to relevant literature that might have informed both the interventions and the evaluation methodologies. The literature on public and patient involvement, for instance, was not mentioned despite this literature containing relevant experiences of trying to evaluate these types of complex, participatory processes in health. It is possible that some of the studies were guided by hypotheses and theoretical frameworks that were not described in the papers we retrieved.

Sustainability of the interventions and their effects after the funded period of the intervention was rarely discussed or examined. A small, enduring change for the better that also creates positive ripple effects over time may be preferable to larger, temporary effects that end with the end of the intervention funding. It would also be useful to discuss with funders and communities in advance what type of outcome would indicate success and over what period of time, to ensure that measures take into account what is considered important to the people who will use them. Sustainability and effectiveness are known to diminish after the funded period of the intervention [37]. Longer term follow-up may be hindered because of the way funding is generally allocated over short periods. It would be interesting to see a greater number of longer-term follow up studies examining what happened “after” the intervention had finished in order to inform policymakers about what the truly “cost-effective” programmes are likely to be. For example, some studies have traced unfolding outcomes after the intervention has finished; these may be important to take into account in any effectiveness considerations.

There was little transparency about funding and any conflicts of interest—which seemed surprising in studies of social accountability interventions. We strongly recommend that these details be provided in future work and be required by journals before publication.

A limitation of this study was that our searches yielded studies where accountability of health workers to communities or to donors appeared to be the main area of interest. A broader understanding of accountability might yield further useful insights. For instance, it seems likely that an intersectional perspective might put different forms of social accountability in the spotlight (e.g., retribution or justice connected with sexual violence or war crime, examining the differentiated effects on sexual and reproductive health, rather than solely accountability in a more bounded sense) [38]. By limiting our view of what “accountability” interventions can address within health, we may unintentionally imply broader questions of accountability are not relevant—e.g., effects of accountability in policing practices on health, effects of accountability in education policy on health, and so on.

With only a few notable exceptions, we lack broader sociohistorical accounts of the ways in which these interventions are influenced by the political, historical, and geographical context in which they appear, and how dynamic social change and “tipping point” events might interrelate with the official “intervention” activities—pushing the intervention on, or holding it back, co-opting it for political ends, or losing control of it completely during civil unrest. While the studies we identified did use more qualitative approaches to assessing what had happened during interventions, the scope of the studies was often far narrower than this—for instance lacking information on broader political issues that affected the intervention at different points in time. In future, studies examining health effects of social accountability interventions should consider taking a more theoretical approach—setting out in more detail what social processes are happening in what historical/geographical/social context so that studies develop a deeper understanding, including using and further developing theories of social change to improve the transferability of the findings. For instance, lessons on conducting and evaluating patient involvement interventions in the UK may well have a bearing on improving social accountability and its measurement in India and vice versa. Related to this, we note that although there is clear guidance from the evaluation literature that it is important to take a systems approach to understanding complex interventions, none of our included studies explicitly took a systems approach—applying these types of approaches more systematically to social accountability interventions is a fertile area for future investigation. Without such studies, we risk implying that frontline workers are the only site of “accountability” and, by omission, fail to examine the role of more powerful actors and social structures which may act to limit the options of frontline workers, as well as failing to explore and address the ways in which existing structural inequalities might hamper equitable provision and uptake of health services.

Terminology may be hampering transfer of theoretically relevant material into and out of the “social accountability” field. The term “social accountability” may imply an adversarial relationship where certain individuals are acting in bad faith. One of the studies in our review used different terminology—“collaborative synergy”—referring to the work of coalitions in the Connect2Protect intervention [27]. We speculate that lack of agreed, common terminology may hinder learning from other areas of research—the phrase “social accountability” is not commonly used in the patient and public involvement (PPI) literature, possibly because of the greater emphasis in high income settings on co-production and sustainability compared with more of a “policing” emphasis in the literature reporting on LMIC settings. Yet one of the purposes of PPI interventions is to improve services and this may well include healthcare providers being held accountable for the services they provide. Litigation was outside the scope of this article, but legally enshrined rights to better healthcare are crucial and litigation is a key route to ensuring these rights are achieved in practice. A more nuanced account of these types of interventions in context would be valuable in understanding “what works where and why,” to inform future policy and programmes.

Dasgupta et al. comment on how hard it is to attribute change to any particular aspect of a social accountability intervention because successful efforts are led by individuals in many different roles whose relationships with one another are constantly changing and adapting. Attributing success is difficult because these changing relationships shape how and whether any individual can have an impact through their actions.

Evaluation tools, particularly those used within and for a specific time frame, have a limited capacity to capture the iterative nature of social accountability campaigns, as well as to measure important impacts like empowerment, changes in the structures that give rise to rights violations, and changes in relationships between the government and citizens. [24, p. 140]


Designing adequate evaluation strategies for social accountability interventions is challenging. It can be difficult to define the boundaries of the intervention (e.g., to what extent does it make conceptual sense to report on the intervention without detailing the very specific social context?), or the boundaries of what should be evaluated (e.g., political change or only changes in specific health outcomes). What is clear is that quantitative measures are generally too limited on their own to provide useful data on attribution, and the majority of evaluations appear to acknowledge this by including qualitative data as part of the evidence. The goals and processes of the interventions are inherently social. By examining social dimensions in detail, studies can start to provide useful information about what could work elsewhere, or provide ideas that others can adapt to their settings. More lessons should be drawn from existing evaluation and accountability work in high-income settings—the apparent lack of cross-learning or collaborative working between HIC and LMIC settings is a wasted opportunity, particularly when so much good practice exists in HIC and in LMIC settings—there are ample opportunities to learn from one another that are often not taken up and this is clear from the literature which tends to be siloed along country-income lines. Finally, more transparency about funding and histories of these interventions is essential.