Process Tracing the Policy Impact of ‘Indicators’

In recent years, a range of new indices, benchmarking and scorecard tools—also known as ‘indicators’—have been developed to influence public policy and to promote accountability. While subjected to important technical and political critiques, the policy impact of ‘indicators’ is often assumed yet rarely demonstrated. Suitable evaluative methods are in their infancy. This article adopts an innovative process tracing analysis to assess the policy impact of the Hunger And Nutrition Commitment Index (HANCI) in Bangladesh, Malawi, Nepal, Zambia and globally. We present a rare and empirically rich application of this systematic qualitative evaluative method. We further contribute to the theorisation of ‘indicators’ by positing a central role for equitable producer–user relations in mediating policy impact, and demonstrate that such relations can overcome significant political critiques on ‘indicators’.


Introduction
In recent years, reducing hunger and malnutrition has come to be viewed as much an outcome of a political process as of technical interventions. Political commitment is now seen as essential for bringing food and nutrition security higher up on public policy agendas (FAO et al. 2014;Foresight 2011;Gillespie et al. 2013). As a consequence, a range of new metrics, analytics and scorecard tools have proliferated to assess political commitment and to promote accountability for reducing hunger and malnutrition. These include: the Global Hunger Index (WHH et al., 2012), the Access to Nutrition Index (GAIN 2013) or The Economist's Global Food Security Index (EIU 2012), World Health Organization (WHO) Nutrition Landscape Analyses (Engesveen et al. 2009) and the Hunger And Nutrition Commitment Index (te Lintelo et al. 2013), amongst others.
These instruments mark part of a global trend, with Davis et al. (2012, pp. 73-74) defining an indicator 1 as "a named collection of rank-ordered data that purports to represent the past or projected performance of different units" […] "used to compare particular units of analysis (such as countries, institutions, or corporations), synchronically or over time, and to evaluate their performance by reference to one or more standards." Typically, 'indicators' aim to support accountability, monitor, evaluate, influence and reform public policy reform and to affect attitudes, behaviours and actions of governments and their bureaucratic apparatus.
Both 'design' and 'political' traditions typically treat 'indicators' as a dependent variable (that requires explanation), yet their policy impact is often assumed but rarely demonstrated. Hence, there is now a distinct empirical and theoretical need to instead consider 'indicators' "as explanatory variables and look for their impact on specific policy innovations" (Kelley and Simmons 2015, p. 68). Such a call also aligns with growing demands from policymakers, including international aid donors, for practitioners and researchers involved in producing 'indicators' to demonstrate impact, for instance in the shape of policy influence. This, in turn, raises methodological questions about the ways in which we can ascertain causality in case studies of 'indicators' and their external validity (Stern et al. 2013) while taking account of interactions between 'indicators' as intervention and their dynamic contexts (Byrne 2013).
Accordingly, this article makes two main contributions to debates about the policy impact of 'indicators'. The first contribution is methodological. Although 'indicators' are frequently used in policy advocacy, evaluating their policy impact is often complicated. Quantitative impact evaluation methods are ill suited to the task, as these require, for instance, counterfactuals and control over implementation. While newly emergent qualitative systematic evaluation tools have promise, they have not been tested widely as yet (Naeve et al. 2017), and their application to 'indicators' is extremely rare. Accordingly, we draw on case study evaluation literatures and offer a process tracing (PT) analysis to interrogate the policy impact of 'indicators'. We present a detailed case study of the Hunger And Nutrition Commitment Index (HANCI). First issued in 2013, this index systematically compares and ranks 45 high-burden countries along a set of 22 policy, legal and financial variables that express government political commitment to address hunger and undernutrition (www.hanci ndex.org). In particular, we present evidence on the use of HANCI in Bangladesh, Nepal, Malawi, Zambia and in the global sphere.
Secondly, this paper adds to the theorisation of pathways through which 'indicator' impact can be achieved. In particular, we challenge the dominant theoretical model and empirical practice that poses a dichotomy between 'indicator' producers and users, and emphasises the technical rigour and communicative appropriateness of 'indicators'. Rather, we assert that equitable producer-user partnerships can not only be catalytic in achieving impact, but also successfully confront important critiques on 'indicators'.
Following this introduction, in the next section we review an emerging global literature on 'indicators'. We then present the study methodology, results and a discussion, followed by a brief conclusion.

Theory
'Indicators' involve the selection, compilation, simplification, aggregation, filtering and naming of the resulting numeric product and are used to evaluate the performance of states, private-sector actors or international bodies (Davis et al. 2012). While differing in aims, composition, sectoral and country coverage, 'indicators' typically comment on policies (e.g. governments having nutrition policies), social practices (e.g. the rates of adoption of best practices for infant and young child feeding) and private-sector or government qualities (e.g. political commitment) (Kelley and Simmons 2015).
The theorisation of 'indicators' is broadly located within a Foucauldian approach to analysing power. Focussing on the 'conduct of conduct', such analyses interrogate the governmental and social technologies and forms of knowledge that monitor and steer people's behaviour, thinking and moral practice, often from a distance (Dean 2010). Within the realm of food governance, such analyses have been fruitfully applied to a very diverse set of issues: from the disciplining effects of diets (Ristovski-Slijepcevic et al. 2010), to the management of refugee pig farms (Wing Chan and Miller 2015), and the transformation of rural life through the moral economic force of agricultural grades and standards (Busch 2000).
There is now broad agreement that 'indicators' are valuable to policymakers, private-sector actors, researchers and civil society groups (Davis et al. 2012;Høyland et al. 2012;Merry 2011). They are often used to draw attention to social problems, to analyse causes or consequences of policy interventions, to promote social justice and policy reform strategies (Davis et al. 2012;Larner and Le Heron 2004;Parks et al. 2015;Ravallion 2011) and to hold political leaders accountable to international standards (Kelley and Simmons 2015;Rosga and Satterthwaite 2009). Civil society groups increasingly use 'indicators', as funders demand them to demonstrate quantifiable evidence of their activities' impact (Merry 2011). The United States Agency for International Development (USAID) and the World Bank use the Bank's World Governance 'indicators' and the Ease of Doing Business 'indicators' to decide on foreign aid allocations (Davis et al. 2012). And in the private sector, 'indicators' are used to advise investors on political risks, for corporate social responsibility reviews and for selection of locations for foreign direct investment (Davis 2014).
Evidence is growing that 'indicators' can influence government policy, as shown in studies of e.g. the Ease of Doing Business Index (Davis 2014;Schueth 2011), the Corruption Perceptions Index (Galtung 2006) and the US State Department's Trafficking in Persons Report (Davis et al. 2012;Kelley and Simmons 2015). Surveys with decision-makers in developing countries also affirm indicator influence (Parks et al. 2015). Otherwise, evidence of their influence can be found in their contestation (Hansen and Muehlen-Schulte 2012, p. 458). Ratings are cited, discussed and sometimes excoriated, indicating their power to draw attention and set the terms of policy debates (Kelley and Simmons 2015, p. 59). Policy actors are more sensitive to rankings and numbers than to texts and words (Hansen and Muehlen-Schulte 2012, p. 457), not least because popular and political debates tend to erroneously interpret rankings as highly accurate (Høyland et al. 2012).
The influence of 'indicators' is deemed greatest in framing and agenda-setting processes (Parks et al. 2015). In the field of nutrition, civil society organisations (CSOs) are documented to have played a significant role in policy advocacy (Mejia Acosta and Haddad 2014;Pelletier et al. 2013). Their deep insight into the nuts and bolts, and the politics of local systems of government can help to connect marginalised citizens to policy debates, and facilitate political entrepreneurship. Policy advocates engage decision-making processes by offering targeted framings of a problem and its solutions (Shiffman 2007), but the credibility of advocacy claims requires solid evidence too (Gillespie et al. 2013). 'Indicators' could provide such data.
Usually, global 'indicators' harbour an implicit theory of change that proposes that technically sound and effectively communicated 'indicators' somehow will affect policy stakeholders. Kelley and Simmons (2015) have made an important start opening this causal black box (Fig. 1). They argue that states, intergovernmental organisations (IGOs) and private actors employ 'indicators' to inform domestic politics, encourage elite peer shaming and generate international pressure, to produce political, reputational or material consequences that drive state behaviour to reprioritise and change policy and reform law (Kelley and Simmons 2015).
Generally, policymakers' responses depend on their subjective regard for the rating body. They seek to learn how to improve scores by consulting policy advice issued in reports. While decision-makers generally dislike low ratings, they also respond to and publicly take credit for improving country performance, with positive rankings stimulating efforts to maintain these (ibid). Moreover, policymakers like 'indicators' because decision-making processes that rely on these can be presented as efficient, consistent, legitimate, transparent, scientific and impartial (Davis 2014;Davis et al. 2012, p. 84).
As the global popularity of 'indicators' has soared, a growing body of critique has emerged, using technical and political lenses. Technical appraisals consider the validity and reliability of 'indicator' design, question whether they help us to understand the phenomena they seek to measure and ask if outcomes such as rankings are interpreted correctly (Davis 2014;Høyland et al. 2012). Others challenge the lack of transparency in 'indicator' construction and calculation (Decancq and Lugo 2010;Merry 2011;Ravallion 2011). This is important because the concepts that 'indicators' seek to capture are often difficult to measure, whether it concerns rule of law, ease of doing business or political commitment.
A second set of critiques underlines 'indicators' constitute a political act, with knowledge and governance effects (Merry 2011). 'Indicators' exercise power through their ability "to name, to define and to describe certain people and places as  being different from others and in a way that excludes other definitions" (Larner and Le Heron 2004, p. 219). They generate a 'politics by numbers' by facilitating comparisons among units and over time and by establishing 'standards' against which comparisons are made (Davis et al. 2012, p. 77). Most 'indicators' routinely observe and check the progress or quality of a policy, practice or condition over a period of time, to encourage self-monitoring and self-regulation, underpinned by peer-shaming mechanisms that pressurise those who are revealed to 'underperform' (Kelley and Simmons 2015;Merry 2011, p. S85). 'Indicators' hence are not simply a neutral tool of measurement providing sources of knowledge about, but also a means of governing, actors, societies and states. They create new fields of competition and bring their own spaces and subjects into existence (Larner and Le Heron 2004, pp. 215, 219). Producers are often motivated by the ability of 'indicators' to attract attention to their causes (Büthe and Mattli 2011). Yet, 'indicators' "typically conceal their political and theoretical origins and underlying theories of social change and activism" (Merry 2011, p. S84). One core plank of the political critique to 'indicators' hence concerns their ability to depoliticise. Intended to be easy to understand and ready to be consumed by policymakers, 'indicators' trade off usage with over-simplification of complex context-specific phenomena (Davis et al. 2012;Larner and Le Heron 2004, p. 214;Merry 2011). Furthermore, as relatively few people have the technical expertise and resources to understand how 'indicator' scores are determined, they concentrate power among technocrats, 'expert' producers, users and sponsoring organisations (Merry 2011;Davis et al. 2012). Finally, as 'indicators' are typically designed and labelled in the global north (Merry 2011), they offer limited information about local conditions (Davis 2014).

Materials and Methods
Up until about two decades ago, political scientists and development economists largely relied on econometric models to establish causal claims (Voors 2018, p. 80). Much methodological innovation regarding the application of case studies in impact evaluation has occurred since (Stern et al. 2013). The world is now commonly understood as composed of complex systems that mediate causal effects of interventions in non-linear ways. Exploring the impact of 'indicators' thus requires understanding interactions between the intervention (e.g. HANCI), the context and the people involved in these (Byrne 2013;Yin 2013). Importantly, there are potentially multiple causal paths to the same outcome, where "each path is a specific conjunction of factors" (Mahoney and Goertz 2006, p. 237). By detecting configurations of causal factors in complex systems, case studies can support inductive social science theory building (Byrne 2013).
The process tracing (PT) method is designed to be applied in complex contexts where competing causal explanations may be found for observed outcomes (Beach and Pedersen 2013;Collier 2011). We employ a theory building PT approach within a single-case research design. Firstly, we postulate a causal mechanism for how HANCI may have achieved policy impact, and empirically test for its validity. We analyse whether the theorised mechanism is present, and whether it functions as expected. Evidence presented covers the period from the inception of HANCI in April 2012 until November 2015.
We then build on this analysis to critically reflect on current critiques on 'indicators', and synthesise a new generic causal mechanism that seeks to explain how 'indicators' can achieve policy impact, proposing its general applicability across a range of policy contexts and 'indicators'. Impact is understood in terms of policy framing and agenda-setting, as these are areas in which 'indicators' can have most influence (Parks et al. 2015). Shifts in policy framing can be understood as changes in the way that policymakers understand and talk about a social problem or the possible responses to it (Chong and Druckman 2007). The HANCI project frames hunger and undernutrition as issues of political commitment, rather than as a matter of inadequate food production, overpopulation, poverty or other frames that could be adopted. It posits that credible evidence on political commitment could persuade non-policy elite stakeholders, such as civil society groups, and policy elites, such as senior government officials, ministers and parliamentarians, to adjust their own framings of hunger and undernutrition in such terms (te Lintelo et al. 2014).
We investigate impact in four countries with high burdens of undernutrition in which HANCI producers collaborated closely with civil society groups: Bangladesh, Nepal, Malawi and Zambia. We also consider any impact at the wider international level. The selection of case countries was based on a programmatic research design, reflecting funder prioritisation as well as presence of interested local partners. The small number of quite different countries in our HANCI study precluded the use of conventional experimental designs which, as Yin (2013, p. 323) argues, require the availability of a sufficiently large number of cases that can be divided into two (or more) comparison groups. Case study evaluations must thus rely on other techniques such as process tracing.
PT methodology encourages us to understand any policy impact (Y) as the outcome of a causal mechanism, composed of individually necessary 'parts' composed of 'entities' (objects/actors/institutions) that engage in 'activities' to jointly transmit causal forces. In order to make plausible claims about the validity of the causal mechanism, we need to observe for each part (a) whether the mechanism is present or absent in the case and (b) whether the mechanism and its parts functioned as expected. However, even if (a) and (b) are confirmed, we cannot yet make logical claims about whether the mechanism is sufficient or necessary to explain Y (Beach and Pedersen 2013, pp. 15-18). We therefore need to scrutinise the inferential strength of the evidence for each posited part of the causal mechanism, as well as for alternative explanations (rival hypotheses). Accordingly, we first identify and collect the kinds of evidence that we expect to see if the part is valid as well as those kinds that would refute it. Collected evidence is then assessed along two dimensions: uniqueness (sufficiency) and certainty (or necessity). As 'indicators' typically involve multiple causality, separating out 'uniqueness' and 'certainty' is particularly important (Stern et al. 2013). Uniqueness entails empirical predictions that cannot be explained by other theories or causal mechanisms. Hence, such evidence has confirmatory power. Uniqueness corresponds to a low likelihood ratio. Certainty, on the other hand, expresses what kind of evidence must be present for the postulated parts of the causal mechanism to be correct. Barnet and Munslow (2014) summarise the work of Beach and Pedersen (2013) in slightly different terms of theory (≈causal mechanisms) and hypotheses (≈component parts), to argue that theory testing requires seeking evidence that would (be minimally needed to) confirm the hypothesis (providing certainty) and evidence that would refute it, and then identify tests for uniqueness. Consequently, one can identify four broad test types ( Fig. 2): straw-in-the-wind, hoop, smoking-gun and doubly decisive tests. The tests are classified according to whether passing the test is necessary and/or sufficient for accepting the inference (Collier 2011, p. 825). Straw-in-the-wind tests are not of much value for our purposes, while smoking-gun and doubly decisive tests have low levels of likelihood but great affirmative value for the proposed hypotheses, if passed. Hoop tests do not confirm a hypothesis, but they can eliminate it. When passed, they help to enhance certainty in the relevance of the posited part. But when they fail, they declare the part invalid. We use hoop and smoking-gun tests in the analysis.
We draw on a range of evidence types. Documentary sources include annual or multi-year work plans, strategic plans, advocacy and media strategies, press releases, campaign materials or presentations at high-level policy forums. We further use testimonial evidence obtained through a limited number of key informant interviews with government officials, civil society and political leaders.
Finally, we note that, while evaluation methodologies often posit that evaluators should be objective and distanced from the subject being evaluated, the authors of this study instead fulfilled a 'developmental evaluator' role (Patton 2011). Developmental evaluation allows for capturing emerging features within complex systems. Having taking this role, we take particular care not to make definitive judgments about success or failure (Coffman 2009, p. 11). Yet, PT advances an unbiased assessment of the causal mechanism by specifying anticipated and actual evidence, explicating how we interpret the strength of presented evidence and discussing where evidence is absent. PT's inferential logic enables us to make stronger causality claims than ordinary qualitative case studies (Beach and Pedersen 2013;Collier Fig. 2 Four types of process-tracing tests for causal inference Source Bennett (2010, p. 210) and Van Evera (1997, pp. 31-32), adapted in Barnet and Munslow (2014, p. 20) 2011). Nevertheless, full verification of HANCI's impact through an independent evaluation at a later stage may draw out further learning.

Results
The proposed causal mechanism for the policy impact of HANCI contains three parts (Fig. 3). We run through these from left to right.
First, we establish the presence of independent variable X: the HANCI 'indicator'. This is evidenced by a list of published research and communications products for 2012-2015 that include: evidence reports and a learning partnership report, an updated website www.hanci ndex.org, one animated film, infographics/maps, slideshows and journal articles (Food Policy, World Development); and for each new issue of HANCI: scorecards with rankings and data for 45 countries and international/country-specific press releases, tweets and blogs. While 'indicators' often "rely on practices of measurement and counting that are [themselves] opaque" (Merry 2011, p. S84 ed.), HANCI reports transparently outlined methodological choices and their effects on rankings by conducting statistical sensitivity analyses. Country scorecards showed data, sources and reference years, while the website enabled visitors to see how personalised weighting choices affected country rankings. 'Indicators' in many areas conceal underlying theories of change (Merry 2011). A recent review of 22 'indicators' in the field of nutrition however underlined that HANCI is rare in explicitly setting out a theory of change from the start (Results for Development 2019). We next look step by step at the three parts of the proposed causal mechanism.
TheoreƟcal Level Part 1 of causal mechanism Part 2 of causal mechanism Part 3 of causal mechanism AcƟvity 1 Promote access to and use of HANCI through communication strategies and products, and targeted partnership activities AcƟvity 2 HANCI evidence is used to inform policy framing of hunger and nutrition as an issue of political commitment and/or to guide programmatic and funding decisions AcƟvity 3 Shaming/praising, mobilising, advocacy, information provision and media tactics to influence policy framings of hunger and nutrition as an issue of political commitment X HANCI is produced

EnƟty 3
Non-elite policy stakeholders Step 1 Infer existence of causal mechanism Y Policy elites express new understandings and framings in political and policy debates and/or set new policy agendas Step 3

Part 1 of the Causal Mechanism
Part 1 of the mechanism references the efforts that are required to enable non-elite policy stakeholders' (notably policy advocacy CSOs) access to 'indicators'. Davis et al. (2012) note that 'indicator' use is facilitated by communications strategies and products that are relatively simple, free of charge, presented in user-friendly formats, and claim originality and innovation, and for which complementary products such as online analytical tools are readily available. In addition, and unlike most global 'indicators', the HANCI project strongly posited that equitable partnerships between producers and users would enhance uptake. Accordingly, three hoop tests are proposed: • Observing a communications strategy • Finding communications activities and user-friendly products • Witnessing targeted partnership activities Absence of evidence for any of these would fatally undermine the validity of Part 1.

Evidence for Part 1
From its inception in 2013, HANCI findings were disseminated through online and in-country launch events, based on a communications strategy that identified activities, products and target stakeholders. Press releases were issued at strategic moments, for instance, before the British government hosted the G8 and organised a Hunger Summit in June 2013. Embargoed press releases encouraged international campaigners to circulate findings in their networks. These included the Scaling Up Nutrition (SUN) movement, Save the Children's Everyone campaign, Oxfam's GROW campaign, the IF campaign of a collective of 240 British international non-governmental organisations (INGOs) and Generation Nutrition led by Action Against Hunger. Findings were further presented at the Second International Conference on Nutrition in Rome (2014) and the SUN Global Gathering in Milan (2015). Research, advocacy capacity building and media-oriented activities were undertaken with local civil society organisations engaged in nutrition policy advocacy. In each country (Bangladesh, Nepal, Malawi and Zambia) the following activities were conducted: (a) commissioning in-country research on political commitment, (b) capacity-building workshops assessing HANCI evidence for advocacy and (c) joint development of priority advocacy messages drawing on HANCI and outreach to policy elites. These workshops debated suitable data and sources, on explicit and implicit assumptions built into the index (for instance, regarding weighting schemes) and methodological limitations (e.g. the logical inability to substitute performance on one indicator with another). Except for Malawi, further activities involved (d) engagement with local media, including building capacity to report on nutrition. Table 1 summarises the evidence, showing that all three hoop tests were passed, to give confidence that the promotion of HANCI results fostered access and use by non-elite stakeholders.

Part 2 of the Causal Mechanism
Building on Part 1, Part 2 proposes that non-elite policy stakeholders adopt 'indicators' to underpin and/or adjust their framings of policy problems and solutions, and/ or to guide programmatic and funding decisions. If Part 2 is valid, we anticipate observing that: HANCI registers on the radar of international bodies, agencies and thought leaders; INGOs' campaigns take notice of its evidence; and partner CSOs in Zambia, Nepal, Malawi and Bangladesh over time incorporate HANCI findings in advocacy materials such as reports, videos, blogs etc. Absence of any such evidence would undermine confidence in this part of the mechanism, as would explicit statements that partners are not interested in evidence on political commitment; or that partners' interest in political commitment evidence remains the same before and after exposure to HANCI. On the other hand, evidence that relates changes in policy framings, programmatic or funding decisions to HANCI could affirm functioning of Part 2. Possibly, CSO partners only use HANCI when contracted to do so, hence observing their use beyond funded activities would present the passing of a tighter hoop test, as would observations that third parties adopt HANCI without being contracted. Furthermore, uniqueness tests could include whether few or no strategic or programme documents of CSO partners consider hunger and nutrition in terms of political commitment (prior to HANCI engagement), or if no actors-other than those involved in HANCI-talk about the need to understand hunger and undernutrition in terms of political will.

Evidence for Part 2
Over the course of the period reviewed, representatives of Save the Children, ONE, Concern, Oxfam GB, Oxfam India, Oxfam Intermon, the Bill and Melinda Gates Foundation, ActionAid, Trocaire and the global Scaling Up Nutrition movement all expressed an interest in using HANCI products, to affirm that HANCI evidence registered on the radar of leading international agencies. Key findings from the first HANCI report were included in a bulletin emailed to all Save the Children staff worldwide. Oxfam GB funded the development of a new India-focussed HANCIlike instrument. The InterAmerican Development Bank offered financial support to conduct HANCI primary research in Guatemala, which topped the global rankings, while new funding from the Child Investment Fund Foundation allowed the  (Gillespie et al. 2013). HANCI also featured in the World Economic and Social Survey on Millennium Development Goals lessons for post-2015 (UN DESA 2014). Moreover, HANCI researchers were invited to review a methodology for a political commitment tool under development by FAO, to join the editorial review group for its flagship State of Food Insecurity Report 2014, to participate in the Data Access Group and to support the writing of the Global Nutrition Report 2014 and 2015. Finally, HANCI findings featured in live-televised (Al Jazeera) and radio interviews (BBC, Radio Moscow and Radio Netherlands), in global and national print and web-based newspaper articles (e.g. in The Guardian, AllAfrica and Reuters) and by prominent development bloggers. 2 To conclude, HANCI evidence abundantly registered on the radar of leading international development agencies, media and donors. Evidence for target countries complements the picture. In Zambia, HANCI evidence supported existing advocacy asks of the Zambia CSO-SUN Alliance to build political will to tackle undernutrition and to frame undernutrition in terms of political commitment (Chilufya and Smit-Mwanamwenge 2014; Zambia CSO-SUN Alliance 2014). "We used HANCI evidence to justify specific calls that the CSO-SUN has made for greater political commitment" (pers. comm., W. Chilufya, national coordinator 2015). In Malawi, testimonies affirm that non-partner organisations used HANCI evidence for policy advocacy (pers. comms., T. Zimpita, coordinator, CSONA, 2015; J. Nyirende, Head of Programmes, Save the Children Malawi, 2013).
In Bangladesh, the local partner identified food rights a strategic priority and campaigned for a Right to Food (RTF) Bill. It envisaged that "the HANSI [sic] database …will be used for our campaign work" (ActionAid Bangladesh 2013, p. 11). An RTF campaign brochure elaborated: "Bangladesh ranks 8th in terms of nutrition commitment, yet only 27th in terms of hunger commitment" and included images of a Bangla-language HANCI scorecard (ActionAid Bangladesh 2014 no page number). However, several organisational documents did not frame hunger and nutrition in terms of political commitment, notably its Operational Plan, a Position Paper on Food Rights and Sustainable Livelihoods and a document called the Design of the RTF campaign. Hence, we conclude that, in Bangladesh, HANCI evidence was adopted but did not consistently underpin the framing of the RTF campaign in terms of political commitment.
'Indicator' producers further partnered with Save the Children Nepal and the Civil Society Alliance for Nutrition in Nepal (CSANN) to conduct a workshop just 2 weeks after the latter's foundation. Immediately afterwards, seven participants codrafted CSANN's Advocacy and Communications Strategy, which made 16 mentions of 'political commitment/will'. It also contained an activity chart showing that, in this short time period, no other advocacy-related activities took place that could have promoted the framing of hunger and nutrition in terms of political commitment (CSANN 2014, p. 11). Furthermore, CSANN member organisations adopted and used HANCI evidence in their subsequent policy advocacy (pers. comm., U. Koirala, CSANN President 2015).
Accordingly, both internationally and in partner countries, many hoop tests are passed (Table 2) and we find substantial evidence underwriting the validity of Part 2 of the causal mechanism.
Yet, in some instances, these framings pre-dated HANCI engagement. In particular, CSOs in Malawi, Nepal and Zambia affiliated to the Scaling Up Nutrition movement referenced its global strategy for 2012-2015, which targeted "a major increase in political commitment to ending under-nutrition" (SUN 2012, pp. 6, 13). Therefore, we conclude that HANCI contributed but was not singularly responsible for non-elite policy stakeholders' adoption of political commitment framings.

Part 3 of the Causal Mechanism
Finally, the third part of the causal mechanism considers "non-elite policy stakeholders employ shaming and praising, mobilising, advocacy, information provision and media tactics to influence policy elites' framing of policy problems and solutions". Evidence for its validity would include observing that partner as well as nonpartner organisations employ HANCI evidence for such purposes.

Evidence for Part 3
In target countries, partner organisations hosted advocacy events with senior government officials. Here, 'indicator' producers discussed evidence on political commitment, while CSO partners presented particular policy asks. Carefully timed jointly authored press releases advocated for the same, receiving substantial media attention. In Malawi, a leading newspaper published an article entitled "Government welcomes new HANCI findings…" (Face of Malawi 2014). In Bangladesh, six newspapers reported on, and two national TV channels broadcasted interviews based on, the advocacy event. In Nepal, the CSANN president was interviewed by a national TV station. CSANN further organised a follow-up meeting with ten MPs lobbying for stronger engagement on nutrition. Professor Koirala (pers. comm., 2015) recalled: Using evidence collated by [producers] meant that we were better able to convince policymakers… During our work with [producers] we found it much easier to connect evidence to our advocacy asks, which was really important for us in terms of establishing credibility as an alliance. It really, really mattered for us.
In Zambia, the CSO-SUN Alliance used HANCI as "a yardstick for advocacy" (pers. comm., W. Chilufya, national coordinator, 2015) in workshops with government officials and separately with a group of MPs in July-August 2014. The Alliance also conducted additional non-contracted activities with MPs to found an All Party Parliamentary Caucus on Nutrition in October 2014. Zambia's low HANCI rankings (17th in 2012, 30th in 2013 kick-started discussions with members Health and on Community and Social Development. Rankings were found "a useful tool to provoke government, and to create an appetite to talk about issues of hunger and nutrition in the country" (ibid.), including on specific policy recommendations.
Advocacy requires evidence: policymakers ask you who is telling you this? So you need to be equipped… It is particularly important also to use, where possible, data and evidence published by the government itself. So for us one of the documents has been HANCI, to support our recommendations (pers. comm., W. Chilufya, 2015).
Finally, CSO-SUN platforms in Uganda and Kenya spontaneously employed HANCI evidence in advocacy efforts (pers. comms., M. Mumma, KANCO and C. Muyama, UCCO-SUN, 2015). KANCO presented HANCI scorecards to the government's Head of Nutrition, the Ministry of Health. Next, KANCO participated in a delegation visiting State House, presenting HANCI scorecards and Global Nutrition Report data. As a result, the First Lady of Kenya agreed to become patron of the national Scaling Up Nutrition campaign. The data were also used to successfully persuade Yvonne Chaka Chaka, a famous South African singer, to become a nutrition champion for the organisation (pers. comm., M. Mumma, KANCO, 2015).
Accordingly, media reports, textual and testimonial evidence show that HANCI evidence was widely used to underpin CSO advocacy claims towards political elites, whether or not they were contracted to do so by the 'indicator' producer (Table 3). Accordingly, a distinct set of (though not all) hoop tests were passed at the international level and within the target countries, to give sufficient confidence that Part 3 of the causal mechanism functions as proposed.

Did HANCI Achieve Policy Impact (Y)?
As evidence shows that all parts of the proposed causal mechanism are present and function, we anticipate observing independent variable Y: "Policy elites express new understandings and framings in political and policy debates and/or set new policy agendas"; that is, they portray hunger and nutrition policy problems and solutions in terms of political commitment. Such evidence would, at a minimum, need to show that monitored governments publicly respond to, contest or seek acclaim based on HANCI evidence. Better still, senior political leaders and/or bureaucrats could report that such evidence inspired them to bring hunger and nutrition higher up on political agendas or to frame these in terms of political commitment, and/or use the evidence to define new policy agendas.

Evidence for Y
HANCI impact was uneven across target countries. In Bangladesh, the government did not issue a public response to the press release. However, it extended an invitation to the producer and local partner to present findings to the director general (DG) of its Food Monitoring and Planning Unit. In this meeting, the DG carefully studied and then dismissed the HANCI scorecard and the partners' claim to a Right to Food law. Noting that Bangladesh did not get highest scores in terms of ensuring a justiciable Right to Food, he argued (pers. comm., N. Farid, 2014) that "I don't care what score Bangladesh is getting…We [our programmes, ed.] are very real. I am not interested in some hypothetical issue… I am not aligning our indicators to HANCI indicators-no, I am using my own indicators." In contrast, in Malawi, the Principal Secretary to the Government for Nutrition, HIV and AIDS argued "We believe that the findings by a reputable institution like [producer organisation] on the hunger and nutrition [commitment index] will help provide insights on how we can improve on our commitments to address the challenges of hunger and malnutrition" (Face of Malawi 2014). And in Nepal, a member of the National Planning Commission noted that HANCI findings are "eye opening for government" (pers. comm., M. Shrestha, 2014). Parliamentarians, in consultations with CSANN, concurred that "Political commitment is of prime importance", with ten signing a pledge to address undernutrition in election manifestos (pers. comm., U. Koirala, October 2015).
In Zambia, the Minister of Agriculture was enraged by low HANCI rankings. He called a committee investigation, but the committee affirmed the veracity of data used (pers. comm., anonymised committee member, 2015). The CSO-SUN Alliance meanwhile had discussions with government officials, who argued that their efforts to combat undernutrition were inadequately reflected in the index, to demand full transparency of data sources used. In response, 'indicator' producers commenced sharing Zambia data with the Alliance board, who in turn allowed government officials to have a sneak preview. This allowed CSO-Government relations to remain constructive. Rankings also generated much traction in advocacy meetings with MPs. The Zambian All Party Parliamentary Caucus on Nutrition official statute identified generating strong political will as one of six objectives. One of its members, the honourable Hamududu MP, noted (pers. comm., August 2014) "HANCI is a very good tool to help us qualify how well we are doing- [we] can worry about the specific data that is included, but this provides us with a framework to think through how we can be improving our commitment." Also outside the four target countries, policy elites and bilateral and multilateral donors were found to liberally use HANCI evidence. Guatemala's top ranking was proudly announced by Secretariat for Food and Nutrition Security (Government of Guatemala 2013), and noted in various reports (Government of Guatemala 2015, p. 106) and in a televised news programme. President O.P. Molina, in a speech at the Third Summit of the Community of Latin American and Caribbean States, underlined that government initiatives "contributed to Guatemala receiving the highest rating in the Hunger And Nutrition Commitment Index" (Hoy Venezuela 2015). Vice-President Roxana Baldetti presented its HANCI rankings at the Nutrition for Growth Summit in London (Dickson 2013) and in the presidential palace in Guatemala, drawing on HANCI infographics (Fig. 4).
This highly unique and certain (doubly decisive) evidence shows that, in Guatemala, policy elites explicitly express the need to address hunger and undernutrition in terms of political commitment using the HANCI.
Further evidence of the use of HANCI by policy elites was a keynote speech by Irish President Michael Higgins to ministers, donors and Malawian academy at the University of Lilongwe, congratulating the Government of Malawi on its third-ranking performance in the HANCI (Higgins 2014). Furthermore, by June 2015, the New Partnership for African Development (NEPAD) of the African Union approached the index producers to jointly develop a HANCI for Africa. NEPAD envisaged this to support its monitoring of member states' performance on commitments made to address hunger and nutrition, set out in the 2014 Malabo Declaration. Finally, in November 2014, the World Health Organization of the UN devised a global monitoring framework (GMF) to assess the implementation of policies and programmes promoting the achievement of targets for maternal, infant and young child nutrition. The framework identifies a core set of variables that all UN member countries must report on. It also identifies an extended set of 15 variables, from which countries can draw to design national nutrition surveillance systems. One of these variables concerns 'strength of nutrition governance', with HANCI identified as one of two metrics to be used (WHO 2014, p. 35). Not least because the GMF was devised through Fig. 4 Guatemalan Vice-President takes ownership of HANCI findings a consultative process with member states and UN agencies, the specific selection of HANCI suggests broad support amongst policy elites to frame hunger and nutrition in terms of political commitment. In other words, it constitutes another smoking-gun test for Y. We do not consider it to be a doubly decisive test as we also know that, besides HANCI, significant global advocacy efforts were made by the SUN Movement using this frame. Table 4 sums up the results of our assessment. Internationally as well as across Malawi, Nepal and Zambia, but not in Bangladesh, have policy elites publicly responded to HANCI evidence on political commitment by challenging, denying or embracing its value? Evidence was strongest for Zambia. While we have not witnessed a specific change in the substance of national policies, laws or budgets, passed hoop tests suggest that its future occurrence is conceivable. Furthermore, at the international level, a doubly decisive test was passed for Guatemala. Both the citing of HANCI by the Irish President and in the WHO Global Monitoring Framework selecting HANCI constitute smoking-gun tests. We thus conclude that there is strong evidence for outcome Y.

Discussion
In this penultimate section of the paper we reflect on the methodological, empirical and theoretical implications of our findings. We first debate the process tracing methodology and then propose a number of suggestions advancing the theorisation of the relationship between 'indicators' and policy impact.

Reflections on Process Tracing
Most 'indicator' producers aim to affect social change or policy reform. Yet, conclusive proof of such impact is often hard to establish (Davis 2014, pp. 46-47). This paper provides evidence supporting the theory that 'indicators' lead to policy impact. The impacts of HANCI were noted across countries and at the international level. However, policy processes are often dynamic and complex systems (see e.g. Baumgartner and Jones 2009) within which interventions are unlikely to have linear effects. 'Indicators' constitute one potential factor, amongst a range of other dynamic factors affecting policymakers, such as a salient policy shift unrelated to the intervention, the existence of a similar intervention or some other unidentifiable influence in the wider context (Yin 2013). Observing causal connections between 'indicators' and their effects on policy stakeholders' beliefs, attitudes and decisions is hence difficult, and counterfactuals cannot be established; For instance, when policymakers refer to an 'indicator', this does not mean that subsequent behaviour is affected by it. They may present data to justify a decision after the fact, to display a symbolic commitment to evidence-based decision-making or may simply resist pressure to change behaviour (Davis et al. 2012). They may also shift behaviours in ways designed to improve their score, but in ways not desired by the 'indicator' producer (Merry 2011). Policymakers use diverse types and sources of data in decision-making processes, so even when they are influenced by 'indicators', they are unlikely to rely entirely on them (Davis et al. 2012). Moreover, even within our limited set, civil society partners have heterogeneous existing capacities, expertise and focus, funding, leadership, motivation and drive-all affecting their ability to engage in producer-user relationships that seek to influence policy. Simultaneously, political contexts differ tremendously between countries: Platforms through which governments engage with policy advocates and political space for critique are quite uneven, as is governments' historical interest in food security and nutrition issues. Accordingly, rather than singling out determinant factors driving policy impact, or parsing out the relative contribution of civil society partners, producers or standalone HANCI 'indicators', a comparison of within-and across-country cases allows for the investigation of the configuration of factors that contribute to producing particular policy impacts within context (cf. Byrne 2013; Stern et al. 2013).
This is the approach that this paper has followed. Process tracing allowed for a careful investigation and triangulation of a catalogue of evidence types using multiple data sources and involving multiple (producer and user) analysts. PT offered an explicit procedure for assessing the causal inferential strength of evidence, facilitating a balanced assessment of the plausibility of hypotheses and rival explanations. Yet, conducting qualitative impact evaluations with acceptable and rigorous procedures is still in its formative stages and explicit procedures are needed to deal with how and whether the acceptance or rejection of rival explanations meet standards as being 'acceptable', 'weak' or 'strong' (Yin 2013). By ensuring that evaluators declare assumptions and their weighing of the evidence, PT enables verification and triangulation by other researchers and help producers to guard against hubristic claims about 'indicator' impact. Moreover, process tracing encourages making new analytical generalisations. This has been considered the "preferred manner of generalizing from case studies and case study evaluations" (Yin 2013, p. 327). The next section hence summarises how our study sought to refine 'indicator' theory.

Implications for 'Indicator' Theory
In this section we consider in what ways study findings relate to both 'political' and 'design' approaches to 'indicators'. While there is no singular scientific way of producing an 'indicator', design and interpretation can be done in valid and invalid ways (Masset 2011;Ravallion 2011). We propose that a robust design is a necessary but insufficient condition for 'indicators' to have potential for enduringly influencing policymakers. Policymakers critically review methods underlying 'indicators', particularly when these show them in a poor light. Producers can underwrite 'indicator' credibility by ensuring construct validity, an explicit theory of change, and transparency regarding data employed, methodological choices made and their effects on the performance of the 'indicator'.
Current theorisation highlights elite peer shaming, domestic politics and international pressure as main causal factors in shaping policy impact (Kelley and Simmons 2015), with relations between producers and users portrayed as dichotomous and hierarchical. Our findings however suggest that a reconceptualised relationship between producers and users can not only catalyse the policy impact but also redress important concerns about the ways in which 'indicators' explicitly or implicitly configure such relations.
Equitable partnerships involve open, respectful, equal and trustful relations between 'indicator' producers and users, to the extent that boundaries between their roles may blur. Such partnerships allow for the involvement of people across scientific disciplines and policy sectors to overcome silo mentalities that often inhibit coordinated action on nutrition. Equitable partnerships can foster transparency and democratise deep understanding of the technical structure and functioning of 'indicators', demystify research evidence and break down barriers between (typically Northern) academic producers and (Southern) practitioner users, to address critiques of 'indicators' being technocratic and elite-driven. Moreover, users' in-depth understanding of the substance and functioning of 'indicators' can foster their effective deployment. For CSO users, particularly as the political space for policy advocacy is narrowing globally (Carothers and Brechenmacher 2014), having a sound understanding of strengths and limitations of 'indicator' evidence is critical for effectively engaging potentially hostile governments. Moreover, equitable partnerships allow producers to obtain regular user feedback to improve 'indicator' design and an enhanced understandings of their tactical deployment by policy advocates.
While the case study evaluation literature assigns causal power to context (Stern et al. 2013), most 'indicators' are tone-deaf about local conditions (Davis 2014). They homogenise, simplify and depoliticise complex context-specific phenomena (Davis et al. 2012;Larner and Le Heron 2004;Merry 2011). Although this is a reasonable observation, such recognition is neither beyond CSO users nor beyond targeted policymakers. Thus, while the value of 'indicators' is often portrayed in terms of holding leaders accountable to international standards (Kelley and Simmons 2015; Rosga and Satterthwaite 2009), we find that users' credibility depends on connecting 'indicators' to domestic standards, as for instance set out in a national nutrition policy, and to draw on domestically produced data, especially government statistics. We further find that elite shaming is used sparingly as it is politically costly. Where used, it is done in combination with praising tactics to avoid burning down carefully constructed relationships of engagement that enable effective policy advocacy.
Within equitable partnerships, policy advocates employ 'indicators' in a tactical manner to instrumentally open up topics for discussion, although on their own terms and at a time of their choosing. In such circumstances, 'indicators' do not depoliticise but rather repoliticise. The best policy advocates use 'indicators' at opportune moments, finely attuned to priorities of political leaders and other local political economy dynamics. This matters because "senior-level decision-makers are selective and strategic… paying more attention to assessments that align with pre-existing government interests, policies and programs" (Parks et al. 2015, p. 9). Consequently, country-specific diagnostics, intelligently applied, can exert greater influence on policy reforms than cross-country benchmarking. In sum, equitable producer-user partnerships can better interpret and act on local context to more effectively mobilise domestic politics, elite peer shaming (and praising) and capitalise on international pressure.
Based on these reflections on 'indicator' design and 'indicator' politics, Fig. 5 shows a synthesis causal mechanism for the impact of 'indicators' on policy framing. 3 Although most impact evaluation models fail to consider the causality imparting arrows in their diagrams (Yin 2013, p. 324), arrows in Figs. 5 (and 3) represent equitable producer-user partnership activities, with forward/backward directions illustrating feedback loops. Such activities can take many forms-for instance, joint workshops to examine and unpack the 'indicator', cooperatively crafting advocacy messages and jointly presenting evidence informed advocacy claims to policy elites such as MPs. Needless to say, equitable producer-user partnerships can be a contributory but not determining factor within dynamics driving policy change (cf. Vellema et al. 2013).
We end this section with some reflections on what enables or constrains equitable partnerships. Without being exhaustive, enabling factors include producers and users having prior trusting relationships rooted in a participatory and egalitarian ethos, a mutual recognition of complementary strengths (e.g. superior understanding of local context for effective advocacy versus research rigour and academic credibility) and a commitment to develop technically robust 'indicators' that are made comprehensible by effective communications and used with integrity. Producers must be willing to let users drive decisions on how and when to use 'indicator' evidence, consult them on the interpretation of emerging evidence and be willing to adjust 'indicator' TheoreƟcal Level design based on these consultations. Constraining factors include, for both users and producers, adequate time, financial and other resources to effectively build trusting relations. For producers, mind-sets may constrain too: It is typically easier to produce figures from behind a desk than to work with users, even though this advances the chances that 'indicators' are appropriately used to influence policy.

Conclusions
This article observed the rapid proliferation of global food and nutrition security 'indicators', to note that their impact on policy is routinely assumed, yet rarely demonstrated. Innovatively using a process tracing approach, we established the impact of one 'indicator', the Hunger And Nutrition Commitment Index (HANCI), on national and international policy framing and agenda-setting processes. The analysis proposes a causal role for equitable producer-user partnerships in achieving policy impact, enriching current theorisation of 'indicators' which emphasises the role of technically sound design and effective communication. Such partnerships allow for a better understanding, attuning to and navigation of local context, typically blindsided in 'indicators' despite their causal force. As such, 'indicator' producers and users have much to gain from such partnerships: credibility, access to policymakers, and insight into and greater ability to navigate complex political economies, to enable a more effective and politically sensitive employment of 'indicators'.

Compliance with Ethical Standards
Conflict of interest On behalf of all authors, the corresponding author states that there is no conflict of interest.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.