Keywords

1 Introduction

The most important entry point for understanding how the SDGs produce global public policy is an examination of their indicators: the ways they are deliberated, chosen, refined and measured on the global level, which is the key venue for producing quantified global governing knowledge. In contrast to the way that the MDGs’ 60 indicators were chosen, the United Nations Statistical Commission was tasked with creating a system for choosing and refining indicators for inclusion in the SDGs that would be led by member states rather than International Organisations. Such an innovation in the governing of the monitoring agenda was seen as allowing for greater refinement, since both methodologies and data sources were about to expand and change over the period 2015–2030. The UNSC’s designated working group for this deliberative work, the Inter-Agency and Expert Group for the Sustainable Development Goal Indicators (IAEG-SDGs), became the key agency for establishing protocols for evaluating methodologies for producing data for each indicator.

This chapter will analyse the process by which indicators are deliberated, chosen and refined for inclusion in the global SDG monitoring framework. This includes a careful analysis of the evaluative framework for indicators, which is an ongoing process as the SDG framework is seen to be a living document, to allow for further refinement as we ease towards 2030. This evaluative framework is central to the production of the SDGs as a whole: it encapsulates the tensions within this ‘neutral’ space of statistical decision-making and shows us how entwined technical and political accountability are within the global agenda of the SDGs. As such, the evaluation of measurement can be seen as a governing practice. In the next section, we show where this argument sits within the literature on the production of knowledge for governance and the role of knowledge evaluation in governing paradigms. The following section provides a history of the development of the SDGs’ tier system, shows how it is a performative—in many senses of the term—space and outlines how it has served both as an engine and an obstacle for certain policy issues, as inclusion within the SDG framework has become one of the key modes of policy advocacy.

2 Knowledge for Governance, Practices of Evaluating Governing Knowledge, and Technical-Political Accountability

Social scientists have shown how the production of quantified knowledge for policy has become a hallmark of contemporary governance. Policy studies scholarship has investigated the nature of evidence-based policymaking and the so-called evidentiary turn in public policy (Duffy, 2017; Miller, 2001; Taylor, 2005). ‘The promise of evidence-based policy-making’—as anthropologists Rottenburg and Merry (2015, p. 1) argued—‘is that it is not only more objective and less prone to misuse, but also more transparent, more democratic, and more open to public debate than decisions taken by politicians and business leaders with reference to qualitative ways of knowing’. As ‘globally circulating knowledge technolog[ies] that can be used to quantify, compare and rank virtually any complex field of human affairs’ (Rottenburg & Merry, 2015, p. 5), indicators are key to supporting this promise.

The object of study for this chapter is the evaluation process for indicators’ inclusion, classification and refinement in the SDG framework. Our fundamental argument here is that substantive decisions in the space of the Inter-Agency and Expert Group on Sustainable Development Goal Indicators (IAEG-SDGs)—which explicitly labels itself as ‘neutral’ and ‘apolitical’—are vital to the global governing paradigm of the SDGs. In this way, the UNSC and the IAEG-SDGs have created a taxonomy of taxonomies through the tier system, as they classify and reclassify the methodologies for measuring economic, social and environmental phenomena on the global level. Informed by Science and Technology Studies (STS) scholarship, we are interested in opening up the practices of evaluation of quantified knowledge that are at the centre of this work. As we will show, this evaluation of indicators includes more than merely the evaluation of methodologies for measuring phenomena, but indeed entangle considerations about finance, power and alignment of multitudes of policy agendas on national and global levels. For example, in the first meeting of the IAEG-SDGs in June 2015, the representative for the Cameroonian NSO—standing in for ‘the African group of countries’ represented at the meeting—argued the ‘need to establish the costing structure’; second, he emphasised country ownership; and finally suggested the need for a careful elaboration of the process for choosing proxies and making sure indicators aligned with national development plans (IAEG-SDGs, 2015, p. 6).

These evaluatory practices of quantified knowledge production are by definition both technical and political. Building off the ‘success’ of the MDGs to entangle global policy agendas with quantified knowledge, the SDGs entangle technical and political accountabilities (Bandola-Gill, 2021; Fontdevila & Grek, 2021). As outlined in the 2011 Busan Action Plan, ‘reliable and accessible statistics provide the evidence needed to improve decision making, document results, and heighten public accountability’ (PARIS21, 2011, p. 2). Political accountability implies a relationship of responsibility, that a governing power—whether a nation-state or a supra-national organisation—is to account for its actions, which impact its citizenry or beneficiary population. As many scholars of quantification have shown, however, in recent decades, accountability—as a form of responsible governance—has become closely tied to quantification. Espeland and Vannebo (2007, p. 22) discuss how, understood ‘as creating responsible people and accessible, responsive institutions, accountability is obviously a desirable goal’. However, with the new ‘technologies of audit and accountability’, discussed in the introduction, came ‘new forms of governance and power’ (Shore & Wright, 2004, p. 57). In the context of the SDGs, some have expressed anxiety that these practices of counting and evaluating quantitative knowledge, through the introduction and monitoring of indicators on policy issues like gender equality, have themselves effectively become proxies for ‘substantive contestation on key policy issues and meaningful accountability mechanisms’ (Razavi, 2019, p. 149).

3 Processes and Institutions: Producing Indicators for the SDGs

3.1 Evaluating Statistical Knowledge: Developing Protocols for Choosing Indicators

Central to the production of knowledge for governance in the context of the SDGs is an intricate evaluative model for deliberating, choosing and refining indicators for monitoring progress on the larger framework’s targets and goals. This evaluation of quantified knowledge for the SDGs depends on a tripartite system: first, the production of a classification system for legitimizing global indicators for inclusion within the SDG framework (the tier system); second, the relevant protocols for guiding the promotion—or rejection—of different indicators through the said classification system (evidence production and methodology refinement) and third, a network of actors with delineated authority over the deliberation process (the Inter-Agency and Expert Group for the Sustainable Development Goal Indicators, or the IAEG-SDGs). In addition to the above, as we will see below, many indicators follow certain ‘path dependencies’, carrying with them long-standing epistemic communities who have spent years, or even decades, debating and producing bodies of evidence to support particular ways of measuring phenomena in public policy.

Further, as part of the larger emphasis that the SDGs be much more participatory, national statistics offices (NSOs) and institutes pushed for their inclusion in the deliberative process for deciding which global indicators would be chosen to monitor the 17 SDGs, which resulted in NSOs being designated the voting members of the IAEG-SDGs, while UN agencies, civil society and donor organisations were given the role of observers. NSOs’ role is key in this process since, once indicators are set, it is then the NSOs’ responsibility to produce the data and statistics to populate these indicators, in order to ‘report’ on them. In Chart 1, we can see how UN Water has visualised these responsibilities and the flow of data.

Chart 1
figure 1

Roles and responsibilities for monitoring and reporting on SDG6 (on water and sanitation), made by UN Water and adapted from the IAEG-SDGs. https://www.sdg6monitoring.org/activities/roles-and-responsibilities/

We will discuss a very particular part of this data flow—harmonisation—in the next chapter, but here it is important to point out the large number of actors involved in the validation and production of the SDG dataFootnote 1. Therefore, the push for more inclusive deliberation was not helped by the problem of the lack of data availability in many countries, which became a development problem in its own right: indeed, the global monitoring of the MDGs had made visible the problem of insufficient data infrastructures for tracking many social, economic and environmental phenomena ‘with frequency, timeliness, comparability’ (UNSD and FOC, 2014, p. 8). We will discuss the issue of statistical capacity development further in the next chapter, but the anxiety about the sheer number of indicators to be measured at the global level—OWG’s original proposal in 2014 included 304 unique indicators in the SDG framework in comparison to the MDGs’ 60 indicators—was also acknowledged as a reason to establish a clear process by which proposed indicators would be chosen, including that they have clear methodologies that most countries are producing data to monitor.

The production of indicators in the SDG framework is both a continuation of and an explicit divergence from earlier protocols, networks and institutions used and leveraged to measure economic, social and environmental phenomena on the global level. As discussed in the introduction, a major criticism of the MDGs was that the process by which goals were included in the global agenda was decided from the top without proper consultation with member countries and many bilateral or multilateral development partners. As one member of the global statistics community put it: ‘The MDGs were pretty much cooked up by the international agency community’ (UN Statistician 4, 2020). This criticism extended to the way that targets and indicators were chosen for inclusion within the MDG framework. In a report on the lessons learned from monitoring the MDGs, the Inter-Agency and Expert Group on the Millennium Development Goal Indicators (IAEG-MDGs) highlighted this as the first weakness (‘from statistical but also policy perspective’) of the MDGs: ‘Targets and indicators were perceived by national statistical systems and other development partners primarily as an international agency driven ‘top-down’ initiative’ (IAEG-MDGs, 2013, p. 3). This was part of the push for more participatory forms of deliberating goals and targets in the SDG framework, which we will discuss further in Chap. 4, as well as the indicators for monitoring progress towards them. Further, there was an underlying argument that the indicators for monitoring the MDGs had been reductionist, and that some of the indicators for measuring societal progress—gross national product (GDP) in particular—needed to be complemented with ‘broader measures of progress’ within a framework of sustainable development (UNSD, 2013, p. 3). To address these issues, the UN Statistical Commission was called upon to promote ‘the science-policy interface through inclusive, evidence-based and transparent scientific assessments, as well as access to reliable, relevant and timely data in areas related to the three dimensions of sustainable development,Footnote 2 building on existing mechanisms, as appropriate’ (UNGA, 2012, p. 15). The UNSC was established in 1947 to produce and maintain international statistical standards, mirroring and influencing larger trends in international development over more than 70 years of its existence (Ward, 2004). Voting members of the UNSC include representatives of national statistical offices and the statistical offices of International Organisations. Indicators chosen for inclusion in the new, expansive SDG framework had to both incorporate broader definitions of societal progress—including developing indicators whose methodologies and data production were not yet refined—and make use of existing mechanisms ‘as appropriate’.

In the same 2013 report on ‘lessons learned’ mentioned above, the IAEG-MDGs indicated that there were fundamental inconsistencies in these 60 indicators that had been chosen in this ‘top-down’ manner: ‘Some goals, targets and indicators are not well-aligned, and some goals are not adequately addressed by existing indicators’ (IAEG-MDGs, 2013, p. 3). Going further, Fukuda-Parr, Yamin and Greenstein (2014, p. 112) argue that it was the indicators and the availability of data that drove which targets were included within the MDG framework,Footnote 3 and ‘the decision that only targets with agreed-upon indicators and ‘robust’ data would be included in the goals, with very few exceptions’ had a direct impact on derailing the MDGs from the much more expansive Millennium Declaration. However, this emphasis on ‘robust’ data did not map neatly onto which indicators were chosen, either; the authors found that ‘some of the indicators and targets chosen were weakly conceptualized and driven by political considerations as much as measurement ones’ (Fukuda-Parr et al., 2014, p. 112). Whether SDG indicators will follow a similar path is an open question—not the least because the IAEG-SDGs have been asked to continue to deliberate on many of the indicators and consider new ones that meet necessary criteria to monitor SDG targets.Footnote 4 However, avoiding these inconsistencies was a key impetus for establishing a protocol for reviewing, evaluating and refining indicators.

In March 2013, the UN Statistical Commission (UNSC) established the Friends of the Chair (FOC) group on broader measures of progress as a result of demands made at the 2012 United Nations Conference on Sustainable Development Rio de Janeiro, Brazil (Rio+20) ‘to launch a programme of work on broader measures of progress to complement gross domestic product (GDP) in order to better inform policy decisions’ (UNSD, 2015a, p. 2). The FOC was explicitly meant to support the intergovernmental process on the post-2015 development agenda—to provide statistical guidance to the Open Working Group (OWG) on Sustainable Development Goals as they discussed, refined and chose the goals and targets to be included within the SDG framework. In this way, the goal to expand beyond the reductionist view of development promoted by GDP was central to the work of the post-2015 agenda as a whole, and to the statistical work of this agenda—as taken on by the UNSC and its associated working groups—in particular. One of the key modes by which FOC—and by extension the global statistical community—provided assistance to the drafting of targets and goals was a set of 29 statistical notes to aid in the OWG’s deliberations in March 2014, in which the UNSD and FOC collated and outlined ‘main policy issues, potential goals and targets’, ‘conceptual and methodological tools’, ‘existing and new indicators’ and ‘data requirements, challenges and limitations’ for 29 varied policy issues (UNSD and FOC, 2014). These notes were meant to ‘provide information on the measurement aspects’ of those issues discussed by OWG in its first eight sessions (UNSD and FOC, 2014, p. 1).

After extensive deliberation on the targets over the course of 2014, the OWG proposed a list of 304 provisional indicators in communication with FOC for discussion at the 46th UN Statistical Commission in March 2015. In its guidance to the OWG on these provisional indicators in 2015, the FOC used a provisional form of evaluation for each of these initially proposed indicators. The grades for each of the originally proposed indicators were used both by the FOC and in consultation with representatives from national statistics offices from member state countries in order to evaluate each proposed indicator. This provisional form of evaluation gave a grade between ‘AAA’ and ‘CCC’ to each indicator, where the first letter rates the feasibility of producing data for the indicator, the second rates the suitability of the indicator, and the third letter rates the degree to which the indicator is actually relevant to the target it is meant to measure. The UN Statistical Division, as secretariat of the UNSC, contacted statistical representatives from 70 countries to grade each of the 304 indicators in this way.Footnote 5

At the 46th UNSC in March 2015, the Commission also officially endorsed the establishment of the IAEG-SDGs and tasked the working group to ‘fully [develop] a proposal for the indicator framework for the monitoring of the goals and targets of the post-2015 development agenda at the global level, under the leadership of the national statistical offices, in an open and transparent manner’ (UNSD, 2015b, p. 1). The group was to consist of 27 representatives of National Statistical Offices (NSOs) and other actors—including international agencies—were to participate only as observers. As with many UN groups, representation was very important, and the IAEG-SDGs were required to ensure ‘equitable regional representation and technical expertise and including members of the least developed countries, landlocked developing countries and small island developing States’ (p. 1). As might be expected, with the member state NSOs in control of the deliberation of the indicators for inclusion, the relationship between them and the UN agencies and Bretton Woods organisations—as custodian agencies of the indicators—was not clearly defined. As one representative of a member state NSO put it, this relationship was ‘a mystery to me actually, even though I was part of it’ (National Statistician, 1). The official roles constituted in the IAEG-SDG protocols were not mapped neatly on the actual process for developing and verifying indicators, as this community member remembers it:

the first time I went to the IAEG-SDG meeting in New York, I couldn’t believe how many observers in the form of UN organisations there were. We [the NSOs] were sitting like a small number of people in the midst and everywhere you looked it was like the sea of UN organisations. But if they had not been there, it would not have worked, because the stats system is designed for some things, but this system is a lot larger than that. (National Statistician, 1)

Emphasising the fact that national statistics systems sit within a larger architecture for producing data about global phenomena, she argued that the NSOs could not have produced a global policy monitoring system without the UN agencies. The tension between IOs and member states—which can be found in many different parts of the monitoring process and is a key defining feature of the 2030 Agenda as a whole—is indeed part of what keeps the machinery of global measurement going, as this member makes clear. It also raises the question of how many of the classification and evaluatory decisions made by the IAEG-SDGs are in fact shaped by the path dependency due to many IOs’ long histories of producing data about global phenomena.

3.2 The Tier System

The key protocol for evaluating and classifying indicators is the tier system—a tool which was first introduced by the Inter-agency and Expert Group on Gender Statistics (IAEG-GS) in 2012 as a means to evaluate indicators, alongside ‘the primary criterion that indicators should address key policy concerns’ (UNSD, 2013, p. 3) (Chart 2). The group broke down the tiers in the following way, which map neatly onto the tier system used by the IAEG-SDGs:

Tier 1

Indicators conceptually clear, with an agreed international definition and regularly produced by countries

Tier 2

Indicators conceptually clear, with an agreed international definition, but not yet regularly produced by countries

Tier 3

Indicators for which international standards need still to be developed and not regularly produced by countries

Chart 2
figure 2

Timeline of important moments in the creation and use of the tier system.

The tier classification process was proposed by the Inter-Agency and Expert Group on the SDGs (IAEG-SDGs) in 2015 and formally agreed upon at the 47th UN Statistical Commission in March 2016 (UNSD, 2016). This classification system was designed as a means to evaluate and refine global indicators for international comparability. At the beginning of the indicator refinement and reclassification process in 2015, ‘the largest proportion of indicators was in the so-called tier III category’ (UN Statistician, 6). After the IAEG-SDGs’ 2020 comprehensive review of the framework—which was approved at the 51st UNSC in March 2020—there were no tier III indicators amongst the 231 unique indicators included in the framework. All tier III indicators had either been eliminated or their methodology had been refined and tested, leading to their reclassification as tier II or tier I indicators. As of December 2020, the ‘tier classification contains 130 tier I indicators, 97 tier II indicators and 4 indicators that have multiple tiers (different components of the indicator are classified into different tiers)’ (IAEG-SDGs, 2021, p. 2).

Very early on in the IAEG-SDGs’ process, however, it became clear that it would be difficult to convert the expansive global agenda of sustainable development into indicators for the SDG monitoring framework. A crucial sticking point for monitoring many targets was that the indicator be ‘conceptually clear’ in order to be classified as a tier I or tier II indicator. One community member described the difficulty of defining ‘sustainable forestry’ and ‘sustainable agriculture’ and ‘figuring out’ a number in the face of great expectations for the SDGs to be actually transformative and in conversation with other public agencies in her country:

I think there has been frustrations everywhere because things are complicated and because the people who order the system […] expected it all to be in place very soon. They had no idea how long […] it takes to develop a new statistical thingy. And all across the system, it’s like: ‘Yeah, we have some data on forestry, but you have asked us for sustainable forestry, so now we have to figure out what the criteria would be for that, and then we have to figure out is someone measuring that,’ and that’s everywhere. I’ve had discussions with [ministerial] people that were very upset because we had taken the wrong numbers to look at sustainable agriculture. […] I could think of 10 different ways that you could make sustainable agricultural statistics if you wanted to, [but the ministerial representative] didn’t want any of these. [Later,] he wrote to me, and said, ‘No, I can’t put it into numbers what it is that I would like to do, and maybe,’ he said, ‘we shouldn’t have a number for this, then.’ [Well,] that’s not for us to decide, is it? And being a stats person of course I think it’s better to have something and then argue about it than having nothing, and everyone thinks that they are talking about the same thing, but they’re not. (National Statistician, 1, our emphasis)

As we will discuss further in Chap. 5, a placeholder number does important work for the global statistical community—allowing for a common language, even if it is not quite the right language yet. The inclusion of ‘sustainability’ into many of the SDG targets—as a global agenda to support sustainable development—has proven very difficult for the IAEG-SDGs and custodian agencies to define statistically, because the definition of sustainability itself is quite open to interpretation.

Further, even if an indicator is conceptually agreed upon, the distinction between tier II and tier III indicators is also about establishing internationally agreed upon standard measurement methodologies and data sources, as discussed further in the next section. In the case of global SDG indicators—as opposed to localised indicators—the onus is on UN custodian agencies and Bretton Woods organisations to do this work of ‘reclassifying’ the indicators for which they are responsible. This reclassification of indicators requires material and epistemic investment from UN custodian agencies, particularly in pilot testing methodologies, which was felt by some UN agencies as a ‘burden’:

The big challenges were to really comply with all the criteria of the reclassification process. And probably the most arduous criterion was having to pilot test the methodology in a regionally representative sample of countries. So, you can immediately imagine that that puts a very big burden on custodian agencies. And compounding that was a situation where some countries were not willing to collaborate for different reasons. Some have their own resource constraints. So even participating in the pilot testing would have entailed some additional resources from their side which they were not able to commit. So, for different reasons there was also, let’s say, a reluctance from some countries to participate in the pilot tests, which delayed some of these pilots. (UN Statistician, 6)

Here, we can see how ‘reclassifying’ indicators for which IOs are responsible is a key mode of asserting their definition of a policy problem on the global policy landscape—one which some countries resisted for both material and ‘different reasons’. As we will see below, reclassification is described as a technical process—a matter of leveraging funds to be sure you can run a pilot study in at least one country in each UN region (Africa, Europe and North America, Latin America and the Caribbean, Asia and the Pacific, and Western Asia)—in order to provide sufficient evidence to the IAEG-SDGs to show that an indicator can be feasibly populated with data across the world. However, the reclassification protocol is also one of the key processes by which proposed policy problems can become global public policy problems.

Despite the attempts of the UNSC and its larger community—including the IAEG-SDGs—to promote ‘the science-policy interface through inclusive, evidence-based and transparent scientific assessments’ (UNGA, 2012, p. 15), there are those in the global statistical community who argue that this promotion of the science-policy interface has been hobbled since the beginning, when statisticians were not invited as official members into the OWG’s drafting of the SDG targets. In the original work plan, the drafting of the goals and targets were designated to the politicians, while the indicators were delegated to the statisticians:

Well, that proved to be maybe a short-sighted, let’s say, approach because the result was that the statisticians were not involved in the formulation of the targets, and we have targets that are very wordy, very multidimensional, sometimes requiring many indicators, and the main problem is that they don’t specify quantitative thresholds. They use vague terms like increase or something like that. (UN Statistician, 6)

From the perspective of this community, this has complicated the process of monitoring the SDGs, which also fundamentally impairs technical-political accountability: ‘The core concern of statisticians with respect to the post-2015 development agenda is the measurability of goals and targets at national and global levels, as a prerequisite for accountability’ (UNSD and FOC, 2014, p. 10). At the same time, there has been political pressure on the part of international agencies, member states and civil society to include indicators for their issues in the monitoring framework. As one member of this community described it, ‘the SDG indicators were like a big bus [that] some […] people were desperate to get on’, as it was clear that ‘once [an indicator is] in the framework that will be a powerful measure for countries, for everybody to focus attention of it, use those numbers’ (UN Statistician, 3). This push and pull has existed since the beginning of the indicator framework evaluation process: making sure there are enough indicators to encapsulate the expansive 2030 Agenda, but also to make sure there are not too many indicators that it becomes out of reach to produce monitoring data for many countries with limited statistical capacity.

3.3 The Tier System in Motion: Reclassifying Indicators and Establishing Authoritative Global Public Policy Knowledge

At the 51st United Nations Statistical Commission in March 2020, many representatives of NSOs and statistical offices of International Organisations reiterated and slightly reframed formal congratulations to the IAEG-SDGs for the work entailed in the 2020 Comprehensive Review of the SDG framework, which included eliminating all tier III indicators. In an illustrative example, the representative of India’s statistical office stated:

we would like to place on record our deep appreciation for the UNSD and [the IAEG-SDGs] towards their efforts for improvement of global indicator framework including tier classification updates, the 2020 comprehensive review of the global indicator framework for the SDGs. Their work on proposed replacement indicators, revisions to existing indicators and proposal for additional indicators are commendable. (Srivastava, 2020, p. 2)

The representative from the United States pointed out that the review process was highly inclusive, and that although the US might have its own ideas about which statistics and data should be used for policymaking, they were committed to using and supporting the global standards decided upon by the working group, calling upon other colleagues in the room to do the same. Coming early in the official statements, this statement from the American representative gave the impression of trying to head off potential contention before it had a chance to be aired. Eliminating all tier III indicators meant that all indicators in the SDG framework had conceptually clear definitions and methodologies, as both tier II and tier I indicators must, but that all countries might not yet be producing the data for the indicators yet, as only tier I indicators do. The IAEG-SDGs had spent all of 2019 and the months before the 2020 UNSC looking over requests for indicator reclassification and refinement, and communicated with custodian agencies that they were required to turn in their supportive materials for this reclassification in time for the working group to review it by January or February of 2020, or else their tier III indicator would be eliminated from the framework—or, to use the language of the UN statistician above, ‘kicked off the bus’.

The process of refinement and reclassification of indicators began at the IAEG-SDGs’ 3rd meeting in 2016, when the working group first began allowing for indicators to be reclassified and move up or down the ladder of the tier system. Between the 3rd and 4th meetings in April and November of 2016, the first ten indicators were proposed for possible refinement. At the November 2016 meeting, the IAEG-SDGs proposed that the group would ‘conduct a review of a set of indicators for re-classification at the Fall physical meeting, once per year’; that agencies would be required to produce their updated information ‘at least 1 month before the physical meeting for review by members’ and that a ‘revised tier classification will only be published once a year following the IAEG-SDG meeting’ (IAEG-SDGs, 2016, p. 9). In practice, the working group would hear cases for reclassification at both physical meetings, and sometimes at virtual meetings that happened between the physical meetings. In the first five years of its existence, the working group physically met twice a year—in addition to six virtual meetings—with the primary objective to reclassify tier III and tier II indicators, to push custodian agencies to test their proposed methodologies and to promote the support for producing data to populate the indicators across the world. After the comprehensive review, the group now only meets once a year, as it is understood that the SDG indicator framework is now—for the most part—complete, and that further refinements or adjustments or reclassification from tier II to tier I will require less work than the scramble that had occupied the group’s first five years of existence.

3.4 The Case of Migration: Reclassifying Measures and Policies

Building a case for reclassification is ultimately the responsibility of a custodian agency but must also involve diverse stakeholders. When the IAEG-SDGs announced that their goal was to eliminate the tier III indicators in 2019, the group put out a call to custodian agencies of those indicators that they were to provide the proper methodologies and data sources by the end of 2019, or ‘their’ indicators would be ‘pulled off the table’ (National Statistician, 1).

The case of reclassifying the indicator 10.7.2 on migration policies by the UN Department of Economics and Social Affairs (UNDESA), Population Division and International Organization for Migration (IOM) provides a good example of what this process of reclassification looks like in practice (UNDESA and IOM, 2019). As co-custodians of this indicator, UNDESA and IOM were ‘required to document, among others, the involvement of governments and national statistical systems in the development of the indicator methodology, and the regional representativeness of the results of pilot studies’ (UNDESA and IOM, 2019, p. 11). Therefore, in their case for reclassifying indicator 10.7.2, UNDESA and IOM documented 11 open consultations—taking place from February 2016 to June 2018—which engaged a wide range of stakeholders (almost 300 participants representing governments, International Organisations, NGOs and academics) in the definition of the indicator and the data sources to populate it. They call this process the ‘validation of the methodology’, and in these consultations, the co-custodians shared their proposed methodologies for feedback.

The methodology validated by this large network of stakeholders included answering questions on an annual survey—the United Nations Inquiry among Governments on Population and Development (the ‘Inquiry’)—related to the 30 subsections of the IOM’s Migration Governance Framework (MiGOFFootnote 6): the questions asked countries to grade themselves (on a scale of 100) to how well they reached the aspirational framework set out by MiGOF. To produce data required for the reclassification process, UNDESA and IOM launched pilot surveys in 30 countries in order ‘to validate and provide additional consistency checks for country responses’, and 10 countries responded to the survey with feedback (UNDESA and IOM, 2019, p. 11). In creating an indicator for measuring a country’s ability to protect the rights of migrants and promote their wellbeing, UNDESA and IOM—in conversation with stakeholders—chose the existing data collection mechanism of the ‘Inquiry’ and limited the scope to the opinions of government entities.

Armed with an explanation of how they chose the MiGOF (the international concept) and the Inquiry (the data source), the documentation of stakeholder involvement and the responses to the pilot survey from the ten geographically representative countries, the UNDESA and IOM put their case forward at the 8th IAEG-SDGs meeting in November 2018. After a brief presentation and discussion at the meeting, the working group decided to grant the request to requalify the indicator to Tier II. UNDESA and IOM were told that in order to be reclassified as a tier I indicator, the data streams for indicator 10.7.2 must be ‘established for at least 50 per cent of countries and at least 50 per cent of the population in every SDG region where the indicator is relevant’, at which point the co-custodians will submit a new request to the IAEG-SDGs (UNDESA and IOM, 2019, p. 13). The IAEG-SDGs then provided an updated tier classification database to the global statistical community before the annual UNSC in March, seeking official approval from the UNSC. This approval process is largely performative, however, as one member of this community put it:

the work of the Statistical Commission is really happening […] in those working groups. Normally [the working groups] are tightly scripted—the Statistical Commission gives them a terms of reference and a timeframe and usually also determines the participation and then these groups do their work and then they bring their pieces of work, all the technical work, back to the Statistical Commission, which […] is a four-day meeting, it’s a parliament. And it’s Chief Statisticians, it’s not experts on census or civil registration, national accounts or tourist statistics or whatever the topic may be, and then the idea is usually to just have a high level discussion and wave things through. (UN Statistician, 3, our emphasis)

Therefore, as the ‘statistical parliament’, the UNSC is crafted as a space for carefully worded congratulations and very few points of substantive discussion about the material of quantification or the policy problems that sit behind them.

3.5 Producing and Eliminating Indicators: The Case of the Tier III Sustainable Tourism Indicator

Of course, when it came down to whether or not ‘their’ indicators are ‘kicked off the bus’ of the SDGs, International Organisations and member state representatives vented disagreements about methodologies and data sources. This was the case at the 2020 UNSC, where the IAEG-SDGs presented their comprehensive review for official approval. With the goal of eliminating all tier III indicatorsFootnote 7, the IAEG-SDGs required that all tier III indicators (which included 88 indicators at their peak in 2016) be either reclassified or dropped. In early 2019, the IAEG-SDGs received 251 proposals for changes to the SDG indicator framework. Of these, the working group identified 53 proposals to put towards open consultation, ‘including replacements, revisions, additions, deletions and, in a few selected cases, requests for proposals for a group of tier III indicators whose methodological progress has stalled’, receiving input from over 600 ‘individuals/countries/organizations’. For those proposals that included reclassification, this input was in addition to the stakeholder involvement detailed above.

One such tier III indicator whose methodological progress had stalled was indicator 8.9.2, which was worded as ‘Proportion of jobs in sustainable tourism industries out of total tourism jobs’, and its custodian agency was the World Tourism Organization (UNWTO). The IAEG-SDGs had put forward a proposal for open consultation for replacing this wording with ‘Number of employees in tourism industries’ to be combined with an additional indicator of ‘Energy use by tourism industries’ (also a tier III indicator), to address the ‘sustainable’ component. This was in addition to an existing indicator—designated as tier II and not as threat of elimination—that measures the contribution of tourism to a country’s GDP. As the Assistant Director of Statistics Canada argued, without the additional indicator (relating to energy use), ‘neither indicator for the target will have anything to do with sustainable tourism’ (IAEG-SDGs, 2020).

When reviewing the entire SDG framework, the IAEG-SDGs argued that this new indicator on energy consumption was, indeed, a tier III indicator, and rejected the proposal. Left with one indicator measuring the proportion of GDP produced by tourism and another proposed to measure the number of employees in the tourism industry, the IAEG-SDGs argued that the proportion of GDP and employment actually ‘mirror each other, they’re almost the same’, thus rejected the proposal to include the ‘employment numbers’ to stand in for ‘sustainable tourism’ (National Statistician, 1). This elimination of indicator 8.9.2, according to this member of the IAEG-SDGs, was meant to serve as a catalyst in the meeting in order to impress upon UNWTO the importance of better documenting and testing the methodological components of measuring ‘sustainable tourism’. However, the temporal frames of the UNWTO—who have been working on an indicator to measure sustainable tourism for 25 years, attempting to grapple with the complexities of defining what sustainability means in the context of tourism—and that of the IAEG-SDGs—facing the deadline of the 2020 Comprehensive Review—clashed, and the indicator was eliminated. At the 2020 UNSC, in the midst of the template congratulations to the IAEG-SDGs, some member states and the UNWTO expressed concern and disappointment that their indicator and policy problem had been ‘pulled off the table’. For one member of this group, this was about an ‘identity crisis’ on the part of International Organisations, which might only have a few indicators in the SDG framework:

I think for some of […] the UN organisations that have a few indicators might also have just wanted to be able to say that they have a good influence in the process. And so, for some of them there is also, even if they don’t have a number, they want to have a place there, and then it becomes more of an identity crisis if we pull things off the table. (National Statistician, 1)

In the end, as the days of the 2020 UNSC went by, representatives of the Caribbean Community (CARICOM) became increasingly vocal in their disagreement that the indicator should be removed and threatened that they would withhold their approval for the new SDG framework as a result. In this way, for another member of this community, this eruption of disappointment into the normally tranquil space of the UN Statistical Commission was an illustration of what happens for both International Organisations and member states when a policy item that is important to them is ‘kicked off the bus’. However, it was also an example of how the machinery of the UNSC and its working groups hold together truces—at various levels of fragility—that allow for the infrastructure to continue to function:

it is rare that the [Statistical] Commission has actually on the floor overruled—I can’t now think of an example—because the thing is these things that come to the Commission are such carefully calibrated compromises, so if somebody says, ‘I will not agree to this package unless you put my tourism indicator in there’, then seven others will come up and say, ‘well, OK, if you don’t agree to this then I don’t agree, too—I compromised here and there and there, and then I pull that back’, and then the whole thing unravels and then you get to a point—and this is what happened this time again—where people then say, ‘OK, we all lose if we let this fall apart’, so grudgingly they agree to that, but it was clear that […] there are a few countries that feel very strongly about—forget about the whole rest of the SDGs—half of their income is coming from tourism, so then they feel very, very strongly about these things. (UN Statistician, 3)

In making ‘official’ decisions on how to measure the world—with what international concepts and with what data sources—the UNSC and its working groups make decisions about what can be counted as global public policy knowledge through these protocols for refinement, addition and reclassification, which are the ultimate responsibility of indicators’ custodian agencies. In the next and final section, we will discuss the architecture for producing the data and statistics to populate this global framework.

4 Conclusion

This chapter took a deep dive into the evaluatory and evidentiary practices at the heart of producing the material building block of the SDG framework—the IAEG-SDGs’ taxonomy of taxonomies. Through these practices, and with the particular protocols established by the IAEG-SDGs and the larger UN Statistical Commission, member states and International Organisations produce the indicators that work as the essential material underpinnings of the SDG measurement infrastructure. Determining which indicators will materialise or not is a key mode of determining which policy issues get attention, as the SDGs are a bus ‘people were desperate to get on’ in order to bring attention to their championed issues. The global statistical community has carefully tried to ‘gatekeep’ the global indicator framework, largely using the argument that if there are too many indicators, then monitoring the entire agenda becomes unreachable for many countries. Yet, this chapter has shown that the SDG governing architecture, with its intricate processes of classifying indicators, has not created an inflexible structure: instead, indicators, seen here as the material representations of social phenomena, are malleable entities, ready to mould, classify and reclassify, upgrade or even eliminate; ironically, elimination here does not mean exclusion. As the case of sustainable tourism showed, countries’ pressures and threats can move ‘eliminated’ indicators back to the negotiating table. Therefore, the production of indicators and the data systems to report on them are not mere ‘numbers’: they become the material, and absolutely essential, means via which the epistemic infrastructure is built and rebuilt over time and space.

In the next chapter, we will shift to the work that custodian agencies do in producing global public policy knowledge through the process of data harmonisation, which is a process of making data comparable. With the crowding of the field of actors producing development data, there has been increased fragmentation of the streams of data produced about economic, political and social phenomena on the national level. Indeed, the problem of statistical capacity in many lower-income countries has arguably been ‘baked into’ the reasoning for the tier system. The latter and its significance for the production of a global public policy field are the focus of the next chapter.