Introduction

The world appears not to be on track to meet its arguably most important development target; ending poverty by 2030. Official data indicate that with or without the impact of COVID, it is highly unlikely that poverty will end any time soon.Footnote 1 How do we know? How confident are we that the identification and the official data provide a good gauge of what happens in our world in terms of poverty from a consistent global perspective? These are questions that the late Sir Tony Atkinson strove to address, and highly recommended that the appropriate global institutions engage with. Recommendation 5 in the Atkinson Report of the World Bank Commission on Global Poverty (Atkinson, 2016, 2019) reflects this.Footnote 2 Although the World Bank accepted this recommendation, it decided not to immediately engage, as, understandably, the institution was not geared to pick up the challenge of producing pure “total error” estimates at the time (Atkinson, 2016; The World Bank, 2016).

"Most people who study development issues do so because they feel that present levels of poverty, misery, and injustice are simply unacceptable. Their aim is to arrive at concrete recommendations for action” (Szirmai, 2015). On a global scale, however, this is hard to do without relatively consistent and accurate methods. Highly inaccurate methods are unlikely to help guide proper policy on a global scale both in terms of prioritization and the size of required intervention.

The current method used to measure global poverty defines a one-size-fits-all poverty threshold, also known as the international poverty line (iPL) at $2.15/day in 2017 Purchasing Power Parity (PPP) exchange rates (Jolliffe et al., 2022). These PPP exchange rates are used to convert currencies to a common denomination, and correct for the tradable sector bias that market exchange rates have.Footnote 3 By using the 2017 PPP exchange rates, instead of the 2011 PPP rates that were used in the previous definition of the $1.9/day iPL by Ferreira et al. (2015), Jolliffe et al. (2022) navigate outside the recommendation of the World Bank’s Commission on Global Poverty that stipulated not to update the iPL value, let alone method, at least until the 2030 deadline of the Sustainable Development Goals.Footnote 4 According to the Commission’s report, the stability of the iPL method is key in avoiding confusion about the various methodological choices that need to be made when switching between PPP rounds.

In the sections below we go to a great length around all the details of the new method of the iPL derivation and try to gather evidence that will help provide an answer to the following two important questions:

  1. a.

    is the method used by the World Bank (WB) fit for the purpose of measuring extreme poverty and is that method defensibly valid both internally and externally?

  2. b.

    do the resulting statistics perform acceptably well in terms of accuracy and precision?

Consequently, we deploy our investigation around the four dimensions mentioned above: internal validity, external validity, accuracy, and precision.

We further investigate if the—implicit or explicit—arguments used to support the introduction of the latest iPL are well-founded in terms of the evidence and in terms of their method. Among those arguments are that the differences in global poverty between the 2011 and 2017 rounds are small; the iPL value is robust and reliable; the groupings of countriesFootnote 5 are meaningful for poverty measurement; that the iPL measures extreme poverty; that the iPL measures absolute poverty; and that the method’s accuracy and precision are high enough to be presented in publications without any explicit warning. Overall, we gather evidence that indicates a lack of confidence in the above assertions.

Section “Literature and context” provides the context of global poverty measurement, an overview of the relevant literature, as well as a brief discussion of the main issues with the use of PPP exchange rates in global poverty measurement and alternative approaches for measuring global poverty. Section “Investigation” presents the methodological steps that we take to investigate the extent to which the iPL fulfills its task of defensibly measuring global poverty across the four dimensions of internal and external validity, accuracy, and precision. Section “Discussion, policy implications, and recommendations” discusses further some of the findings and presents some policy implications and recommendations, while section “Conclusions” concludes.

Literature and context

The latest iPL

Jolliffe et al. (2022) estimate the new iPL recast using the 2017 PPP exchange rates. To define the new iPL they take the median value of the national poverty lines (NPL) among the low-income countries (LIC). Their approach deviates methodologically from the previous two iPL definitions, by Ravallion et al. (2009) for the 2005 PPP rates and Ferreira et al. (2015) for the 2011 PPPs. Both previous approaches took the mean value of a reference group which was econometrically identified, and used a relatively small dataset of NPLs. Compared to the around 70 NPLs available in the previous two attempts at estimating an iPL, Jolliffe et al. (2022) have some 1438 poverty lines to work with. Those poverty lines are not direct NPLs but they are indirectly estimated using the “Harmonized Poverty Line" approach by Jolliffe and Prydz (2016). This method takes as a starting point the poverty rates reported by national authorities to the World Bank, and then the income/consumption distributions available in the World Bank database are used to convert the poverty rate to its corresponding poverty line. This of course adds complexity to the process, albeit addressing some shortcomings of using the NPLs directly (differences in units –per capita or per adult equivalent–, use of different or inappropriate inflation rates, etc.). Jolliffe et al. (2022) also provide separate iPLs per income group of countries as shown in Fig. 1.Footnote 6

Fig. 1: National Poverty Lines around the World.
figure 1

Authors’ own replication of figure 3 from Jolliffe et al. (2022) based on the data the authors shared with us. Differences with the values reported by the World Bank are due to the rounding that the World Bank is applying to the closest 5-cent value.

Jolliffe et al. (2022) introduce the new definition of iPL to be used by the custodian institution for the measurement of global poverty on behalf of the United Nations for monitoring the first sustainable development goal (eliminating global poverty). To support this they bring a number of arguments to the fore. To some extent, it also defends the choice of the World Bank not to honor the aforementioned Commission of Global Poverty recommendation that stipulated to keep the line in its, at the time, 2011 PPP terms.

Moreover, they argue that the iPL value that they estimate is both robust and reliable. To support this point they show that taking the median of NPLs across a wide-range of countries ranked by GDP per capita, the iPL does not effectively change more than a cent or two for any grouping between 11 and 40 least affluent countries.

They further argue that the differences with the 2011 round are small, and therefore updating the iPL does not bring any real disturbance in monitoring global poverty. However, this does not entirely address the argument put forth by the Commission in support of not recommending further iPL updates until 2030:

"[Re-estimating the iPL with data from a new PPP round] would combine two changes: a (better) adjustment for price changes and a recasting of the view about relative living standards in different countries. This takes us to the heart of the problem: underlying the application of the PPPs is the fact that these are influenced by factors other than the evolution of the national CPIs. The objective of the PPP exercise is to set the purchasing power in an international context, and the PPP applied to a particular country reflects what has happened to prices in other countries. It is, for instance, quite possible that there has been no change in any domestic price, but the PPP adjustment leads to a change in the local currency poverty line.” (Atkinson, 2016, p.73)

Finally, it helps to state explicitly here a number of implicit assumptions that are present in the new iPL approach. First, is that the particular groupings of countries are in some sense optimal for poverty measurement. Second is that iPL actually measures extreme poverty, and not another form of poverty, although they do not explicitly use a definition of what extreme poverty is. Third, the iPL measures absolute poverty, instead of relative poverty. Fourth, the method’s accuracy and precision are high enough to be presented in publications without any explicit warning on this point.

Brief overview of PPP-related issues

A substantial body of literature has brought a number of arguments questioning the use of PPP exchange rates for measuring global poverty, which are rather widely recognized as methodologically worrisome issues (Allen, 2017; Deaton, 2001; Fischer, 2010; Moatsos, 2016, 2020; Moatsos and Lazopoulos, 2021; Reddy and Pogge, 2010; Subramanian, 2015; Sumner, 2010).

First, PPPs are constructed to compare economies at large or country-wide household consumption. The World Bank warns on this point that PPP estimates “may not reflect the expenditure patterns of the poor” (TheWorldBank, 2007). On this point Ravallion et al. (1991), state that “[i]deally one would like to construct new PPP rates for the prices most relevant to the absolute poor, in which the prices of food-staples would clearly carry a high weight”. Moatsos (2020) points out that before them Ahluwalia et al. (1979, p. 305) acknowledged early on that using PPPs instead of market exchange rates resolves some problems, but “but other problems arise that replace the problems addressed”.

Second, countries around the world demonstrate substantial differences in their economic structure and their consumption patterns. However, PPP values can be estimated with higher reliability when the participating countries have small differences along these lines, and “[t]he more the countries differ in these perspectives the larger the resulting concern for the PPP estimates” (Moatsos, 2020). In similar terms, there are some important components that are “comparison–resistant”, including government-provided services and house rental, and the statistical treatment of these components can substantially affect a country’s PPP rate (Deaton and Heston, 2010). On this point, one can add the urban bias of the data that PPP rates are based upon, due to the urban bias in the price data that the National Statistical Offices use.

Third, changes in consumption of luxury goods, and in general goods not related to the consumption of the poor in 3rd countries (say high-income countries for which the iPL is almost irrelevant) will influence the PPPs of countries where the iPL has a substantial grip. Therefore, when one is using PPP exchange rates in the measurement of global poverty the estimates will change if such changes occur in 3rd countries even if nothing changed in the later countries (Reddy and Pogge, 2010); a point echoing the aforementioned Commission’s quote years ahead of its time. Related to this Pogge (2013) points out that “[t]he fact that an income suffices to meet basic human needs [in one country] is no assurance, then, that a PPP equivalent income in another country is similarly sufficient. In poor countries, prices of necessities are often higher, and prices of services lower, than what the PPP to the US dollar would suggest”.

Fourth, the substantial uncertainty in the estimation of PPP exchange rates propagates in the global poverty estimates. Several scholars have attempted to estimate the uncertainty of the PPP estimates, and all conclude that the error terms involved are of substantial magnitude. For example, Deaton (2012) estimates that the 2011 PPP for China implies a relative error term of about 15%, and most countries are found with relative error terms of around 15–17%. Other estimates by Prasada Rao also provide error terms of considerable magnitude for both the 2011 and 2017 PPP rounds. More generally, in terms of PPP volatility, Deaton (2001, p.129) argues due to the “notorious volatility” of primary world commodity prices, for countries where these commodities take an important share of their GDP, their PPP rates may also vary considerably due to this inherited issue.

Finally, each new round of PPPs brings about larger or smaller changes in (global) poverty rate statistics of the World Bank. For example, the 2005 PPP round was notorious for bringing about very large differences that compelled the World Bank’s classic global poverty line duo to entitle their 2009 QJE article as “The Developing World is Poorer than We Thought, But No Less Successful in the Fight Against Poverty" (Chen and Ravallion, 2010). Milder changes were introduced in the following two rounds, albeit not negligible. For example, the estimate of extreme poverty for Sub-Saharan Africa according to the latest 2017 PPP round is 36.66%, while the estimate with the 2011 PPP is 40%; a substantial difference, which implies larger (and smaller) differences across countries in that region.Footnote 7

Investigation

In this section we gather all the steps that we take to investigate the extent to which the iPL fulfills its demanding task; that is measuring global poverty in a reliable manner. We do this by proceeding with our investigation across four dimensions: internal and external validity, accuracy, and precision.

A clarification in terms of terminology is in order at this point. The terms absolute and extreme poverty are occasionally confusing and used interchangeably. The use of terms like absolute and relative poverty lines does not help either, in terms of conceptual clarity. Absolute poverty lines are poverty lines that are kept constant in real value terms across the years. Relative poverty lines are poverty lines that typically change their real value from year to year because they follow a specific point in the distribution (typically the 40th, 50th, or 60th percentile of the median value of the entire distribution), and not a specific real value. Absolute poverty on the other hand typically means a condition of absolute deprivation that coincides with extreme poverty in terms of welfare levels achieved. However, using absolute poverty lines does not necessarily mean that one traces extreme or absolute poverty. Such an absolute threshold, if set appropriately high, may be used to separate those tentatively deprived from the middle-class, or the middle-class from the upper- or ruling-class, at least in some monetary sense. An absolute poverty line can be above a relative poverty line without much effort. Likewise, relative poverty lines can also be used to measure absolute/extreme poverty if the bar is set low enough, e.g., at the 20th or 30th percentile of the median. Therefore, a more precise characterization of what an iPL tries to measure would be that it traces extreme poverty in absolute terms (since considerable effort is put to work in keeping its value constant in real terms).

Internal validity

An ideal setting

As a first step, to gauge the internal validity of the iPL approach, we ask the following central question: What would be the ideal way of measuring extreme poverty using the national definitions, along the lines of the iPL approach? Consequently, we trace how does the iPL derivation method deviates from such an ideal setting. We argue below that the following are the four necessary steps to define the ideal iPL setting:

  1. 1.

    Countries implement the already agreed-upon common definition of extreme poverty in terms of welfare level achieved, using say the Copenhagen Declaration or an equivalent frameworkFootnote 8, although they may well deviate in the exact specification (e.g., in terms of the exact number of square meters of housing per person). This is a necessary step in the ideal setting so that the data underlying the iPL represent the same poverty concept.

  2. 2.

    Countries report those priced poverty lines to the WB every year using the commonly agreed-upon framework. For increased transparency and verifiability, and for better data flow and handling, it may be preferable that the World Bank prices these poverty lines directly using the price data reported by the states. The pricing of the NPLs is an unavoidable step in any monetary poverty measurement attempt.

  3. 3.

    Working on the level of the priced poverty lines the typical step involves the conversion of NPL values to a reference currency (usually the US Dollar). This is done using available PPP exchange rates, ideally using a PPP variation that is estimated to represent consumption patterns of the poor or at least those least deprived (e.g., bottom 20% of the distribution).

  4. 4.

    The WB (or another academic/research institution with the appropriate capacity) devises a methodology to estimate the representative poverty line among countries that are at the lowest ranks of welfare levels or separate for each country group. This is a necessary step in order to obtain a unique iPL tracing extreme poverty in absolute terms per country group.

There are some important ways in which the current iPL methodology deviates from such an ideal setting.

Currently, the NPL data underlying the derivation of the iPL represent a host of methodological approaches, and political negotiations on the national level, and target various welfare levels. Most of the NPLs are of the cost of basic needs tradition, while others are relative poverty lines (the typical choice for high-income countries). In any case, the intended underlying welfare level is far from clear and comparable across the NPL values. The argument, in the current derivation method of the iPL, that circumvents this conundrum is to simply accept the variety and aim to identify the typical way that extreme poverty is defined across nations. In past iPL derivations, the goal was to separate the absolute from the relative component across the NPLs (Ravallion et al., 1991, 2009). Probably, due to the methodologically thin approach in taking the average of some NPLs to achieve this separation, the current approach steers clear from attempting this. At the same time, it is not demonstrated that the resulting iPL is representative of how poverty is defined across the various groups of nations, let alone globally (see sub-section “Minimization criteria” for a detailed investigation of this).

As mentioned above, however, in terms of poverty definitions, a framework has already been agreed-upon, but unfortunately it is not used by the international community (see the Copenhagen Declaration for the detailsFootnote 9). If such a common framework were to be followed by individual countries, then one would expect to observe a transition taking place, in the national poverty line values, as a result of socio-economic development, along the following lines:

  1. (a)

    a change in the exact items that the concept of (extreme) poverty is identified with,

  2. (b)

    a change in the relative weight of the various poverty line cost components due to cross-country relative price differences (e.g., food vs. housing components), and

  3. (c)

    a change in the final value of the various poverty lines due to cross-country absolute price differences.Footnote 10

The above list makes it rather evident, that even following the same methodological framework, one should not expect to find identical poverty lines around the world. So we need to accept that the underlying NPLs will be of different dollar values even when expressed in a common PPP accounting denomination, and even when they follow the same methodological framework. Consider, for example, how rents become relatively much higher in high-income countries relative to low-income countries, as demonstrated by Allen (2017).Footnote 11

With respect to step 3 from the ideal iPL setting, there are a number of ways that the currently applied methodology deviates from an ideal setting. First, the PPP exchange rates applied represent the consumption patterns of the average household of each participating country. However, as argued by several scholars (e.g., Deaton and Dupriez, 2011), we ought to be using PPP exchange rates specifically constructed to represent the consumption patterns of the poor, or—to expand this in less demanding terms—of the less affluent, for instance using the consumption pattern of households occupying the bottom 50% of the distribution. Regional differences in such approaches can be non-negligible. Footnote 12

At the same time, given that PPP exchange rates are better estimated when participating countries demonstrate small differences in terms of economic structure and consumption patterns, and since the World Bank specifies different iPLs for the four aforementioned country groups, it follows that different PPPs for the poor ought to be applied for each of the four groups.Footnote 13 Using separate PPP estimations for each group of countries would quite likely alleviate some of the importance that any specific grouping has in specifying the iPL(s); a problem that we turn to next. Footnote 14

Regardless of the exact country groupings used, choosing the exact method that distills one iPL from the various NPLs available in each group hinges upon the issue of what form of representativeness one has in mind for the resulting iPL. Using a central tendency statistic, such as the median that is now in use directly upon the NPLs, is only one of many options. Such an approach attempts to translate each national threshold used to separate poor from non-poor on a global scale. Such a translation however may lead us astray in terms of other criteria that reflect poverty perhaps more clearly than the poverty line per se. Currently, the iPL defined as the median across NPLs is minimizing the absolute difference across all NPL values in the LIC country group. Is this the ultimate criterion for defining international or global poverty thresholds? We argue that other minimization criteria such as minimizing the number of individuals that are allocated differently between the NPL and the iPL would provide a more defensible approach. On the same line of thought, minimizing the difference between NPLs and the iPL on any other key poverty indicator such as the poverty rate, poverty gap, poverty severity, or the Watts index would also be preferable alternatives. Arguably, minimizing across all these alternative dimensions would bring us closer to the ideal case. Such a method would by definition represent how poverty is across countries from a global standpoint with the minimum possible deviation. Whether the resulting deviations are acceptable is a topic of another important debate.

The above observations take as given that it is the correct estimation of an iPL that one should be after in consistently measuring global poverty. This is not something necessitated by poverty theory per se; in some sense, quite the contrary. There is no poverty theory that requires us to take an average or the median across say households within a country to derive the national poverty line. Antithetically, different poverty lines are defined for various types of households within nations, for example. Notwithstanding the differences between household types and countries at different levels of socio-economic development, still, the above four necessary steps in defining an iPL may well be curtailed to the first two steps only, which would only require the direct use on a country level of each priced NPL derived with the same methodological framework. Such an approach would steer clear of all methodological traps in dealing with PPPs and country groups.

Finally, it is worth noting that the whole process of taking the median or mean of NPLs as the core step in deriving the iPL rests on the (implicit) assumption that between the point of the NPL and the iPL the gradient of each country’s income or consumption distribution is the same. This would mean that it would be theoretically possible to get to an iPL from which any deviation would change the poverty statistics across countries at the same rate. This however it is not the case, as different household survey distributions have variable gradients at and around the iPL. Therefore, there is a differential in the rate with which each country is characterized by higher or lower poverty as iPL changes. This makes any uncertainty in the iPL imply higher uncertainty in the geographical distribution of those living in conditions of poverty around the world.

Updating the iPL definition with 2017 PPPs

As stipulated by recommendation 10 the global poverty estimates until 2030 should be the result of the iPL expressed in local currency units and updated using the most appropriate CPI for the poor, and the iPL must not be updated in light of newly available PPP exchange rates even on the grounds of better methodological/data quality. This recommendation directly addresses the mandate of the World Bank’s Commission on Global Poverty which included investigating “[w]hat should be the interpretation going forward of the definition of extreme poverty, set in 2015 at 1.90 PPP-adjusted dollars a day per person, in real terms?”. The response of the Commission was to let the definition as it was in 2015, and only move it forward until 2030 in local currencies adjusting the nominal value for inflation facing the poor.

The World Bank understandably wants to update its entire arsenal of statistics in the most recent vintage of the PPP exchange rate, most importantly to reflect the most recent relative price structures and most advanced methodologies. Clearly, the World Bank’s Commission on Global insisted that this is not reason enough to update the definition of poverty, as this would affect the core meaning of “real terms” (for the reasons quite vividly expressed in the aforementioned quote from the Commission’s Report).

Then why does the World Bank prioritize the updating of all statistics to the latest PPPs over appropriate global poverty measurement in constant real terms? And this despite the fact that the historically key person at the World Bank behind the iPL approach, the late Martin Ravallion concluded that the “key point is that whatever line is used it is intended to be held constant in terms of its purchasing power” (Ravallion, 2016). Part of the answer can be found in the cover note issued by the World Bank at the same time as the Commission’s Report:

"[W]e plan to follow this recommendation, but leave open the possibility that future PPP rounds might be used again to inform the construction of the IPL, even before 2030, if and only if we are satisfied that the ICP methods have substantially stabilized over at least two ICP rounds. This would require that our own researchers, and the broader scholarly community, are largely of the view that changes in PPPs are then driven by real changes in cost-of-living parities, rather than in data collection or index-number methodology.”

Jolliffe et al. (2022) in line with this response by the World Bank provide “empirical evidence that is consistent with stability between the 2011 and 2017 rounds when methodological stability is understood at the global level”. Therefore, they attempt to provide the necessary evidence to support that the World Bank is consistent with their above cover note when they update the iPL despite that going against the very core of Recommendation 10. However, one can legitimately wonder, why the stability of the iPL estimates is -or appears to be or is treated as it is—the penultimate criterion with which to opt for updating the iPL? Why does the difference between the current iPL and the previous iPL serve as a scientific criterion for supporting the transition to the new iPL? What is the methodological substance behind this argument, even if we are able to find support for such stability?Footnote 15

The above questions become more pressing especially when the empirical evidence presented in support of the new iPL addresses the problem of stability at the global level, while the Commission clearly sets the context for measuring poverty in constant real terms at the national level using locally denominated poverty lines.

However, the new iPL derivation method violates Recommendation 10 in one additional important way, as it does not only use the new PPP rates but also changes the method itself (using a new reference group and taking the median instead of the mean).

To investigate the impact of the change in the PPP rates, one needs to not only estimate the point differences in global poverty statistics among the 2011 PPP estimates using the old iPL derivation method and the 2017 PPP estimates using the new iPL derivation method, as done in Jolliffe et al. (2022) because in such a comparison we are changing both the PPP rates and the derivation method of the iPL at the same time. To identify the impact of the change in PPP rates we need to compare:

  1. (a)

    the 2017 iPL global poverty statistics that obtained the iPL produced with the 2017 method (taking the median upon the NPLs), but using the 2011 PPP rates and data, and

  2. (b)

    the 2011 iPL global poverty statistics with the iPL produced with the 2011 method (taking the mean upon the NPLs), but using the 2017 PPP rates and data.

In absolute numbers, the difference in option (a) above, is about 144 million individuals for 2011 or 111 million for 2017, with the 2017 iPL derivation method (taking the median) producing an iPL of 1.77 in 2011 PPP terms. While implementing scenario (b) we find a difference of some 190 million individuals for poverty measured in the year 2011 or 150 million in the year 2017, and an iPL of 2.33 (using the mean of NPL as stipulated by the 2011 iPL derivation method). Because of these differences being the opposite sides around the official statistics, the gap between the two options is even more substantial, and stands at about 200 to 400 million, or 2.7 to 5.7 percentile points, depending on the year of comparison.Footnote 16

Stability of the new iPL value

In demonstrating the improved nature of the new iPL derivation method, Jolliffe et al. (2022) argue that the iPL value is “robust to varying the set of countries”. They demonstrate this by showing that when ranking countries according to their GDP per capita (in 2017 PPP terms), any group of countries from the first 11 up to the first 40 have a median within the narrow band of $2.14 and $2.17/day (see column “CMedian” in Table 1 in the Supplementary Material). We investigate this claim across two dimensions.

First, we check if this stability is primarily related to selecting any fixed point on the distribution of NPLs, and we find that this is not the case (Table 2 in the Supplementary Material). Among testing this using alternative percentiles from the 40th to the 60th with a 1 percentile step, the observed stability only relates to the 50th to 52nd percentile, and not to any other percentile that we tested for such stability.Footnote 17

Second, using the PPP uncertainty (discussed thoroughly in section “Precision” below) we find that such wide-range stability only takes place in only 1 every 10,000 PPP draws,Footnote 18 and in 49 out of 10000 draws demonstrate this stability for grouping the first 11 and up to 20 countries instead of 40.Footnote 19

The above findings suggest that, in all likelihood, the observed stability of the median across a wide spectrum of sets of countries is rather coincidental to the exact PPP mean values and the choice of the 50th percentile as the reference point.

A relative methodology?

In its core, the iPL derivation method shares some important properties with relative poverty lines. In effect, the iPL derivation method takes the standard approach in relative poverty measurement, but uses the actual median value of NPLs, instead of a fraction of their median. The standard approach in relative poverty definition is to draw the line as the 40th% to 60th%-ile of the median (of the income distribution across individuals or households). Therefore, the iPL can be thought of as being derived with a similar concept as the one proposed by Fuchs (1967) on how to anchor the definition of relative poverty lines; so it makes good sense to compare it with what the standard definition of relative poverty lines would produce. In this, there are two basic options:

  1. 1.

    To consider that the global relative poverty line would be equal to the 40th% to 60th%-ileFootnote 20 of the median NPL across all countries in the grouping sample (ranked according to their NPL not GDP per capita). And then the iPL would be anything between 40th% ($2.07/day) to 60th%-ile ($3.10/day) of the median NPL. Such a comparison would show that the running definition of the iPL approximates the most frugal relative poverty approach in terms of the NPL population around the world.

  2. 2.

    To consider the world as one distribution and identify the iPL as anything between 40th% to 60th%-ile of the median income/consumption of individuals across the world. The 40th% to 60th%-ile of the median, which is at $7/day, are $2.80/day and $4.2/day for 2017, and correspond to poverty rates of 17.4% and 32.33%. The current iPL corresponds to the 27th%-ile of the median in 2017. This comparison shows that the running iPL definition is substantially below possible relative poverty lines estimated for the world as a whole.

Essentially, we argue that the $2.15/day iPL estimated for monitoring extreme poverty around the world, due to its relative nature of derivation, can be thought of as effectively estimating a frugal form of relative poverty across low-income countries for the benchmark PPP year, and due to its very low value, as a proxy for extreme poverty everywhere else. Its exact nature becomes more vague, moving away from the benchmark year, since its value is kept constant in (local) real terms. This characteristic goes against the goal of the iPL to measure extreme absolute poverty around the world.Footnote 21

Rankings and groupings

With respect to step 4 from the ideal iPL setting, the key prerequisite there is how one would group the various countries. Such a task has more than one answer, as there are several options beyond the administrative groupings that the World Bank is using. For example, one can use the United Nations Human Development Index (HDI), in its simple or its inequality-sensitive versions. Or one could use the mean or median income or consumption values from the original Household Surveys. Or group the countries in terms of equally distributed equivalent income, using for instance the Atkinson approach (Atkinson, 1970). Which of the above or other indicators should be used is up for some serious debate, though at the outset the HDI seems to us like the most defensible approach. In terms of deciding the actual groups there are at least two competing angles. One would be to use a well-known recipe like taking the bottom 40 percent as it is the case for defining equitable growth, or those with less than 40%, 50%, or 60% of the median values like in relative poverty. What remains clear in any case, is that any particular grouping is to some extent an arbitrary choice, and perhaps the most defensible approach of all the above, would be to take the union of most deprived countries resulting from all these methods. The administrative method that the World Bank applies to resolve this issue is a practical one, but its implications and the deviations it implies vis-a-vis defensible alternatives remain unclear. For illustration purposes, we use the HDI indicator as published by the UN and take as one group for defining the iPL reference group the bottom 40% of countries ranked according to the HDI. The median value of the 71 countries in that group is $2.75/day which corresponds to a global poverty rate of 16.7%.

Accuracy

Minimization criteria

A critical question raised above is concerned with the methodological foundation for working in the domain of NPLs instead of poverty characteristics themselves as the epicenter of what poverty means across countries. Jolliffe et al. (2022) argue that the new iPL “reliably reflects the typical standard by which the poorest countries of the world judge their citizens to be impoverished”. Technically this is an internally consistent statement. But how does the new iPL behave when it is put in use? Do the resulting statistics also represent well those considered impoverished across the poorest countries according to the definitions used in those countries?

By using the iPL instead of the NPL, we are moving away from the national statistic of poverty towards an estimate coming from a global standpoint. How do we know that this estimate is doing its job in reflecting the “countries’ own judgments of what it means to be poor” (Jolliffe et al., 2022, p.11) upon which the new iPL approach relies in defining poverty? A criterion is necessary here, though none is explicitly provided by the Bank.Footnote 22 To best reflect “countries’ own judgments of what it means to be poor” the poverty statistic produced from a global standpoint needs to minimize the difference between poverty statistics produced at the national level, and the international one across the relevant country groups. The larger (smaller) this difference the worse (better) the model is the iPL to reflect those judgments on the initial country level. Consequently, the result of this minimization may be different according to the poverty statistic that is used.

We proceed to answer this key question regarding the representativeness of the new iPL by devising five well-founded minimization criteria: These are the number of individuals that are misclassified using the iPL, the poverty rate, poverty gap, poverty intensity, and the watts index, which are all the standard indicators reported by the World Bank on its Poverty and Inequality Platform (PIP). For the low-income countries (LIC) the iPL value that minimizes the Euclidean distance across all criteria is $1.9/day which corresponds to the 35th%-ile of the distribution of NPLs of this country group; the lower middle-income country group (LMIC) has $2.52/day at 24th%-ile; Upper Middle-Income Country group (UMIC) has $5.6/day at 23rd%-ile; and High-Income Country group (HIC) has $13.25/day at 18th%-ile. The iPL performs 8% worse than the optimal solution in minimization terms for the LIC, 35% worse for the LMIC, 11.5% worse for the UMIC, and 72% worse for the HIC. We arrive at these estimates by comparing the mean deviation of the optimal iPL in Euclidean terms with that of the new iPL for each country group. These findings indicate that the iPL method, seen from this angle, is biased upwards, and appears to be a bad fit across all key poverty statistics employed in the literature and reported by the World Bank’s Poverty and Inequality Portal.

The diagrams in Fig. 2 show that, using an iPL we are after several targets at the same time, and it is impossible to get them all correct. However, using a method that only worked with steps 1 and 2 of the four concrete steps in the ideal iPL, we would be achieving all targets at once by construction. At the same time, should we use these minimization criteria to select an iPL we end up with a value way too low, that certainly does not keep the real value of what the iPL translates to in local country terms, and thus violating a key requirement of the WB and its Commission on Global Poverty.

Fig. 2: Minimization Criteria.
figure 2

From top left to right, the mean absolute difference between the statistics produced at various values of the iPL against the statistics produced using the national poverty lines. The red dot indicates the value at the current iPL.

Countries or individuals?

Considering the framework of the new iPL, which strives to best represent the standards with which countries define poverty, one can legitimately ask if the various calculations ought to take place on the country or on the individual level. Put differently, should the iPL be a country-weighted poverty line or population population-weighted poverty line? As such, a country-weighted iPL, is when all NPLs get a weight of 1, while a population-weighted iPL is one where each NPL gets its weight from the population in that country. This lingers upon the perspective which changes if we are interested in the measurement of poverty on a country level or if we are interested in defining poverty on the level of the international individual.Footnote 23Table 1 in the Supplementary Material provides the results for the iPL taking both standpoints. The iPL is a bit lower, at $2.04/day in the population-weighted case, while the mean of the poorest 40 is $2.21/day; this corresponds to an approximately 1.1% points difference compared with the official global poverty rate or a substantial relative reduction of about 11.5% (both reported here for 2017). The choice of how to weigh the NPLs also influences the stability of the median as more countries are added (see Table S1 in the Supplementary Material). This stability, although not present anymore in the case of the population-weighted approach, the $1.9/day iPL for 2017 PPPs is by far the most frequent result among the population-weighted median values as it obtained in 28 out of the 40 least affluent countries. Rather coincidentally, this result matches the one we obtained from the previous sub-section by using the minimization criteria across the LICs.

External validity

Definition used

An official and agreed-upon definition of extreme/absolute poverty is:

"Absolute poverty is a condition characterized by severe deprivation of basic human needs, including food, safe drinking water, sanitation facilities, health, shelter, education and information. It depends not only on income but also on access to social services”. (Copenhagen Declaration on Social Development, 1995)Footnote 24

In turn, does the iPL derivation method use a particular definition of extreme poverty? The answer to this very basic question is actually yes and no. In fact, the iPL does not represent any specific definition of extreme poverty. But it strives to remain close to what the “countries’ own judgments [are regarding] what it means to be poor” (Jolliffe et al., 2022, p.11). Seen from this angle, it resembles a mashup index, containing an indirect definition of extreme poverty (across the relevant countries). Below in this sub-section, we contrast the results of this definition-loose approach with alternative approaches that have a more specific definition in their core, though to be sure this does not—a priori—make these alternatives true measures of extreme global poverty. They are simply operationalized here as external references to place the iPL approach in a broader context.

OPHI/MPI

The multidimensional poverty index (MPI), developed by Alkire and Santos (2010), and now maintained under the auspices of the Oxford Poverty and Human Development Initiative (OPHI), has been recognized by scholars and the United Nations as a favorable approach in consistently measuring poverty around the world, and pin-pointing the living standard of those living in conditions of poverty. Due to its construction, the MPI avails itself for use in policy-making, while the UN incorporates the MPI results in its Human Development Reports, and it is also used by Jolliffe et al. (2022) as an external point for comparison. This should not be interpreted as MPI is a method that is free of criticism by the scholar community; see for example Catalán and Gordon (2020), Dotter and Klasen (2017), Duclos and Tiberti (2016), Santos and Villatoro (2020) for some details.

Nonetheless, it is a good and widely used instrument to compare the iPL approach with as a benchmark point. Among the 112 pairs of countries spread across the years 2010–2020, with 78% in the 2015–2020 period, the mean absolute difference is about 12.2% (CI95%: 0.06%-52%), and a median of 5.2%.Footnote 25 For most countries, MPI gives a substantially higher poverty rate than the iPL.

Global cost of basic needs poverty lines

The first such contribution to global poverty is estimated a frugal consumption basket, reflecting the items barely enough to fulfill some basic needs, generally following the definition of subsistence basket from the historical real wages literature (Moatsos, 2016). The consumption baskets were estimated for all developing countries for the period 1985-2014, identifying a much lower but more persistent level of global poverty throughout the period compared to the World Bank statistics at the time. Using more richer, yet still frugal, baskets global poverty levels increase enormously. In 2017, Allen estimated the cost of basic needs-based poverty lines for 2011 using that year’s PPP exchange rates, and the prices collected by the World Bank’s International Comparison Program (ICP) which has the responsibility for the PPP estimation. The purpose of this approach was to provide a cost of basic needs basis for the estimation of an international poverty line. Indeed the average of the poverty lines that Allen estimated for the developing countries in his data mean value is $1.88/day; almost exactly the iPL value.

Using the non-subsistence poverty results from Moatsos (2016) that cover developing countries to compare with the results from the iPL method. There are 101 countries common in both the Moatsos (2016) and the iPL datasets. The average absolute difference is somewhat higher than the one with MPI, and it stands at about 20.4% (CI95%: 0.2% -54.7%), and a median of 16.8%.Footnote 26 Also in terms of comparison with the 20 countries in the Allen (2017) dataset, the mean absolute difference is at ca. 9.4% (CI 95%: 0%–39.9%), and the median of ca. 3.5%. The difference is substantial, even for these CBN NPLs that were constructed to provide the CBN framework for supporting the 2011 PPP iPL. Half countries have a higher CBN-based poverty rate than the 2017 PPPs iPL according to the Allen (2017) poverty lines, and half below.

Food only

In 2019 a team of researchers at Tufts University led by Prof. Will Masters, estimated the costs and the affordability of healthy diets as defined by the EAT-Lancet initiative (Hirvonen et al., 2020; Willett et al., 2019).Footnote 27 Assuming that the cost of the healthy diet is indicative of an extreme poverty line for countries where food expenses consist the bulk of the budget that the least well-off household have to spend then using the cost of the EAT-Lancet healthy diet as a poverty line (Hirvonen et al., 2020; Willett et al., 2019). The EAT-Lancet healthy reference diet is a product of a large consortium of specialists in health, nutrition, economics, and other fields, for creating a diet that both provides the nutrients humans need and the specific food quantities required on a global scale can be sustainably produced within our planet (assuming a global population of 10 bn post-2050). In comparison with such a food-based only approach, the mean absolute differences with the iPL are in between the CBN and the MPI methods. There are 147 countries in both the EAT-Lancet and the iPL datasets, and their mean absolute difference stands at about 10% (CI95%: 0%–38.7%), with a median of 6%, again indicating a substantial number of extreme differences.Footnote 28 For all but a few countries, the food-only EAT-Lancet-based poverty lines give a substantially higher poverty rate than the iPL.

Precision

Following recommendation 5 of the Commission on Global Poverty stipulating to explicitly include in the total error approach the error attributed to the derivation method, we use the Monte-Carlo microsimulation technique to estimate exactly this error component.Footnote 29 As the core source of uncertainty, we use the errors in the estimation of the PPP exchange rates for 2017 as published by Rao et al. (2022).

In principle, we draw 1000 random draws of PPP exchange rates, based on their mean and standard error values, which result in 1000 sets of possible NPLs. For each of these draws we take the median of that group of NPL points. Each of these medians is estimated with some uncertainty, due to the inherited statistical uncertainty in calculating the median value, and we do that via bootstrapping techniques to avoid making any parametric or distributional assumptions. We take 100 bootstrap samples for each of the 1000 PPP draws and estimate a median value across those 100 bootstrap samples for each of the 1000 PPP draws. This results in a distribution of 100,000 possible median values that reflect the uncertainty in defining the median and the uncertainty in the PPP estimates. Based on this Monte-Carlo microsimulation we are able to calculate the mean value and the 95% confidence interval of the median iPL value and the corresponding global poverty rates and its confidence interval and mean value.

Figure 3 contains the results of this exercise. The confidence interval at a 95% level of the iPL is $1.8/day to $2.8/day and the mean value is $2.19/day, which is very close to the iPL value. The mean global poverty value is 10.3% for 2017, with a confidence interval at a 95% confidence level of 6.2% to 17%, corresponding to an estimate of 488 to 1318 million people living in conditions of extreme poverty. The asymmetric uncertainty bands are the result of the differential in the steepness of the income distributions below and above the iPL. These results can be vividly contrasted with the claim regarding the new iPL that its “value is robust [...], and hence reliably reflects the typical standard [... of being] impoverished” (Jolliffe et al., 2022); pp. 2.

Fig. 3: Distribution of iPLs and Global Poverty rates.
figure 3

This figure shows the distributions of iPLs and Global Poverty rates when uncertainty because of PPP errors is taken into account using a Monte-Carlo microsimulation.

Comparing with the estimates from a similar exercise for the previous iPL based on the 2011 PPP rates, the uncertainty is considerably lower (though the micro-simulations are not exactly one-to-one comparable), but still at very high levels, and remains remarkably higher than the uncertainty of the cost of basic needs approach by Moatsos and Lazopoulos (2021). Specifically, Moatsos and Lazopoulos show that using the 2011 iPL derivation the 95% confidence interval for the iPL is at $0.91/day to $3.19/day with an average of $1.84/day, and the resulting global poverty rate estimate confidence interval between 2 to 28% with a mean of 11%.Footnote 30 Compared to these results uncertainty appears less extreme for the new iPL. The dispersion of the underlying NPLs of the 2017 reference group (LIC group) has a mean of $2.41/day and a standard deviation of $0.95/day, or 0.36 in relative terms, vs. a mean of $1.88/day and a standard deviation of $0.68/day for the 2011 reference group,Footnote 31 or in relative terms a rather similar value of 0.39; and if anything it becomes more noisy in 2017. Moreover, the PPP error terms estimates for the 2017 PPP round are higher than for 2011, resulting from various methodological improvements in the estimation of PPP errors (see Rao et al. (2022) for details). However, despite the increased uncertainty on the PPP level and in terms of underlying NPL mean values, we do observe some improvement with respect to the breadth of the confidence interval of the global poverty rate with the 2017 methodology. Therefore, with respect to the World Bank method, this can be seen as a methodological improvement, tentatively stemming from taking the median instead of the mean of the reference NPLs. We think that most of this improvement stems from the relatively higher value of the NPL reference group in 2017, which translates to the disproportionate increase of the floor value of the median confidence interval to $1.8/day instead of the $0.9/day in 2011, vis-a-vis their mean values.

In any case, it is very unlikely, that the errors due to the scattering of the underlying NPLs and the PPPs, will become decisively smaller in the future so that we can expect that the substantial uncertainties endemic to the derivation method of the iPL will be mitigated. For reference, it takes a reduction of the relative dispersion of NPLs in the reference group by 90% and the division of the PPP uncertainty by 80% to get the 95% confidence level of global poverty estimates to have a confidence interval that is about 2% percentage points wide.Footnote 32 This is quite an unlikely event. As discussed in the section “An ideal setting” (points (a)–(c)), the variance in NPLs, made evident when expressed in PPP terms, would also be expected in the ideal case where countries would use the same definition of poverty.

Discussion, policy implications, and recommendations

As discussed above, the levels of Euclidean distance of the various iPLs, across the minimization applied above, intrinsically relate to the administratively decided number of country groups. Obviously, the minimum error (actually zero error in terms of minimization) would occur when that number is equal to the number of participating countries. The obvious question to ask is why do we want to move away from that optimum? It is already the case, that the World Bank is gradually moving in that direction from estimating one iPL, then moving to three iPLs, and now promoting the use of 4 iPLs corresponding to countries performing differently across socio-economic development indicators. It is our view that any one-size-fits-(almost)-all approaches should be abandoned in monitoring global poverty, and a proper definition of absolute poverty should reside at the core of a new approach.

There is little doubt that the COVID pandemic increased the number of individuals living in conditions of poverty across the world, yet the levels of uncertainty of the iPL methodology do not allow us to pinpoint the level of that increase (let alone its geographic distribution) at any reasonable level of certainty on a global scale. This point is demonstrated in Fig. 4 which shows among other things the official global poverty statistics along with 6 alternative trajectories (among infinite similar trajectories) based on alternative PPP draws based on the uncertainty in the estimation of the PPP terms and the iPL as presented above. Clearly, along any trajectory, the intrinsic volatility is higher than the ca.1% increase in global poverty related to COVID.Footnote 33 This is hardly an acceptable result. Currently, it is only on the national level, due to better-defined national poverty lines, that one can get a good grasp of how intense the impact was at a reasonable degree of certainty. Global poverty statistics properly produced should be able to help in that regard, and in the global coordination of the fight against poverty.

Fig. 4: Randomized trajectories in context.
figure 4

This figure shows the official global poverty estimates contrasted along with 6 alternative trajectories based on PPP uncertainty. All paths are equally probable. For context, the global poverty estimations based on keeping the iPL methodology fixed across benchmark years shape boundaries that contain most of the alternative trajectories. Also, note that the 2017 iPL/PPPs show a slightly higher rate of global poverty reduction compared to the 2011 vintage.

In addition, Fig. 4, highlights the large impact in global poverty measurement that the change to the new PPPs has. It demonstrates that by showing in comparison, the official global poverty statistics in 2011 and 2017 PPP terms along with two counterfactual lines indicating the poverty rate with the 2011 (2017) PPP values but with the methodology used for the derivation of the iPL in 2017 (2011) PPP round. The breadth of the path between the two lines that isolates the impact of the switch to a different PPP vintage, while keeping the methodology for deriving the iPL fixed. Almost all random walks indicated by the alternative trajectories (shown with dotted purple lines) are contained within that spacious path. These random walks are equiprobable trajectories based on the PPP and iPL method uncertainties discussed above, thereby indicating that the trend of the poverty rates is also sensitive to the same uncertainties. This indicates that the impact of the change in the PPP vintage used in global poverty measurement (against the recommendations of the World Bank Commission on Global Poverty) is in good comparison with the impact of the uncertainty in the PPP exchange rates. This does not seem to be a coincidence, but rather an expected outcome. At the same time, the change in the PPP vintage and the methodological change cancel each other out almost perfectly. And this is the third indication of coincidence identified in this investigation of the iPL.

The lack of coordination across nation-states to define poverty along the lines of a commonly agreed framework should not be a blocking point in pursuing a more consistent framework of poverty measurement around the world. Given the World Bank’s mandate, it appears to be the proper institution to provide its aegis for such an initiative, which appears to be a much more modest endeavor than the World Bank’s International Comparison Program, now responsible for producing the PPP exchange rates. Just as the World Bank assists the various states in defining poverty to express their national views, within the same initiative the Bank could also support the creation of globally comparable poverty lines along a commonly agreed-upon framework. As mentioned above, this framework already exists in the form of the Copenhagen Declaration. The World Bank, as an institution mandated in the fight against global poverty and entrusted with the monitoring of global poverty, does not need to force upon any country a particular definition of poverty, as indeed that “would be paternalistic and disrespectful" (The World Bank, 2016, pp. 7). Countries have already agreed on a definition, and it is therefore a matter of actual implementation; this doesn’t mean of course that the task is, at any rate, an easy one.

Conclusions

Circling back to our initial questions, we asked if the method used by the Bank is fit for the purpose of measuring extreme poverty and if the method is defensibly valid both internally and externally. Above we presented findings that raise considerable doubts in that regard, initially in terms of the in-practice distance of the iPL method from its ideal version. Moreover, in terms of internal validity, the central arguments in support of the updated iPL, both in terms of its stability due to the change PPP benchmark year, and in terms of the stability of its value as more countries are taken into account for its determination, are contested based on our investigation. The external validity of the method is also contested in comparison to other, more methodologically involved measurement methods for global poverty, where the iPL method provides substantially lower poverty rates. Finally, its rather unintentional absolute and relative mixed-nature that was obtained as a by-product of the estimation method of the iPL, and the lack of a concrete extreme poverty definition in its core, does not add to its external (or internal) validity.

Do the resulting statistics perform acceptably well, in terms of accuracy and precision? The estimated confidence intervals of the method indicate that the method is characteristically imprecise. The minimization criteria demonstrate that it misses the optimal mark, by a considerable difference, while the alternative iPL value identified via those criteria is also hard to defend as it would (a) move further away from external benchmarks, and (b) would create an even greater gap between the iPL that obtains only by changing the PPPs to the 2017 round, while keeping fixed the methodology used in the previous two rounds. All in all, rather indicates a methodological conundrum.

Consequently, it seems that the iPL approach by trying to achieve many things with one line, and by giving too much emphasis on the questionable ability of PPP rates to compare individuals living in poverty, does little to guarantee an appropriate measurement of global poverty. Therefore, the iPL framework can probably be best described as a shortcut to global poverty measurement.

The question that naturally arises is: why use a proxy, when international agreements about the meaning of absolute poverty are already signed, by almost all countries around the world? It takes political will to re-invigorate the processes described in the Copenhagen Declaration, instead of trying to reinvent the wheel. This process focuses on steps “1" and “2" of the aforementioned four necessary steps for the ideal iPL, while the current iPL method tries to do steps “3" and “4", without completing “1" and “2", it thereby adds empirical uncertainty and a lack of conceptual clarity in methodology.

Alternative approaches in the realm of cost of basic needs, capabilities, and multidimensional poverty measurement must be prioritized, not only to be able to produce statistics that spur out of more appropriate methodological underpinnings but also if we would like to link global poverty statistics with policy-making and coordination on a global scale. The commonly agreed Copenhagen Declaration should become again an integral part of the agenda of international agencies, not only as a reference framework but also indirectly measuring and monitoring the fight against poverty on a global scale.