Urban poverty: Measurement theory and evidence from American cities

We characterize axiomatically a new index of urban poverty that i) captures aspects of the incidence and distribution of poverty across neighborhoods of a city, ii) is related to the Gini index and iii) is consistent with empirical evidence that living in a high poverty neighborhood is detrimental for many dimensions of residents’ well-being. Widely adopted measures of urban poverty, such as the concentrated poverty index, may violate some of the desirable properties we outline. Furthermore, we show that changes of urban poverty within the same city are additively decomposable into the contribution of demographic, convergence, re-ranking and spatial effects. We collect new evidence of heterogeneous patterns and trends of urban poverty across American metro areas over the last 35 years.


Introduction
Much of the literature studying economic inequality has focused on the distribution of income at national or regional level. Inequality at urban level is also important (Glaeser et al. 2009). In America, for instance, cities are among the most unequal places in the country (Moretti 2013;Baum-Snow and Pavan 2013). Over the last three decades, urban income inequality has increased substantially in most of American metro areas (Watson 2009), albeit heterogeneously across cities. Inequality within and across neighborhoods is substantial (Wheeler and La Jeunesse 2008) and increasingly related to the trends of citywide income inequality, with low and high income households living in close spatial proximity (Andreoli and Peluso 2018).
Poverty is a key driver of income inequality in American cities. The urban poor population, i.e. the individuals living in households with aggregate income below the federal income poverty line and who reside in cities, has increased from 25.4 mln in 1980 to 31.1 mln in 2000 and up to 43.7 mln in the 2012-2016 period (estimates based on Census and American Community Survey data). These figures correspond to about 11% of the population before 2000, rapidly increasing to 14.9% after the Great Recession. The geography of poverty has also evolved over the same period. The number of census tracts displaying extreme poverty (where at least 40% of the population is poor) has almost doubled since 2000 (rising from 2,510 to 4,412 in 2013), offsetting demographic growth of about 11% during the same period (Jargowsky 1997(Jargowsky , 2015.
In this paper, we study the spatial distribution of poverty across neighborhoods of a city. The goal is to establish a measurement apparatus for assessing the extent of urban poverty displayed by a city, a concept encompassing concerns for incidence of poverty in the city, as well as for the unequal distribution of poverty across neighborhoods. We outline an axiomatic approach which characterizes a new parametric index, mapping the distribution of poor and non-poor population across neighborhoods of a city into a number, which is the level of urban poverty displayed by that city. Alternative measures of urban poverty widely adopted in the literature and in policy analysis alike may violate some of the axioms we outline.
One of such measures is the concentrated poverty index. The index, introduced by Wilson (1987), is the share of a metro area's poor population (identified by an exogenous income poverty line) that lives in neighborhoods where poverty is extremely concentrated. According to the American Census Bureau, these neighborhoods qualify as places where more than 20% (or 40% in places where poverty is extremely concentrated) of the resident population is poor. A number of contributions (Jargowsky and Bane 1991;Massey et al. 1991;Jargowsky 1997;Kneebone 2016;Iceland and Hernandez 2017) have documented the patterns and drivers of concentrated poverty across American metro areas. After a decline of concentrated poverty in the 1990s, the 2000s and 2010s have witnessed a re-concentration of poverty, rising from 11% to 14.1% in the largest 100 American metro areas. Patterns are heterogenous across metro areas and depend on differences in the size, geographic location, income inequality alongside the degree of income and ethnic segregation in the city.
In a urban context, an uneven distribution of poverty across neighborhoods is a relevant dimension of collective well-being. In fact, in cities where poverty is highly concentrated in few neighborhoods, poor residents living therein are overexposed to poverty, thus likely suffering the consequences of a double burden of poverty related to its geographic distribution. Evidence of negative effects of poverty concentration at neighborhood level has been found on a variety of relevant individual outcomes, such as health (Ludwig et al. 2011(Ludwig et al. , 2013, labor market attachment (Conley and Topa 2002), individual well-being (Ludwig et al. 2012) and the economic opportunities of future generations (Chetty et al. 2016;Chetty and Hendren 2018). 1 A way to formalize the spillover effect of concentrated poverty is to assume that individual well-being depends on household characteristics (such as poverty status) alongside the proportion of poor in the neighborhood (as in Bayer and Timmins 2005), and well-being is decreasing in this proportion. The larger the share of population exposed to high-poverty neighborhoods, the stronger urban poverty impacts collective well-being (all else equal). A urban poverty measure which is consistent with this view should not register less urban poverty when the share of poor population living in extreme poverty neighborhoods rises, even if this increment originates from a reduction of the incidence of poverty in neighborhoods where poverty is less extreme. Using simple counterexamples, we show that the concentrated poverty index may violate this intuitive principle (see also Massey and Eggers 1990;Jargowsky 1996).
This paper addresses this measurement concern and introduces a new measure of urban poverty, that is inspired by inequality analysis and is consistent with the intuitive requirement outlined above. Our measure weights three components of urban poverty that positively contribute to it: first, the incidence of poor residents in high poverty neighborhoods; second, the inequality in the distribution of poverty within the cluster of high poverty neighborhoods; third, the extent of inequality in the distribution of poor residents across high poverty and low poverty neighborhoods. The index we characterize shares features in common with prominent members of the class of rank-dependent poverty measures (Bosmans 2014;Ebert 2010;Sen 1976) 2 and addresses robustness concerns about the way high and low poverty neighborhoods are defined (a related point is raised in Shorrocks 1995; Thon 1979). We organize our results in Section 3, whereas Section 2 provides the setting.
When the focus is on the distribution of poverty across the whole city, the urban poverty index is shown to converge to a Gini-type index. In this case, we demonstrate that the longitudinal variation in urban poverty is additively and non-parametrically decomposable into the contribution of demographic growth, of poverty convergence across neighborhoods and of spatial association in poverty changes (Section 4). The decomposition is relevant for assessing whether urban poverty is mostly driven by neighborhoods that are spatially clustered, unveiling local poverty traps that can potentially reinforce the double burden effects of poverty concentration, or rather urban poverty is idiosyncratic to the neighborhoods characteristics.
In Section 5, we employ our measurement apparatus to assess the dynamics of poverty across all American metro areas over the last 35 years, exploiting rich data from the Census and the American Community Survey (ACS). Our main findings are that: i) American metro areas display strong heterogeneity in urban poverty patterns; ii) Urban poverty has not evolved significantly over the 35 years and has been hardly affected by the Great Recession burst, contrary to the rising trends of concentrated poverty; iii) Both re-ranking and convergence components of urban poverty changes are substantial across metro areas, indicating the role of changes in neighborhood poverty composition; iv) The spatial component of urban poverty is negligible for the large majority of cities, but very significant in largest metro areas where clustering of high-poverty neighborhoods seems to be an issue.
Section 6 concludes with a discussion.

Setting
For any given city, we consider a partition of the urban space into n neighborhoods. In empirical analysis, neighborhoods can coincide with an administrative division of the territory, such as the partition of American cities into census tracts. We take the partition into neighborhoods as given, and we study the distribution of poor and non-poor people therein. Let i ∈ {1, . . . , n}, with n a positive natural number, denote a neighborhood of a city and N i ∈ R + be the individuals living in that neighborhood, with N = n i=1 N i . 3 An individual is poor when living in a household whose total disposable income is smaller than an exogenous poverty threshold (such as the federal income poverty line provided by the American Census Bureau), calculated in a given year for that specific type of household (for instance, depending on the size and the age structure). The analysis of urban poverty is hence conditional on the definition of poverty status, which we take as given. In our application, for instance, residents are poor if they live in households with an equivalent income smaller than 100% of the federal poverty line. Furthermore, let P i denote the number of poor individuals living in neighborhood i, while P = n i=1 P i is the number of poor individuals in the city. A urban poverty configuration is a collection of counts of poor and non-poor individuals distributed across neighborhoods and is denoted by In what follows, a configuration always represents a city in a given year, and we use superscripts to indicate a specific urban poverty configuration only when disambiguation is needed. The ratio P i N i measures the incidence of poverty in neighborhood i. The ratio P N measures instead the incidence of poverty in the city, and is equivalent to the average of poverty incidences across neighborhoods, weighted by the respective population proportions, i.e.
We use ζ ∈ [0, 1) to define a urban poverty threshold, which is a cutoff point that allows to identify the neighborhoods where poverty is over-concentrated. The urban poverty threshold incorporates an exogenous normative judgment about the level of poverty concentration that can be tolerated in a given neighborhood: when P i N i ≥ ζ then poverty in neighborhood i exceeds the tolerance level and contributes to urban poverty. In this case, neighborhood i is addressed to as a highly concentrated poverty neighborhood. When ζ = 0, tolerance is set to a minimum, indicating that every neighborhood of the city contributes to generate urban poverty.
For a given urban poverty threshold ζ , neighborhoods can be ranked by poverty incidence in non-increasing order: For simplicity, labels 1, 2, . . . , n are assumed to coincide with the ranks of neighborhoods, ordered by non-increasing poverty incidence. Among all neighborhoods in the city, z identifies the neighborhood where poverty incidence is the closest to the urban poverty threshold. The neighborhood z serves as a benchmark. In fact, poverty is over-represented in neighborhoods i ∈ {1, . . . , z}.
In this paper, the urban poverty threshold is exogenously given and represents a normative stance about the maximum level of poverty which can be tolerated in a neighborhood without triggering poverty concentration. The Census Bureau, for instance, makes use of the 20% and 40% thresholds to identify places where poverty is highly concentrated and ghettoes, respectively. As a result, if a city displays higher poverty incidence on average than another, then that city should also display larger deviations from the urban poverty threshold, hence larger urban poverty: even if poor people are evenly spread across neighborhoods, residents in the first city have larger chances to be exposed to poverty in their neighborhoods compared to residents in the second city (for a discussion, see Ravallion and Chen 2011).

Concentrated poverty and its critical aspects
A convenient way to represent the distribution of the poor population in the city is to plot the cumulative proportion of the poor against the proportion of the overall population living in the neighborhoods displaying higher incidence of poverty, i.e. ranked by decreasing P i N i . The cumulative proportion of poor people in neighborhood j is given by j i=1 P i P and the cumulative proportion of residents therein is . . , n on a graph. The curve starting from the origin and interpolating these points is a concentration curve denoted the urban poverty curve. The urban poverty curve of an hypothetical configuration A is reported in panel (a) of Fig. 1. Its graph is concave and always lies above the unit square diagonal, implying that in configuration A there are neighborhoods with poverty incidence smaller than P N and other neighborhoods with poverty incidence greater than P N . 4 4 This curve can be interpreted as the Lorenz curve of the distribution of poor population proportions Pi Ni across the city neighborhoods, each weighted by Ni N . The curve of a configuration in which poor people are evenly spread across neighborhoods of the city, that is Pi Ni = P N for every neighborhood i, coincides with the unit square diagonal. For simplicity, we assume that the city has many neighborhoods that differ in terms of poverty shares, so that the urban poverty curve appears smooth.
The lack of intersections of urban poverty curves is a natural criterion to rank distributions by the degree of urban poverty they display. If the urban poverty curve of configuration B lies nowhere below and somewhere above that of A, then any proportion of the population living in high-poverty neighborhoods in B is systematically exposed to a larger fraction of poverty than the corresponding population proportion in A.
In our graphical analysis, we always assume that the distributions under comparison display the same poverty incidence, i.e. P A N A = P B N B . In this case, the urban poverty threshold could be expressed in relative terms as ζ = α P N , where α ≥ 0 is a parameter expressing a normative view about sensitivity of urban poverty to the incidence of poverty in the city. Larger values of α imply that urban poverty evaluations should focus on neighborhoods where poverty is highly concentrated. The coefficient α straightforwardly relates to the urban poverty curve. For instance, in a city with P A N A = 0.2, one can set α = 2 to have ζ = 0.4. On the graph, the coefficient α gives the slope of a line tangent to the urban poverty curve, as in Fig. 1, panel (a). The tangent point identifies the neighborhood z displaying poverty incidence of about ζ = 2 P A N A , the urban poverty threshold. 5 Urban poverty curves are also related to the measurement of concentrated poverty, which is identified by the index CP (A) := z i=1 P i P . The index coincides with the level of the curve at abscissa z i=1 N i N . Graphically, it is identified by the length of the vertical line segment on the same figure. The index CP measures the proportion of poor people who live in high-poverty neighborhoods, defined according to the threshold ζ . According to the American census, concentrated poverty corresponds to the proportion of poor residents that live in census tracts where at least 20% or 40% of inhabitants fall below the urban poverty threshold (i.e., ζ = 0.2 or ζ = 0.4 respectively).
The concentrated poverty index misses some important aspects of the distribution of poverty across the city neighborhoods and, as a consequence, it may rank cities inconsistently with non-intersecting urban poverty curves. Panel (b) of Fig. 1 draws an example. In the figure we consider two configurations A and B where P B N B = P A N A . The distribution of poverty across the neighborhoods of city B is more uneven than that in city A, in the sense that in B a larger fraction of the poor population is concentrated in high poverty neighborhoods, compared to A. As a consequence, the urban poverty curve of the former lies always above that of the latter. Nonetheless, CP (B) < CP (A) for α = 2.
In this paper, we introduce a new urban poverty index which is inspired by social welfare and inequality analysis and is consistent with the ranking of configurations predicted by non-intersecting urban poverty curves. The index we study compounds, with proper normative weights, two aspects of the spatial distribution of poverty: on the one hand, the extent of poverty incidence in places where poverty is highly concentrated; on the other hand, aspects of the distribution of poverty among high poverty neighborhoods as well as across high and low poverty neighborhoods. Both components positively contribute to urban poverty. While the former component has to do with the incidence of poverty among neighborhoods 1, . . . , z, the latter component captures inequality in the 5 To see this, denote with δx and δy variations in the coordinates of the urban poverty curves on the horizontal and vertical axis. Moving along the curve from the tangency point implies δx = N z /N and δy = P z /P which gives the slope of the curve δy/δx = Pz Nz N P ≈ α. distribution of poverty proportions P 1 N 1 , . . . , P z N z and is related to the Gini coefficient G(.; ζ ) defined here: 6 Our analysis relates to other contributions highlighting weaknesses of the concentrated poverty. For instance, Massey and Eggers (1990) suggest valuing the intensity and the distribution of poverty in the city as relevant aspects generating the double burden of poverty. The approach they propose, considers mixtures of dissimilarity and interaction indices, is interesting and related to the urban poverty curve ordering, but it is not based on normative grounds. In the next section, we provide a parsimonious axiomatic approach incorporating the idea that poverty concentration gives rise to a double burden of poverty.

Axioms
Denote the set of admissible urban poverty configurations = n∈N (n) with (n) : A urban poverty index is a function UP (A; ζ ) : × [0, 1) → R + assigning a non-negative real number to a configuration A ∈ (n), interpreted as the level of urban poverty in that configuration. We write UP (.; ζ ) to explicitly recall that the measurement of urban poverty is conditional on the exogenous urban poverty threshold ζ . We develop an axiomatic approach for the measurement of urban poverty.
A convenient way to incorporate concerns for the effects of a transfer of poverty operation on the measured level of urban poverty is to focus on urban poverty indices that explicitly depend on the urban poverty shortfall P i N i − ζ . The shortfall is non-negative in every neighborhood i where poverty is highly concentrated (that is, i ≤ z) and increases as the proportion of poor residents P i N i grows. 7 The first axiom introduces structure. It assumes that for any configuration A ∈ and urban poverty threshold ζ ∈ [0, 1), UP (A; ζ ) is a normalized (weighted) average of urban poverty shortfalls of each highly concentrated poverty neighborhood with P i N i ≥ ζ , where each of these neighborhoods is weighted according to its position in the ordered distribution of poor neighborhoods and on population shares N 1 N , . . . , N n N . The whole measure is scaled according to a normalization factor that depends on the aggregate statistics. The aggregate normalization and the neighborhoods weighting function are continuous in their arguments. Let n denote the unit simplex in the n-dimensional space whose elements are all positive, that is n := {d 1 , d 2 , ..., d n : d i > 0, n i=1 d i = 1 f or i = 1, 2, ..., n}. To ease notation, we also denote N z = z i=1 N i and P z = z i=1 P i , while N + is the set of positive natural numbers. 6 The index G(.; ζ ) is related to the area comprised between the urban poverty curve and the unit square diagonal, up to a proportion z i=1 Ni N of the overall population. A related index adopted in income inequality analysis is discussed in Zoli (1999) and Andreoli (2018). 7 Notice that the urban poverty shortfall could never exceed zero if ζ = 1. To avoid such situation, we maintain that ζ < 1.

Axiom (AGG)regation. UP (.; ζ ) satisfies AGG if for any
A ∈ and ζ ∈ [0, 1) where z ≥ 1, there exist a continuous function A : [0, 1] 2 → R + and a sequence of continuous functions w i : n → R for each i = 1, 2, ..., n and each n ∈ N + such that Note that the AGG property holds if there exists at least a neighborhood with P i N i ≥ ζ . The case where ζ > P 1 N 1 will be considered when normalizing the index. According to AGG, the function A P N ,P z N z is the aggregate normalization factor and the functions denote the normative weights attached to the neighborhoods. The AGG axiom imposes considerable structure, albeit it represents an encompassing model for a variety of indicators consistent with the ranking of urban poverty curves, stemming from choices of normalization and weighting parameters. Given the linearity of the components considered in AGG, the concerns about the poverty distribution across neighborhoods are formalized by the choice of the weighting functions w i (.).
Let consider evaluations that are normalized by the incidence of poverty in the city, that is A(P /N, P z /N z ) = 1 P /N and assume there are no concerns about the unequal distribution of poverty across neighborhoods, that is w i (N 1 /N, . . . , N n /N ) = 1 for every neighborhood i. This parametric choice retains exclusively concerns for the incidence of concentrated poverty and can be related to the concentrated poverty index as follows: The result (2) shows that the index CP is consistent with AGG only up to an additive correction factor ζ N P z i=1 N i N , which gives the adjusted concentrated poverty index CP * . Similarly to the concentrated poverty measure, the index CP * is related to the urban poverty curves. Differently from CP , the index CP * always ranks configurations consistently with the ordering produced by non-intersecting urban poverty curves. This is illustrated in Fig. 2, where we consider the special case in which ζ = α P N , which gives CP * (A; ζ ) = CP (A)− α N z N .
In panel (a) of Fig. 2 we show the same urban poverty curves as in Fig. 1, and we denote with bold solid lines the adjusted concentrated poverty indices CP * (A; ζ ) (segment AB) and CP * (B; ζ ) (segment CD). 8 The adjusted concentrated poverty index ranks CP * (B; ζ ) > CP * (A; ζ ), coherently with the ordering of configurations induced by the urban poverty curves. Since every urban poverty curve is concave and lies above the diagonal, the index CP * is always positive and bounded above by CP .
While the index CP * can be regarded to as a natural extension of the CP index, it is far from being an ideal measure of urban poverty for a generic configuration, for at least two reasons. First, the index measures the degree of concentration of poverty by focusing 8 To see this, note that the length of the line segments starting from points A and C and intersecting the horizontal axis is α   Fig. 2 reports one of such cases. 9 The second critical aspect of CP * is that the index does not address heterogeneity in the concentration of poor individuals across the city's neighborhoods. There are two potential sources of heterogeneity. The first source is due to heterogeneity in P i N i ratios for neighborhoods i ≤ z. When these ratios are homogenous across neighborhoods where poverty is concentrated, i.e., P 1 N 1 = . . . = P z N z ≥ ζ , the index CP * is a sufficient statistic for urban poverty. If they are not, the index CP * may rank as indifferent configurations that can be unambiguously ranked according to the urban poverty curve (see Fig. 3). 10 The second source of heterogeneity is due to the distribution of demographic sizes of the neighborhoods, N i N . The index CP * is insensitive to marginal changes in the poverty threshold that are due to changes in the demographic size of the neighborhoods. Panel (b) of Fig. 3 reports an example of a city with many small neighborhoods, with an aggregate population share of N 1 /N , and one large neighborhood of size N 2 /N with a proportion of poor people equal to that in the population as a whole (i.e., P /N). The adjusted concentrated poverty measure is unaffected by small changes in the poverty threshold from ζ to ζ . While this property of CP * is appealing in some cases, it also implies that concentrated poverty evaluations neglect the size effects of the population that is actually exposed to poverty in the neighborhood of residence. In the figure, a large proportion of the population, (N 1 + N 2 )/N , is concerned with concentrated poverty when the poverty threshold is ζ = α P N , whereas only a minor share of 9 The curve of configuration B lies above that of A almost everywhere. For α = 1, CP * (B; P /N) > CP * (A; P /N). For ζ = α P N and α small enough, however, CP * (B; ζ ) = CP * (A; ζ ) and the two configurations become indistinguishable despite a larger fraction of the poor population of B is concentrated in poor neighborhoods compared to A. 10 The graph in panel (a), Fig. 3, provides an example where urban poverty is unambiguously larger in configuration B than in configuration A for α = 1, but CP * (B; P /N) = CP * (A; P /N).

Fig. 3
Adjusted concentrated poverty and neighborhood structure heterogeneity the population seems to be exposed to high poverty when the poverty threshold marginally reduces to ζ = α P N . 11 More structure is needed in order to address distributional concerns. We consider additional axioms, characterizing the behaviour of any urban poverty measure vis-à-vis the effects of meaningful transformations of the data that affect heterogeneity. When paired with AGG, these axioms characterize the weighting scheme.
The next axiom introduces a form of invariance of urban poverty measures with respect to the demographic structure of neighborhoods. To do so, we introduce a new operation, denoted by the neighborhood splitting, which reshapes the demographic size and geographic boundaries of any neighborhood i by splitting i into two new neighborhoods i and i of smaller geographic and demographic size. We postulate invariance of the urban poverty index to any split operation or sequence thereof. This postulate owes its normative appeal to replication invariance properties formulated in inequality (Atkinson 1970;Cowell 2000) and segregation analysis (Andreoli and Zoli 2014).

Axiom INV-S: INVariance to neighborhood Splitting. UP (.; ζ ) satisfies INV-S if for any
Inequality aversion with respect to poverty incidence across the neighborhoods is formalized by imposing the next axiom. The axiom invokes a principle of transfers, stating that urban poverty in configuration A should be smaller than in configuration A whenever A is obtained from A by a (regressive) transfer of poor people from a neighborhood with a lower poverty incidence to a neighborhood with a higher poverty incidence which is paired by a transfer in the opposite direction of the same number of non-poor people. The population size and the ranking of each neighborhood are unaffected by the transfer.

Axiom Principle of (TRAN)sfers. UP (.; ζ ) satisfies TRAN if for
Note that in the definition of TRAN it is assumed that both i and j are highly concentrated poverty neighborhoods. However, it could be the case that because of the transfer the poverty incidence in neighborhood j falls below ζ (which is set exogenous), implying that z = z − 1 (since z is endogenous). According to AGG, a similar transfer if implemented among neighborhoods that do not display high concentrated poverty and that remain as such, does not modify UP (.; ζ ).
An urban poverty index satisfies TRAN when it ranks distributions consistently with non-intersecting urban poverty curves, since an operation underlying TRAN always implies an upward shift of the curve. The concentrated poverty index may violate TRAN, insofar a movement of poor people from a lower poverty neighborhood towards a higher poverty neighborhood can reduce concentrated poverty. A numeric example clarifies this point. Consider a city with n = 3 neighborhoods, where the urban poverty configuration A is (N 1 , N 2 , N 3 ) = (10, 10, 10) and (P 1 , P 2 , P 3 ) = (7, 5, 3), implying P /N = 15/30 = 0.5. For a urban poverty threshold equal to ζ = 0.4, we have that z = 2 (neighborhoods 1 and 2 have poverty incidences equal to 7/10 = 0.7 and 5/10 = 0.5, respectively) and CP = 12/15. Suppose now that two poor residents move from neighborhood 2 to neighborhood 1 and one poor resident switches from neighborhood 3 to neighborhood 1, and opposite transfers of non-poor people take place such that the population size of each neighborhood is not affected. The new urban poverty configuration A is such that (P 1 , P 2 , P 3 ) = (10, 3, 2) and (N 1 , N 2 , N 3 ) = (10, 10, 10). The overall poverty incidence for A is still P /N = 15/30 = 0.5 but z = 1, given that only neighborhood 1 has a poverty incidence greater than 0.4. We have that CP = 10/15 < CP , i.e., concentrated poverty has decreased, despite the urban poverty curve of A lies above that of A.
Next, we analyze the consequences of a combined transfer, generated by combining a regressive and a progressive transfer (i.e. a transfer of opposite sign obtained by setting ε < 0) of similar proportions of poor and non-poor individuals occurring in high poverty neighborhoods (that is, only across neighborhoods 1, . . . , z). In our setting, any combined transfer does not affect poverty incidence in high poverty neighborhoods, but only its distribution. For ease of exposition, we assume that combined transfers always occur on high poverty neighborhoods that occupy adjacent positions.
Different views may prevail when analyzing the effects of combined transfers on urban poverty. When a regressive transfer (involving neighborhoods i and i +1) takes place earlier than a progressive transfer (involving neighborhoods j and j + 1) in the ranking of neighborhoods (that is, i < j), then the burden of concentrated poverty shifts more heavily on extreme poverty neighborhoods, while the population in some high poverty neighborhoods is relieved from it. Since poverty gets even more concentrated in extreme poverty neighborhoods, urban poverty is not bound to decrease. Conversely, a similar argument leads to conclude that when a progressive transfer is followed by a regressive one, urban poverty cannot increase after the transfer. The following axiom takes a neutral stance with respect to the effects of a combined transfer on the distribution of poverty across high poverty neighborhoods, and hence on urban poverty.

Axiom INV-T: (INV)ariance w.r.t. combined (T)ransfers. Let
Axiom INV-T postulates that the combined effect of transfers of population of poor people across adjacent highly concentrated neighborhoods with the same population size is not affecting urban poverty. This is the case irrespective of whether the progressive or regressive transfers of population of poor people take place between neighborhood with higher or lower poverty incidence. If i = j , INV-T is satisfied by definition because the two transfers cancel out. In all the other cases, the implications of the transfer taking place between two adjacent neighborhoods in the ranking based on poverty incidence, are not affected by the position occupied by the neighborhoods in that ranking.
Next, we introduce additional properties that define the cardinal features of the urban poverty indices. The first invariance axiom considers situations where the poverty threshold ζ is modified. It considers different effects of combined changes in P i and in ζ . In order to simplify the exposition, we assume that this invariance condition holds for n = 2. Axioms can be readily generalized.

Axiom INV-PL: (INV)ariance to (P)overty (L)ine modifications. UP (.; ζ ) satisfies INV-
≥ ζ the following conditions hold: The two conditions in INV-PL require respectively that (i) if the number of poor individuals in each neighborhood is scaled by the same factor λ > 0 and similarly also the threshold ζ is scaled by the same factor then UP is not affected, (ii) if one neighborhood exhibits a high concentration of poverty while the other does not, if both the poverty incidence in the first neighborhood and the poverty threshold change by the same amount and the overall poverty incidence in the population is not affected, then UP does not change.
The two invariance conditions imply respectively that what matters are the ratios between P i /N i and ζ and that the differences between P i /N i and ζ are informative only if nonnegative for a given level P /N of average poverty in the population.
Next property requires that if poverty increases proportionally in each neighborhood then UP should not decrease. Along with TRAN, these two axioms incorporate the features of transformations of the data that rise poverty in high poverty neighborhoods while unambiguously rising the double burden due to poverty concentration. As a consequence, urban poverty is bound to rise.

Axiom (MON)otonicity. UP (.; ζ ) satisfies MON if for configurations
To conclude, we set an axiom that quantifies the lower bound of the UP index.

Axiom (NOR)malization. for any
The NOR condition specifies the value of the index for configurations where the poverty incidence in each neighborhood is below the threshold ζ . In this case the value of the index is constant for each configuration and coincides with the infimum of all values of UP (.; ζ ) that could be obtained in any other alternative configuration in where at least for one neighborhood the poverty incidence is not below ζ .
When assuming AGG and considering configurations in which poverty incidence is evenly distributed across high-poverty neighborhoods (i.e. P i /N i = P z /N z for all i ≤ z), axioms NOR and TRAN jointly imply that urban poverty evaluations should be normalized by the average poverty incidence. This leads to the main result. MON, TRAN and NOR if and only if there exist β, γ ≥ 0 such that:

Main result and discussion
The proof of the theorem is in Appendix A.1, where we also demonstrate the independence of the axioms. Theorem 1 shows that the urban poverty axioms characterize exactly one parametric urban poverty index. For any given poverty threshold ζ , this index depends on three components and two parameters (besides the urban poverty threshold), capturing respectively the incidence (β) and the aversion to dispersion (γ ) of poverty. The three components of the index contribute positively to the measured level of urban poverty. Their relative importance depends on the weight they receive. The first component, weighted by β, captures the average excess of poverty (with respect to the tolerance level ζ ) in high poverty neighborhoods and it is related to the adjusted concentrated poverty index. In fact, when evaluations do not express distributional concerns, the index CP * becomes the relevant measure of urban poverty.

Corollary 2 Let
The second component, weighted by γ , captures aspects of inequality in the distribution of poverty across high poverty neighborhoods through the Gini index G(A, ζ ). The third component captures inequality in the distribution of the poor population between high poverty (with weight N z N ) and low poverty (with weight 1 − N z N ) neighborhoods. 12 The inequality component of urban poverty measures the relative extent of dissimilarity between the actual distribution of poor residents across neighborhoods of the city and the distribution of the population of poor and non-poor residents across the same neighborhoods (see Andreoli and Zoli 2014). Each component of urban poverty captures a specific feature of changes in urban poverty distribution. For instance, a regressive transfer of poor population occurring between neighborhoods where poverty is highly concentrated does only affect the distribution of poverty in those neighborhoods. This effect is reflected on the component G(.; ζ ). Consider instead a situation where poverty is evenly distributed in high poverty neighborhoods (i.e. P i N i = P z N z for each i ≤ z, implying G(.; ζ ) = 0) and assume that some poor individuals in z are evenly redistributed across neighborhoods 1, . . . , z − 1, so that poverty rises uniformly across these neighborhoods. This change rises both the first component (poverty incidence rises) and the third component (poverty is more unevenly distributed across high and low poverty neighborhoods) of the index when such movement shifts z towards z = z − 1. Extending the weighting scheme over {1, . . . , n}, as postulated in axiom AGG, guarantees that urban poverty evaluations are robust with respect to marginal changes in the distribution of poverty around the (exogenous) urban poverty threshold ζ which may nonetheless affect the (endogenous) threshold neighborhood z. Finally, consider again a situation in which poverty is evenly distributed and new non-poor individuals flew in low poverty neighborhoods. While this change does not affect P z , N z , P and z, it increases N and therefore it increases the segregation of poverty in high poverty neighborhoods by rising the number N − N z of residents that are least exposed to high poverty concentration. Arguably, this change rises inequalities between neighborhoods and rises measured urban poverty through the third component of the index. Evaluations of urban poverty are conditional on the poverty threshold. When the tolerance threshold approaches zero, concerns for even small levels of poverty concentration rise. This allows to take into account the fact that increments of poverty in those neighborhoods where poverty is more concentrated prevents other people living in neighborhoods where the poor are under-represented to be exposed to the double burden of poverty. By setting the urban poverty threshold at ζ = 0, concerns about the distribution of poverty are extended to all neighborhoods of the city. Notice that if ζ = 0, then z = n,N z = N and P z = P , it follows that: When the urban poverty threshold is inclusive of all neighborhoods of the city, the first component of urban poverty, measuring incidence, reaches its maximum level, whereas the urban poverty index becomes ordinal equivalent to the Gini inequality index G(A; 0).
We explore the decomposition properties of the Gini index to address some key measurement issues in urban poverty. A first issue is that urban poverty may be insensitive to the depth of poverty, insofar P i is identified on the basis of an exogenous poverty threshold. In the American case, for instance, urban poverty patterns of extremely poor families (with equivalent income below 75% of the federal income poverty line) may differ from urban poverty of the average family in need (comprising all families with income below 200% of the federal income poverty line). The factor income decomposition in Shorrocks (1982) can be used to linearly decompose the urban poverty index into the contribution of different subgroups identified by varying the poverty threshold, such as for families in extreme (below 75%), severe (between 75% and 100%) or mild (between 100% and 200%) poverty. Such a decomposition may be useful in drawing empirical robustness checks.
A second issue concerns the possibility of factorizing longitudinal variations in urban poverty within the same city between two periods t and t into the contribution of demographic growth, poverty growth and poverty relocation across neighborhoods. Based on Corollary 3, the focus should be on the quantity U P = UP (A ) − UP (A) = G(A ; 0) − G(A; 0) for the arbitrary selection γ = 1 and β = 0. The following section provides two relevant decomposition results.

Decomposing changes in urban poverty
This section builds on Corollary 3 and provides two additional corollaries to the main result, showing that changes in urban poverty can be decomposed linearly into convergence, reranking and poverty growth components, as well as into the contribution of changes taking place within or across spatial clusters of neighborhoods. Both decompositions are relevant for describing the dynamics of urban poverty. 13

Convergence, re-ranking and growth components of urban poverty
Consider a city, where each neighborhood i is observed in both periods t and t . The overall population and the number of poor residents in i are denoted by N A i and P A i in period t and N A i and P A i in period t , respectively. Let c stand for the change in poverty incidence in the whole city, so that: The next corollary shows that changes in urban poverty can be linearly decomposed into the contribution of demographic (W ), re-ranking (R) and convergence (C · E) components.

Corollary 4 Let
Proof See Appendix A.2.
The first component captures the effect of changes in the demographic weights of neighborhoods, and is denoted by W . In empirical applications, it is generally the case that The changes in demographic weights may have nontrivial effects on the accountancy of urban poverty changes. The demographic component contributes positively to changes in urban poverty (W > 0) if the demographic weights of the neighborhoods that are more dissimilar in terms of poverty incidence increase, whereas the weights of neighborhoods that are less dissimilar decrease. The component W isolates the effect of population changes from those of changes in the distribution of poverty across neighborhoods.
Component C captures the effect of the change in poverty incidence in the city. It measures the implication of a citywide expansion (or reduction) of poverty incidence on urban poverty, thus allowing to separate the contribution of a change in poverty incidence in the same proportion c for all neighborhoods 14 from the contribution of neighborhoodspecific changes in poverty incidence (occurring when poverty incidence changes across neighborhoods in a disproportionate way).
Components R and E measure different distributional effects due to disproportionate changes in neighborhood poverty rates. The component R measures the effect of re-ranking of neighborhoods, based on poverty incidence, from t to t . The component E measures the effect of convergence (or divergence) in poverty incidence among neighborhoods. Neighborhoods diverge when the poverty rates of neighborhoods with high (low) poverty incidence in t increase (decrease) faster than the poverty rates in low (high) poverty neighborhoods. In this case, E > 0 and urban poverty level increases by C · E. Otherwise, E < 0.
The implications of convergence in poverty incidence on changes in urban poverty can be more complex. For instance, a strong convergence of poverty rates across neighborhoods may induce a re-ranking of neighborhoods in terms of poverty incidence, implying that the reduction in urban poverty due to convergence (i.e., C · E) can be partially offset by the re-ranking effect measured by R, which is always non-negative. Component R is null when there is no re-ranking, and is positive when at least two neighborhoods exchange their positions in the ranking of neighborhoods by poverty rate from t to t . 15 The result in Corollary 4 is useful for decomposing additively the contribution of poverty incidence and demographic changes at neighborhood and city level on the dynamics of urban poverty. The decomposition displays advantages over other methods. First, the decomposition allows to factor out the effect of demographic changes (W ) on urban poverty, thus disentangling the effect of changes in poverty from the effect of demographic shifts and growth across neighborhoods. Second, components R and C · E pick up specific aspects of changes in poverty concentration that cannot be inferred just by looking at U P . For instance, consider two cities A and B displaying no changes in urban poverty ( U P A = U P B = 0), with R A = C A · E A = 0 for the first city, while R B = −C B · E B > 0 for the second city. While the poor population is immobile in the first city A, poverty deconcentrate in some neighborhoods and reconcentrate in others in the second city B, despite the change does not imply a neat form of convergence in the degree of poverty concentration, but rather a shift of poverty across the neighborhoods of the city (large R B ).

Spatial components of urban poverty
The index UP , U P and their components are, by construction, invariant to changes in the spatial configuration of poverty within the city, and hence unaffected by the implicit degree of spatial association of poverty incidence across neighborhoods. Building on the Rey and Smith (2013) spatial decomposition of the Gini index, we obtain a two-term additive decomposition of the urban poverty index, in which a "neighborhood component" measures distributional changes originating from neighborhoods that are spatially close and 14 For each neighborhood i, 15 We borrow the terminology from the analysis of panel income growth (Jenkins and Van Kerm 2016). Component E is computed by comparing the relative disparities between the neighborhood poverty rates in t and those in t , under the assumption that the ranking of neighborhoods remains constant over time to that observed in t. The effect of E can be either magnified or mitigated by C, since the latter component reflects the change in citywide poverty incidence. For instance, the potential effect of a convergence in poverty incidence among neighborhoods (E < 0) is reduced when changes in neighborhoods poverty rates lead to increasing the citywide poverty incidence (C < 1). a "non-neighborhood component" measures changes concerning neighborhoods that are not in spatial proximity. The two components reveal the contribution of spatial association to measured urban poverty. The spatial decomposition is conditional on the knowledge of a proximity matrix N, whose generic binary element n ij ∈ [0, 1] indicates whether neighborhoods i and j are neighbors according to a given criterion. The matrix N can be constructed from the data and is assumed fixed throughout the comparisons (in our setting, the spatial structure of a city does not change across time), but is specific to the city.

Corollary 5 Let
Corollary 5 delivers two important results. First, it shows that UP can be exactly decomposed into neighborhood N and non-neighborhood nN components. When G N is large relative to G nN , most of inequality in urban poverty occurs in neighborhoods that are located in spatial proximity. In this case, high and low poverty neighborhoods tend to belong to the same spatial cluster. The converse holds when G N is small compared to G nN , in which case there is a positive spatial autocorrelation in the distribution of poverty among neighborhoods.
The clustering dimension of urban poverty is relevant for policy analysis for at least two reasons. First, spatial clustering of high poverty neighborhoods may decrease the likelihood of access to transportation, to the job market, to high-quality supply of public goods and definitely to economic and social opportunities for the residents, thus amplifying the double burden from poverty already experienced by neighborhood residents. Second, when clusters of high poverty neighborhoods overlap with administrative divisions of the territory, such as counties or school districts, more economically vulnerable residents might face poverty traps that extend their effects both on long-term poverty status of the residents as well as on inter-generational mobility prospects of the children living therein.
The second important result of Corollary 5 is that changes over time in urban poverty can be also linearly decomposed along the spatial dimension. In this way, we can disentangle the contribution of changes in poverty within clusters from that of changes across clusters, which are more relevant for understanding spatial drivers of urban poverty.

Data
We use data from the U.S. Census Bureau to study patterns and trends of urban poverty in American cities. Data for 1980Data for , 1990  These years roughly correspond to the onset, the striking and the early aftermath of the Great Recession period (Jenkins et al. 2013;Thompson and Smeeding 2013).
Poverty incidence at the census tract level is measured by the number of individuals in families with total income below the poverty threshold, which varies by family size, number of children, and age of the family members. 16 Poverty status is determined for all families (and, by implication, all family members). 17 The census reports poverty counts at census tract level for various poverty thresholds. In this paper, we consider as poor all individuals living in households with income below the 100% federal income poverty line. In the supplemental Appendix, we provide robustness checks for poverty status determined by equivalent family income below 75%, 100% (baseline) and 200% of the federal poverty line.
Following Andreoli and Peluso (2018), we consider the 2016 Census Bureau definition of American Metropolitan Statistical Areas (MSA) to group census tracts into cities. The number and geographic size of the census tracts vary substantially across time within the same MSA. Some census tracts experience demographic growth and are split into smaller tracts. Some other census tracts are consolidated to account for demographic shifts. While raw data allow to estimate urban poverty at the city level, they cannot be used to perform the decomposition exercise, insofar the definition of neighborhood is not constant over time. We resort on the Longitudinal Tract Data Base (LTDB), which provides crosswalk files to estimate population counts statistics within 2010 tract boundaries for any tract-level data that are available for prior years as well as in ACS for the period after 2010 (Logan et al. 2014). 18 We calculate poverty incidence in each census tract/year and then construct measures of urban poverty and concentrated poverty in high (i.e. where poverty incidence is above 20% of the resident population) and extreme (i.e. where poverty incidence is above 40% of the resident population) poverty neighborhoods.
The balanced panel enables us to further decompose changes in urban poverty in its underlying components for 395 American MSAs. 19 Census tracts are geo-localized, implying that measures of proximity of these tracts can be further produced and used to disentangle the neighborhood and non-neighborhood components of urban poverty across all years and all MSAs. 16 Both Census 1990 and 2000 and ACS determine a family poverty threshold by multiplying the base-year poverty thresholds (1982) by the average of the monthly inflation factors for the 12 months preceding the data collection. The poverty thresholds in 1982, by size of family and number of related children under 18 years can be found on the Census Bureau web-site: https://www.census.gov/data/tables/time-series/demo/incomepoverty/historical-poverty-thresholds.html. For a four persons household with two underage children, the 1982 threshold is $9,783. Using the inflation factor of 2.35795 gives a poverty threshold for this family in 2013 of $23,067. If the disposable household income is below this threshold, then all four members of the household are recorded as poor in the census tract of residence, and included in the 2014 wave of ACS. 17 Poverty status is also determined for persons not in families, except for inmates of institutions, members of the Armed Forces living in barracks, college students living in dormitories, and unrelated individuals under 15 years old. 18 These files make use of re-weighting methods to assign each census and ACS year population to the exact census tract boundary defined in 2010 census. We obtain a balanced longitudinal dataset of census tracts for 395 American Metropolitan Areas (those with at least 10 census tracts according to 2010 census) for years 1980, 1990, 2000, 2008, 2012 and 2014. 19 Figure 6 in the Appendix B displays urban poverty calculated on balanced longitudinal data against urban poverty calculated on raw data. Estimates of urban poverty based on the two methods largely coincide. Urban poverty estimates are also unrelated to the incidence of poverty in the city, as shown in Fig. 7.

Results
Panel (a) of Fig. 4 describes the levels and trends of urban poverty and concentrated poverty in American MSAs over 1980-2014. In line with the literature, we find that concentrated poverty is high in American cities, ranging from 26% to 51% on average over the period. Concentrated poverty has increased since the onset of the Great Recession, and it has remained stable in the aftermath. Conversely, the distribution of urban poverty among MSAs reveals a more stable pattern over the last 35 years we consider. Small changes in urban poverty may however be the outcome of the offsetting contributions of re-ranking and changes in disparities between census tract poverty rates. The decomposition in Corollary 4 is useful to separate these effects. Panels (b) and (c) of Fig. 4 display the extent of heterogeneity in the distribution of concentrated poverty and urban poverty over the whole period considered. Data suggest that concentrated poverty and urban poverty indices capture uncorrelated aspects of the urban distribution of poor. Larger metro areas, denoted by circles of larger size on the graph, display proportionally more concentrated poverty than urban poverty. Urban poverty is persistent over the period, with most MSAs grouped along the figure bisector. Panel (d) of Fig. 4 breaks down heterogeneity of year-to-year variation in urban poverty into its components, computed separately for each MSA. 20 The little variability in urban poverty can be explained by the trends in its components R and D := C · E. The component D is negative for a majority of MSAs in each sub-period, indicating that relative disparities in poverty incidence across census tracts have decreased over time. Such a pattern suggests convergence in neighborhood poverty and decreasing urban poverty. This equalizing effect is partially offset by the re-ranking component R, which indicates that initially low poverty Overall, the analysis of urban poverty suggests a major trend of convergence in poverty across American MSA neighborhoods. Poverty has grown everywhere in American MSAs after the Great Recession, but less so in high poverty neighborhoods, while concentrating into historically middle-class, low poverty neighborhoods.
We examine the decomposition of urban poverty into neighborhood and nonneighborhood components. A proximity matrix describing the spatial relations between census tracts is obtained for each city resorting on the notion of critical cut-off neighborhood, according to which two census tracts are neighbors if their distance is equal or less than a given cut-off distance. 21 We use the Moran-I index to test for spatial dependence in urban poverty rates (setting spatial independence at the null) and register for each MSA the p-value of the test, computed separately in 1980 and 2014. In Table 1 we report proportions of cases of weak (at 10% significance level) and strong (at 1% significance level) rejections of the null hypothesis, alongside the proportion of acceptances (with p-value larger than 0.1). The Moran-I statistics can be highly influenced by the population size of the city and the number of neighborhoods. We hence report rejection and acceptance rates by quartiles (Q1,...,Q4) of MSAs ranked by population size.
Results support the hypothesis of spatial independence for the Q1 and Q2 cities (with average population size smaller than 0.15mln). Patterns is less clear for the Q3 cities, where rates of rejection and acceptance of spatial independence in the occurrence of poverty are mixed, with weak rejection rates ranging from 43% in 1980 to 56% in 2014. For large MSAs (with about 2mln residents on average) included in Q4, data weakly reject the null hypothesis of spatial independence in about 80% of the cases (in both years alike) in favor of positive spatial autocorrelation. In these cities, neighboring census tracts tend to have similar poverty rates, thus rising the risk of presence of spatial poverty traps.
In Fig. 5, we separately analyze the patterns of urban poverty in the largest five American MSAs and further decompose the changes of urban poverty into neighborhood and nonneighborhood components. Overall, we find that urban poverty has increased from 1980 to 1990, with largest MSAs in the top of urban poverty distribution. Urban poverty in largest  Fig. 5), which is generally high and explains most of urban poverty in these cities. The minor role of the neighborhood component in large MSAs confirms that census tract poverty rates are more similar among neighboring tracts than among non-neighboring tracts.

Concluding remarks and extensions
This paper introduces a parsimonious axiomatic approach to characterize a new parametric urban poverty measure. This measure weights the contributions of poverty incidence and inequality in poverty distribution across neighborhoods of a city. The latter component depends on the way poverty is unequally distributed across high poverty neighborhoods and on the way the poor population is split across clusters of high and low poverty. The approach builds on the idea that concentration of poverty across neighborhoods can produce welfare losses for those exposed to it. The concentrated poverty index, the official measure of urban poverty adopted by the Census Bureau to assess urban poverty, may fail to satisfy this basic requirement.
We use our urban poverty measure to highlight patterns, trends and components of urban poverty using census and ACS data for the largest 395 American MSAs over the last 35 years. While there is evidence that concentrated poverty has increased after the onset of the Great Recession, we find no systematic trends in the evolution of urban poverty. This apparent steadiness masks the implications of ongoing changes in the geography of poverty within MSAs, with poverty rising and falling heterogeneously across census tracts. The data we use do not allow to distinguish whether trends in urban poverty are driven by relocation of chronically poor individuals across census tracts, or rather by the fact that the likelihood of occurrence of poverty spells is unevenly distributed across census tracts, possibly affected by unobservable factors that are also relevant for the way rich and poor households sort in space. Distinguishing the two effects would require knowledge of individual-level incidence of poverty spells alongside residential decisions.
The urban poverty index we characterize focuses on the incidence and distributional inequality of poverty among neighborhoods where poverty is highly concentrated, as identified by an exogenously given urban poverty threshold (for instance above 20% or 40% of resident population). When the threshold is set to zero, urban poverty evaluations are based on all neighborhoods of the city. In this case, urban poverty evaluations depend only on the distribution of poverty across all neighborhoods of the city and the index converges to the Gini coefficient. Building on this result, we investigate a spatial decomposition of the implied urban poverty index, which is additive in the contribution to urban poverty of high-poverty clusters and the contribution of distant neighborhoods. This decomposition is relevant for analyzing the spatial dynamics of urban poverty, which may vary across similar cities on the basis of the quality of housing stock, the distribution of public goods and the extent of affordability of neighborhoods. We find that in the largest MSAs, the nonneighborhood component of urban poverty is dominating. Trends are less clearcut for the rest of the MSAs.
The results presented in this paper can be extended in a variety of directions which are useful to study the incidence of covariates on urban poverty.
First, notice that the urban poverty index takes the poverty status identification as given, while it evaluates the distribution of poverty across neighborhoods. Different criteria can be used to identify the poor, for instance using different poverty thresholds. As a robustness check, we assess urban poverty looking at different populations with income below 75%, 100% (baseline) and 200% of the poverty line. Results reported in the Appendix (Fig. 8) show that urban poverty grows as the severity of poverty rises (panel a)), albeit uniformly across MSAs (panels b), c), d)). The ranking of MSAs based on different measures of poverty depth is robust (the indices obtained with different poverty groups display a rank correlation larger than 89% in 2014) and uncorrelated with concentrated poverty. While heterogeneity in distribution of poverty across neighborhoods is substantial, it is unlikely driven by the severity of poverty status.
Second, the urban poverty index obtained when setting the urban poverty threshold to zero, i.e. G, can be decomposed along the lines of the factor income decomposition by Shorrocks (1982) to analyze the contribution of various groups identified by varying degrees of poverty depth, as well as to analyze the contribution to urban poverty of different social groups in which the poor and non-poor populations can be further partitioned into, for instance along the lines of race, human capital or income. Furthermore, the urban poverty index G can be weakly (additively) decomposed as in Ebert (2010) into two components. A component captures the incidence of urban poverty among neighborhoods identified by some common trait, such as quality of housing stock, affordability, supply of local public goods. Another component measures instead the average contribution to urban poverty of differences between poverty incidence in any given neighborhood and poverty incidence in neighborhoods displaying different characteristics.
Finally, we acknowledge that poverty may be multidimensional, insofar individuals can be deprived in dimensions other than income and these dimensions are relevant to explain the urban distribution (Decancq et al. 2019). Borrowing on the relation with the Gini index, the urban poverty index can be generalized to the multidimensional setting following the approach in Koshevoy and Mosler (1997), based on a multivariate extension of the Gini index. As motivated in Andreoli and Zoli (2020), such approach consists in measuring the extent of dissimilarity between the distribution of the population across the city neighborhoods and the distributions of people that are poor in any given dimension, such as income (as we do here), housing or education. These extensions are left for future investigations.

A.1 Proof of Theorem 1
We will prove the theorem making use of a sequence of lemmas that will highlight the role of the different axioms in the derivation of the final result.
Proof The proof combines the effect of AGG with INV-S by deriving a functional restriction on the class of weighting functions w i . . . , N i N , . . . , N n N that appear in the definition of AGG. We leave to the reader to verify that the index in Eq. 4 satisfies AGG and INV-S, here we focus on the proof of the (only if) part of the statement in the lemma. First recall that, given AGG, we can write , . . . , N i N , . . . , N n N (5) where A : [0, 1] 2 → R + and w i : n → R satisfy the conditions specified in AGG. Let z ≥ 1, we apply INV-S. Note that because of the definition of INV-S, the scaling component A P N ,P z N z of Eq. 5 is not affected by splitting operations. Thus INV-S only . . . , N i N , . . . , N n N . We construct the proof in two steps. We first derive the restrictions on the function w 1 (.) and then in a recursive manner we derive also the restrictions on all the other functions w i (.) for i = 2, 3, . . . , n.
We first note that the function w i (.) does not depend on ζ , and then we set ζ such that for a given A ∈ we have that n = z. Note that for any A ∈ there exist values of ζ such that n = z, for instance this is the case if we let ζ = 0.
We can then obtain in general that N ī  1, 2, . . . , n whereN 0 N := 0 and h(0) := 0. As pointed out the function w i (.) does not depend on ζ , therefore even if it is derived under the assumption that ζ is such that z = n, the specification also holds for any ζ ∈ [0, 1), and therefore for any z ≤ n, provided that z ≥ 1 as required in the definition of AGG.

Lemma 2 Let
Proof We take the result from Lemma 1 and investigate the implications on the specification of UP (.; ζ ) generated by further imposing INV-T. We leave to the reader to check that the obtained specification of UP (.; ζ ) satisfies all axioms, here we focus on the "only if" part of the lemma. For z = 1, INV-T does not hold. Note that when z = 1 the specification of UP (.; ζ ) in the lemma is consistent with the one derived in Lemma 1 where h(1) = β 0 if z = n = 1. While the specification in the lemma for h(.) that is valid also when z = 1 < n, will be obtained in the next general part of the proof.
We set z ≥ 2 and consider the transfers involved in the definition of INV-T. Note that with z = 2, the axiom is satisfied by construction given that it involves two transfers of population taking place in opposite directions and therefore their effects cancel out leading to the initial configuration A.
Without loss of generality we assume that there are z ≥ 2 neighborhoods with highly concentrated poverty with P i N i ≥ ζ and such that their population size is equal, that is N i = N 0 for i = 1, 2, . . . , z. It follows that their relative population size within this set of neighborhoods is N ī Moreover, we consider first the case where ζ ∈ [0, 1) is such that for a given A ∈ we have z = n ≥ 2.
Consider the effect of the combined transfers of population in INV-T, and apply them to the specification derived in Lemma 1. Note that these transfers take place among neighborhoods in {1, 2, . . . , z} and do not affect the components A P N ,P z N z and h Recall that i n by construction could be any rational number in (0, 1], with h(0) = 0 already set in Lemma 1. Given that the set of rational numbers is dense in (0, 1] and that h(.) is continuous in that interval the result could be extended to all real numbers in [0, 1], with h(0) = 0. Recalling thatN i N = i n we can then write more generally Consider the weighting function h from Lemma 1, it can then be specified as: By substituting into the specification of UP (A; ζ ) in Lemma 1, one obtains the results presented in this lemma. Recall that we have derived the result under the assumption that for ζ ∈ [0, 1) we have z = n ≥ 2 and note that the function h(.) does not depend on ζ . In order to extend the result to all cases where n ≥ z ≥ 2 it needs to be checked that the obtained functional form for h(.) allows to satisfy INV-T also when z < n. Note that, as in Eq. 7, the application of the transfers in INV-T when n > z ≥ 2 requires that the following condition has to be satisfied for all i, j ∈ {1, 2, . . . , z − 1}, for n > z ≥ 2, for ε > 0. That is, after substituting for the derived specification of h This is similarly the case if we consider the neighborhood with index j . As a result INV-T holds also for n > z ≥ 2.
To complete the exposition we consider the case where n > z = 1. In this case INV-T cannot be applied, however we have already derived the required specifications for function h(.) from the previous steps of the proof.

Lemma 3 Let
Proof We consider the result from Lemma 2 and investigate the implications on the specification of UP (.; ζ ) generated by further imposing INV-PL, MON, TRAN and NOR. We leave to the reader to check that the obtained specification of UP (.; ζ ) satisfies all axioms, here we focus on the "only if" part of the lemma. Recall that, if z ≥ 1, then according to Lemma 2 it is possible to write We first consider INV-PL(i We take into account two cases, first when ζ = 0 and then when ζ ∈ (0, 1). By applying the result to the specification in Lemma 2, and letting β := β 0 K and γ := γ 0 K/2 one obtains We now investigate the effects of MON and TRAN. According to MON considering that Rearranging the condition, it implies that This condition depends on the value of z ≥ 1, and in particular, because of the construction of the weighting function w i (.) that satisfies INV-S, the condition depends only onN z N . In fact, in this case, because of INV-S, without loss of generality, one can consider distributions with two neighborhoods and z = 1. In this case N 1 =N 1 =N z and recall thatN 0 = 0. After substituting, one obtains the condition for allN z N ∈ (0, 1]. LettingN z N = 1, that is if z = n, it follows that a necessary condition for MON to hold is β ≥ 0. Moreover, lettingN z N → 0, the additional derived necessary condition is β + γ ≥ 0, because otherwise, if γ < −β for sufficiently small values ofN z N is possible to violate the condition in Eq. 10. Both necessary conditions β ≥ 0 and β + γ ≥ 0 turn out to be sufficient for Eq. 10 to hold for allN z N ∈ (0, 1]. We consider now the restrictions required by axiom TRAN. First we consider the case where, because of the transfer, the poverty incidence in neighborhood j does not fall below ζ , that is P A j /N A j ≥ ζ . Recall moreover, that according to TRAN the considered transfer does not affect the ranking of the neighborhoods. Consider Eq. 9 and note that according to TRAN, only P i and P j are modified by the transfer, it should then be verified that for j > i, with j ≤ z, and ε > 0. Thus as a result should hold γ ≥ 0. Note that the condition should hold for all i < j ≤ z with z ≤ 2 and therefore letting z = n, should hold for all i < j ≤ n.
We consider now the case where P A j /N A j < ζ, with j = z, by applying TRAN, it follows that The condition can then be simplified as Recalling that (N−N i for j > i, that ε ≤ ε, and that β ≥ 0, then γ ≥ 0 is sufficient to verify that UP (A ; ζ ) ≥ UP (A; ζ ).
Thus, γ ≥ 0 is necessary and sufficient for TRAN to hold. By combining with the parametric restrictions derived by applying MON one obtains β ≥ 0, and γ ≥ 0.
All derivations illustrated so far consider the case where z ≥ 1. Note that forA ∈ and given ζ ∈ (0, 1) it is possible to take into account also configurations where P i N i < ζ for all neighborhoods i. In this case the value of the index is derived by considering axiom NOR. For all these configurations the value of the index coincides with the infimum of the index taken over all the other possible configurations in A where z ≥ 1. Consider the obtained derivation of UP for z ≥ 1, where with β, γ ≥ 0, note that the first term in the summation P i −ζ N i P ≥ 0 is non-increasing in i, and that the term (N−N i )−N i−1 N is also non-increasing in i. It follows that, given that β, γ ≥ 0 also β + γ (N−N i )−N i−1 N is non-increasing in i. The summation in Eq. 11 is then minimized, for each z ≥ 1 if the terms P i −ζ N i P are equalized. Given that P i −ζ N i P ≥ 0 then the minimum for each z ≥ 1 is obtained for P i − ζ N i = 0 for all i ≤ z. It follows that in this case UP = 0. Thus, by NOR the value of the index is 0 when P i N i < ζ for all i. To complete the proof we rearrange the specification of UP (.; ζ ) in Eq. 11. We can rewrite: . Thus, we obtain: In order to complete the proof of the Theorem one has to link the result in Lemma 3 with the Gini index formula G(A; ζ ). Next lemma provides this link.
Lemma 4 Let A ∈ , ζ ∈ [0, 1), and z ≥ 1, then Proof The Gini index G(.; ζ ) can be written as follows: We now develop the first term appearing in squared brackets in Eq. 12, denoted max in short-hand notation, to show that it can written as a function of the rank weights. First, let develop the double summations term as follows: After As a resultP z G( , after dividing both sides by P we obtain the result in the lemma. By substituting from Lemma 4 into the specification of Lemma 3 in Eq. 8 for z ≥ 1, we obtain the specification of UP (.; ζ ) in the Theorem for z ≥ 1 : To complete the proof we show that all axioms are independent, meaning that it is possible to derive alternative functional forms for UP (.; ζ ) by dropping one of the axioms and considering all the others. Drop for z ≥ 1, and set UP (.; ζ ) = 0 in all other cases. Drop AGG: consider for z ≥ 1, and set UP (.; ζ ) = 0 in all other cases. QED.

A.2 Proof of Corollary 4
Proof Let p i = P i N i and s i = N i N denote the poverty incidence and population share of neighborhood i, respectively.
Let p = (p 1 , . . . , p n ) T be the n × 1 vector of neighborhood poverty rates sorted in decreasing order and s = (s 1 , . . . , s n ) T be the n × 1 vector of the corresponding population shares. A urban poverty configuration is fully identified by the pair (s, p), and is used interchangeably. Let 1 n being the n × 1 vector with each element equal to 1, P is the n × n skew-symmetric matrix: The difference enclosed within square brackets on the right-hand side of Eq. 23 can be additively split into two components: one component measuring the re-ranking of neighborhoods, a second component measuring the change in disparities between neighborhood poverty rates. Let p t |t be the n × 1 vector of t neighborhood poverty rates sorted in decreasing order of the respective t neighborhood poverty rates, and B be the n × n permutation matrix re-arranging the elements of p t to obtain p t |t , that is p t |t = Bp t . Matrix P t |t = 1/p t |t 1 n p T t |t − p t |t 1 T n contains the n 2 relative pairwise differences between the neighborhood poverty rates as arranged in p t |t . The concentration index of the t neighborhood poverty rates sorted by the t neighborhood poverty rates, calculated by using the t population shares, is defined as follows: By using permutation matrix B, the concentration index C s t , p t |t can be re-written as a function of P t instead of P t |t . Since P t |t = BλP t B T , the concentration index C s t , p t |t expressed as a function of P t becomes By adding C s t , p t |t as expressed in Eq. 24 and subtracting it as expressed in Eq. 25 to the difference enclosed within square brackets on the right-hand side of Eq. 23, we obtain where R =G t|t − B TG t B and D = P t |t − P t . Component R measures the effect of reranking of neighborhoods from t to t and its contribution to the change in urban poverty is always non-negative. The nonzero elements of R indicate the pairs of neighborhoods which have re-ranked from t to t .
Component D measures the effect of disproportionate changes in neighborhood poverty rates. The generic (i, j )-th element of D compares the relative difference between the t poverty rates of the neighborhoods in positions j and i in p t with the relative difference between the t poverty rates of the same two neighborhoods in p t |t . A positive (negative) value of D indicates that relative disparities in neighborhood poverty rates have increased (decreased) from t to t , increasing (reducing) urban poverty. If all neighborhood poverty rates have changed in the same proportion from t to t , then D = 0.
Since matrix D in Eq. 27 is obtained by subtracting P t from P t |t , D can be re-written as By replacing D in Eq. 27 with its expression in Eq. 31, the decomposition of the change in urban poverty becomes

A.3 Proof of Corollary 5
Proof Building on the Rey and Smith (2013) spatial decomposition of the Gini index and the spatial decomposition of the change in inequality in Mussini (2020), U P , W , R and E can be broken down into spatial components. Let N t be the n × n spatial weights matrix having its (i, j )-th entry equal to 1 if and only if the (i, j )-th element of P t is the relative difference between the poverty rates of two neighborhoods that are spatially close, otherwise the (i, j )-th element of N t is 0. Using the Hadamard product, 22 the relative pairwise differences between the poverty rates of neighborhoods that are spatially close can be selected from P t : P N,t = N t P t .
For each pair of neighborhoods, the relative difference between their t poverty rates in P e t |t has the same position as the relative difference between their t poverty rates in P t . Thus, N t also selects the relative pairwise differences between neighbors from P e t |t : P e N,t |t = N t P e t |t .
Since E = P e t |t − P t , the Hadamard product between N t and E is a matrix with nonzero elements equal to the elements of E pertaining to neighborhoods that are spatially close: E N = P e N,t |t − P N,t = N t P e t |t − P t = N t E.
Let N t be the n × n spatial weights matrix having its (i, j )-th entry equal to 1 if and only if the (i, j )-th element of P t is the relative difference between the poverty rates of two neighborhoods that are spatially close, otherwise the (i, j )-th element of N t is 0. The Hadamard product of N t and P t is the matrix The nonzero elements of P N,t are the relative pairwise differences between the t poverty rates of neighborhoods that are in spatial proximity. The decomposition of the change in the neighborhood component of urban poverty is obtained by replacing P t and E in Eq. 32 with P N,t and E N , respectively: Let J n be the matrix with diagonal elements equal to 0 and extra-diagonal elements equal to 1, the matrix with nonzero elements equal to the relative pairwise differences between the t poverty rates of neighborhoods that are not in spatial proximity is P nN,t = (J n − N t ) P t .
The matrix selecting the elements of E pertaining to the pairs of neighborhoods that are not spatially close is E nN = (J n − N t ) E.
The decomposition of the change in the non-neighborhood component of urban poverty is obtained by replacing P t and E in Eq. 32 with P nN,t and E nN , respectively: Given Eqs. 40 and 37, the spatial decomposition of the change in urban poverty is U P = W N + W nN + R N + R nN + C · (E N + E nN ).

Appendix B: Additional results
Funding Open access funding provided by Università degli Studi di Verona within the CRUI-CARE Agreement.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.