Appendix A: Proofs of convexity and quasiconvexity
The following proofs show the property of convexity or quasiconvexity of a variety of inequality indices. With yi we denote the income of an individual or a household and abstain from considering weighting factors as well as equivalence scales and transfers in order to ease the proofs. However, the proofs are without loss of generality, as one could replace yi with \(\frac {y_{i}+t_{i}}{ES_{i}}\) and scale the sums with weights wi.Footnote 1
A.1 The variance is convex
The variance is defined as,
$$ V\left( \left\{y_{i}\right\}_{i = 1}^{N}\right)=\frac{1}{N}\sum\limits_{i = 1}^{N} \left( y_{i}-\frac{1}{N}\sum\limits_{i = 1}^{N} y_{i} \right)^{2}. $$
(16)
The functions \(y_{i}-\frac {1}{N}{\sum }_{i = 1}^{N} y_{i}\) are affine and, therefore, convex for all i. Squaring these functions and then summing preserves convexity. Therefore, the variance is convex.
A.2 The absolute mean deviation is convex
The absolute mean deviation is defined as,
$$ \text{AMD}\left( \left\{y_{i}\right\}_{i = 1}^{N}\right)=\frac{1}{N}\sum\limits_{i = 1}^{N} \left|y_{i}-\frac{1}{N}\sum\limits_{i = 1}^{N} y_{i} \right|. $$
(17)
The functions \(y_{i}-\frac {1}{N}{\sum }_{i = 1}^{N} y_{i}\) are composed with the absolute value, which is a norm. Norms are convex and, therefore, the convexity of the expression is preserved.Footnote 2 As before with the variance, this implies that the absolute mean deviation is convex.
A.3 The Gini Index is quasiconvex
The Gini index can be written as,Footnote 3
$$ G\left( \left\{y_{i}\right\}_{i = 1}^{N}\right)= \frac{1}{2N{\sum}_{i = 1}^{N} y_{i}}\sum\limits_{j = 1}^{N}\sum\limits_{i = 1}^{N}\left|y_{i}-y_{j}\right| . $$
(18)
To establish quasiconvexity of G(⋅), we need to establish that G(⋅) is quasiconvex in yi.Footnote 4 Next we introduce the Gini’s sublevel sets, \(L_{k}^{-}(G)\),
$$\begin{array}{@{}rcl@{}} L_{k}^{-}(G)=\left\{(y_{1},{\cdots} ,y_{n})\,\mid \,G(y_{1},{\cdots} ,y_{n})\leq k\right\}, \end{array} $$
(19)
with k denoting the upper bound of the sublevel set. If the elements of \(L_{k}^{-}(G)\) are convex for every k, then G(.) is quasiconvex. The sublevel set for an arbitrary k may be denoted by,
$$\begin{array}{@{}rcl@{}} &&\frac{1}{2N{\sum}_{i = 1}^{N} y_{i}}\sum\limits_{j = 1}^{N}\sum\limits_{i = 1}^{N} \left|y_{i}-y_{j}\right| \leq k \end{array} $$
(20)
$$\begin{array}{@{}rcl@{}} &&\sum\limits_{j = 1}^{N}\sum\limits_{i = 1}^{N}\left|y_{i}-y_{j}\right| - \text{\(k \)2N}\sum\limits_{i = 1}^{N} y_{i}\leq 0. \end{array} $$
(21)
This inequality condition holds because of the non-negativity of the mean and can be shown to describe a convex set in yi by establishing that the left-hand side is a convex function in yi for all k. This is sufficient, since any sublevel set of a convex function is a convex set and here we are studying the sublevel set of the function with level-value zero.
Next, we rewrite the left-hand side as a function that has known convexity properties. We note that the left-hand side can be expressed as the point-wise maximum of 2N− 1 linear expressions in yi with another linear function in yi subtracted. For example if N = 2 :
$$ \max \left\{2\left( y_{1}-y_{2}\right),2\left( y_{1}-y_{2}\right)\right\}- \text{\(k \)4}\sum\limits_{i = 1}^{2}y_{i} $$
(22)
The point-wise maximum of linear expressions is convex, so the maximum term is convex.Footnote 5 The second term is linear in the yi and, thus, also convex. So the whole left-hand side is convex for any k. Ergo, the Gini index is quasiconvex in the yi.Footnote 6
A.4 The relative mean deviation is quasiconvex
The following proof builds on the convexity of the absolute mean deviation (see proof above). Since \(RMD=\frac {AMD\left (\left \{y_{i}\right \}_{i = 1}^{N}\right )}{\frac {1}{N}{\sum }_{i = 1}^{N} y_{i}}\), we can form the sublevel sets,
$$ AMD\left( \left\{y_{i}\right\}_{i = 1}^{N}\right) - k \frac{1}{N}\sum\limits_{i = 1}^{N} y_{i} \leq 0. $$
(23)
The lefthand side contains only convex terms. Hence, the sublevel sets of the RMD are convex. Therefore, the RMD is quasiconvex.Footnote 7
A.5 The Atkinson Index is quasiconvex
The Atkinson-Index is defined as,
$$ A_{\epsilon }\left( \left\{y_{i}\right\}_{i = 1}^{N}\right)= 1-\frac{1}{\frac{1}{N}{\sum}_{i = 1}^{N} y_{i}}\left( \frac{1}{N}\sum\limits_{i = 1}^{N}\left( y_{i}\right)^{1-\epsilon }\right)^{\frac{1}{1-\epsilon }}. $$
(24)
First, consider only the second term of A𝜖 and substitute p = 1 − 𝜖. Then,
$$ s\left( \left\{y_{i}\right\}_{i = 1}^{N}\right)=\left( \frac{1}{N}\sum\limits_{i = 1}^{N}\left( y_{i}\right){}^{p}\right)^{\frac{1}{p}}. $$
(25)
This function is concave for (p − 1) < 0 or equivalently 𝜖 ≥ 0.Footnote 8 To show quasiconvexity, we need to establish that the sublevel sets of A𝜖 are convex. This is sufficiently shown by verifying that the negative term of A𝜖 has convex sublevel sets, as the rest is just an affine transformation.
The sublevel sets are given by,
$$\begin{array}{@{}rcl@{}} &&-\frac{1}{\frac{1}{N}{\sum}_{i = 1}^{N} y_{i}}\left( \frac{1}{N}\sum\limits_{i = 1}^{N}\left( y_{i}\right)^{1-\epsilon }\right)^{\frac{1}{1-\epsilon }} \leq k \end{array} $$
(26)
$$\begin{array}{@{}rcl@{}} &&-\left( \frac{1}{N}\sum\limits_{i = 1}^{N}\left( y_{i}\right)^{1-\epsilon }\right)^{\frac{1}{1-\epsilon }} -k\frac{1}{N}\sum\limits_{i = 1}^{N} y_{i} \leq 0. \end{array} $$
(27)
Next, we assess if the functions on the left-hand side are convex. If they generate sets that are convex given any k, quasiconvexity is implied. Since the first function is convex – the negative of \(s\left (\left \{y_{i}\right \}_{i = 1}^{N}\right )\) is convex – and the second function is affine, this is the case.
A.6 The Theil Index is quasiconvex
The definition of the Theil Index is,
$$ T=\frac{1}{N}{\sum\limits_{i}^{N}}\frac{N y_{i}}{{{\sum}_{i}^{N}}y_{i}}\text{Log}\left[\frac{N y_{i}}{{{\sum}_{i}^{N}}y_{i}}\right]. $$
(28)
For quasiconvexity the sublevel sets of the Theil Index need to be convex. Accordingly,
$$\begin{array}{@{}rcl@{}} && \frac{1}{N}{\sum\limits_{i}^{N}}\frac{N y_{i}}{{{\sum}_{i}^{N}}y_{i}}\text{Log}\left[\frac{N y_{i}}{{{\sum}_{i}^{N}}y_{i}}\right] \leq k \end{array} $$
(29)
$$\begin{array}{@{}rcl@{}} &&{\sum\limits_{i}^{N}} y_{i}\text{Log}\left[\frac{N y_{i}}{{{\sum}_{i}^{N}}y_{i}}\right] - k {\sum\limits_{i}^{N}} y_{i}\leq 0. \end{array} $$
(30)
The functions on the left-hand side induce convex sets if they are convex. The second term is affine and, thus, convex. The first term is convex if its Hessian is positive semi-definite. The second partial derivatives of \(f(\{ y_{i}\}^{N}_{i = 1})= {{\sum }_{i}^{N}} y_{i}\text {Log}\left [\frac {N y_{i}}{{{\sum }_{i}^{N}}y_{i}}\right ]\) are,
$$ f_{y_{i},y_{i}}= \frac{1}{y_{i}}-\frac{1}{{{\sum}_{i}^{N}}y_{i}}~, \quad f_{y_{i},y_{j}}= -\frac{1}{{{\sum}_{i}^{N}}y_{i}}. $$
(31)
Then the Hessian of \(f(\{ y_{i}\}^{N}_{i = 1})\) is,
$$ H_{f} = \left( \begin{array}{ccc} \frac{1}{y_{1}}-\frac{1}{{{\sum}_{i}^{N}}y_{i}} & -\frac{1}{{{\sum}_{i}^{N}}y_{i}} & {\cdots} \\ -\frac{1}{{{\sum}_{i}^{N}}y_{i}} & \frac{1}{y_{2}}-\frac{1}{{{\sum}_{i}^{N}}y_{i}} & ~ \\ {\vdots} & ~ & \ddots \end{array} \right). $$
(32)
We can reshape the matrix before we test for positive semi-definiteness as
$$ H_{f} = \left( \begin{array}{ccc} \frac{1}{y_{1}} & 0 & {\cdots} \\ 0 & \frac{1}{y_{2}} & ~ \\ {\vdots} & ~ & \ddots \end{array} \right) - \left( \begin{array}{ccc} \frac{1}{{{\sum}_{i}^{N}}y_{i}} & \frac{1}{{{\sum}_{i}^{N}}y_{i}} & {\cdots} \\ \frac{1}{{{\sum}_{i}^{N}}y_{i}} & \frac{1}{{{\sum}_{i}^{N}}y_{i}} & ~ \\ {\vdots} & ~ & \ddots \end{array} \right) . $$
(33)
The Hessian is positive semi-definite iff for any vector υ,
$$ \boldsymbol{\upsilon}^{\prime}\left( \begin{array}{ccc} \frac{1}{y_{1}} & 0 & {\cdots} \\ 0 & \frac{1}{y_{2}} & ~ \\ {\vdots} & ~ & \ddots \end{array} \right)\boldsymbol{\upsilon} - \boldsymbol{\upsilon}^{\prime}\left( \begin{array}{ccc} \frac{1}{{{\sum}_{i}^{N}}y_{i}} & \frac{1}{{{\sum}_{i}^{N}}y_{i}} & {\cdots} \\ \frac{1}{{{\sum}_{i}^{N}}y_{i}} & \frac{1}{{{\sum}_{i}^{N}}y_{i}} & ~ \\ {\vdots} & ~ & \ddots \end{array} \right)\boldsymbol{\upsilon} \geq 0 . $$
(34)
To show that this is the case, we rely on the Cauchy-Schwarz-Inequality. It states that for any two vectors a and b,
$$ (\mathbf{a}^{\prime}\mathbf{a})(\mathbf{b}^{\prime}\mathbf{b})\geq (\mathbf{a}^{\prime}\mathbf{b})^{2}. $$
(35)
State the dot-product of the Hessian with υ as summations,
$$ \frac{1}{{{\sum}_{i}^{N}}y_{i}}\left( \left( {\sum\limits_{i}^{N}}y_{i}\right)\left( {\sum\limits_{i}^{N}} \frac{{\upsilon^{2}_{i}}}{y_{i}}\right) - \left( {\sum\limits_{i}^{N}} \upsilon_{i}\right)^{2} \right) \geq 0. $$
(36)
To complete the proof, pick \(a^{\prime }=(\sqrt {y_{1}},\sqrt {y_{2}},\ldots )\) and \(b^{\prime }=(\frac {\upsilon _{1}}{\sqrt {y_{1}}},\frac {\upsilon _{2}}{\sqrt {y_{2}}},\ldots )\), which establishes that the above sums are greater or equal to zero.
Since both functions determining the sublevel sets are convex for any k, the Theil is quasiconvex.
Appendix B: Implementation for the Gini index
The optimization problem with the classic formulation of the Gini index is,
$$\begin{array}{@{}rcl@{}} &&\underset{\mathbf{t}}{\text{minimize}} \quad \frac{1}{2 W \sum\limits^{N}_{i = 1} w_{i} \frac{y_{i}+t_{i}}{ES_{i}}} \sum\limits_{i = 1}^{N} w_{i} \sum\limits_{j = 1}^{N} w_{j} \left|\frac{y_{i}+t_{i}}{ES_{i}}-\frac{y_{j}+t_{j}}{ES_{j}}\right|\\ && \text{subject to}~ 0 \leq t_{i},~ i = 1,\ldots,N \\ && \sum\limits_{i = 1}^{N} t_{i} \leq B \end{array} $$
(37)
Because of the absolute value function in the classic, discrete formulation of the Gini index, it is not differentiable at zero. To derive a differentiable reformulation, we introduce the variables Δij, which replace the absolute differences in the objective function, and impose linear constraints that require the Δij to be non-negative: \(-{\Delta }_{ij} + \left (\frac {y_{i}+t_{i}}{ES_{i}}-\frac {y_{j}+t_{j}}{ES_{j}}\right )\leq 0\) and \(-{\Delta }_{ij} - \left (\frac {y_{i}+t_{i}}{ES_{i}}-\frac {y_{j}+t_{j}}{ES_{j}}\right )\leq 0\).Footnote 9 To see that this is the case, pick an income difference between any i and j and consider the following scenarios for Δij:
-
1.
Let \(\left (\frac {y_{i}+t_{i}}{ES_{i}}-\frac {y_{j}+t_{j}}{ES_{j}}\right )\) be non-negative. Then Δij has to be greater than or equal to \(\left (\frac {y_{i}+t_{i}}{ES_{i}}-\frac {y_{j}+t_{j}}{ES_{j}}\right )\) and, thus, will always be greater than or equal to \(-\left (\frac {y_{i}+t_{i}}{ES_{i}}-\frac {y_{j}+t_{j}}{ES_{j}}\right )\). Accordingly, the Δij will be non-negative.
-
2.
Let \(\left (\frac {y_{i}+t_{i}}{ES_{i}}-\frac {y_{j}+t_{j}}{ES_{j}}\right )\) be negative. Then Δij has to be greater than or equal to \(-\left (\frac {y_{i}+t_{i}}{ES_{i}}-\frac {y_{j}+t_{j}}{ES_{j}}\right )\) and, thus, will always be greater than \(\left (\frac {y_{i}+t_{i}}{ES_{i}}-\frac {y_{j}+t_{j}}{ES_{j}}\right )\). So, the Δij will again be non-negative.
Hence, we can convert (37) into an equivalent differentiable optimization problem,
$$\begin{array}{@{}rcl@{}} &&\underset{\mathbf{t, {\Delta}_{ij} }}{\text{minimize}} \quad \frac{1}{2 W \sum\limits^{N}_{i = 1} w_{i} \frac{y_{i}+t_{i}}{ES_{i}}} \sum\limits_{i = 1}^{N} w_{i} {\sum}_{j = 1}^{N} w_{j} {\Delta}_{ij}\\ && \text{subject to}~ 0 \leq t_{i},~ i = 1,\ldots,N \\ && \sum\limits_{i = 1}^{N} t_{i} \leq B \\ && -{\Delta}_{ij} + \left( \frac{y_{i}+t_{i}}{ES_{i}}-\frac{y_{j}+t_{j}}{ES_{j}}\right)\leq 0,~ \forall i,j = 1,\ldots,N \\ && -{\Delta}_{ij} - \left( \frac{y_{i}+t_{i}}{ES_{i}}-\frac{y_{j}+t_{j}}{ES_{j}}\right)\leq 0,~ \forall i,j = 1,\ldots,N . \end{array} $$
(38)
A Linear-Fractional Problem The objective function in the modified problem (38) has a specific form: an affine function in the numerator and an affine function in the denominator. Problems of this type are called linear-fractional (see Boyd and Vandenberghe (2004, p. 151) or originally Charnes and Cooper (1962)). The equivalent problem is,
$$\begin{array}{@{}rcl@{}} &&\underset{\tilde{\mathbf{t}},\tilde{{\Delta}}_{ij},z}{\text{minimize}} \quad \sum\limits_{i = 1}^{N} w_{i} \sum\limits_{j = 1}^{N} w_{j} \tilde{{\Delta}}_{ij}\\ && \text{subject to}~ 0 \leq \tilde{t}_{i},~ i = 1,\ldots,N \\ && \sum\limits_{i = 1}^{N} \tilde{t}_{i} \leq zB \\ && -\tilde{{\Delta}}_{ij} + \left( \frac{zy_{i}+\tilde{t}_{i}}{ES_{i}}-\frac{zy_{j}+\tilde{t}_{j}}{ES_{j}}\right)\leq 0,~ \forall i,j = 1,\ldots,N \\ && -\tilde{{\Delta}}_{ij} - \left( \frac{zy_{i}+\tilde{t}_{i}}{ES_{i}}-\frac{zy_{j}+\tilde{t}_{j}}{ES_{j}}\right)\leq 0,~ \forall i,j = 1,\ldots,N . \\ && 2 W \sum\limits^{N}_{i = 1} w_{i} \frac{zy_{i}+\tilde{t}_{i}}{ES_{i}}= 1 \\ && z \ge 0, \end{array} $$
(39)
where we obtain the desired transfer schedule \(t_{i}=\frac {1}{z}\tilde {t}_{i} ~ \forall i\). The major advantage of solving this problem, instead of performing bisection on (38), is the immense saving in computational effort: we need only one run of the interior-point algorithm to solve (39) instead of several, as in the case of bisection.
Size of the Problem and Improving Performance Further, there are two possibilities to reduce the number of variables and constraints for a given dataset: First, we can reduce the size of the dataset if two or more households are of the same type – in terms of their equivalence scale – and have the same income. Then we may simply add up their population weights and optimize the collapsed dataset. Second, we may restrict the number of households that may be recipients of a transfer in the optimization by performing the following procedure: 1. Perform a bottom fill-up procedure for every equivalence-scale-type, where the entire budget at disposal is distributed only among households of this type. 2. Mark those households that are recipients of a positive transfer. 3. Perform the optimization of (39) with free transfer variables for the marked households only.
The justification is that, even in the most extreme case, where just one type of household experiences a bottom fill-up, only the marked households can be transfer recipients. Other households of the same type have a weaker effect on the Gini index than the marked households.
Further, to save on memory and reduce computational effort, we provide the following guidelines to enhance the performance of the solver fmincon in Matlab:
-
1.
The gradient of the objective function, the gradient of the constraints and the Hessian should be generated as sparse matrices to save memory.
-
2.
The gradient of the objective function and the Hessian are zero everywhere and should be supplied directly by the user.
-
3.
The gradient of the constraints is constant and can be generated before the execution of interior-point algorithm.
-
4.
Parallel computations should be implemented in order to calculate the gradient of the constraints wherever possible.