1 Introduction

The objective of this paper is twofold. One of the objectives is to investigate the relations between the Lorenz order (LO), the standard inequality criterion for income distributions, and the polarization order (PO), a polarization criterion for income distributions divided into two classes at the median that was introduced by Foster and Wolfson (2010) and Wang and Tsui (2000). More specifically, this paper aims to determine whether a distribution F can be either more or less polarized than a distribution H in terms of the PO when F is more unequal than H in terms of the LO. The second objective is to investigate the conditions for the PO of the standard parametric models for income distributions and those for the LO and PO of the double-Pareto lognormal (dPLN) distribution introduced by Reed (2003) and Reed and Jorgensen (2004). The derived conditions for dPLNs are particularly useful for discussions of appropriate polarization indices.

The concept of polarization and polarization measures were developed to measure the extent of clustering, i.e., population concentration in a few classes with wide gaps between classes. The PO is an underlying stochastic order for two-class polarization (bipolarization) indices such as the Foster-Wolfson index (2010) and Wang-Tsui index (2000). However, Zhang and Kanbur (2001) failed to empirically find clear differences between polarization and inequality indices in the measurement of intertemporal distributional changes. Their study led to proposals for alternative bipolarization indices inconsistent with PO (PO-inconsistent indices) by them and several other researchers to make the measurement of polarization more distinct from the measurement of inequality. To theoretically clarify this 'distinction' problem, this paper discusses differences in order relations between the distributions on the level of the respective underlying stochastic orders. The discussion reveals that, due to the super-sub relations of LO and PO to the star-shaped order (SO) (Marshall et al., 1967), a distribution F can be more polarized than H (in terms of PO) when F is more unequal than H (in terms of LO), but F essentially cannot be less polarized than H when F is more unequal than H. Hence, the polarization measurement cannot be clearly distinct from the inequality measurement if we adopt the PO as the polarization criterion. This finding is supportive of the empirical study of Zhang and Kabur (2001). Then, this paper conducts the study for the second objective to search for appropriate PO-inconsistent indices.

Modeling the size distributions of incomes has been one of the major topics in research on income distributions. Regarding this topic, the conditions for the LO of major size distribution models have been studied by researchers such as Taillie (1981), Wilfling and Kramer (1993), Wilfling (1996a, 1996b), Kleiber (1996, 1999), and Belzunce et al. (2013). Although neither the conditions for the PO of the standard distribution models nor those for the LO and PO of the dPLN distribution, a relatively new model, have been studied so far, those conditions are considered to be important in their own right for studies on modeling the size distributions of incomes; in particular, so are those for the dPLN because this model is known to be a well-fitted four-parameter model of income distributions like the generalized beta distribution of the 2nd kind (GB2) (McDonald, 1984). In addition, by utilizing the dPLN's analytical tractability, the derived conditions for the dPLN can serve as a basis for investigating the properties of inequality and polarization indices (including PO-inconsistent indices) regarding their sensitivity to distributional changes. The sensitivity analysis provides a useful suggestion for the consideration of appropriate polarization indices.

The remaining sections proceed as follows: In the next section, I discuss the super-sub relations of the LO and PO to SO and the possibility that the PO has the reverse order of the LO in general (distribution-free) terms. The SO is shown to play a key role in super-sub relations among several quasi-orders, including the LO, PO and relative deprivation order. In Section 3, I address the conditions for the SO, LO and PO of major parametric models, particularly for those of the dPLN. An example demonstrates how to apply the derived conditions for the dPLN to sensitivity analyses of inequality and polarization indices to distributional changes. Although the example is illustrative, it suggests that some PO-inconsistent bipolarization index may be appropriate. The last section contains concluding remarks.

2 Star-shaped, Lorenz, and polarization orderings

2.1 Lorenz order

Hereafter, I assume that distributions are positive with finite means. I frequently identify each distribution with its cumulative distribution function (c.d.f.). The Lorenz order, the standard underlying stochastic order for inequality in income distributions, is a quasi-order between distributions defined as follows:

$$ F{\succcurlyeq}_LH\iff {L}_F(t)\le {L}_H(t),0<\forall t<1, $$
(1)

where F and H denote the c.d.fs. of the distributions in question and \( {L}_F(t)=\frac{1}{m_F}{\int}_0^t{F}^{-1}(u) du \), \( {m}_F={\int}_0^{\infty } xdF(x)={\int}_0^1{F}^{-1}\left(\tau \right) d\tau \) and F−1(t) = inf {x : F(x) ≥ t} denote the Lorenz curve, mean and quantile function of F, respectively. For N-element discrete distributions F = {x1, x2, ⋯, xN} and H = {y1, y2, ⋯, yN} with the same means, where both the xis and yis are arranged in ascending order, FLH is equivalent to H being obtained from F by means of a finite sequence of rank-preserving Pigou-Dalton progressive (PD) transfers (Hardy et al., 1952). A rank-preserving PD-transfer is a transfer of a small amount between two elements that does not cause order changes among all elements in the distribution and is defined as follows:

$$ {y}_i={x}_i+\delta, {y}_j={x}_j-\delta, {y}_k={x}_k,\mathrm{where}\ \delta >0,i<j,k\ne i,j. $$

The principle of progressive transfer, an essential property required for inequality measures, asserts that inequality does not increase by a rank-preserving PD-transfer. The principle in the strong sense asserts that the inequality reduces by a rank-preserving PD-transfer. FLH is equivalent to I(F) ≥ I(H) for any relative inequality index I (satisfying the principle); hence, FLH indicates that F is not less unequal than H. FLH represents that FLH and ∃τ s.t. LF(τ) <LH(τ), implying that F is more unequal than H because I(F) > I(H) holds for any relative inequality index satisfying the principle in the strong sense.

2.2 Polarization order

There are two strands of research on polarization measurements. One is based on identification–alienation models, which enable us to measure polarization in distributions consisting of two or more classes. The indices of Esteban and Ray (1994) and Duclos et al. (2004) are in this strand; however, those indices do not have any underlying stochastic order, as explained by Esteban and Ray (1994). The second strand addressed in this paper relates to polarization in distributions consisting of two classes partitioned at a given quantile (which I call the base quantile hereafter). The basic idea of the bipolarization measurement is to evaluate how distant lower and higher classes are from the center, defined as the base quantile. Foster and Wolfson (2010) originally studied this approach to observe a decline of the middle class.Footnote 1 I generalize the definition of the underlying stochastic order proposed by Wang and Tsui (2000), which is a slight modification of the definition of Foster and Wolfson (2010), to allow a nonmedian base quantile.Footnote 2

First, I define the base quantile ξt, 0 < t < 1, of a discrete distribution F = {x1,x2,⋯,xN} with two or more elements as follows:

$$ {\xi}_t=\frac{1}{2}\left({x}_{t\bullet N}+{x}_{t\bullet N+1}\right),{N}_t=t\bullet N+0.5\ \mathrm{if}\ t\bullet N\ \mathrm{coincides}\ \mathrm{with}\ \mathrm{an}\ \mathrm{integer}, $$
$$ {\xi}_t={x}_{N_t},{N}_t=\left\lceil t\bullet N\right\rceil\ \mathrm{otherwise}, $$

where the xis are arranged in ascending order and ⌈x⌉ denotes the ceiling function of rounding up digits after the decimal point to form an integer. In the case that the base quantile is the median (i.e., t = 0.5), ξ0.5 = (xn + xn + 1)/2 and Nt = n + 0.5 for N = 2n (even), and ξ0.5 = xn + 1 and Nt = n + 1 for N = 2n + 1 (odd). Next, I introduce the quasi-order of increased spread ≽IS, t in (2.1) for two N-element discrete distributions F = {x1, x2, ⋯, xN} and H = {y1, y2, ⋯, yN}, where both the xis and yis are arranged in ascending order.

$$ F{\succcurlyeq}_{IS,t}H\Longleftrightarrow \left|\frac{x_i-{\xi}_t}{\xi_t}\right|\ge \left|\frac{y_i-{\eta}_t}{\eta_t}\right|\mathrm{for}\ i=1,2,\cdots, N. $$
(2.1)

ξt and ηt denote the base quantiles of F and H, respectively. The polarization order ≽P, t in (2.2) is a kind of 2nd order dominance relation resembling the generalized Lorenz order (Shorrocks, 1983).

$$ F{\succcurlyeq}_{P,t}H\Longleftrightarrow {\displaystyle \begin{array}{c}{\sum}_{i\le j<{N}_t}\left(\frac{\xi_t-{x}_j}{\xi_t}\right)\ge {\sum}_{i\le j<{N}_t}\left(\frac{\eta_t-{y}_j}{\eta_t}\right)\ \mathrm{for}\ 1\le i<{N}_t,\\ {}{\sum}_{N_t<j\le i}\left(\frac{x_j-{\xi}_t}{\xi_t}\right)\ge {\sum}_{N_t<j\le i}\left(\frac{y_j-{\eta}_t}{\eta_t}\right)\ \mathrm{for}\ {N}_t<i\le N.\end{array}} $$
(2.2)

In the above, the 1st and 2nd inequalities are unnecessary for Nt = 1 and Nt = N, respectively. FP, tH indicates that F is not less polarized than H. Conditions (2.3) and (2.4) are generalizations of (2.1) and (2.2), respectively, that encompass all distributions, including continuous distributions.

$$ F{\succcurlyeq}_{IS,t}H\Longleftrightarrow \left|\frac{F^{-1}(s)-{F}_m^{-1}(t)}{F_m^{-1}(t)}\right|\ge \left|\frac{H^{-1}(s)-{H}_m^{-1}(t)}{H_m^{-1}(t)}\right|\mathrm{for}\ s\ne t, $$
(2.3)
$$ F{\succcurlyeq}_{P,t}H\Longleftrightarrow {\displaystyle \begin{array}{c}{\int}_s^t\left(\frac{F_m^{-1}(t)-{F}^{-1}\left(\tau \right)}{F_m^{-1}(t)}\right) d\tau \ge {\int}_s^t\left(\frac{H_m^{-1}(t)-{H}^{-1}\left(\tau \right)}{H_m^{-1}(t)}\right) d\tau\ \mathrm{for}\ s<t,\\ {}{\int}_t^s\left(\frac{F^{-1}\left(\tau \right)-{F}_m^{-1}(t)}{F_m^{-1}(t)}\right) d\tau \ge {\int}_t^s\left(\frac{H^{-1}\left(\tau \right)-{H}_m^{-1}(t)}{H_m^{-1}(t)}\right) d\tau\ \mathrm{for}\ s>t,\end{array}} $$
(2.4)

where \( {F}_m^{-1}(t)=\left(\operatorname{inf}\left\{x:F(x)>t\right\}+{F}^{-1}(t)\right)/2 \). Note that \( {F}_m^{-1}(t)={F}^{-1}(t) \) if t is a continuous point of F−1.

In the case of finite-element discrete distributions, FP, tH is equivalent to a ‘combined' condition of increased spread ≽IS, t in (2.1) and increased bipolarity ≽IB, t in (2.5) in the sense that ∃G s. t. FIS, tG and GIB, tH.

$$ {\left\{{x}_i/{\xi}_t\right\}}_{\Omega}\ \mathrm{a}\mathrm{re}\ \mathrm{derived}\ \mathrm{from}\ {\left\{{y}_i/{\eta}_t\right\}}_{\Omega}\ \mathrm{by}\ \mathrm{a}\ \mathrm{finite}\ \mathrm{sequence}\ \mathrm{of} \operatorname {rank}-\mathrm{preserving}\ \mathrm{PD}-\mathrm{transfers}, $$
(2.5)

where Ω = 1 ≤ i < Nt or Nt < i ≤ N. Wang and Tsui (2000) proved this equivalence for the case in which the base quantile is the median. Their proof can be readily generalized for any other base quantile. From the equivalence, the PO is characterized by two properties that the polarization does not decrease when the gap between the two classes widens (increased spread, IS) or when the inequality reduces within either of the two classes (increased bipolarity, IB). Foster and Wolfson (2010) proposed a PO-consistent bipolarization index with the base quantile at the median, represented as follows:

$$ {\displaystyle \begin{array}{c} FW(F)=4\left({\int}_0^{0.5} d s{\int}_s^{0.5}\frac{F_m^{-1}(0.5)-{F}^{-1}\left(\tau \right)}{F_m^{-1}(0.5)} d\tau +{\int}_{0.5}^1 ds{\int}_{0.5}^s\frac{F^{-1}\left(\tau \right)-{F}_m^{-1}(0.5)}{F_m^{-1}(0.5)} d\tau \right)\\ {}=2\frac{m_F}{F_m^{-1}(0.5)}\left[1-2{L}_F(0.5)- Gini(F)\right]=2\frac{m_F}{F_m^{-1}(0.5)}\left[{G}_{med}^b(F)-{G}_{med}^w(F)\right],\end{array}} $$
(2.6)

where Gini(F) denotes the Gini index for F, and \( {G}_{med}^b(F) \) and \( {G}_{med}^w(F) \) denote the between- and within-class Gini component, respectively. The two components are expressed as \( {G}_{med}^b(F)=\left({m}_{F^U}-{m}_{F^L}\right)/4{m}_F=0.5-{L}_F(0.5) \) and \( {G}_{med}^w(F)=\frac{1}{4}\frac{m_{F^L}}{m_F} Gini\left({F}^L\right)+\frac{1}{4}\frac{m_{F^U}}{m_F} Gini\left({F}^U\right)= Gini(F)-{G}_{med}^b(F) \), where FL and FU denote the c.d.fs. of the lower and upper classes, respectively; and \( {m}_{F^L} \) and \( {m}_{F^U} \) denote the means of FL and FU, respectively. Several researchers have proposed PO-consistent bipolarization indices, including the index of Wang and Tsui (2000).

In parallel with this research development, different types of bipolarization indices have emerged, mainly stimulated by the empirical study of Zhang and Kanbur (2001). They measured intertemporal changes in inequality and polarization in the regional per capita consumption distribution in China by using inequality and polarization indices; however, they failed to find clear differences between the two types of indices. They proposed PO-inconsistent alternative indices (such as an index in an example in Subsection 3.4), aiming to make the polarization measurement more distinct from the inequality measurement. Several other researchers such as Rodriguez and Salas (2003), Silber et al. (2007) and Palacios-González and Garcia-Fernández (2011) have also proposed PO-inconsistent indices. As this ‘distinction' problem has not yet been clarified theoretically in the literature, it would be useful to discuss the problem at the level of the respective underlying stochastic orders. The star-shaped order is an important quasi-order for this discussion.

2.3 Star-shaped order and super-sub relations among the Lorenz order, polarization order and other quasi-orders

Marshall et al. (1967) first introduced the following star-shaped order in the fields of majorization:

$$ F{\succcurlyeq}_{\ast }H\Longleftrightarrow \frac{F^{-1}(t)}{H^{-1}(t)}\ \mathrm{is}\ \mathrm{monotonically}\ \mathrm{increasing},0<t<1 $$

The name comes from F−1(H(x)) being star-shaped in x from above at 0; i.e., F−1(H(x))/x is monotonically increasing if FH. The star-shaped order is equivalent to

$$ \frac{F^{-1}(t)}{F^{-1}(s)}\ge \frac{H^{-1}(t)}{H^{-1}(s)},0<s<t<1. $$
(2.7)

(2.7) implies that the distance between any pair of quantiles of F measured by the absolute difference of the log values is no shorter than that of H. The SO has attracted attention in relation to an argument regarding the concepts of inequality and relative deprivation. Several researchers, e.g., Amiel and Cowell (1992) and Kolm (1999), among others, have criticized the principle of progressive transfer for being not widely acceptable as a requirement for inequality measures. Several researchers such as Preston (1990), Moyes (1994, 1999) and Chakravarty and Moyes (2003) have considered the SO as a suitable inequality criterion because of property (2.7). It may be notable that the variance of the logarithm VL1 or VL2, which is known to be inconsistent with the LO, is consistent with the SO.

$$ {\displaystyle \begin{array}{c}{VL}_1(F)=\int {\left(\log x-\overline{\log x}\right)}^2 dF(x)=\frac{1}{2}\iint {\left(\log x-\log y\right)}^2 dF(x) dF(y)=\frac{1}{2}\iint {\left(\log {F}^{-1}(t)-\log {F}^{-1}(s)\right)}^2 dtds,\\ {}{VL}_2(F)=\int {\left(\log x-\log {m}_F\right)}^2 dF(x)={VL}_1(F)+{\left[\int \log \left({m}_F/x\right) dF(x)\right]}^2,\end{array}} $$

where \( \overline{\log x}=\int \log xdF(x) \). The 2nd term on the rightmost side in VL2(F)'s formula is equal to the 2nd Theil index.

I introduce one more quasi-order. Belzunce et al. (2012) proposed a stochastic order based on the expected proportional shortfall (PSO), as follows:

$$ F{\succcurlyeq}_{PS}H\Longleftrightarrow {\int}_t^1\frac{F^{-1}\left(\tau \right)-{F}^{-1}(t)}{F^{-1}(t)} d\tau \ge {\int}_t^1\frac{H^{-1}\left(\tau \right)-{H}^{-1}(t)}{H^{-1}(t)} d\tau, 0<t<1. $$

The PSO and SO can be regarded as quasi-orders for the relative deprivation in distributions by defining the distance between the τ- and t-quantiles, t < τ, as (F−1(τ) − F−1(t))/F−1(t). The relative deprivation profile and relative deprivation orders of Chakravarty and Moyes (2003) coincide with the PSO and SO, respectively (if the domain of the stochastic orders is restricted to all finite-element discrete distributions). Belzunce et al. (2012) found that

$$ F{\succcurlyeq}_{\ast }H\Rightarrow F{\succcurlyeq}_{PS}H\Rightarrow F{\succcurlyeq}_LH; $$
(2.8)

hence, the PSO is in an intermediate position between the SO and LO.

Similar relations hold among the SO, IS and PO, as in the following theorem:

  • Theorem 1. For a pair of distributions F and H, FH ⇒ FIS, tH ⇒ FP, tH, 0 < t < 1. F*HFK, tH for all 0 < t < 1, where K = IS or P. (The proof is given in the Appendix.)

From Theorem 1 and (2.8), both FLH and FP, tH hold simultaneously if FH (see Figure 1). However, the PO is inconsistent with the LO because both FLH and FP, tH hold if FIB, tH; additionally, mF = mH, ξt = ηt, and F ≠ H. Then, when do both FP, tH and FLH hold at the same time in general? Regarding this problem, the following theorem holds:

  • Theorem 2. If FLH, then FP, tH holds if and only if LF(t) = LH(t) and \( {F}_m^{-1}(t)/{m}_F={H}_m^{-1}(t)/{m}_H \); i.e. the total income shares of the lower and higher classes and the ratios of the base quantile to the mean must be identical between F and H. (Note that \( {F}_m^{-1}(t)/{m}_F={L}_F^{\prime }(t) \) when LF is differentiable at t.) (The proof is given in the Appendix.)

Fig. 1
figure 1

Super-sub relations among the stochastic orders

The condition of the strict fulfillment of the two equations (eqs.) in Theorem 2 implies that FLH and FP, tH are essentially unable to hold at the same time in empirical studies. Hence, Theorems 1 and 2 do not conflict with the empirical study of Zhang and Kanbur (2001). If the polarization measurement should be distinct from the inequality measurement as Zhang and Kanbur (2001) asserted, a question arises: which PO-inconsistent polarization measures are appropriate? We obtain a suggestion for the question in the next section.

3 Star-shaped, Lorenz, and polarization orderings of parametric models for income distributions

3.1 Star-shaped and Lorenz orderings of the standard parametric models

Modeling the size distributions of income is a major topic in research on income distributions. Many parametric models have been proposed to represent the size distributions of income. The properties of the models regarding the abovementioned quasi-orders are useful for evaluating whether the models have sufficient flexibility. For two-parameter classical models—the families of Pareto, lognormal, gamma, Lomax and Fisk distributions—it is known that the Lorenz curves do not intersect with each other within each of those families; i.e., the LO always holds within each family, although the Lorenz curves of empirical income distributions frequently intersect with each other. Three-parameter models—the families of Singh-Maddala, Dagum, generalized gamma distributions, and type II beta generalized distributions—are well fitted to some empirical income distributions. Within each of those families, the SO is shown to be equivalent to the LO (Wilfling and Kramer, 1993; Kleiber, 1996; Taillie, 1981; Belzunce et al., 2013). Can the SO be regarded as equivalent to the LO among real income distributions? To address this question, consider a four-parameter model. The generalized beta distribution of the 2nd kind (McDonald, 1984) often gives better fits than the existing three-parameter models. The GB2 distribution GB2(a, b, p, q) with parameters a, b, p, q > 0 has the following probability density function (p.d.f.):

$$ {f}_{GB2}\left(x;a,b,p,q\right)=\frac{a{x}^{ap-1}}{b^{ap}B\left(p,q\right){\left[1+{\left(x/b\right)}^a\right]}^{p+q}},x>0, $$

where b is the scale parameter and a, p, q are the shape parameters. Because fGB2(x)~xap − 1 as x → 0 and fGB2(x)~xaq − 1 as x → ∞, fGB2 has left and right power-law tails with exponents ap − 1 and −aq − 1, respectively. GB2 encompasses the abovementioned two- or three-parameter models as special cases or limiting cases. Belzunce et al. (2013) obtained a sufficient condition (3.1) and a necessary condition (3.2) for the SO between F1~GB2(a1, b1, p1, q1) and F2~GB2(a2, b2, p2, q2).

$$ {a}_1\le {a}_2,{a}_1{p}_1\le {a}_2{p}_2\ \mathrm{and}\ {a}_1{q}_1\le {a}_2{q}_2\Longrightarrow {F}_1{\succcurlyeq}_{\ast }{F}_2 $$
(3.1)
$$ {F}_1{\succcurlyeq}_{\ast }{F}_2\Longrightarrow {a}_1{p}_1\le {a}_2{p}_2\ \mathrm{and}\ {a}_1{q}_1\le {a}_2{q}_2 $$
(3.2)

The sufficient condition and necessary condition for the LO derived by Kleiber (1999) coincide with (3.1) and (3.2). However, in contrast to the equivalence between the SO and LO within the three-parameter families, the SO is not equivalent to the LO within the GB2 family if (3.1) does not hold. Due to sampling variation and duplication of values in survey data, a commonly observed phenomenon in which different people report exactly the same amount of income due to rounding or other reasons, it is difficult to directly examine the SO in empirical income distributions obtained from sample survey data. Here, I make use of GB2 distributions fitted by Bandourian et al. (2002) to the LIS data to investigate whether the SO and LO hold for each pair of GB2 distributions fitted to empirical distributions. Bandourian et al. (2002) fitted various parametric models to empirical income distributions of 23 countries in 1967-1997, 82 distributions in total. The pair-comparison results are listed in Table 1. Among all the pairs of fitted GB2 distributions (3,321 pairs), 496 pairs satisfy condition (3.1), and the LO also holds for 1,145 pairs that fail to satisfy (3.1). The SO does not hold for 473 of the 1,145 pairs; hence, the SO does not hold for 29% of pairs for which the LO holds. The GB2 model attains the best fit to 34 of the 82 distributions in terms of the likelihood ratio test among the tested models. The comparison results for all pairs of the 34 distributions are listed in the bottom row in Table 1. In the limited pair comparisons, the SO does not hold for 47 pairs (20%) among the 234 pairs for which the LO holds. The results indicate that we can find cases that FH fails to hold, i.e., FH, among the cases of FLH at a certain rate, implying that income distribution models should be sufficiently flexible to be capable of reproducing such cases.

Table 1. LO and SO between pairs of GB2 distributions fitted to the LIS income data

3.2 The double-Pareto lognormal distribution

Reed (2003) and Reed and Jorgensen (2004) first introduced the double-Pareto lognormal distribution. Since then, the dPLN model has been fitted to income distributions (Reed and Wu, 2008; Okamoto, 2012; Toda, 2012), consumption distributions (Hajargasht and Griffiths, 2013; Toda, 2017) and city size distributions (Giesen et al., 2010). From those applications, the dPLN model has become known as another four-parameter distribution that can be well fitted to income distributions and other size distributions. It would be meaningful to study the conditions for the SO, LO and PO of dPLN distributions to investigate whether the conditions resemble or differ from those of GB2 distributions. Furthermore, the derived conditions can serve as a basis for investigating the properties of inequality and polarization indices by utilizing the dPLN's analytical tractability, which is an advantage over GB2, as explained below.

A dPLN distribution dPLN(μ, σ, α, β) is generated from a combination of multiplication and division of three mutually independent random variables (r.vs.) following a Pareto or lognormal distribution:

$$ {Z}_1\bullet {Z}_2^{-1}\bullet {Z}_3\sim dPLN\left(\mu, \sigma, \alpha, \beta \right), $$
(3.3)

where Z1~Pareto(α), Z2~Pareto(β), Z3~LN(μ, σ). dPLN(μ, σ, α, β) has the following p.d.f.: for σ, α, β > 0,

$$ {f}_{dPLN}\left(x;\mu, \sigma, \alpha, \beta \right)=\frac{\alpha \beta}{\alpha +\beta}\left[{x}^{\beta -1}{e}^{-\beta \mu +{\beta}^2{\sigma}^2/2}{\Phi}^c\left(\frac{\log x-\mu +\beta {\sigma}^2}{\sigma}\right)+{x}^{-\alpha -1}{e}^{\alpha \mu +{\alpha}^2{\sigma}^2/2}\Phi \left(\frac{\log x-\mu -\alpha {\sigma}^2}{\sigma}\right)\right], $$

where Φ denotes the c.d.f. of the standard normal distribution and Φc denotes the complement of Φ; i.e., Φc = 1 − Φ. The c.d.f. of the dPLN is given in the Appendix. As the r.v. X~dPLN(μ, σ, α, β) is equivalent to X = eμY, Y~dPLN(0, σ, α, β), μ is the log-scale parameter. The other three parameters are the shape parameters. Because fdPLN(x)~xβ − 1 as x → 0 and fdPLN(x)~xα − 1 as x → ∞, fdPLN has left and right power-law tails with exponents β − 1 and −α − 1, respectively. For β < 1, fdPLN(x) → ∞ as x → 0; and for β = 1, \( {f}_{dPLN}(x)\nearrow \frac{\alpha \beta}{\alpha +\beta }{x}^{\beta -1}{e}^{-\beta \mu +{\beta}^2{\sigma}^2/2} \) as x → 0. Hence, fdPLN does not have a peak in either case. In contrast, fdPLN is unimodal for β > 1. When α, β → ∞, dPLN(μ, σ, α, β) approaches LN(μ, σ). As shown in Corollary 4 below, a dPLN distribution is more dispersed along with either a decrease of α or β or an increase of σ; hence, a decrease of α or β brings an inequality increase with a heavier right or left tail, respectively, and an increase in σ brings an inequality increase with a more dispersed central part of the distribution without changes in the heaviness of the tails in power exponent order. Analytical expressions of the mean, higher-order moments, Lorenz curve, Gini index, and generalized entropy (GE) indices for the dPLN are given in the Appendix.

Regarding the goodness-of-fit of the dPLN in comparison with GB2, the argument is inconclusive at present. Reed and Wu (2008) and Okamoto (2012) empirically showed that the dPLN tends to be better fitted to income distributions than GB2, whereas Hajargasht and Griffiths (2013) were unable to make clear judgments on the goodness-of-fit to consumption distributions in 10 developing countries and regions. However, the dPLN has several notable advantages. Toda (2017) pointed out the analytical tractability of the dPLN. This advantage partly comes from the abovementioned clear correspondence of the shape parameters to parts of the distribution, which are mainly affected by changes in the respective parameter values. By utilizing this clear correspondence, it is possible to investigate the sensitivity of inequality and polarization indices to distributional changes, as illustrated in an example given in the last subsection. A theoretically important advantage of the dPLN is the existence of stochastic processes generating the distribution (Reed, 2003; Toda, 2012). There are also some advantages regarding the Gini index, as mentioned in the Appendix.

3.3 Star-shaped, Lorenz and polarization orderings of the double-Pareto lognormal distribution and the polarization ordering of other parametric models

The conditions for the existence of the mean of a dPLN distribution are α > 1 and β > 0, as shown in the Appendix. I assume that these conditions are satisfied hereafter. Theorem 3 gives a sufficient condition for the SO within the dPLN family in a different way from the proof of Belzunce et al. (2013) for the corresponding condition (3.1) within the GB2 family.

  • Theorem 3. σ1 ≥ σ2, α1 ≤ α2, and β1 ≤ β2 imply dPLN(μ1, σ1, α1, β1)≽dPLN(μ2, σ2, α2, β2). (The proof is given in the Appendix.)

As FG implies FLG from Taillie (1981), the following corollary is derived:

  • Corollary 4. σ1 ≥ σ2, α1 ≤ α2, and β1 ≤ β2 imply dPLN(μ1, σ1, α1, β1)≽LdPLN(μ2, σ2, α2, β2).

Corollary 4 can be derived directly from Whitt's lemma (1980) because the dPLN is generated by a combination of multiplication and division of three independent r.vs. following Pareto(α), Pareto(β) or LN(μ, σ) as in (3.3). Although the converses of Theorem 3 and Corollary 4 are false, the following theorem holds:

  • Theorem 5. For F1~dPLN(μ1, σ1, α1, β1) and F2~dPLN(μ2, σ2, α2, β2), either F1F2 or F1LF2 implies α1 ≤ α2 and β1 ≤ β2. (The proof is given in the Appendix.)

From Theorems 1 and 3, Corollary 6 is obvious. Similarly, from Theorem 1 and (3.1), Corollary 7 is obvious.

  • Corollary 6. For F1~dPLN(μ1, σ1, α1, β1) and F2~dPLN(μ2, σ2, α2, β2), σ1 ≥ σ2, α1 ≤ α2, and β1 ≤ β2 imply F1K, tF2 for 0 < t < 1, K = IS or P.

  • Corollary 7. For F1~GB2(a1, b1, p1, q1) and F2~GB2(a2, b2, p2, q2), a1 ≤ a2, a1p1 ≤ a2p2, and a1q1 ≤ a2q2 imply F1K, tF2 for 0 < t < 1 and K = IS or P.

Similar corollaries are derived for models in which conditions for the SO exist, such as the Singh-Maddala, Dagum and generalized gamma distributions. The converse of Corollary 6 is false. F1K, tF2 for all t, K = IS or P, implies F1F2 from Theorem 1 and hence α1 ≤ α2 and β1 ≤ β2 from Theorem 5. If F1IS, tF2 for some t, then, as \( {F}_{dPLN}^{-1}\left(\tau; \mu, \sigma, \alpha, \beta \right)\sim {\tau}^{1/\beta } \), τ → 0, and \( {F}_{dPLN}^{-1}\left(\tau; \mu, \sigma, \alpha, \beta \right)\sim {\left(1-\tau \right)}^{-1/\alpha } \), τ → 1, α1 ≤ α2 and β1 ≤ β2 are shown to hold from a similar argument to that for Theorem 5. However, F1P, tF2 may hold even if \( {\upalpha}_1>{\upalpha}_2 \) and \( {\upbeta}_1>{\upbeta}_2 \) when \( {\upsigma}_1>{\upsigma}_2 \) as in the following example:

Consider a case in which both F1LF2 and F1P, tF2 hold within the dPLN family. If all shape parameters of F1 and one of the shape parameters of F2 are fixed, the remaining two shape parameters form a set of discrete points in two-dimensional space (if it is not empty) because of the two eqs. in Theorem 2. For dPLN distributions F1 with {σ1, α1, β1} = {0.5,  3.5, 3.5} and F2 with σ2 = 0.3, two sets of {α2, β2}s for which F1LF2 and F1P, tF2 hold adjoin at one point {α2, β2} ≒ {2.567, 2.146} (Figure 2); hence, both F1LF2 and F1P, tF2 hold at the same time only at this point (if the rest of the parameters are fixed.). Numerical calculations show that the simultaneous fulfilment of the LO and reverse PO at only one point also holds for other {σ1, α1, β1} and \( {\upsigma}_2\left(<{\upsigma}_1\right) \). A similar argument holds for GB2. Like GB2, the SO is not equivalent to the LO in the case that the condition in Theorem 7 does not hold. In the above example, F1LF2 holds if both \( {\alpha}_2<2.567 \) and \( {\beta}_2<2.146 \) are satisfied; however, F1F2 does not hold if 2.16 \( <{\alpha}_2<2.567 \) and \( 1.76<{\beta}_2<2.146 \).

Fig. 2
figure 2

Shape parameters of F2 satisfying F1LF2, F1P, 0.5F2, and F1P, 0.5F2

For the application to the sensitivity analysis of inequality and polarization indices with respect to distributional changes, I provide one more theorem readily derived from Corollaries 4 and 6.

  • Theorem 8. Assume that index I(F) for F~dPLN(μ, σ, α, β) is differentiable in the shape parameters. If I is consistent with the LO or PO, i.e., if FLH or FP, tH implies I(F) ≥ I(H), then, ∂I(F)/∂σ ≥ 0, ∂I(F)/∂α ≤ 0, and ∂I(F)/∂β ≤ 0. Conversely, if \( {\left.\partial I(F)/\partial \sigma \right|}_{\left\{{\sigma}_0,{\alpha}_0,{\beta}_0\right\}}<0 \), \( {\left.\partial I(F)/\partial \alpha \right|}_{\left\{{\sigma}_0,{\alpha}_0,{\beta}_0\right\}}>0 \), or \( {\left.\partial I(F)/\partial \beta \right|}_{\left\{{\sigma}_0,{\alpha}_0,{\beta}_0\right\}}>0 \) for some {σ0, α0, β0}, then I is inconsistent with the LO and PO.

3.4 An example of the use of the dPLN model to analyze the sensitivity of polarization indices to distributional changes

The example below illustrates how to apply the dPLN to an analysis of the sensitivity of polarization and inequality indices to distributional changes by calculating the elasticity of the indices with respect to the shape parameters of the dPLN. Broadly speaking, the elasticity with respect to α or β indicates the sensitivity of the respective index to a change in the heaviness of the right or left tail, respectively, and the elasticity with respect to σ indicates the sensitivity to a change in the dispersion in the central part of the distribution.

$$ \frac{\partial \log I\left(F\left(x;\boldsymbol{\theta} \right)\right)}{\partial \log {\theta}_i}=\frac{\theta_i}{I\left(F\left(x;\boldsymbol{\theta} \right)\right)}\frac{\partial I\left(F\left(x;\boldsymbol{\theta} \right)\right)}{\partial {\theta}_i}, $$

where θ = {θi} = {μ, σ, (α − 0.5)−1, (β + 0.5)−1} and F(x; θ)~dPLN(μ, σ, α, β). The reason for the use of (α − 0.5)−1 and (β + 0.5)−1 instead of α and β is as follows:

Assume LF and LH are mutually symmetric with respect to a diagonal other than the equality diagonal, as shown in Figure 3. I call this symmetric relation L-symmetry hereafter. The L-symmetric relation between F and H is equivalent to

$$ \mathrm{eqs}.{L}_F(p)=1-q\ \mathrm{and}\ p=1-{L}_H(q). $$
(3.4)
Fig. 3
figure 3

Mutually symmetric Lorenz curves

If F and H are mutually L-symmetric, then, the weighted two-element discrete distributions \( {D}_F=\left\{{m}_F^L,{m}_F^U;F\left({m}_F\right)=P\left(x\le {m}_F\right),1-F\left({m}_F\right)=P\left(x>{m}_F\right)\right\} \) and \( {D}_H=\left\{{m}_H^{-L},{m}_H^{-U};P\left(y<{m}_H\right),P\left(y\ge {m}_H\right)\right\} \) are also mutually L-symmetric, where \( {m}_F^L \) and \( {m}_F^U \) are the means of F conditional on \( x\le {m}_F\ \mathrm{and}\ x>{m}_F \), respectively; \( {m}_H^{-L} \) and \( {m}_H^{-U} \) are the means of H conditional on y \( < \)mH and y \( \ge \) mH, respectively. Taguchi (1968) showed that F and H are mutually L-symmetric if and only if the following eqs. hold under the assumption that F and H have continuous p.d.fs. f and h with means mF and mH, respectively:

$$ \frac{m_Hh(y)}{m_Ff(x)}={\left(\frac{x}{m_F}\right)}^3\ \mathrm{and}\ y=\frac{m_F{m}_H}{x}. $$
(3.5)

For an L-symmetric pair F and H,

$$ Gini(F)= Gini(H), $$
$$ Pietra(F)=F\left({m}_F\right)-{L}_F\left(F\left({m}_F\right)\right)=P\left(y<{m}_H\right)-{L}_H\left(P\left(y<{m}_H\right)\right)=F\left({m}_H\right)-{L}_H\left(F\left({m}_H\right)\right)= Pietra(H). $$

Note that the Pietra index for F, Pietra(F), is equal to \( Gini\left({D}_F\right)={G}_{mean}^b(F) \) which represents the Gini between-class component for F being divided into two classes at mF. The GE index family satisfies the following eq. (the proof is given in the Appendix.):

$$ {GE}_{\varepsilon }(F)={GE}_{1-\varepsilon }(H). $$
(3.6)

Hence, GE0.5 satisfies the same property regarding L-symmetry as the Gini and Pietra indices. F~dPLN(μ, σ, α, β) and H~dPLN(μ, σ, β + 1, α − 1) are mutually L-symmetric because eqs. (3.5) hold.Footnote 3 If this property of the dPLN is taken into account, it would be better to equate the elasticities of Gini(F), Pietra(F) and GEε(F) to the right- and left-tail parameters to those of Gini(H), Pietra(H) and GE1−ε(H) to the left- and right-tail parameters, respectively. It would also be better to make the elasticities of the inequality and PO-consistent polarization indices nonnegative. For these reasons, I replace parameters α and β with (α − 0.5)−1 and (β + 0.5)−1, respectively for the calculation of elasticity.

Table 2 presents the elasticities of four bipolarization indices and four inequality indices with respect to the shape parameters of a self L-symmetric dPLN distribution with σ = 0.3, α = 4 and β = 3. Among the four bipolarization indices, the FW index (2.6) is PO-consistent. The ZK index (3.7) is a special case of the modified indices of Zhang and Kanbur (2001).Footnote 4 The LS index (3.8) is a new index. The latter two indices are for distributions that are divided into two classes at the mean instead of the median.

$$ {ZK}_{\varepsilon }(F)={GE}_{\varepsilon}^b(F)/{GE}_{\varepsilon }(F)={GE}_{\varepsilon}\left({D}_F\right)/{GE}_{\varepsilon }(F), $$
(3.7)
Table 2. Elasticity of polarization and inequality indices with respect to the shape parameters of dPLN(μ, 0.3, 4, 3)

where \( {GE}_{\varepsilon}^b(F) \) denotes the GE between-class component.

$$ LS(F)=2{G}_{mean}^b(F)/ Gini(F)-1=2 Pietra(F)/ Gini(F)-1. $$
(3.8)

The new index LS represents the slimness or inverse of kurtosis of the Lorenz curve because Pietra(F) equals the maximum distance between the equality diagonal and Lorenz curve, and Gini(F) equals two times the average distance between them. The LS index is also a variant of the modified Zhang-Kanbur-type indices, defined as the ratio of the between-class inequality to the overall inequality, although it is transformed to normalize the index to the 0 to 1 range. The SDH bipolarization index, which was proposed by Silber, et al. (2007) based on a kurtosis measure of the Lorenz curve, can be regarded to correspond to the LS for distributions that are divided at the median, i.e., \( {G}_{med}^b(F) \) instead of \( {G}_{mean}^b(F) \) in (3.8). The LS and ZK0.5 indices inherit the property regarding L-symmetry from the Gini, Pietra and GE0.5 indices. Among the four inequality indices, T1 = GE1 and T2 = GE0 represent the Theil and 2nd Theil indices, respectively. All the indices are calculatable using the analytical expressions of the c.d.f., moments, Lorenz curve, and Gini and GE indices for the dPLN in the Appendix.

As the elasticities of ZK0.5 and LS to the tail parameters and that of SDH to the right tail parameter are negative, these three indices are PO-inconsistent, from Theorem 8. Several possible features of the polarization indices emerge from Table 2. First, the sensitivity of FW to the left-tail parameter is higher than that to the right-tail parameter and even higher than that of T2 in relative terms. The sensitivity of the polarization indices (except SDH) to the central-dispersion parameter relative to the absolute sensitivity to the tail parameters is higher than those of the inequality indices. In particular, that of the LS is substantially higher than those of all other indices. The SDH is intuitively inappropriate because the signs of the elasticities to the left and right tail tend to be different. These results suggest that the LS may be an appropriate index to measure a rise or fall in the middle class if we can assume that the rise or fall closely relates to the decrease or increase of the dispersion in the central part of the distribution, although extensive and comprehensive discussions encompassing other properties of the LS are necessary to reach a conclusion.

4 Concluding remarks

Several points for the discussion of the concept and measurement methodology of bipolarization emerge from the findings of this paper. Should polarization indices be inequality measures consistent with a weaker inequality criterion based on the SO? All bipolarization indices including PO-inconsistent indices are consistent with IB. However, should polarization indices be consistent with IS? In particular, should polarization increase when an element on either end of a distribution moves far from the center of the distribution? Such movement of an element on either end increases inequality in terms of the SO. However, the increase in polarization due to this move does not necessarily appear intuitive. Should polarization indices be created based on the kurtosis of the Lorenz curve rather than based on the (relative) spread from a given center point? Should we divide the distributions at the medians or means for the bipolarization measurement? The choice of the center or division point relates to the definition of the middle class, i.e., whether the middle class occupies rank positions around the median or the mean.

The discussion should proceed by taking into account real distributions. A well-fitted and analytically tractable dPLN model may play a certain role as an alternative to real distributions for the discussion.