Skip to main content

Exponential structure of income inequality: evidence from 67 countries


Economic competition between humans leads to income inequality, but, so far, there has been little understanding of underlying quantitative mechanisms governing such a collective behavior. We analyze datasets of household income from 67 countries, ranging from Europe to Latin America, North America and Asia. For all of the countries, we find a surprisingly uniform rule: income distribution for the great majority of populations (low and middle income classes) follows an exponential law. To explain this empirical observation, we propose a theoretical model within the standard framework of modern economics and show that free competition and Rawls’ fairness are the underlying mechanisms producing the exponential pattern. The free parameters of the exponential distribution in our model have an explicit economic interpretation and direct relevance to policy measures intended to alleviate income inequality.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  1. The emergence of income inequality can be traced back to the pioneering work of Angle (1986, 1992, 1993, 1996, and 2006).

  2. In the neoclassical economics, monopolistic power implies that the behaviors among firms are highly heterogeneous. Interestingly, Lux and Marchesi (1999) also showed that heterogeneous behaviors among economic agents may lead to a power law in financial markets.

  3. Here we have considered \({\textit{lim}}_{i \rightarrow \infty } \, y_i \ne 0\).

  4. When labor L and capital K substitute with each other, we have \({\textit{MRTS}}_{{\textit{LK}}} > 0\).

  5. Full sample means \(\{x_1, \ldots , x_n\}\), where n denotes the sample size.


  • Acemoglu D, Robinson J (2009) Foundation of societal inequality. Science 326(5953):678–679

    Article  Google Scholar 

  • Angle J (1986) The surplus theory of social stratification and the size distribution of personal wealth. Soc Forces 65:293–326

    Article  Google Scholar 

  • Angle J (1992) The inequality process and the distribution of income to blacks and whites. J Math Sociol 17:77–98

    Article  Google Scholar 

  • Angle J (1993) Deriving the size distribution of personal wealth from “the rich get richer, the poor get poorer”. J Math Sociol 18:27–46

    Article  Google Scholar 

  • Angle J (1996) How the gamma law of income distribution appears invariant under aggregation. J Math Sociol 31:325–358

    Article  Google Scholar 

  • Angle J (2006) The inequality process as a wealth maximizing process. Physica A 367:388–414

    Article  Google Scholar 

  • Arrow KJ (1963) Social choice and individual values. Wiley, New York

    Google Scholar 

  • Arrow KJ, Debreu G (1954) Existence of an equilibrium for a competitive economy. Econometrica 22(3):265–290

    Article  Google Scholar 

  • Atkinson AB, Piketty T, Saez E (2011) Top incomes in the long run of history. J Econ Lit 49(1):3–71

    Article  Google Scholar 

  • Autor DH (2014) Skills, education, and the rise of earnings inequality among the “other 99 percent”. Science 344(6186):843–851

    Article  Google Scholar 

  • Autor DH, Katz LF, Kearney MS (2008) Trends in U.S. wage inequality: revising the revisionists. Rev Econ Stat 90(2):300–323

    Article  Google Scholar 

  • Axtell RL (2001) Zipf distribution of U.S. firm sizes. Science 293(5536):1818–1820

    Article  Google Scholar 

  • Banerjee A, Yakovenko VM (2010) Universal patterns of inequality. New J Phys 12:075032

    Article  Google Scholar 

  • Banerjee A, Yakovenko VM, Di Matteo T (2006) A study of the personal income distribution in Australia. Physica A 370(1):54–59

    Article  Google Scholar 

  • Chakrabarti AS, Chakrabarti BK (2009) Microeconomics of the ideal gas like market models. Physica A 388(19):4151–4158

    Article  Google Scholar 

  • Chakrabarti BK, Chakraborti A, Chakravarty SR, Chatterjee A (2013) Econophysics of income and wealth distributions. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Cho A (2014) Physicists say it’s simple. Science 344(6186):828–828

    Article  Google Scholar 

  • Clementi F, Gallegati M, Kaniadakis G (2010) A model of personal income distribution with application to Italian data. Empir Econ 39:559–591

    Article  Google Scholar 

  • Clementi F, Gallegati M, Kaniadakis G (2012) A new model of income distribution: the \(\kappa \)-generalized distribution. J Econ 105:63–91

    Article  Google Scholar 

  • Derzsy N, Néda Z, Santos MA (2012) Income distribution patterns from a complete social security database. Physica A 391(22):5611–5619

    Article  Google Scholar 

  • Dopfer K (2004) The economic agent as rule maker and rule user: Homo Sapiens Oeconomicus. J Evol Econ 14:177–195

    Article  Google Scholar 

  • Dragulescu A, Yakovenko VM (2000) Statistical mechanics of money. Eur Phys J B 17(4):723–729

    Article  Google Scholar 

  • Dragulescu A, Yakovenko VM (2001a) Evidence for the exponential distribution of income in the USA. Eur Phys J B 20(4):585–589

    Article  Google Scholar 

  • Dragulescu A, Yakovenko VM (2001b) Exponential and power-law probability distributions of wealth and income in the United Kingdom and the United States. Physica A 299(1–2):213–221

    Article  Google Scholar 

  • Foley DK (1994) A statistical equilibrium theory of markets. J Econ Theory 62(2):321–345

    Article  Google Scholar 

  • Foster J, Metcalfe JS (2012) Economic emergence: an evolutionary economic perspective. J Econ Behav Organ 82(2–3):420–432

    Article  Google Scholar 

  • Golosov M, Maziero P, Menzio G (2013) Taxation and redistribution of residual income inequality. J Polit Econ 121(6):1160–1204

    Article  Google Scholar 

  • Harte J, Zillio T, Conlisk E, Smith AB (2008) Maximum entropy and the state-variable approach to macroecology. Ecology 89(10):2700–2711

    Article  Google Scholar 

  • Heathcote J, Storesletten K, Violante GL (2010) The macroeconomic implications of rising wage inequality in the United States. J Polit Econ 118(4):681–722

    Article  Google Scholar 

  • Hodgson GM (2004) The evolution of institutional economics: agency, structure and Darwinism in American Institutionalism. Routledge, London

    Book  Google Scholar 

  • Jagielski M, Kutner R (2013) Modelling of income distribution in the European Union with the Kokker–Planck equation. Physica A 392(9):2130–2138

    Article  Google Scholar 

  • Jones CI (2015) Pareto and Piketty: the macroeconomics of top income and wealth inequality. J Econ Perspect 29(1):29–46

    Article  Google Scholar 

  • Kakwani N (1980) Income inequality and poverty. Oxford University Press, Oxford

    Google Scholar 

  • Katz L, Autor D (1999) Changes in the wage structure and earnings inequality. In: Ashenfelter O, Card D (eds) Handbook of labor economics, vol 3A. North-Holland, Amsterdam

    Google Scholar 

  • Kuznets S (1955) Economic growth and income inequality. Am Econ Rev 45(1):1–28

    Google Scholar 

  • Lai TL, Robbins H, Wei CZ (1979) Strong consistency of least squares estimates in multiple regression II. J Multivar Anal 9(3):343–361

    Article  Google Scholar 

  • Lambert PJ (1993) The distribution and redistribution of income: a mathematical analysis, 2nd edn. Manchester University Press, Manchester

    Google Scholar 

  • Lux T, Marchesi M (1999) Scaling and criticality in a stochastic multi-agent model of a financial market. Nature 397:498–500

    Article  Google Scholar 

  • Mackmurdo AH (1940) The social organism. Nature 145(3666):187–187

    Article  Google Scholar 

  • Mandelbrot B (1960) The Pareto–Levy law and the distribution of income. Int Econ Rev 1(2):79–106

    Article  Google Scholar 

  • Mas-Collel A, Whinston MD, Green JR (1995) Microeconomic theory. Oxford University Press, Oxford

    Google Scholar 

  • Moretti E (2013) Real wage inequality. Am Econ J Appl Econ 5(1):65–103

    Article  Google Scholar 

  • Nelson RR, Winter SG (1982) An evolutionary theory of economic change. The Belknap Press of Harvard University Press, Cambridge

    Google Scholar 

  • Nirei M, Souma W (2007) A two factor model of income distribution dynamics. Rev Income Wealth 53(3):440–459

    Article  Google Scholar 

  • Nishi A, Shirado H, Rand DG, Christakis NA (2015) Inequality and visibility of wealth in experimental social networks. Nature 526(7573):426–429

    Article  Google Scholar 

  • Oancea B, Andrei T, Pirjol D (2016) Income inequality in Romania: the exponential-Pareto distribution. Physica A.

  • Pareto V (1897) Cours d’ Economie Politique. L’ Universite de Lausanne, Lausanne

    Google Scholar 

  • Piketty T (2003) Income inequality in France, 1901–1998. J Polit Econ 111:1004–1042

    Article  Google Scholar 

  • Piketty T, Qian N (2009) Income inequality and progressive income taxation in China and India, 1986–2015. Am Econ J Appl Econ 1(2):53–63

    Article  Google Scholar 

  • Piketty T, Saez E (2003) Income inequality in the United States, 1913–1998. Q J Econ 118:1–39

    Article  Google Scholar 

  • Piketty T, Saez E (2014) Inequality in the long run. Science 344(6186):838–843

    Article  Google Scholar 

  • Potts J (2001) Knowledge and markets. J Evol Econ 11:413–431

    Article  Google Scholar 

  • Ravallion M (2014) Income inequality in the developing world. Science 344(6186):851–855

    Article  Google Scholar 

  • Rawls J (1999) A theory of justice (revised edition). Harvard University Press, Cambridge

    Google Scholar 

  • Rudin W (1976) Principles of mathematical analysis, 3rd edn. McGraw-Hill, Inc, New York

    Google Scholar 

  • Saez E, Zucman G (2016) Wealth inequality in the United States since 1913: evidence from capitalized income tax data. Q J Econ 131:519–578

    Article  Google Scholar 

  • Shaikh A (2016) Income distribution, econophysics and piketty. Rev Polit Econ.

  • Shaikh A, Papanikolaou N, Wiener N (2014) Race, gender and the econophysics of income distribution in the USA. Physica A 415:54–60

    Article  Google Scholar 

  • Silva AC, Yakovenko VM (2005) Temporal evolution of the “thermal” and “superthermal” income classes in the USA during 1983–2001. Europhys Lett 69(2):304–310

    Article  Google Scholar 

  • Tao Y (2010) Competitive market for multiple firms and economic crisis. Phys Rev E 82(3):036118

    Article  Google Scholar 

  • Tao Y (2015) Universal laws of human society’s income distribution. Physica A 435:89–94

    Article  Google Scholar 

  • Tao Y (2016) Spontaneous economic order. J Evol Econ 26(3):467–500

    Article  Google Scholar 

  • Tao Y (2017) An index measuring the deviation of a real economy from the general equilibrium: evidence from the OECD Countries. Available at SSRN:

  • Tao Y, Wu X, Li C (2017) Rawls’ fairness, income distribution and alarming level of Gini coefficient. Economics discussion papers, no 2017-67. Kiel Institute for the World Economy.

  • Venkatasubramanian V, Luo Y, Sethuraman J (2015) How much inequality in income is fair? A microeconomic game theoretic perspective. Physica A 435:120–138

    Article  Google Scholar 

  • Walras L (2003) Elements of pure economics or the theory of social wealth. Routledge, London

    Google Scholar 

  • Whitfield J (2007) Survival of the likeliest? PLoS Biol 5(5):e142

    Article  Google Scholar 

  • Yakovenko VM, Rosser JB Jr (2009) Statistical mechanics of money, wealth, and income. Rev Mod Phys 81(4):1703–1717

    Article  Google Scholar 

Download references


The authors would like to thank two anonymous referees and the editorial board for valuable comments and suggestions. All errors remain ours. Victor Yakovenko was supported by grant “Statistical Physics Approach to Income and Wealth Distribution” from the Institute for New Economic Thinking (INET), and Yong Tao by the Fundamental Research Funds for the Central Universities of China (Grant No. SWU1409444).

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Yong Tao or Victor M. Yakovenko.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (docx 424 KB)



N-person non-cooperative game

Arrow–Debreu’s General Equilibrium Model (ADGEM) is based on the well-known two criteria of neoclassical economics: utility maximization and profit maximization. If there are N consumers, each of whom operates a firm, the ADGEM describing their optimal behavior uses the following principles (Tao 2015, 2016):

  1. (a)

    Profit maximization: For each firm \(i=1,\ldots ,N\), \(y_i^*\in Y_i \) maximizes profits such that \(p\cdot y_i \le p\cdot y_i^*\) for all \(y_i \in Y_i \).

  2. (b)

    Utility maximization: For each consumer \(i=1,\ldots ,N\), \(x_i^*\in X_i \) is the solution of maximizing the preference \(\mathop \succ \limits _{\sim i} \) under the budget set: \(\big \{ x_i \in X_i :p\cdot x_i \le p\cdot \omega _i +\mathop \sum \nolimits _{j=1}^N \theta _{ij} p\cdot y_j^*\big \}\).

  3. (c)

    Market clearing: \(\mathop \sum \nolimits _{i=1}^N x_i^*=\mathop \sum \nolimits _{i=1}^N \omega _i +\mathop \sum \nolimits _{i=1}^N y_i^*\).

Here \(x_i \) and \(X_i \) represent consumption vector and consumption set of the \(i \mathrm{th}\) consumer, respectively; \(y_i \) and \(Y_i \) represent production vector and production set of the \(i\mathrm{th}\) firm, respectively (Mas-Collel et al. 1995); \(\theta _{ij} \) represents an ownership share of each firm \(j=1,\ldots ,N\) paid to the \(i\hbox {th}\) consumer. The allocation \(\left( {x_1^*,\ldots ,x_N^*;y_1^*,\ldots ,y_N^*} \right) \) and a price vector \(p=\left( {p_1 ,\ldots ,p_L } \right) \) constitute a Pareto optimal solution to ADGEM (a)–(c).

Rawls’ fairness of “2-person allocation”

For illustration, let us consider a simple “2-person society” in which the GDP is denoted by $2 and each person can earn a possible equilibrium income with $0, $1 or $2. For the “2-person society”, the Eq. (1) can be expressed in the form:

$$\begin{aligned} \left\{ {{\begin{array}{l} I_i =0,1,2\quad for \quad i=1,2 \\ \mathop \sum \nolimits _{i=1}^2 I_i =2 \\ \end{array} }} \right. . \end{aligned}$$

By Eq. (B.1), the “2-person society” will have three equilibrium income allocation (EIA): \(A_1 =\left\{ {0,2} \right\} \), \(A_2 =\left\{ {2,0} \right\} \) and \(A_3 =\left\{ {1,1} \right\} \). They have been shown as below:

figure a

By Rawls’ principle of fair equality of opportunity, each EIA should occur with an equal probability (Tao 2015, 2016); therefore, each person’s expected income equals $1. The detailed calculation is as below:

$$\begin{aligned} \hbox {Probability}\left( A_{1} \right)= & {} 1/3,\hbox {Probability}\left( {A_2 } \right) =1/3,\hbox {Probability}\left( {A_3 } \right) =1/3 \nonumber \\ \hbox {Woman's Expected Income}= & {} 0\times \left( {1/3} \right) + 2\times \left( {1/3} \right) +1\times \left( {1/3} \right) =1 \end{aligned}$$
$$\begin{aligned} \hbox {Man's Expected Income}= & {} 2\times \left( {1/3} \right) +0\times \left( {1/3} \right) +1\times \left( {1/3} \right) =1 \end{aligned}$$

This means that each person owns the equal opportunity of earning money. If we denote the equal income distribution by a and the unequal income distribution by b, we do have \(a=\left\{ A_{3} \right\} \) and \(b=\left\{ {A_1 ,A_2 } \right\} \). By Rawls’ principle of fair equality of opportunity, a will occur with probability 1 / 3 and b will occur with probability 2 / 3. Following the rule of “survival of the likeliest”, b will be a result of natural selection.

Density function of income distribution

For “N-person allocation”, Tao has shown that, by applying Rawls’ fairness into Eq. (1) where N and Y are large enough, one will get the exponential income distribution which occurs with the highest probability (Tao 2015, 2016):

$$\begin{aligned} a_k= & {} g_k e^{-\frac{( {\varepsilon _k -\mu } )}{\theta }}, \nonumber \\&\varepsilon _1<\varepsilon _2<\cdots <\varepsilon _n. \end{aligned}$$

Here \(\mu \) and \(\theta \) denote marginal labor-capital return and marginal technology return, respectively (Tao 2016); readers can find the origin of these two parameters in “Appendix D”.

The formula (C.1) indicates that there are \(a_k \) consumers each of which obtains \(\varepsilon _k \) units of revenue, and k runs from 1 to n. Because income distribution (C.1) will occur with the highest probability, Tao call it the “spontaneous economic order” (Tao 2016).

The formula (C.1) can be rewritten in the form of continuous function. To see this, let us first observe:

$$\begin{aligned} \mathop \sum \limits _{k=1}^n a_k =N, \end{aligned}$$

which leads to:

$$\begin{aligned} \mathop \sum \limits _{k=1}^n \frac{a_k }{N}=1. \end{aligned}$$

Here \(\frac{a_k}{N}\) denotes the proportion of populations each of whom earns \(\varepsilon _k \) units of income. Now we write \(\frac{a_k }{N}\) in the form of continuous function: \(f\left( x \right) \). To this end, let us order:

$$\begin{aligned} f\left( x \right) =w\cdot e^{\frac{- ( {x-\mu })}{\theta }}, \end{aligned}$$

where x, which replaces \(\varepsilon _k \), denotes a continuous income level, and by Rational Agent Hypothesis one has (Tao 2010) \(x\ge \mu \).

Here w is an undetermined constant, which will be determined by the sum formula (C.3). Let us replace \(\frac{a_k }{N}\) by (C.4), and transform sum operation of formula (C.3) into integral operation:

$$\begin{aligned} \mathop \int \nolimits _\mu ^{+\infty } w\cdot e^{\frac{-({x-\mu })}{\theta }}dx=1, \end{aligned}$$

which leads to \(w=\frac{1}{\theta }\).

Finally, we obtain the density function of income distribution:

$$\begin{aligned} f\left( x \right) =\frac{1}{\theta }e^{\frac{- ({x-\mu })}{\theta }}. \end{aligned}$$

Technological progress and entropy

Because the firm consists of labor and capital, the Cobb–Douglas aggregate production function (or GDP) of neoclassical economics can be written in the form (Tao 2010, 2016):

$$\begin{aligned} Y=Y\left( {N\left( {L,K} \right) ,H} \right) , \end{aligned}$$

where L and K denote labor and capital, whereas \(N\hbox { and }H\) denote the number of firms and technological progress.

The complete differential of (D.1) yields [see also Eq. (9) in Banerjee and Yakovenko (2010)]:

$$\begin{aligned} \hbox {d}Y\left( {N\left( {L,K} \right) ,H} \right) =\mu dN\left( {L,K} \right) +\theta dH, \end{aligned}$$

where \(\mu =\partial Y/\partial N\) and \(\theta =\partial Y/\partial H\) denote the marginal labor-capital return and the marginal technology return (Tao 2016), respectively.

Here Tao identifies the entropy \(ln\varOmega \) with the technological progress H (Tao 2010, 2016):

$$\begin{aligned} H=ln{\varOmega }, \end{aligned}$$

where \({\varOmega }\) denotes the number of equilibrium income allocations that a given income distribution contains [Furthermore, \({\varOmega }\) also measures the choice freedom of social members (Tao 2016)]. For example, for the 2-person society described by “Appendix B”, we have \({\varOmega } \left( a \right) =1\) and \({\varOmega } \left( b \right) =2\). By maximizing (Tao 2010, 2015, 2016) \({\varOmega }\) one can obtain the exponential income distribution (2). Consequently, the technological progress H can be regarded as the entropy of socio-economical systems.

Furthermore, the complete differential of Eq. (D.1) can be rewritten in the form:

$$\begin{aligned} dY=\omega dL+rdK+\theta dH, \end{aligned}$$

where \(\omega =\partial Y/\partial L\) and \(r=\partial Y/\partial K\) denote marginal labor return and marginal capital return (Tao 2017), respectively. On the one hand, we might as well assume that capital markets exhibit perfect competition, so r also denotes the interest rate. On the other hand, by the principle of diminishing marginal return in neoclassical economics, \(\omega \) denotes the minimum wage. Comparing Eqs. (D.2) and (D.4), we can obtain (Tao 2017):

$$\begin{aligned} \mu =\omega \cdot \sigma -r\cdot \sigma \cdot MRTS_{LK} , \end{aligned}$$

where \(\sigma =dL/dN\) and \(MRTS_{LK} =-dK/dL\). Here \(\sigma \) denotes the marginal employment level and \(MRTS_{LK} \) denotes the marginal rate of technical substitution of labor and capital (Tao 2017).

Main propositions

To obtain the consistent estimate of \(\mu \), we do the estimate analysis in terms of two cases: full sample and truncation sample. In this paper, \(lim_{n\rightarrow \infty } a_n =a\) means \(lim_{n\rightarrow \infty } P\left( {a_n =a} \right) =1\), where \(P\left( \xi \right) \) denotes the probability of \(\xi \) occurring.

1.1 Full sample

Let us first drop the constraint \(x\ge \mu \). For the full data (i.e., population), the Eq. (4) can be written in the form:

$$\begin{aligned} y_j= & {} \beta ^{*}x_j +\alpha ^{*}+\varepsilon _j , \end{aligned}$$
$$\begin{aligned} \mu ^{*}= & {} -\frac{\alpha ^{*}}{\beta ^{*}}, \end{aligned}$$

where \(\beta ^{*}=-\frac{1}{\theta ^{*}}\), \(\alpha ^{*}=\frac{\mu ^{*}}{\theta ^{*}}\), and \(\varepsilon _j \sim N\left( {0,\sigma ^{2}} \right) \) for \(j=1,2,\ldots ,\infty \). Here \(\left\{ {x_j } \right\} _{j=1}^\infty \) and \(\left\{ {y_j } \right\} _{j=1}^\infty \) denote the full data. \(\beta ^{*}\) and \(\alpha ^{*}\) are obtained by regressing \(\left\{ {y_j } \right\} _{j=1}^\infty \) on \(\left\{ {x_j } \right\} _{j=1}^\infty \).

For the full sampleFootnote 5, the sample estimates of Eqs. (E.1) and (E.2) yield:

$$\begin{aligned} \hat{y} _i= & {} \hat{\beta } x_i +\hat{\alpha } , \end{aligned}$$
$$\begin{aligned} \hat{\mu }= & {} -\frac{\hat{\alpha } }{\hat{\beta } }, \end{aligned}$$

where \(i=1,2,\ldots ,n\).

Due to the absence of the constraint \(x\ge \mu \), equation (E.1) differs slightly from Eq. (3); therefore, we don’t ensure if \(\mu ^{*}=\mu \). In section E2, we will discuss the estimate of \(\mu \) when \(x\ge \mu \) holds. In this section, we mainly investigate the consistency of the estimate (E.4).

Taking the least squares estimation on Eq. (E.3) we have:

$$\begin{aligned} \hat{\beta }= & {} \frac{\mathop \sum \nolimits _{i=1}^n \left( {x_i -\bar{x} } \right) \left( {y_i -\bar{y} } \right) }{\mathop \sum \nolimits _{i=1}^n \left( {x_i -\bar{x} } \right) ^{2}}, \end{aligned}$$
$$\begin{aligned} \hat{\alpha }= & {} \bar{y} -\hat{\beta } \bar{x}, \end{aligned}$$

where \(\bar{x} =\frac{1}{n}\mathop \sum \nolimits _{i=1}^n x_i \) and \(\bar{y} =\frac{1}{n}\mathop \sum \nolimits _{i=1}^n y_i \).

Since the exponential distribution (3) is only suitable for the low and middle parts of the income data, we should drop the high income data. Moreover, due to the economic meanings of \(x_i \) in the Eq. (3), \(\left\{ {x_i } \right\} _{i=1}^n \) should be a monotonic increasing sequence. Thus, we can make the following assumptions.


  1. (a).

    \(\left| {x_i } \right| <\infty \) and \(\left| {y_i } \right| <\infty \) for \(i=1,2,\ldots ,n\).

  2. (b).

    \(\left\{ {x_i } \right\} _{i=1}^n \) is a strictly monotonic increasing sequence with \(x_i \ge 0\) for \(i=1,2,\ldots ,n\).

  3. (c).

    \(\varepsilon _j \) are i.i.d. \(N\left( {0,\sigma ^{2}} \right) \).

Next we verify that \(\hat{\beta } \) and \(\hat{\alpha } \) are consistent estimates.

Theorem 1

Assume that \(\varepsilon _j \) are i.i.d. \(N\left( {0,\sigma ^{2}} \right) \). If there is \(lim_{n\rightarrow \infty } \left( {{\varvec{X}}^{T}{\varvec{X}}} \right) ^{-1}={\varvec{0}}\), then one has:

$$\begin{aligned} lim_{n\rightarrow \infty } \hat{\beta }= & {} \beta ^{*}, \end{aligned}$$
$$\begin{aligned} lim_{n\rightarrow \infty } \hat{\alpha }= & {} \alpha ^{*}, \end{aligned}$$

where \({{\varvec{X}}}=\left( {{\begin{array}{lll} {x_1}&{} \cdots &{} {x_n } \\ 1&{} \cdots &{} 1 \\ \end{array} }} \right) ^{T}\).


See Lai et al. (1979). \(\square \)

To verify Eqs. (E.7) and (E.8), we can only prove the following proposition.

Proposition 1

\(lim_{n\rightarrow \infty } \left( {{{\varvec{X}}}^{T}{{\varvec{X}}}} \right) ^{-1}=\mathbf{0}\).


It’s easy to compute:

$$\begin{aligned} \left( {{{\varvec{X}}}^{T}{{\varvec{X}}}} \right) ^{-1}=\frac{1}{n\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\mathop \sum \nolimits _{i=1}^n x_i } \right) ^{2}}\left( {{\begin{array}{cc} n &{} {-\mathop \sum \nolimits _{i=1}^n x_i } \\ {-\mathop \sum \nolimits _{i=1}^n x_i }&{} {\mathop \sum \nolimits _{i=1}^n x_i^2 } \\ \end{array} }} \right) , \end{aligned}$$

so proving \(lim_{n\rightarrow \infty } \left( {{{\varvec{X}}}^{T}{{\varvec{X}}}} \right) ^{-1}=\mathbf{0}\) is equivalent to verifying:

$$\begin{aligned} lim_{n\rightarrow \infty } \frac{1}{n\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\mathop \sum \nolimits _{i=1}^n x_i } \right) ^{2}}\left( {{ \begin{array}{cc} n&{} {-\mathop \sum \nolimits _{i=1}^n x_i } \\ {-\mathop \sum \nolimits _{i=1}^n x_i }&{} {\mathop \sum \nolimits _{i=1}^n x_i^2 } \\ \end{array} }} \right) =\left( {{\begin{array}{cc} 0&{} 0 \\ 0&{} 0 \\ \end{array} }} \right) . \end{aligned}$$

Obviously, proving Eq. (E.9) is equivalent to verifying the following three equations:

$$\begin{aligned}&lim_{n\rightarrow \infty } \frac{n}{n\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\mathop \sum \nolimits _{i=1}^n x_i} \right) ^{2}}=0, \end{aligned}$$
$$\begin{aligned}&\quad lim_{n\rightarrow \infty } \frac{\mathop \sum \nolimits _{i=1}^n x_i }{n\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\mathop \sum \nolimits _{i=1}^n x_i} \right) ^{2}}=0, \end{aligned}$$
$$\begin{aligned}&\quad lim_{n\rightarrow \infty } \frac{\mathop \sum \nolimits _{i=1}^n x_i^2 }{n\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\mathop \sum \nolimits _{i=1}^n x_i} \right) ^{2}}=0. \end{aligned}$$

One can compute:

$$\begin{aligned} n\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\mathop \sum \nolimits _{i=1}^n x_i } \right) ^{2}=n^{2}\left[ {\frac{1}{n}\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\bar{x} } \right) ^{2}} \right] . \end{aligned}$$

Furthermore, we have the following result:

$$\begin{aligned}&\frac{1}{n}\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\bar{x} } \right) ^{2}=\frac{1}{n}\mathop \sum \nolimits _{i=1}^n x_i^2 -2\left( {\bar{x} } \right) ^{2} \nonumber \\&\quad + \left( {\bar{x} } \right) ^{2}=\frac{1}{n}\mathop \sum \nolimits _{i=1}^n \left( {x_i -\bar{x} } \right) ^{2}. \end{aligned}$$

By Assumption (b) we must have \(\mathop \sum \nolimits _{i=1}^n \left( {x_i -\bar{x} } \right) ^{2}\ne 0\); otherwise, \(x_i =\bar{x} \) for \(i=1,2,\ldots ,n\), contradicting the strict monotonicity. On the other hand, by the strict monotonicity, there should be at most one number \(x_l \) leading to \(x_l =\bar{x} \). Thus, if we order \(\mathop {\min }\limits _{i\ne l} \left| {x_i -\bar{x} } \right| =A\), then we have \(\mathop \sum \nolimits _{i=1}^n \left( {x_i -\bar{x} } \right) ^{2}\ge 0+\left( {n-1} \right) \cdot A^{2}\).

Consequently, by Eq. (E.14) we can obtain:

$$\begin{aligned} \left| {\frac{1}{n}\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\bar{x} } \right) ^{2}} \right| =\left| {\frac{1}{n}\mathop \sum \nolimits _{i=1}^n \left( {x_i -\bar{x} } \right) ^{2}} \right| \ge \frac{n-1}{n}\cdot A^{2}. \end{aligned}$$

Using Eqs. (E.13) and (E.15) one has

$$\begin{aligned}&\left| {\frac{1}{n\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\mathop \sum \nolimits _{i=1}^n x_i } \right) ^{2}}} \right| \nonumber \\&\quad =\left| {\frac{1}{n^{2}\left[ {\frac{1}{n}\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\bar{x} } \right) ^{2}} \right] }} \right| \nonumber \\&\quad \le \frac{1}{n^{2}\cdot \frac{n-1}{n}\cdot A^{2}}=\frac{1}{n\cdot \left( {n-1} \right) \cdot A^{2}}. \end{aligned}$$

On the other hand, by Assumption (a), we can order \({\textit{max}}_i \left| {x_i } \right| =B\); therefore, we have:

$$\begin{aligned} \left| {\mathop \sum \nolimits _{i=1}^n x_i } \right| =\mathop \sum \nolimits _{i=1}^n \left| {x_i } \right| \le n\cdot B, \end{aligned}$$
$$\begin{aligned} \left| {\mathop \sum \nolimits _{i=1}^n x_i^2 } \right| =\mathop \sum \nolimits _{i=1}^n x_i^2 \le n\cdot B^{2}. \end{aligned}$$

Using Eqs. (E.16)–(E.18), we can obtain:

$$\begin{aligned}&\left| {\frac{n}{n\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\mathop \sum \nolimits _{i=1}^n x_i } \right) ^{2}}} \right| \nonumber \\&\quad \le \frac{n}{n\cdot \left( {n-1} \right) \cdot A^{2}}=\frac{1}{\left( {n-1} \right) \cdot A^{2}}. \end{aligned}$$
$$\begin{aligned}&\quad \left| {\frac{\mathop \sum \nolimits _{i=1}^n x_i }{n\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\mathop \sum \nolimits _{i=1}^n x_i} \right) ^{2}}} \right| \le \frac{n\cdot B}{n\cdot \left( {n-1} \right) \cdot A^{2}}=\frac{B}{\left( {n-1} \right) \cdot A^{2}}. \end{aligned}$$
$$\begin{aligned}&\quad \left| {\frac{\mathop \sum \nolimits _{i=1}^n x_i^2 }{n\mathop \sum \nolimits _{i=1}^n x_i^2 -\left( {\mathop \sum \nolimits _{i=1}^n x_i } \right) ^{2}}} \right| \le \frac{n\cdot B^{2}}{n\cdot \left( {n-1} \right) \cdot A^{2}}=\frac{B^{2}}{\left( {n-1} \right) \cdot A^{2}}. \end{aligned}$$

Imposing \(n\rightarrow \infty \) on Eqs. (E.19)–(E.21) one can obtain Eqs. (E.10)–(E.12). \(\square \)

By using the Theorem 1, it’s easy to compute:

$$\begin{aligned} lim_{n\rightarrow \infty } \hat{\mu } =-\frac{\alpha ^{*}}{\beta ^{*}}=\mu ^{*}. \end{aligned}$$

Equation (E.22) indicates that if there is no the constraint \(x\ge \mu \), then the estimate \(\hat{\mu } \) is consistent. However, the existence of the constraint \(x\ge \mu \) may lead to the inconsistency of estimate \(\hat{\mu } \).

1.2 Truncation sample

Now let us recover the constraint \(x\ge \mu \). Since the constraint \(x\ge \mu \) holds, we attempt to construct a truncation estimate of \(\mu \). To this end, we might as well assume that \(\mu \) has existed. Thus, the truncation of the full data \(x_j \) can be written as:

$$\begin{aligned} x_j \ge \mu , \end{aligned}$$

where \(j=g^{*},g^{*}+1,\ldots ,\infty \).

Using the truncation data (E.23), Eq. (4) can be written as:

$$\begin{aligned} y_k= & {} \beta x_k +\alpha +\varepsilon _k , \end{aligned}$$
$$\begin{aligned} x_k\ge & {} \mu , \end{aligned}$$

where \(\beta =-\frac{1}{\theta }\), \(\alpha =\frac{\mu }{\theta }\), and \(\varepsilon _k \sim N\left( {0,\sigma ^{2}} \right) \) for \(k=g^{*},g^{*}+1,\ldots ,\infty \). Here \(\beta \) and \(\alpha \) are obtained by regressing \(\left\{ {y_j } \right\} _{j=g^{*}}^\infty \) on \(\left\{ {x_j } \right\} _{j=g^{*}}^\infty \).

Thus, the sample estimates of Eqs. (E.24) and (E.25) yield:

$$\begin{aligned} \hat{y} _i= & {} \hat{\beta } _g x_i +\hat{\alpha } _g , \end{aligned}$$
$$\begin{aligned} x_i\ge & {} \hat{\mu } _g , \end{aligned}$$

where \(i=g,g+1,\ldots ,n\) and \(g=g\left( n \right) \). Here \(\left\{ {x_i } \right\} _{i=g}^n \) and \(\left\{ {y_i } \right\} _{i=g}^n \) denote truncation sample. It’s worth emphasizing that \(g^{*}\) and \(g=g\left( n \right) \) are undetermined.

Taking the least squares estimation on Eq. (E.26) we have:

$$\begin{aligned} \hat{\beta } _g= & {} \frac{\mathop \sum \nolimits _{i=g}^n \left( {x_i -\bar{x} _g } \right) \left( {y_i -\bar{y} _g } \right) }{\mathop \sum \nolimits _{i=g}^n \left( {x_i -\bar{x} _g } \right) ^{2}}, \end{aligned}$$
$$\begin{aligned} \hat{\alpha } _g= & {} \bar{y} _g -\hat{\beta } _g \bar{x} _g , \end{aligned}$$

where \(\bar{x} _g =\frac{1}{n-g+1}\mathop \sum \nolimits _{i=g}^n x_i\) and \(\bar{y} _g =\frac{1}{n-g+1}\mathop \sum \nolimits _{i=g}^n y_i\).

The main purpose of this section is to derive the estimate \(\hat{\mu } _g \). Assume \(g^{*}<\infty \), thus we will have the following theorem and proposition:

Theorem 2

Assume that \(\varepsilon _j \) are i.i.d. \(N\left( {0,\sigma ^{2}} \right) \). If there is \(lim_{n\rightarrow \infty } \left( {{{\varvec{X}}}_{g^{*}}^T {{\varvec{X}}}_{g^{*}}} \right) ^{-1}=\mathbf{0}\), then one has:

$$\begin{aligned} lim_{n\rightarrow \infty } \hat{\beta } _{g^{*}}= & {} \beta , \end{aligned}$$
$$\begin{aligned} lim_{n\rightarrow \infty }\hat{\alpha } _{g^{*}}= & {} \alpha , \end{aligned}$$

where \({{\varvec{X}}}_{g^{*}} =\left( {{\begin{array}{lll} {x_{g^{*}} }&{} \cdots &{} {x_n } \\ 1&{} \cdots &{} 1 \\ \end{array}}} \right) ^{T}\).


Same as the Theorem 1. \(\square \)

Proposition 2

\(lim_{n\rightarrow \infty } \left( {{{\varvec{X}}}_{g^{*}}^T {{\varvec{X}}}_{g^{*}} } \right) ^{-1}=\mathbf{0}\).


Same as the Proposition 1. \(\square \)

Consistent with the form of Eq. (E.4), \(\hat{\mu } _g \) can be defined as:

$$\begin{aligned} \hat{\mu } _g =-\frac{\hat{\alpha } _g }{\hat{\beta } _g }. \end{aligned}$$

Now we start to derive the consistent condition of guaranteeing the validity of estimate (E.32).

Substituting Eqs. (E.29) into (E.32) one has:

$$\begin{aligned} \hat{\mu } _g =\bar{x} _g -\frac{\bar{y} _g }{\hat{\beta } _g }, \end{aligned}$$

which guarantees that the constraint of Eq. (E.26) has been imposed on the estimate (E.32).

On the other hand, Eq. (E.27) indicates:

$$\begin{aligned} \bar{x} _g >\hat{\mu } _g +\delta , \end{aligned}$$

where we have used the Assumption (b) and \(\delta >0\).

Inserting Eqs. (E.33) into (E.34) yield:

$$\begin{aligned} \frac{\bar{y} _g }{\hat{\beta } _g }>\delta >0, \end{aligned}$$

which guarantees that the constraint of Eq. (E.27) has been imposed on the estimate (E.32).

Thus, we can obtain the core proposition of this Appendix as below:

Proposition 3

For a strictly monotonic increasing sequence \(\left\{ {x_j } \right\} _{j=1}^n \), if there exists an integer \(g=g\left( n \right) \) to guarantee:

  1. (A).

    \(x_{i-1}<\mu <x_i \) or \(x_i =\mu \), where \(i=g<n\) and \(lim_{n\rightarrow \infty } \frac{g}{n}=0\);

  2. (B).

    \(\frac{\bar{y} _g }{\hat{\beta } _g }>\delta >0\) for any n;

    then one has:

    $$\begin{aligned} lim_{n\rightarrow \infty } \hat{\mu } _g =lim_{n\rightarrow \infty } \left( {\bar{x} _g -\frac{\bar{y}_g }{\hat{\beta } _g }} \right) =\mu , \end{aligned}$$

    where g is uniquely determined by n and \(g<\infty \). This means:

    $$\begin{aligned} lim_{n\rightarrow \infty } g=g^{*}. \end{aligned}$$

To verify the Proposition 3, we need to prove the following four lemmas:

Lemma 1

If \(\left\{ {\xi _i } \right\} _{i=1}^n \) is a monotonic sequence and if \(\left| {\xi _i } \right| <\infty \) for any i, then one has:

$$\begin{aligned} lim_{n\rightarrow \infty } \xi _n =\xi , \end{aligned}$$

where \(\left| \xi \right| <\infty \).


See the theorem 3.14 in Rudin (1976). \(\square \)

Lemma 2

For the sequence \(\left\{ {\xi _i } \right\} _{i=1}^n \), if \(lim_{n\rightarrow \infty } \xi _n =\xi \), then one has:

$$\begin{aligned} lim_{n\rightarrow \infty } \frac{1}{n}\mathop \sum \nolimits _{i=1}^n \xi _i =\xi . \end{aligned}$$


Since \(lim_{n\rightarrow \infty } \xi _n =\xi \), by the definition of limit, for every \(\upepsilon >0\) there always exists a positive integer N so that when \(k>N\), one has:

$$\begin{aligned} \left| {\xi _k -\xi } \right| <\frac{\epsilon }{2}. \end{aligned}$$

To verify Eq. (E.39), we only need to prove:

$$\begin{aligned} lim_{n\rightarrow \infty } \left( {\frac{1}{n}\mathop \sum \nolimits _{i=1}^n \xi _i -\xi } \right) =0; \end{aligned}$$

that is, for every \(\upepsilon >0\) there always exists a positive integer \(N_0\) so that when \(n>N_0 \), one has:

$$\begin{aligned} \left| {\frac{1}{n}\mathop \sum \nolimits _{i=1}^n \xi _i -\xi } \right| <\epsilon . \end{aligned}$$

It’s easy to compute:

$$\begin{aligned}&\left| {\frac{1}{n}\mathop \sum \nolimits _{i=1}^n \xi _i -\xi } \right| \nonumber \\&\quad =\left| {\frac{1}{n}\left[ {\mathop \sum \nolimits _{i=1}^N \left( {\xi _i -\xi } \right) +\mathop \sum \nolimits _{j=N+1}^n \left( {\xi _j -\xi } \right) } \right] } \right| \nonumber \\&\quad \le \frac{1}{n}\left| {\mathop \sum \nolimits _{i=1}^N \left( {\xi _i -\xi } \right) } \right| +\frac{1}{n}\left| {\mathop \sum \nolimits _{j=N+1}^n \left( {\xi _j -\xi } \right) } \right| . \end{aligned}$$

Because \(lim_{n\rightarrow \infty } \xi _n =\xi \), it’s easy to verify that \(\left| {\xi _i } \right| <\infty \) and \(\left| \xi \right| <\infty \). Thus, one has \({\textit{max}}_i \left| {\xi _i -\xi } \right| <\infty \). Consequently, thanks to \(j>N\), Eq. (E.43) can be written in the form:

$$\begin{aligned}&\left| {\frac{1}{n}\mathop \sum \nolimits _{i=1}^n \xi _i -\xi } \right| \nonumber \\&\quad \le \frac{N}{n} {\textit{max}}_i \left| {\xi _i -\xi } \right| +\frac{n-N}{n}\frac{\epsilon }{2} \nonumber \\&\quad <\frac{N}{n} {\textit{max}}_i \left| {\xi _i -\xi } \right| +\frac{\epsilon }{2}. \end{aligned}$$

where we have used Eq. (E.40).

It’s easy to compute \(lim_{n\rightarrow \infty } \frac{N}{n} {\textit{max}}_i \left| {\xi _i -\xi } \right| =0\). This means that for every \(\upepsilon >0\) there always exists a positive integer \(N_1\) so that when \(k>N_1 \), one has:

$$\begin{aligned} \frac{N}{k} {\textit{max}}_i \left| {\xi _i -\xi } \right| <\frac{\epsilon }{2}. \end{aligned}$$

Let us order \(N_0 =max\left\{ {N,N_1 } \right\} \); thus, substituting Eqs. (E.45) into (E.44) we conclude that for every \(\upepsilon >0\) when \(n>N_0 \), there always holds:

$$\begin{aligned} \left| {\frac{1}{n}\mathop \sum \nolimits _{i=1}^n \xi _i -\xi } \right| <\epsilon . \end{aligned}$$

\(\square \)

Lemma 3

If \(lim_{n\rightarrow \infty } \frac{g}{n}=0\), one has:

$$\begin{aligned}&lim_{n\rightarrow \infty } \bar{x} _g =lim_{n\rightarrow \infty } \bar{x} =x, \\&\quad lim_{n\rightarrow \infty } \bar{y} _g =lim_{n\rightarrow \infty } \bar{y} =y, \end{aligned}$$

where \(x=lim_{n\rightarrow \infty } x_n \) and \(y=\beta ^{*}x+\alpha ^{*}\).


We first verify \(lim_{n\rightarrow \infty } \bar{x} _g =lim_{n\rightarrow \infty } \bar{x} \). It’s easy to check:

$$\begin{aligned} \bar{x}= & {} \frac{1}{n}\mathop \sum \nolimits _{i=1}^n x_i =\frac{1}{n}\mathop \sum \nolimits _{i=1}^{g-1} x_i \nonumber \\&+\,\frac{n-g+1}{n}\frac{1}{n-g+1}\mathop \sum \nolimits _{j=g}^n x_j \nonumber \\= & {} \frac{1}{n}\mathop \sum \nolimits _{i=1}^{g-1} x_i +\frac{n-g+1}{n}\bar{x} _g. \end{aligned}$$

Since \(lim_{n\rightarrow \infty } \frac{g}{n}=0\), imposing \(n\rightarrow \infty \) on Eq. (E.46) one obtains:

$$\begin{aligned} lim_{n\rightarrow \infty } \bar{x} _g =lim_{n\rightarrow \infty } \bar{x} , \end{aligned}$$

where we have used \(\left| {x_i } \right| <\infty \).

Since Assumptions (a) and (b) hold, by using Lemma 1 one has: \(lim_{n\rightarrow \infty } x_n =x\). This means that by using Lemma 2 one obtains \(lim_{n\rightarrow \infty } \bar{x} =x\). Therefore, we verify \(lim_{n\rightarrow \infty } \bar{x} _g =lim_{n\rightarrow \infty } \bar{x} =x\).

Now we start to verify \(lim_{n\rightarrow \infty } \bar{y} _g =lim_{n\rightarrow \infty } \bar{y} =y\).

Based on the same technique from Eq. (E.46), by Assumption (a) we can verify \(lim_{n\rightarrow \infty } \bar{y} _g =lim_{n\rightarrow \infty } \bar{y} \). By using Eq. (E.1), one has:

$$\begin{aligned} \bar{y} =\beta ^{*}\bar{x} +\alpha ^{*}+\bar{\varepsilon }, \end{aligned}$$

where \(\bar{\varepsilon } =\frac{1}{n}\mathop \sum \nolimits _{i=1}^n \varepsilon _i\).

By Assumption (c) \(\varepsilon _j \) are i.i.d. \(N\left( {0,\sigma ^{2}} \right) \), so by using the law of large numbers, it’s easy to obtain:

$$\begin{aligned} lim_{n\rightarrow \infty } \bar{\varepsilon } =E\left( {\varepsilon _i \hbox {|}x_1 ,\ldots ,x_n } \right) =0. \end{aligned}$$

Therefore, substituting the above equation into Eq. (E.47) leads to:

$$\begin{aligned} lim_{n\rightarrow \infty } \bar{y} =\beta ^{*}lim_{n\rightarrow \infty } \bar{x} +\alpha ^{*}+lim_{n\rightarrow \infty } \bar{\varepsilon } =\beta ^{*}x+\alpha ^{*}. \end{aligned}$$

\(\square \)

Lemma 4

If \(lim_{n\rightarrow \infty } \frac{g}{n}=0\), one has:

$$\begin{aligned} lim_{n\rightarrow \infty } \hat{\beta }_g= & {} lim_{n\rightarrow \infty } \hat{\beta } =\beta =\beta ^{*}. \end{aligned}$$
$$\begin{aligned} lim_{n\rightarrow \infty } \hat{\alpha } _g= & {} lim_{n\rightarrow \infty } \hat{\alpha } =\alpha =\alpha ^{*}. \end{aligned}$$


Here we only verify Eq. (E.48). By the same technique, one can verify Eq. (E.49).

It’s easy to check:

$$\begin{aligned} \hat{\beta }= & {} \frac{\mathop \sum \nolimits _{i=1}^n \left( {x_i -\bar{x} } \right) \left( {y_i -\bar{y} } \right) }{\mathop \sum \nolimits _{i=1}^n \left( {x_i -\bar{x} } \right) ^{2}} \nonumber \\= & {} \frac{\frac{1}{n}\mathop \sum \nolimits _{i=1}^{g-1} \left( {x_i -\bar{x} } \right) \left( {y_i -\bar{y} } \right) +\frac{1}{n}\mathop \sum \nolimits _{j=g}^n \left( {x_j -\bar{x} } \right) \left( {y_j -\bar{y} } \right) }{\frac{1}{n}\mathop \sum \nolimits _{i=1}^{g-1} \left( {x_i -\bar{x} } \right) ^{2}+\frac{1}{n}\mathop \sum \nolimits _{j=g}^n \left( {x_j -\bar{x} } \right) ^{2}}. \end{aligned}$$

Imposing \(n\rightarrow \infty \) on Eq. (E.50) one obtains:

$$\begin{aligned} lim_{n\rightarrow \infty } \hat{\beta }= & {} \frac{lim_{n\rightarrow \infty } \frac{1}{n}\mathop \sum \nolimits _{j=g}^n \left( {x_j -\bar{x} } \right) \left( {y_j -\bar{y} } \right) }{lim_{n\rightarrow \infty } \frac{1}{n}\mathop \sum \nolimits _{j=g}^n \left( {x_j -\bar{x} } \right) ^{2}}. \nonumber \\= & {} \frac{lim_{n\rightarrow \infty } \frac{1}{n}\mathop \sum \nolimits _{j=g}^n \left( {x_j -lim_{n\rightarrow \infty } \bar{x} } \right) \left( {y_j -lim_{n\rightarrow \infty } \bar{y} } \right) }{lim_{n\rightarrow \infty } \frac{1}{n}\mathop \sum \nolimits _{j=g}^n \left( {x_j -lim_{n\rightarrow \infty } \bar{x} } \right) ^{2}}, \end{aligned}$$

where we have used \(\left| {x_i } \right| <\infty \), \(\left| {y_i } \right| <\infty \) and \(lim_{n\rightarrow \infty } \frac{g}{n}=0\).

Using Lemma 3, Eq. (E.51) can be rewritten as:

$$\begin{aligned} lim_{n\rightarrow \infty } \hat{\beta }= & {} \frac{lim_{n\rightarrow \infty } \frac{1}{n}\mathop \sum \nolimits _{j=g}^n \left( {x_j -lim_{n\rightarrow \infty } \bar{x} _g } \right) \left( {y_j -lim_{n\rightarrow \infty } \bar{y} _g } \right) }{lim_{n\rightarrow \infty } \frac{1}{n}\mathop \sum \nolimits _{j=g}^n \left( {x_j -lim_{n\rightarrow \infty } \bar{x} _g } \right) ^{2}}. \nonumber \\= & {} lim_{n\rightarrow \infty } \frac{\mathop \sum \nolimits _{j=g}^n \left( {x_j -\bar{x} _g } \right) \left( {y_j -\bar{y} _g } \right) }{\mathop \sum \nolimits _{j=g}^n \left( {x_j -\bar{x} _g } \right) ^{2}} \nonumber \\= & {} lim_{n\rightarrow \infty } \, \hat{\beta }_g. \end{aligned}$$

Because \(g^{*}<\infty \), by the same technique for deriving Eq. (E.52), we can obtain: \(lim_{n\rightarrow \infty } \hat{\beta } =lim_{n\rightarrow \infty } \hat{\beta } _{g^{*}} \).

On the other hand, by Theorem 1 one has \(lim_{n\rightarrow \infty } \hat{\beta } =\beta ^{*}\) and by Theorem 2 one has \(lim_{n\rightarrow \infty } \hat{\beta } _{g^{*}} =\beta \). Therefore, we conclude that Eq. (E.48) holds. \(\square \)

Now we start to verify the Proposition 3.

Proof of Proposition 3

Imposing \(n\rightarrow \infty \) on Eq. (E.33) one obtains:

$$\begin{aligned} lim_{n\rightarrow \infty } \hat{\mu } _g =lim_{n\rightarrow \infty } \bar{x} _g -\frac{lim_{n\rightarrow \infty } \bar{y} _g }{lim_{n\rightarrow \infty } \hat{\beta } _g }. \end{aligned}$$

Using Lemmas 14, Eq. (E.53) equals:

$$\begin{aligned} lim_{n\rightarrow \infty } \hat{\mu } _g =x-\frac{y}{\beta }. \end{aligned}$$

We have known

$$\begin{aligned} \mu =-\frac{\alpha }{\beta }. \end{aligned}$$

Imposing \(n\rightarrow \infty \) on Eq. (E.29) one obtains:

$$\begin{aligned} \alpha =y-\beta x. \end{aligned}$$

Substituting Eqs. (E.55) and (E.56) into Eq. (E.54) yields:

$$\begin{aligned} lim_{n\rightarrow \infty } \hat{\mu } _g =x-\frac{y}{\beta }=\mu . \end{aligned}$$

Because \(\frac{\bar{y} _g }{\hat{\beta }_g }>\delta >0\) for any n, by Lemmas 3 and 4 we have:

$$\begin{aligned} lim_{n\rightarrow \infty } \frac{\bar{y} _g }{\hat{\beta }_g }=\frac{y}{\beta }\ge \delta >0. \end{aligned}$$

Thus, by Eq. (E.57) we must conclude:

$$\begin{aligned} \mu<x<\infty , \end{aligned}$$

where we have used \(\left| {x_i } \right| <\infty \).

Since \(x_{g-1}<\mu <x_g \) or \(x_g =\mu \), by Assumption (b) we have:

$$\begin{aligned} 0\le \mu<x<\infty . \end{aligned}$$

Now we further verify that, for a given n, there is no another \( g^{\prime } \ne g \) to guarantee that \( x_{{g^{\prime } - 1}}< \mu < x_{{g^{\prime }}}\) or \( x_{{g^{\prime }}} = \mu \). We discuss this point in terms of two cases. First, if \( x_{{g^{\prime } - 1}}< \mu < x_{{g^{\prime }}} \) holds, we have to conclude \( x_{{g - 1}}< \mu < x_{{g^{\prime }}} \) and \( x_{{g^{\prime } - 1}}< \mu < x_{g}\). For this case, we might as well assume \( g^{\prime } > g\), which by Assumption (b) leads to \( x_{{g^{\prime }}} > x_{g}\). This means \( x_{g} \le x_{{g - 1}}\), contradicting Assumption (b). Likewise, we can refute \( g^{\prime } < g\). Second, if \(x_{{g^{\prime }}} = \mu \) and \( g^{\prime } \ne g\), then by Assumption (b) the contradiction occurs. In summary, we must conclude \( g^{\prime } = g\).

Finally, we verify \(g<\infty \). If \(g=\infty \), by \(x_{g-1}<\mu <x_g \) or \(x_g =\mu \), we have to conclude \(lim_{l\rightarrow \infty } x_l =\mu =x\), which contradicts \(\mu<x<\infty \).

Based on the results above, we should have \(lim_{n\rightarrow \infty } g=g^{*}\). To see this, we might as well assume that \(lim_{n\rightarrow \infty } g>g^{*}\). Then, by \(x_{g-1}<\mu =x_{g^{*}} <x_g \), one has \(lim_{n\rightarrow \infty } g-1=g^{*}\), which contradicts \(x_{lim_{n\rightarrow \infty } g-1} <x_{g^{*}} \), where we have used \(g<\infty \). \(\square \)

Description of data sources




Socio-Economic Database of Latin America and the Caribbean


Australian Bureau of Statistics




Statistics Canada


Hong Kong


Nepal Rastra Bank


Russian Federal State Statistics Service


Singapore Department of Statistics


Korean Statistical Information Service


National Statistical Office of Thailand





United Kingdom National Statistics


United States Census Bureau




OECD Statistics


Federated States of Micronesia


Department of Census and Statistics Ministry of Finance and Planning Sri Lanka


Bangladesh Bureau of Statistics—Ministry of Planning


Liberia Institute for Statistics and Geo-Information Services—Government of Liberia


Central Agency for Public Mobilization and Statistics (CAPMAS)—Arab Republic of Egypt


Namibia Statistics Agency


China Institute for Income Distribution


Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tao, Y., Wu, X., Zhou, T. et al. Exponential structure of income inequality: evidence from 67 countries. J Econ Interact Coord 14, 345–376 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Income inequality
  • General equilibrium
  • Rawls’ fairness
  • Technological progress
  • Entropy

JEL Classification

  • D31
  • D51
  • D63
  • E14