Skip to main content

Testing the relationship between income inequality and life expectancy: a simple correction for the aggregation effect when using aggregated data


In this paper, we show a simple correction for the aggregation effect when testing the relationship between income inequality and life expectancy using aggregated data. While there is evidence for a negative correlation between income inequality and a population’s average life expectancy, it is not clear whether this is due to an aggregation effect based on a non-linear relationship between income and life expectancy or to income inequality being a health hazard in itself. The proposed correction method is general and independent of measures of income inequality, functional form assumptions of the health production function, and assumptions on the income distribution. We apply it to data from the Human Development Report and find that the relationship between income inequality and life expectancy can be explained entirely by the aggregation effect. Hence, there is no evidence that income inequality itself is a health hazard.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2


  1. Wagstaff and van Doorslaer (2000) use the term income inequality hypothesis instead. In their work, the relative income hypothesis is defined by health depending “on the deviation of the individual’s income from the population mean income” (ibid., p. 547). Because of the ambiguous use of the terminology in the literature and the popularity of the term relative income hypothesis, we employ the latter.

  2. For an overview, see Wagstaff and van Doorslaer (2000), Subramanian and Kawachi (2004), Lynch et al. (2004), as well as Wilkinson and Picket (2006).

  3. We are grateful to Hugh Gravelle who proposed this exposition on the following two pages.

  4. In our applications below, this will basically be a test on assumptions of the income distribution and the health production function. In principle, the question of the appropriate assumed income distribution can be settled before estimating Eq. 3 and the test of equality of the parameters is mainly on the functional form assumptions of the life expectancy production function.

  5. Apart from the finding of Pinkovski and Sala-i-Martin that the log-normal distribution provides a good approximation to income data, this possibility of deriving s from the Gini index (which is reported in official data) is the other important reason for using this distribution. Chotikapanich et al. (2007) assume a generalised beta distribution which, however, requires more than one parameter to retrieve the standard deviation of log income from the Gini index.

  6. Examples from other fields like migration economics, where the Gini index is employed, are, e.g. Chojnicki et al. (2011).

  7. Both lines are predicted values after the two regressions, where we set the Gini index to its mean value of 40.66 and insert different values of income between 0 and 60,000.

  8. According to the results in column (5), income is positive as long as 1. 378 × Income / 1, 000 − 0. 020 × (Income / 1, 000)2 > 0 which holds for Income / 1, 000 > 68. 9. The marginal effect of income is positive as long as 1. 378 − 0. 020 × 2 × Income / 1, 000 > 0 which holds for Income / 1, 000 > 34. 45.

  9. Note that in our model, the shape of the log-normal distribution depends on two scale parameters: average log income and the standard deviation of log income/the Gini index. Since both parameters differ between countries, the shape of the log-normal distributions will also vary.

  10. In a recent study, Hupfeld (2011) finds that the relationship between income and life expectancy is non-monotonic (convex) for pensioners in the public pension system in Germany. However, the data used have at least two drawbacks. The first is that one cannot identify subjects who were self-employed or civil servants for some time, where low-pension claims might be due not to low income but to other labor conditions. Higher life expectancy in lower income deciles may therefore be an artefact. The second drawback is that the highest income decile is right-censored. Hence, higher life expectancy in the highest decile could also be due to an artefact. Both drawbacks can lead to a rather more convex than concave relationship between income and life expectancy.

  11. The same holds for share of GDP spent on health care or education.


  • Aitchison J, Brown JAC (1981) The lognormal distribution with special reference to its uses in economics. Cambridge University Press, Cambridge

    Google Scholar 

  • Chojnicki X, Docquier F, Ragot L (2011) Should the US have locked heaven’s door? J Popul Econ 24:317–359

    Article  Google Scholar 

  • Chotikapanich D, Valenzuela R, Prasada Rao DS (1997) Global and regional inequality in the distribution of income: estimation with limited and incomplete data. Empir Econ 22:533–546

    Article  Google Scholar 

  • Chotikapanich D, Griffiths WE, Prasada Rao DS (2007) Estimating and combining national income distributions using limited data. JBES 25:97–109

    Google Scholar 

  • Cowell FA (2011) Measuring inequality. LSE perspectives in economic analysis. Oxford University Press, Oxford

    Google Scholar 

  • Deaton A (2003) Health, inequality, and economic development. JEL 41:113–158

    Article  Google Scholar 

  • Frazis H, Stewart J (2011) How does household production affect measured income inequality? J Popul Econ 24:3–22

    Article  Google Scholar 

  • Gravelle H (1998) How much of the relation between population mortality and unequal distribution of income is a statistical artefact? BMJ 316:382–385

    Article  Google Scholar 

  • Gravelle H, Wildman J, Sutton M (2002) Income, income inequality and health: what can we learn from aggregate data? Soc Sci Med 54:577–589

    Article  Google Scholar 

  • Hey JD, Lambert PJ (1980) Relative deprivation and the Gini coefficient: comment. Q J Econ 95:567–573

    Article  Google Scholar 

  • Hupfeld S (2011) Non-monotonicity in the longevity–income relationship. J Popul Econ 24:191–211

    Article  Google Scholar 

  • Leon-Gonzalez R, Tseng FM (2011) Socio-economic determinants of mortality in Taiwan: combining individual and aggregate data. Health Policy 99:23–36

    Article  Google Scholar 

  • Lynch J, Smith GD, Harper S, Hillemeier M, Ross N, Kaplan GA, Wolfson M (2004) Is income inequality a determinant of population health? Part 1. A systematic review. Milbank Q 82:5–99

    Article  Google Scholar 

  • Pinkovski M, Sala-i-Martin X (2009) Parametric estimations of the world distribution of income. NBER Working Paper 15433

  • Preston SH (1975) The changing relation between mortality and level of economic development. Pop Stud-J Demog 29:231–248

    Article  Google Scholar 

  • Rodgers GB (1979) Income and inequality as determinants of mortality: an international cross-section analysis. Pop Stud-J Demog 33:343–351. Reprint 2002 in: Int J Epidemiol 31:533–538

    Article  Google Scholar 

  • Sala-i-Martin X (2006) The world distribution of income: falling poverty and . . . convergence, period. Q J Econ 121:351–397

    Article  Google Scholar 

  • Smith JP (1999) Healthy body and thick wallets: the dual relation between health and economic status. JEP 13:145–166

    Google Scholar 

  • Subramanian SV, Kawachi I (2004) Income inequality and health: what have we learned so far? Epidemiol Rev 26:78–91

    Article  Google Scholar 

  • United Nations (2009) Human development report 2009—overcoming barriers: human mobility and development. United Nations Development Programme, New York et al. Accessed 20 Jun 2013

  • Waldmann RJ (1992) Income distribution and infant mortality. Q J Econ 107:1284–1302

    Article  Google Scholar 

  • Wagstaff A, van Doorslaer E (2000) Income inequality and health: what does the literature tell us? Annu Rev Public Health 21:543–567

    Article  Google Scholar 

  • Wildmann J, Gravelle H, Sutton M (2003) Health and income inequality: attempting to avoid the aggregation problem. Appl Econ 35:999–1004

    Article  Google Scholar 

  • Wilkinson RG (2002) Commentary: liberty, fraternity, equality. Int J Epidemiol 31:538–543

    Article  Google Scholar 

  • Wilkinson RG (1996) Unhealthy societies—the afflictions of inequality. Reprint (2003). Routledge, London

  • Wilkinson RG, Picket KE (2006) Income inequality and population health: a review and explanation of the evidence. Soc Sci Med 62:1768–1784

    Article  Google Scholar 

  • Wolfson M, Kaplan G, Lynch J, Ross N, Backlund E (1999) Relation between income inequality and mortality: empirical demonstration. BMJ 319:953–957

    Article  Google Scholar 

  • Yitzhaki S (1979) Relative deprivation and the Gini coefficient. Q J Econ 93:321–324

    Article  Google Scholar 

Download references


We are grateful to Hugh Gravelle and an anonymous referee who greatly helped to improve the paper. We also thank Stefan Felder, Miriam Krieger, as well as the participants of the international conference “Health, Happiness, Inequality” (2010) in Darmstadt, Germany, the international conference “Distributive Justice in the Health System—Theory and Empirics” (2009) in Halle, Germany, and the 2009 Annual Meeting of the German Health Economics Association (dggö) in Hanover, Germany for the very useful comments. All remaining errors are our own.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Thomas Mayrhofer.

Additional information

Responsible editor: Alessandro Cigno


Appendix 1

Derivation of the correction factors

Let m and s be the mean and standard deviation of the natural logarithm of y i k and let y i k follow a log-normal distribution. Then, log(y i k ) follows a normal distribution.It is well-known that:

  • \(E[\text {log}(y_{ik})] = m\)

  • \(\text {Var}[\text {log}(y_{ik})] = s^{2}\)

  • \(E[y_{ik}] = e^{m+\frac {s^{2}}{2}}\)

  • \(\text {Var}[y_{ik}] = e^{2m+s^{2}}\left (e^{s^{2}}-1\right )\).

Hence, in the logarithmic case:

$$\theta_{k} = E[\text{log}(y_{ik})] - \text{log}(E[y_{ik}]) = m - \text{log}\left(e^{m+\frac{s^{2}}{2}}\right) = m - \left(m + \frac{s^{2}}{2}\right) = -\frac{s^{2}}{2}. $$

In the quadratic case:

$$\theta_{k} = E\left[y_{ik}^{2}\right] - E[y_{ik}]^{2} = Var(y_{ik}) = e^{2m+s^{2}}\left(e^{s^{2}}-1\right) $$

according to the computational formula for the variance. To compute m, we use \(E[y_{ik}] = e^{m+\frac {s^{2}}{2}} \Leftrightarrow m = \text {log}(E[y_{ik}])- \frac {s^{2}}{2}\) and the log of the mean income for \(\text {log}(E[y_{ik}])\).

Appendix 2

Table 4 Robustness checks: regression results, log income

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Mayrhofer, T., Schmitz, H. Testing the relationship between income inequality and life expectancy: a simple correction for the aggregation effect when using aggregated data. J Popul Econ 27, 841–856 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Income inequality
  • Life expectancy
  • Aggregation effect

JEL Classification

  • D31
  • I10
  • O15