International Journal of Public Health

, Volume 55, Issue 4, pp 347–351

Decomposing socioeconomic health inequalities


    • Institute of Tropical Medicine
    • Institute of Health and SocietyUniversité Catholique de Louvain
  • Peter Konings
    • Department of Electrical EngineeringSCD (SISTA)
  • John Lynch
    • Sansom Institute, University of South Australia
    • Department Social MedicineUniversity of Bristol
  • Sam Harper
    • Department of Epidemiology, Biostatistics, and Occupational HealthMcGill University
  • Dirk Berkvens
    • Institute of Tropical Medicine
  • Vincent Lorant
    • Institute of Health and SocietyUniversité Catholique de Louvain
  • Andrea Geckova
    • Kosice Institute for Society and Health
  • Ahmad Reza Hosseinpoor
    • World Health Organization
Hints & Kinks

DOI: 10.1007/s00038-009-0105-z

Cite this article as:
Speybroeck, N., Konings, P., Lynch, J. et al. Int J Public Health (2010) 55: 347. doi:10.1007/s00038-009-0105-z


DecompositionConcentration indexInequityHealth
This Hints & Kinks paper describes a technique enabling quantification of the contributions of determinants to socioeconomic inequality in health. This technique, differing from an analysis investigating the determinants of average health levels, has received considerable attention from health economists (van Doorslaer and Gerdtham 2003; van Doorslaer and Jones 2003; van Doorslaer et al. 2004; Wagstaff et al. 2003) but only more recently from epidemiologists as well (Harper and Lynch 2007; Lynch 2006; Hosseinpoor et al. 2006). This paper employs the relative concentration index (RCI), described in Konings et al. (2009), to summarize relative inequality across the entire socioeconomic distribution. The RCI of a continuous health outcome y results from a relative concentration curve, which graphs on the x-axis the cumulative percentage of the sample, ranked by an indicator of socioeconomic position such as education or income beginning with the poorest. The y-axis then indicates the cumulative percentage of the health outcome corresponding to each cumulative percentage of the distribution of the socioeconomic indicator. Figure 1 provides an example of a concentration curve, where the health variable is childhood malnutrition in Ghana in 2003. It shows that the level of malnutrition accumulates faster among the poor than among the better-off because the line is above the diagonal. The RCI is defined as twice the area between the concentration curve and the line of equality (the 45° diagonal from the bottom-left corner to the top-right). Details on how to compute the RCI can be found in Konings et al. (2009).
Fig. 1

The relative concentration curve: an example with malnutrition (Ghana, 2003; Source Demographic and Health Survey). The relative concentration index equals 2 × the area between the 45° line and the relative concentration curve = A/(A + B)

If yi is linearly modeled, linking a health variable, y to a set of k health determinants, xk, this can be expressed by:
$$ y_{i} = \alpha + \sum\limits_{k} {\beta_{k} x_{ki} } + \varepsilon_{i} $$
where ε is an error term.
Given the relationship between yi and xki in Eq. 1, the RCI for y can be written as (Wagstaff et al. 2003):
$$ {\text{RCI}} = \sum\limits_{k} {\left( {{\frac{{\beta_{k} \bar{x}_{k} }}{\mu }}} \right)C_{k} } + {\frac{{{\text{GC}}_{\varepsilon } }}{\mu }} = C_{{\hat{y}}} + {\frac{{{\text{GC}}_{\varepsilon } }}{\mu }} $$
with μ the mean of y, \( \bar{x}_{k} \) the mean of xk, and Ck the RCI for xk (defined analogously to C). In the last term, GCε is the generalized concentration index for εi. It can be computed as a residual or using the formula \( {\text{GC}}_{\varepsilon } = {\frac{2}{n}}\sum\nolimits_{i = 1}^{n} {\varepsilon_{i} R_{i} } , \) with Ri the fractional rank in the socioeconomic distribution, corresponding to εi (i.e. Ri = 1/N for the poorest individual and Ri = N/N for the richest).

Equation 2 shows that C is made up of two components. The first is a deterministic, or “explained” component (Wagstaff et al. 2003), equaling a sum of weighted concentration indices of the explanatory variables, where the weights are \( \left( {{{\beta_{k} \bar{x}_{k} } \mathord{\left/ {\vphantom {{\beta_{k} \bar{x}_{k} } \mu }} \right. \kern-\nulldelimiterspace} \mu }} \right). \) The second, a residual, or “unexplained”, component reflects the inequality in health that is left unexplained by the determinants in the model.

A decomposition analysis can be conducted in a stepwise manner (Hosseinpoor et al. 2006). A first step is to estimate a regression model of the health variable to obtain the coefficients of the explanatory variables (βk) in Eq. 1. The next step consists of calculating the average of the health variable and each of the determinants (μ and \( \bar{x}_{k} \)). A third step requires computing the RCIs for the health variable and for the determinants as well as the generalized concentration index of the error term (GCε). The RCI of each determinant is quantified using the computation explained in Konings et al. (2009). yi and μ are now the value of the specific determinant for the ith individual, and the average of the determinant, respectively. These three steps provide all values required in Eq. 2. The last and remaining step quantifies the “pure” contribution of each determinant included in the model to the inequality in the health variable. This absolute contribution of each determinant is computed by multiplying the “contribution weight” related to a determinant and its RCI:
$$ \left( {\left( {{\frac{{\beta_{k} \bar{x}_{k} }}{\mu }}} \right)C_{k} } \right). $$
The relative (proportional) contribution of each determinant is then obtained by dividing its absolute contribution by the RCI of the health outcome
$$ \left( {{{\left( {{\frac{{\beta_{k} \bar{x}_{k} }}{\mu }}} \right)C_{k} } \mathord{\left/ {\vphantom {{\left( {{\frac{{\beta_{k} \bar{x}_{k} }}{\mu }}} \right)C_{k} } C}} \right. \kern-\nulldelimiterspace} C}} \right). $$

This sequence combines a regression analysis with distributional data (from the RCI) and thus allows estimating how determinants proportionally contribute to health inequality (e.g. the poor-rich gap). As an example the contribution of breastfeeding to childhood malnutrition inequalities in Ghana (Van de Poel et al. 2007) can be calculated by noting that the average “height for age” z score is μ = 1.58 and the average duration of breastfeeding (in months) is \( \bar{x}_{k} = 1 6. 9 8. \) The RCI for malnutrition was C = 0.079 and the RCI for breastfeeding was Ck = −0.0042. The coefficient in the regression for breastfeeding was \( \beta_{k} = 0.01. \) These parameters result in a pure contribution of \( \left( {{\frac{{\beta_{k} \bar{x}_{k} }}{\mu }}} \right)C_{k} = \left( {{\frac{ 0.0 1\times 1 6. 9 8}{ 1. 5 8}}} \right) - 0.00 4 2 = - 0. 0 0 0 4 5 \) and a relative contribution of breastfeeding \( {{\left( {{\frac{{\beta_{k} \bar{x}_{k} }}{\mu }}} \right)C_{k} } \mathord{\left/ {\vphantom {{\left( {{\frac{{\beta_{k} \bar{x}_{k} }}{\mu }}} \right)C_{k} } C}} \right. \kern-\nulldelimiterspace} C} = {\frac{ - 0. 0 0 0 4 5}{ - 0.079}} = 0. 0 0 5 7 \) or 0.57%.

Notice that the 0.57 is slightly different from (Van de Poel et al. 2007) because of rounding errors. Notice that a negative contribution means that the effect of the explanatory variable on health (i.e. the regression coefficient) combined with the distribution of that variable over economic status is to “lower” socioeconomic inequality in health, favoring the poor. For example, age makes a negative contribution to malnutrition inequalities in Ghana (Van de Poel et al. 2007) because older children are both more likely to be malnourished and, probably due to higher child mortality rates among poorer households, more prevalent in the richer wealth quintiles.

The aforementioned methodology applies to continuous health outcomes. Non-normally distributed health outcomes require some modifications. A binary health outcome, death for example, requires a logit model which is non-linear in the probability of e.g. death, but linear in the natural logarithm of the odds of death. Moreover, since we are concerned with describing the inequality in predicted death, given the observed values of the determinants, attention will be limited to the first term in Eq. 2, i.e. the predicted inequality measured by \( C\hat{y} \) then written as:
$$ C_{{\hat{y}}} = \sum\limits_{k} {\left( {{\frac{{\beta_{k} \bar{x}_{k} }}{\mu }}} \right)C_{k} } $$
For example, Hosseinpoor and colleagues (2006) decomposed the RCI of the Iranian infant mortality into its determining factors, as shown in Fig. 2. While 36% of income-related inequality in infant mortality in Iran was due to economic inequality per se, 64% was because important risk factors for infant mortality such as maternal illiteracy and access to a hygienic toilet were strongly correlated with income. The utility of the decomposition exercise is thus that it links monitoring health inequality with understanding its determinants (Harper and Lynch 2007).
Fig. 2

The size and sign of the different contributions to the socio-economic inequality of infant mortality in Iran (Hosseinpoor et al. 2006)

Of course, the extent to which the selected determinants provide a well-specified model for the underlying data may affect the decomposition. The method presented here is only one analytical approach to decompose socioeconomic inequalities in health and other approaches, like the Oaxaca decomposition, can be used for analyzing inequalities (Van de Poel and Speybroeck 2009).

Most of today’s software for inequality in health is written in Stata code (O’Donnell et al. 2008), but there is no implementation publicly available allowing for a decomposition of the RCI using generalized linear models. The methods described above are now implemented in an R package called decomp (available from the authors upon request). The R program is free of charge ( We demonstrate code, including methods for bootstrapping confidence intervals in an “Appendix”.

Decomposition results allow policy makers to move from tackling average health problems (the “level approach”) to tackling inequalities of health (the “gap approach”). Results in Ghana (Van de Poel et al. 2007) suggested that factors strongly associated with average malnutrition are not necessarily contributing to relative socioeconomic inequality in malnutrition. Variables such as duration of breastfeeding can be quite strongly associated with a child’s nutritional status, but do not contribute to socioeconomic inequality in malnutrition, because of a relatively equal distribution across socioeconomic groups.

A decomposition analysis allows a clear understanding of how factors affect inequality, i.e. through the more unequal distribution of determinants or through the greater association of determinants with health. Policies trying to reduce average bad health can be different from those aiming at lowering socioeconomic inequality in bad health. The latter are often determined not only by health system functions, but also by factors beyond the scope of health authorities, requiring a multisectoral approach to realize improvements of inequality in health across society.


This paper resulted from cooperation initiated through meetings organized by the WHO European Office for Investment for Health and Development, Venice, Italy.

Copyright information

© Birkhäuser Verlag, Basel/Switzerland 2010