# Decomposing socioeconomic health inequalities

## Authors

- First Online:

- Received:
- Revised:
- Accepted:

DOI: 10.1007/s00038-009-0105-z

- Cite this article as:
- Speybroeck, N., Konings, P., Lynch, J. et al. Int J Public Health (2010) 55: 347. doi:10.1007/s00038-009-0105-z

### Keywords

DecompositionConcentration indexInequityHealth*x*-axis the cumulative percentage of the sample, ranked by an indicator of socioeconomic position such as education or income beginning with the poorest. The

*y*-axis then indicates the cumulative percentage of the health outcome corresponding to each cumulative percentage of the distribution of the socioeconomic indicator. Figure 1 provides an example of a concentration curve, where the health variable is childhood malnutrition in Ghana in 2003. It shows that the level of malnutrition accumulates faster among the poor than among the better-off because the line is above the diagonal. The RCI is defined as twice the area between the concentration curve and the line of equality (the 45° diagonal from the bottom-left corner to the top-right). Details on how to compute the RCI can be found in Konings et al. (2009).

*y*

_{i}is linearly modeled, linking a health variable,

*y*to a set of

*k*health determinants,

*x*

_{k}, this can be expressed by:

*ε*is an error term.

*y*

_{i}and

*x*

_{ki}in Eq. 1, the RCI for

*y*can be written as (Wagstaff et al. 2003):

*μ*the mean of

*y*, \( \bar{x}_{k} \) the mean of

*x*

_{k}, and

*C*

_{k}the RCI for

*x*

_{k}(defined analogously to

*C*). In the last term, GC

_{ε}is the generalized concentration index for

*ε*

_{i}. It can be computed as a residual or using the formula \( {\text{GC}}_{\varepsilon } = {\frac{2}{n}}\sum\nolimits_{i = 1}^{n} {\varepsilon_{i} R_{i} } , \) with

*R*

_{i}the fractional rank in the socioeconomic distribution, corresponding to

*ε*

_{i}(i.e.

*R*

_{i}= 1/

*N*for the poorest individual and

*R*

_{i}=

*N/N*for the richest).

Equation 2 shows that *C* is made up of two components. The first is a deterministic, or “explained” component (Wagstaff et al. 2003), equaling a sum of weighted concentration indices of the explanatory variables, where the weights are \( \left( {{{\beta_{k} \bar{x}_{k} } \mathord{\left/ {\vphantom {{\beta_{k} \bar{x}_{k} } \mu }} \right. \kern-\nulldelimiterspace} \mu }} \right). \) The second, a residual, or “unexplained”, component reflects the inequality in health that is left unexplained by the determinants in the model.

*β*

_{k}) in Eq. 1. The next step consists of calculating the average of the health variable and each of the determinants (

*μ*and \( \bar{x}_{k} \)). A third step requires computing the RCIs for the health variable and for the determinants as well as the generalized concentration index of the error term (GC

_{ε}). The RCI of each determinant is quantified using the computation explained in Konings et al. (2009).

*y*

_{i}and

*μ*are now the value of the specific determinant for the

*i*th individual, and the average of the determinant, respectively. These three steps provide all values required in Eq. 2. The last and remaining step quantifies the “pure” contribution of each determinant included in the model to the inequality in the health variable. This absolute contribution of each determinant is computed by multiplying the “contribution weight” related to a determinant and its RCI:

This sequence combines a regression analysis with distributional data (from the RCI) and thus allows estimating how determinants proportionally contribute to health inequality (e.g. the poor-rich gap). As an example the contribution of breastfeeding to childhood malnutrition inequalities in Ghana (Van de Poel et al. 2007) can be calculated by noting that the average “height for age” *z* score is *μ* = 1.58 and the average duration of breastfeeding (in months) is \( \bar{x}_{k} = 1 6. 9 8. \) The RCI for malnutrition was *C* = *−*0.079 and the RCI for breastfeeding was C_{k} = −0.0042. The coefficient in the regression for breastfeeding was \( \beta_{k} = 0.01. \) These parameters result in a pure contribution of \( \left( {{\frac{{\beta_{k} \bar{x}_{k} }}{\mu }}} \right)C_{k} = \left( {{\frac{ 0.0 1\times 1 6. 9 8}{ 1. 5 8}}} \right) - 0.00 4 2 = - 0. 0 0 0 4 5 \) and a relative contribution of breastfeeding \( {{\left( {{\frac{{\beta_{k} \bar{x}_{k} }}{\mu }}} \right)C_{k} } \mathord{\left/ {\vphantom {{\left( {{\frac{{\beta_{k} \bar{x}_{k} }}{\mu }}} \right)C_{k} } C}} \right. \kern-\nulldelimiterspace} C} = {\frac{ - 0. 0 0 0 4 5}{ - 0.079}} = 0. 0 0 5 7 \) or 0.57%.

Notice that the 0.57 is slightly different from (Van de Poel et al. 2007) because of rounding errors. Notice that a negative contribution means that the effect of the explanatory variable on health (i.e. the regression coefficient) combined with the distribution of that variable over economic status is to “lower” socioeconomic inequality in health, favoring the poor. For example, age makes a negative contribution to malnutrition inequalities in Ghana (Van de Poel et al. 2007) because older children are both more likely to be malnourished and, probably due to higher child mortality rates among poorer households, more prevalent in the richer wealth quintiles.

Of course, the extent to which the selected determinants provide a well-specified model for the underlying data may affect the decomposition. The method presented here is only one analytical approach to decompose socioeconomic inequalities in health and other approaches, like the Oaxaca decomposition, can be used for analyzing inequalities (Van de Poel and Speybroeck 2009).

Most of today’s software for inequality in health is written in Stata code (O’Donnell et al. 2008), but there is no implementation publicly available allowing for a decomposition of the RCI using generalized linear models. The methods described above are now implemented in an R package called decomp (available from the authors upon request). The R program is free of charge (http://www.r-project.org). We demonstrate code, including methods for bootstrapping confidence intervals in an “Appendix”.

Decomposition results allow policy makers to move from tackling average health problems (the “level approach”) to tackling inequalities of health (the “gap approach”). Results in Ghana (Van de Poel et al. 2007) suggested that factors strongly associated with average malnutrition are not necessarily contributing to relative socioeconomic inequality in malnutrition. Variables such as duration of breastfeeding can be quite strongly associated with a child’s nutritional status, but do not contribute to socioeconomic inequality in malnutrition, because of a relatively equal distribution across socioeconomic groups.

A decomposition analysis allows a clear understanding of how factors affect inequality, i.e. through the more unequal distribution of determinants or through the greater association of determinants with health. Policies trying to reduce average bad health can be different from those aiming at lowering socioeconomic inequality in bad health. The latter are often determined not only by health system functions, but also by factors beyond the scope of health authorities, requiring a multisectoral approach to realize improvements of inequality in health across society.

## Acknowledgments

This paper resulted from cooperation initiated through meetings organized by the WHO European Office for Investment for Health and Development, Venice, Italy.