Assessing Changes in Household Socioeconomic Status in Rural South Africa, 2001–2013: A Distributional Analysis Using Household Asset Indicators

Understanding the distribution of socioeconomic status (SES) and its temporal dynamics within a population is critical to ensure that policies and interventions adequately and equitably contribute to the well-being and life chances of all individuals. This study assesses the dynamics of SES in a typical rural South African setting over the period 2001–2013 using data on household assets from the Agincourt Health and Demographic Surveillance System. Three SES indices, an absolute index, principal component analysis index and multiple correspondence analysis index, are constructed from the household asset indicators. Relative distribution methods are then applied to the indices to assess changes over time in the distribution of SES with special focus on location and shape shifts. Results show that the proportion of households that own assets associated with greater modern wealth has substantially increased over time. In addition, relative distributions in all three indices show that the median SES index value has shifted up and the distribution has become less polarized and is converging towards the middle. However, the convergence is larger from the upper tail than from the lower tail, which suggests that the improvement in SES has been slower for poorer households. The results also show persistent ethnic differences in SES with households of former Mozambican refugees being at a disadvantage. From a methodological perspective, the study findings demonstrate the comparability of the easy-to-compute absolute index to other SES indices constructed using more advanced statistical techniques in assessing household SES.


Introduction
An individual's or group's position within a hierarchical social structure known as socioeconomic status (SES) influences one's access to and control over desired resources including knowledge, money, power, prestige, and beneficial social connections which shape one's well-being and life chances (Link and Phelan 1995;Mueller and Parcel 1981;Link and Phelan 2005;Link et al. 2008;Phelan et al. 2010). Therefore, it is important to understand the distribution of SES and its temporal dynamics within a population to ensure that policies and interventions adequately and equitably contribute to the well-being and life chances of all individuals.
In low-and middle-income settings, one of the widely used measures of SES is a composite index constructed from a list of household asset items (Ataguba et al. 2011;Barros et al. 2010;Gwatkin et al. 2007;Hong and Mishra 2011;Hosseinpoor et al. 2006;Minujin and Delamonica 2004;Nkonki et al. 2011;Uthman 2009; Van de Poel et al. 2008;Ziraba et al. 2009). The index is often called a ''wealth index'' or ''asset index'' (Howe et al. 2012) and the household asset items on which it is derived from include durable goods, housing characteristics, sanitation and access to services. Balen et al. (2010), Howe et al. (2009Howe et al. ( , 2012, Montgomery et al. (2000) and Sahn and Stifel (2003) have outlined the theoretical basis for the preference of the asset index as a measure of SES in low-and middle-income settings over ''direct'' measures such as income, expenditure, and financial assets (e.g., savings and pensions). Supporting reasons range from reliability to time and cost effectiveness. For example, the information required to construct the asset index is relatively easy and inexpensive to collect. Additionally, in low-and middle-income settings, household assets provide a better proxy for a household's long-run wealth compared to information on income or expenditures; this is due to seasonal variability in earnings, income from potentially multiple and diverse informal activies, high rates of self-employment, likely recall bias and misreporting. Booysen et al. (2008), Sahn and Stifel (2003) and Ward (2014) are among others who have demonstrated that data on household asset ownership collected at more than one point in time using a standardized questionnaire can be used to construct an asset index to compare and follow up the changes in the distribution of SES within populations. The Agincourt Health and Demographic Surveillance System (HDSS), which is central to the research programme of the MRC/Wits Rural Public Health and Health Transitions Research Unit has collected data on household asset ownership every 2 years since 2001 using a standardized questionnaire in the Agincourt sub-district in rural northeast South Africa. In this paper, we use these data to construct and compare asset indices and to assess the dynamics of SES in the Agincourt HDSS study population over the period 2001-2013. The focus is on the temporal changes in the ownership of various household asset items and the distribution of SES.

Data Sources
The analysis in this paper is based on data on asset indicators collected by the HDSS. The Agincourt system has collected detailed longitudinal data on vital events including births, deaths, in-and out-migrations, as well as complementary data covering health, social and economic indicators in a predominantly rural population in northeast South Africa every year since 1992 (Kahn et al. 2007(Kahn et al. , 2012. Until 2006, the study included 21 villages. The study area was extended to 26 villages in 2007. Another five villages were added between 2010 and 2012 in response to an expanding trials and evaluation portfolio. The population, of approximately 115,000 people in 2014, is largely Shangaan-speaking and almost a third are former Mozambican refugees who arrived in the area in the early to mid-1980s and their descendants.
Collection of data on household asset indicators that include construction materials of the main dwelling, type of toilet facilities, sources of water and energy, ownership of modern assets and livestock only started in 2001 and has been repeated every 2 years. To assess changes in the asset indicators over the period 2001-2013, we use only the data collected from households in the original 21 villages.

Statistical Analysis
There are three parts to the analysis. The first part summarizes changes in ownership of various household assets in the Agincourt study population from 2001 to 2013. The second part involves constructing three composite indices that can be used as a measure of SES from the household asset items. The three indices namely absolute index, principal components analysis (PCA) index and multiple correspondence analysis (MCA) index are among the most widely utilized indices in the literature. The three indices are used to assess the robustness of our findings. Similar to the approach adopted by Howe et al. (2008), the three indices are compared with each other using scatter plots and the percentage of households classified into the same and different SES quintiles. The agreement of classification of households into SES quintiles between indices is assessed using Kappa statistics. The Kappa statistic, which takes values between 0 (no agreement better than chance) and 1 (perfect agreement) measures agreement in classification between two methods taking into account the agreement that is expected based on chance alone (Howe et al. 2008). Also similar to the approach adopted by Balen et al. (2010), the Spearman's rank correlation coefficient is utilised for further comparisons of the three indices. The last part of the analysis applies the method of relative distributions developed by Morris (1998, 1999) to the asset indices to assess changes in the distribution of SES over time in terms of location and shape. This part of the analysis also takes into account ethnic differences in the distribution of SES as a previous study by Sartorius and colleagues covering the period 2001-2007 showed persistent differentials in SES between the South African and Mozambican populations (Sartorius et al. 2013).

Construction of Asset Indices
The absolute index that we construct has been utilized by a number of other researchers that have analyzed data from the Agincourt HDSS (Houle et al. 2013;Gomez-Olive et al. 2014;Houle et al. 2014;Madhavan et al. 2012). To construct this index, first the items of each asset indicator are assigned a weight so that increasing values correspond to items associated with higher SES. For example, for the asset indicator wall material, 5 = brick; 4 = cement; 3 = other modern material; 2 = mud; and 1 = other traditional material. Thereafter, the value assigned to each item of an asset indicator is normalized by dividing it by the value assigned to the item associated with the highest SES. This results in items of a given asset indicator taking values within the range [0, 1]. The asset indicators are then grouped into five broad asset subcategories (modern assets, livestock, power supply, water and sanitation, and dwelling structure). The normalized values of the asset indicators within each subcategory are then summed to yield a subcategory-specific value. Each subcategory-specific value is further normalized so that it too is in the range [0, 1]. Finally, the five subcategory-specific normalized values are summed to produce an overall household asset index that falls in the range [0, 5].
The PCA index was first recommended by Filmer and Pritchett (2001) and is one of the most widely used asset indices (Gwatkin et al. 2007;McKenzie 2005;Minujin and Delamonica 2004). Construction of this index starts by constructing an n Â p matrix, X, representing ownership of p asset items collected from n households. Thereafter, each element of X is normalized by subtracting from it the column mean and dividing the difference by the column standard deviation to produce another n Â p matrix, Y. Next, a p Â p correlation matrix, R, is computed from the normalized data matrix, Y. This is followed by solving the equation R À kI ð ÞV ¼ 0 for k and V, where k is a vector of eigenvalues, I is an identity matrix and V is a matrix of eigenvectors associated with the eigenvalues in k. Each eigenvector is then scaled so that its sum of squares equals the total variance. The product of the normalized matrix of assets variables, Y, and the matrix of scaled eigenvectors, V Ã produces a set of uncorrelated linear combinations of the asset variables for each household j, known as principal components. For each household, the number of principal components equals the number of asset items, and the rank of each component corresponds to the rank of its associated eigenvector. The first component is associated with the most dominant (largest) eigenvalue and explains as much as possible of the variation in the original data. The second component is associated with the second largest eigenvalue and explains as much as possible of the remaining variation in the data, subject to being uncorrelated with the first component. Similarly, each subsequent component explains as much as possible of the remaining variation in the data, while being uncorrelated with the other components. Formally, for household j, the PCA index is computed as x jpÀ x p s p where v i1 * are the elements of the scaled eigenvector associated with the largest eigenvalue, x ji are the asset ownership values for household j and asset i; i 2 1; 2. . .p ½ , and x i and s i are respectively, the mean and standard deviation of the asset ownership values across all households for asset item i. In our description of the steps to derive the PCA index we have kept the mathematical details to a minimum. More detailed mathematical descriptions of the steps involved in the PCA technique can be found in Everitt andHothorn (2011), Rencher (2003).
The procedure used to construct the MCA index is similar to the one used to construct the PCA index but does not assume that the data are continuous and that there is a linear relationship between the observations (Traissac and Martin-Prevel 2012;Booysen et al. 2008;Howe et al. 2012). Because all the asset indicators are discrete or categorical, others have argued that the MCA index is the most appropriate asset-based measure of SES (Booysen et al. 2008;Traissac and Martin-Prevel 2012;Asselin and Anh 2008). In constructing the MCA index we follow the guidelines provided by Booysen et al. (2008) and Asselin and Asselin and Anh (2008). First, the indicators of asset ownership of all households are organized into a matrix X of ones and zeros called the ''indicator matrix''. In the indicator matrix, each categorical asset indicator is decomposed into a set of mutually exclusive and exhaustive binary categories that each take only the value 0 or 1 such that every household has a '1' in exactly one of each asset's set of categories and a '0' in the rest of the asset's categories. Second, a matrix S is calculated by taking the v 2 metric on row/column profiles of X. Greenacre (2007) provides the formula for computing S as where P is the matrix formed by dividing each element of the matrix X by the sum of its elements, r is a vector whose elements are the sums of the row elements of the matrix P, c is a vector whose elements are the sums of the column elements of the matrix P, and D r and D c are diagonal matrices formed from r and c respectively. Finally, singular value decomposition (SVD) is then performed on the matrix S to decompose it into three matrices such that S ¼ UD a V T (Greenacre 2007). The columns of the matrices U and V referred to as left and right singular vectors are respectively the eigenvectors of the matrices SS T and S T S and the columns of the diagonal matrix D a known as singular values are the square roots of the common positive eigenvalues of SS T and S T S. Like in the PCA approach, in constructing a single asset index, the elements in the first column vector of the matrix V derived by the SVD are then used as weights of the asset categories. Consequently, as provided by Booysen et al. (2008), the MCA index score for household i is calculated as where R ij is the response of household i to asset category j and W j is the MCA weight of asset category j.
The PCA and MCA indices are derived from pooled data from all the available years. This approach ensures that indices explain variation over time as well as across households and are not affected by changes in the contribution of particular assets to household welfare (McKenzie 2005). Pooling of the data is not necessary for the absolute index as the procedure used to generate this index assigns the same weight to the same asset item across time.

Assessing Distributional Changes in SES
The method of relative distributions that we apply to the three indices to assess trends in the distribution of SES quantifies differences between the distributions of a set of measurements of an attribute of interest from a population at one time period and another set of measurements of the same attribute from a different population, or from the same population at a later time period. It takes the values of one distribution (the comparison distribution) and expresses them as positions in another distribution (the reference distribution) Morris 1998, 1999). Compared to the standard approach of comparing distributions using summary statistics such as mean, median and variance, which do not consider the entire distributions, the relative distribution analytic approach allows direct comparisons between outcomes across the entire distributions and provides insights that may be missed by the former approach.
Taking 2001 as the baseline year, we obtain the relative distribution for each later time period, t, using the density function of the percentile rank, r, of asset index value,y, in 2001 as where f 0 (y) and f t (y) are the density functions of the asset index values in 2001 and at a later time period respectively. Basically, the relative distribution, g t (r), represents the ratio of the population density at asset index value, y, at each later time period, t, to the density in 2001. When there are no differences between the comparison and reference distributions, the relative distribution is uniform or ''flat'' (taking a value of 1 throughout). When there are differences between the distributions, the relative distribution ''rises'' or ''falls'' depending on the direction of the difference. For example, if the proportion of households at a later time period, t, with asset index values equal to the median asset index value in 2001 is less than 50 %, the relative distribution will have a value below 1 at a point on the vertical axis corresponding to 50 % on the horizontal axis. Following the approach by Morris (1998, 1999), the changes in the relative distribution of the asset index values in 2001 and at later time periods are statistically summarized using the entropy statistic and median relative polarization (MRP) index. The entropy statistic used is based on the Kullback-Leibler divergence, which is a measure of the distance between two distributions and is defined by: where g(r) is the probability density function of the relative distribution of asset index values in the reference and comparison distributions and F 0 and F respectively represent the cumulative distribution functions of the reference and comparison distributions of asset index values. We use the entropy statistic to quantify: (1) overall divergence between the comparison and reference distributions; (2) divergence between the location-adjusted reference distribution and the reference distribution; and (3) divergence between the comparison distribution and the location-adjusted reference distribution. The location adjustment used is median adjustment. This is preferred over mean adjustment because of the well-known drawbacks of the mean when distributions are skewed. As for the MRP index, we use it to quantify the extent to which the shape difference between the distributions of asset index values in 2001 and at later time periods takes the form of relative polarization or rising inequality. It is computed as: where g t (r) is the relative population density at asset index value, y at each time period, t weighted by the absolute difference between the baseline rank of y and the median, r À 1 2 . Its value varies between -1 and 1, with 0 representing no change in the distribution of asset index values at time period t relative to the baseline year, positive values representing more polarization (i.e. increases in the tails of the distribution) and negative values representing less polarization (i.e. convergence towards the center of the distribution). In order to distinguish the contributions from the lower and upper tails of the distribution to the overall polarization, the MRP index is decomposed into lower (LRP) and upper (URP) polarization indices defined respectively as: These indices also vary between -1 and 1 and have similar interpretations as the MRP index.
The analysis of ethnic differences in the distribution of SES between the South African and Mozambican populations use the distribution of the asset index values of the Mozambican households as the reference distribution and that of the asset index values of the South African households as the comparison distribution.

Software
We use STATA version 13.1 (Stata Corp., College Station, USA) to construct the asset indices and to perform the descriptive analyses. We also utilize the R statistical package reldist to conduct the relative distribution analysis (Handcock and Aldrich 2002).

Ethics Statement
The Human Research Ethics Committee (Medical) of the University of the Witwatersrand reviewed and approved the Agincourt HDSS (protocol M960720 and M081145). At the start of surveillance in 1992, community consent was secured from civic and traditional leadership and has continuously been reaffirmed for over two decades through frequent meetings. This is facilitated by the Agincourt Unit's LINC (Learning, Information dissemination and Networking with Community) Office. Three local people working under a coordinator in the LINC office regularly engage with Community Development Forums as well as a Community Advisory Group in the study site. Both are elected committees comprising village members. Community Development Forums, the lowest level of local government, include the Induna who represents the Traditional Council. The LINC office ensures that Forum members understand research objectives and results and are able to raise concerns about the Unit's research in their communities, and provide feedback of research results at community meetings. The Community Advisory Group ensures information flows between the Unit and the community, voices concerns, assesses the potential impact of the Unit's research on the community, and maintains ongoing dialogue and consultation. At the individual and household level, informed verbal consent is obtained from the head of the household or an eligible adult in the household at each annual followup surveillance visit. Prior to conducting any interview, a local fieldworker who is welltrained and versed in the Agincourt HDSS methods and the process of verbal informed consent explains in the local language to the respondent the purpose, aims and justification of the HDSS as well as information about confidentiality, privacy and the right to refuse to participate or withdraw from the HDSS. The responsible fieldworker documents the consent process by marking out the respondent on the household roster as well as recording the fieldworker details and date on the spaces provided at the top of the household roster. A verbal consenting process is normal practice for HDSS and the processes followed in the Agincourt HDSS have continued to be accepted by the aforementioned ethics committee. Furthermore, additional ethical clearance was obtained from the same ethics committee for the primary study reported in this paper (protocol M120488).

Data Availability
Detailed documentation of the Agincourt HDSS data and an anonymized database containing data from 10 % of the surveillance households are freely available on the Agincourt HDSS website (www.agincourt.co.za). The specific customized data used in this study are available on request to interested researchers. Table 1

Comparison of Asset Indices
The last three columns of Table 1 present the weights assigned to each asset item in constructing the three asset indices. For the absolute index, the weights are assigned in such a way that increasing values correspond to items associated with higher SES. For the PCA and MCA indices, positive weights are assigned to items expected to be associated with higher SES (e.g. tiles and cement housing floor materials, bricks and cement housing wall materials and tiles and corrugated iron sheets housing roof materials) and negative weights are assigned to items expected to be associated with lower SES (e.g. mud and other traditional housing floor and wall materials, and thatch and other traditional housing roof materials). However, on average the absolute values of the weights in the MCA index are higher than those in the PCA index. In addition, the ranking of the asset items based on the weights in the MCA and PCA indices show marked differences. From the PCA index, the highest weight is assigned to owning a toilet within the yard followed by owning a fridge and the lowest weight is assigned to not owning any toilet facility followed by sources of power for lighting other than electricity, solar or battery. From the MCA index, the highest weight is assigned to owning a toilet inside the dwelling followed by owning a flush toilet and the lowest weights are assigned to owning a house with the floor made of traditional materials such as dirt. Despite the differences in the weights assigned to the asset items in the three indices, as shown in Fig. 1 and Table 2, the indices are reasonably comparable. Pairwise comparisons between the values of the indices result in correlation coefficients of at least 0.95. In addition, each pair of indices assigns at least 71 % of households in the same SES quintile with Kappa statistics of at least 0.64. Where a pair of indices places households in different quintiles, movement is generally limited to one quintile, with less than 1 % of households moving between two or more quintiles. The median-adjusted relative distributions, which expose the effect of changes in distributional shape, show that for all the indices, the proportion of households with asset index values corresponding to the middle deciles (4th to 7th deciles) of the 2001 distribution has been increasing over time. Conversely, the proportion of households with asset index values corresponding to the lower and upper deciles of the 2001 distribution has been decreasing over time. This means that the distribution of SES has consistely become less polarized and is converging towards the middle over the years compared to 2001. Further details on the degree of convergence of the SES distribution from the two tails to the middle are provided by the median, lower and upper polarization indices and their corresponding 95 % confidence intervals, as reported in Table 3. The significantly negative values for the median index confirm that the SES distribution has been converging from the two tails to the middle. The significantly negative values for the lower and upper polarization indices confirm further that the convergence has occurred from both tails of the distribution. However, the large negative values for the upper indices compared to the lower indices indicate that the convergence towards the middle deciles from the upper tail of the distribution has been larger than that from the lower tail.

Distributional Changes in SES
The analysis that takes into account ethnic background of the household head shows that improvement in SES has occurred for both South Africans and Mozambicans (Fig. 6).   (Fig. 7). A comparison of the distributions of the SES of the two ethnic groups using relative distribution methods indicate that the differences are mainly due to differences in the medians of the distributions (Table 4; Fig. 8). There is little effect of differences in the shape of the distributions.

Discussion
Using pooled data on household assets collected every 2 years from 2001 to 2013 from households of the residents of the Agincourt HDSS, we have assessed the dynamics of SES in a typical South African rural setting. We constructed three asset indices: absolute index, PCA index and MCA index from information on ownership of household assets that include construction materials of the main dwelling, type of toilet facilities, sources of water and energy, ownership of modern assets and livestock. Thereafter, we applied the method of relative distributions to the three indices to assess temporal trends in the distribution of SES. Our findings indicate that the proportion of households that own assets associated with modern wealth such as stove, fridge, cellphone, car, electricity for lighting and cooking and houses constructed with modern floor, wall and roof materials has substantially increased over time. The increase has persisted beyond the time period covered in an earlier study by Sartorius et al. (2013).
On the contrary, ownership of assets associated with traditional wealth such as livestock has persistently been low. This indicates that unlike other rural populations in sub-Saharan Africa, such as a rural population in Senegal studied by Garenne (2015), traditional wealth contributes to the SES of few households in rural South Africa. This is not surprising since South Africa is a middle-income country. From a policy perspective, the general continuous increase in ownership of assets associated with modern wealth is a positive indicator of the impact of the wide-ranging reforms introduced in South Africa by the post-apartheid government that include the provision of free basic services, such as electricity (50 kWh per household per month), water, sanitation and housing to previously disadvantaged populations the majority of whom live in rural areas (Bhorat and van der Westhuizen 2013). Another important factor has been the implementation of non-contributory social grants provided by the state to vulnerable sectors of the population (Collinson 2010;Lund 2002).
Results from the relative distribution analysis in all the three indices show that the median asset index values have shifted to the right and that the distribution of SES has become less polarized and is converging towards the middle. Worth noting however is that the convergence towards the middle is larger from the upper tail than from the lower tail of the SES distribution. This might be an indication that there has been little or no improvement in the SES of the very poor segment of the population. Further analysis of the charactersitics of the individuals whose SES has persistently remained lower can assist in formulating policies that could bring further improvements in SES. The finding that the  SES of the Mozambican households continues to be lower compared to that of South African households suggests that members of the Mozambican households should be among the target of such policies. From a methodological perspective, the finding that the conclusion drawn from the analysis using the easy-to-compute absolute index are similar to those from the analysis using indices constructed using more advanced statistical techniques such as PCA and MCA demonstrates the utility of the absolute index in assessing people's SES based on household assets. This finding is consistent with findings by Howe et al. (2008) and Garenne (2015) that SES indices constructed using statistically advanced methods such as PCA offer little advantage over indices constructed using simpler and more intuitive methods such as the absolute index. Since the absolute index has the added property of comparability across time without pooling the data it may be desirable in assessing temporal trends in SES.
Our study uses indices constructed from information on ownership of household assets to assess trends in SES. However, we acknowledge that our approach is by no means the only way to measure SES. Since our indices do not include other factors associated with social exclusion such as gender, education and ethnic background, they may provide only a partial view of the multi-dimensional concept of poverty, inequality and inequity. Nethertheless, our findings provide some interesting insights into the dynamics of SES in rural South Africa in recent years.

Conclusion
This study has shown that over the period 2001-2013 the rural population in northeast South Africa has experienced significant improvements in ownership of household assets associated with greater modern wealth and polarization of the distribution of SES has declined. However, the movement towards the middle of the SES distribution has been slower for poorer households. Methodologically, the results demonstrate that the absolute index is comparable to other indices constructed using more advanced statistical techniques in assessing people's SES based on household assets.