Background

India is urbanising rapidly. The current projection is that an estimated 534 million urban residents will make up 38% of the population in 2026.1 , 2 One of India’s three megacities, Mumbai’s population is over 16 million and will rise to 24 million over the next decade. Land is at a premium, existing infrastructure is overburdened and rich and poor from diverse backgrounds coexist intimately. The 2001 census identified 1,959 slum settlements, home to 54% of Mumbai’s people but covering only 6% of its land area.3 Localities were categorised as slums if they had been notified as such by state or local government, or recognised even if not formally notified. They were characteristically compact areas with populations of at least 300 (60–70 households), living in poorly built, congested dwellings in an unhygienic environment, usually with insufficient infrastructure, sanitary, and drinking water provision.

The loose nature of this definition reflects the diversity of informal settlements. The UN provisional operational definition of a slum (UN Expert Group Meeting, Nairobi 2002) includes five dimensions of vulnerability,4 and there is substantial heterogeneity both between and within communities.5 , 6 All of Mumbai’s slums share one characteristic—density—but “…slum pockets are highly differentiated by type, size and location, and occupy land held under a variety of ownership structures, including central government, state, municipal and private…”.7 Localities with durable housing, metered electricity, piped water, well-maintained public toilets, enclosed drainage, thriving businesses, ubiquitous television, and (relative) security of tenure contrast with localities with rudimentary shelter built by polluted creeks, by railway lines or garbage dumps, with poor access to water and electricity, non-existent sanitation and drainage, and high population transience.

Our growing understanding of the social determinants of health underpins an emphasis on inequalities,8 of which Mumbai’s slums are a manifestation. Until relatively recently, health indicators have been collected and viewed as aggregates comparing urban with rural processes and outcomes. However, since cities are engines of demographic and epidemiologic change, and because vast inequalities exist within them, there is a consensus that health interventions should promote urban equity.9 , 10 We need to disaggregate urban data both to understand the determinants of inequality and to try to address them.11 14

One would assume that women and children who live in urban slum areas are less healthy than those who live in non-slum areas, and several studies confirm that this is the case.15 22 For example, India’s National Family Health Survey (NFHS-3, 2005–6) found that women living in Mumbai informal settlements were less likely than women living in non-slum areas to make first trimester antenatal care visits (64% compared with 76%) and deliver their babies at health facilities (83% compared with 91%).23

Unfortunately, in trying to plan interventions, we have been limited by a lack of disaggregated information and by the difficulty of untangling the knot of risk factors around life in informal settlements. Take, for example, a young married woman living in a Mumbai slum with her husband and his parents. Her health could be compromised by a wide range of factors. She may have been exposed to a heritage of poverty, societal marginalisation, the monetization of subsistence needs, and dietary insufficiency; limited education, early marriage and conception; hard work and gendered discrimination; infection from substandard water supply and sanitation, hazardous location and household crowding; harmful effects of home and local industry, noise, damp and insubstantial housing; and hazards associated with early sexual activity, spousal alcohol or drug dependency, domestic violence, informal sector livelihoods, identity politics, accidents, and crime.

A minimum conceptual framework for these determinants would include her physical environment, her social environment, and her access to and use of health and social services.24 However, since risk factors cluster together and are mutually reinforcing, understanding their individual influences is difficult.25 An experience in the planning phase for a major health project made us think about this. The Society for Nutrition, Education and Health Action (SNEHA), a Mumbai-based non-government organisation, works to improve the health of women and children in informal settlements. In partnership with the Municipal Corporation of Greater Mumbai and with community members themselves, we undertook the City Initiative for Newborn Health.26 One component of the Initiative was a cluster randomised controlled trial of community mobilisation activities, which required us to identify slum localities at higher health risk. Data on health outcomes were not available at this level, so we undertook an extensive vulnerability assessment. The Environmental Health Program had carried out a model assessment in Indore, which began with municipal lists of slum areas, expanded them through participatory mapping, and classified localities on the basis of economic, social, and environmental conditions, access to and usage of public health services, disease incidence, and collective community efforts.27 The views of staff of public sector, non-governmental and community-based organisations were solicited and validated through site visits.

We took an approach with a similar ethos. We used an iterative process to collect and triangulate information. In the first step, we discussed potential criteria for health risk with a range of local informants (women’s group leaders, health workers, ration shopkeepers, community and political leaders, members of community-based and non-government organisations, private practitioners, and preschool teachers). This yielded three sets of risk indicators: social (unemployment, groups in difficult circumstances, substandard housing), environmental (open drainage, informal water supply, informal electricity supply, sanitation), and healthcare utilisation (infrequent interaction with community health volunteers, home deliveries). The key informant process generated a provisional list of 117 slum localities. The second step was to undertake triangulation and transect walks in each area to classify vulnerability systematically. Vulnerability criteria were confirmed with municipal community health volunteers. They accompanied our team members on walks around locality boundaries and in diagonal transects, meeting, and verifying vulnerability indices with local people such as groups of women, Integrated Child Development Services anganwadi teachers, ration shopkeepers, doctors, tea stall owners, members of community-based organisations, local social workers, pharmacists, political party workers, and industry owners. The process suggested that 92 informal settlements in six municipal wards were at high risk of poorer maternal and child health outcomes. Of these, 48 were randomly selected for inclusion in our trial.

The vulnerability assessment raised three issues. First, since health outcomes in individual communities were unquantified, we assumed (reasonably, we thought) that a multidimensional risk assessment would correlate with the health outcomes of women and children. Second, although the process was crucial to understanding the community environment, and although it was essentially a first step in engagement, it was time-consuming and more than half of the informal settlements surveyed were not involved in the subsequent program. Third, it was possible that our evaluation had been over-comprehensive. Several commentators suggested that deprivation is obvious and we wondered if a rapid look at an informal settlement might provide enough information to rate its degree of vulnerability.

This paper describes our responses to these concerns, based on subsequent data collection. First, we examined the associations of presumptive indicators of vulnerability with selected maternal and newborn healthcare and outcomes. Second, we tried to develop a tool for rapid triage that used ‘obvious’ characteristics and could be applied during a brief visit to a slum locality. Third, we evaluated the performance of the tool.

Methods

Data Collection

The data for the analysis were collected by a surveillance system set up to document all births and infant and maternal deaths in a cluster randomised controlled trial. The trial was designed to test the effect on maternal and newborn health of an intervention involving a participatory cycle of meetings with community women’s groups [28]. The trial involved 48 slum clusters in six municipal wards (F North, G North, H East, K West, M East, P North). Each cluster covered 1,000–1,500 households; some clusters were entire informal settlements, while others were parts of larger settlements. Births and deaths were identified by two locally resident women per cluster, remunerated per confirmed identification. One of 12 interviewers visited households in each cluster to interview women or their families at about 6 weeks after delivery. Participants gave verbal informed consent to involvement and use of anonymised data. Interviewers used a quantitative tool with questions on housing, education, assets, maternity history, antenatal, delivery, postnatal, and newborn care, illness, and careseeking. Information provided by participants remained confidential, with access restricted to interviewers, data managers, and analysts. Questionnaires were checked routinely by supervisory cadres, as well as through systematic crosschecks and random observation visits. Data were entered in Microsoft Access databases (Microsoft Corporation) with validation constraints.

Sample Size

The sample size for the study was set by a 2-year period of surveillance, from October 1, 2005 to September 30, 2007. The first year served as a baseline and the women’s group intervention was introduced at the beginning of the second year. We recorded 13,467 births in this period, for which detailed information was available on 10,754 (80%). The main reasons for the shortfall in questionnaire completion were relocation or the fact that women who lived elsewhere had come to the city for delivery.

Choice of Indicators

We wanted to choose a small set of outcomes against which to examine the research questions. Available indicators of maternal and infant health included method of confirmation of pregnancy, the site, number and recalled content of antenatal care visits, place of delivery, birth weight, breastfeeding, newborn immunisation, maternal and newborn illness and site of care-seeking, and maternal, newborn and infant mortality. Our idea was to specify one antenatal, one intrapartum, and one postpartum outcome before the analysis. We could not justify confirmation of pregnancy, site, and content of antenatal care as clear indicators of health. Birth weight was a possibility, but is subject to transgenerational and life cycle influences that make it questionable as an indicator of recent environment. Newborn immunisation is probably not as useful an indicator of population health as full immunisation over infancy. Information on maternal and newborn illness, and breastfeeding, was based on self-report of a range of problems and our experience of developing simple variables to express significant morbidity suggested that validity and comparability would be limited. Rates of antenatal care are high in Mumbai (over 90%), but attendance for the recommended minimum of three visits shows more variability. Although this is actually an index of access to and uptake of healthcare, it is a common international measure of safer motherhood and would be transferable to other settings. Institutional delivery rates are also around 90% in Mumbai, but we had noted that there was substantial variability in slum areas. Skilled birth attendance—preferably at an institution—is currently a major international target and an index of population health. We chose newborn survival as an outcome because it was the focus of our work, and because maternal and post-neonatal mortality are less common. Since the research questions were about risk, we expressed the three chosen outcomes negatively: less than three antenatal care visits, non-institutional (home) delivery, and neonatal mortality (death of a live-born infant in the first 28 days).

Table 1 summarises the range of information on potential risk factors available in our dataset. We wanted to frame a scorecard according to the UN Guide to Monitoring for Millennium Development Goal Target 11.4 The table therefore locates information in five dimensions, with the addition of other categories for completeness. Water supply was defined as legal if delivered through piped systems or community tapstands, illegal if access required unauthorised tapping, or purchased from a tanker or from another area. Electricity supply was categorised as legal (metered), illegal, or unavailable. Hazardous location was defined on the basis of UN-Habitat criteria: landslide, earthquake or flood-prone areas, garbage mountains, high-industrial pollution areas, and risk zones such as beside railroads, airports and power lines. The most relevant of these in the local context were locations beside garbage dumps, highly polluted bodies of water, railway lines, and airports. Table 1 shows that the next step was to translate our individual questionnaire data into a series of indicators, the criteria for which were that they should be estimable on a short visit to an informal settlement and that any quantification should be rough and ready.

Table 1 Available data on characteristics of slum residence, categorising variables from the maternity surveillance system against dimensions of the UN provisional operational definition

Data Analysis

Associations of Indicators of Vulnerability with Maternal and Newborn Healthcare and Outcomes

We used random effects multivariable logistic regression models in Stata 10 (College Station, TX, USA) to assess the association of candidate risk indicators with the three dependent variables. Quadrature checks supported the use of this approach to the clustered data. Although the second year of data collection coincided with the intervention phase of the randomised controlled trial, adjusting for allocation status did not affect the models and is not presented.

Developing a Triage Tool to Identify Informal Settlements at Higher Risk

Classification and Regression Tree analysis (CART) has been used successfully to design client triage systems for hospitals. Essentially, it tests the purity with which each of a series of binary determinants predicts an outcome, chooses the best predictor, splits the sample in two according to it, and then repeats the process. This recursive partitioning is computationally intensive but easy to interpret. It does not rely on assumptions of distribution, and is useful when there are many interactions between potential predictors (as in this case) and the objective is to allocate people or places to risk groups.29

We used the Classification and Regression Tree module in SPSS 16.0 (SPSS Inc, Chicago, IL, USA) to assess candidate indicators for inclusion in a slum triage tool. We used a randomly selected 70% of the dataset for this step. For each of the three outcomes, we entered the independent variables used in the logistic regression analysis, apart from the variable describing open drainage which was not used because of its lack of effect. To keep the model simple, tree growth was set to a maximum depth of three levels (which had similar results to pruned models with more levels). We used a Gini measure of impurity, parent nodes with a minimum 100 cases, child nodes with a minimum of 50, and default surrogates for missing values of independent variables. We did not apply costs to misclassification.

In keeping with our research questions, we constructed a provisional scorecard for maternal and newborn health risk in informal settlements in the simplest manner possible. We selected risk factors that ranked highly on the CART diagrams and decided in advance that the highest possible score would be 10 and that categorization of factors should be crude: ‘hardly any’, ‘some’, or ‘most’.

Testing the Tool Against Existing Data

To test the scorecard’s utility, we collapsed the individual records into 48 single records summarising outcomes and predictors for each cluster, and then applied the scorecard. In practice, we would probably use the raw scores to rank slum areas in terms of risk. However, to evaluate utility we arbitrarily categorised the top quartile (with a score of more than 5) as high risk and calculated sensitivity, specificity, and positive and negative predictive values in two-by-two tables of high risk exposure and outcome.

Results

Associations of Indicators of Vulnerability with Maternal and Newborn Healthcare and Outcomes

Table 2 summarises three pre-specified random effects multivariable logistic regression models: for less than three antenatal care visits, non-institutional (home) delivery, and neonatal mortality. We entered seven independent variables based on Table 1: no access to legal individual or communal piped water, no access to individual or communal toilets, illegal or no electricity supply, open drains, hazardous location, non-durable housing, and rental tenancy. Insufficient antenatal care was associated with non-durable and rental housing, with odds ratios (ORs) 1.45 (95% confidence interval [CI] 1.24–1.69) and 1.49 (1.30–1.70), respectively. Home delivery was associated with no access to toilets (OR 1.87, 95% CI 1.12–3.11), illegal or no electricity supply (1.47, 1.18–1.82), and non-durable housing (1.87, 1.55–2.26). Neonatal mortality was associated with no access to toilets (5.11, 2.31–11.32) and non-durable housing (1.70, 1.06–2.70). Our first inference was that easily observable characteristics of informal settlements might be predictors of maternal and infant health. A provisional list of indicators might include no access to toilets, poor electricity supply, non-durable housing, and rental tenancy.

Table 2 Random effects multivariable logistic regression models for three sentinel outcomes, against seven potential indicators that could be appraised rapidly

Developing a Triage Tool to Identify Informal Settlements at Higher Risk

Figures 1 and 2 summarise the results of the CART analysis, which we used because of its intelligibility to general audiences and to compare with the regression models. The analysis with neonatal death as an outcome did not produce any levels below the root node because of insufficient numbers. Figure 1 shows the tree for the outcome of less than three antenatal care visits (the tree represents the test phase on 30% of the dataset). In triage order, the important factors were rental tenancy, non-durable housing, and hazardous location. The analysis began with an overall proportion of 12%, which rose to 27% if women lived in rented, non-durable housing located beside a garbage dump, polluted water body, railway line or airport. Figure 2 shows the tree for the outcome of home delivery. In triage order, the important factors were non-durable housing, no access to legal water, and hazardous location. An overall proportion of 11% rose to 35% if women lived in non-durable housing without formal water supply, located by a garbage dump, polluted water body, railway, or airport.

Figure 1
figure 1

Classification and regression tree analysis for less than three antenatal care visits, using test dataset (30% of whole dataset).

Figure 2
figure 2

Classification and regression tree analysis for home delivery, using test dataset (30% of whole dataset).

These findings supported our idea of a simple scorecard as a predictor of maternal and newborn health risk. We favoured the characteristics suggested by the CART analysis: non-durable housing, rental tenancy, hazardous location, and no access to legal water. We added illegal or no electricity supply and access to toilets because they were associated with at least one health outcome in the regression analysis, because we wanted the scorecard to cover the five parameters of the UN-HABITAT slum definition, and because we wanted to test the scorecard in different settings. Box 1 summarises the proposed scorecard. Quantitative categories were designed to be both uniform and simple.

Box 1. Proposed scorecard for health vulnerability of slum areas

Testing the Tool Against Existing Data

The CART model derived from 70% of the dataset correctly predicted 88% of the results for antenatal care in the test 30% dataset, and 89% of results for home deliveries. The first step in testing the score was to apply it to the existing data at cluster level. This is the level at which information would be collected in a real situation. After application to the collapsed dataset of 48 cluster-level records, the card produced scores ranging from 0 to the maximum 10, with the 50th centile at a score of 3 and the 75th at a score of 5. As we have said, we will probably use raw scores to rank informal settlements in terms of risk. However, for test purposes we arbitrarily classified an area as particularly vulnerable if its score was in the top 25% (13 clusters). We used the same criterion to split outcome proportions, taking the top 25% as particularly poor. Table 3 summarises a series of analyses of two-by-two tables for the three outcomes, and for low birth weight as a comparator that had not been included in the development of the scorecard. The scorecard had poor sensitivity and positive predictive value, but relatively high specificity and negative predictive value.

Table 3 Performance of scorecard against four sentinel cluster-level indicators of health vulnerability

Discussion

Our findings suggest that obvious characteristics of informal settlements that would be intuitively associated with health risks are associated with indicators of maternal and newborn vulnerability. A scorecard based on UN-HABITAT criteria was specific but not sensitive for three indicators, and a fourth that was not used in its derivation. The study had three substantial limitations. First, because of our focus of work and available data, we used three outcomes: uptake of a recommended three or more antenatal care visits, home delivery, and neonatal mortality. We cannot assume that the findings would apply to other indicators such as infectious and chronic disease. It is rational to suggest that they might, but we have no supporting evidence. Since maternal and newborn survival are targets for two Millennium Development Goals 30, and since they are indicators of population health, we plan to take the scorecard forward at this stage.

The second issue is the external validity and utility of the scorecard. As a speculative exercise, the analysis gives us grounds for optimism, but the next step must be to test it in other environments. Variation is inherent in slum communities—after all, this is the main problem of definition and classification—and we need to see if the utility of the scorecard is a function of its cyclical development. Precisely because of variability, we have included indicators that have scope for wider distribution. For example, access to toilet facilities was the norm in the model areas (only 2% of households did not have access to a communal or private toilet), but we included it in the scorecard because this would not necessarily be the case. The levels of both risk indicators and outcomes will differ substantially between neighbourhoods, urban areas and countries.

Thirdly, because of our emphasis on utility in the field, we took a simplistic approach to assigning scores: ‘very few’, ‘some’, and ‘most’. To test the scorecard, we took the top quartiles of both independent and dependent variables as high-risk categories. This decision was made for the sake of evaluation rather than practical usage. We are about to begin a programme of work in 40 informal settlements, evaluating the effectiveness of community resource centres as a delivery strategy for interventions to improve women’s and children’s health. We would like the programme to run in areas of particular need, and face again the task of prioritising settlements for selection. This is precisely the kind of situation in which we think the scorecard will be useful, and we will test it along with more detailed data collection in order to see if it is as useful as we hope it might be. Our idea is to rank candidate settlements by score and then select the top 40. Despite its lack of sensitivity, the specificity of the scorecard may be an advantage, since the aim is to make sure that selected settlements are genuinely vulnerable. We would welcome opportunities to modify and test the scorecard in collaboration with groups working in other cities and countries.

Conclusion

We confirm the supposition that visible characteristics of urban slum localities do indicate maternal and newborn health vulnerability. We suggest that a simple scorecard based on six easily collected indices might be a useful tool to select areas at particularly high risk. The next steps will be to test external validity and to examine other health outcomes.