Under‑Five Child Growth and Nutrition Status: Spatial Clustering of Indian Districts

Variation in human growth and the genetic and environmental factors that are influ-encing it have been described worldwide. The objective of this study is to assess the geographical variance of under-five nutritional status and its related covariates across Indian districts. We use the most recent fourth round of the Indian National Family Health Survey conducted in 2015–2016, which for the first time offers district level information. We employ principal component analysis (PCA) on the demographic and socio-economic determinants of childhood morbidity and conduct hierarchical clustering analysis to identify geographical patterns in nutritional status at the district level. Our results reveal strong geographical clustering among the districts of India, often crossing state borders. Throughout most of Southern India, children are provided with relatively better conditions for growth and improved nutritional status, as compared to districts in the central, particularly rural parts of India along the so called “tribal belt”. Here is also where girls are on average measured to have less weight and height compared to boys. Looking at average weight, as well as the proportion of children that suffer from underweight and wasting, north-eastern Indian districts offer living conditions more conducive to healthy child development. The geographical clustering of malnutrition, as well as below-average child height and weight coincides with high poverty, low female education, lower BMI among mothers, higher prevalence of both parity 4 + and teenage pregnancies. The present study highlights the importance of combining PCA and cluster analysis in studying variation in under-five child growth and of conducting this analysis at the district level. We identify the geographical areas, where children are under severe risk of undernutrition, stunting and wasting and contribute to formulating policies to improve child nutrition in India.


Introduction
Child growth and nutritional development are largely influenced by living conditions, including socio-economic, cultural, demographic and climatic factors that can vary dramatically across nations. National variation in human growth patterns across the world have been well documented for decades (Eveleth et al. 1976(Eveleth et al. , 1990 and average child height and weight are important indicators of population health and human capital (Coffey and Spears 2018). However, despite the strong role of neighbourhood and community level effects found in various national contexts (Fotso 2007;Uthman 2009;Avan and Kirkwood 2010;Alam 2012;Geruso and Spears 2018), disparities in child height and weight at smaller spatial scales within nations have been studied to a lesser extent.
The need to do so is particularly evident in a country like India. Despite recent advances Indian children are still shorter than elsewhere in the developing world at similar levels of economic development (Coffey and Spears 2018). According to the recent Global Hunger Index (GHI) 2017 report (von Grebmer et al. 2017), India is ranking 100th out of 119 countries investigated and has the third highest score in Asia. This severe situation is driven to a large extent by high child malnutrition and underlines the need for a stronger commitment to poverty alleviating social policies. As of 2015-2016, more than a fifth (21%) of children in India suffered from wasting (low weight for height) and although there has been progress with respect to stunting (low height for age), down by 20% since 2005, the rate still stands at a staggering 38.4%. Similarly, the underweight rate is down by 16% since 2005, but even that progress leaves India with a relatively high rate of 35.7% (IIPS and ICF 2017).
These elevated rates of childhood malnutrition have large public health implications. Most importantly, widespread undernutrition is strongly correlated and without a doubt causally linked to higher infant mortality (Schroeder and Brown 1994;Rice et al. 2000;Black et al. 2008;Liu et al. 2016). Even if child deaths are not directly due to undernutrition, it has been shown to be the underlying cause for a large number of child deaths from diarrhoea, pneumonia, malaria or measles occurring in the developing world (Caulfield et al. 2004). Undernourished children are more likely to suffer from poor health compared to well-nourished children (Latham 1997;Cunha 2000;Gillespie and Haddad 2003) and consequently, perform worse in educational terms. By the time children enter school, undernourished children already lag behind with regard to not just physical, but also cognitive, behavioural and emotional development, which cannot be compensated through schooling (Alderman et al. 2006;Ruel and Alderman 2013). In the aggregate, these effects amount to large-scale losses in productivity and economic costs (Heckman and Masterov 2007;Galasso et al. 2016;Masoud et al. 2018) that prevent countries like India from achieving further development goals.
But the risk of malnutrition is not evenly spread across India. Besides climatic variation, that leads to differences in agricultural patterns and yields, India is characterized by vast spatial heterogeneity in the social, cultural, demographic and economic environment (James 2011;Samir et al. 2018), all of which has strong implications for the circumstances of growing up. Guilmoto and Rajan first described the historical spatial variation in the trajectory of the fertility transition across Indian districts going back to 1951 (2001). Kumar et al. (2012) conduct a geospatial analysis of under-5 mortality and find strong district-level variation in the provision of maternal and child health services to play a leading role in explaining the observed mortality patterns. Together with increases in education, these reductions in under-5 mortality also seem to be driving spatial variation in fertility decline in the more recent past (Mohanty et al. 2016). Urbanization, which has been shown to follow a rather distinctive pattern in India (Denis and Zérah 2017), as well as development more broadly vary heavily across the subcontinent. There is considerable geographical variation in the percentage of indigenous population belonging to one of India's scheduled casts or scheduled tribes (SC/ST, Census of India 2011). Moreover, the widespread phenomenon of son preference, that might result in both higher fertility (Singh et al. 2017), as well as higher prevalence of malnutrition among girls (Pillai and Ortiz-Rodriguez 2015;Kugler and Kumar 2017), has been shown to follow marked spatial patterns (Guilmoto et al. 2018).
Consequently, due to their mere size Indian states are not always analytically meaningful and substantial inter-state, as well as intra-state differences in child health have been observed (IIPS and ICF 2017). In their geospatial analysis of infant and child mortality, Singh et al. (2011), for example, emphasize important interregional disparities, both within and across states. Adjacent districts in neighbouring states are often more similar than districts within the same state because administrative demarcations often disregard cultural, religious and linguistic homogeneity. While levels of malnutrition and differences in child growth have been examined at the individual and household level for India (Kanjilal et al. 2010;Sarkar et al. 2013;Nie et al. 2019), similar studies at the district level are mostly missing (Khan and Mohanty 2018), despite their strong policy relevance. Lower administrative level analysis is important because it gives a better picture of existing health inequalities within states or spreading across state borders and can thus help policy makers to respond appropriately in high-risk pocket areas. Financial assistance is typically provided by the state or central government, but the policy discourse is based on district-level estimates and policies are implemented by the administrative head of a particular district in collaboration with the local Member of the Legislative Assembly (MLA) (Swaminathan et al. 2019).
The aim of this study, therefore, is to understand the geographical variation in growth and nutritional patterns of children under the age of five among the districts of India. The three main objectives are (1) to identify geographical clusters based on the demographic and socioeconomic characteristics of the households within them; (2) to identify differences or similarities in height, weight and nutritional status of children living in different clusters; and (3) to assess whether the observed patterns differ by gender.
Clustering Indian districts based on socio-economic characteristics of the households with children under the age of five may help identify factors that affect child growth and development in these diverse communities. Our results can help in designing and implementing appropriate state-specific or regional strategies and intervention programs to prevent child undernutrition. The findings of this study can help improve our understanding of the district-level nutritional conditions children are exposed to in India today, which may in turn help reduce avoidable child deaths in the future and to meet the targets set by the SDGs for India as a whole from the district level up.

Data and Methods
The anthropometric and socio-economic data used in this study are taken from the most recent wave of the National Family Health Survey (NFHS-4) conducted in 2015-2016 by the Health Ministry, Government of India. For the first time in the history of the NFHS surveys, district information corresponding to households is made publicly available (IIPS and ICF 2017). The survey, which does not contain any information that would make individual survey participants identifiable, is based on a sample of 1,315,617 children born to a total of 699,686 women aged 15-49 years from 601,509 households. In our analysis, we make use of the restricted sample of 259,627 children that were born in the 5 years before the survey. The sample was selected through a two-stage sample design and covers all the 640 districts as per the 2011 Census of India. The individual level data are available from the Demographic Health Survey (DHS) data repository and can be accessed upon request. For our study, the unit of analysis are the 640 districts of India. In our analyses, we consistently use individual-level sampling weights.
To measure growth of children aged 0-59 months, we calculate average height and weight for girls, boys and both sexes combined at the district level. Further indicators of nutritional deprivation derived from height and weight are stunting, wasting and underweight. Following WHO guidelines (World Health Organization 2006, 2011, in our analysis we classify children as stunted, wasted and underweight, respectively, if their height-for-age, weight-for-height and weight-for-age Z-scores are below minus two standard deviations from the median of the reference population (set by the median of the WHO child growth standards). While the three phenomena often appear in combination, they generally measure different types of undernourishment: while stunting results from chronic nutritional deprivation, wasting indicates acute nutritional deficiency and underweight represents the combination of the two.
In accordance with the literature, we also use the NFHS to compose a set of district-level determinants of child growth and malnutrition. These determinants can be broadly differentiated into three main categories. Among the health-related determinants, we choose the proportion of children that were fully immunized at the age of 12-23 months, the proportion of children that have never been breastfed, and mothers' body mass index (BMI). For demographic determinants, we calculate the proportion of parity 4 + births, the proportion of teenage pregnancies, the average birth interval within a district, as well as-in our analysis of the combined spatial pattern for both sexes-the proportion of female children under five. Similarly, socio-economic determinants were derived at the district level using information on the proportion of households that are female-headed, the proportion of the district population living in urban areas, the proportion of mothers with mass media exposure, the proportion of mothers with at least secondary or higher level of educational attainment, the proportion Hindu, the proportion population belonging to historically disadvantaged groups (SC/ST), as well as the prevalence of using safe cooking fuels, improved water supply and sanitation. Finally, we account for the geographic location of the district by using longitude and latitude information of the district centroid. Summary statistics for each of these variables can be found in Table 1.
NFHS does not contain information on the household's income situation. Therefore, to account for low income status at the district level, a wealth indicator was recreated following the procedure outlined in the DHS report on NFHS-4 (Rutstein 2015), where women's economic status was assessed through a composite index of household characteristics, such as possession of consumer durables and assets, as well as building characteristics. The three components from that wealth index that are particularly relevant to our study subject, namely access to modern sanitation, cooking fuels and clean sources of drinking water, were excluded from our reconstruction of the wealth index because they are explicitly controlled for as part of our analysis.
Due to the large number of often correlated variables, before identifying district clusters we first use principal component analysis (PCA) as a preprocessing step to denoise the data. Once the complexity of the data set has been reduced to its principal components, we perform cluster analysis on the principal components to identify regional child growth and nutritional status patterns across India. All the analyses are conducted using R Statistical Software, in particular the "FactoMineR" package (Escofier et al. 2017). The correlations between the variables are depicted in Fig. 1 sorted by hierarchical clustering such that similar items group together.
Combining PCA with cluster analysis is useful in various ways. PCA is an exploratory approach that serves to reduce the complexity of a data set which can suffer from problems such as linear dependence among its variables or non-normality. Standard OLS regression techniques fail under such circumstances. Most importantly, though, the results of any subsequent clustering will be more robust if the less meaningful information, which is already accounted for by other, correlated factors, has already been cleared from the data. The principal components drawn from the full set of variables contain only the information that contributes the most to describing statistical relations within any random subset of the data. This is particularly useful in case there are reasons to doubt the accuracy of specific survey items. The more variables there are in the dataset, the more useful PCA becomes. The contribution of each variable in terms of its correlation with the most important dimensions extracted by PCA is depicted in Fig. 2. The contribution of variables "poor" and "edu", for example, to the first principal component is highly significant, whereas there is no significant contribution of these variables to the second dimension. Vice versa, the proportion "hindu" appears to be highly correlated with the second dimension but is irrelevant for the first principal component. Overall, the first four principal components together explain about 60% of the variance.
Cluster analysis techniques enjoy widespread popularity in diverse scientific disciplines (Eveleth et al. 1990;Tanner 1962). The first application to anthropometric data can be traced back to Mahalanobis et al. (1949). Other studies have used cluster Table 1 Descriptive statistics of the selected district level variables using survey weights, India analysis for the study of health conditions such as diabetes (Guttula et al. 2010) or cardiac disease (Perumal and Mahalingam 2013). More specifically for the case of India, Vasulu and Pal (1989) showed the relationship between anthropometric variation and cultural diversity among the Yanadi, a tribe in the south eastern part of  Andhra Pradesh. Similar methods have been applied to classify growth profiles of children in China (Cheng-Ye et al. 1991), as well as to create patterns of overall activity and inactivity in a diverse sample of Chinese youth to evaluate their use in predicting overweight status (Monda and Popkin 2005). A study by Tucker (2010) combined PCA and cluster analyses to study dietary patterns of different populations in the US and Ferenci et al. (2008) used the method to discuss obesity patterns observed among Hungarian children. More recently, Rao et al. (2013) combined PCA and cluster analysis on nutritional and anthropometric data of preschool children to analyse geographical variation in child growth among the districts of Uttar Pradesh. Rao et al. (2013) combined PCA and cluster analysis on nutritional and anthropometric data of preschool children to analyse geographical variation in child growth among the districts of Uttar Pradesh.
This approach is different from the usual approach to analysing spatial clustering through local indicators of spatial association (LISA, Anselin 1995). LISA allows for the decomposition of global indicators of spatial association, such as Moran's I, into the contributions of individual observations which can then be mapped and inspected visually. This method is particularly useful in identifying local hotspots or pockets of non-stationarity. However, Moran's I and the resulting LISA cluster maps are derived based on the information on only one indicator, in addition to a neighbourhood matrix. The way "neighbourhood" is defined is sometimes disputable. It could be based on simple contiguity, but depending on the problem to be analysed it might make more sense to define neighbourhood based on distance between neighbours or the length of their common border. The method we are employing here, uses information from a much wider set of inputs. These do tend to suffer from linear dependence. However, by using PCA as a preprocessing step, we can find meaningful spatial relations from clustering without providing any information on the neighbourhood relations, which can sometimes be ill-specified. Due to the difference in the inputs used by our method and the method suggested by Anselin, it is not surprising that our results do not match entirely. For comparability, we show LISA cluster maps based on the first three principal components, as well as the main indicators of morbidity (stunting, wasting, and underweight) in the appendix.

Results
Once we extracted the principal components, we can perform hierarchical cluster analysis on the reduced number of variables using Ward's criterion. The crucial question in this step relates to the number of clusters to distinguish. This can be done either through visual inspection of the dendrogram (depicted in Fig. 3) or by applying a quantifiable partitioning criterion, e.g. minimizing the ratio between two successive within-group inertias (Husson et al. 2017). In our case, both of these methods suggest distinguishing three separate clusters among Indian districts, two of them very large with 271 and 313 districts, respectively, a third one with only 56 districts.
The spatial distribution of these clusters is shown in Fig. 4. The regional clustering is indeed very strong, with members of cluster 1 being concentrated in the southern and northern parts of India and cluster 3 districts being concentrated at intermediate latitudes along India's so called "tribal belt" ranging from Rajasthan to West Bengal. Of the 8.6% of the Indian population that can claim SC/ST status, 75% are living in this area (Census of India 2011). On the other hand, of the 56 districts in cluster 2 only six are located in the western part of India at below 80 degrees longitude. The remaining 50 districts are in the East at above 90 degrees longitude. Since the quality of the data collected in Jammu & Kashmir might suffer from the ongoing conflict there, we did run the analysis without that state and the results for the rest of India are robust to this modification. Table 2 shows the differences between district cluster 1 and the full sample of Indian districts by variable. As indicated by a positive value of the v.test statistic, describing the standardised deviation between the mean of the districts within the cluster and the general average (Husson et al. 2017, p. 155), districts with a high proportion of women with at least secondary education ("edu") are significantly overrepresented in cluster 1 compared to the whole of India. Whereas the mean Fig. 3 Dendrogram of district clusters derived from applying cluster analysis to principal components across all 640 Indian districts in this proportion is 0.60, for the districts belonging to cluster 1 the proportion is 0.76. Districts inside cluster 1 also tend to be more "urban", children are more likely to be vaccinated ("vacc") and houses are more likely to be equipped with improved sanitation, a clean source of drinking water and safe fuel for cooking is used. The negative v.test statistics in the bottom part of the table indicate an underrepresentation of districts in cluster 1 that have large proportions of poor people-according to our recreated poverty index-with less access to mass media ("mmedia"). Moreover, the proportion of low-BMI and otherwise deprived mothers ("SCST") is smaller within districts in cluster 1 and they might be characterized by lower fertility, as suggested by a smaller proportion of higher order births ("prop4pl"). Combining the results from PCA and cluster analysis, a visual representation of the variables most important in distinguishing cluster 1 can be found in Fig. 5. For the green districts that are part of cluster 1, the variables with a positive v.test in Table 2 display a strongly negative relation with the first principal component. All other variables show a positive relation with PC1 for cluster 1 districts. Table 3 shows the variables most distinctive for cluster 2 districts. In addition to the geographical pattern described earlier, the variables displaying the strongest positive deviation from the country overall are the proportion of population belonging to one of the scheduled castes or scheduled tribes, as well as the proportion of children ranking fourth or higher in the birth order and the proportion growing up with modern toilet facilities in the household ("ptoilet"). On the other hand, districts with low birth intervals, completed vaccination of the children, improved water supply and mothers with low BMI are underrepresented in cluster 2. In addition, with the exception of a few districts in the North of India cluster 2 is concentrated in the North-eastern part of the country. Therefore, average longitude for districts belonging to cluster 2 is significantly higher than for a central Indian district. Judging from Table 4, cluster 3 appears to be the inversion of cluster 1. The same variables that were identifying districts predominantly underrepresented in cluster 1, now identify districts that are overrepresented and vice versa. Do these clusters found by principal component and cluster analysis do indeed have explanatory power with regard to child outcomes? Figure 6 suggests they do. While for cluster 1, children below the age of 5 in the median district are on average 84.84 cm, in the median district for cluster 3 children at that age measure only 82.86 cm. The median in cluster 2 is on average higher than in cluster 1 by about half a centimetre, yet districts in cluster 2 spreading over two geographically completely separate areas of India are more heterogeneous and the difference between cluster 1 and cluster 2 cannot be said to be significant. Still, comparing height among cluster 1 and cluster 2 with cluster 3, we find no overlap in spreads indicating a clear group difference. 75% of districts in cluster 3 lie below 75% of districts in cluster 1 and cluster 2. Differences become clearer using the same clusters to identify differences in district level average weight of children below the age of five. Cluster 2 is now clearly very different from cluster 3 as there is almost no district in cluster 2 falling within the 75% range of district averages in cluster 3. Looking at average weight, it is likely that cluster 1 is different from cluster 2, as the median in cluster 1 no longer reaches the spread of the box around the median for cluster 2. Moreover, districts within cluster 2 are more homogenous with respect to children's weight than with respect to their height, as indicated by the spread of the box. Similarly, children within districts belonging to cluster 3 are more likely to suffer from stunting, wasting and underweight compared to those within clusters 1 and 2. Yet the difference between cluster 1 and cluster 2 is not significant with respect to stunting.
Finally, we want to look at whether these spatial patterns could be driven in part by gender disparities with respect to nutritional status. Previous studies have suggested that son preference is in fact the single most important factor in explaining India's high child malnutrition rates compared to countries at similar, sometimes lower levels of development (Jayachandran and Pande 2015). This factor benefits first born sons in particular over their later born siblings. Not only might Indian families concentrate resources in the form of nutrition and medical care to support their first-born sons, but the lack of a male offspring also leads to higher than desired family sizes (Arnold et al. 1998;Clark 2000) which again leads to a higher likelihood of the mother being undernourished during later pregnancies and lower per child investments, as expressed in e.g. levels of educational attainment (Kugler and Kumar 2017). According to the raw anthropometric data, Indian girls under the age Fig. 6 Distribution of wasting, stunting, underweight, as well as average height and weight across Indian districts for three different groups of counties identified by cluster analysis of five are not necessarily shorter or lighter than boys. In fact, in 144 out of 640 districts girls are on average taller than boys, while girls weigh more on average in 56 of the 640 districts. For India as a whole, prevalence of undernutrition is almost the same among girls and boys (IIPS and ICF 2017, p. 293). The question we are interested in is whether gender disparities can still be found sub-nationally.
To answer this question, we standardize the weight and height measures separately for girls and boys to produce measures of stunting, wasting and underweight. According to Moran's I, global spatial autocorrelation for all the districts of India combined is significant only in the case of stunting. Nevertheless, looking at the LISA cluster maps of the gender-ratio (boys/girls) for stunting, wasting, and underweight z-scores shown in Fig. 7, there are significant local pockets of gender disparity across India. Red areas refer to pockets of districts where z-scores for boys are consistently higher than those for girls, i.e. districts with ratios larger than 1 are surrounded by similar districts. Dark blue pockets, on the other hand, indicate pockets where girls appear to be disadvantaged. Neither of the two types of clusters is confined to the less developed states. For example, gender disparities in stunting are strong not just in less developed states like Chhattisgarh, but even in some of the districts of Andra Pradesh and Telangana. In the case of wasting, we find district pockets in developed states like Gujarat and Andhra Pradesh, as well as in less developed states like Odisha. Underweight is not equally prevalent among boys and girls in some of the districts from developed states like Gujarat, Andhra Pradesh, Telengana and Tamil Nadu. These findings confirm that looking only at state-level differences in development outcomes might hide important differences at smaller spatial scales.

Discussion
Since the National Nutritional Anemia Prophylaxis Program (NNAPP) of 1970, several ambitious programs have been implemented in India to tackle childhood malnutrition. In 1976 the government initiated the Integrated Child Development Services (ICDS) scheme, one of the world's largest programs for early childhood care and development. The scheme was implemented to provide children with Fig. 7 LISA cluster maps of gender-ratios (boys/girls) in z-scores for Stunting (left), Wasting (middle), and Underweight (right) at the Indian district level. NFHS-4, 2015NFHS-4, -2016 health, nutrition, and education services from birth to age six, as well as nutritional and health services to pregnant and breastfeeding mothers. More recent initiatives include the National Food Security Mission (NFSM) launched in 2007, as well as the Nutri Farms scheme launched in 2013-2014. Yet despite these strong efforts in the past, India continues to score very poorly on the Global Hunger Index and lags behind other developing countries at similar or even lower levels of development. A recent study by Lucas et al. (2019) highlights India's crucial role in achieving the global SDG on child mortality.
One of the reasons for why it takes so long to fully eradicate wasting and stunting is the intergenerational transmission of risk factors. Many societies are caught in a vicious cycle where malnourished girls, lagging behind in developmental terms, grow up to become low BMI mothers whose underweight children again suffer from a larger risk of developmental deficiencies. As described by de Onis and Branca (2016), this type of "growth faltering" starts already in utero and continues to affect children's further development throughout their lives leading to increased morbidity and mortality. Specific to the Indian context is the strong and persistent pattern of son preference.
But it would be wrong to think that the phenomenon affects only those children in India falling below the specified threshold for stunting and wasting. In fact, India's entire length-for-age/height-for-age Z-score distribution is shifted to the left. This has dramatic consequences, most severely for the children affected and their families, but growth faltering also has to be seen from a public health standpoint and affects the development outlook of entire nations as these children will be more likely to suffer from disease and have lower productivity as adults (Sachs 2001;Narayan et al. 2019).
The findings presented in this study are important for public health planning and for targeting the underlying factors associated with child growth and malnutrition in India. The National Health Mission (NHM) program initiated by the Government of India has been set up to improve child and maternal health conditions across India. The National Nutritional Mission (NNM) focuses specifically on the high priority states (Falcao et al. 2015;Murray et al. 2014). Prioritising efforts within the districts identified by this study as most disadvantaged will help further alleviate the overall burden of malnutrition in India, while potentially increasing the efficiency and lowering the costs of intervention programs.
Moreover, even intervention programmes that are sensitive to different types of nutritional deficiencies might be ineffective if the underlying determinants of malnutrition are not addressed (Ruel and Alderman 2013). We therefore encourage the development of multi-sectoral plans to deal with malnutrition at district level by combining direct nutrition interventions with strategies linked to health, family planning, water supply and sanitation, as well as cultural norms such as preferential treatment of first born male offspring (del Casanovas et al. 2013). Given the scarcity of health resources, the more focused implementation of child-health inducing policy schemes in the identified geographical areas seems all the more germane. Stakeholders within government and international funding agencies should take an integrated approach coordinating efforts toward poverty reduction, increasing female education, improving sanitation, as well as care services provided to mothers and children, especially among the SC/ST communities and in the disadvantaged regions of the country where son preference is still prevalent.

Conclusion
Previous studies conducted both on India and other developing countries have demonstrated that socio-economic and demographic variables are strongly related with the nutritional status of children, affecting their weight and height (Van de Poel et al. 2008;Kanjilal et al. 2010;Biswas and Bose 2011). Using PCA and cluster analysis on DHS survey data, in this study we are able to show strong regional clustering among 640 districts of India. Based on household-level information describing children and mothers' living conditions, we find that growing up in the central, particularly rural districts of India yields far worse child outcomes in terms of growth and nutritional development. On the other hand, our results suggest a significant advantage for children growing up in any of the districts of Southern India, particularly with respect to height. Looking at average weight, as well as the proportion of children that suffer from underweight and wasting, North-eastern districts seem to be offering living conditions more conducive to healthy child development. With regard to gender differences in the prevalence of malnutrition, our results based on LISA cluster maps suggest several local pockets-within states and crossing state boundaries-where either of the sexes seems to fare better. However, we were not able to relate these smaller clusters to our results obtained based on PCA and cluster analysis.
With respect to the specific characteristics that distinguish regional clusters, we find that besides female education and urbanization, access to mass media and usage of clean cooking fuels, improved sanitation and drinking water are the most important determinants of child well-being. Vice versa, low BMI of mothers, household poverty levels, high fertility (as indicated by large proportions of higher order births), predominantly Hindu population, mothers' age at birth below 20 years and lack of immunization are significant determinants of lower child height and weight, as well as higher prevalence of nutritional deficiencies among under-five children.
These results confirm previous findings on the determinants of childhood malnutrition. The phenomenon is more widespread among poorer families in rural areas (Gwatkin et al. 2007;Wagstaff and Watanabe 2000) where access to food is less secure and living conditions are more likely to lead to infectious diseases and further undernutrition. Female educational status and BMI have been found to be positively associated with nutritional indicators at the community and regional level in India (Kravdal 2004;Khan and Mohanty 2018), whereas the negative association with the proportion Hindu corresponds with the "Muslim advantage in child mortality" observed earlier (Guillot and Allendorf 2010). Lack of clean water supply and improved sanitation have repeatedly been identified as major drivers of infection and premature mortality (Spears et al. 2013;Rah et al. 2015;Geruso and Spears 2018) and the possibility to make the necessary amendments to the household's infrastructure are again related to widespread economic inequalities .
One of the many seemingly paradoxical situations in contemporary India is that these patterns of nutritional deprivation coincide with increasing prevalence of obesity, not only among children from wealthier families but also among those from the lower social classes where underweight continues to be a major concern (Ranjani et al. 2016). This situation seems to be particularly severe in urban areas, where women have been shown to be at higher risk of overweight (Dev and Balk 2015). Our study does not touch upon this important form of malnutrition which might well anticipate the future chronic disease pattern in India. Furthermore, the paper is limited in the range of data that are considered. We do not include climatic factors that might lead to differences in the availability of different types of grains (for an overview of crops available throughout the Indian subcontinent and their nutritive properties see Das et al. 2012) which might affect differences in the rates of malnutrition. In addition, while considering variation at the district level is an important step forward compared to looking only at the state level, we might miss important heterogeneity detectable at even finer levels of spatial resolution. This Modifiable Areal Unit Problem (MAUP, Wong 2004) could only partly be mitigated by looking at even finer spatial resolution, e.g. provided by DHS clusters. Moreover, the data available from DHS lacks representativeness at the national level and are therefore not used here. Another limiting factor arises from differences in district-level population agestructure that could be related to differences in average family size. Notwithstanding these caveats, the paper does demonstrate the existence of regional clusters of districts where malnutrition among children under the age of five is particularly severe and relates these clusters to socioeconomic and demographic risk factors.