Data sources
Leprosy
The Indian Ministry of Health reports annual new case counts for leprosy for the period 2008–2015 for each district in India (see Spatial boundaries) [18–32]. In accordance with case report data, we define each year as the twelve month period ending March 31. The National Leprosy Eradication Program also provides annual estimated populations for each district, the number of new cases of Grade 2 disability (defined by the WHO as visible deformity to the hands or feet or severe visual impairment) at the district level, as well as state-level estimates for the fraction of multibacillary cases, the fraction of cases among children, and the fraction with Grade 2 disability at diagnosis.
Census
The 2011 Census of India contains district-level data on illiteracy, unemployment, scheduled caste and scheduled tribe populations, rural population, and poverty [33–37]. In our data set, a poverty index was defined as the absence of a defined set of assets included in the census survey. A household was considered to be impoverished in the absence of ownership of a radio, a TV, a computer (with or without internet access), a mobile phone, landline, a bicycle, or a motorized two- or four-wheel vehicle (including a scooter or car) [37, 38]. This definition is more restrictive than other economic measures of poverty (which routinely place between 20–30% of the population in poverty); only about 18% of households meet this criterion of poverty.
Illiteracy is defined as the inability to both read and write in any language; children 6 years old or younger are automatically considered illiterate in the census. An individual is considered unemployed (specifically, a “non-worker”) if he or she did not partake in an economically productive activity in the 12 months preceding the census survey. This includes students, homemakers, children, retirees, and beggars; it does not include subsistence farmers or others whose primary activity was producing food for self-consumption. Therefore, unemployment here does not necessarily indicate a desire to work, or an active pursuit of employment. The census also reports the fraction of a district’s population that lives in a rural area (defined as a region not registered as statutory town or municipality, with fewer than 5000 individuals, with greater than 75% of working individuals employed in agriculture, or with population density less than 400 per km2) [33].
The Constitution of India includes provisions for individuals in scheduled castes and scheduled tribes (indigenous tribal persons). Historically, these two groups experienced higher levels of discrimination, exclusion, and poverty [39]. The Census reports the number of individuals in scheduled castes and in scheduled tribes per district [35, 36], though this was not reported in 84 of the 604 analytic districts (Table 1; see Spatial boundaries, below).
Table 1 Data sources used in the analysis State-level predictors
While we primarily focused on district-level analysis, we examined two possible state-level predictors of leprosy burden, collected from the Centre for Monitoring Indian Economy database [40]. The first, per-capita net domestic product (NDP), is thought to be a more direct measurement of community development and wealth than poverty or other socio-demographic variables. The second, the number of government hospitals in each state, may be related to healthcare availability and accessibility.
Per-capita income
For validation, we compared selected indices of poverty with district-level per-capita income data. Madhya Pradesh, one of India’s largest states, reported per-capita income (held constant relative to 2004–2005 price index, thus adjusting for inflation) from 2008 to 2012 in 45 of its 48 districts [41].
Satellite imagery
Nighttime satellite imagery data has proven useful in assessing economic conditions in the developing world [42–45]. We obtained nighttime cloud-free composites providing average visible lights and stable lights (which excludes impermanent sources of light, such as fires or other background noise), at 30 arc second resolution (roughly 1 km; Fig. 1) [46]. In the most dense, brightly lit areas, the satellite sensors become saturated and cannot record values above a certain threshold. In India, this threshold obscures subtle differences in illumination from the country’s largest cities, including Delhi, Kolkata, Bangalore, and Mumbai. Radiance, a readjusted illumination measure produced from the same satellite imagery, may provide a better indicator of economic activity and development [44, 46]. Radiance data were derived from images taken in 2010 and 2011, and were computed by averaging radiance over the areas of each district. We computed the radiance divided by the estimated population, yielding a ratio which exhibits outliers (the largest value is approximately 14 times the average value). To minimize the occurrence of potential high-leverage points, we used the rank transformed values as a predictor. Additionally, we calculated a binary low visibility indicator, defined as 1 if a district was in the lowest decile of mean visibility index, as well as a similar low radiance indicator. No effort was made to identify oil flares or other causes of high illuminance unrelated to socioeconomic development.
Spatial boundaries
There were several rearrangements of state and district boundaries over the study period. Spatial analysis was based on the GADM (Global Administrative Areas) database for administrative boundaries [47], supplemented by an updated version for selected jurisdictions [48]. If a district or state was divided into multiple districts or states during the study period, we combined data from the resulting new districts to estimate what the counts would have been for the old district boundaries, to obtain a longitudinally consistent set of reporting districts and states. Likewise, if two or more regions were merged, the data from these regions was combined throughout the study period into a single analytic district. This procedure yielded 604 analytic districts from 2008–2015 (Table 1; [15]).
Program activities
A group of 209 districts were identified as high leprosy districts, based on 2010–2011 reports [49], and these regions were targeted for subsequent enhanced surveillance activities through the National Leprosy Eradication Program. As in our previous analysis [15], we entered this list of districts for use as a binary regressor.
Statistical methods
Outcomes
The primary outcome variables were the leprosy annual new case detection rates (ANCDR), defined as the number of new cases in a district divided by the estimated population of the district during that year, and the rate of new Grade 2 disability per million population (Grade 2 rate). We also explored the heterogeneity in the proportion of reported leprosy cases that displayed Grade 2 disability (Grade 2 fraction).
We computed Spearman’s rho (ρ) correlation coefficients for the outcomes of interest and the potential predictors. We then conducted multivariate linear mixed effects regression [50] of the longitudinal outcomes, using the district- and state-level predictors. All models include a random slope and intercept, year as a fixed effect, and a fixed effect in 2012 and 2013 for each of the 209 enhanced case finding districts mentioned above. Spatial block bootstrap (1000 replicates) helps account for spatial dependence and often estimates a conservative confidence interval [51]. The marginal and conditional R2 values estimate the variability explained by fixed effect predictors, and by both fixed and random effects, respectively [52, 53]. To improve normality and homoskedasticity, we used the log transformation for the new case detection rates (per 10 000 inhabitants) and the per-capita Grade 2 rate (per million inhabitants), with zeros modeled as 0.5 divided by the district population (as in [15]). All analysis was conducted in R v. 3.2 for MacIntosh (R Foundation for Statistical Computing, Vienna, Austria), using packages sp, maptools, spdep, lmer, and sperrorest.