Determinants of COVID-19 pandemic in India: an exploratory study of Indian states and districts

The countries across the globe are facing one of the worst infectious diseases in modern times in the form of COVID-19 pandemic. Different measures have been taken to control and manage the outbreak of COVID-19 in these countries. There are two propositions in context of effective control and management of a pandemic like COVID-19. First, a strong and effective public health care system is essential for managing the public health crisis and the uneven responses to COVID-19 are mainly because of inadequate health infrastructure. Second, the spread of COVID-19 depends on the interplay of other social determinants at local level, and therefore, addressing the gaps in social determinants of COVID-19 at local level is critical to control and manage this pandemic. The present paper attempts to examine these two propositions in Indian context at states and districts level, respectively. Using the cross-sectional data and constructing composite indices of COVID-19 intensity and level of health infrastructure at state level, the results show that there is no robust relationship between level of health infrastructure and management of COVID-19 at state level as the states with better health infrastructure are also struggling to combat against COVID-19. The district-level analysis indicates a significant relationship between concentration of COVID-19 and social determinants as majority of the districts with higher concentration of COVID-19 are those which have social determinants below national average.


Introduction
The world is witnessing an unanticipated public health crisis because of the COVID-19 pandemic also known as 'severe acute respiratory syndrome coronavirus-2 (SARS-Cov-2)' (Lahiri et al. 2020;Arif and Sengupta 2020). The outbreak of COVID-19 was first reported in the city of Wuhan, China, in December 2019 and has spread worldwide (Ghosh et al. 2020a, b). The first case of COVID-19 in India was reported in the Thrissur district of Kerala on January 30, 2020 (Sarkar et al. 2020), and by the end of November 2020, India accounted for the second highest number of confirmed cases after the USA. The Government of India along with state government has taken several measures to control the outbreak of COVID-19 which include four consecutive lockdowns, invoking of special provisions of Epidemic Disease Act of 1897 and Disaster Management Act 2015 (Naik et al. 2020), establishing containment zones (Malani et al. 2020), introducing a smartphonebased application 'Aarogya Setu' to trace and detect the COVID-19 affected cases, and use of social media, electronic and print media to spread awareness among population about COVID-19 Gohel et al. 2020c), etc. Despite these efforts, the number of confirmed cases has increased significantly. There are two main propositions in context of spread and management of COVID-19. First, a strong and effective public health infrastructure is essential not only to respond to any public health crisis including epidemics but also to address ongoing challenges such as preventing and managing communicable and non-communicable diseases (The National Academy of Sciences 2003). The studies (Strand et al. 2020;Sahoo et al. 2020) have tried to link the COVID-19 situation with level of health infrastructure and explained that uneven responses to COVID-19 are mainly because of inadequate health infrastructure. Second, the health infrastructure is important to manage the cases of COVID-19, but the spread of COVID-19 depends on the interplay of other social determinants 1 at local level which include population density, availability of basic amenities such as drinking water and sanitation, and elderly population, etc. The social determinants of COVID-19 are responsible for maintaining social distancing which is identified as most used strategy for prevention and control of COVID-19 in the absence of any vaccine and widespread testing.
In this context, the attempt of this study is to understand what governs COVID-19 numbers in India. In other words, how do we understand the variation in the intensity of COVID-19 cases across regions and states. A strong case has been made to understand the intensity of COVID-19 cases through the analysis and comparison between states (Ghosh et al. 2020a, b;Srivastava et al. 2020). The states with better capacity in terms of efficiency in governance, economically developed status and having better health infrastructure were expected to cope and control the spread of the SARS-Cov-2 virus better. This paper is a modest attempt to move away from this logic that the states with better health infrastructure managed COVID-19 situation better. The paper puts forth two interrelated arguments to explain the spread of COVID-19. First, the available evidence collated by the paper shows that the states with better health infrastructure are not necessarily the best performers when it comes to managing the spread of COVID-19. Second, the spread of COVID-19 and its related consequences can be better explained through a district-level analysis. Following this, the primary argument that the paper develops is that the intensity 1 3 of COVID-19 spread at the district level is associated with the combination of factors that can be explained as social determinants of COVID-19. The latter includes basic amenities such as water and sanitation, density of the population, aged and migrant population. Arguing differently, the paper indicates that the sudden shock of COVID-19 translating into high intensity of cases at district level is more prevalent among those districts which have high population density, lack of basic civic entitlements and high share of vulnerable population such as elderly and migrants. Accordingly, the paper is divided into four sections.
The following section provides the details of the different data sources and methodology used in the study. The third and fourth sections present the state-and district-level analysis, and fifth section concludes the study.

COVID-19 related data
• The COVID-19 related data were sourced from COVID-19 India dashboard. 2 It is an application programming interface (API) which provides the data on confirmed cases, deaths, recovered cases, active cases and testing of COVID-19 in India at national, state and district level on daily basis. The present study has used the data available on this website with a cutoff date of November 1, 2020.

Data on health infrastructure
• The data on number of government hospitals, hospital beds and doctors have been obtained from National Health Profile 2019 published by Ministry of Health and Family Welfare, Government of India. • The estimated number of private hospitals, private hospital beds, public and private intensive care units' beds and ventilators has been taken from a report of The Centre for Disease Dynamics, Economics and Policy, Washington D.C. and New Delhi (Kapoor et al. 2020). • The number of testing laboratory facilities (both public and private) up to November 1, 2020, has been obtained from the Indian Council of Medical Research, Government of India.

Estimation of population projections based on census of India
• The projected population of 2020 of different states is obtained from the Report of the Technical Group of Population Projection, Registrar General of India. • However, the district-level population is projected up to 2020 using exponential growth of districts between 2001 and 2011.

Social determinants
• The district-level population density is calculated using the projected population of districts and area from Census of India 2011, and the district-level internal migration rate is calculated from D-Series (migration series) of Census of India 2011. • The other indicators at district level such as percentage of elderly population, percentage of households with drinking water facility outside premises and percentage of households with no sanitation facility/shared facility are estimated from National Family Health Survey of India 2015-2016 (NFHS-IV), which is a cross-sectional sample survey covering 601,599 households and 2.87 million individuals from 29 states and 7 union territories (IIPS and ICF 2017).
The third section of the paper examines the interlinkage between level of health infrastructure and COVID-19 situation at state level. To achieve this objective, a state-level-COVID-19 Situation Index has been constructed to understand the current situation of COVID-19 in different states. COVID-19 situation in the states is understood as combination of positive cases, deaths and recovered population along with testing (contextualized and calculated in relation to positivity rate: ratio of number of positive cases to the number of tests administered). Therefore, a COVID-19 situation index is calculated with the help of multiple indicators of COVID-19 situation between March 25 (date of imposition of first lockdown) and November 1, 2020. These indicators are (a) growth rate of COVID-19 cases per million population, (b) case fatality rate (percentage of death cases out of total cases as of November 1, 2020), (c) recovery rate (percentage of recovered cases out of total cases as of November 1, 2020), (d) growth rate of per million active cases and (e) test positivity rate (percentage of positive cases out of total test conducted). 3 In order to adjust the different scale and variance among indicators across states, Z-score method is used. This helps to compute scores without any distortion from the mean. The output is dimensionless, and the relative differences are maintained due to the application of a linear transformation (Mei and Grummer-Strawn 2007). For each of these individual indicators x i , the mean (x i ) and standard deviation i of top 20 states, where the COVID-19 cases had been reported at the time of first lockdown, have been calculated. The normalization has been done using Z-score formula (equation-1), so that Z-score for each indicator (z i ) has similar dispersion.
The actual minimum and maximum value of z i across states depends on the individual indicator (OECD and European Commission 2008). After obtaining Z-score for each of these indicators, an average has been calculated to construct a composite index named as 'COVID-19 situation Index.' The next step was to understand the COVID-19 situation in each state in relation to the availability of health infrastructure. Accordingly, a health infrastructure index has also been constructed using the same method. This index is constituted by taking multiple (a) total number of hospitals per million population, (b) hospital beds per million population, (c) intensive care units (ICUs) per million population, (d) COVID-19 testing laboratories per million population, and (e) doctors per million population. 4 The states were ranked according to COVID-19 situation index and health infrastructure index separately. 5 Thereafter, a relationship matrix was established between two indices. Drawing from the relationship between these two indices, the states were classified into four groups using quartile classes of the scores achieved by states-better, average, poor and very poor. It was interesting to note that health infrastructure is not the determining factor reflecting the intensity of COVID-19 cases across states. The results were useful, but perplexing. Therefore, in order to understand the intensity of COVID-19 cases, a district-level analysis was undertaken. This analysis threw some interesting results informing that the demographic profile and everyday living conditions were critical to understand the spread of COVID-19. Section four elaborates on the district-level analysis. The first step was to identity the districts having concentration of COVID-19 cases in relation to the national average. 6 The concentration of infection ratio, i.e., percentage of confirmed cases in total population, is calculated at the district level using the following formula of location quotient (L.Q.) 7 : where CC i = total number of confirmed cases in the ith district (i = 1, 2 …n), P i = total projected population of ith district, CC = total number of confirmed cases in India, and P = total projected population of India.
Based on the value of location quotients, districts were classified into five groups signifying the intensity of COVID-19 infection-very low (<=0.50), low (0.50-1.00), moderately high (1.00-2.00), high (2.00-3.00) and very high (>=3.00). L.Q. value more than 1 implies concentration of COVID-19 infections higher than national average. A spatial map is prepared using ArcGIS to show the distribution of L.Q. across districts. There are existing studies that point out several indicators of social determinants for higher concentration The indicators used to construct health infrastructure index are positive indicators, and therefore, the states with positive scores are the 'states with better health infrastructure' and the states with negative scores are the 'states with low level of health infrastructure'. 5 The rationale behind using two different types of indicators, i.e., negative indicators for COVID-19 situation index and positive indicators for health infrastructure Index lies in the nature of indicators. In general, most of the indicators of COVID-19 are negative in nature, i.e., they show the intensity of the spread of COVID-19. However, the health infrastructure indicators show how much a state is prepared in terms of providing services and facing any pandemic. 6 A combination of factors has been used to show the intensity of COVID-19 cases at state level, mainly because except positive cases, all other indicators are largely connected to the health infrastructure of the states, and to establish the interlinkage between COVID-19 situation and health infrastructure, the use of these indicators for construction of index was essential. However, the use of single indicator, i.e., infection ratio, is justified at district level because this is only indicators of COVID-19 which is mainly influenced by social determinants of COVID-19 among all other indicators such as death rates, recovery rate, test positivity rate, etc. of cases in certain pockets. These indicators are (a) population density (Arif and Sengupta 2020;Pathak et al. 2020;Tamrakar et al. 2020), (b) percentage share of elderly population, (c) percentage of households having drinking water facility outside premises, (d) percentage of households without sanitation facility/with shared sanitation facilities and (e) migration rate (Pathak et al. 2020;Tamrakar et al. 2020). These five indicators were tabulated for each district. For each of these indicators of social determinants, the district has been assigned 1 if the value is higher than national average, and 0 if otherwise. The scores obtained by districts in each of the five indicators are combined, which range from minimum 0 to maximum 5. The higher score indicates the worst situation of district in terms of selected social determinants as compared to national average. The next step was to examine the relationship between high L.Q. districts and the combined scores of determining factors. Based on this relationship, the paper shows that districts which are positive on at least two to three determining factors are those with highest on COVID-19 cases.

State-level situation of COVID-19 as of November 1, 2020
The purpose of this section is to understand the intensity by which the states are affected with the SARS-Cov-2 virus, and its relationship with the level of health infrastructure. Toward this endeavor, three interrelated statistical exercises were undertaken-(a) COVID-19 situation index, (b) health infrastructure index and (c) matrix examining the relationship between COVID-19 situation index and health infrastructure index.

COVID-19 situation index
As indicated in the preceding section, five indicators-infection rate, active cases, death rate, recovery rate and test positivity rate, were used to construct COVID-19 situation index. The first four indicators are directly related to the human exposure to the SARS-Cov-2 virus itself. The fifth indicator informs about the efforts of the government/ individuals to detect the symptomatic and asymptomatic cases. There is possibility of high number of cases which have not been detected because of low level of testing. Therefore, the integration of positivity rate is useful in addition to the set of indicators. Experts and commentators across the globe have taken these five indicators to understand the COVID-19 situation at global and national level (Sahoo et al. 2020;Srivastava et al. 2020;Hamzah et al. 2020;Dehkordi et al. 2020). In other words, a set of indicators are necessary to comprehensively understand the COVID-19 situation. The detailed estimates of these indicators are available in Appendix Tables 5, 6 and 7. The intensity of SARS-Cov-2 virus in each state has been understood through combining these indicators into a composite index using Z-score method (explained earlier). We have termed this as 'COVID-19 Situation Index.' Note the paper tabulates the indicators for top 20 states which cover 97% of confirmed cases in India. The states have been classified into following four groups-very poor, poor, average and better based on the rank achieved in the scores of the composite index (see Table 1 and Appendix Tables 8 and 9 for details).
Why do states perform differently? Popular explanations (Sadanandan 2020; Seethalakshmi and Nandan 2020) point out that the states with better health infrastructure were able 1 3 to manage the spread and consequences of COVID-19 relatively more effectively. The following sub-sections examine this proposition (Tables 2, 3).

State level of health infrastructure
The health expenditure in India is lowest among emerging economies (Jakovljevic et al. 2017). The budget estimates of 2019-2020 show that the expenditure on health in India is only 1.6% of gross domestic product (GDP). However, the available actual estimates of 2017-2018 show that it was only 1.4% of the GDP of India (Economic Survey 2019-2020). The low investment on health is reflected on the available health infrastructure. There are  Source: Authors' estimation only 51 hospitals and 1405 beds per million population in India. However, the number of ICU beds and ventilators 8 per million population are 70 and 35, respectively. Despite increasing number of testing laboratories in different states, India had only 2 testing laboratories per million population as of November 1, 2020. There are 854 doctors per million population in India. It is evident from these estimates that the current status of health infrastructure and medical human resource in India is inadequate to address any pandemic including the COVID-19. The state-wise estimates show a large-scale variation in the level of health infrastructure and medical human resource (see Appendix Table 10 for details).

Health infrastructure index
In order to understand the variations in the level of health infrastructure and medical human resources, a health infrastructure index is constructed using the following indicators-(a) total number of hospitals per million population, (b) hospital beds per million population, (c) intensive care units (ICUs) per million population, (d) COVID-19 testing laboratories per million population and (e) doctors per million population. The composite scores for selected states have derived from Z-score method (see Appendix Table 11 for details), and states have been classified into following four groups-very poor, poor, average and better based on the ranking of states' scores in composite index (see Appendix  Table 12 for details).

Ranking matrix between COVID-19 situation index and health infrastructure index
A matrix has been prepared using the states' position in COVID-19 situation index and health infrastructure index to show the interrelationship between situation of COVID-19 and state-level health infrastructure. The states are placed in the appropriate groups-very poor, poor, average and better as per their scores in both indices. An examination of the relationship between the COVID-19 situation index and health infrastructure index reveals two clusters of states. The first cluster comprises two groups of states-(a) states with average to better health infrastructure having poor to very poor ranking in COVID-19 situation index and (b) states with poor to very poor health infrastructure having average to better ranking in COVID-19 situation index. The states in these two groups indicate that there is no robust relationship between level of health infrastructure and COVID-19 situation. These states together cover 59% confirmed cases and 64% active cases of India which support the proposition that level of health infrastructure is not the sole factor for COVID-19 situation at state level.
In contrast, the second cluster comprises the other two groups of states-(a) states with average to better health infrastructure having average to better ranking in COVID-19 situation index and (b) state with poor to very poor health infrastructure having poor to very 1 3 poor ranking in COVID-19 situation index. The states in these two groups show a significant relationship between level of health infrastructure and COVID-19 situation, but together cover only 38% confirmed cases and 32% active cases which are less than the first cluster.
On basis of the said relationship, it can be concluded that the level of health infrastructure has limited role in controlling the outbreak of COVID-19 at state level and there is no robust relationship between these two. This is not to argue that health infrastructure does not have any role in managing the spread of COVID-19. Instead, the analysis informs the serious limitations of aggregating and interrelating the health infrastructure and COVID-19 situation to explain the spread and sustenance of COVID-19. Therefore, to pursue this line of argument further, the following section explores the district-level concentration of COVID-19 cases and the determining factors.

COVID-19 hot spot districts and social determinants of COVID-19 among districts with high infections
In the previous section, a combination of indicators have been used to construct COVID-19 situation index. In this exercise, all five indicators were important to examine the interlinkages between COVID-19 and level of health infrastructure at state level because, except confirmed cases, all other indicators such as deaths, recovered cases and test positivity rate are direct results of better public health system. However, the number of confirmed cases in any region indicates the pace of spread of COVID-19, and largely interlinked with nonmedical factors such as social determinants of health. Therefore, in this section, we have used infection ratio, i.e., total confirmed cases in total population to examine the concentration of COVID-19 cases and its relationship with social determinants at district level.
As explained earlier, location quotient is one of the best measures to determine the concentration of COVID-19 at district level. Location quotient is a widely used statistical method to identify the concentration of any item in a particular area/region with respect to national average. A higher location quotient (L.Q. >= 1) indicates the higher concentration of COVID-19 infections. The districts are divided into five clusters based on the value of L.Q.-very high (L.Q. >= 3), high (L.Q. 2-3), moderately high (L.Q. 1-2), low (L.Q. 0.50-1) and very low (L.Q. <= 0.50).
The first cluster of districts are very high COVID-19 concentration districts. The value of the location quotients across districts shows that Mumbai had the highest concentration of COVID-19 infection (L.Q.−14.7) followed by Bijapur (12.8) and Chennai (6.5). There were 20 districts with 'very high' concentration of COVID-19 infection (L.Q. >= 3). Majority of these districts are in Maharashtra (4), Andhra Pradesh (3) and Puducherry (3). These districts comprise a few big metropolitan cities of India such as Mumbai, Chennai, Bengaluru, Gurugram, Nagpur and Kolkata. If we combine all the districts of National Capital Territory of Delhi (NCT of Delhi), it also falls under this category and the number of districts then increases to 21 under 'very high' concentration of COVID-19 infection. The districts in this category cover 27% confirmed cases of India (Fig. 1).
The next cluster is the districts with high concentration of COVID-19 infection. There were 31 districts with 'high' concentration of COVID-19 infection (L.Q. 2-3) as of November 1. These districts are mainly located in Andhra Pradesh (8), Kerala (8), Karnataka (5) and Maharashtra (4). Udupi, Thiruvananthapuram, Faridabad, Hyderabad, Srinagar, Thane, Coimbatore, Satara, Kozhikode, Mysuru, Nashik and Thrissur are some of the districts in this category. The districts with high concentration of COVID-19 infection cover a total 18.24% confirmed cases of India. However, together, 'very high' and 'high' concentration districts cover a total of 45.21% of the total confirmed cases of India. The next cluster is the moderately high concentration districts (L.Q. 1-2). It comprises of 107 districts with 20.46% of the total confirmed cases of India (Fig. 1).
The fourth and fifth clusters are constituted by 165 and 317 districts with low (L.Q. 0.50-1), very low (L.Q. <= 0.50) level concentration of COVID-19 cases, respectively. Together these districts cover 34.33% of the total confirmed cases of India as of November 1, 2020. Majority of the districts (415) in these two clusters had confirmed cases less than 10,000 (Fig. 1).
The results show the three clusters of districts with very high, high and moderate concentration of COVID-19 infections have 66% confirmed cases of India (see Appendix  Table 13). Out of 159 districts in these three clusters, more than 100 districts located only in seven states-Karnataka, Maharashtra, Kerala, Andhra Pradesh, Tamil Nadu, Odisha and Jammu and Kashmir. We have tried to find out the possible reasons on why certain districts have higher concentration of COVID-19 cases than others through an in-depth analysis of districts in the said three clusters. Experts (Arif and Sengupta 2020;Pathak et al. 2020;Tamrakar et al. 2020) inform that five possible social determinants that could  (V4) and (e) migration rate (V5). Together these five indicators can be termed as 'Social determinants of COVID-19.' As pointed out in the methodology section, we have extended the scope of social determinants and included the demographic factors such as elderly and migrant population along with basic amenities and population density. All these five indicators are examined for each district in these clusters. If a district has a value higher than national average on a specific indicator, it scores '1,' otherwise '0'. This exercise is done for all the five indicators separately. Thereafter, value of each indicator for every single district is added to arrive at a combined score. For instance, the score of Mumbai for the above-mentioned five indicators (V1 to V5) is 1,1,1,0 and1, respectively, and therefore, the combined score of Mumbai is 4 (see Appendix Table 13). The higher combined score indicates the worst situation of districts in terms of selected social determinants.
The districts with highest concentration of COVID-19 infection have already been classified in three groups-very high (L.Q. >= 3), high (L.Q. 2-3) and moderate concentration (L.Q. 1-2) of COVID-19 infections in Fig. 1. In the next step, we have tabulated the number of districts with different combined scores in each of these three categories in Table 4.
The table reflects a few interesting takeaways. It is evident from Table 4 that districts where 3 social determinants indicators are above than national average are highest (54) followed by districts where 2 indicators are above national average (53). There are 29 districts where the districts have 4 indicators above national average. Together these three types of districts cover a total of 60% of confirmed cases of COVID-19 in India. Khordha (Odisha) (L.Q. 3.02) and Virudhunagar (Tamil Nadu) (L.Q. 1.19) districts have all the social determinants indicator above national average. Some of the important districts with 4 indicators above national average are Mumbai, Pune, Delhi, Nagpur, Nashik and Chandigarh. However, Bangalore Rural, Kolkata, Faridabad, Thane, Kozhikode, Mysore, Visakhapatnam and Cuttack are some of the districts with 3 indicators above national average. A few important districts with only 2 indicators above than national average are Bangalore, Gurugram, Hyderabad, Coimbatore, Lucknow and Ahmedabad (see Appendix Table 13). These figures indicate that there are a significant number of districts with higher concentration of COVID-19 infections where at least 2-3 indicators among selected five of social determinants are playing important role in the spread of COVID-19. If we examine each of these five selected social determinants of COVID-19 separately, three indicators-drinking water facility outside premises, elderly population and migration, are more significant to explain the higher concentration of COVID-19 infections as more than 50% districts with higher COVID-19 infections have above national average score in each of these three indicators which indicates that these are the probable factors for high outbreak of COVID-19 infections in these districts. However, population density and lack of sanitation facility are other two social determinants responsible for higher COVID-19 infections as among districts with higher concentration of COVID-19, 48% and 30% have above national average in these two indicators, respectively (see Appendix Table 13). The results of the study indicate that the interplay of social determinants at local level is the explanatory factor for higher concentration of COVID-19 in certain pockets of India.

Conclusion
The main objective of this paper is to examine the two propositions-(a) states with better level of health infrastructure have better chances of managing COVID-19, (b) the social determinants play important role in the spread/control of COVID-19. The results show that there is no robust relationship between level of health infrastructure and management of COVID-19 at state level as the states with better health infrastructure are also struggling to combat against COVID-19. The better health infrastructure could manage the COVID-19 cases and prevent the death, but it cannot stop the outbreak of infection. It is dependent on other social determinants such as population density and availability of basic amenities such as drinking water and sanitation in households along with demographic factors such as migrant population, elderly population and people with comorbidities. The results from the present study show that districts with higher concentration of COVID-19 infection comprise some of the big metropolitan cities and other major urban centers of India such as Mumbai, Chennai, Delhi, Bengaluru, Gurugram, Nagpur, Kolkata, Hyderabad, Thane, Thiruvananthapuram and Faridabad. Majority of the districts with higher concentration of COVID-19 infection have more than national average percentage in selected social and demographic determinants such as population density, lack of access to basic amenities, migration and elderly population. The argument is upfront and undisputable. The SARS-Cov-2 thrives among clusters of human being staying in dense localities with minimum basic amenities, a situation that is further aggravated during health emergencies like COVID due to public health infrastructure. The COVID-19 has exposed the vulnerability of our cities and provided us a chance to 'revisit' the policies and programmes at local level to make our cities sustainable, healthy and resilient.   Source: Author's calculation based on https:// api. COVID 19ind ia. org/ docum entat ion/ csv/. Accessed on 2nd November 2020  Pradesh 12  775  39  19  1  456  Maharashtra  26  1874  94  47  2  1402  Odisha  57  584  29  15  1  513  Punjab  77  2021  101  51  2  1602  Rajasthan  72  1185  59  30  1  552  Tamil Nadu  32  2040  102  51  3  1758  Telangana  109  2662  133  67  2  132  Uttar Pradesh  75  1229  61  31  1  339  West Bengal  23  1162  58  29  1  737  Mean  50  1441  72  36  2  885  SD  40  928  46  23 1 615               Source: Author's calculation based on https:// api. COVID 19ind ia. org/ docum entat ion/ csv/ Accessed on 2nd November 2020; Census, 2001 and2011;and NFHS-IV (2015-16) Funding No funding was received for this research.

Conflicts of interest
The authors have no conflicts of interest to declare.

Ethical approval
The study is based on secondary datasets available in the public domain for research use, and therefore, no approval is required from institutional review board. The map used in this study is representative and does not include the actual administrative boundary of India drawn by Survey of India.