FinancialFootnote 1 inclusion is considered to be a crucial element in fostering economic growth and development of a country, through facilitating easier availability of credit, savings, payment, and insurance options to a large section of people (Chibba 2009). This idea has motivated a large amount of research as well as policy initiatives across different countries in the world (See World Bank 2014 for a review). The Indian government has, over the last 50 years, undertaken various such policy measures to improve financial inclusion in the country. Between 1969 and 1990, the government carried out a large-scale social banking programme to provide access of formal credit to the rural poor. More recently, in August 2014, the government launched another ambitious financial inclusion programme. The programme, called Pradhan Mantri Jan Dhan Yojna, or PMJDY, aims to open one bank account for every household in the country.Footnote 2 The government has started providing social security benefits to its citizens through these accounts, thereby creating an incentive to open the accounts. This has resulted in a sudden jump in the number of bank accounts held in the country. But how do such financial inclusion measures affect welfare of the households? We investigate this question in the present article.

Even though there are some studies that have looked into the link between financial inclusion and different dimensions of economic development in the recent past (such as access to banking, agricultural credit, saving behaviour, inequality etc. see Singh 2017 for a review of the literature), the evidence is rather scarce, especially in the context of developing countries like India. Further, to the best of our knowledge, there is no systematic study so far analysing this relationship in the context of the recent nation-wide financial inclusion programme (PMJDY).

One plausible mechanisms through which financial inclusion may impact household welfare is through easier access to savings and credit facilities (Bharadwaj and Suri 2020), which can alter production and employment choices (Aghion and Bolton 1997; Banerjee and Newman 1993). Better production and employment choices in turn lead to higher income and, thus, change the pattern of consumption expenditure,Footnote 3 including the possibility of widening of the consumption basket (Chai et al. 2015; Chakrabarty and Mandi 2019; Falkinger and Zweimüller 1996; Theil and Finke 1983). We examine this link in this article by investigating whether inclusion in the formal financial system by opening of bank accounts can lead to increase in consumption diversification through increase in income, proxied through monthly per capita expenditure (MPCE). As evidenced in the literature, the change in savings and borrowing behaviour due to financial inclusion not only create better employment opportunities leading to increased income, it also changes consumption expenditure pattern through increasing dietary diversity (Annim and Frempong 2018), smoothing of consumption etc. (Lai et al. 2020). We, however, do not explicitly explore these channels in the present paper due to lack of availability of data on borrowing, consumption smoothing etc.

We use diversification in consumption expenditure as a measure of welfare in this study. Increase in diversity of consumption expenditure, and not only growth in consumption expenditure, is widely accepted as an important determinant of economic welfare (Barro and Sala-i-Martin 2004; Grossman and Helpman 1991; Romer 1990). The poorer households, who cannot fulfil a threshold level of consumption for the inelastic basic goods, are not able to allocate expenditure on non-essential elastic goods. With increase in level of income, expenditure allocation on such goods increases, thereby widening the consumption basket. Hence, an increase in variety in the consumption basket is considered as welfare enhancing (Chai et al. 2015; Clements et al. 2006; Jackson 1984; Prais 1952). To estimate consumption diversification, we use the entropy-based measure of expenditure share proposed by Theil (1967) and Theil and Finke (1983). At low levels of income, the consumption basket tends to be homogeneous in nature. But as income rises, the basket becomes more heterogeneous, consisting of different types of goods. In this sense, our entropy-based measure will increase, indicating more diversity (see Eq. 1 in Sect. 4.1). Our first set of hypotheses test this impact of income enhanced through the opportunities of financial inclusion programmes, on consumption diversification among households.

We initially test the three following hypotheses on consumption diversification as a result of financial inclusion. First, if financial inclusion programmes are welfare enhancing, then it should decrease expenditure share on food items. At low income levels, share of expenditure on food items is higher, following Engel’s law. Banerjee and Duflo (2007) suggested this share to be around 50–70 percent of household budget for the poor. But, as income grows, this share comes down substantially, to around 30 percent (Clements and Chen 1996). Along with this, the diversification in consumption within the food items is expected to increase. Our first hypothesis tests this proposition. Further, the welfare enhancing effect of financial inclusion should lead to diversification in expenditure on non-food items as well due to increased income. The second hypothesis investigates this. Finally, the third hypothesis assesses whether increasing income achieved through access to more financial resources leads to a shift in expenditure from food basket to non-food consumption basket. We carry out additional tests to examine the independent impact of financial inclusion on diversification of consumption due to the presence of other plausible transmission channels as discussed above. However, we do not explicitly identify these specific channels.

We employ a panel data method to carry out our analysis. The panel of households comes from a survey conducted by the Centre for Monitoring Indian Economy (CMIE). The survey collects data on consumption expenditure, demographic information, caste, religion, asset holding, region of residence etc. along with the information on bank account opening status for each individual within every household, three times a year (defined as waves). We construct a strongly balanced panel of 49,739 households for each time period, from this dataset using data for first nine waves of the survey, covering the period of January 2014 to December 2016. In order to capture the effect of the financial inclusion programme, PMJDY, we construct two time-dummy variables. The first dummy variable investigates the immediate impact of PMJDY and the second dummy variable helps us to account for the effect one year after the incubation of the programme. To examine the transmission path from financial inclusion to consumption diversification through increase in income, proxied through monthly per capita expenditure (MPCE), we employ a two-stage regression approach. In the first stage, we regress monthly per capita consumption expenditure on the financial inclusion dummies, as well as a set of control variables. Then, in the second stage, we regress the Theil’s entropy-based index on the predicted values of the per capita expenditure obtained from the first stage. Even though we do not explicitly explore additional channels in this paper, we run a separate set of second-stage regressions including the financial inclusion dummy variables, to account for the separate effect of financial inclusion.

Our empirical analysis shows that there is indeed an increase in diversity due to increase in total expenditure after introduction of PMJDY scheme. However, the separate effect of PMJDY on diversification of food expenditure is ambiguous. Immediately after the initiation of the PMJDY programme, there was a drop in diversification. But, the effect of financial inclusion is positive after a year. This increase may be anomalous, possibly due to a sudden jump in the index immediately after one year. A closer inspection reveals a drop in the longer run as well. This observation is possibly due to lack of information on detailed items and quality of items within the food group. Regarding heterogeneity in non-food expenditure, and shift in expenditure from food to non-food items, our results are as expected. Both measures show an increase in diversity in the shorter run as well as in the longer run after the initiation of the programme. Our examination of plausible mechanism shows that the rise in diversity is indeed through an increase in total consumption expenditure as evidenced through the significant positive coefficients of predicted values of the per capita expenditure. This signifies an improvement in household welfare. However, we also notice an independent positive effect of financial inclusion on consumption diversification in addition to the above-mentioned channel. We have discussed some of the plausible pathways in the previous paragraphs. Apart from that, this effect could also arise due to measurement error in household per capita consumption expenditure (MPCE) since it does not include certain components of consumption expenditure such as housing, car etc. It may also be because of failure to capture adequately the economies of scale in MPCE due to the presence of heterogeneous demographic compositions within households.Footnote 4

This research contributes to the literature in several ways. First, we employ an entropy-based diversity as a measure of household welfare. The concept of diversity has been widely used to measure various dimensions of concentration in an industry/market, using primarily the Hirschman-Herfindahl index. The application of diversity measures, however, is quite scarce in the existing literature on the economics of household welfare. Recently, Chakrabarty and Mandi (2019) have used Theil’s entropy-based diversification measure on cross-sectional household survey data to study the determinants of consumption diversification. This study is possibly the closest to our approach. However, this study does not explore the role of a social policy, such as the financial inclusion programme we focus on, on diversification. Moreover, we use a country-wide longitudinal data for our analysis.

Second, our study contributes to the studies on the role of financial inclusion programmes on economic development. The determinants and impact of financial inclusion, especially that through opening of bank accounts, has been well studied in the context of the industrialized countries. Lusardi and Mitchell (2014) provide a review of such studies conducted in 12 developed countries. There are some recent studies focussing on the issues of financial inclusion in developing country settings as well. Chin et al. (2011) studied the effect of opening bank accounts among the Mexican immigrants in the USA. They found an increase in savings share out of income. Dupas et al. (2018) investigated the effect of increased access to basic bank accounts to unbanked rural households in three countries across two continents: Uganda, Malawi and Chile. They do not find discernible welfare effects due to this increased access. One possible reason of this finding may be that overwhelming majority of the poor did not make much use of the bank accounts. Further, in a study covering eight African countries, De Koker and Jentzsch (2013) find that access to formal financial services through financial inclusion programmes may not substitute usage of informal financial services. They cite existence of informal employment (and therefore payments in cash) in the economies as a probable reason.

In the context of India, Burgess and Pande (2005) found that increasing access to banking can be instrumental in declining rural poverty through higher deposit mobilization and credit disbursement. In a relatively recent study, Young (2015) has reinforced the beneficial impacts of access to banking services through robust empirical analysis. On the other hand, Kochar (2011) finds that expansion in access to banking may actually increase consumption inequality, through unequal access to credit between the rural rich and the poor. All these studies used the rural bank branch expansion initiative in India, undertaken between 1969 and 1990, as the context. In contrast, our study uses the recent financial inclusion programme, named Pradhan Mantri Jan Dhan Yojna or PMJDY, initiated at the national level, as the backdrop.

Third, our research strategy also distinguishes itself from most other existing studies on financial inclusion. Many of the studies on this topic are limited to a smaller geographical area, thereby lacking generalizability in a larger context. The handful number of studies using country-wide data also confine themselves into cross-sectional analysis (Badarinza et al. 2016). Even though the study by Burgess and Pande (2005) uses panel data for their analysis, the data are aggregated at state level. In our study on the other hand, we elect to use a nationally representative sample of roughly 150,000 households to pursue our research objective. These households were surveyed once in every four months, allowing us to construct a panel of the sample households. This dataset and our research approach potentially allow to arrive at more precise estimation of the effects of our interest.

Fourth, apart from the standard fixed effect model, we also use the Hausman-Taylor estimation method to estimate the coefficients in order to address endogeneity issues of some of our variables (Hausman and Taylor 1981). One disadvantage of the fixed effect estimator is that it cannot provide the effects of the time-invariant covariates in the model. However, some of those variables may be crucial in understanding the variations in consumption diversity, and may be informative in making policy decisions. Hausman-Taylor estimation method enables us to recapture the effects of those factors in our model (See for example Poprawe 2015; Quayes 2015). Finally, most of the studies on financial inclusion have confined themselves to studying the effect on the rural poor population only. We conduct our analysis both on the rural and the urban sample. This extension can potentially increase generalizability of our results.

The rest of the article is organized as follows. We start with a brief overview of the financial inclusion programmes in India, focussing on the PMJDY scheme. Section three discusses the dataset used for empirical analysis of the paper. The empirical strategy adopted is discussed in section four, while the results are presented in Section five. Section six concludes.

Financial Inclusion Programmes in India

In this section we provide a brief overview of the financial inclusion programmes in India, focussing on the recent PMJDY programme, since this programme is the main context of the paper. Attempts to widen financial inclusion in India can be traced back to 1969, when the central bank mandated the commercial banks of the country to open branches in rural unbanked locations. More recently, in 2006, the central bank implemented financial extension services, under which Non-Governmental Organizations, Micro Finance Institutions and other Civil Society Organizations could be employed as intermediaries to increase outreach of the banking sector. Simultaneously, commercial banks were encouraged to provide zero minimum balance accounts access to the economically weaker section.

The Pradhan Mantri Jan Dhan Yojna, or PMJDY programme is an extension of these initiatives. The programme was announced by the Prime Minister of India on the Independence Day (15th August) in 2014, and was formally launched on 28th August of the same year. The primary aim of the programme is to open at least one bank account for every household. Towards this goal, the scheme allows opening of zero balance bank accounts in public sector, private sector or regional rural banks. Additionally, the PMJDY accounts are equipped with an overdraft facility of up to Rs. 10,000, life insurance cover of Rs. 30,000, and a debit card with in-built accidental insurance coverage of Rs. 100,000. The programme also aims at providing Direct Benefit Transfers (DBTs) through these accounts instead of handing over cash to the beneficiaries. These additional benefits, and large-scale awareness building campaigns have resulted in substantial increase in number of bank accounts opened: approximately 180 million new bank accounts were opened within the first year after launch of the programme, and close to 325 million new accounts were opened till the end of August 2018 (i.e. three years after launch of the programme). Since August 2018, the scope of the programme has been extended to opening one bank account for every adult in the country. Till August 2019, i.e. after five years of launch, the number of accounts opened reached to over 367 million. Table 1 shows the annual progress of the scheme between September 2014 and August 2019.

Table 1 Number of beneficiaries of PMJDY Scheme, from September 2014 to August 2019


The empirical analysis in this paper primarily relies on the Consumer Pyramids Survey conducted by the Centre for Monitoring Indian Economy (CMIE) since January 2014. This is a household-level longitudinal survey, covering roughly 150,000 households spread across all states and union territories in India. Around two third of the sampled households are from urban areas. The survey captures information on household demographics, composition of income and expenses, details on assets and liabilities, employment details etc. The data for each household are collected three times a year, called wave. A wave starts in the months of January, May and September, and repeats every year. The first wave of survey was conducted between January and April 2014. For our study, we consider household-level data for first nine waves of the survey, covering the period of January 2014 to December 2016. The reason for not considering data for subsequent waves is to avoid confounding effects of other large exogenous macroeconomic shocks such as demonetisation (announced on 8th November 2016) and Goods and Services Tax (GST) (implemented since June 2017).

The survey data capture information on bank account opening status for each individual within every household, separately for each wave. We exploit this information, combined with the timing of PMJDY implementation,Footnote 5 to analyse its effects on consumption patterns. The PMJDY scheme was announced on 28th August 2014. By that time, two waves of the survey were completed. Given the exogenous nature of the scheme, this gives us the opportunity to compare the effect of PMJDY before and after the scheme was introduced. As expected, the data show a positive discontinuous shift in number of bank accounts opened right after the introduction of the scheme, as shown in Table 2 and Fig. 1.Footnote 6 Table 2 presents the mean of the proportion of household members within a household holding a bank account. The data are shown for each of the 9 waves, separately for rural and urban regions. The difference between waves 2 and 3 (i.e. immediately before and after announcement of PMJDY scheme) is approximately 6 percentage points, which is the highest between any two waves we consider. The difference in mean values is similar between rural and urban regions. Two-sample t-tests show that this difference is highly statistically significant. Therefore, it is safe to assume that introduction of the PMJDY scheme led to an increase in opening of bank accounts in India.

Table 2 Mean of the proportion of household members holding a bank account, in each wave
Fig. 1
figure 1

Trend of the proportion of household members holding a bank account

We proceed as below to arrive at our final sample of households. First, we keep data for the major 20 states, which accounts for 94% of the whole sample, and exclude the union territories and the north eastern hilly regions of the country from the sample.Footnote 7 Second, we drop all records for which there were missing or invalid values for the included variables in our empirical models (for example, we do not include records if age is specified as -99, or caste is specified as ‘not stated’). Third, we keep records only for those households who appeared in each wave of the survey. The final pooled sample size, after carrying out these cleaning operations is 447,651. This implies that we have a balanced panel of 49,739 households for each of the nine waves we consider. Further, there is a systematic difference in consumption pattern between households in rural and urban regions in India (Krishnaswamy 2012). Table 3 shows that this difference is present in our sample of households as well. Since consumption-based measure is our main impact variable, we carry out our analysis separately for urban and rural regions. The number of households in rural region for each wave in our sample is 16,218 and that for urban region is 33,521.Footnote 8 Table 12 in Appendix shows the list of states and number of households included from each state, separately for rural and urban regions.

Table 3 Summary statistics

Table 3 shows the descriptive statistics for the variables used in our analysis, separately for the rural and urban regions. The variables include monthly per capita food and non-food expenditure,Footnote 9 per capita total expenditure,Footnote 10 caste category, religion, number of children and an asset ownership index for the households.Footnote 11

We use monthly per capita expenditure as a proxy for income of the households. Similarly, we use the asset ownership index variable as proxy for the households’ wealth. This variable has been built in the following way. The survey captures asset ownership for each household in several categories of asset ownership. We select 19 among those categories that are relevant to our study (the list of all asset variables captured for our study are presented in the appendix, in Table 14). Among these 19 variables, 16 are indicator variables, having ‘1’ if the household owns the asset and ‘0’ otherwise. The rest three variables are continuous in nature (number of houses owned, number of tractors owned, and number of cattle owned). We convert the continuous variables into binary indicator variables, similar to the other indicator variables, by assigning value ‘1’ whenever the value of the continuous variables is greater than zero, and ‘0’ otherwise. Next, we take a simple average of the values populated in all the 19 items to arrive at the asset index. A lower value of the index signifies lower asset holding, and vice versa.

The dataset captures information on castes in eight categories.Footnote 12 We reorganize this data in the following four categories: Upper Caste, Intermediate Caste,Footnote 13 OBC and SC & ST. We do not include the households that did not reveal their caste. Similarly, we rearrange the religion reported by the households in three categories: Hindu. Muslim and others (consisting of Buddhist, Christian, Jain and Sikh religion categories). We assume that caste and religion are exogeneous as well as time-invariant in nature. Finally, we include a variable that captures the number of children within a family for each period. We define a family member as children if his/her age is below 16.

In Table 3, Panel A shows the descriptive statistics for the rural sample, whereas Panel B shows the same for the urban sample. The differences in mean of the variables between rural and urban regions are on the expected lines. Per capita food expenditure, non-food expenditure and per capita total expenditure for the urban households is higher than the rural households.Footnote 14 On the other hand, urban households report fewer number of children compared to rural ones on average. Share of Hindu population is higher in the rural areas at 89% vs. 85% in urban areas. This is almost entirely covered by the larger share in Muslim population from 6% in rural areas to 9% in urban areas. Distribution of the households by caste composition shows that the proportion of upper caste households residing in urban areas is double that of rural areas. 30% of the sampled households in urban areas belong to upper caste, compared to 15% in rural areas. This distribution is reversed for the SC and ST communities (24% in urban areas vs. 36% in rural areas) and remains almost same for OBC communities.

We also report mean expenditure shares for food and non-food group items for all the waves, separately for rural and urban regions, as shown in Table 4. For both regions, share of expenditure spent on food items have decreased over time. On the other hand, share of expenses on non-food items have increased marginally. This trend is as expected, and follows the Engel’s law, assuming households’ real income have gone up.Footnote 15

Table 4 Trend of mean expenditure shares on food and non-food items, separately for rural and urban regions

Empirical Strategy

In order to address the central question of whether households’ welfare improves through diverse expenditure pattern within sub-group of commodities due to financial inclusion programme, we construct two time-dummy variables to capture the effects of financial inclusion.Footnote 16 The PMJDY scheme was officially initiated from 28th August 2014. Therefore, the first dummy, named PMJDY1, holds the value ‘0’ for all households for the first two waves (January to April and May to August), and ‘1’ for all households for all subsequent waves. However, the PMJDY scheme was launched as an initiative to increase the number of bank accounts in the country over a period of time. Therefore, in all practicality, only a fraction of the households opened their bank accounts as part of the financial inclusion programme immediately after the initiation of the programme. Hence, we include another indicator variable, named PMJDY2, which takes the value ‘0’ for the first five waves, and ‘1’ thereafter (i.e. from sixth to 9th waves). This variable helps us account for the effect on consumption patterns one year after the incubation of the programme. Lastly, some of the households in our sample may have opened bank accounts due to different supply-side and demand-side factors other than the PMJDY scheme. Also, in developing countries, the access to financial resources may happen through informal sources (Banerjee 2004). We include a time trend variable (named as wave) to account for these and other general macroeconomic trends.

Theil Diversification Index

We use diversification of consumption expenditure as the measure of household welfare in this article (see for example, Chai et al. 2015; Chakrabarty and Mandi 2019; Clements et al. 2006 for such association). There are different approaches to measure consumption diversification, such as Gini-Simpson index (Simpson 1949), distance-based measures (Lieberson 1969), absolute value-based measures (Reardon and Firebaugh 2002) etc. In our analysis, we consider entropy-based measure proposed by Henry Theil (Theil 1967; Theil and Finke 1983). Entropy in general captures the degree of ‘dividedness’ in a system. Theil introduces this concept as a ‘measure of dividedness’ of economic variables, such as racial division, industrial diversification, political diversification etc. (Theil 1972). One advantage of using Theil’s measure is that it allows decomposition of the diversity index into two components: the diversity within separate entities and also the diversity between these entities (Palan 2010; Reardon and Firebaugh 2002). We extend this concept to measure the diversification in consumption expenditure of households.Footnote 17 The rest of this subsection describes this measure in brief, with specific references to our study.

Suppose there are ‘\(n\)’ number of commodities that a household consumes, and \({w}_{i}\) is the share of the total budget the household spends on \(i\)th commodity. Clearly, \(\sum _{i=1}^{n}{w}_{i}=1.\) Then, the Theil’s measure for diversification of consumption expenditure \(H(w)\) for the household is given by:

$$H\left(w\right)=-\left(\sum _{i=1}^{n}{w}_{i}log{w}_{i}\right)$$

Theil calls the measure obtained from this expression “entropy”. Minimum value of entropy is obtained when all expenses are incurred on only one commodity, and maximum value is obtained when equal share of expenditure (i.e. \(1/n\)) is spent on each commodity.

Further, Theil’s Entropy Decomposition Theorem (Theil 1972) establishes that if the system can be divided into smaller sub-groups, then the overall entropy of the whole system can be decomposed into the sum of two separate entropies: the within-group entropy, and the between-group entropy, as shown in Eq. (2) below:

$$Overall~entropy = BetweenGroup~entropy + ~weighted~average~of~WithinGroup~entropies$$

In our analysis, we consider two broad groups of commodities in which a typical household incurs its expenditure: food items and non-food items.Footnote 18 The entropy measure between these two broad groups will yield the between-set entropy. Further, within each of these broader groups, we construct several sub-groups. There are 13 sub-groups within the food items group and 7 sub-groups in the non-food items group (details of the commodities considered in each group and sub-group is presented in Table 13 in the appendix.). Therefore, the entropy between the food items group and the non-food items group is the first term in Eq. (2), whereas the weighted average of entropies of food items group and non-food items group is the second term. Equations (3) and (4) show the expressions for between-groups and within-group entropies, with reference to our study.

$$\begin{aligned} entropy\;between\;food\;and\;nonfood\;groups & = \,\left( {\frac{{Expenditure\;on\;food\;items}}{{Total\;expenditure\;on\;food\;and\;nonfood\;items}}} \right)*log\left( {\frac{{Expenditure\;on\;food\;items}}{{Total\;expenditure\;on\;food\;and\;nonfood\;items}}} \right) \\ & \quad + \left( {\frac{{Expenditure\;on\;nonfood\;items}}{{Total\;expenditure\;on\;food\;and\;nonfood\;items}}} \right)*log\left( {\frac{{Expenditure\;on\;nonfood\;items}}{{Total\;expenditure\;on\;food\;and\;nonfood\;items}}} \right) \\ \end{aligned}$$


$$\begin{aligned} weighted\;average\;of\;entropies\;within\;food\;items\;group\;and\;nonfood\;items\;group & = \left( {\frac{{Expenditure\;on\;food\;items}}{{Total\;expenditure\;on\;food\;and\;nonfood\;items}}} \right) \\ & \quad *\left( { - \mathop \sum \limits_{i = 1}^{13} share\;of\;total\;expenditure\;on\;ith\;food\;itme*\log \left( {share\;of\;total\;expenditure\;on\;ith\;food\;item} \right)} \right) \\ & \quad + \left( {\frac{{Expenditure\;on\;nonfood\;items}}{{Total\;expenditure\;on\;food\;and\;nonfood\;items}}} \right) \\ & \quad *\left( { - \mathop \sum \limits_{i = 1}^7 share\;of\;total\;expenditure\;on\;ith\;nonfood\;itme*{\text{log}}(share\;of\;total\;expenditure\;on\;ith\;nonfood\;item} \right) \\ \end{aligned}$$

In our econometric analysis, we use both within-group entropies as well as between-groups entropy as measures of household welfare.


In our discussions so far, we hypothesized the primary link from financial inclusion to welfare in the following way: financial inclusion may improve employment opportunities, and thus household income. Further, increased income leads to both increase and diversification of consumption expenditure. And finally, diversified consumption causes welfare gain.

Specifically, for our welfare measure, welfare gain will translate into diversification in consumption within the food group and non-food commodity group, as well as between the food and non-food commodity groups. Accordingly, we formulate the hypotheses as follows:


Financial inclusion will lead to diversification in food expenditure.


Financial inclusion will lead to diversification in non-food expenditure.


Financial inclusion will lead to shift in expenditure from ‘within only food group’ to ‘between food group and non-food group’.

Econometric Model

For the analysis of impact of financial inclusion on consumption diversification, we employ a panel data regression method. The Consumer Pyramid database collects information about the same households over repeated periods, making it ideal for tracing the changes in economic behaviour of same households over a period of time. We exploit this feature of the dataset to investigate the effect of large-scale financial inclusion programme on households’ welfare.

Two-Stage Regression: Income Channel

To examine the path from financial inclusion to consumption diversification through increase in consumption expenditure, we employ a two-stage regression approach. In the first stage, we regress monthly per capita consumption expenditure (logMPCE) on the financial inclusion indicators, as well as a set of control variables (Eq. 5). Then, in the second stage, we regress the three diversification indices (food index, non-food index, and between index) on the predicted values of the per capita expenditure (\(\widehat{{logMPCE}_{it})}\) obtained from the first stage, along with a smaller set the control variables (Eq. 6).

$${\text{First}}\; {\text{stage:}}\log MPCE_{{it}} = \alpha _{0} + \alpha _{1} PMJDY1_{t} + \alpha _{2} PMJDY2_{t} + \alpha _{3} Asset~Index_{{it}} + \alpha _{4} Numbe~of~Children_{{it}} + \alpha _{5} Mean\_education_{{it}} + \alpha _{6} \Pr oportion\_working_{{it}} + \alpha _{7} wave_{t} + \mathop \sum \limits_{{k = 1}}^{3} \alpha _{k}^{{caste}} Caste_{i} + \mathop \sum \limits_{{j = 1}}^{2} \alpha _{j}^{{\text{Re} ligion}} \text{Re} ligion_{i} + \mathop \sum \limits_{{s = 1}}^{{19}} \alpha _{s}^{{State}} State_{i} + \rho _{i} + u_{{it}}$$
$${\text{Second}}\,{\text{stage:}}\,Y_{{it}} = \beta _{0} + \beta _{1} \widehat{{\log MPCE_{{it}} + }}\beta _{2} Asset~Index_{{it}} + \beta _{3} Numbe~of~Children_{{it}} + \beta _{4} Mean\_education_{{it}} + \beta _{5} wave_{t} + \mathop \sum \limits_{{k = 1}}^{3} \beta _{k}^{{caste}} Caste_{i} + \mathop \sum \limits_{{j = 1}}^{2} \beta _{j}^{{\text{Re} ligion}} \text{Re} ligion_{i} + \mathop \sum \limits_{{s = 1}}^{{19}} \beta _{s}^{{State}} State_{i} + \alpha _{i} + u_{{it}}$$

We use these equations to run six sets of regressions. First, following the hypotheses mentioned above, we run a regression for each type of Theil Diversification Index: diversification within food group, diversification within non-food group and diversification between food and non-food group. Further, as mentioned earlier, we carry out our analysis separately for rural and urban regions, for each diversification index. \({Y}_{it}\) is the dependent variable that captures the appropriate diversification index (Theil Food Index, Theil non-food Index and Theil Between Index), where ‘i’ denotes the households and ‘t’ denotes the waves.

PMJDY1 is a binary variable capturing the immediate impact of the PMJDY scheme. It takes the value ‘0’ for the first two waves and ‘1’ for the subsequent seven waves. However, as mentioned before, given the scale and scope of the PMJDY financial inclusion programme, its take-up, and therefore its impact on households’ consumption might not be apparent immediately after the launch of the scheme. To address this concern, we include another time-dummy variable that takes the value ‘0’ for the first five waves, and ‘1’ for the next four waves. This dummy, denoted as PMJDY2 in the model, measures the effect of the financial inclusion programme after one year of inception.

Therefore, based on our model, our three hypotheses H1a, H2a and H3a for three different Theil Indices (food, non-food and between) that capture the effect of financial inclusion through the channel of increasing income, can be written as shown below:

$${\text{H1a, H2a, H3a:Null hypothesis:}}\beta _{1} \le 0;\,{\text{alternate hypothesis:}}\,\beta _{1} > 0.$$

Consumption diversity could also arise due to demographic compositions and heterogeneity in tastes. The extant literature has also shown that demographic attributes play important role in consumption decisions. We include three variables to control for these attributes. First, the impact of number of children on household decision making, as well consumption diversification has been studied in the literature (Chakrabarty and Mandi 2019; Flurry 2007; Ray 1986). Accordingly, we include a variable Number of Children that captures the number of family members below age 16 in a household. Second, educational attainment is also found to affects consumption expenditure (for example, see Maitra and Ray 2004). Therefore, we include the variable \(Mean\_education\) in our regressions. As the name suggests, this variable depicts the level of education in a household, averaged over all family members. Finally, the relation between employment and consumption is also well established in the literature (see Mincer 1960, for example). To account for this, we include another explanatory variable, \(Proportion\_working\). This variable is constructed by dividing the number of family members working within a household by the total number household members. All three variables are calculated for each household for each wave separately. We must note that the variable \(Proportion\_working\) is included only in the first-stage regression as an additional variable.

Ando and Modigliani (1963) first stated the role of asset in consumption behaviour in their life-cycle hypothesis of consumption. Since then there has been extensive empirical research in this field using macroeconomic time series data (see Altissimo et al. 2005). There are a few recent studies on this effect using microeconomic data as well. For example, Campbell and Cocco (2007) have studied the effect of home price fluctuation on consumption for homeowners of different age groups. Caceres (2019) studied the effects of various types of wealth, such as housing, financial assets, and total net worth on consumption. Bostic et al. (2009) have investigated the impact of wealth on consumption expenditure using household-level consumer expenditure survey. Following this, in our study, we include Asset Index as an explanatory variable in the empirical model.

Social characteristics such as caste and religion play an important role in consumption pattern for households especially in developing countries like India. Bailey and Sood (1993) have studied the effect of religious affiliation on consumption behaviour. Borooah et al. (2014) have studied link between caste and consumption pattern. Khamis et al. (2012) have shown the impact of both castes as well as religion on visible consumption expenditure. Hence, we include time-invariant Religion and Caste dummies in our empirical model. Since religion and caste of a person is ascribed at the time of birth, and there is very little possibility of its change over their lifetime, we consider these factors as time-invariant and exogenous in nature. Additionally, to capture general macroeconomic linear trend, we include the wave variable that contains value from 1 to 9 covering our whole time period. Finally, some of the unobserved state level heterogeneity that might affect consumption patterns are captured through state level fixed effects.

Two-Stage Regression: independent effect of financial inclusion

Apart from the income channel mentioned above, financial inclusion may affect welfare through other channels as well. For example, higher access to savings and credit due to financial inclusion (Bharadwaj and Suri 2020) may lead to more diversity in consumption (Annim and Frempong 2018) and also consumption smoothing (Lai et al. 2020), thereby indicating a change in the pattern of consumption. This may lead to possible widening of the consumption basket. We, therefore, run another six set of regressions employing Eq. 7, in which we include, additionally, the purely exogenous financial inclusion dummy variables in the second stage to account for the existence of such separate effect of financial inclusion over and above the income channel. The coefficients \({\beta }_{1}\) & \({\beta }_{2}\) in Eq. 7 correspond to the short run and long run independent impact of PMJDY scheme after controlling for all other relevant explanatory variables included in the second-stage regression along with the income channel:

$$\begin{aligned} {Y_{it}} & = {\beta _0} + {\beta _1}PMJDY{1_t} + {\beta _2}PMJDY{2_t} + {\beta _3}\widehat {logMPC{E_{it}} + }{\beta _4}Asset\;Inde{x_{it}} \\ & \quad + {\beta _5}Numbe\;of\;Childre{n_{it}} + {\beta _6}Mean\_educatio{n_{it}} + {\beta _7}wav{e_t} \\ & \quad + \mathop \sum \limits_{k = 1}^3 \beta _k^{caste}Cast{e_i} + \mathop \sum \limits_{j = 1}^2 \beta _j^{Religion}Religio{n_i} + \mathop \sum \limits_{s = 1}^{19} \beta _s^{State}Stat{e_i} + {\alpha _i} + {u_{it}} \\ \end{aligned}$$

Therefore, based on our model, the separate effect of financial inclusion on consumption diversification can be tested using the following hypothesesFootnote 19:


Null hypothesis: \(\beta_{1}\le 0,\) alternate hypothesis: \(\beta_{1}>0\), H1b’: Null hypothesis: \(\beta_{2}\le 0\), alternate hypothesis: and \(\beta_{2}>0.\)


Null hypothesis: \(\beta_{1}\le 0\), alternate hypothesis: and \(\beta_{1}>0.\) H2b’: Null hypothesis: \(\beta_{2}\le 0\), alternate hypothesis: and \(\beta_{2}>0.\)


Null hypothesis: \(\beta_{1}\le 0\), alternate hypothesis: and \(\beta_{1}>0.\) H3b’: Null hypothesis: \(\beta_{2}\le 0\), alternate hypothesis: and \(\beta_{2}>0.\)

If we are able to reject the null hypotheses, that might indicate the presence of separate channels.

We start our analysis by testing for plausible presence of heteroskedasticity using the Modified Wald Test (Greene 2011). We find strong presence of heteroskedasticity, and therefore prefer to use the alternate Hausman test to identify the suitable estimation model between the Fixed vs. Random Effects estimators.Footnote 20 Subsequently, upon confirmation that the Fixed Effect estimator is the optimum model for our underlying data,Footnote 21 we use the FE estimator model with heteroskedasticity adjusted robust standard error. Additionally, we also employ an alternate approach proposed by Hausman and Taylor (1981) based on 2SLS estimation method, in order to take care of plausible endogeneity of wealth variable. This approach also allows us to estimate the effects for the time-invariant variables, such as caste and religion.


We start the discussion of this section with the results obtained by running two-stage panel data regressions using Eqs. 5 and 6. This approach is employed to examine the effect of financial inclusion on consumption diversification through one plausible channel, i.e. increase in consumption expenditure. Table 5 shows results from the first-stage regression. The coefficients for the financial inclusion indicators are positive and significant, implying that financial inclusion increases monthly per capita expenditure. We use the predicted values of monthly per capita expenditure as an independent variable in the second stage. Additionally, we run a separate set of second-stage regressions (by using Eq. 7), where we include the financial inclusion indicator variables in the second-stage equation to account for separate effect apart from the effect through increase in income, if any.

Table 5 First stage regression results

Thus, we run twelve sets of second-stage regressions, for three dependent variables – Theil Food Index, Theil non-food Index and Theil Between Index – separately for rural and urban regions. The reason for running the regressions separately for rural and urban regions is that because of systematic differences in consumption expenditure between these two regions (refer to Table 3), patterns of consumption diversity is also expected to be different. Table 6 shows the Theil Diversification Index values for ‘within food group’, within ‘non-food group’ and ‘between food and non-food group’, separately for rural and urban regions. While both ‘within non-food group’ and ‘between-group’ consumption diversity increase, ‘within food group’ diversity decreases over time.

Table 6 Trend of the different theil diversification indices

Table 7 shows results for Theil Diversification Index, for food items group. Columns 1 and 2 show that Monthly Per Capita expenditure (MPCE) is positively correlated with diversification in food expenditure (i.e. we reject the hypotheses, H1a). This is expected, and corroborates with previous findings (Clements et al. 2006; Falkinger and Zweimüller 1996). However, the coefficient for this variable in the second-stage captures the value of monthly per capita expenditure, conditioned on the set of explanatory variables mentioned in Eq. 5.

Table 7 Regression results for diversification of food index

Also, there is a negative correlation between asset holding and diversification in food expenditure. Given the inelastic nature of the food group commodities, this result is also in the expected lines. Further, an increase in wealth should shift a part of expenditure away from food group to non-food group.Footnote 22 This is evident from the values and signs of coefficients for the Asset Index variable, as evidenced from the regression coefficients of between groups, shown in Table 11. The coefficient for number of children in the household is positive and significant. This corroborates findings from recent studies (see Chakrabarty and Mandi 2019, for example). The reason behind increasing diversification on number of children is that presence of children necessitates households to spend on items such as milk, fruits etc. Columns 3 and 4 of Table 7 shows estimates for the time-invariant variables, using Hausman-Taylor estimation method.Footnote 23 Coefficients for socially disadvantaged castes and religious groups is mostly positive for food group diversification. We argue that this result on the expected lines too, since relatively poorer groupsFootnote 24 would diversify their expenditure within food group, rather than consuming non-food group items. This is also evident from the negative coefficient for these groups when we estimate shift between food and non-food group (Table 11). Finally, the wave variable, which captures general macroeconomic trend, shows negative impact on food diversification, as is expected from the descriptive statistics in Table 6.

Columns 5 and 6 show results including the financial inclusion indicators. The PMJDY1 dummy shows a statistically significant negative impact, and the PMJDY2 dummy shows a statistically significant positive impact on ‘within food group’ consumption diversification. These dummies capture the separate effect of financial inclusion, independent of other control variables, including the total expenditure variable.

We reiterate that PMJDY1 dummy captures the effect immediately after the launch of the financial inclusion scheme, i.e. from third wave onwards, whereas PMJDY2 captures the effect after one year of launch of the scheme, i.e. from sixth wave onwards. According to our hypotheses H1b and H1b’, we expected the diversity within food items to increase. However, we are not able to reject the null hypothesis for hypothesis H1b, but the hypothesis H1b’ is rejected by our analysis. A robustness check considering dummy variables with value ‘0’ for waves one to seven, and ‘1’ for subsequent two waves, and also with value ‘0’ for waves one to four, and ‘1’ for subsequent five waves in fact shows that the coefficient for the PMJDY2 variable is negative in both cases. The positive sign of the PMJDY2 variable in Table 7 may be an aberration due to sudden positive jump in Theil Diversification Index for food items from wave 5 to wave 6 (see Table 6).

According to some medical literature, dietary diversification increases with increase in financial resources (Morseth et al. 2017). But our analysis does not reveal this pattern, possibly due to the some data limitation. Information on expenditure on only broad sub-items within the food group is available in our dataset; the same on detailed items is not available. This limits our scope of capturing variation in consumption within the food group items.

However, there is a possibility that households may shift within the food group from lower quality food items to selected higher quality items (and hence higher welfare), which may be inferred from Table 8. The table shows that mean of proportion of food items consumed by households over the waves has decreased by 2% from 1st to 9th wave. Secondly, it may be noted that the Theil’s within-group diversification index does not reveal the shift in quality of the items consumed; rather, it only captures the heterogeneity of items consumed. However, a separate analysis using the data shows that there is a substitution from lower income-elastic to higher income-elastic (luxury) goods, substantiating a shift in quality of consumption as well.Footnote 25

Table 8 Mean of proportion of food items consumed by households, over time

Results clearly show that omitting the financial inclusion indicator variables increases the magnitude of coefficients for predicted total consumption substantially. This indicates a possibility of bias due to omission of significant separate impact of financial inclusion indicator variables. Therefore, this might suggest the presence of other plausible channels, as mentioned before. Further, this bias may arise due to measurement error of the consumption variable, or the possible failure to capture the presence of economies of scale in MPCE. We must mention that the coefficients for other control variables are qualitatively similar.

Table 9 shows results for non-food items group, using both Eqs. 6 and 7. Columns 1 and 2 show results obtained from Eq. 6. There is an increase in diversification both for rural as well as urban households, as indicated by the positive and statistically significant coefficient \(\beta_{1}\). This effect confirms the impact of financial inclusion on consumption diversification through increase in income. Hence, we reject the null hypotheses for H2a. We also notice positive and significant impact of financial inclusion indicator variables over and above the income effect (hence we reject hypotheses H2b and H2b’).

Table 9 Regression results for diversification of non-food index

Descriptive statistics showing the trend in mean shares of expenditure within non-food group (Table 4), and also the mean of the proportion of items consumed within this group (Table 10), support our findings. The last column of Table 10 indicates that over time there is a considerable increase in the number of non-food items consumed by both rural as well as urban households.

Table 10 Mean of proportion of non-food items consumed by households, over time

All other variables affecting non-food diversity are positive and significant, as expected. Given that the non-food items group consists of relatively elastic commodities, asset holding is expected to have a positive impact on diversification. Interestingly, for the socially disadvantaged castes, the effects on diversification in non-food consumption is higher than socially advantaged castes. This may be explained through the differences in initial endowments between the advantaged and disadvantaged groups. In other words, since the historically socially advantaged groups possess higher wealth and also earn higher income on average, their existing consumption basket for non-essential commodities is expected to be broader than the other groups. Therefore, the marginal improvement in consumption diversification should be higher for the group that start from lower level of initial consumption.

Finally, Table 11 shows results for the shift in expenditure from food items group to non-food items group, i.e. the first component of Eq. (2). The effect of financial inclusion on the between-group diversification through the channel of income is positive and significant, implying the rejection of null hypothesis H3a. All other time-varying coefficients are positive and statistically significant as well. The decrease in the between-group expenditure for the socially disadvantaged groups (relative to the base group) possibly signifies that they are able to diversify within the food group of commodities and non-food group of commodities (as shown in earlier tables), but lower MPCE for this social groups does not allow them to shift expenditure from the essential commodities (i.e. food items) to relatively non-essential commodities (i.e. non-food items). We would like to re-emphasize that the separate and significant effect of financial inclusion is evident even for between-group diversity index (columns 5 and 6). Hence, we reject hypotheses H3b and H3b’. This shows that there could be channels other than income which drive the change in consumption pattern due to financial inclusion.

Table 11 Regression results for diversification of between index


The main contribution of this paper is to assess whether state-led financial inclusion programmes can lead to higher economic welfare. We conduct this study using the recent thrust on financial inclusion (primarily through the PMJDY programme) in India. The country-wide launch of the programme, a sharp increase in the number of bank accounts opened since its launch, varied opinion on the programme, and lack of previous studies evaluating the programme makes this an issue of considerable interest.

The study also contributes in using diversification in consumption expenditure as a measure of welfare. As noted earlier, the existing literature in development economics has used this welfare measure relatively sparsely. However, apart from improvement in the level of expenditure, its diversification has also been used as an indicator of overall welfare. We employ Theil’s (1967) entropy-based measure as the estimate of diversification.

The key insight from the study is that financial inclusion matters towards improvement of welfare. To reiterate, one link from financial inclusion to economic welfare is through better production and/or employment opportunities, leading to higher income and, thus, change in pattern of consumption expenditure. There can be other channels through which this transmission may happen. For example, financial inclusion has been shown to affect saving and borrowing behaviour, which can improve consumption smoothing, possibly leading to widening of the consumption basket.

Using a two-stage regression approach, we find robust evidence that households diversify their consumption expenditure as a result of increase in access to formal financial system. This increase is especially visible within non-food items, as well as between food and non-food items. Given that total (real) expenditure on consumption has increased in both these baskets (see Table 15), this too indicates an improvement in welfare. Separate analysis for rural and urban samples shows that similar diversification pattern exists across population in rural and urban regions. We obtain our results after controlling for a set of plausible confounding factors. The results also suggest separate and significant effect of financial inclusion after controlling the channel of increased income opportunities due to financial inclusion. This may indicate the existence of additional channels, or presence of measurement error in total consumption expenditure variable. We do not explicitly explore these channels in the present paper, but we believe these channels may be worth exploring in future research works.

The study makes a few additional observations. Diversification is generally positively associated with the level of income and wealth. However, the impact is heterogeneous across social as well as religious groups. While the socially disadvantaged groups diversify more within food group items as well as within non-food group items, they are relatively less able to shift expenditure between these two groups. This is possibly due to differences in initial conditions among the socio-religious groups. Further, demographic attributes also act as a moderating factor towards ultimate benefit of such programmes.

One critique of the financial inclusion schemes in general is that many of the accounts opened through these initiatives lie dormant (for example, see Goedecke et al. (2018) for Indian context in general, and Bijoy (2018) for PMJDY programme in specific). However, our empirical analysis indicates that the scheme has improved economic welfare of the households, even after controlling for plausible confounding factors. In this sense, our estimates may be inferred as a lower bound of the improvement of well-being. In recent time, the use of these accounts has actually shown a rising trend (Chopra et al. 2018; GoI 2020). Future research can be directed towards exploring the marginal effect of this increasing trend, and the mechanisms through which this gain occurs.