1 Introduction

The Indian government initially responded relatively quickly to the COVID-19 pandemic, with a severe nationwide lockdown that was imposed at a few hours’ notice. Almost all economic activity initially came to an abrupt halt. The result was one of the greatest declines in economic activity among all the nations of the world. The lockdown began in late March, 2020, and was relaxed gradually, beginning in late May of that year, though—as was the case in most countries—unequal access to digital tools and different job requirements led to an uneven return to normal activity. India’s GDP shrank by 23.9% year-on-year in the April–June quarter of 2020. The subsequent economic recovery was impacted by a resurgence of the pandemic in 2021, which began to overwhelm the health system. As is the case for other countries, new variants and local and regional waves have continued to challenge policymakers. Again, as was the case for other nations, various state and local restrictions were implemented in response to resurgent cases, but there was not another national lockdown, because of the fear of economic damage. However, even sub-national lockdowns, or simply the behavioral responses to the pandemic itself, such as postponing investment or self-restricting mobility, can have significant economic impacts.

These continued threats and challenges imply that quantifying and mapping the economic impacts of the initial lockdowns is still relevant. Indeed, there is a growing literature that seeks to assess the effects of the pandemic and lockdowns. A major focus has been on income and employment effects, but food consumption and food security have also been of concern to analysts. In this paper, we use data from an ongoing large-scale survey to quantify the impacts of the initial lockdown on household income and its components, particularly wages and business income, which are by far the two largest components of household income. We distinguish between the effects on rural versus urban households, and investigate the possibility of more severe impacts on potentially vulnerable households, specifically, those with daily laborers, and those headed by women. We also examine how the impacts differed across different parts of the income distribution.

In this analysis, we focus on the state of Punjab in India. Several factors influenced this choice. Punjab is in the middle of state per capita income rankings within India, with a fairly representative mix of agriculture, industry and services.Footnote 1 It does not have any major metropolitan city, nor any very poor regions. It is fairly compact and relatively geographically homogeneous. While it is a smaller state in the Indian context, its size is not trivial: in population size, it is comparable to the world’s 50th most populous nation.Footnote 2 We also have some qualitative information on the progression of the pandemic in Punjab, which provides additional context for the formal empirical analysis. Another possible advantage of focusing on Punjab is that it has a relatively high penetration of cellphones, reducing the impact of the lockdown on accessibility of survey respondents.

While no individual state can be fully representative of the whole country, we will argue that the positive characteristics of Punjab, combined with our results on the impacts of the lockdown on relatively vulnerable segments of the population, justify our choice. Specifically, the fact that we estimate significant additional negative impacts on income for rural laborer households in a state such as Punjab, with relatively favorable conditions, reinforces concerns about the damage done by the lockdowns to vulnerable populations in India (e.g., Afridi et al. 2022). We will also discuss our results in the context of studies using national or multi-state data, as well as studies using data that is more concentrated, either geographically or by population segment. Our analysis can be viewed as filling a gap between these two kinds of studies, in terms of geographic scope.

Our data is from a household-level survey conducted by the Centre for Monitoring the Indian Economy (CMIE), the Consumer Pyramid Household Survey (CPHS). This dataset has been extensively used by researchers to examine the economic impacts of the pandemic and national lockdown in India, while many other studies have used explicitly commissioned post-pandemic surveys.Footnote 3 While the CPHS dataset has been questioned in terms of its post-pandemic coverage (and its representativeness) (Dreze and Somanchi 2021; Somanchi 2021), it has a major advantage of providing more extensive pre-pandemic data, which can be used to establish the impacts of the pandemic, rather than recall for a short period, to which solely post-pandemic surveys are restricted.Footnote 4 We will argue in our later discussion that any biases in the coverage of the CPHS would strengthen the main message of our analysis. CMIE collects this data across almost all of India’s states, and while there are clearly advantages to a nationwide analysis, it is also true that each state has different economic and social characteristics, and separate analysis can be helpful. For example, a post-pandemic survey of farmers in two states, Odisha and Haryana, which have quite different agricultural economies, discovered very different responses and accounting of impacts (Ceballos et al. 2020). Another reason for a focused analysis is that the spread of the pandemic and policy responses also differed across the various states, even in the context of a national lockdown. In our later discussion, we compare the results with more geographically concentrated studies, and with other studies for India as a whole, as well as including our own analysis comparing Punjab with the neighboring state of Haryana.

As already noted, several recent studies have documented the impact of India’s severe lockdown and the resulting economic disruption and human distress. As also noted, to better understand the magnitude of this effect, numerous surveys were instituted by various organizations and researchers, soon after the lockdown, going well beyond normal data-collection efforts. Dreze and Somanchi (2021) review and analyze much of the early evidence, documenting sharp drops in employment and income. In some of these studies, urban households were found to be worse affected than rural households: we compare this finding with our own results in our later discussion. These authors also provide their own analysis of the CPHS data, tracking per capita income by quartile, and finding greater percentage declines for the bottom two quartiles versus the top quartile (Dreze and Somanchi 2021, Figure 1). Income declines in the various surveys summarized by these authors were over 40% in magnitude (Dreze and Somanchi 2021, Table 1). Also, in a very early post-lockdown analysis, Bertrand et al. (2020) used April 2020 CPHS data to report that the second and third quintiles of the income distribution had been worst affected, as measured by the proportion reporting income losses. Subsequently, Gupta et al. (2021) used the CPHS data at the all-India level, and found that incomes of daily laborers fell much more than those of salaried workers (75% vs 35%), but that incomes fell more for individuals from households in the highest income quartile. These authors also provided a detailed analysis of occupational adjustments associated with the lockdown’s disruptions of the economy, and of the labor market in particular.

Table 1 Summary statistics of real income from trimmed sample

Focusing on gender inequalities in impacts, Abraham et al. (2022); Afridi et al. (2021) and Deshpande (2020) all used the CPHS data to demonstrate that women’s employment was much more severely affected by the lockdowns, along with other disproportionate negative impacts. For example, Deshpande conditioned this conclusion on the fact that India has a very unequal gender employment ratio, with women much less likely to be in the (formal) labor force: allowing for this, they were worse affected by the lockdown. Afridi et al. (2022) used national CPHS data, combined with administrative data, and found that regions of India that had greater state capacity to deliver on the MG-NREGA scheme did better in cushioning job losses from the pandemic, especially for rural women, and even more for the less skilled in that category.

Various other studies have used a range of data sources. For example, based on their own extensive post-lockdown survey, Kesar et al. (2021) also found substantial declines in employment and income. Other survey-based studies—typically geographically narrow—that estimated the impacts of the lockdowns on agricultural livelihoods include Ceballos et al. (2020), Ceballos et al. (2021), Jaacks et al. (2021), Kesar et al. (2021), Habanyati et al. (2022), Gupta et al. (2020), and Suresh et al. (2022). In an alternative to relying on survey data, instead measuring economic activity through electricity usage, Beyer et al. (2023) and Beyer et al. (2021) documented the pattern of declines in economic activity across India, using both light intensity and electricity consumption data.Footnote 5

We contribute to this literature by using an approach that is different from other studies in several dimensions. By focusing on a single state, we are able to reduce possible complications arising from differences in economic structure, policy implementation, and initial conditions across states. Also, our focus is specifically on differences in the effects of the pandemic and lockdown on rural versus urban households, and on households that might be more vulnerable, including those at the lower end of the income distribution, those with daily laborers (overlapping significantly with the first category), and those headed by women. Other studies have found that women’s livelihoods were affected disproportionately by the lockdown. In our analysis, we cannot isolate this impact in general households (those not necessarily female-headed), since the data on individual contributions to household income is not complete.However, in our data for Punjab, it appears that there was an additional negative impact that was concentrated among urban female-headed households in the upper part of the income distribution.

We also contribute to the existing literature by using CPHS data in a manner that allows for a more rigorous comparison of pre-pandemic and post-lockdown economic outcomes, something that is typically not possible with studies relying solely on post-lockdown surveys. Compared to other studies that employ CPHS data, we use a longer stretch of pre-pandemic data to anchor our estimates of the impact of the lockdown. Our paper also makes a methodological contribution by using the Poisson Pseudo-Maximum Likelihood (PPML) method to estimate the impacts of the lockdown on incomes. We follow Wooldridge (1999) in using PPML to consistently estimate the effect of COVID-19 and its induced lockdown on different components of income, even when there are a large number to zero-valued observations of the dependent variable. This is the case with some of the components of the income data. Usually, when there are a large number of zeroes in the data, a standard practice is to ignore the clustering at zero, or to limit the sample to include only nonzero observations, or to use a two-step method based on a truncation model, which is not quite appropriate in this context. We do investigate these alternative methods for robustness of our results.

The remainder of the paper is structured as follows. The next section describes our data, and describes the empirical framework we use. Specifically, we use PPML as an estimation method that is meant to deal with the skewness and zero observations in the distribution of income and its components. Additionally, we supplement that approach with the alternative of quantile regressions, to examine differences in impacts across the income distribution. We allow for differences between rural and urban households, and for different impacts for households with daily laborers or that are headed by females. We also examine time patterns of impacts in the months following the initial lockdown. The third section presents all our regression results. The fourth section follows with a discussion of these results in the context of the existing literature. The final section is a summary conclusion.

2 Data and empirical framework

2.1 Data

For the purpose of our analysis we use monthly income data collected through household surveys by CMIE for the Indian state of Punjab, covering the period from June 2014 to August 2020. The survey design for this period was revised in 2013 based on the sampling frame of Census 2011 (versus the earlier frame of the 2001 Census), due to a revised classification of towns and the introduction of town-size-based strata. It uses multi-stage sampling, with villages and towns as primary sampling units and households as ultimate sampling units. Households are selected using circular systematic sampling. Our data consists of 318,953 observations on urban households and 120,906 observations on rural households. While it is a panel survey, with each household sampled thrice every year (i.e. every four months), the scope of the survey was expanded during the sample period, so there is not a single value for how often households were sampled over that period. Therefore, we organize and analyze the observations as monthly cross-sections, rather than as a panel of individual households.

The data includes information on monthly time series of income for sample households along with its composition by sources. In the survey, total income is recorded independently, and not just as a sum of the various components. In some cases, information on specific components or sources is not recorded in the survey, so the sum of reported components does not always equal reported total income. The sources consist of wages earned by members of the household, dividends received on equity investments, interest income from savings, business income from various sources, and various types of transfers, including provident fund withdrawals, pension payments, insurance payments, government welfare transfers and miscellaneous private transfers such as gifts. We analyze the data by combining the different sources into four major income categories. Firstly, we consider wages and secondly, estimate the impact on business income, defined as the sum of income from businesses, production activities and rental income. Business income includes income from farming, as well as other kinds of business profits (Jha and Basole 2022, pp. 8–9). In the case of business income for agricultural households, components include income from leasing out land, selling crops, animal husbandry and non-farm businesses (Ministry of Statistics and Programme Implementation 2021).Footnote 6 In non-agricultural households, based on the employment structure of the economy (Singh and Singh 2022), business income is associated with light industry, hospitality, retail and wholesale trade, and transportation. We also consider capital income and transfers, where capital income is the income from dividends and interest payments, while transfers include monetary payments to households in form of provident fund payments, insurance, government transfers, private transfers and pensions.

Since the original data are in nominal terms, we deflated all the data by the monthly national CPI.Footnote 7 We also trim our sample by 0.1% on each tail, using total income to do so, to ensure that outliers are not driving our results. We make use of the weights provided by the CPHS (Vyas 2020b) to generate population estimates for Punjab from our representative sample for the state. Household weights and household-member weights for state-level estimates assign a weight to each household and to each household member, such that each household weight represents a different number of households; and each household member weight represents the number of people in the total population for each wave. We only use state-level household weights, since our unit of analysis is the household. This is defined as number of households in a given wave in the Punjab sample, divided by CMIE’s projected number of households in the state. CMIE makes population projections at the town level for urban areas, and for the relevant rural regions in each district. Using growth rates of population observed between 2001 and 2011, the projections are quite granular, though there is a chance that they might overestimate population numbers if growth rates were declining.

Two issues that arise for the CPHS data, or potentially any other COVID-related surveys, are those of selection and attrition. The basic CPHS survey methodology for capturing poorer households has been questioned (Somanchi 2021), and this problem was compounded when the lockdown was imposed, since the data collection had to shift exclusively to telephonic means. Dreze and Somanchi (2021) point to the lower response rates after lockdown, and raise questions about claims that the survey remained representative (Vyas 2020a). In particular, the concern is that poorer households would be under-represented after the lockdown. However, Vyas (2021) addresses these criticisms, arguing that the adjustment for non-response rates used is standard, and non-distorting. Furthermore, this is likely to be less of an issue in our case, since our analysis allows us to consider pre-pandemic and lockdown data in ways that mitigate concerns of under-representation of poor households in the lockdown-period data.

Specifically, we follow the procedure laid out in Vyas (2020b), and multiply state-level weights with an adjustment factor to account for non-response in the sample. The adjustment factor is defined as the probability of responding to the survey conditional on the household being selected for sample. As a result, our sample weight for a household can be defined as the probability of the survey being completed for the household.Footnote 8 In any case, we will argue that the direction of our results means that their message would be strengthened if there are biases associated with selection and/or attrition.Footnote 9

$$\begin{aligned} \text {Sample weight}= & {} P(\text {HH is surveyed}) = P(\text {Response} | \text {HH is selected}) \times P(\text {HH is selected}) \\ \text {Sample weight}= & {} \text {HH weight for state} \times \text {HH weight for non-response for state} \end{aligned}$$

Table 1 presents the summary statistics of real income for the trimmed sample. Even after trimming, as one would expect, there is considerable dispersion or inequality in the distribution of income and its components. Several of the components of income are highly concentrated, and the distributions are skewed: this is also a consideration for our estimation approach. For a large fraction of households, capital income i.e. income from dividends on equity investment and interest on savings is zero: indeed, capital income is received by less than a quarter of the sample. Similarly, a large portion of households did not receive any transfers, or they received small transfers, and 300 Rupees is the 75th percentile of the distribution of this component. On the other hand, some individuals received very large transfers, reflecting the heterogeneous nature of the category, which has somewhat of a bi-modal distribution as a result. For most households, therefore, total monthly income consists of wages/salaries and, to a lesser extent, business income: the latter category tends to be positive for the top half of the overall distribution, though it is much more substantial on average than capital income or transfers for those who do receive it.

Two other characteristics of the data are worth noting, beyond the summary statistics in Table 1.Footnote 10 First, households typically do not have both wage income and business income, reflecting occupational or class distinctions. For urban households, 37.18 percent had positive wage income, and 54.42 percent had positive business income, but only 4.38 percent had positive income in both categories. For rural households, there was a little more overlap, with the corresponding percentages being 48.48, 60.50 and 9.50. Second, there is clear seasonality in the income of rural households, but much less so for urban households. Moreover, this seasonality is manifested in business income, concentrated at the times of the two harvests, in April–May and October–November. More specifically, based on month-by-month averages for the sample, in the case of urban households, wage income ranges from Rs. 10,016 to Rs. 10,604, while business income is slightly more variable, ranging from Rs. 7915 to Rs. 10,440, with October, November and April being the months with highest business income. In the case of rural households, wage income is also quite steady, monthly averages ranging from Rs. 6956 to Rs. 8020. On the other hand, business income is highly seasonal for rural households: the monthly low is Rs. 2751, in February, while the high is Rs. 41,300, in April. The monthly averages are also especially high for October (Rs. 30,041) and November (Rs. 21,683), with still-high levels of around Rs. 10,000 in May, December and September. By contrast, in the remaining months, average business income is around Rs. 5000 or less.

A few other remarks on Table 1 are in order. Some categories of income are not included in the table, and total income is not always equal to the sum of income attributed to all sources—that is, the source of some part of total income may not be identified for some observations. It is also important to remember that the median and quartiles for total income and components are calculated for each individual distribution. For example, median wage income and median business income can occur in very different observations, and neither necessarily corresponds to the household observation that is the median of the total income distribution. In some cases, different income components are negatively correlated. For example, there are 46,145 observations for which both wage and business income are zero (just over 10% of the sample). Just over half of these observations correspond to retired or elderly people, who receive pensions. For this subgroup, transfers have a relatively tight distribution, with quartiles of Rs. 14,622, Rs. 18,328 and Rs. 23,367. This contrasts with the smaller transfers in the population at large, where the median is only Rs. 108, reflecting minimal transfers associated with government welfare schemes.Footnote 11 Finally, note that, for conciseness, we have combined different kinds of transfers, as for business income and capital income, which may mask some heterogeneity in their character.

Since much of our analysis compares the impacts of the lockdown across three characteristics—rural versus urban, female-headed versus not, and laborer household versus not—Table 1 is supplemented with a breakdown by each of these characteristics. Observations on female-headed households make up 17.4 percent of the sample, and their mean income is 94.5 percent of that of non-female-headed households. Rural households are 27.5 percent of the sample, and their mean income is 94.5 percent of that of urban households. In these two cases, therefore, there is relative parity in incomes. However, households with laborers are 20.6 percent of the sample, and their mean income is only 37 percent of that of households without laborers. This category will therefore be an important focus of the analysis.

2.2 Empirical strategy

The empirical strategy has two parts—the first part analyzes the impact of COVID-induced lockdowns on household-level income, using the Poisson Pseudo Maximum Likelihood (PPML) method. The use of pseudo maximum likelihood methods for the estimation of Poisson regression models has been extended beyond count data, mainly in the estimation of multiplicative models where the dependent variables are non-negative and have a skewed distribution. The consistency of the Poisson estimator is only dependent upon the correct specification of the conditional mean without specific distributional assumptions (Gourieroux et al. 1984), including the presence of heteroscedasticity (Santos Silva and Tenreyro 2006). In the second part, we use standard quantile regressions to estimate the differential effects of the lockdowns across different parts of the income distribution.

As noted earlier (Table 1), the income distributions are skewed and, in the case of individual components, have a large number of zeroes. For example, the 75th percentile for capital income is a 0. While there are different possible methodologies to help deal with the large number of zeroes, we use PPML because this non-linear estimator is consistent even under the weakest assumptions.Footnote 12 This rationale is irrespective of the presence of zeros in the dependent variable, and, as Santos Silva and Tenreyro (2022) observe in the context of a gravity model of trade, “the PPML estimates change very little if the estimation is performed excluding the observations for which the dependent variable is zero... observations where the conditional mean is close to zero have low variance and therefore the residuals are close to zero for observations for which the value of trade is small or zero. This implies that observations for which the dependent variable is equal to zero have a very small contribution to the value of the pseudo log-likelihood function, and therefore contribute little to the estimation results.”

The PPML estimator, \({\hat{\beta }}\), is defined by Eq. (1).

(1)

As suggested in our discussion of the data, the nature of the survey data, even though observations are at the household level, makes it preferable to focus on district-level variation in terms of fixed effects. Districts are also a key administrative unit, and unobserved heterogeneity is naturally captured at the district level. Therefore, in all of our specifications that follow, we use district-year and district-month fixed effects, with standard errors clustered at the level of both district-years and district-months. The district-year fixed effects capture time trends as well as cyclical factors, and the district-month fixed effects capture seasonality. Since we deflate our income data, the district-year fixed effects should be picking up real growth effects: India’s growth rate was relatively high over this period, though it was declining before the pandemic hit. In any case, removing these growth effects is important for isolating the impact of the pandemic.

Equation (2) presents our base specification to estimate the impact of COVID on income, where \(Y_{ijt}\) is the log of income from different categories for household i, in district j at time t, and we include district-month and district-year fixed effects to account for geographical variation as well as the cyclical and seasonal nature of income. The log transformationFootnote 13 is an implication of the PPML method, as can be seen from Equation (1).Footnote 14 Our coefficient of interest is \(\theta _0\), which estimates the effect of the COVID-induced lockdown on (the log of) total income, wages, business income, capital income and transfers. We also perform our analysis separately for urban households and rural households, which is captured in Eq. (3). Next, Equation (4) considers the possibility that there could be differential impacts of COVID for households headed by females, and those with a daily-wage laborers, using a standard difference-in-differences specification.Footnote 15

$$\begin{aligned}{} & {} Y_{ijt} = \alpha _{0} + \beta _{jt}^{Month} + \beta _{jt}^{Year} + \theta _{0} COVID_{t} + \epsilon _{ijt} \end{aligned}$$
(2)
$$\begin{aligned}{} & {} Y^{Rural/Urban}_{ijt} = \alpha _{0} + \beta _{jt}^{Month} + \beta _{jt}^{Year} + \theta _{0} COVID_{t} + \epsilon _{ijt} \end{aligned}$$
(3)
$$\begin{aligned}{} & {} Y_{ijt} = \alpha _{0} + \beta ^{Month}_{jt} + \beta ^{Year}_{jt} + \theta _{0} Covid_{t} + \gamma _{1} {\textbf{1}}_{Laborer} +\gamma _{2} {\textbf{1}}_{Female} \nonumber \\ {}{} & {} \quad +\theta _{2} Covid_{t} \times {\textbf{1}}_{Female} + \theta _{3} Covid_{t} \times {\textbf{1}}_{Laborer} + \epsilon _{ijt} \end{aligned}$$
(4)

While the baseline Eq. (1) estimates the average effect of the pandemic on different categories of income, we also use an alternative specification to capture time-varying effects by including monthly dummies for March through August. The monthly dummies take the value of 1 when income or an income component corresponds to the same month, and is 0 otherwise. This allows one to capture the month-by-month effects after the beginning of the pandemic on different income categories. Specifically, Eq. (5) is used to estimate these effects, where \({\textbf{1}}_{month}\) is the corresponding monthly dummy. Our coefficients of interest are \(\theta _0\), \(\theta _1\),... \(\theta _5\), where \(\theta _0\) measures the effect of the COVID pandemic and lockdown on income or its component in March, \(\theta _1\) measures the effect for April and so on.

$$\begin{aligned} \begin{aligned} Y_{ijt}&= \alpha _{0} + \beta _{jt}^{Year} + \beta _{jt}^{Month} + \theta _{0} COVID \times {\textbf{1}}_{March} + \theta _{1} COVID \times {\textbf{1}}_{April} \\&\quad + \theta _{2} COVID \times {\textbf{1}}_{May} + \theta _{3} COVID \times {\textbf{1}}_{June} + \theta _{4} COVID \times {\textbf{1}}_{July} \\&\quad + \theta _{5} COVID \times {\textbf{1}}_{August} + \epsilon _{ijt} \end{aligned} \end{aligned}$$
(5)

Finally, a plausible hypothesis, supported by other surveys and analyses, is that the effect of the COVID pandemic and lockdowns differed by income levels of households. To examine this hypothesis in a precise manner, we estimate quantile regressions. Quantile regression allow us to look at the impact on households at different parts of the total income distribution. In our analysis, the estimation is performed for each decile of the income distribution, starting from the 10th percentile and going up to the 90th percentile. Equation (6) below presents our base specification for the quantile regression model, associated with quantile \(\tau \), where \(\theta _{0\tau }\) represents the effect of the pandemic for households in decile \(\tau \) of the natural logarithm of total income.Footnote 16 As in the case of the PPML estimates, we also analyze the data separately for rural and urban households.

$$\begin{aligned} Q(log(Y_{ijt}))_{\tau } = \alpha _{0\tau } + \beta _{jt\tau }^{Month} + \beta _{jt\tau }^{Year} + \theta _{0\tau } COVID_{t} + \epsilon _{ijt\tau } \end{aligned}$$
(6)

where \(Q(log(Y_{ijt}))_{\tau }\) is the \(\tau \)-th percentile.

Furthermore, we also allow for differential impacts on potentially vulnerable households at different points of the income distribution. Therefore, Eq. (7) is used to estimate the impact of belonging to different types of households—laborer and female-headed—at different parts of the income distribution.

$$\begin{aligned} Q(log(Y_{ijt}))_{\tau }&= \alpha _{0\tau } + \beta ^{Month}_{jt\tau } + \beta ^{Year}_{jt\tau } + \theta _{0\tau } Covid_{t} + \gamma _{1\tau } {\textbf{1}}_{Laborer} +\gamma _{2\tau } {\textbf{1}}_{Female} \nonumber \\&\quad +\theta _{2\tau } Covid_{t} \times \mathbf {1\tau }_{Female}+ \theta _{3\tau } Covid_{t} \times \mathbf {1\tau }_{Laborer} + \epsilon _{ijt\tau } \end{aligned}$$
(7)

3 Results

In this section, we first present and summarize the results for our PPML regressions, which use a method designed to cope with various distributional issues, including, but not limited to, clusters of zero observations. The second part of this section presents the quantile regression results, which allow one to see how the impacts of the lockdown differed across the income distribution (Tables 2, 3, 4).

Table 2 Summary income for female headed versus non-female headed HH
Table 3 Summary income for Rural versus Urban HH
Table 4 Summary income for laborer versus non-laborer HH
Table 5 COVID impact on income and components

3.1 PPML estimates

The top panel of Table 5 presents the baseline results from using the PPML method to estimate the effect of COVID on income and some of its components. As discussed earlier, this method deals with the skewness of the distributions, zero values for some cases of the dependent variable, and heteroscedasticity. As one would expect, the average effect of COVID and the ensuing lockdown on total income, and its various components was strongly negative. On average, in the first months after the lockdown, total income decreased by 18%. Normally, the coefficient, in this case -.204, would be interpreted as the percentage response of income to a change in the independent variable. However, since the latter, i.e., the COVID dummy, is a 0-1 variable, it is more accurate to use the formula \(\Delta Y/Y = e^\beta - 1\). So, for example, when \(\beta \) = \(-\) 0.204, \(e^\beta - 1\) = \(-\) 0.184, or an 18% decline. This adjustment is followed in all subsequent calculations of marginal impacts of the pandemic or membership in different subgroups such as laborers.

Examining the impacts on components of income, there was a larger decrease in business and capital income. Business income decreased by 36% while capital income declined by 84%. However, one should be cautious in interpreting the result for capital income, given the extreme skewness of the distribution and its concentration in the upper quartile of the sample. The general impact on wages was also negative and statistically significant, but was smaller in magnitude than the decline in business income, with a point estimate of 15%. Transfers, which are also relatively small and with a skewed distribution, did not display a statistically significant decline in this baseline regression.

Since the PPML model depends on the precise nature of the conditional expectation of the error term conditioned on the dependent variable being positive (Santos Silva and Tenreyro 2006; Bellego et al. 2022) we compare the PPML results to the standard log-linear model estimates with zeros excluded. These results are in the middle panel of Table 5. For total income, where there are no zero values, the estimated impact of the lockdown is similar to the PPML case. For wages, business income and capital income, the impacts are smaller, and statistically insignificant for the case of wages. However, for transfers, which have an unusual bimodal distribution, the estimated average impact changes sign and becomes statistically significant. Intuitively, these log linear results will underestimate the negative impacts of the pandemic and lockdown, since they exclude cases where income from a particular source was lost, even temporarily. On the other hand, if the zero observations are the result of exogenous qualification restrictions, such as being a pensioner in the case of transfers, these latter results may be more accurate. As noted, which model is preferable depends on the process that generates zero values. A further discussion of alternative methods for handling zeros is provided in the subsection on robustness checks, later in this section.

The bottom panel of Table 5 extends the basic PPML estimation to include dummies for rural households, and for rural households post-COVID. In the sample, rural households had lower total incomes pre-COVID than their urban counterparts, though business incomes were higher for rural households. The impact of the lockdowns was particularly severe for wage incomes of rural households. Though business income made up for this on average, since a small percentage of households have income from both these two categories, the rural households that lost wage income may well have been different from those that gained business income. Some of the business income gains could have come from early attempts to stock up on supplies in anticipation of continued mobility restrictions, as documented in Singh et al. (2020).

Table 6 extends the comparison of rural and urban households from the bottom panel of Table 5 through separate estimations for the two categories. The top panel reports PPML results, and the bottom panel reports log-linear estimates. Focusing on the PPML results, from the first two columns for total income, we can see that the impacts were fairly similar for rural and urban households, and, therefore, also similar to the aggregate impact reported in top part of Table 5. As one would expect, the differences in the coefficients match the coefficients of the dummies in the previous table (e.g., for total income the rural and urban coefficients in Table 6 differ by 0.08, versus the dummy coefficient of 0.074 in Table 5. But in the case of wages, the disaggregation leads to markedly different results from the aggregate estimates. While there is still no significant effect on wages for urban households separately, the separate estimation reveals a large and statistically significant negative impact on wage incomes of rural households: indeed, the coefficient implies a 32% decline in wage income for rural households in the sample. In the case of business income, while the coefficient suggests a decline for rural households, it is not statistically significant, so we cannot be confident about this effect. However, business income for urban households declined by approximately 50%, according to the point estimate of the regression coefficient. The results for capital income indicate that it decreased for both urban and rural households, by similar magnitudes. By contrast, the results for transfers differ between rural and urban households, with the former category showing a significant decline (24%) in this component of income, even if that was small in absolute monetary terms. The comparison of the log-linear results to the PPML results is very similar to that of the combined rural–urban estimations in the previous table.

Table 6 COVID impact on income and components: rural versus urban
Table 7 Monthly impact of COVID on income and components

The nationwide lockdown was imposed in late March, with a few hours’ notice, and by all accounts it was immediately severe and effective in many places. However, restrictions began to be relaxed in May, with a gradual easing in the following two months, but with some tightening again in August, as cases began to rise again. The estimations in Tables 5 and 6 do not allow for the changing severity of the lockdown. Therefore, to investigate the pattern of impacts over time, Table 7 presents the PPML results for rural and urban households again, but now with an separate dummy for each month of the initial lockdown, rather than a single post-COVID dummy. District-month fixed effects continue to control for seasonality in these regressions, along with district-year fixed effects to control for other temporal variations. Comparing these results with the “average” impacts in Table 6, we see that the impacts on rural and urban households had very different trajectories over time. Putting aside the impacts in March, since the lockdown only came in that month’s final week, Table 7 shows that the decline in total income was considerable for urban households in April and May, with no statistically significant impact on rural households in those two months. However, in the following three months, while urban households’ incomes began to recover, rural households experienced significant income declines during this stretch, even as the lockdown began to be eased.

Examining the components of income, additional differences emerge from these regression results. The wage income of rural households fell significantly in April and May, but began to recover thereafter. The wage income of urban households showed a similar pattern, but with smaller initial declines for April and May, and a complete recovery thereafter. The stark difference in the time pattern of income declines between rural and urban households is therefore traceable to business income, where declines increased after May for rural households, while being partially reversed for urban households. While capital income and transfers are much smaller magnitudes, and more concentrated, they also display patterns of larger and more persistent declines for rural households, reinforcing the picture that the shock to the economy took longer to affect rural households, but that these households had a more difficult path to recovery. In the case, of transfers, while pensions and government welfare transfers were likely unaffected, private remittances (including from abroad) might have been disrupted by the pandemic. While we do not report the log-linear results here in the interests of brevity, the comparison is similar to the previous ones, with a similar underlying intuition: estimated negative impacts are smaller for wages, business income and capital income, while the sign of the estimated impacts is mostly reversed in the case of transfers.

Table 8 Impact on income and components by household type

Next, we explore the impact on categories of households that were potentially more vulnerable to the lockdown, namely, those with laborers in the household, and those headed by females. We begin with the basic pre- and post-COVID specification, before tracing the monthly pattern of impacts. Table 8, presents estimates after adding dummies for female-headed households and households with a daily-wage laborer or in other marginal occupations. Furthermore, these dummies are interacted with the COVID dummy variable, to identify the differential impacts of the pandemic and lockdown on these two categories of household as a difference in differences. As before, we estimate separate regressions for rural and urban households, and we examine total income as well as the four components defined earlier. Even before the pandemic, total incomes and most of the components were lower on average for both types of potentially vulnerable households, but much more strongly for those with laborers, with wage income a partial exception to this relative ranking. For laborer households in rural areas, the pandemic and lockdown had a strong additional impact as compared to other households, with a decline of 19% in total income beyond that experienced by non-laborer, non-female-headed households. However, for urban laborer households, while the additional impact on total income was negative, the point estimate is small and insignificant. For this category, the decline in wages was large and significant—an additional 64%. Interestingly, business income increased for both rural and urban laborer households, albeit from a low base. This suggests that laborer households responded to the loss of wages in the lockdown with income from business activities, though we do not have information on the exact nature of these alternatives.Footnote 17 Interestingly, transfers, which were much lower for these households before the pandemic, increased more than for other households after the lockdown, which does suggest that there was successful targeting of crisis-induced income support. By contrast, female-headed households were not relatively as poor before the pandemic and lockdown, but urban female-headed households did suffer additional negative impacts. While the average decline was small in magnitude, it was large relative to the initial disparity. Specifically, for these households, an initial income disadvantage of 5% was compounded by a further 8% after the lockdown. We estimated these regressions with the log-linear specification and zeros excluded, and the comparison mostly follows the earlier patterns, so the results are again omitted for brevity.Footnote 18

Table 9 Impact on income by month and household type

Table 9 continues the analysis of differential impacts for the two types of households, and breaks down the results for laborer households and female-headed households by month, in this case focusing on total income. Including monthly post-lockdown dummies as in Table 7, and interacting them with the household-type dummies creates some problems for the PPML procedure. Therefore, we adopt an alternative approach, and estimate a separate regression for each post-lockdown month, thereby benchmarking the impacts against pre-pandemic data for that month alone. This alternative benchmarking has some advantages with respect to dealing with seasonality, so it should not viewed as an inferior approach to estimating monthly impacts. Compared to Table 8, the month-by-month impact on female headed households is somewhat more visible than in the regressions that estimate average impacts over time. In the case of laborer households, there appears to have been a rapid recovery in total incomes, once lockdown measures were eased, the process of which began in mid-May and continued in phases thereafter. Interestingly, these regressions also reveal more clearly the seasonality of the income of rural laborer households, since the baseline coefficients are much lower in April and May than in the subsequent three months. This implies that the immediate post-lockdown income reductions for these households were on an already-low base. These kinds of considerations were perhaps not factored into the manner in which the lockdown was implemented, although one has to recall the urgencies and uncertainties at the time.Footnote 19

3.2 Quantile regressions

The analysis for potentially vulnerable households suggests that the impacts of the lockdown were more severe at the lower end of the income distribution: for example, laborer households, already with much lower incomes on average, were impacted more severely than female-headed households or households without these demographic or occupational characteristics. This conjecture is investigated and confirmed by quantile regressions. Results for total income by decile, using quantile regressions, are in Table 10. The decline in income for the lowest decile after the lockdown was 50%, and this negative impact was lower as one moves up the income distribution. In the middle of the distribution, the average decline was 29%, and there was actually a small increase in incomes at the top of the distribution, though this was only in the case of rural households. Urban households in the lower deciles of the income distribution were relatively worse affected than their rural counterparts, although one should note that deciles can represent different levels for the two sub-populations.

Table 10 Quantile regressions—total income, rural and urban

Table 11 adds the two household-type dummies and interactions with the post-COVID dummy to the quantile regression for all households, as reported in Table 10. The reported coefficients are the post-COVID dummies, capturing average impacts on total income at each decile. We see that the general pattern of decline in income across the income distribution is similar to the previous regressions. For female-headed households, their pre-COVID income shortfall compared to general households was approximately the same across the income distribution. The pandemic and lockdown did not result in any significant decline in their incomes, except at the very top of the distribution, where it was about 7%. On the other hand, laborer households were relatively much poorer than the general population in the upper deciles of the income distribution. However, the decline in income for these households was much greater at the lower end of the income distribution. In fact, a laborer household in the bottom decile of the income distribution would have seen a decline of 67% in total income as a result of the lockdown, with almost half of this decline being associated with their laborer-household status.

Table 11 Quantile regressions and household type—all

Tables 12 and 13 disaggregate the analysis of Table 11 into separate quantile regressions for rural and urban households. It is particularly noteworthy that the relative income of labor households across the income distribution is very different for rural and urban areas. Rural laborer households at the lower end of the income distribution were actually better off than non-laborer households in the same deciles, just the opposite of what was true for urban households with daily laborers. This is consistent with the occupational makeup of these households being different in rural versus urban areas, the former including marginal farmers in the category, for example. Again comparing rural and urban households across Tables 12 and 13,this difference in relative income over the different deciles of the distribution is also true, to some extent, of female-headed households. The results show very clearly that the lockdown had its most severe impact on the incomes of rural laborer households at the bottom of the income distribution, more than wiping out any pre-pandemic advantage over non-laborer households. For female-headed households, negative impacts occurred only at the upper end of the income distribution for those in urban areas, but the magnitude of these impacts, in percentage terms, was much lower than what was endured by poorer labor households, both rural and urban.

Table 12 Quantile regressions and household type—rural
Table 13 Quantile regressions and household type—urban

3.3 Robustness checks

As discussed earlier in the section, we performed robustness checks for the PPML approach by excluding observations with zero values for the dependent variable and estimating the corresponding log-linear models. The results of these were incorporated in Tables 5 and 6, and briefly discussed in the context of Tables 7 and 8, though the latter results are not reported in detail in the paper. The results were qualitatively similar, and are available from the authors. Specifically, in the aggregate, the COVID impacts on total income, wage income, business income and capital income were almost all qualitatively similar when we excluded the zeros. The results for transfer income were somewhat different because of the unusual distribution of transfers, and the exogenous factors determining transfers for pensioners. Similar patterns were also observed in the case of the differential effects of COVID on the incomes of rural versus urban households, month-wise impacts, and female-headed or laborer households, with the exception again of transfers. As another alternative to PPML, instead of excluding zeros, we also estimated a two-part model. In the first stage, a logit model is estimated to create predicted values for the dependent variable, and these are used in a second-stage regression. For the basic estimation for the whole sample, the results were not qualitatively different. Specifically, the estimates for the decline in the two main income components, wages and business income, had magnitudes in between those obtained from PPML and those from excluding observations with zeros. The estimates for capital income and transfers did not fit this pattern, being smaller in magnitude for the former, and larger in magnitude for the latter. However, estimations for other specifications and subsamples had convergence problems, which limited the use of this alternative estimation approach.Footnote 20

We also estimated the various specifications and subsamples for data from the neighboring state of Haryana.Footnote 21 This is not strictly a robustness check, but indicates the generalizability of our conclusions. Almost all the results were qualitatively similar, with the difference arising in the case of capital income (especially dividend income) and transfer income (chiefly private transfers in this case). For both these categories, the Haryana results indicated impacts in the opposite direction to those for Punjab. One possible explanation is that Haryana borders the National Capital Region of Delhi, and includes the information technology hub of Gurugram (formerly Gurgaon). The presence of these very high-income pockets in Haryana could be influencing the specific difference in results for Haryana. Much of the rest of Haryana’s economic structure is similar to that of Punjab, and the otherwise similar results suggest at least some connection between economic structure and lockdown impacts. The effect of Covid is more pronounced on total income, wages and business income in Haryana compared to Punjab. In the aggregate, the impacts of COVID were somewhat greater for Haryana, but, with the exception of capital income and transfers, rural and urban household impacts were similar in the two states. The monthly patterns of impacts were also mostly similar. One difference between the two states was that laborer households in Punjab were more badly affected than in Haryana, with urban laborer households in Haryana not doing worse than general households. Again, this could reflect the more diverse economic structure of Haryana, or its border with the NCR, which could have provided alternative income sources not available in Punjab.

4 Discussion

Punjab is a compact, relatively homogeneous state. It has relatively weak local governments (as compared to some other states such as Kerala), and district-level bureaucrats of the Indian Administrative Service, the elite Indian bureaucratic cadre, play an important role in local administration. This was especially true during the pandemic, and—based on monitoring daily reports and other documents—the state-level bureaucracy ensured a uniformly strict lockdown across the state, as well as a carefully measured relaxation, reversed at some points when cases swelled. Punjab’s incidence of COVID cases was in the middle of the distribution of major Indian states, but its mortality rate was on the high end, reflecting pre-existing conditions such as diabetes and cardiac disease. One of the main features of Punjab’s economy is its importance in the national food procurement system, especially for wheat, but also for rice. The initial lockdown occurred when the spring wheat harvest was near, and the state government embarked on a major effort to overcome the sudden loss of migrant labor by helping with the coordination and movement of harvesting machines, and the relaxation of curfews to permit harvesting (Vatta et al. 2020), though there were some disruptions of marketing and procurement. While this effort reduced initial impacts on the rural economy, disruptions arising from the lack of labor, and responses from farmers to suppress wages in this situation (Kaur and Kaur 2020) began to take a toll in subsequent months.

Our results are quite striking in several dimensions, and consistent with the above account of the progression of the Punjab economy in the first months of the lockdown. In our estimates, the impact of the lockdown is immediately visible on the total incomes of households in the month following the lockdown. The average impact was similar for rural and for urban households but rural households experienced these initial impacts as a decline in wage income, whereas urban households experienced the effect of the lockdown as a decline in business income. Rural households also experienced a decline in transfer payments, though this was a relatively small figure on average. The timing of the impact was also very different for rural and urban households. The latter experienced a steep initial decline in income after the lockdown, followed by a gradual recovery as the economy reopened.Footnote 22 However, rural household incomes, after initial stability, declined even as the economy reopened, without a pattern of recovery. This is possible evidence of a kind of "scarring" that has been a matter of concern—persistent effects of the lockdowns on livelihoods, even after restrictions are relaxed. In the case of Punjab, this may reflect the fact that, despite the importance of government-procured wheat, and the focus of the state government on protecting that part of the economy, the rural economy is much broader, and government protection in other markets was difficult to provide. These observations are also consistent with the analysis of wheat farmers in the neighboring state Haryana by Ceballos et al. (2020) and Ceballos et al. (2021), as well as the various local or geographically-focused studies of impacts on agriculture listed in the introduction.

The pattern of recovery of urban versus rural incomes may also be related to the manner in which the economy opened up. One possible measure of how this reopening occurred is the pattern of return migration. After the lockdown was eased, the Punjab government began collecting daily data by district on the number of people entering Punjab.Footnote 23 We performed the following calculation to examine the relationship between where travellers to Punjab were registering, and the urbanization of that district. The cumulative number of incoming travellers to each district, as of July 31, 2020, was transformed to a percentage of the district’s population. The correlation of this percentage with the percentage urbanization of districts was 0.625, which is at least consistent with a hypothesis that urban areas were recovering faster.Footnote 24

While female-headed households and households with daily laborers both have lower incomes than the general population, with the difference especially large for the latter category, female-headed households did not suffer as much from the lockdown as did households with laborers. Urban laborer households experienced smaller declines in income than their rural counterparts in the first two months of the lockdown, but the latter also saw an increase in income in subsequent months, suggesting that there was some substitution across time in the rural economy. Urban laborer households also received higher transfer income after the lockdown, suggesting some successful targeting—although these transfers were by no means large enough to make up for income loss from other sources. Note that this time pattern of impacts for rural laborer households was quite different from the overall time pattern for all rural households taken together, as discussed in the beginning of this section. The time pattern of impacts on wage and business income also differed between rural and urban households. For example, wage income in urban households recovered quickly as the economy reopened, whereas business income in rural households continued to suffer declines relative to the pre-pandemic period. These findings of differing time patterns of impact on income components of rural and urban households, as well as differences for subcategories based on household characteristics within each broader category, have not been highlighted in the literature, although we should acknowledge that the rural economy of Punjab has a somewhat distinctive structure within India. At least for the Punjab sample, the main differential impact was for laborer households, and this characteristic is what dominated for female-headed households, rather than any additional gender effect. However, this result does not generalize to other parts of India.

The results of the quantile regressions are in concordance with what other, purely post-COVID, surveys, have indicated: the impacts of the lockdown were proportionately much more severe at the bottom of the income distribution. In our data, this skewing of negative impacts toward the poor was exacerbated in the case of laborer households, though there was no such additional effect for female-headed households. In fact, for urban female-headed households, there were differential significant negative effects of the pandemic and lockdown, but only at the upper end of the income distribution, possibly reflecting the kind of job loss discussed in studies focused on gender (Abraham et al. 2022; Afridi et al. 2021; Deshpande 2020).

Several of the other studies cited in our introduction (e.g., Kesar et al. 2021), using CPHS data or specially commissioned surveys, have examined job loss in particular, as well as occupational switching. Some have also examined differential impacts by age, gender and caste. Our analysis complements such studies, focusing on total income and its main sources. Changes in the latter are indicative of short-term adjustments of employment, particularly the substitution of business income (presumably from the informal sector) for wage income among laborer households. Our results on the severe income losses of daily laborers (households in our case, individuals for other studies) are consistent with those of Gupta et al. (2021), though we are able to distinguish more finely between rural and urban households, and to examine the differential impacts over the income distribution through quantile regressions. By contrast to Gupta et al. (2021), we do not find significant income gains for higher-income urban households, though this may reflect the fact that our focus on Punjab excludes all of India’s major metropolitan cities, and even the state capital of Chandigarh, which is a centrally-administered Union Territory, whereas theirs is an all-India analysis. Individuals who would have benefited from the situation were more likely in locations such as those better-off urban locations.

The fact that Punjab is relatively homogeneous, economically and geographically, is a positive for an analysis conducted at the state-level, in that it reduces the impact of unobserved or unmeasured differences in economic structures within the sample. Of course, we allowed for district-level fixed effects, but a study of their magnitudes did not indicate any systematic reasons for differences across districts. Furthermore, our review of district-level lockdown orders and implementation information for Punjab suggests that the state government was able to impose restrictions quite uniformly across the state, even as the number of COVID-19 cases and their spread differed across districts and over time.Footnote 25 Hence, our results arguably provide an accurate picture of what the state’s residents experienced during the initial lockdown and its gradual relaxation.

Finally, we offer some thoughts on the broader implications of our analysis and results. We have found that rural laborer households were the worst affected in Punjab by the lockdown and disruptions of the economy. This result is, in a sense, no different than results for other states and local regions in India. What is noteworthy is that Punjab has a relatively prosperous agricultural economy, with significant government procurement of wheat and rice, by far the state’s two main crops.Footnote 26 Punjab has relatively few marginal farmers, women-run farms, or farms operated by members of the Scheduled Castes (Ministry of Agriculture and Farmers Welfare 2019). Its average farm sizes are considerably greater than the national average. But even in this situation, there was a segment of the population, rural laborer households, that was especially vulnerable, as our results show. In that sense, the message of our results is that social protection is not just a concern for poorer states, or ones without good infrastructure for access to social services. The comparison with Haryana, in terms of broad similarities but some differences in the impacts on laborer households, is consistent with this perspective. Here, one can emphasize that concerns about under-representation of poor or migrant households in the survey, and differential attrition in sampling post-COVID, only strengthen our results, since they would imply underestimation of impacts, even in a relatively favorable situation. We do not have data on caste composition, but it is plausible that these laborer households are disproportionately from lower castes, since this aspect of caste is a nationwide characteristic.Footnote 27

5 Conclusion

Our analysis contributes to an important literature on the impacts of the sudden and severe lockdown that the Indian government imposed, once the COVID-19 pandemic became an obvious threat to public health. Many studies have documented the negative impacts on the population at large, and on particular categories of the population, particularly women and the poor. Our results add to this evidence, and quantify the time pattern of impacts over the period immediately following the lockdown. We also identify differences in rural versus urban households, households with laborers, female-headed households, and poorer households. We perform the analysis using a relatively robust dataset, with a longer period of benchmarking with pre-pandemic data. We innovate on the existing literature by using Poisson Pseudo Maximum Likelihood estimation to deal with large numbers of zero observations and other distributional issues, as well as using quantile regressions to examine differential impacts on income at different parts of the income distribution.

While we concentrate on the state of Punjab, this data is available for other Indian states, and has been used for national level analyses as well. However, the major states of India are equivalent to medium to large countries in population size, and are amenable to individual analysis. Arguably, there is enough heterogeneity across India’s states to warrant separate analyses. Our results for Haryana, which shares many aspects of economic structure with Punjab, but also differs in per capita income, thanks to its proximity to the NCR and the presence of a sizeable information technology sector in Gurugram, suggest that pairwise comparisons can be useful. But even single-state analyses are of value. In particular, while India’s lockdown was a national policy, implementation relied heavily on state governments, and state-by-state analysis is potentially more useful for policymakers on the ground, in the case of managing pandemic-induced disruptions in economic activity, especially if they recur at different times in different states or regions.