## Abstract

We investigate how the social welfare gain of subsidizing work-related goods depends on whether the underlying income tax system is linear, piecewise linear or fully nonlinear, focusing on child care services as a paradigmatic example of goods/services that are complements with labor supply. Our quantitative analysis employs an empirically relevant labor supply model and shows that the welfare gain of an optimally chosen subsidy is negligible when the optimal income tax is restricted to be linear but about the same as under fully nonlinear taxation when the optimal income tax is restricted to be piecewise linear. Our findings enhance the policy relevance of the optimal tax argument in favor of providing subsidies to work-related goods and also shed light on the relative welfare gains of employing piecewise linear rather than fully nonlinear income taxes.

## Introduction

According to the seminal contribution of Atkinson and Stiglitz (1976), if the income tax is allowed to be optimally nonlinear, commodity taxes are a redundant policy instrument when preferences are separable between leisure and other goods and individuals only differ in their innate market ability (skill). If the separability condition is not satisfied and the desired direction of redistribution goes from higher- to lower-skilled agents, one should use commodity taxes and subsidies to discourage the consumption of goods/services that are substitutes with labor supply and encourage the consumption of goods/services that are complements with labor supply.^{Footnote 1}

There is now a sizable literature emphasizing that, in Mirrleesian income tax settings, subsidizing (or publicly providing) goods or services that are consumed in conjunction with labor supply can be welfare-enhancing. The argument is that, by subsidizing (or subjecting to a relatively more lenient tax treatment) the purchase of goods or services that are complements with labor supply, one can slacken the binding incentive constraints faced by a government designing a nonlinear income tax for redistributive purposes.^{Footnote 2} These constraints arise because the government does not directly observe an individual’s earning ability (skill), and therefore cannot levy taxes or transfers that are directly conditioned on innate ability. Instead, the government pursues its redistributive goals by means of an anonymous nonlinear income tax schedule, i.e., it designs a menu of combinations of gross incomes and taxes, and hence disposable income, and lets agents choose their preferred income point. To achieve redistribution, taxes on low-income earners must be lower than on high-income earners, and conceivably negative. High-skilled agents might then find attractive to mimic low-skilled agents by lowering their labor supply and earning an income qualifying for a lower tax burden. The tax schedule must then be designed in such a way that mimicking is deterred or, in other words, that it satisfies the incentive-compatibility constraints (self-selection constraints) requiring that each agent has no incentive to choose a point other than the one intended for his/her skill type on the income tax schedule set by the government.

One question that has so far been neglected in the literature is whether the welfare-enhancing effects from subsidizing work-related goods/services hinge on the ability of the government to optimize a fully nonlinear income tax, or whether similar welfare gains can be obtained in settings where the government relies on less sophisticated income tax systems like the ones that we typically observe in real economies.^{Footnote 3} The aim of this paper is to fill this gap in the literature by focusing on child care services as a paradigmatic example of goods/services that are complements with labor supply.^{Footnote 4} Besides the fact that real-world governments do not typically tax income on a fully nonlinear scale, one reason why the question that we address is interesting is that it is in principle ambiguous whether the effectiveness of subsidies to work-related goods as a welfare-enhancing instrument is increasing or decreasing in the degree of sophistication of the underlying income tax schedule. On the one hand, as the income tax becomes more sophisticated, social welfare increases and it becomes more difficult to reap welfare gains by supplementing income taxation with other policy instruments. On the other hand, as we have just remarked, as the income tax becomes more sophisticated (flexible), it becomes easier for the government to offset, for each agent, the distortionary effects generated by subsidizing the work-related good.

We employ a fairly canonical model where a subset of agents need child care services in order to work, and evaluate the welfare gains from subsidizing this work-related good under income tax schedules that exhibit different degrees of sophistication. In particular, we consider (1) a linear tax system, (2) a two-bracket piecewise linear income tax, (3) a four-bracket piecewise linear income tax, and (4) a fully nonlinear income tax.^{Footnote 5} We partition the population into “users” (parents) and “nonusers” (non-parents) of the work-related good, and based on the assumption that the attribute (parental status) identifying “users” is publicly observable, we allow the government to select different tax schedules for each group. This allows us to see how the presence of a need for a work-related consumption good affects optimal marginal income taxes, and how the optimal tax schedules change as a subsidy for the work-related good is introduced.^{Footnote 6} To simplify the analysis, we focus on the case where for each unit of labor supply, one unit of the work-related good needs to be acquired by “users.” In other words, we assume that parents need one hour of child care services for every hour of market work.

Our contribution is mainly quantitative. We present numerical simulations comparing the welfare gains of subsidies for work-related goods under different income tax systems. We characterize individual behavior based on an empirically relevant labor supply model, and we use administrative wage data from Sweden as our source of taxpayer heterogeneity.^{Footnote 7} The labor supply model adopts a quadratic utility function specification, inspired by the contributions by Stern (1986) and Tuomala (2010). This utility specification produces realistic labor supply behavior and is also computationally convenient as it admits a closed-form solution for the labor supply choice subject to a linear budget segment. This is especially practical when the government is optimizing piecewise linear tax schedules and the optimal choice of each individual needs to be calculated repeatedly, as in the nonlinear budget set procedure of Hausman (1979).^{Footnote 8} To compute the optimal fully nonlinear tax systems, we employ a specification with a large number of discrete types, following the simulation approach outlined in Bastani (2015), which also enables us to use exactly the same representation of the wage distribution in all our simulations.

Our results indicate that the effectiveness of subsidies to work-related goods as a welfare-enhancing instrument is increasing in the degree of sophistication of the underlying income tax schedule. While under a linear income tax, the magnitude of the welfare gains obtained by subsidizing the work-related good is negligible, the welfare gains that can be achieved by subsidizing the work-related good under a piecewise linear income tax amount to about the same as the gains that can be achieved by subsidizing the work-related good under an optimal nonlinear income tax. This finding enhances the policy relevance of the optimal tax argument in favor of providing subsidies to work-related goods. The optimal value of the subsidy rate on the work-related good is in general quite large and is also (weakly) increasing in the degree of sophistication of the underlying income tax schedule. Finally, our results also shed light on the relative welfare gains of employing piecewise linear rather than fully nonlinear income taxes, showing that piecewise linear taxes are able to reap a major part of the welfare gains associated with fully nonlinear income taxes.

The paper is organized as follows. In Sect. 2, we outline the nonlinear income tax problem, which serves as our theoretical benchmark. In Sect. 3, we present the governments’ problem in the case of linear and piecewise linear tax structures. Section 4 describes our empirical calibration as well as the linear and piecewise linear optimal tax problems. Section 5 describes our results, and finally, Sect. 6 concludes.

## The nonlinear income tax problem

Consider a setting where agents differ in terms of their labor productivity (wage rates) and their need for a work-related good (child care services). Those who need child care services in order to work are for simplicity labeled “parents” and those who do not need child care services are labeled “non-parents.”^{Footnote 9}

We let *Y* denote the before tax labor income, given by the product between an agent’s wage rate *w* and labor supply *h*. We also make the standard assumption that the policy maker can observe *Y* but not *w* or *h* separately. This rules out first-best personalized lump-sum taxes and transfers but allows labor income to be taxed on a nonlinear scale. Given our focus on child care services as a primary example of a work-related good/service, and given that parental status is an individual characteristic that can reasonably be regarded as publicly observable, we also assume that parental status can be used as a tag in the optimal tax problem, i.e., parents and non-parents face two distinct nonlinear income tax schedules.

The wage rate of an agent of skill type *i* belonging to group \(j=p,np\), where *p* refers to parents and *np* refers to non-parents, is denoted by \( w^{i,j}\). Without loss of generality, we assume that agents are ordered in such a way that \(w^{1,j}<w^{2,j}<...<w^{N,j},\)\(j=p,np\). The total population size is normalized to unity, and the proportion of a type *ij*-agent in the population is denoted by \(\pi ^{ij}\) and is known by the government. The (exogenous) per unit resource cost of child care services (which would be the price in a competitive market) is denoted by *q*. Non-parents do not need child care services. For parents, on the other hand, the demand for child care services is strictly related to the hours of work. Assuming that every parent has only one child, for every hour of work parents need one hour of child care services.^{Footnote 10} Child care services do not represent a good that enters the parents’ utility function directly; for them, it entails a real cost of working, a good which must be acquired in order to work. Thus, in an economy without taxes and public expenditure, the opportunity cost of leisure, which governs the agents’ decisions in an undistorted optimum, is equal to \( {\overline{w}}\equiv w-q\) and *w* for, respectively, parents and non-parents. All agents have identical preferences over consumption (net of expenditures on child care) *c* and hours of work *h*; these are represented by the utility function *u*(*c*, *h*), possessing the standard properties.

### Work-related good not subsidized

Let us start with a characterization of the solution to the government’s problem when the work-related good is not subsidized. The government’s objective is to maximize a weighted sum of agents’ utilities. Based on the link between pre-tax earnings and post-tax earnings implied by the tax schedule that applies to them, agents choose labor supply to maximize their utility. This allows us to implicitly express the marginal tax rates faced by agents as \(T^{\prime }\left( Y\right) =1-\mathrm{MRS}\), where MRS denotes the marginal rate of substitution between gross labor income and consumption. Defining by \(B\equiv Y-T\left( Y\right) \) the after-tax income associated with gross labor income *Y*, the government’s problem can be equivalently stated as the problem of selecting bundles in the \(\left( Y,B\right) \)-space subject to a set of self-selection constraints and a public budget constraint. The self-selection constraints require that each agent (weakly) prefers the bundle intended for him/her rather than behaving as a mimicker by choosing a bundle intended for some other agent.

Given that consumption is determined for parents as \(C=B-qh=B-qY/w\) and for non-parents as \(C=B\), we can define the agents’ indirect utility at any given point in the \(\left( Y,B\right) \)-space as \(V^{i,j}\left( B,Y\right) =u\left( B-\mathbf {1}[j=p]qY/w^{i,j},Y/w^{i,j}\right) \) where \(\mathbf {1} [\cdot ]\) denotes an indicator function. The slope of individuals’ indifference curves in the (*Y*, *B*)-space is given by the MRS expression:

As can be seen from the expression above, the presence of a need for the work-related good affects the shape of the parents’ indifference curves. As a consequence, and in contrast to what happens in models where agents differ only in terms of skills, (weak) normality of *c* is no longer a sufficient condition to ensure that, at any given point in the \(\left( Y,B\right) \)-space, the indifference curves are flatter the higher the wage rate of an agent. Notice however that, although this agent-monotonicity property does not hold for the population as a whole, it still holds within each of the two groups. Thus, as we are assuming that the government is optimizing separate tax schedules for parents and non-parents, it is sufficient to restrict attention to constraints linking pairs of adjacent types when formalizing the government’s problem.^{Footnote 11}

Denote by \(\alpha ^{ij}\) the welfare weight used by the government for agents of type *ij*, with \(\sum _{ij}\alpha ^{ij}=1\). Furthermore, assume that the chosen welfare weights imply that, for each of the two tagged groups, the government wants to redistribute from higher- to lower-ability agents so that the only (potentially) binding self-selection constraints are those running downwards and linking pair of adjacent types. Then, the problem solved by the government can be formally written as:

subject to:

and

where Lagrange multipliers are within parentheses. The first set of constraints represents the self-selection (incentive-compatibility) constraints, and the second constraint is the government’s budget constraint. Implicit in the formulation of the problem above is the idea that the possibility to tag agents based on parental status allows the government to solve two separate optimal income tax problems, one for parents and one for non-parents, with the possibility of accomplishing lump-sum inter-group transfers. Obviously, tagging is always welfare-improving compared to the case where a single tax schedule applies to the whole population. The welfare-enhancing potential of a tagging scheme derives from the fact that all self-selection constraints linking agents belonging to two separate tagged groups are eliminated.^{Footnote 12} In the above problem, this is reflected by the fact that we have written the self-selection constraints conditional on *j*.

As shown in Appendix A, manipulating the first-order conditions of the above problem, the general expression for the marginal tax rate faced by a type *i* agent, \(i\in \{1,\ldots ,N-1\}\), belonging to group \(j=p,np\) is given by:

where \({\widehat{V}}_{B}^{i+1,j}\equiv \frac{d}{dB^{i,j}} V^{i+1,j}(B^{i,j},Y^{i,j})\). Instead, for the highest skilled agent in each group, the standard no-distortion at the top result applies, i.e., for agents \((i,j)=(N,p)\) and \((i,j)=(N,np)\), \(T^{\prime ij}=0\).

The result provided by (1) is a standard one in the optimal tax literature, and we do not discuss it at length. It states that the only reason to distort agents’ (labor supply) behavior is the presence of binding self-selection constraints. Moreover, given that the agent-monotonicity property holds within each of the two tagged groups, (1) implies that the labor supply of all agents, except the highest skilled within each group, is distorted downwards (\(T^{\prime }\left( Y^{ij}\right) >0\) for \( i\in \{1,\ldots ,N-1\}\) and \(j=p,np\)).

Let us now consider how the government’s problem is modified when nonlinear income taxation is supplemented by a child care subsidy.

### Work-related good subsidized

As in our model child care services enter the individual decision problem of parents as a ‘needs constraint’ and are not subject to a separate individual choice, it is straightforward to show that the optimal child care subsidy is 100% when two separate nonlinear tax schedules apply to parents and non-parents. To provide an intuition for this result, suppose that a fully separating equilibrium with \(Y^{1,p}<\cdots <Y^{N,p}\) is achieved as a solution to the government’s problem described in the previous subsection. To show that a Pareto-improvement can be obtained by supplementing income taxation with a child care subsidy, consider the following tax reform. Denote, respectively, by \(\left( Y^{*j,p},B^{*j,p}\right) \) and \(\left( Y^{*j,np},B^{*j,np}\right) \) the bundle offered to parents and non-parents of skill type \(j=1,...,N\) at the solution to the problem where \( s=0\) (i.e., the problem described in the previous subsection). Now introduce a child care subsidy at rate \(s\in (0,1]\) and, while leaving unchanged the set of bundles \(\left( Y^{*j,np},B^{*j,np}\right) \) offered to non-parents, change the set of bundles for parents by offering the following packages: \(\left( Y^{*1,p},B^{*1,p}-sqY^{*1,p}/w^{1,p}\right) \) ,..., \(\left( Y^{*N,p},B^{*N,p}-sqY^{*N,p}/w^{N,p}\right) \).

Notice that, by keeping their labor supply after the reform at the original pre-reform level, the utility of all agents would be unaffected and the government’s budget constraint would still be satisfied since the income tax payment of each type of parents has been increased just enough to cover the cost of the subsidy that they receive (\(sqY^{*j,p}/w^{j,p}\) for \( j=1,...N \)). The only effects of the reform that are left to evaluate are those on the binding self-selection constraints.

Regarding this, no effects whatsoever are generated on the self-selection constraints that are relevant in the design of the nonlinear income tax faced by non-parents.^{Footnote 13} Consider now the self-selection constraints requiring higher-ability parents to be prevented from mimicking lower-ability parents. After implementation of the proposed reform, the consumption that a parent of skill type *j* can get by mimicking a parent of skill type \(j-1\) is now lower (by the amount \(sq\left[ \left( Y^{*j-1,p}/w^{j-1,p}\right) -\left( Y^{*j-1,p}/w^{j,p}\right) \right] \)) than before the reform, whereas the labor effort that he/she has to exert has not changed. We can therefore conclude that a child care subsidy is an unambiguously welfare-enhancing instrument in this case. Moreover, we can also notice that the consumption for a *j*-type parent behaving as a mimicker is lowered by an amount that is increasing in *s*, which in turn implies that the optimal subsidy rate is in this case 100%.^{Footnote 14}

Based on the discussion above, we can then proceed to analyze the case where our work-related good is fully subsidized. In such a setting, the indirect utility is given by \(V^{i,j}\left( B,Y\right) =u\left( B,Y/w^{i,j}\right) \) for both \(j=p\) and \(j=np\) as child care purchases no longer appear in the (private) budget constraints of parents.^{Footnote 15} Instead, these expenditures enter the government’s budget constraint. The problem solved by the government in the presence of the subsidy is given by:

subject to:

and

where Lagrange multipliers appear within parentheses.

As shown in Appendix B, manipulating the first-order conditions of the government’s problem, a general expression for the marginal tax rate faced by a type *i* agent, \(i\in \{1,\ldots ,N-1\}\), belonging to group \(j=p,np\) can be derived:

where, again, \({\widehat{V}}_{B}^{i+1,j}\equiv \frac{{\text {d}}}{{\text {d}}B^{i,j}} V^{i+1,j}(B^{i,j},Y^{i,j})\). For agents of type \((i,j)=(N,np)\), we still have that \(T^{\prime ij}=0\), whereas for agents of type \((i,j)=(N,p)\) we have \( T^{\prime ij}=\frac{q}{w^{i,p}}\).

Comparing (1) and (2), it can thus immediately be seen that the only difference comes from the presence of a term \(\frac{q}{ w^{i,p}}\) in the expressions for the marginal tax rates faced by parents when income taxation is supplemented with a child care subsidy. The introduction of a subsidy is therefore likely to lead to an increase in the marginal tax rates for parents. However, the total distortions in the economy may in fact still be reduced. Intuitively, the *q* / *w* terms that enter the expressions for the marginal tax rates faced by parents do not represent distortionary terms but serve the same role as a market price in letting parents face the right incentives.^{Footnote 16} At the same time, the subsidy serves the purpose of weakening the self-selection constraints thwarting the government in the design of the nonlinear income tax that applies to parents. For these constraints, the mimicking-deterring effect reduces the need to distort agents for self-selection purposes. It therefore allows to reduction of the truly distortionary component (i.e., the \(\lambda \)-terms) in the formulas for the marginal tax rates.

Notice that the expressions for the marginal tax rates that apply to non-parents do not incorporate the *q* / *w* terms. This is important, since for them, these terms would represent a truly distortionary component. Notice also that the fact that the cost of child care is not mirrored in the expressions for the marginal tax rates that apply to non-parents does not mean that the additional resources needed to finance the child care subsidy are raised only from parents. It means that if also non-parents were to participate in the financing of the child care subsidy, the additional revenue extracted from them may to a large extent be collected in a non-distortionary way through an increase in inframarginal income tax rates.^{Footnote 17}

Having analyzed the role of subsidies to work-related goods under a fully nonlinear income tax, in the next section we describe the quantitative model that we employ to compare the welfare-enhancing power of subsidies under different assumptions regarding the flexibility of the income tax at disposal of the government. Before doing this, however, a final remark is in order. As we have pointed out, in our setting a 100% subsidy rate is optimal under fully nonlinear taxation. This result does not necessarily extend to the case of less sophisticated tax systems as linear- and piecewise linear tax systems. The reason is that with less sophisticated income tax systems the government no longer has the required flexibility to fully offset for each agent the distortion on the leisure-labor choice generated by subsidizing the work-related good. Thus, even though incentive-compatibility constraints are implicitly present also under piecewise linear income taxes,^{Footnote 18} and therefore one can still regard a subsidy to work-related goods as an instrument exerting mimicking-deterring effects, full subsidization is not necessarily optimal. This also implies that it is in principle ambiguous whether the effectiveness of subsidies to work-related goods as a welfare-enhancing instrument is increasing or decreasing in the degree of sophistication of the underlying income tax schedule. On one hand, as the income tax becomes more sophisticated, social welfare increases and it becomes more difficult to reap welfare gains by supplementing income taxation with other policy instruments. On the other hand, as we have just remarked, when the income tax becomes more sophisticated, it becomes easier for the government to offset for each agent the distortionary effects generated by subsidizing the work-related good.

## The linear and piecewise linear tax problems

We now present the government maximization problem under a four-bracket piecewise linear tax.^{Footnote 19} As before, we assume that the population consists of 2*N* different types of agents with wage rates \(w^{1,j}<w^{2,j}<\cdots <w^{N,j}\) where \(j\in \{np,p\}\). The total population size is normalized to one, and \(\pi ^{ij}\) denotes the population share of a type (*i*, *j*) agent, \( i=1,\ldots ,n,j\in \{np,p\}\). The piecewise linear tax function is described by four slope parameters \( t_{1},t_{2},t_{3},t_{4}\), and three ‘break points’ \(Z_{i}\) defined as the points on the *x*-axis where the slope of *T* changes. The demogrant is denoted by *G*. Formally, the tax function as a function of income *Y* is defined as:

The set of parameters of the tax function is denoted by

The government designs two piecewise linear tax schedules, one for parents and one for non-parents, denoted by \(T(Y;\theta ^{p})\) and \(T(Y;\theta ^{np}) \), respectively. The consumption for an individual belonging to group \(j\in \{p,np\}\), with productivity *w*, choosing to earn an income of *Y* under a tax schedule described by the tax parameters \(\theta \), is given by:

where \(s\in \left[ 0,1\right] \) denotes the subsidy rate to child care.

Agents choose *Y* to maximize \(U(C^{j}(Y),Y)\equiv u(C^{j}(Y),Y/w)\) leading to the indirect utility function:

Under a max–min social welfare function, the government solves the following problem:

subject to the resource constraint:

The solution to the problem above yields an optimal piecewise linear tax system with associated optimized tax schedules \(T^{*j}=T(Y;\theta ^{*j})\), \(j\in \{p,np\}\).^{Footnote 20} The solution also provides a value for the inter-group transfer, which will be denoted by \(G^{np,p}\), and which can be calculated as \(\sum _{i=1}^{N}\pi ^{i,np}\left( Y^{*np}(\theta ^{np};w^{i,np})-C^{*np}(\theta ^{np};w^{i,np})\right) \).^{Footnote 21} We solve this problem using numerical optimization techniques. A similar procedure is used to solve numerically the government’s problem under a two-bracket piecewise linear tax.

The case where the tax system is linear can be thought of as a limit case of the piecewise linear structure that we have described above. Simply, in the linear case, each of the two separate tax schedules features a single income bracket and a single marginal tax rate.^{Footnote 22}

## Quantitative model

In this paper, we use wages as a proxy for skills and calibrate the wage distribution to Swedish register data using the population distribution. Our wage data consist of individuals who worked at least part time in 2005. Parents are defined as women with at least one child in child care age (for Sweden, this corresponds to ages one to six); non-parents are defined as all men (with and without children) and all women without any child in day care age.^{Footnote 23} According to this definition, in 2005 the fraction of parents in Sweden was slightly below 10%.^{Footnote 24} As an estimate of the hourly price for child care, we have chosen a price of 40% of the median wage for parents.

In order to capture empirically relevant behavioral elasticities and facilitate a tractable comparison with different optimum tax models, we choose the following quadratic specification of the direct utility function:^{Footnote 25}

where \(\alpha ,\beta <0\), \(\gamma ,\delta ,\epsilon >0\).^{Footnote 26} The annual time endowment *J* is set to 5840 hours. The labor supply function is:

where *m* is virtual income and *w* is the wage rate. Finally, the (uncompensated) elasticity of labor supply is:

We make the normalization \(\alpha =-1\) and impose the constraint that the labor supply function evaluated at a (net) wage rate of zero is (on average) equal to zero. This pins down \(\beta \). The remaining parameters that need to be chosen are \(\gamma \), \(\delta \), and \(\epsilon \). We choose \(\gamma =0.07\), \(\delta =95\), and \(\epsilon =2000\), which produce empirically relevant substitution- and income effects on labor supply.

The uncompensated labor supply elasticity as a function of the (hourly) wage rate (denoted in SEK) is shown in the top panel of Fig. 1. Given that the distribution of wages for parents lies to the left of the wage distribution for non-parents, and that parents are interpreted as women with small children, the parameterization is consistent with the empirical finding that the labor supply of women with small children is more responsive to taxation.^{Footnote 27} The income elasticities of labor supply are shown in panel b) and range between − 0.05 and − 0.08, consistent with the empirical literature, as it usually documents small income effects. Finally, in the bottom panel of Fig. 1 the labor supply function is graphed.^{Footnote 28} Compared to parameterizations used in the earlier optimal tax literature, we believe the implied behavioral elasticities depicted in the graphs do, by and large, match more closely estimates found in the contemporary empirical labor supply literature.^{Footnote 29}

To obtain a revenue-based measure of the welfare gains attainable by subsidizing child care under different income tax systems, we consider an equivalent variation type of welfare gain measure, taking as a benchmark the solution to the government’s problem under the linear income tax optimum.^{Footnote 30} We first calculate the minimum amount of extra revenue that should be injected into the government’s budget, in the linear income tax optimum without child care subsidies, in order to achieve the same social welfare level as under a different tax system (piecewise linear or fully nonlinear income tax, with or without child care subsidies). Once we have found this minimum amount of extra revenue, we divide it by the aggregate GDP at the linear income tax optimum without child care subsidies, to get a revenue-based measure of the welfare gains.

Regarding the social welfare function, we focus on the max–min, approximating this social welfare objective with the maximization of the demogrant. This is always a valid approach when the least well-off individual does not work.^{Footnote 31} In the simulation exercises presented below, the government optimizes two separate income tax schedules for the groups (parents and non-parents) and can transfer resources across the groups. In the case of a max–min social welfare function, this implies that the utility of the least well-off individual has to be the same in each group. When these agents do not work, a social welfare maximum requires the demogrant to be the same for both groups. For all the various tax systems that we consider (fully nonlinear, piecewise linear and linear), we represent the population distribution with 1998 agents and 999 wage rates from each group. These correspond to the quantiles of each distribution, with the exclusion of the extreme values.^{Footnote 32}

## Quantitative results

Our main results are contained in Tables 1 and 2 and in Fig. 2. In the figure, we have plotted the optimal fully nonlinear income tax system together with the optimal four-bracket piecewise linear tax system. The two top graphs display the marginal tax rate schedules for parents, whereas the two bottom graphs show the corresponding graphs for non-parents. The graphs to the left refer to the case with an optimally chosen subsidy to child care, whereas the graphs to the right refer to the case with no subsidy. The location of the break points in the piecewise linear tax system is indicated with vertical dashed lines. The marginal tax rates associated with the allocations chosen by agents under an optimal fully nonlinear income tax are indicated with blue dots, and the solid red line represents a kernel density approximation of the optimal schedule.^{Footnote 33}

The values of the marginal tax rates for the linear- and piecewise linear tax schedules are displayed in Table 1 together with the value of the optimal subsidy rate and of the demogrant. As we can see from the table, the optimal subsidy rate drops below 100% only when the income tax system is linear. Thus, we get that in general very large subsidy rates are still optimal when the degree of sophistication of the income tax schedule is significantly lower than under a fully nonlinear income tax. At first sight, this may appear counterintuitive given that, under a max–min social welfare function, the government aims at maximizing the utility of the least well off, who are likely to be not working and therefore cannot directly benefit from a subsidy. However, notice that, since in our model all parents, irrespective of their market productivity, face an identical marginal cost of working (given by *q*, when \(s=0\)), a proportional subsidy on work-related expenditures becomes equivalent to a progressive wage subsidy. Formally, denoting by \(\overline{w}\) the net wage rate of a parent, i.e., \(\overline{w}\equiv \left( 1-t\right) w-\left( 1-s\right) q\), the combined effect of *t* and *s* is equivalent to a wage subsidy levied at rate \(\overline{s}=-t+sq/w\):

which turns out to be progressive as \(\partial \overline{s}/\partial w<0\).

Put differently, by supplementing a linear income tax with a proportional subsidy on work-related expenditures, the marginal effective income tax rate (MEITR) faced by an agent is given by \(\tau \equiv t-sq/w\). Thus, even though any given individual faces a constant MEITR, the value of the MEITR is increasing in the market productivity of an agent. Despite the fact that the government does not directly observe the market productivity of an agent, the combination of a flat tax rate *t* and a flat subsidy *s* allows the government to offer parents a set of skill-dependent marginal income tax schedules. In a sense, this can be seen as the possibility to introduce in the tax system an additional layer of tagging, even though of an imperfect kind given that all parents, irrespective of their skill type, face the same demogrant, and given that the MEITR \(\tau \) is constrained to vary in skill according to the function \(\partial \tau /\partial w=sq/\left( w^{2}\right) \) .

The last column of Table 1 shows that, when supplementing the income tax that applies to parents with an optimal subsidy on work-related expenditures, there is less need to engage in inter-group redistribution (the value of \(G^{np,p}\) drops in all cases when *s* is optimally chosen). This is due to the fact that, by relying on the subsidy, the government succeeds in raising the demogrant that can be self-financed via taxation of parents’ aggregate labor income.

In terms of the effects of the subsidy on the structure of optimal statutory marginal income tax rates (as opposed to MEITR), we can see from Table 1 that, while the subsidy has minor effects on the structure of marginal tax rates for non-parents, it shifts up the structure of statutory marginal tax rates for parents. In the linear income tax case, the optimal marginal tax rate increases by about 22%, from 33.29 to 55.63%. Taking into account that in our simulations *q* is set equal to 40% of the median wage for parents, a subsidy at 60% coupled with an increase from 33.29 to 55.63% in *t* implies, roughly, that the MEITR for parents is lowered for those with a productivity below the median level and is increased for those with a productivity above the median level. For the piecewise linear income tax cases (two brackets and four brackets), we can also see that the introduction of the subsidy, rather than simply shifting up uniformly the structure of statutory marginal tax rates for parents, is accompanied by an increase in the statutory marginal tax rates that becomes smaller as one considers higher-income brackets.^{Footnote 34} This implies that the statutory marginal tax rate structure faced by parents becomes more regressive as income taxation is supplemented with a subsidy on work-related expenditures.

Finally, the welfare comparisons are contained in Table 2. The reported results show that, although very large subsidy rates are in general optimal, the magnitude of the welfare gains that can be achieved by using this additional policy instrument varies significantly depending on the degree of sophistication of the underlying income tax schedule. In particular, whereas the welfare gains are negligible under a linear income tax, they are roughly of the same magnitude under a piecewise linear income tax and under a fully nonlinear income tax (0.52% for the case of a two-bracket piecewise linear tax, 0.48% for the case of a four-bracket piecewise linear tax, and 0.50% for the case of a fully nonlinear income tax).

Irrespective of whether work-related expenditures are subsidized or not, Table 2 also sheds light on the relative merits of increasingly sophisticated tax schedules. For the case when \(s=0\), the results show that, while a fully nonlinear income tax delivers large welfare gains compared to a linear income tax, a two-bracket piecewise linear tax already captures about 86% of the welfare gains achievable through a fully nonlinear optimal income tax, with the share increasing to about 91% for the case of a four-bracket piecewise linear income tax.^{Footnote 35} An almost identical picture emerges comparing the welfare gains of the various tax systems when the subsidy is optimally chosen.

### Robustness with respect to the choice of social welfare function

Up until now, we have considered a max–min social welfare objective, which in a setting where a nonzero fraction of the population is non-working is equal to the objective of tax revenue maximization from the working population. This represents a simple and transparent benchmark case that has been analyzed extensively in the optimal tax literature. In Appendix D, we examine the sensitivity of our results to the choice of social objective by examining the results when the government is maximizing the following social welfare function:

This is equivalent to a formulation where the social planner is of the Utilitarian type and preferences are given by \(\log (u)\) where *u* is defined in (4).^{Footnote 36} The results from this exercise are displayed in Tables 3, 4 and Fig. 3 (mirroring Tables 1, 2 and Fig. 2). As can be seen from the summary of the welfare gains in Table 4, the gains from nonlinear income taxation under the above social welfare function specification are significantly reduced due to the substantial decrease in the governments’ desire for redistribution.^{Footnote 37} However, it is still the case that the welfare gains of an optimally chosen child care subsidy are about the same under the fully nonlinear income tax as under a piecewise linear tax.

Finally, one may also note from Table 4 that the piecewise linear income tax is able to capture, in comparison with what happened for the case of a max–min social welfare function, a substantially smaller part (about 59% in the case of a two-bracket piecewise linear tax and about 66% in the case of a four-bracket piecewise linear tax) of the welfare gain associated with a fully nonlinear income tax.^{Footnote 38}

## Concluding remarks

The previous literature has shown that, in the presence of a fully nonlinear income tax, subsidizing complementary-to-labor private goods may be beneficial due to its role in alleviating the self-selection constraints faced by the government when trying to achieve redistributive goals. In this paper, we have set out to examine whether this finding is a theoretical curiosity, namely, that such gains only are achievable when the government is optimizing a fully nonlinear income tax, or whether sizable welfare gains can be obtained also when the government is optimizing simpler income tax systems of the kind used in real economies. This comparison is made possible through a computational approach where we are able to compute fully nonlinear optimal income taxes and piecewise linear taxes under identical circumstances in terms of the components of the optimal income tax model (social welfare function, distribution of productivities, and the model of household behavior).

The message that we provide is overall positive. Using a quantitative simulation model with behavioral foundations consistent with the empirical labor supply literature, our analysis indicates that, while the effectiveness of subsidies to work-related goods as a welfare-enhancing instrument is indeed increasing in the degree of sophistication of the underlying income tax schedule, the welfare gains that can be achieved by subsidizing the work-related good under a piecewise linear income tax is roughly the same as the gains that can be achieved by subsidizing the work-related good under an optimal fully nonlinear income tax. Regarding the optimal value of the subsidy, our results indicate that it is in general quite large and is (weakly) increasing in the degree of sophistication of the underlying income tax schedule.

Our results also indicate that, in general, an optimal nonlinear income tax delivers significant welfare gains compared to a linear income tax, even though the magnitudes of these welfare gains vary substantially depending on the chosen social welfare function. In particular, the welfare gains appear to be increasing in the degree of social aversion to inequality embedded in the social welfare function. However, in the context of our stylized model, between 66 and 91% of the gains of a fully nonlinear optimal income tax (over a linear income tax) can be captured by a four-bracket piecewise linear income tax, depending on the choice of social welfare function, and even with a two-bracket piecewise linear tax one could capture between 59 and 86% of the gains of a fully nonlinear optimal income tax.

To conclude, we would like to emphasize that the purpose of this paper has not been to provide realistic measures of the welfare gains that can be derived from subsidizing child care in real economies, as such exercises would require a more sophisticated model of household behavior. Instead, we have used a simple and computationally tractable model to illustrate how the welfare gains that derive from subsidizing a complementary-to-work good in a nonlinear income tax setting depend on the degree of sophistication of the income tax instrument.

## Notes

- 1.
- 2.
See, e.g., Blomquist and Christiansen (1995), Boadway and Marchand (1995), Cremer and Gahvari (1997), Balestrino (2000), Pirttilä and Tuomala (2002), and Blomquist et al. (2010). More recently, Koehne and Sachs (2017) analyze the desirability of providing tax breaks for work-related goods in the context of Pareto-efficient tax structures, whereas Bastani et al. (2017), and Ho and Pavoni (2016) analyze optimal child care subsidies.

- 3.
As we have already mentioned, the Mirrleesian literature provides a rationale for subjecting work-related goods/services to a relatively more lenient tax treatment (compared to other goods) based on its effects on the incentive-compatibility constraints faced by the government in the design of the nonlinear income tax. But it is worth noticing that incentive constraints are also implicitly present under less sophisticated tax schedules, as, for instance, under a piecewise linear tax, albeit they take a different form.

- 4.
For a recent empirical assessment of the relation between working hours and demand for child care services, see Pirttilä and Suoniemi (2014).

- 5.
- 6.
The simpler case where all individuals have work-related consumption requirements and the government designs a single tax schedule is of course a special case of our analysis.

- 7.
It is more appropriate to focus on a model of labor supply rather than a model of taxable income in the current context, as we are interested in needs for work-related consumption goods directly related to labor supply.

- 8.
However, in contrast to this study, we allow for tax schedules that generate non-convex budget sets.

- 9.
The empirically most relevant counterpart to what we label as “parents” is most likely, the so-called secondary earner in couples with children in child care age, or the lone parents (of young children) when the household is not a couple. The reason is that these are the agents whose labor supply is primarily affected by the availability of child care services, and in this sense, they can be singled out as the “users” of the work-related good which we use in our illustration. Thus, albeit we use for simplicity the labels “parents” and “non-parents,” one has to bear in mind that for our purposes the group of parents represents a subset of real-world parents, consisting to a large extent of mothers of young children.

- 10.
This assumption is made for simplicity and does not affect the qualitative results.

- 11.
See Guesnerie and Seade (1982) for a further elaboration on the single-crossing condition in the discrete optimal income tax model.

- 12.
The term “tagging” was coined by Akerlof (1978) to describe the use of taxes that are contingent on personal characteristics. More recent contributions on tagging and taxation include Immonen et al. (1998), Boadway and Pestieau (2006), Blomquist and Micheletto (2008), Cremer et al. (2010), Bastani (2013), Bastani et al. (2013), Bastani et al. (2015), and Kanbur and Tuomala (2016).

- 13.
This is due to the fact that non-parents do not demand child care services and the fact that tagging implies that non-parents can only choose income points on the tax schedule that apply to non-parents.

- 14.
In our model, child care services enter the individual decision problem as a ‘needs constraint’ and are not subject to a separate individual choice. Thus, a child care subsidy does not distort how individuals allocate their disposable income across consumption goods. The only margin of choice that it distorts is the individual leisure-labor choice. However, under a fully nonlinear income tax, where marginal income tax rates can be varied independently at each income level, the government has enough flexibility to offset for all parents, through a proper adjustment in their income tax schedule, the distortionary effect generated by a variation in the subsidy rate.

- 15.
Notice that, as compared to the case considered in the previous subsection, the parents’ indifference curves in the \(\left( Y,B\right) \)-space are likely to become flatter. This certainly happens when the agents’ preferences are quasi-linear in consumption, in which case the parents’ indifference curves flatten by the amount

*q*/*w*. More generally, the parents’ indifference curves flatten after the introduction of the child care subsidy provided that the income effects on labor supply are not very large. - 16.
It forces parents to internalize the resource cost of child care which they would face in a competitive market where child care services are privately purchased.

- 17.
Nonetheless, it would be wrong to expect that the actual values of the optimal marginal income tax rates faced by non-parents would not change once the government supplements income taxation with a child care subsidy. The reason why only the first-order conditions for the optimal marginal taxes faced by parents change their form is that, in the government’s budget constraint, the public outlays associated with the child care subsidy are only a function of the labor supply of households with children. However, the change in the public budget constraint generated by the inclusion of the cost of the child care subsidy will affect the value of the Lagrange multiplier associated with the public budget constraint. In turn, this will change quantitatively also the value of the marginal tax rates for non-parents, even though their first-order conditions do not change.

- 18.
However, they take a different form than under a fully nonlinear tax since individuals on the same budget segment are pooled together.

- 19.
Our focus on piecewise linear tax systems is motivated by the fact that most real-world tax systems take this form. One reason for the widespread use of piecewise linear taxes might be that such taxes are relatively easy to understand for taxpayers which might be necessary for the tax system to be perceived as transparent and legitimate. Empirically, most individuals locate in the interior of segments of piecewise linear taxes, and hence when determining their marginal tax burden, face a much simpler calculation under a piecewise linear tax as compared to under a fully nonlinear income tax.

- 20.
Notice that this is not a concave programming problem. Although utility is continuous in \(\Theta \), if the tax schedule displays in some intervals marginal rate regressivity, the budget set is non-convex and tax revenue is not continuous. For this reason, an algorithmic approach suited for non-smooth problems needs to be used in the numerical analysis.

- 21.
When \(\sum _{i=1}^{N}\pi ^{i,np}\left( Y^{*np}(\theta ^{np};w^{i,np})-C^{*np}(\theta ^{np};w^{i,np})\right) >0\), we have that the inter-group transfer runs from non-parents to parents (\(G^{np,p}>0\)). When instead \(G^{np,p}<0\), the inter-group transfer implies a redistribution of resources from parents to non-parents.

- 22.
The government’s problem under a linear tax system is described in detail in Appendix C where we also provide a characterization of the optimal marginal tax rates that apply to parents and non-parents, and a characterization of the optimal child care subsidy.

- 23.
This choice is motivated by the fact that what we have in mind as an empirical counterpart to the label “parent” is, more properly, the so-called secondary earner in couples with children in child care age, or the lone parents (of young children) when the household is not a couple. The reason is that these are the agents whose labor supply is primarily affected by the availability of child care services, and in this sense, they can be singled out as the “users” of the subsidized private good on which we focus.

- 24.
Data have been combined from three sources, “Flergenerationsregistret,” “Louise-databasen” and “Lönestrukturstatistiken,” covering men and women working in the public sector and in large companies but not in small companies. According to Statistics Sweden, there were 2,143,775 women in the age of 25–60 in Sweden in 2005. Our data set includes 1,457,931 wages for women and 1,519,921 wages for men. Among women, 17.43% had at least one child in day care age. This represents 8.53% of the entire population.

- 25.
A similar utility function is described by Stern (1986) as a good candidate for representing labor supply behavior. The quadratic specification has also been used by Tuomala (2010) and is computationally convenient as it permits a closed-form solution for the labor supply choice. This is useful especially when dealing with piecewise linear tax schedules.

- 26.
To ensure concavity, we require \(4\alpha \beta -\gamma ^{2}>0\).

- 27.
See, e.g., the review of the literature provided by Meghir and Phillips (2010).

- 28.
The labor supply function is evaluated at an (annual) non-labor income of \( m=150{,}000\) (SEK) which is of the same order of magnitude as the demogrant arising endogenously in the optimal tax problems that we solve.

- 29.
One should keep in mind that in this simulation exercise we focus on the labor supply elasticity rather than on the taxable income elasticity. It should therefore not be surprising that the labor supply elasticity generated by our utility function is decreasing in the wage rate of agents.

- 30.
A characterization of the linear income tax optimum, with and without child care subsidies, is provided in Appendix C.

- 31.
- 32.
When approximating actual wage distributions, one can either use a set of equally spaced wage rates together with heterogeneous probabilities, or use the percentiles of the wage distribution, along with (by construction) uniform probabilities. We have chosen the latter approach to represent the wage distribution as it makes sense to use more data points in more populated regions of the wage distribution.

- 33.
As already mentioned, to approximate the actual wage distributions we have used the percentiles of the wage distribution, along with (by construction) uniform probabilities. This implies that some wage rates lie quite close together in regions of the wage distribution where many individuals are located. This in turn explains why we observe some bunching in the four graphs of Fig. 2.

- 34.
For the two-bracket piecewise linear income tax case, we have that \( dt_{1}^{p}=56.43\%>dt_{2}^{p}=28.74\%\); for the two-bracket piecewise linear income tax case, we have that \(dt_{1}^{p}=52.06\%>dt_{2}^{p}=37.58 \%>dt_{3}^{p}=25.69\%>dt_{4}^{p}=18.44\%\).

- 35.
The value 86% is found by dividing the welfare gains obtained under a two-bracket piecewise linear income tax (and zero subsidy), i.e., 10.91%, by the corresponding figure under a fully nonlinear income tax, i.e., 12.68%. The value 91% is obtained by dividing the welfare gains under a four-bracket piecewise linear income tax (and zero subsidy), i.e., 11.62%, by the corresponding figure under a fully nonlinear income tax.

- 36.
This is a common type of social welfare function used, for example, by Saez (2001) in his optimal tax simulations for the case when utility is quasi-linear in consumption. As is well known, the desire for redistribution in optimal tax models depends on the joint curvature of the utility function and the social welfare function. As our utility function (4) (realistically) has moderate income effects as compared to some other utility specifications in the literature, the \( \log \) transformation of utility serves the purpose of introducing additional motives for redistribution.

- 37.
Notice, however, that the inter-group transfer is substantially larger under the (generalized) Utilitarian social objective than under the max–min case. The reason is that in the latter case the planner was concerned with equalizing the demogrants (in order to equalize the utility of the least well off in the two groups), and therefore, except for agents at the bottom of the skill distribution, the planner did not attach weight to a reduction in the difference between the marginal utility of consumption for parents and non-parents. In the (generalized) Utilitarian case, the planner is instead concerned with the well-being of a broader set of agents, which implies that the inter-group transfer is an instrument to equalize the average net social marginal utility of income for parents and non-parents. Another difference with respect to the max–min case refers to the structure of the statutory marginal tax rates. In the absence of a subsidy, the two- and four-bracket piecewise linear taxes were generally characterized by a decreasing profile of marginal tax rates in the max–min case, a standard feature when the government’s objective is the maximization of the demogrant. Under our (generalized) Utilitarian objective function, instead, the marginal tax profile is increasing in the absence of a subsidy. It becomes decreasing only for parents when an optimal subsidy is used. But also in this case the decrease in the statutory marginal tax rates over the income brackets is smaller than under the max–min objective.

- 38.
The value 59% is found by dividing the welfare gains under a two-bracket piecewise linear income tax (and zero subsidy), i.e., 0.26%, by the corresponding figure for the fully nonlinear income tax, i.e., 0.44%. The value 66% is found by dividing the welfare gains under a four-bracket piecewise linear income tax (and zero subsidy), i.e., 0.29%, by the corresponding figure for the fully nonlinear income tax. Similar, but slightly larger, numbers would be obtained comparing piecewise linear and fully nonlinear taxes under the assumption that

*s*is always optimally chosen. In this case, a two-bracket piecewise tax would capture about 61% of the welfare gains of a fully nonlinear tax (calculated as 0.30/0.49), and the corresponding figure for a four-bracket piecewise tax would be 69% (calculated as 0.34/0.49). - 39.
Apart from the fact that our definition of \(b^{i}\) also incorporates a term depending on

*q*, the condition \(E\left( b\right) =1\), which implicitly defines the optimal level of the demogrant, is the same that one obtains in a standard model of optimal linear income taxation without public provision.

## References

Aaberge, R., & Colombino, U. (2013). Using a microeconometric model of household labour supply to design optimal income taxes.

*Scandinavian Journal of Economics*,*115*(2), 449–475.Akerlof, G. A. (1978). The economics of tagging as applied to the optimal income tax, welfare programs, and manpower planning.

*American Economic Review*,*68*, 8–19.Andrienko, Y., Apps, P., & Rees, R. (2016). Optimal taxation and top incomes.

*International Tax and Public Finance*,*23*(6), 981–1003.Apps, P., Van Long, N., & Rees, R. (2013). Optimal piecewise linear taxation.

*Journal of Public Economic Theory*,*16*(4), 523–545.Atkinson, A. B., & Stiglitz, J. E. (1976). The design of tax structure: Direct versus indirect taxation.

*Journal of Public Economics*,*6*(1–2), 55–75.Balestrino, A. (2000). Mixed tax systems and the public provision of private goods.

*International Tax and Public Finance*,*7*, 463–478.Bastani, S. (2013). Gender-based and couple-based taxation.

*International Tax and Public Finance*,*20*(4), 653–686.Bastani, S. (2015). Using the discrete model to derive optimal income tax rates.

*FinanzArchiv: Public Finance Analysis*,*71*(1), 106–117.Bastani, S., Blomquist, S., & Micheletto, L. (2013). The welfare gains of age-related optimal income taxation.

*International Economic Review*,*54*(4), 1219–1249.Bastani, S., Blomquist, S., & Micheletto, L. (2017). Child care subsidies, quality, and optimal income taxation. CESifo Working Paper No. 6533.

Bastani, S., Blomquist, S., & Pirttilä, J. (2015). How should commodities be taxed? A counter-argument to the recommendation in the mirrlees review.

*Oxford Economic Papers*,*67*(2), 455–478.Blomquist, S., & Christiansen, V. (1995). Public provision of private goods as a redistributive device in an optimum income tax model.

*Scandinavian Journal of Economics*,*97*(4), 547–567.Blomquist, S., Christiansen, V., & Micheletto, L. (2010). Public provision of private goods and nondistortionary marginal tax rates.

*American Economic Journal: Economic Policy*,*2*(2), 1–27.Blomquist, S., & Micheletto, L. (2008). Age-related optimal income taxation.

*The Scandinavian Journal of Economics*,*110*(2003:7), 45–71.Boadway, R., & Marchand, M. (1995). The use of public expenditures for redistributive purposes.

*Oxford Economic Papers*,*47*(1), 45–59.Boadway, R., & Pestieau, P. (2006). Tagging and redistributive taxation.

*Annales d’Economie et de Statistique*,*83*(84), 123–147.Christiansen, V. (1984). Which commodity taxes should supplement the income tax?

*Journal of Public Economics*,*24*(2), 195–220.Corlett, W. J., & Hague, D. C. (1953). Complementarity and the excess burden of taxation.

*The Review of Economic Studies*,*21*(1), 21–30.Cremer, H., & Gahvari, F. (1997). In-kind transfers, self-selection and optimal tax policy.

*European Economic Review*,*41*(1), 97–114.Cremer, H., Gahvari, F., & Lozachmeur, J. M. (2010). Tagging and income taxation: Theory and an application.

*American Economic Journal: Economic Policy*,*2*(1), 31–50.Guesnerie, R., & Seade, J. (1982). Nonlinear pricing in a finite economy.

*Journal of Public Economics*,*17*, 157–179.Hausman, J. A. (1979). The econometrics of labor supply on convex budget sets.

*Economics letters*,*3*(2), 171–174.Ho, C., & Pavoni, N. (2016). Efficient child care subsidies. Working Papers 572. IGIER (Innocenzo Gasparini Institute for Economic Research), Bocconi University.

Immonen, R., Kanbur, R., Keen, M., & Tuomala, M. (1998). Tagging and taxing: The use of categorical and income information in designing tax/transfer schemes.

*Economica*,*65*, 179–192.Kanbur, R., & Tuomala, M. (2016). Groupings and the gains from tagging.

*Research in Economics*,*70*(1), 53–63.Koehne, S., & Sachs, D. (2017). Pareto-efficient tax breaks. CESifo Working Paper Series No. 6147. Available at SSRN: https://ssrn.com/abstract=2877112.

Meghir, C., & Phillips, D. (2010). Labour supply and taxes. In J. Mirrlees (Ed.),

*Dimensions of tax design: The mirrlees review*. Oxford: OUP.Pirttilä, J., & Suoniemi, I. (2014). Public provision, commodity demand, and hours of work: An empirical analysis.

*The Scandinavian Journal of Economics*,*116*(4), 1044–1067.Pirttilä, J., & Tuomala, M. (2002). Publicly provided private goods and redistribution: A general equilibrium analysis.

*Scandinavian Journal of Economics*,*104*, 173–188.Saez, E. (2001). Using elasticities to derive optimal income tax rates.

*Review of Economic Studies*,*68*, 205–229.Sheshinski, E. (1972). The optimal linear income tax.

*Review of Economic Studies*,*39*, 297–302.Slemrod, J., Yitzhaki, S., Mayshar, J., & Lundholm, M. (1994). The optimal two-bracket linear income tax.

*Journal of Public Economics*,*53*, 269–290.Stern, N. H. (1986). On the specification of labour supply functions. In W. Blundell & I. Walker (Eds.),

*Unemployment, search and labour supply*(pp. 121–142). Cambridge: Cambridge University Press.Tuomala, M. (2010). On optimal non-linear income taxation: Numerical results revisited.

*International Tax and Public Finance*,*17*, 259–270.

## Author information

### Affiliations

### Corresponding author

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Appendices

### Optimal marginal tax rates with optimal nonlinear income taxation

The first-order conditions of the government’s problem for \(Y^{i,j}\) and \( B^{i,j}\) (for \(i\in \{1,\ldots ,N-1\}\) and \(j=p,np\)) are, respectively, given by:

and therefore:

which, using the definition of (implicit) marginal tax rate \(T^{\prime }=1-\mathrm{MRS}\), implies (1).

For \(i=N\) and \(j=p,np\) we instead have that the first-order conditions with respect to \(Y^{i,j}\) and \(B^{i,j}\) are, respectively, given by:

implying \(\left[ 1+\frac{\partial V^{N,j}}{\partial Y^{N,j}}/\frac{\partial V^{N,j}}{\partial B^{N,j}}\right] \mu \pi ^{Nj}=0\) and therefore \(T^{\prime }\left( Y^{Nij}\right) =0\).

### Optimal marginal tax rates with optimal nonlinear income taxation and a child care subsidy

The first-order conditions of the government’s problem with respect to \( Y^{i,np}\) and \(B^{i,np}\) are identical to those characterizing the government’s problem in the absence of a child care subsidy. Thus, the formulas characterizing the marginal tax rates faced by non-parents remain unaffected. Instead, the first-order conditions of the government’s problem for \(Y^{i,p}\) and \(B^{i,p}\) (for \(i\in \{1,\ldots ,N-1\}\)) become, respectively:

and therefore:

which, using the definition of (implicit) marginal tax rate \(T^{\prime }=1-\mathrm{MRS}\), implies (2).

For \(i=N\) and \(j=p\) we instead have that the first-order conditions with respect to \(Y^{i,p}\) and \(B^{i,p}\) are, respectively, given by:

implying \(\left[ 1+\frac{\partial V^{N,p}}{\partial Y^{N,p}}/\frac{\partial V^{N,p}}{\partial B^{N,p}}\right] \mu \pi ^{Np}=\mu q\pi ^{Np}/w^{N,p}\) and therefore \(T^{\prime }\left( Y^{Nij}\right) =q/w^{N,p}\).

### Characterization of an optimum under a linear tax

Under a linear income tax characterized by a marginal tax rate \(t^{p}\) and demogrant \(G^{p}\), and supplemented by a child care subsidy levied at rate *s*, parents solve the problem \(\underset{h}{\max }\)\(u\left( G^{p}+\left( 1-t^{p}\right) wh-(1-s)qh,h\right) \). Under a linear income tax characterized by a marginal tax rate \(t^{np}\) and demogrant \(G^{np}\), non-parents solve the problem \(\underset{h}{\max }\)\(u\left( G^{np}+\left( 1-t^{np}\right) wh,h\right) \).

Denoting, respectively, by \(V^{i,p}\left( t^{p},G^{p},s\right) \) and \( V^{i,np}\left( t^{np},G^{np}\right) \) the indirect utility of parents and non-parents of ability type *i*, and denoting by \(\alpha ^{ij}\) the welfare weight used by the government for agents of type *ij*, the design problem solved by the government can be written as:

subject to:

where \(\mu \) is the Lagrange multiplier associated with the government’s budget constraint.

Denote \(\pi ^{i,p}/\underset{k=1}{\overset{N}{\sum }}\pi ^{k,p}\) by \( {\overline{\pi }}^{ip}\) and \(\pi ^{i,np}/\underset{k=1}{\overset{N}{\sum }} \pi ^{k,np}\) by \({\overline{\pi }}^{i,np}\). The first-order condition with respect to \(G^{p}\) and \(G^{np}\) are, respectively, given by:

Define the net social marginal valuation of a lump-sum transfer to a parent of type *i* and to a non-parent of type *i* as, respectively:

Having defined \(b^{i,p}\) and \(b^{i,np}\) we can easily see that condition (9), (10) boil down to requiring \(E\left( b^{p}\right) =E\left( b^{np}\right) =1\), where \(E\left( \cdot \right) \) denotes the expectation operator.^{Footnote 39} In other words, it prescribes that at an optimum the lump-sum component should be adjusted such that \(b^{j}\), the government’s net social marginal valuation of a transfer of 1 currency unit (measured in terms of government’s revenue) to agents of group *j* (with \(j=p,np\)) should on average be equal to its marginal cost.

The first-order condition with respect to \(t^{np}\) is the following:

or, equivalently, applying the Slutsky equation and denoting by a tilde symbol a compensated variable:

Noticing that \(\frac{\partial V^{i,np}\left( t^{np},G^{np}\right) }{\partial t^{np}}=-\frac{\partial V^{i,np}\left( t^{np},G^{np}\right) }{\partial G^{np} }w^{i,np}h^{i,np}\) (by applying Roy’s identity) and \(\frac{\partial {\widetilde{h}}^{i,np}}{\partial t^{np}}=-w^{i,np}\frac{\partial {\widetilde{h}} ^{i,np}}{\partial w^{i,np}(1-t^{np})}\), and using (10), we can derive the following implicit expression for the optimal \(t^{np}\):

or, equivalently, denoting by \({\widetilde{\eta }} _{h^{i,np},w^{i,np}(1-t^{np})}\) the compensated elasticity of labor supply with respect to the net wage rate for a non-parent of skill type *i*:

The first-order conditions with respect to \(t^{p}\) and *s* are, respectively, given by:

Using the Slutsky equation and denoting by a tilde symbol a compensated variable, we can rewrite Eqs. (11), (12) respectively as:

Notice that, by applying Roy’s identity we can write \(\frac{\partial V^{i,p}\left( t^{p},G^{p},s\right) }{\partial t^{p}}=-\frac{\partial V^{i,p}\left( t^{p},G^{p},s\right) }{\partial G^{p}}w^{i,p}h^{i,p}\) and \( \frac{\partial V^{i,p}\left( t^{p},G^{p},s\right) }{\partial s}=\frac{ \partial V^{i,p}\left( t^{p},G^{p},s\right) }{\partial G^{p}}qh^{i,p}\). Moreover, we have that \(\frac{\partial {\widetilde{h}}^{i,p}}{\partial t^{p}} =-w^{i,p}\frac{\partial {\widetilde{h}}^{i,p}}{\partial \left[ w^{i,p}(1-t^{p})-\left( 1-s\right) q\right] }\) and \(\frac{\partial {\widetilde{h}}^{i,p}}{\partial s}=q\frac{\partial {\widetilde{h}}^{i,p}}{ \partial \left[ w^{i,p}(1-t^{p})-\left( 1-s\right) q\right] }\). Thus, using (9), we can rewrite (13), (14) in matrix form as:

Denoting by \(\Delta \) the determinant of the 2x2 matrix on the left hand side of (15), we have:

Thus, we have:

### Robustness with respect to the social welfare function

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## About this article

### Cite this article

Bastani, S., Blomquist, S. & Micheletto, L. Nonlinear and piecewise linear income taxation, and the subsidization of work-related goods.
*Int Tax Public Finance* **26, **806–834 (2019). https://doi.org/10.1007/s10797-019-09532-1

Published:

Issue Date:

### Keywords

- Optimal income taxation
- Tagging
- Piecewise linear taxation
- Child care subsidies

### JEL Classification

- H21
- H42