Abstract
We provide a full characterization of a twotype optimal nonlinear income tax model where the singlecrossing condition is violated due to an assumption that agents differ both in terms of market abilities and in terms of their needs for a workrelated good. We set up a Paretoefficient tax problem and analyze the entire secondbest Paretofrontier, highlighting several nonstandard results, such as the possibility of income reranking relative to the laissezfaire and gaps in the Paretofrontier.
Introduction
The important and influential literature growing out of Mirrlees’ (1971) seminal paper on optimal income taxation has stressed the tradeoffs between incentive and distributional considerations in the design of income tax schedules. These tradeoffs arise from an information friction that endogenizes the feasible tax instruments: the government knows the distribution of types in the population and it can also observe the actual earned income of each individual, but is not able to observe the specific type of any given individual. Personalized lumpsum taxes and transfers are therefore not available but public observability of earned income at the individual level allows the government to tax earned income on a nonlinear scale.
The vast majority of papers in the optimal tax literature assume that agents differ along a single dimension (market ability). This is due to tractability considerations. Given certain assumptions on the utility function, it enables a monotonic relationship between an agent’s unobserved type and the slope of his/her indifference curve in the earningsconsumption space. This property, referred to as ’singlecrossing’ (hereafter, SC), allows the researcher to provide a full characterization of the set of implementable contracts while restricting attention to local incentive constraints linking adjacent types. In the case of a continuum of types, it also implies that the incentive constraints can conveniently be expressed in terms of differential equations. When agents differ along multiple dimensions, however, the SC property will generally be violated, as there is no natural way to order agents in a multidimensional space.^{Footnote 1}
A comparatively small literature analyzes optimal income taxation with multidimensional unobserved heterogeneity, and these contributions can roughly be divided into four strands. A first strand assumes that the additional dimensions of heterogeneity enters additively separable in the utility function, thereby not affecting individuals’ tradeoffs between pretax and aftertax income (see e.g., Kleven et al. 2009; Jacquet et al. 2013; Scheuer 2014; Bastani et al. 2020). A second strand imposes restrictions such that the various dimensions of heterogeneity can be collapsed into one dimension and parameterized by a single index (see, e.g., Boadway et al. 2002; Choné and Laroque 2010; Golosov et al. 2013; Rothschild and Scheuer 2014; Lockwood and Weinzierl 2015). A third strand analyzes more general forms of heterogeneity, but focuses attention to quantitative analysis of models with a small discrete number of types (see, e.g., Bastani et al. 2013; Judd et al. 2018). Finally, a fourth strand comprises papers that provide a characterization of optimal marginal tax rates while remaining agnostic about which incentivecompatibility constraints are binding in equilibrium (see, e.g., Cremer et al. 1998; Cremer and Gahvari 2002; Micheletto 2008).
Compared to the existing literature referred to above, the purpose of this paper is to provide a more thorough investigation of the consequences descending from abandoning the SC condition. For this purpose, we set up a simple twotype model where the SC condition is naturally violated, and we characterize the properties of a secondbest optimum by considering the entire secondbest Pareto frontier (hereafter, PF).^{Footnote 2} The model that we consider is a standard intensivemargin optimal income tax model where agents have identical preferences and heterogeneous market abilities, but where we also allow for heterogeneity in “needs” for a workrelated good/service, i.e. a good/service that some agents need to purchase in order to work.^{Footnote 3} It is this bidimensional heterogeneity that implies a violation of the SC condition.
Our analysis highlights several results, each of them representing an anomaly with respect to what is obtained in an optimal income tax model under SC. First of all, a secondbest optimum might not preserve the ranking of earned income that prevails under laissezfaire. Second, redistribution via income taxation might be feasible even when the laissezfaire equilibrium is a pooling equilibrium. Third, a secondbest optimum might not be unique, in the sense that there might be more than one set of allocations in the (pretax income, aftertax income)space that solve the government’s maximization problem. Fourth, the secondbest PF can be disconnected. Fifth, supplementing an optimal nonlinear income tax with an optimal subsidy on workrelated expenses may imply that redistribution is achieved through a separating or pooling equilibrium where both selfselection constraints are binding. A final result that we show is that the labor supply of some agents may be distorted even though no selfselection constraint is (locally) binding in equilibrium.
The paper is organized as follows. In Sect. 2 we present our setting and highlight how it implies that the SC condition does not hold. In Sect. 3 we evaluate the properties of the secondbest PF and of the allocations that allow implementing the various points on the secondbest PF. To simplify the exposition we make the assumption that, for agents who incur a cost for the purchase of a workrelated good, the cost is proportional to their labor supply. In Sect. 4 we discuss how our results change when workrelated expenses are subsidized by the government, and in Sect. 5 we briefly consider the possibility that jobrelated expenses vary nonlinearly with hours of work. Finally, Sect. 6 offers concluding remarks.
The model
Consider an economy populated by two groups of individuals who have identical preferences represented by the quasilinear utility function
where c denotes consumption and h denotes labor supply.^{Footnote 4}
The two groups of agents are assumed to differ with respect to their market ability, reflected in their hourly wage rate, and their needs for a workrelated good. One group has no need for any workrelated good, whereas agents belonging to the other group incur a monetary cost \(\varphi (h)=qh\), where q is a positive constant. Throughout the paper we will refer to these groups of agents as “nonusers” and “users”, and denote their hourly wage rates by, respectively, \(w^{n}\) and \(w^{u}\) (superscript “n” referring to nonusers, and superscript “u” referring to users). Moreover, normalizing to 1 the size of the total population, we will denote by \(\pi \) the proportion of users. Furthermore, we will assume that \(w^{u}>w^{n}\), implying that the highskilled agents are disadvantaged along our second dimension of heterogeneity, and that \(q<w^{u}\) which ensures that the labor supply of users is strictly positive under laissezfaire.
Assume that the government levies a nonlinear income tax T(wh) and let earned income be denoted by Y (i.e., \(Y\equiv wh\)) and aftertax income be denoted by B (i.e., \(B\equiv YT\left( Y\right) \)). It is straightforward to verify that the SC property is not satisfied in our twotype economy. This property requires that, at any bundle in the (Y, B)space, the indifference curves are flatter the higher the wage rate of an agent. In our model, and for a given (Y, B)bundle, users and nonusers have utilities that are respectively given by:
Therefore, at a given (Y, B)bundle, the slope of a user’s indifference curve is equal to
whereas nonusers have an indifference curve with slope equal to
From (2) and (3), it follows that users and nonusers have equally sloped indifference curves at bundles where
whereas at any bundle where \(Y>\left( <\right) \Omega \), users have flatter (steeper) indifference curves than nonusers.
The fact that the SC property is not satisfied shows that our bidimensional heterogeneity (in skills and needs) cannot be reduced to one dimension. Albeit this complicates the analysis, it also allows us to highlight some interesting results that can arise due to the violation of SC.
In the next section we will evaluate the properties of the secondbest PF and of the allocations that allow implementing the various points on the secondbest PF. In doing that, we will restrict our attention to the case when \(\pi \), the proportion of users, is lower than \(1\left( w^{n}\right) ^{2}/\left( w^{u}\right) ^{2}\); this represents the most interesting case for the purpose of illustrating the anomalies that can arise due to the violation of SC.^{Footnote 5}
Before turning to the analysis of the secondbest PF, however, we will devote the remainder of this section to first provide a characterization of the laissezfaire equilibrium, and then characterize the properties of the firstbest PF.
The laissezfaire equilibrium
Under laissezfaire, users choose h to maximize \(\left( w^{u}q\right) hh^{2}/2\), implying \(h^{u}=w^{u}q\), whereas nonusers choose h to maximize \(w^{n}hh^{2}/2\), implying \(h^{n}=w^{n}\).
Therefore, denoting by \(Y_{LF}^{i}\) the laissezfaire income of an individual i, for \(i=n,u\), we have that \(Y_{LF}^{n}=\left( w^{n}\right) ^{2}\), \(Y_{LF}^{u}=\left( w^{u}q\right) w^{u}\). It then follows that
Equivalently, defining \({\overline{q}}\) as
we have that
Consider the case when \(q>{\overline{q}}\), so that \(Y_{LF}^{u}<Y_{LF}^{n}\). Since \(\Omega \) in (4) can be reexpressed as \(\left( w^{n}\right) ^{2}q/{\overline{q}}\), it also follows that \(\Omega >Y_{LF}^{n}\) when \(q> {\overline{q}}\). Similarly, when \(q<{\overline{q}}\), we have that \(\Omega <Y_{LF}^{n}\), and when \(q={\overline{q}}\) we have that \(\Omega =Y_{LF}^{n}\). Thus, whether q is smaller than, equal to, or larger than \({\overline{q}}\) also determines the relative sizes of both types’ MRS at their laissezfaire bundles (i.e. the relations between \(Y_{LF}^{u}\), \(Y_{LF}^{n}\) and the threshold \(\Omega \)).
The following Lemma summarizes the relationship between the value of q and the three possible configurations of a laissezfaire equilibrium.
Lemma 1
Assume that \(w^{u}>w^{n}\).

(i)
When \(q<{\overline{q}}\), the laissezfaire equilibrium will be such that \( Y_{LF}^{u}>Y_{LF}^{n}>\Omega \);

(ii)
When \(q={\overline{q}}\), the laissezfaire equilibrium will be such that \( Y_{LF}^{u}=Y_{LF}^{n}=\Omega \);

(iii)
When \(q>{\overline{q}}\), the laissezfaire equilibrium will be such that \(Y_{LF}^{u}<Y_{LF}^{n}<\Omega \).
A graphical illustration of the laissezfaire equilibrium for the case when \( q>{\overline{q}}\), and of the violation of SC, is provided in Fig. 1 below.
Regarding utilities, denoting by \(U_{LF}^{i}\) the laissezfaire utility of an individual i, for \(i=n,u\), we have that \(U_{LF}^{u}=\left( w^{u}q\right) ^{2}/2\), \(U_{LF}^{n}=\left( w^{n}\right) ^{2}/2\), and therefore
or, equivalently
One thing to notice is that the utility ranking and the income ranking may differ. In particular, while \(Y_{LF}^{u}\le Y_{LF}^{n}\) implies that \( U_{LF}^{u}<U_{LF}^{n}\), knowing that \(Y_{LF}^{u}>Y_{LF}^{n}\) is not sufficient to establish who is better off under laissez faire. When \( Y_{LF}^{u}>Y_{LF}^{n}\), we can have that \(U_{LF}^{u}<U_{LF}^{n}\) (when \( \left( w^{u}q\right) w^{u}>\left( w^{n}\right) ^{2}>\left( w^{u}q\right) ^{2}\)), \(U_{LF}^{u}=U_{LF}^{n}\) (when \(\left( w^{u}q\right) w^{u}>\left( w^{n}\right) ^{2}=\left( w^{u}q\right) ^{2}\)), or \(U_{LF}^{u}>U_{LF}^{n}\) (when \(\left( w^{u}q\right) w^{u}>\left( w^{u}q\right) ^{2}>\left( w^{n}\right) ^{2}\)).
The shape of the firstbest Pareto frontier
In a firstbest setting where asymmetric information is not an issue, the shape of the PF can be straightforwardly characterized. The firstbest PF goes through the point with coordinates (\(U_{LF}^{n},U_{LF}^{u}\)) and has slope \(dU^{u}/dU^{n}=(1\pi )/\pi \) for values of \(U^{n}\) such that \( \left( w^{n}\right) ^{2}/2\le U^{n}\le \left[ \left( w^{u}q\right) ^{2}\pi /\left( 1\pi \right) \right] +\left( w^{n}\right) ^{2}/2\). For \( U^{n}>\left[ \left( w^{u}q\right) ^{2}\pi /\left( 1\pi \right) \right] +\left( w^{n}\right) ^{2}/2\) the slope of the PF is such that \( dU^{u}/dU^{n}<(1\pi )/\pi \); for \(U^{n}<\left( w^{n}\right) ^{2}/2\) the slope is such that \((1\pi )/\pi<dU^{u}/dU^{n}<0\).
The intuition is as follows. Starting from the laissezfaire equilibrium, a 1$ lumpsum tax levied on nonusers, which reduces by 1 the utility of each nonuser, allows the government to collect $\((1\pi )\), which implies that each user can receive a lumpsum transfer of $\(\left( 1\pi \right) /\pi \), raising utility by \(\left( 1\pi \right) /\pi \). This kind of income and utilityredistribution, from nonusers to users, can go on until all the income earned by nonusers under laissezfaire, i.e. \(\left( w^{n}\right) ^{2}\), is confiscated by the government. At that point we have that \(U^{n}=\left( w^{n}\right) ^{2}/2\) (consumption for nonusers is equal to zero and, with no income effects on labor supply, their labor supply is undistorted at its laissezfaire level) and \(U^{u}=\left[ \left( w^{n}\right) ^{2}\left( 1\pi \right) /\pi \right] +\left( w^{u}q\right) ^{2}/2\). Once this point on the firstbest PF is reached, and assuming that zero represents the lower bound for individual consumption,^{Footnote 6} a further increase in \(U^{u}\) can only be obtained by pushing the labor supply of nonusers above its undistorted level \(h^{n}=w^{n}\) (while keeping at zero their consumption), so that additional resources can be transferred to users. However, due to the distortion on the labor supply of nonusers, redistribution becomes costlier and the slope of the PF becomes equal to \(dU^{u}/dU^{n}=\left( 1\pi \right) w^{n}/\pi h^{n}\), which is greater than \((1\pi )/\pi \) when \(h^{n}\) exceeds \(w^{n}\), i.e. its laissezfaire value.^{Footnote 7}
The fact that the nonnegativity constraint on consumption becomes binding along some portions of the firstbest PF, and consequently the fact that there are portions of the firstbest PF where the labor supply of some agents is upward distorted, is an artifact of our assumption that utility is linear in consumption.^{Footnote 8} Most importantly, it has nothing to do with the fact that the SC property does not hold in our model. For this reason, in our analysis we will hereafter impose the following lower bounds on the utility of, respectively, nonusers and users:
Conditions (5) and (6) ensure that, at each point along the relevant part of the firstbest PF, the labor supply of all agents will be left undistorted.
Pareto efficient income taxation
Consider now a secondbest setting with asymmetric information. Specifically, assume that the government knows the distribution of types in the population but does not know “who is who”. Albeit individual wages, hours of work and jobrelated expenses are not observed by the government, earned income is assumed to be publicly observable at an individual level. This allows earned income to be taxed on a nonlinear scale and the government’s problem consists in optimally choosing the nonlinear income tax \(T\left( Y\right) \). Notice however that, while \(T\left( Y\right) \) defines a link between earned income Y and aftertax income B which is a singlevalued function, the link that it establishes between earned income and consumption is a multivalued function. This is because, for given Y and corresponding tax payment \(T\left( Y\right) \), an individual consumption depends on the amount of jobrelated expenses.
As customary in the optimal tax literature, we will adopt a mechanism design approach assuming that the government optimally chooses two bundles in the (Y, B)space subject to the requirement that the chosen set of bundles satisfies publicbudget balance, incentivecompatibility, and nonnegativity constraints on both consumption and labor supply. Denoting by (\(Y^{u},B^{u}\) ) the bundle intended for users and by (\(Y^{n},B^{n}\)) the one intended for nonusers, a Pareto efficient tax problem can be formalized as follows:
subject to:
In the problem above, the \(\nu \)constraint prescribes a lower bound \( {\overline{V}}^{n}\) for the utility of nonusers, the \(\mu \)constraint represents the government’s budget constraint (the resource constraint of the economy), the \(\lambda \)constraint is the selfselection constraint requiring nonusers not to be tempted to choose the bundle intended for users, and the \(\phi \)constraint is the selfselection constraint requiring users not to be tempted to choose the bundle intended for nonusers. For a given value of \(\ {\overline{V}}^{n}\), we define the set of admissible bundles as the set of bundles \(\{(Y^{u},B^{u}),(Y^{n},B^{n})\}\) satisfying the constraints in the above optimization problem (including the nonnegativity constraints on labor supply and consumption for each agent). For given values of \(\pi \), q, \(w^{u}\) and \(w^{n}\), the value function of the optimization problem above defines a value for \(U^{u}\) which is a function of \({\overline{V}}^{n}\), that can be written as \(U_{SB}^{u}\left( {\overline{V}}^{n}\right) \). Repeatedly solving the optimization problem for different values of \({\overline{V}}^{n}\) allows tracing the entire secondbest PF. In particular, we have that:
Definition 1
The secondbest Paretofrontier is defined by the graph of the function \(U_{SB}^{u}\left( {\overline{V}}^{n}\right) \) over the domain of values \({\overline{V}}^{n}\) such that the set of admissible bundles is nonempty and the \(\nu \)constraint is binding.
We will present our results by means of three Propositions which separately consider the three cases described in Lemma 1 above. In each Proposition we will denote by \(T^{\prime }\left( Y_{SB}^{u}\right) \) and \(T^{\prime }\left( Y_{SB}^{n}\right) \) the marginal income tax rate faced by, respectively, users and nonusers at the allocation which allows implementing a given point on the secondbest PF. As customary in the optimal tax literature, the marginal income tax rate faced by an individual at a given bundle in the (Y, B)space is defined as \(1MRS_{YB}\).
As we will see, the nonstandard outcomes which are due to the violation of SC only arise when \(q\ge {\overline{q}}\). For this reason, discussing the results when \(q<{\overline{q}}\) can be regarded as a useful starting point. Proposition 1 summarizes the main findings for this case.
Proposition 1
Assume that \(0<q<{\overline{q}}\), so that \(Y_{LF}^{n}<Y_{LF}^{u}\). Then,

(i)
the domain of the function \(U_{SB}^{u}\left( {\overline{V}} ^{n}\right) \)describing the secondbest PF is given by \({\overline{V}}^{n}\in [U_{LF}^{n},U_{LF}^{n}+\frac{\pi }{2}\frac{ (Y_{LF}^{u}Y_{LF}^{n})^{2}}{\left( w^{u}\right) ^{2}\pi \left( w^{n}\right) ^{2}}]\);

(ii)
for \(U_{LF}^{n}\frac{\pi }{2}\frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{ \left( w^{n}\right) ^{2}}\le {\overline{V}}^{n}\le U_{LF}^{n}+\frac{\pi }{2} \frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{\left( w^{u}\right) ^{2}}\), the secondbest PF coincides with the firstbest PF and it is attained through an allocation where \(T^{\prime }\left( Y_{SB}^{n}\right) =T^{\prime }\left( Y_{SB}^{u}\right) =0\);

(iii)
for \(U_{LF}^{n}+\frac{\pi }{2}\frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{ \left( w^{u}\right) ^{2}}<{\overline{V}}^{n}\le U_{LF}^{n}+\frac{\pi }{2} \frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{\left( w^{u}\right) ^{2}\pi \left( w^{n}\right) ^{2}}\), the secondbest PF is attained through an allocation where \(T^{\prime }\left( Y_{SB}^{u}\right) =0\) and \(T^{\prime }\left( Y_{SB}^{n}\right) >0\);

(iv)
for \(U_{LF}^{n}\le {\overline{V}}^{n}<U_{LF}^{n}\frac{\pi }{2} \frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{\left( w^{n}\right) ^{2}}\), the secondbest PF is attained through an allocation where \(T^{\prime }\left( Y_{SB}^{n}\right) =0\) and \(T^{\prime }\left( Y_{SB}^{u}\right) <0\).
Proof
See Appendix A \(\square \)
The results provided in Proposition 1 are qualitatively similar to those that would be obtained in a standard twotype setting where agents only differ in market ability (\(q=0\)).
Part (ii) shows that when the amount of intergroup redistribution is sufficiently small (i.e., \({\overline{V}}^{n}\) is sufficiently close to \( U_{LF}^{n}\)), no distortion is needed to satisfy incentivecompatibility; this means that asymmetric information does not prevent the government from attaining a point on the firstbest PF.
Together, parts (iii) and (iv) show instead that, when the amount of redistribution becomes sufficiently large, incentivecompatibility considerations require to distort the labor supply of the transferrecipients. When these are represented by nonusers, as in part (iii) of Proposition 1, their labor supply will be downward distorted by letting them face a positive marginal tax rate. When transferrecipients are instead represented by users, as in part (iv) , their labor supply will be upward distorted by letting them face a negative marginal tax rate. In either case, since Proposition 1 refers to the case when \(Y_{LF}^{n}<Y_{LF}^{u}\), the direction of the distortion imposed on the labor supply of the transferrecipients is always “coherent” with the income ranking prevailing under laissezfaire. Thus, when \(q<{\overline{q}}\), the laissezfaire incomeranking is preserved at all points on the secondbest PF.
Let’s now consider the case when \(q={\overline{q}}\).
Proposition 2
Assume that \(q={\overline{q}}\), so that \(Y_{LF}^{n}=Y_{LF}^{u}\). Then,

(i)
the domain of the function \(U_{SB}^{u}\left( {\overline{V}} ^{n}\right) \)describing the secondbest PF is given by \({\overline{V}}^{n}\in \left[ U_{LF}^{n},U_{LF}^{n}\right] \);

(ii)
for \((1\pi )U_{LF}^{n}\le {\overline{V}}^{n}<U_{LF}^{n}\), the secondbest PF can be attained through two different allocations, one where \( T^{\prime }\left( Y_{SB}^{n}\right) =0\)and \(T^{\prime }\left( Y_{SB}^{u}\right) <0\), and another one where \(T^{\prime }\left( Y_{SB}^{n}\right) =0\)and \(T^{\prime }\left( Y_{SB}^{u}\right) >0\);

(iii)
for \(U_{LF}^{n}\le {\overline{V}}^{n}<(1\pi )U_{LF}^{n}\), the secondbest PF is attained through an allocation where \(T^{\prime }\left( Y_{SB}^{n}\right) =0\)and \(T^{\prime }\left( Y_{SB}^{u}\right) <0\).
Proof
See Appendix B. \(\square \)
A key insight to understand the properties of the PF when \(q={\overline{q}}\) is that the indifference curve on which nonusers locate under laissezfaire lies everywhere above the indifference curve on which users locate under laissezfaire (except at the point \(Y_{LF}^{n}=Y_{LF}^{u}\) where the two indifference curves are tangent). This is illustrated in Fig. 2.
According to Proposition 2, the government can use a nonlinear income tax to redistribute towards users even in cases when both types earn the same income under laissezfaire. This stands in contrast to models where the SC holds; under SC, an anonymous nonlinear income tax does not allow the government to convert a pooling laissezfaire equilibrium into a separating equilibrium. However, as shown in part (ii), the labor supply of users is always distorted for \({\overline{V}}^{n}<U_{LF}^{n}\), which shows that the \(\lambda \)constraint is binding for any degree of redistribution from nonusers to users.
The indifference curves represented in Fig. 2 are helpful to get an intuition for the result that redistribution towards users is feasible. Suppose in fact that nonusers were offered an undistorted bundle on an indifference curve that is below the one on which they locate under laissezfaire. Looking at Fig. 2 it is easy to realize that a downward shift in the indifference curve of nonusers would allow to find a set of bundles that are at the same time above the users’ laissezfaire indifference curve and below the downward shifted indifference curve of nonusers. This means that, starting from the equilibrium described in Fig. 2, it is feasible to move nonusers on a lower indifference curve without violating the incentivecompatibility constraint requiring them not to be tempted to mimic users (the \(\lambda \)constraint).
According to part (ii) of Proposition 2, for each \({\overline{V}} ^{n}\in [(1\pi )U_{LF}^{n},U_{LF}^{n})\) the corresponding point on the secondbest PF can be achieved through two different allocations. The two allocations are equivalent in the sense that they induce the same utility distribution. Although at both allocations nonusers get the same (Y, B)bundle and face no distortion on their labor supply (\(T^{\prime }\left( Y_{SB}^{n}\right) =0\) and \(Y_{SB}^{n}=Y_{LF}^{n}\)), one implementing allocation entails a downward distortion on the labor supply of users (\(T^{\prime }\left( Y_{SB}^{u}\right) >0\) and \(Y_{SB}^{u}<Y_{LF}^{u}\) ), whereas the other implementing allocation entails an upward distortion on their labor supply (\(T^{\prime }\left( Y_{SB}^{u}\right) <0\) and \(Y_{SB}^{u}<Y_{LF}^{u}\)). Intuitively, the reason why there are two different allocations that allow achieving the same point on the secondbest PF is that, with \(q={\overline{q}}\), the magnitude of the distortion on users’ labor supply, that is needed to deter mimicking by nonusers, is the same independently on its direction.
According to part (iii), for \({\overline{V}}^{n}<(1\pi )U_{LF}^{n}\), a point on the secondbest PF always requires that the labor supply of users is upward distorted (\(T^{\prime }\left( Y_{SB}^{u}\right) <0\)). To understand why this is the case, consider the point on the secondbest PF that corresponds to \({\overline{V}}^{n}=(1\pi )U_{LF}^{n}\). Of the two allocations that allow implementing this point, the allocation entailing a downward distortion on the labor supply of users prescribes to offer them the bundle \(\left( Y,B\right) =\left( 0,(1\pi )U_{LF}^{n}\right) \). At this bundle their labor supply is pushed to its lower bound. Given that incentivecompatibility (the \(\lambda \)constraint) requires that a reduction in \({\overline{V}}^{n}\) is accompanied by a larger (in absolute value) distortion on users, it follows that once \({\overline{V}}^{n}\) has reached \((1\pi )U_{LF}^{n}\), a further reduction cannot be accommodated by magnifying the downward distortion on the labor supply of users. Therefore, for \({\overline{V}}^{n}<(1\pi )U_{LF}^{n}\), the implementing allocation becomes unique and it requires to distort upwards the users’ labor supply.
Finally, Proposition 2 shows that when the two types are pooled at the laissezfaire equilibrium, it is never possible to use a nonlinear income tax to redistribute from users to nonusers, i.e. there is no point on the secondbest PF where nonusers get a utility higher than \(U_{LF}^{n}\). An intuition for this result can again be grasped by looking at the indifference curves depicted in Fig. 2. Given that the laissezfaire indifference curve of users lies everywhere below the laissezfaire indifference curve of nonusers (except at \(Y_{LF}^{u}=Y_{LF}^{n}\) where they are tangent), it is impossible to move users on a lower indifference curve without violating the incentivecompatibility constraint requiring them not to be tempted to mimic nonusers (the \(\phi \)constraint). Taking into account that, as previously noticed, for \({\overline{V}}^{n}<U_{LF}^{n}\) the labor supply of users is always distorted, it also follows that when the laissezfaire equilibrium features pooling, the firstbest and the secondbest PF share only one point, i.e. the laissezfaire utility distribution.
Let’s now move to the last case that is left to consider, i.e. the case when \(q>{\overline{q}}\).
Proposition 3
Assume that \(q>{\overline{q}}\), so that \(Y_{LF}^{n}>Y_{LF}^{u}\). Then,

(i)
when \(q<{\overline{q}}\frac{\sqrt{2}+\sqrt{\pi }}{2\sqrt{\pi }} \frac{\left( \sqrt{2}\sqrt{\pi }\right) \sqrt{\pi }w^{u}}{2}\), the secondbest PF is disconnected and the domain of the function \( U_{SB}^{u}\left( {\overline{V}}^{n}\right) \)is given by
$$\begin{aligned} {\overline{V}}^{n}\in [U_{LF}^{n},\left( 1\pi \right) U_{LF}^{n}\delta )\cup \left[ \left( 1\pi \right) U_{LF}^{n},{\overline{V}} _{\max }^{n}\right] , \end{aligned}$$where \(\delta >0\) and \({\overline{V}}_{\max }^{n}>U_{LF}^{n}+\frac{\pi }{2} \frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{\left( w^{u}\right) ^{2}}\); when \(q\ge {\overline{q}}\frac{\sqrt{2}+\sqrt{\pi }}{2\sqrt{\pi }}\frac{\left( \sqrt{2} \sqrt{\pi }\right) \sqrt{\pi }w^{u}}{2}\), the domain is instead given by
$$\begin{aligned} {\overline{V}}^{n}\in \left[ \left( 1\pi \right) U_{LF}^{n},{\overline{V}} _{\max }^{n}\right] ; \end{aligned}$$ 
(ii)
for \(U_{LF}^{n}\frac{\pi }{2}\frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{ \left( w^{n}\right) ^{2}}\le {\overline{V}}^{n}\le U_{LF}^{n}+\frac{\pi }{2} \frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{\left( w^{u}\right) ^{2}}\), the secondbest PF coincides with the firstbest PF and any point on the frontier is attained through an allocation where \(T^{\prime }\left( Y_{SB}^{n}\right) =T^{\prime }\left( Y_{SB}^{u}\right) =0\);

(iii)
for \(U_{LF}^{n}+\frac{\pi }{2}\frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{ \left( w^{u}\right) ^{2}}<{\overline{V}}^{n}\le {\overline{V}}_{\max }^{n}\), any point on the secondbest PF corresponds to an allocation at which \( T^{\prime }\left( Y_{SB}^{u}\right) =0\) and \(T^{\prime }\left( Y_{SB}^{n}\right) <0\);

(iv)
for \((1\pi )U_{LF}^{n}\le {\overline{V}}^{n}<U_{LF}^{n}\frac{\pi }{2}\frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{\left( w^{n}\right) ^{2}}\), any point on the secondbest PF corresponds to an allocation at which \(T^{\prime }\left( Y_{SB}^{n}\right) =0\) and \(T^{\prime }\left( Y_{SB}^{u}\right) >0\);

(v)
when the secondbest PF includes a region where \(U_{LF}^{n}\le {\overline{V}}^{n}<(1\pi )U_{LF}^{n}\), any point on that region corresponds to an allocation at which \(T^{\prime }\left( Y_{SB}^{n}\right) =0\) and \( T^{\prime }\left( Y_{SB}^{u}\right) <0\).
Proof
See Appendix C. \(\square \)
Qualitatively, some of the results provided in Proposition 3 are standard. For instance, according to part (ii), no distortion is needed to satisfy incentivecompatibility when the amount of intergroup redistribution is sufficiently small (i.e., for values of \({\overline{V}}^{n}\) sufficiently close to \(U_{LF}^{n}\)). Another standard result is represented by part (iii) which states that, for \({\overline{V}}^{n}>U_{LF}^{n}\), if incentivecompatibility considerations require to distort the bundle offered to the transferrecipients (in this case, nonusers), the direction of the distortion is “coherent” with the income ranking under laissezfaire.
Two results stand out instead as nonstandard and are specifically due to the violation of the SC condition. The first, stated in part (i), highlights the possibility that the secondbest PF is disconnected. The second, which is a consequence of parts (iv) and(v), highlights that moving along the portion of the secondbest PF where \( {\overline{V}}^{n}<U_{LF}^{n}\), the sign of \(T^{\prime }\left( Y_{SB}^{u}\right) \) may change. In particular, despite the fact that \( Y_{LF}^{n}>Y_{LF}^{u}\), users do not necessarily face a downward distortion on their labor supply at all points on the PF where the \(\lambda \) constraint is binding, i.e. at all points where the labor supply of users needs to be distorted to prevent mimicking by nonusers.
These two results are strictly related due to the fact that the sign of \( T^{\prime }\left( Y_{SB}^{u}\right) \) is not everywhere nonnegative if and only if the domain of the function \(U_{SB}^{u}\left( {\overline{V}}^{n}\right) \) is a disconnected set, which in turn happens when \(q<{\overline{q}}\frac{ \sqrt{2}+\sqrt{\pi }}{2\sqrt{\pi }}\frac{\left( \sqrt{2}\sqrt{\pi }\right) \sqrt{\pi }w^{u}}{2}\).
To understand these results, consider first Fig. 3, which illustrates the qualitative features of the solution to the government’s problem for any given value of \({\overline{V}}^{n}\) such that \({\overline{V}}^{n}\in [(1\pi )U_{LF}^{n},U_{LF}^{n}\frac{\pi }{2}\frac{(Y_{LF}^{u}Y_{LF}^{n})^{2} }{\left( w^{n}\right) ^{2}})\).
In the figure, the dashed 45\(^\circ \) line represents the laissezfaire budget line (no taxes nor transfers), and points I and V represent the bundles chosen under laissezfaire by, respectively, nonusers and users (\( Y_{LF}^{n}>Y_{LF}^{u}\)). Bundle II represents the undistorted bundle offered to nonusers on their indifference curve associated with \(U^{n}={\overline{V}} ^{n}\). The blue 45\(^\circ \) line represents the virtual budget line on which, given the revenue extracted from nonusers, a bundle for users can be offered.^{Footnote 9} On this virtual budget line, incentivecompatibility considerations (the need to satisfy the \( \lambda \)constraint) prevents the government from offering users the undistorted bundle labelled VI. To prevent nonusers from behaving as mimickers, users can only be offered, on the virtual budget line, either bundles to the left of III or bundles to the right of IV, with both bundle III and bundle IV belonging to the set of admissible bundles. The difference between these two sets of bundles is that, whereas with bundle III, or bundles to the left of it, type separation is achieved by imposing a sufficiently large downward distortion on the users’ labor supply, in the case of bundle IV, or bundles to the right of it, type separation is achieved by imposing a sufficiently large upward distortion on the users’ labor supply.
The black curve passing through bundle III is an indifference curve pertaining to users. The figure shows that, among all the admissible bundles that can be offered to users, bundle III is the one at which their utility is maximized. In particular, notice that the utility of users is strictly higher at bundle III than at bundle IV. The intuition is that, even though the \(\lambda \)constraint can be satisfied by imposing either a sufficiently large downward or a sufficiently large upward distortion on the labor supply of users, the size of the required distortion is smaller when type separation is obtained by distorting downwards the users’ labor supply (\( T^{\prime }\left( Y_{SB}^{u}\right) >0\)). This allows achieving type separation at a lower efficiency cost.
Consider now Fig. 4, which illustrates the solution to the government’s problem for the case when \({\overline{V}}^{n}\) is lowered to \((1\pi )U_{LF}^{n}\).
In Fig. 4, the dashed 45\(^\circ \) line represents the laissezfaire budget line, and the point labelled I on this line represents the bundle selected by nonusers under laissezfaire. Bundle II represents the undistorted bundle offered to nonusers lying on the indifference curve where \({\overline{V}}^{n}=\left( 1\pi \right) U_{LF}^{n}\). The blue 45\(^\circ \) line represents the virtual budget line on which a bundle for users can be offered given the revenue extracted from nonusers. Incentive compatibility requires that, on the blue virtual budget line, users can only be offered either bundle III or bundles to the right of IV, with bundle IV belonging to the set of admissible bundles. The black curve passing through bundle III is an indifference curve pertaining to users and it shows that bundle III is strictly preferred by users to bundle IV. Comparing bundle III in Fig. 4 with the corresponding bundle in Fig. 3, we can also see that the size of the downward distortion on the users’ labor supply is larger in Fig. 4.^{Footnote 10} The important thing to notice, however, is that at bundle III the users’ labor supply has been pushed to its lower bound (\(Y^{u}=0\)).^{Footnote 11}
In a standard model where the SC condition holds, the utility achieved by users at bundle III in Fig. 4 would represent their maximal utility along the secondbest PF. The reason is straightforward. Suppose that singlecrossing were satisfied and that at all bundles in the (Y, B)space users had steeper indifference curves. Then, the users’ indifference curve represented in Fig. 4 would lie everywhere above the indifference curve of nonusers, except at bundle III. But this would necessarily imply that, if nonusers were to be offered a bundle on a lower indifference curve (to increase the tax revenue collected from them), any (Y, B)bundle that makes users better off (compared to bundle III in Fig. 4) would violate incentivecompatibility since it would induce nonusers to behave as mimickers.
With SC being violated, instead, things are different. In Fig. 4 all the bundles that are included in the gray area represent bundles that would at the same time: (i) make users better off (compared to the utility that they achieve at bundle III), and (ii) be incentivecompatible in the sense that they would not induce nonusers to reject bundle II. Even though the bundles in the gray area cannot be offered to users since they violate the publicbudget constraint (when \({\overline{V}}^{n}=\left( 1\pi \right) U_{LF}^{n}\) and nonusers are offered the bundle II), users might be offered a bundle in the gray area if more revenue were collected from nonusers, so that the blue virtual budget line could be shifted up. However, since collecting more revenue from nonusers implies moving them on a lower indifference curve, and since this implies that the set of bundles in the gray area shrinks, the violation of SC is in general not sufficient to guarantee that the utility of users can be raised above the utility reached at bundle III. What is required is that the simultaneous upward shift in the virtual budget line, and downward shift in the indifference curve of nonusers, push their point of intersection (currently at point IV in Fig. 4) inside the gray area. This is more likely to happen the smaller is \(\pi \) and the smaller the difference \(Y_{LF}^{n}Y_{LF}^{u}\left( >0\right) \).^{Footnote 12}
Notice also that at any bundle inside the gray area the labor supply of users is upward distorted (\(T^{\prime }\left( Y_{SB}^{u}\right) <0\)). Thus, if it is indeed possible, by lowering \({\overline{V}}^{n}\) below \(\left( 1\pi \right) U_{LF}^{n}\), to raise \(U^{u}\) above the level that it achieves at bundle III, users will need to be assigned a bundle at which their labor supply is upward distorted. Moreover, since the users’ utility is strictly higher at bundle III than at bundle IV, raising \(U^{u}\) above the value achieved at bundle III would necessarily require a discrete downward jump in \({\overline{V}}^{n}\). This is illustrated in Fig. 5 below which shows the secondbest PF with the property that the domain of the function \( U_{SB}^{u}\left( {\overline{V}}^{n}\right) \) is disconnected.
Finally, notice that when the secondbest PF looks like in Fig. 5, the earnedincome ranking that corresponds to the various points on the frontier is not always consistent with the income ranking under laissezfaire. Along the region where \({\overline{V}}^{n}<U_{LF}^{n}\), one moves from a portion of the secondbest PF that coincides with the firstbest frontier (the green part with slope \(\left( 1\pi \right) /\pi \)), to a portion where \( T^{\prime }\left( Y_{SB}^{u}\right) >0\) (the red part of the curve in Fig. 5), and finally to a portion where \(T^{\prime }\left( Y_{SB}^{u}\right) <0\) (the blue part of the curve in Fig. 5). When entering this last portion, the earnedincome ranking is no longer consistent with the one under laissezfaire since we have \(Y_{LF}^{n}>Y_{LF}^{u}\) but \( Y_{SB}^{n}<Y_{SB}^{u}\).
Both the possibility that the secondbest PF is disconnected and the possibility of income reranking follow from the circumstance that in our setting the SC condition is violated.^{Footnote 13} Similarly, it is because of the violation of the SC condition that, when redistribution favors users, it might be optimal to let them face a negative marginal tax rate even in cases when they earn less than nonusers under laissezfaire. This shows that the violation of SC can provide a novel rationale for negative marginal tax rates.^{Footnote 14}
Subsidizing workrelated expenses
In our analysis we have so far maintained the assumption that the only policy instrument is a nonlinear income tax. In this setting we have highlighted the consequences descending from the violation of the SC condition. Most governments, however, allow special tax treatments for workrelated expenses.^{Footnote 15} To consider this possibility, and given that a “special” tax treatment usually implies a more lenient one, we will now investigate how our results are affected when jobrelated expenses are subsidized at a flat rate \(s>0\) that is optimally chosen.^{Footnote 16} Moreover, since a subsidy on jobrelated expenses is only valuable to users, we will confine our attention to the portion of the PF where \({\overline{V}}^{n}<U_{LF}^{n}\), i.e. to the portion of the PF where redistribution goes from nonusers to users.
The first thing to notice is that the subsidy has a flattening effect on the indifference curves for users in the \(\left( Y,B\right) \)space. For a given (positive) value of s and a given bundle in the (Y, B)space, we have that \(MRS_{YB}^{u}=\left( \left( 1s\right) q+\frac{Y}{w^{u}}\right) /w^{u}\) . Thus, the threshold value for Y, separating the bundles where \( MRS_{YB}^{u}>MRS_{YB}^{n}\) from those where \(MRS_{YB}^{u}<MRS_{YB}^{n}\), lowers from \(\Omega \), as defined in (4), to
Hence, the SC property is restored if \(s\ge 1\).^{Footnote 17}
Most importantly, notice that in our setting a subsidy on jobrelated expenses represents a very effective instrument to redistribute towards users. This is because nonusers derive no benefit from the subsidy. Therefore, channeling at least part of the resources transferred to users through a subsidy on jobrelated expenses makes it less attractive for nonusers to behave as mimickers. One can then expect that, by supplementing an optimal nonlinear income tax with an optimally chosen s, the firstbest PF and the secondbest PF will coincide over a larger set of values for \( {\overline{V}}^{n}\). In particular, since we know from the analysis in Sect. 3 that an optimal nonlinear income tax is sufficient to implement a firstbest optimum (i.e., a point on the firstbest PF) when \({\overline{V}} ^{n}\in [U_{LF}^{n}\frac{\pi }{2}\frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{ Y_{LF}^{n}},U_{LF}^{n})\), one can expect that using s as an additional policy tool allows implementing a firstbest optimum also for a range of values for \({\overline{V}}^{n}\) that are strictly lower than \(U_{LF}^{n}\frac{ \pi }{2}\frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{Y_{LF}^{n}}\). As shown in Proposition 4 below, which looks at the solution to the government’s problem for values of \({\overline{V}}^{n}\) such that \(U_{LF}^{n}\le {\overline{V}} ^{n}<U_{LF}^{n}\frac{\pi }{2}\frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{Y_{LF}^{n}}\) , this intuition is indeed correct.^{Footnote 18}\(^{\text {,}}\)^{Footnote 19}
Proposition 4
Assume that \(U_{LF}^{n}\le {\overline{V}} ^{n}<U_{LF}^{n}\frac{\pi }{2}\frac{(Y_{LF}^{u}Y_{LF}^{n})^{2}}{Y_{LF}^{n}}\)and that the government is optimizing a nonlinear income tax and a proportional subsidy on workrelated expenses. Moreover, let \({\widehat{V}} \equiv U_{LF}^{n}\frac{\pi }{2}\frac{2w^{u}q}{w^{u}}\left( Y_{LF}^{n}Y_{LF}^{u}\right) \)be a threshold value for \({\overline{V}}^{n}\).

(i)
Suppose that \(q\le {\overline{q}}\) (i.e., \(Y_{LF}^{u}\ge Y_{LF}^{n} \)); then, the secondbest PF will coincide with the firstbest PF.

(ii)
Suppose that \(q>{\overline{q}}\) (i.e., \(Y_{LF}^{u}<Y_{LF}^{n}\)). For \({\overline{V}}^{n}\ge {\widehat{V}}\), the secondbest PF coincides with the firstbest PF. For \({\overline{V}}^{n}<{\widehat{V}}\), instead, both selfselection constraints will be binding and any point on the secondbest PF corresponds to an allocation at which both types of agents face a distortion on their labor supply.
Proof
See Appendix D. \(\square \)
According to Proposition 4, there is a crucial difference between cases where \(q\le {\overline{q}}\) and cases where \(q> {\overline{q}}\). In the first scenario, using s as an additional policy instrument always allows implementing a firstbest optimum. Instead, when \(q> {\overline{q}}\), a firstbest optimum can only be implemented as long as the utility of nonusers does not fall below a given threshold value \({\widehat{V}} \). Below, we discuss in two separate subsections the results provided by Proposition 4.
Part (i)
Consider an initial equilibrium where an optimal nonlinear income tax is used in isolation (\(s=0\)) and users are offered a distorted bundle to prevent mimicking by nonusers. The transfer received by each user is equal to \(B^{u}Y^{u}\) at the initial equilibrium. Introducing a small subsidy on jobrelated expenses (\(ds>0\)), while at the same time adjusting \(B^{u}\) downwards by \(dB^{u}=\left( qY^{u}/w^{u}\right) ds\), would leave unchanged the net transfer received by each user.^{Footnote 20} Such a reform, however, would make mimicking less attractive for nonusers.^{Footnote 21} Therefore, by relaxing the incentivecompatibility constraint for nonusers, the reform would pave the way for the possibility to offer users a bundle where their labor supply is less distorted and their utility is higher. When \(Y_{LF}^{u}\ge Y_{LF}^{n}\), one can replicate the kind of reform described above (which hinges on raising s, lowering \(B^{u}\) and moving \(Y^{u}\) closer to its undistorted level) until a firstbest optimum is achieved where no agent’s labor supply is distorted. This is because one can set s with the sole purpose of deterring mimicking by nonusers, safely disregarding the other selfselection constraint, i.e. the one requiring users not to behave as mimickers. The intuition is provided in Fig. 6 below.
In Fig. 6, the bundle labelled I represents the undistorted bundle offered to nonusers and lying on the red indifference curve where \(U^{n}={\overline{V}}^{n}<U_{LF}^{n}\). The blue 45\(^\circ \) line represents the virtual budget line on which a bundle for users can be offered, given the revenue extracted from nonusers, when a nonlinear income tax is used in isolation (\(s=0\)). Incentive compatibility prevents the government from offering users the (firstbest) undistorted bundle labelled IV. Instead, users will be offered the incentivecompatible bundle labelled III. Keeping fixed \({\overline{V}}^{n} \) and supplementing a nonlinear income tax with a subsidy on jobrelated expenses implies that users can be offered a bundle on a virtual budget line that is flatter than the one prevailing when \(s=0\). In particular, while its intercept does not change,^{Footnote 22} its slope drops from 1 to \(1sq/w^{u}\). The dashed blue line represents the virtual budget line generated by supplementing a nonlinear income tax with a subsidy which is just large enough to allow the government to offer an undistorted bundle to users (bundle labelled II) without inducing mimicking by nonusers. Notice that the vertical distance between bundle IV and bundle II is equal to \(sqY_{LF}^{u}/w^{u}\). Taking into account that, at bundle IV, the subsidy was set equal to zero, whereas at bundle II users save an amount \(sqY_{LF}^{u}/w^{u}\) on jobrelated expenses, users get the same net consumption at both bundles, and therefore enjoy the same utility (since labor supply is the same). It is also obvious from the figure that users, whose indifference curve is depicted in black, have no incentive to behave as mimickers since they strictly prefer bundle II to bundle I. The reason is easy to grasp. At bundle II their indifference curve is tangent to the virtual budget line generated by supplementing the income tax with a subsidy on jobrelated expenses. Thus, along the black indifference curve, at all bundles to the left of II we have that \( MRS_{YB}^{u}<1sq/w^{u}<1\). Instead, along the red indifference curve (for nonusers), at all bundles between I and II we have that \(MRS_{YB}^{n}>1\). Therefore, the fact that the two indifference curves cross at bundle II necessarily implies that bundle II is strictly preferred by users to bundle I.
Part (ii)
Things are instead different when \(Y_{LF}^{u}<Y_{LF}^{n}\). In this case, setting s large enough to deter mimicking by nonusers might imply that users have an incentive to mimic nonusers. The intuition why this other selfselection constraint cannot always be disregarded is provided in Fig. 7 below.
In Fig. 7 the bundle labelled I represents the undistorted bundle offered to nonusers. The blue 45\(^\circ \) line represents the virtual budget line on which a bundle for users can be offered, given the revenue extracted from nonusers, when a nonlinear income tax is used in isolation. Incentive compatibility prevents the government from offering users the undistorted bundle labelled IV; instead users will be offered the incentivecompatible bundle labelled III. The dashed blue line is the virtual budget line generated by supplementing a nonlinear income tax with a subsidy which is just large enough to allow the government to offer an undistorted bundle to users (bundle labelled II) without inducing mimicking by nonusers. As was the case in Fig. 6, users get the same net consumption at both bundle IV (without the subsidy) and bundle II (with the subsidy), and therefore enjoy the same utility at both bundles. The figure shows that users, whose indifference curve is depicted in black, are indifferent between choosing the bundle II, intended for them by the government, and choosing the bundle intended for nonusers.^{Footnote 23}
The case represented in Fig. 7 shows a situation where both selfselection constraints are binding but the government is still able to implement a firstbest optimum.^{Footnote 24} This happens when \(Y_{LF}^{u}<Y_{LF}^{n}\) and \( {\overline{V}}^{n}={\widehat{V}}\). Further lowering \({\overline{V}}^{n}\) would no longer allow the government to implement a firstbest optimum. A higher subsidy would be needed to still offer an undistorted bundle to users without inducing nonusers to mimic. But a higher subsidy would induce users to mimic nonusers. Thus, lowering \({\overline{V}}^{n}\) below \({\widehat{V}}\) will induce the government to raise s, but not as much as it would be needed to offer users an undistorted bundle. The optimal s will then represent a tradeoff between the desirable effects in terms of deterring mimicking by nonusers and the undesirable effects of making it more tempting for users to mimic nonusers. At the resulting secondbest optimum both selfselection constraint are binding and both types face a distortion on their labor supply.^{Footnote 25}
For \({\overline{V}}^{n}\) lower than but sufficiently close to \({\widehat{V}}\), the secondbest optimum will be a separating equilibrium where each group is offered a distinct (Y, B)bundle and the labor supply of both types is downward distorted (\(Y_{SB}^{u}<Y_{LF}^{u}\), \(Y_{SB}^{n}<Y_{LF}^{n}\) and \( Y_{SB}^{u}<Y_{SB}^{n}\)). As one keeps lowering \({\overline{V}}^{n}\), the distortions needed to implement a separating equilibrium become larger and larger, and one finally reaches a value for \({\overline{V}}^{n}\) below which it is no longer possible to further increase the users’ utility.
However, notice that when s is an additional policy instrument, the redistributive goals of the government do not necessarily require the implementation of a separating equilibrium, i.e., an equilibrium where each group is offered a distinct (Y, B)bundle. Given that only users benefit from the subsidy s, redistribution can also be achieved by implementing a pooling equilibrium where both groups are offered the same (Y, B)bundle (but have, nonetheless, different consumption). In particular, at a pooling equilibrium the government would solve the following optimization problem:
subject to
Substitute \(B={\overline{V}}^{n}+\left( Y/w^{n}\right) ^{2}/2\) and \( sqY/w^{u}=\left( YB\right) /\pi \), from constraint (8) and (9), respectively, into the objective function. The constrained optimization problem above can then be rewritten in an unconstrained way as
From the first order condition of the problem above, denoting by \(Y^{p}\) the optimal value of Y, one gets:
Moreover, when \(w^{u}\left( w^{u}q\right) <\left( w^{n}\right) ^{2}\) (i.e., \(Y_{LF}^{u}<Y_{LF}^{n}\)), it is straightforward to show that
From (12) we can conclude that, at a pooling equilibrium, the labor supply of users is upward distorted and the labor supply of nonusers is downward distorted. Moreover, from (11) we can also see that, since \(Y^{p}\) does not depend on \({\overline{V}}^{n}\), the magnitude of these distortions does not depend on the specific value of \({\overline{V}}^{n}\). Substituting (11) into the objective function of () we get that, at a pooling equilibrium, the users’ utility is given by
which implies that \(\partial U^{u}/\partial {\overline{V}}^{n}=\left( 1\pi \right) /\pi \), i.e. the same slope that characterizes the firstbest PF.
Clearly, for \({\widehat{V}}\le {\overline{V}}^{n}<U_{LF}^{n}\), a pooling equilibrium will never be chosen by the government. The reason is that, for \( {\widehat{V}}\le {\overline{V}}^{n}<U_{LF}^{n}\) the government can implement a separating equilibrium which allows attaining a point on the firstbest PF. Under a pooling equilibrium, instead, it is never possible to reach a point on the firstbest PF (given that the labor supply of both groups of agents is distorted). For values of \({\overline{V}}^{n}\) that are smaller than but sufficiently close to \({\widehat{V}}\), a separating equilibrium will again dominate a pooling equilibrium; even though both equilibria entail a distortion on the labor supply of both groups and a point on the firstbest PF can no longer be attained, the distortions are less severe under a separating equilibrium. However, for sufficiently low values of \({\overline{V}} ^{n}\), a pooling equilibrium will dominate a separating equilibrium. The reason is that the distortions needed to implement a separating equilibrium become larger and larger as one keeps lowering \({\overline{V}}^{n}\); under a pooling equilibrium, instead, the magnitude of the distortions does not depend on the specific value of \({\overline{V}}^{n}\). The possibility of both types of secondbest equilibria (separating and pooling), depending on the chosen value for \({\overline{V}}^{n}\), is illustrated by means of a numerical example in Appendix F.^{Footnote 26} The example also illustrates the fact that the secondbest PF can be disconnected even when the nonlinear income tax is supplemented by an optimal subsidy on jobrelated expenses.
Pareto efficient taxation when jobrelated expenses are a nonlinear function of hours of work
In Sect. 3 we have emphasized three main anomalies descending from the violation of SC: (i) an anonymous nonlinear income tax may allow the government to convert a pooling laissezfaire equilibrium into a separating equilibrium; (ii) the secondbest PF may be disconnected; (iii) a secondbest optimum may not preserve the income ranking prevailing under laissezfaire.
As we show in a background version of this paper (see Bastani et al. 2019), similar qualitative results generalize, with some nuances, to a setting where the function \(\varphi \left( h\right) \) (describing the workrelated monetary costs) is convex or concave. However, when \(\varphi \left( h\right) \) is concave, one additional anomaly may arise. In particular, when redistribution goes from nonusers to users, it is possible that a secondbest optimum entails a distortion on the labor supply of users even when no selfselection constraint is (locally) binding in equilibrium. The reason is that, when \(\varphi \left( h\right) \) is sufficiently concave, it is no longer the case that \(MRS_{YB}^{u}\) is monotonically increasing in Y.^{Footnote 27} To see this, notice that, for individual preferences given by \(U=ch^{2}/2\) and a general nonlinear function \(\varphi \left( h\right) \), \(MRS_{YB}^{u}\) is given by \(MRS_{YB}^{u}=\left[ \varphi ^{\prime }\left( Y/w^{u}\right) +Y/w^{u}\right] /w^{u}\). Assume that \(\varphi \left( h\right) \) is an increasing and concave function which also satisfies the conditions \(\varphi ^{\prime }\left( 0\right) >w^{u}\), \(\varphi ^{\prime \prime }\left( 0\right) <1\), and \(\varphi ^{\prime \prime \prime }\left( h\right) >0\). Then, while the value of \(MRS_{YB}^{u}\) is always positive for \(Y\ge 0\), it is larger than 1 and decreasing in Y for sufficiently small values of Y. The fact that \(MRS_{YB}^{u}>1\) for sufficiently low values of Y implies that, when incentivecompatibility considerations require that \(Y^{u}\) must be very small (to prevent mimicking by nonusers), it may be optimal to offer users a bundle where \(Y^{u}=0\) even though it would be incentivecompatible to let them increase to some extent their labor supply (and enjoy a slightly larger value of consumption). This possibility is illustrated in Fig. 8 below and a numerical example is provided in Appendix F.
In Fig. 8, the point I represents the bundle selected by nonusers under laissezfaire. Bundle II represents the undistorted bundle offered to nonusers lying on the indifference curve where \(U^{n}={\overline{V}} ^{n}<U_{LF}^{n}\). The blue 45\(^\circ \) line represents the virtual budget line on which a bundle for users can be offered given the revenue extracted from nonusers. Incentive compatibility requires that users can only be offered bundles to the left of bundle V and to the right of bundle VI, with both V and VI belonging to the set of admissible bundles. The three black curves passing through bundles V, IV and III are three different indifference curves pertaining to users.
From the figure, one can see that bundle IV is strictly preferred by users to both the bundle V and bundle VI. But if users are offered the bundle IV, the selfselection constraint requiring nonusers not to mimic users is slack. Notice also that users would be better off if they could get bundle III on the blue virtual budget line, i.e. the bundle at which their labor supply is undistorted. However, offering them this bundle would induce mimicking by nonusers. Therefore, at a secondbest optimum users are offered bundle IV and nonusers are offered bundle II; the labor supply of users is downward distorted even though no selfselection constraint is binding at the secondbest optimum.^{Footnote 28}\(^{\text {,}}\)^{Footnote 29}
Concluding remarks
In this paper, we have considered a twotype optimal nonlinear income tax model where agents differ both in terms of market ability and in terms of “needs” for a workrelated good/service, i.e. a good/service that some agents need to purchase in order to work. Because of this bidimensional heterogeneity, the singlecrossing conditions fails to hold. Ruling out public observability of individual types, we have characterized the properties of a secondbest optimum by looking at the entire secondbest Pareto frontier.
We have highlighted that, due to the violation of singlecrossing, some nonstandard results arise. First of all, a secondbest optimum might not preserve the earnedincome ranking that prevails under laissezfaire. Second, redistribution via income taxation might be feasible even when the laissezfaire equilibrium is a pooling equilibrium. Third, a secondbest optimum might not be unique, in the sense that there might be more than one set of allocations in the (pretax income, aftertax income)space that solve the government’s maximization problem. Fourth, the secondbest Pareto frontier may be disconnected. Fifth, supplementing an optimal nonlinear income tax with an optimal subsidy on workrelated expenses may imply that redistribution is achieved through a separating or pooling equilibrium where both selfselection constraints are binding. Sixth, we have shown that the labor supply of some agents might be distorted even though no selfselection constraint is (locally) binding in equilibrium.
Before concluding, a final remark is in order. For tractability reasons, we have focused our analysis on a simplified twotype model where skills and needs are perfectly correlated. However, insofar as our nonstandard results hinge on the violation of the singlecrossing condition, they generalize, with some nuances, to settings with a larger number of types and imperfect correlation between skills and needs.
Notes
 1.
 2.
A similar exercise has been undertaken by Bierbrauer and Boyer (2014) for a twotype optimal nonlinear income tax model where individuals have linear effort costs and the SCcondition holds.
 3.
Several interpretations are possible. One example is child care services which are needed by parents of young kids in order to work. Other groups who might face needs constraints include workers with relatives who require elderly care, or workers who incur commuting costs or workrelated health costs.
 4.
The specific isoelastic form of the utility function is here mainly adopted for analytical convenience.
 5.
In a background version of the paper we also consider the case where \(\pi \ge 1\left( w^{n}\right) ^{2}/\left( w^{u}\right) ^{2}\). See Bastani et al. (2019).
 6.
One can think that individual consumption cannot fall below a subsistence level \({\overline{c}}\). From this perspective, assuming that \({\overline{c}}=0\) is simply a matter of normalization.
 7.
A similar reasoning can be adopted to show that the slope of the firstbest PF is equal to \((1\pi )/\pi \) for values of \(U^{n}>U_{LF}^{n}\) and such that \(\left( w^{n}\right) ^{2}/2<U^{n}\le \left[ \left( w^{u}q\right) ^{2}\pi /\left( 1\pi \right) \right] +\left( w^{n}\right) ^{2}/2\). When \( U^{n}=\left[ \left( w^{u}q\right) ^{2}\pi /\left( 1\pi \right) \right] +\left( w^{n}\right) ^{2}/2\), all the resources available for consumption by users under laissezfaire have been transferred to nonusers. Since consumption for users has then reached its lower bound, a further increase in the utility of nonusers can only be obtained by requiring users to increase their labor supply, while keeping at zero their consumption, so that additional resources can be transferred to nonusers. However, since the required increase in \(h^{u}\) entails a distortion on the labor supply of users, redistribution becomes costlier and the slope of the PF becomes \( dU^{u}/dU^{n}=\left( 1\pi \right) h^{u}/\pi \left( w^{u}q\right) \), which is lower than \((1\pi )/\pi \) when \(h^{u}\) exceeds \(w^{u}q\), i.e. its laissezfaire value.
 8.
The nonnegativity constraint on consumption could be safely disregarded if the marginal utility of consumption goes to infinity as consumption approaches zero.
 9.
The value of the intercept of the blue 45\(^\circ \) line is given by \(\left( U_{LF}^{n}{\overline{V}}^{n}\right) \left( 1\pi \right) /\pi \). Thus, the intercept is higher the smaller is \({\overline{V}}^{n}\) (i.e., the larger is the tax collected from each nonuser) and the smaller is \(\pi \) (i.e., the smaller is the fraction of transferrecipients).
 10.
This is easily understood by looking at Fig. 3 and thinking at how bundle III would be affected by a downward shift in the indifference curves of nonusers. Since such a downward shift would also entail an upward shift in the intercept of the 45\(^\circ \) virtual budget line (as more revenue is collected from nonusers), the new bundle III would be necessarily associated with a lower value of the users’ labor supply.
 11.
One can also notice that when \({\overline{V}}^{n}=\left( 1\pi \right) U_{LF}^{n}\) and users are offered bundle III, utilities are equalized: \( U^{u}={\overline{V}}^{n}=(1\pi )U_{LF}^{n}\).
 12.
Regarding the effect of \(\pi \), the reason is that a smaller \(\pi \) implies that a given upward shift in the blue virtual budget line can be accommodated by a smaller downward shift in the indifference curves of nonusers. Regarding the effect of \(Y_{LF}^{n}Y_{LF}^{u}\), assume that, for given \(w^{n}\), either \(w^{u}\) increases or q decreases (while still satisfying the inequality \(\left( w^{n}\right) ^{2}>w^{u}\left( w^{u}q\right) \) so that \(Y_{LF}^{n}Y_{LF}^{u}>0\)). This would produce a flattening effect on the indifference curve for users that is displayed in Fig. 4, which would in turn imply that its second intersection with the indifference curve of nonusers would occur at a lower value of Y. A smaller upward shift in the blue virtual budget line would then be needed to move bundle IV inside the gray area.
 13.
Notice that in a model without income effects on labor supply, as the one that we have been considering, the incomeranking under a firstbest optimum is always consistent with the income ranking under laissezfaire (since whenever their labor supply is left undistorted, agents will always work the same amount as under laissezfaire, no matter how large is the tax that they pay or the transfer that they receive). Thus, the fact that the income ranking under a secondbest optimum may differ with respect to the one prevailing under laissezfaire also implies that the income ranking under a secondbest optimum may differ with respect to the one prevailing under a firstbest optimum.
 14.
Previous contributions that have highlighted the possibility that negative marginal tax rates are optimal include Stiglitz (1982), Saez (2002) and Choné and Laroque (2010). In these papers the SC condition is satisfied and the justification for negative marginal tax rates either comes from the assumption that wages are endogenous or from specific assumptions on the profile of social weights that apply to the different types of agents in the economy.
 15.
Recent contributions that have analyzed the optimal tax treatment of workrelated expenses include Koehne and Sachs (2017), Bastani et al. (2020) and Ho and Pavoni (2020), where the last two papers explicitly focus on the case of child care expenditures. A common feature of these papers is that they consider a setting where all agents are, according to our terminology, “users”.
 16.
We are implicitly assuming that jobrelated expenses are not observable by the government at the individual level so that a nonlinear subsidy scheme is not an option. Lack of public observability of personal purchases is an assumption that is often made in the optimal tax literature (see, e.g., Anderberg and Balestrino 2000; Cremer et al. 2001; Blomquist et al. 2010; Jacobs and Boadway 2014; Casarico et al. 2015). In our setting it appears a realistic case to consider since individuals have often the possibility to misreport their true workrelated expenses to the tax authority. For purchases of workrelated goods, as opposed to workrelated services, the possibility of reselling by agents exacerbates the problem of observing consumption at the individual level.
 17.
For \(s<1\) the SC property remains violated. For \(s=1\) users would have flatter (steeper) indifference curves at any point in the (Y, B)space whenever \(w^{u}>\left( <\right) w^{n}\). From the perspective of agents, \(s=1\) is equivalent to granting them a refundable tax credit for all their workrelated expenses (since offering agents a refundable tax credit for a fraction s of their workrelated expenses is equivalent to subsidize workrelated expenses at the rate s). Obviously, the SC property would also be restored for \(s>1\).
 18.
The reason for restricting attention to cases where \({\overline{V}}^{n}\ge U_{LF}^{n}\) is that it allows us to neglect the possibility that the labor supply of nonusers is distorted at a firstbest optimum due to the nonnegativity constraint on private consumption. See the discussion in Sect. 2.2.
 19.
As explained in the beginning of Sect. 4, due to the fact that a “special” tax treatment for workrelated expenses usually means that these kind of expenses are subject to a more lenient tax treatment, in our analysis we restrict attention to the case when workrelated expenses are subsidized. However, one can show that a positive tax on workrelated expenses (\(s<0\)) can be used as an instrument that makes it less attractive for users to behave as mimickers. Thus, supplementing a nonlinear income tax with a tax on workrelated expenses would allow to shift outwards the PF when \({\overline{V}}^{n}>U_{LF}^{n}\).
 20.
When a nonlinear income tax is supplemented with a subsidy on jobrelated expenses, the net transfer received by each user is equal to \( B^{u}Y^{u}+sqY^{u}/w^{u}\).
 21.
For nonusers, the subsidy s is of no value; their utility when behaving as mimickers is given by \(B^{u}\left( Y^{u}/w^{n}\right) ^{2}/2\).
 22.
The intercept is always equal to \(\left( U_{LF}^{n}{\overline{V}}^{n}\right) \left( 1\pi \right) /\pi \), which represents the peruser transfer that can be financed when the utility of nonusers is set at \({\overline{V}}^{n}<\)\( U_{LF}^{n}\).
 23.
Notice that this can only happen when jobrelated expenses are subsidized. With \(s=0\) and \(Y_{LF}^{u}<Y_{LF}^{n}\), at any given bundle to the left of \( Y_{LF}^{n}\), users have an indifference curve that is steeper than the one pertaining to nonusers. When the two indifference curves cross the second time, it will happen at a bundle where \(Y>Y_{LF}^{n}\). Therefore, with \(s=0\) , if nonusers were indifferent between bundle I and bundle II, users would strictly prefer bundle II.
 24.
Notice that when a nonlinear income tax is used in isolation, as assumed in Sect. 3, the solution to the government’s problem can never be a separating equilibrium where both selfselection constraints are binding. To see the reason for this, suppose to start from a separating equilibrium where both selfselection constraints are binding. With a binding public budget constraint, one (Y, B)bundle will be associated with a positive tax payment and another one with a negative tax payment. Then the government could improve upon the initial set of bundles by implementing a pooling allocation where all agents are offered the bundle to which is associated a positive tax payment (the utility of all agents would be unaffected and the government would run a positive surplus). But this cannot be an optimum either, since the government’s budget constraint would be slack. Consider instead the case when the income tax is supplemented by a subsidy on jobrelated expenses. In Fig. 7, the government budget constraint would be violated if both groups were to choose bundle II; it would also be violated if both groups were to choose bundle I (since the dashed blue line, on which a bundle for users can be offered without violating publicbudget balance, lies below bundle I).
 25.
As shown in Appendix E, when \(Y_{LF}^{u}<Y_{LF}^{n}\) such a secondbest equilibrium will be the necessary outcome under a max–min planner.
 26.
See also Bastani et al. (2015) for another example of a twotype model where both selfselection constraints may be binding at a separating equilibrium and where a pooling equilibrium may dominate a separating equilibrium.
 27.
In other words, the indifference curves for users are not everywhere convex.
 28.
Nonetheless, the reason why users are offered a distorted bundle is ultimately due to the need to prevent mimicking from nonusers and ensure proper selfselection by agents.
 29.
It should be noticed that the labor supply of users is downward distorted even though at bundle IV the users’ MRS is larger than 1, i.e. it satisfies the standard definition of upward distortion. This happens because the standard definition of downward and upward distortion is only valid insofar as an individual’s indifference curves are everywhere convex in the (Y, B)space. To clarify this point, suppose that the indifference curves are everywhere convex and that an individual is located at a bundle A where his MRS is larger (resp.: smaller) than 1. Then, the conclusion that the labor supply of this agent is upward (resp.: downward) distorted is based on the observation that, if the individual could freely choose any bundle along a 45\(^\circ \) line going through bundle A, he would choose a bundle to the left (resp.: right) of bundle A. However, if the indifference curves are not everywhere convex, the fact that \(MRS>\left( <\right) 1\) at bundle A does not imply that, if the agent were free to choose any bundle along a 45\(^\circ \) line going through bundle A, he/she would necessarily choose a bundle to the left (resp.: right) of bundle A.
 30.
This is true as long as the undistorted bundle on the indifference curve \( {\overline{V}}^{n}\) does not violate the constraint \(B^{n}\ge 0\), i.e. as long as \({\overline{V}}^{n}\ge U_{LF}^{n}\). As we discuss in Sect. 2.2, in our characterization of the PF we impose the restriction that the utility of nonusers cannot fall below \(U_{LF}^{n}\) (see Sect. 2.2 and in particular condition (5)).
 31.
Notice that, for sufficiently low values of \({\overline{V}}^{n}\) (in particular, \({\overline{V}}^{n}<\left( 1\pi \right) \left( w^{n}\right) ^{2}/2 \)), the lower root of (A3) is negative; when this happens, the set of incentivecompatible bundles on the virtual budget line (A1) is given by those bundles where Y is greater or equal to the larger root.
 32.
 33.
Details of the calculations are available upon request.
 34.
This selfselection constraint is trivially nonbinding at a secondbest optimum when \({\overline{V}}^{n}<U_{LF}^{n}\) and the only policy instrument used by the government is a nonlinear income tax.
 35.
Notice that the right hand side of (D6) can be rewritten as \( B^{n}\left[ \left( 1s\right) qY^{n}/w^{u}\right] \left( Y^{n}/w^{u}\right) ^{2}/2\), where the term \(\left( 1s\right) qY^{n}/w^{u}\) represents the effective outlay for jobrelated costs when users mimic nonusers and jobrelated expenses are subsidized at rate s.
 36.
It can be easily verified that the RHS of (D8) is larger than \( U_{LF}^{n}=\left( w^{n}\right) ^{2}/2\).
 37.
Assume that \(\left( w^{n}\right) ^{2}w^{u}\left( w^{u}q\right) >0\). The condition \(\left( w^{n}\right) ^{2}\pi \left( 2w^{u}q\right) \left[ \left( w^{n}\right) ^{2}w^{u}\left( w^{u}q\right) \right] /w^{u}<\left( w^{n}\right) ^{2}\pi \left[ \left( w^{n}\right) ^{2}\left( w^{u}q\right) w^{u}\right] ^{2}/\left( w^{n}\right) ^{2}\) can be restated as \(\left[ \left( w^{n}\right) ^{2}\left( w^{u}q\right) w^{u}\right] ^{2}w^{u}\left( 2w^{u}q\right) \left[ \left( w^{n}\right) ^{2}w^{u}\left( w^{u}q\right) \right] \left( w^{n}\right) ^{2}<0\) and therefore \(\left[ \left( w^{n}\right) ^{2}\left( w^{u}q\right) w^{u}\right] w^{u}<\left( 2w^{u}q\right) \left( w^{n}\right) ^{2}\). Simplifying terms one gets \( \left( w^{u}q\right) \left( w^{u}\right) ^{2}<\left( w^{u}q\right) \left( w^{n}\right) ^{2}\).
 38.
In this case a firstbest optimum is implemented and both selfselection constraint are binding: \(U^{u}\left( Y^{u},B^{u}\right) =U^{u}\left( Y^{n},B^{n}\right) \) and \(U^{n}\left( Y^{n},B^{n}\right) =U^{n}\left( Y^{u},B^{u}\right) \).
 39.
Setting \({\overline{V}}^{n}=32.69\), one would get the secondbest optimum that would be chosen by a maxmin government. At this secondbest optimum \( Y^{u}=Y^{n}=84.62\), \(B^{u}=B^{n}=68.49\), \(s=83.85\%\) and \(U^{u}=U^{n}=32.69\) . As shown in Appendix E, with \(Y_{LF}^{u}<Y_{LF}^{n}\) a max–min social welfare function always deliver a secondbest optimum where both selfselection constraints are binding. However, this secondbest optimum is not necessarily a pooling equilibrium as in our example. For instance, assume that \(w^{u}=10.2\), \(w^{n}=10\), \(q=5\) and \(\pi =1/5\). The maxmin optimum would be a separating equilibrium where \(U^{u}<U^{n}\).
 40.
We also have \(B_{SB}^{u}=39.96\) and \(B_{SB}^{n}=90.01\). Notice also that the secondbest optimum features income reranking with respect to the laissezfaire equilibrium.
References
Anderberg D, Balestrino A (2000) Household production and the design of the tax structure. Int Tax Public Financ 7:563–584
Bastani S, Blomquist S, Micheletto L (2013) The welfare gains of agerelated optimal income taxation. Int Econ Rev 54:1219–1249
Bastani S, Blomquist S, Micheletto L (2019) Optimal income taxation without singlecrossing. Dondena Working Paper no. 129
Bastani S, Blomquist S, Micheletto L (2020) Child care subsidies, quality, and optimal income taxation. Am Econ J Econ Policy
Bastani S, Blumkin T, Micheletto L (2015) Optimal wage redistribution in the presence of adverse selection in the labor market. J Public Econ 131:41–57
Bierbrauer FJ, Boyer PC (2014) The Paretofrontier in a simple Mirrleesian model of income taxation. Ann Econ Stat 113(114):185–206
Blomquist S, Christiansen V, Micheletto L (2010) Public provision of private goods and nondistortionary marginal tax rates. Am Econ J Econ Policy 2:1–27
Boadway R, Marchand M, Pestieau P, del Mar Racionero M (2002) Optimal redistribution with heterogeneous preferences for leisure. J Public Econ Theory 4:475–498
Casarico A, Micheletto L, Sommacal A (2015) Intergenerational transmission of skills during childhood and optimal public policy. J Popul Econ 28:353–372
Choné P, Laroque G (2010) Negative marginal tax rates and heterogeneity. Am Econ Rev 100:2532–2547
Cremer H, Gahvari F (2002) Nonlinear pricing, redistribution and optimal tax policy. J Public Econ Theory 4:139–161
Cremer H, Gahvari F, Ladoux N (1998) Externalities and optimal taxation. J Public Econ 70:343–364
Cremer H, Pestieau P, Rochet JC (2001) Direct versus indirect taxation: the design of the tax structure revisited. Int Econ Rev 42:781–799
Gahvari F (2007) On optimal commodity taxes when consumption is time consuming. J Public Econ Theory 9:1–27
Golosov M, Troshkin M, Tsyvinski A, Weinzierl M (2013) Preference heterogeneity and optimal capital income taxation. J Public Econ 97:160–175
Ho C, Pavoni N (2020) Efficient child care subsidies. Am Econ Rev 110:162–199
Jacobs B, Boadway R (2014) Optimal linear commodity taxation under optimal nonlinear income taxation. J Public Econ 117:201–210
Jacquet L, Lehmann E, Van der Linden B (2013) Optimal redistributive taxation with both extensive and intensive responses. J Econ Theory 148:1770–1805
Judd K, Ma D, Saunders MA, Su CL (2018) Optimal income taxation with multidimensional taxpayer types. Mimeo
Kleven HJ, Kreiner CT, Saez E (2009) The optimal income taxation of couples. Econometrica 77:537–560
Koehne S, Sachs D (2017) Pareto efficient tax breaks. CESifo Working Paper No. 6147
Lockwood BB, Weinzierl M (2015) De gustibus non est taxandum: heterogeneity in preferences and optimal redistribution. J Public Econ 124:74–80
Micheletto L (2008) Redistribution and optimal mixed taxation in the presence of consumption externalities. J Public Econ 92:2262–2274
Mirrlees JA (1971) An exploration in the theory of optimum income taxation. Rev Econ Stud 38:175–208
Rothschild C, Scheuer F (2014) A theory of income taxation under multidimensional skill heterogeneity. NBER Working Paper No. 19822
Saez E (2002) Optimal income transfer programs: intensive versus extensive labor supply responses. Q J Econ 117:1039–1073
Scheuer F (2014) Entrepreneurial taxation with endogenous entry. Am Econ J Econ Policy 6:126–163
Stiglitz JE (1982) Selfselection and Pareto efficient taxation. J Public Econ 17:213–240
Acknowledgements
Open access funding provided by Linnaeus University.
Author information
Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We are grateful to Floris Zoutman and two anonymous referees for valuable comments on an earlier draft of the paper.
Appendices
Appendix A
Proof of Proposition 1
Assume first that \({\overline{V}} ^{n}<U_{LF}^{n}\), so that users are offered a (Y, B)bundle such that \( Y^{u}B^{u}<0\) and nonusers a (Y, B)bundle such that \(Y^{n}B^{n}>0\). With income tax revenue collected from each nonuser being equal to \( Y^{n}B^{n}\), the revenue that can be transferred to each user is equal to \( \left( Y^{n}B^{n}\right) \left( 1\pi \right) /\pi \). With nonusers being offered a bundle on their indifference curve with associated utility value \( {\overline{V}}^{n}\), the maximum revenue that the government can collect from them is obtained at the bundle where their labor supply is undistorted, implying a zero implicit marginal income tax rate for nonusers.^{Footnote 30} Thus, independently on the value of \( {\overline{V}}^{n}\), we will have that \(Y^{n}=\left( w^{n}\right) ^{2}\).
With \({\overline{V}}^{n}<U_{LF}^{n}\) and \(Y^{n}=\left( w^{n}\right) ^{2}\), the government collects from each nonuser an amount of revenue equal to \( Y^{n}B^{n}=\left( w^{n}\right) ^{2}\left[ {\overline{V}}^{n}+\left( 1/2\right) \left( w^{n}\right) ^{2}\right] =\left( 1/2\right) \left( w^{n}\right) ^{2}{\overline{V}}^{n}\). This implies that the revenue that can be transferred to each user is equal to \(\left( 1\pi \right) \left[ \left( 1/2\right) \left( w^{n}\right) ^{2}{\overline{V}}^{n}\right] /\pi \), which in turn implies that users will be offered a bundle on the virtual budget line
On this virtual budget line, however, some bundles cannot be offered since they would induce mimicking by nonusers. To find the set of incentivecompatible bundles on the virtual budget line (A1), one has to identify the values for Y at which the relevant indifference curve for nonusers (i.e. the one associated with utility \({\overline{V}}^{n}\)) intersects the virtual budget line.
Taking into account that the relevant indifference curve for nonusers has equation
by equating (A1) and (A2) one can find two values for Y. These are given by:
where the term within square root is positive due to the initial assumption that \({\overline{V}}^{n}<U_{LF}^{n}=\left( w^{n}\right) ^{2}/2\).
On the virtual budget line (A1) only the bundles with \(Y\le \left( w^{n}\right) ^{2}w^{n}\sqrt{\frac{1}{\pi }\left[ \left( w^{n}\right) ^{2}2 {\overline{V}}^{n}\right] }\) and \(Y\ge \left( w^{n}\right) ^{2}+w^{n}\sqrt{ \frac{1}{\pi }\left[ \left( w^{n}\right) ^{2}2{\overline{V}}^{n}\right] }\) are incentivecompatible (i.e., do not induce nonusers to behave as mimickers).^{Footnote 31} If incentivecompatibility considerations were not an issue, users could be offered on the virtual budget line (A1) the undistorted bundle
Thus, if it is either the case that
or
the labor supply of users can be left undistorted (\(T^{\prime }\left( Y_{SB}^{u}\right) =0\)). Solving (A4) and (A5) for \( {\overline{V}}^{n}\), one finds that \(T^{\prime }\left( Y_{SB}^{u}\right) =0\) when
where the RHS of (A6) is strictly lower than \(\left( w^{n}\right) ^{2}/2=U_{LF}^{n}\).
Suppose instead that inequality (A6) does not hold. Offering users an undistorted bundle along the virtual budget line (A1) would then violate the incentivecompatibility constraint for nonusers. This implies that users will either be offered the bundle \(\left( Y_{A},B_{A}\right) \) where
and their labor supply is distorted downwards (\(T^{\prime }\left( Y_{SB}^{u}\right) >0\)), or the bundle \(\left( Y_{B},B_{B}\right) \) where
and their labor supply is distorted upwards (\(T^{\prime }\left( Y_{SB}^{u}\right) <0\)).
For later purposes, notice that from (A7), since \(Y_{A}\) cannot take negative values, \(U^{n}\) can never fall below \(\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\) when users are offered the bundle (\(Y_{A},B_{A}\)).
Evaluating the utility of users at the bundle characterized by (A7)–(A8), we have:
whereas the utility of users at the bundle characterized by (A9)–(A10) is
Before comparing the utility of users at \(\left( Y_{A},B_{A}\right) \) and \( \left( Y_{B},B_{B}\right) \), notice that a necessary condition for \(\left( Y_{A},B_{A}\right) \) to be part of the secondbest PF is that \(\partial U^{u}\left( Y_{A},B_{A}\right) /\partial {\overline{V}}^{n}<0\) (and similarly, a necessary condition for \(\left( Y_{B},B_{B}\right) \) to be part of the secondbest PF is that \(\partial U^{u}\left( Y_{B},B_{B}\right) /\partial {\overline{V}}^{n}<0\)).
Consider first \(\partial U^{u}\left( Y_{A},B_{A}\right) /\partial {\overline{V}}^{n}\). This is given by:
With \(q<{\overline{q}}\), we have that \(\left( w^{u}q\right) w^{u}\left( w^{n}\right) ^{2}>0\). Therefore, \(U^{u}\left( Y_{A},B_{A}\right) /\partial {\overline{V}}^{n}<0\) when
Under our assumption that \(\pi <1\left( w^{n}\right) ^{2}/\left( w^{u}\right) ^{2}\), it follows that (A14) is satisfied as long as
where the RHS of (A15) is larger than \(\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\) for \(\pi <q/w^{u}\).
Noticing that \(q<{\overline{q}}\Longrightarrow 1\frac{\left( w^{n}\right) ^{2} }{\left( w^{u}\right) ^{2}}>\frac{q}{w^{u}}\), we can conclude that, with \(q< {\overline{q}}\), offering users the bundle \(\left( Y_{A},B_{A}\right) \) can never be optimal when \(\pi \ge \frac{q}{w^{u}}\).
Consider now \(\partial U^{u}\left( Y_{B},B_{B}\right) /\partial {\overline{V}} ^{n}\). This is given by:
With \(q<{\overline{q}}\), we have that \(\partial U^{u}\left( Y_{B},B_{B}\right) /\partial {\overline{V}}^{n}<0\) when
and our assumption that \(\pi <1\left( w^{n}\right) ^{2}/\left( w^{u}\right) ^{2}\) implies that (A17) is always satisfied.
Let’s now compare \(U^{u}\left( Y_{A},B_{A}\right) \) and \(U^{u}\left( Y_{B},B_{B}\right) \) as given by (A11)–(A12). Simple algebra can be used to show that
Therefore, we can conclude that \(U^{u}\left( Y_{B},B_{B}\right) >U^{u}\left( Y_{A},B_{A}\right) \) for \(q<{\overline{q}}\). This shows that, when \(q< {\overline{q}}\) and (A6) is violated, a secondbest optimum will necessarily entail an upward distortion on the labor supply of users (\( T^{\prime }\left( Y_{SB}^{u}\right) <0\)).
Assume now that \({\overline{V}}^{n}>U_{LF}^{n}\). This implies that the optimal bundles offered by the government will entail \(Y^{n}B^{n}<0\) and \( Y^{u}B^{u}>0\). With revenue collected from each user being equal to \( Y^{u}B^{u}\), the revenue that can be transferred to each nonuser is equal to \(\left( Y^{u}B^{u}\right) \pi /\left( 1\pi \right) \). With users being offered a bundle on their indifference curve with associated utility value \( U_{SB}^{u}\), the maximum revenue that the government can collect from them is obtained at the bundle where their labor supply is undistorted (implying a zero implicit marginal income tax rate for users). In our setting this implies that, independently on the value of \(U_{SB}^{u}\), we will have that \( Y^{u}=\left( w^{u}q\right) w^{u}\).^{Footnote 32} Thus, when the utility obtained by users at a secondbest optimum is \( U_{SB}^{u}<U_{LF}^{u}\) and their labor supply is left undistorted, the government collects from each user an amount of revenue equal to \( Y^{u}B^{u}=\left( w^{u}q\right) w^{u}\left[ U_{SB}^{u}+\frac{1}{2}\left( w^{u}q\right) ^{2}+\left( w^{u}q\right) q\right] =\frac{1}{2}\left( w^{u}q\right) ^{2}U_{SB}^{u}\). This implies that the revenue that can be transferred to each nonuser is equal to \(\left[ \frac{1}{2}\left( w^{u}q\right) ^{2}U_{SB}^{u}\right] \pi /\left( 1\pi \right) \), which in turn implies that nonusers will be offered a bundle on the virtual budget line:
On this virtual budget line, however, some bundles cannot be offered since they would induce mimicking by users. To find the set of incentivecompatible bundles, one has to identify the two values for Y at which the relevant indifference curve for users (i.e. the one associated with utility \(U_{SB}^{u}\)) intersects the virtual budget line.
Taking into account that the relevant indifference curve for users has equation
by equating (A19) and (A20) we can find the two relevant values for Y. These are given by
where the term within square root is positive due to our assumption that \( U_{SB}^{u}<U_{LF}^{u}=\left( w^{u}q\right) ^{2}/2\).
On the virtual budget line (A19), the incentivecompatible bundles (which do not induce users to behave as mimickers) are those satisfying either of the following two conditions:
If incentivecompatibility considerations were not an issue, nonusers could be offered on the virtual budget line (A19) the undistorted bundle
Thus, if it is either the case that
or
the labor supply of nonusers can be left undistorted (\(T^{\prime }\left( Y_{SB}^{n}\right) =0\)). Solving (A21) and (A22) for \( U_{SB}^{u}\), one finds that \(T^{\prime }\left( Y_{SB}^{n}\right) =0\) when
where the RHS of (A23) is strictly lower than \(\left( w^{u}q\right) ^{2}/2=U_{LF}^{u}\).
Taking into account that when nonusers are offered an undistorted bundle, their utility is
and substituting for \(U_{SB}^{u}\) in (A24) the value provided by the RHS of (A23), one gets the maximum utility that can be enjoyed by nonusers without resorting to distort their labor supply:
Suppose now that inequality (A23) does not hold. Offering nonusers an undistorted bundle along the virtual budget line (A19) would violate the incentivecompatibility constraint for users. This implies that nonusers will either be offered the bundle \(\left( Y_{C},B_{C}\right) \) where
and their labor supply is distorted downwards (\(T^{\prime }\left( Y_{SB}^{n}\right) >0\)), or the bundle \(\left( Y_{D},B_{D}\right) \) where
and their labor supply is distorted upwards (\(T^{\prime }\left( Y_{SB}^{n}\right) <0\)).
For later purposes, notice that from (A25), since \(Y_{C}\) cannot take negative values, \(U^{u}\) can never fall below \(\left( w^{u}q\right) ^{2}\pi /2\) when nonusers are offered the bundle (\(Y_{C},B_{C}\)).
Evaluating the utility of nonusers at the bundle characterized by (A25)–(A26), we find that
whereas the utility of nonusers at the bundle characterized by (A27)–(A28) is
Before comparing the utility of nonusers at \(\left( Y_{C},B_{C}\right) \) and \(\left( Y_{D},B_{D}\right) \), notice that a necessary condition for \( \left( Y_{C},B_{C}\right) \) to be part of the secondbest PF is that \( \partial U^{n}\left( Y_{C},B_{C}\right) /\partial U_{SB}^{u}<0\) (and similarly, a necessary condition for \(\left( Y_{D},B_{D}\right) \) to be part of the secondbest PF is that \(\partial U^{n}\left( Y_{D},B_{D}\right) /\partial U_{SB}^{u}<0\)).
Consider first \(\partial U^{n}\left( Y_{C},B_{C}\right) /\partial U_{SB}^{u}\) . This is given by:
Thus, we have that \(\partial U^{n}\left( Y_{C},B_{C}\right) /\partial U_{SB}^{u}<0\) when
For \(q<{\overline{q}}\), condition (A32) holds for
where the RHS of (A33) defines a lower bound for \(U_{SB}^{u}\) along the secondbest PF.
Substituting for \(U_{SB}^{u}\) into (A29) the value provided by the RHS of (A33) allows deriving an upper bound for \(U_{SB}^{n}\), and therefore \({\overline{V}}^{n}\). After tedious calculations one gets:^{Footnote 33}
It is easy to verify that the RHS of (A33) is larger than \(\frac{1}{ 2}\left( w^{u}q\right) ^{2}\frac{1\pi }{2}\left[ \frac{\left( w^{u}q\right) w^{u}\left( w^{n}\right) ^{2}}{\left( w^{u}\right) ^{2}\left( w^{n}\right) ^{2}}w^{u}\right] ^{2}\), which represents the value of \(U_{SB}^{u}\) that implies \(Y_{C}=\Omega \) (where \(Y_{C}\) is defined by (A25) and \(\Omega \equiv q\frac{\left( w^{n}\right) ^{2}w^{u}}{\left( w^{u}\right) ^{2}\left( w^{n}\right) ^{2}}=\left( w^{n}\right) ^{2}q/ {\overline{q}}\) represents the threshold value for Y separating the bundles where \(MRS_{YB}^{u}>MRS_{YB}^{n}\), i.e. those bundles where \(Y<\Omega \), from the bundles where \(MRS_{YB}^{u}<MRS_{YB}^{n}\), i.e. those bundles where \(Y>\Omega \)). This shows that it can never be optimal to discourage the labor supply of nonusers to the point where \(Y_{SB}^{n}=0\).
Consider now \(\partial U^{n}\left( Y_{D},B_{D}\right) /\partial U_{SB}^{u}\). This is given by:
Thus, we have that \(\partial U^{n}\left( Y_{D},B_{D}\right) /\partial U_{SB}^{u}<0\) when
However, condition (A36) is never satisfied when \(q<{\overline{q}}\). Therefore, when \(q<{\overline{q}}\) and (A23) is violated, a secondbest optimum will necessarily entail a downward distortion on the labor supply of nonusers (\(T^{\prime }\left( Y_{SB}^{n}\right) >0\)). \(\square \)
Appendix B
Proof of Proposition 2
Consider first the case when the intended direction of redistribution is from nonusers to users. When \(q= {\overline{q}}\), so that \(\left( w^{u}q\right) w^{u}=\left( w^{n}\right) ^{2}\) , the RHS of inequality (A6) simplifies to \(\left( w^{n}\right) ^{2}/2\), which is the utility achieved by nonusers under laissezfaire. This shows that, when \(q={\overline{q}}\), it is never possible to redistribute from nonusers to users without distorting the labor supply of the latter. In order not to violate the incentivecompatibility constraint for nonusers, users can either be offered the distorted bundle characterized by (A7)–(A8), where \(T^{\prime }\left( Y_{SB}^{u}\right) >0\), or the distorted bundle characterized by (A9)–(A10), where \( T^{\prime }\left( Y_{SB}^{u}\right) <0\). But from (A18) we can see that, when \(\left( w^{n}\right) ^{2}=\left( w^{u}\right) \left( w^{u}q\right) \), users are indifferent between the two bundles. Thus, as long as users prefer these bundles to their laissezfaire bundle, there will be two equivalent secondbest optima.
For users to be better off at their laissezfaire bundle than at either bundle (A7)–(A8) or (A9)–(A10), i.e. for \(U_{LF}^{u}= \frac{\left( w^{u}q\right) ^{2}}{2}>U^{u}\left( Y_{B},B_{B}\right) =U^{u}\left( Y_{A},B_{A}\right) \), it must be that (taking into account that \(\left( w^{n}\right) ^{2}=\left( w^{u}\right) \left( w^{u}q\right) \)):
which, after simplifying and collecting terms, can be restated as
When \(q={\overline{q}}\), our assumption that \(\pi <1\frac{\left( w^{n}\right) ^{2}}{\left( w^{u}\right) ^{2}}\) can be equivalently restated as \(\pi <q/w^{u}\). This implies that \(w^{u}\frac{q}{\pi }<0\). Then, (B1) holds when \(\left( w^{u}q\right) w^{u}/2<{\overline{V}}^{n}\). But since \(q={\overline{q}}\) implies \(\left( w^{n}\right) ^{2}=\left( w^{u}\right) \left( w^{u}q\right) \), we also have that (B1) holds when \({\overline{V}}^{n}>\left( w^{n}\right) ^{2}/2=U_{LF}^{n}\). This means that redistribution from nonusers to users is feasible and users will face a nonzero marginal tax rate at a secondbest optimum. For \((1\pi )U_{LF}^{n}\le {\overline{V}}^{n}<U_{LF}^{n}\) there are two equivalent secondbest optima, one where \(T^{\prime }\left( Y_{SB}^{u}\right) >0\) and one where \(T^{\prime }\left( Y_{SB}^{u}\right) <0\). For \({\overline{V}} ^{n}<(1\pi )U_{LF}^{n}\), since the bundle characterized by (A7)–(A8) becomes non admissible (it would require \(Y^{u}<0\)), the secondbest optimum is unique: users are offered the bundle characterized by (A9)–( A10) and they face a negative marginal tax rate.
Consider now the case when the intended direction of redistribution is from users to nonusers. In this case the RHS of inequality (A23) simplifies to \(\left( w^{u}q\right) ^{2}/2\), which is the utility achieved by users under laissezfaire. This shows that, when \(\left( w^{n}\right) ^{2}=\left( w^{u}\right) \left( w^{u}q\right) \), it is never possible to redistribute from users to nonusers without distorting the labor supply of the latter. In order not to violate the incentivecompatibility constraint for users, nonusers can either be offered the distorted bundle characterized by (A25)–(A26) or the distorted bundle characterized by (A27)–(A28). With \(\left( w^{n}\right) ^{2}=\left( w^{u}\right) \left( w^{u}q\right) \), nonusers are indifferent between the two bundles. However, from (A31) and (A35) we also have that, when \( \left( w^{n}\right) ^{2}=\left( w^{u}\right) \left( w^{u}q\right) \), \(\frac{ \partial U^{n}\left( Y_{C},B_{C}\right) }{\partial U_{SB}^{u}}=\frac{ \partial U^{n}\left( Y_{D},B_{D}\right) }{\partial U_{SB}^{u}}=\left[ \pi + \frac{\left( w^{u}\right) ^{2}}{\left( w^{n}\right) ^{2}}\right] \frac{1}{ 1\pi }>0\), which implies that there is no point on the PF where nonusers get a higher utility than under laissezfaire. \(\square \)
Appendix C
Proof of Proposition 3
Consider first the case when \( {\overline{V}}^{n}<U_{LF}^{n}\). For \(q>{\overline{q}}\), (A13) takes a negative sign, i.e. \(U^{u}\left( Y_{A},B_{A}\right) /\partial {\overline{V}} ^{n}<0\), when
Under our assumption that \(\pi <1\left( w^{n}\right) ^{2}/\left( w^{u}\right) ^{2}\), inequality (C1) is always satisfied. Therefore, one can keep raising the utility of users until \({\overline{V}}^{n}\) is pushed down to the value \(\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\) (implying that \(Y_{A}\), as defined by (A7), reaches its lower bound \( Y_{A}=0\), and \(U^{u}\left( Y^{A},B^{A}\right) =U_{SB}^{n}=\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\)).
Consider now the expression for \(\partial U^{u}\left( Y_{B},B_{B}\right) /\partial {\overline{V}}^{n}\) provided by (A16). When \(q>{\overline{q}} \), we have that \(\partial U^{u}\left( Y_{B},B_{B}\right) /\partial {\overline{V}}^{n}<0\) when
Under our assumption that \(\pi <1\left( w^{n}\right) ^{2}/\left( w^{u}\right) ^{2}\), (C2) is satisfied as long as
where the RHS of (C3) is smaller than \(\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\) for \(\left( 1\frac{\left( w^{n}\right) ^{2}}{\left( w^{u}\right) ^{2}}>\right) \pi >1\frac{\left( w^{n}\right) ^{2}}{\left( w^{u}\right) ^{2}}+\frac{w^{u}\left( w^{u}q\right) \left( w^{n}\right) ^{2} }{\left( w^{u}\right) ^{2}}\) and it is larger or equal than \(\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\) for \(\pi \le 1\frac{\left( w^{n}\right) ^{2}}{\left( w^{u}\right) ^{2}}+\frac{w^{u}\left( w^{u}q\right) \left( w^{n}\right) ^{2}}{\left( w^{u}\right) ^{2}}\). Notice in particular that, when \(\pi \) is lower than but sufficiently close to \( 1\left( w^{n}\right) ^{2}/\left( w^{u}\right) ^{2}\), the RHS of () defines a value that is smaller than \(\left( w^{n}\right) ^{2}/2\) , i.e. it violates our constraint (5).
From (A18) we can also see that, for \(q>{\overline{q}}\) (i.e., \( \left( w^{u}q\right) w^{u}<\left( w^{n}\right) ^{2}\)), \(U^{u}\left( Y_{A},B_{A}\right) >U^{u}\left( Y_{B},B_{B}\right) \). Thus, when \(q> {\overline{q}}\) and (A6) is violated, a secondbest optimum will entail a downward distortion on the labor supply of users (\(T^{\prime }\left( Y_{SB}^{u}\right) >0\)) as long as \({\overline{V}}^{n}\ge \left( 1\pi \right) \left( w^{n}\right) ^{2}/2\). As we have noticed above, when \( {\overline{V}}^{n}=\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\) and users are offered the corresponding \(\left( Y_{A},B_{A}\right) \)bundle, their utility is \(U^{u}\left( Y^{A},B^{A}\right) =U_{SB}^{n}=\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\). Moreover, when \({\overline{V}}^{n}=\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\), \(Y_{A}\) is pushed to its lower bound, i.e. \(Y_{A}=0\), which means that one has reached the limit of redistribution that can be accomplished by downward distorting the users’ labor supply. Therefore, the secondbest PF can include points where \({\overline{V}} ^{n}<\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\) if and only if, by pushing the utility of nonusers below \(\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\) and offering users the corresponding \(\left( Y_{B},B_{B}\right) \)bundle (i.e., distorting upwards their labor supply), it is possible to raise the users’ utility above \(\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\). To verify whether this is indeed possible, notice first that, according to (C3), \(\partial U^{u}\left( Y_{B},B_{B}\right) /\partial {\overline{V}}^{n}<0\) requires \({\overline{V}}^{n}\) to be sufficiently small. Taking into account that in our analysis we require the constraint (5) to be satisfied, it then follows that the secondbest PF will include points where \({\overline{V}}^{n}<\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\) if and only if the following condition holds:
Evaluating (A11) at \({\overline{V}}^{n}=\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\) and (A12) at \({\overline{V}}^{n}=\left( w^{n}\right) ^{2}/2\), condition (C4) can be restated as
Simplifying and collecting terms, the inequality above can be rewritten as
or equivalently, dividing all terms by \(\left( w^{n}\right) ^{2}\), as
Based on (C5), we can then conclude that (C4) is satisfied provided that
One should however not forget that Proposition 3 deals with the case when \(q> {\overline{q}}\), i.e. \(\left( w^{n}\right) ^{2}>w^{u}\left( w^{u}q\right) \). Therefore, we should also check that the condition (C6) is indeed compatible with \(\left( w^{n}\right) ^{2}>w^{u}\left( w^{u}q\right) \). For this purpose, what is required is to show that the following inequality is satisfied:
where the RHS of (C7) is a restatement of the RHS of (C6).
Dividing all terms by \(w^{u}\), one can rewrite (C7) as
from which, after some simple manipulations, one obtains
or equivalently \(\pi <q/w^{u}\).
Given that \(q>{\overline{q}}\Longrightarrow 1\frac{\left( w^{u}\right) ^{2}}{ \left( w^{n}\right) ^{2}}<\frac{q}{w^{u}}\), inequality (C7) is always satisfied under our assumption that \(\pi <1\frac{\left( w^{u}\right) ^{2}}{\left( w^{n}\right) ^{2}}\).
Finally, since it can be easily established that the RHS of (C6) is strictly smaller than \(\left( w^{u}\right) ^{2}\), one can conclude that the secondbest PF also contains values of \({\overline{V}}^{n}\) that are lower than \(\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\) when the following condition holds:
or equivalently, for values of q such that
When (C8) is satisfied so that the secondbest PF includes a region where \(U_{LF}^{n}\le {\overline{V}}^{n}<(1\pi )U_{LF}^{n}\), any point on that region corresponds to an allocation at which \(T^{\prime }\left( Y_{SB}^{n}\right) =0\) and \(T^{\prime }\left( Y_{SB}^{u}\right) <0\).
Having established under which conditions the secondbest PF also contains values of \({\overline{V}}^{n}\) that are lower than \(\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\), what is left to prove is that, when this happens, the secondbest PF is disconnected. For this purpose, taking into account that \(\forall \)\({\overline{V}}^{n}\in [\left( 1\pi \right) \frac{\left( w^{n}\right) ^{2}}{2},\frac{\left( w^{n}\right) ^{2}}{2})\) we have \(U^{u}\left( Y_{A},B_{A}\right) U^{u}\left( Y_{B},B_{B}\right) >0\), it is sufficient to show that the following condition holds:
Evaluating both (A13) and (A16) at \({\overline{V}} ^{n}=\left( 1\pi \right) \left( w^{n}\right) ^{2}/2\), the inequality above requires that
Multiplying both sides by \(\left( w^{u}\right) ^{2}\pi \), simplifying and collecting terms, one can rewrite (C9) as
which is satisfied given that Proposition 3 refers to the case when \(q> {\overline{q}}\) (which implies \(w^{u}\left( w^{u}q\right) <\left( w^{n}\right) ^{2}\)).
Consider now the case when \({\overline{V}}^{n}>U_{LF}^{n}\). For \(q>{\overline{q}} \), (A32) is never satisfied, which implies that it is never the case that \(\partial U^{n}\left( Y_{C},B_{C}\right) /\partial U_{SB}^{u}<0\). Regarding the sign of \(\partial U^{n}\left( Y_{D},B_{D}\right) /\partial U_{SB}^{u}\), we have instead that (A36) holds, and therefore \( \partial U^{n}\left( Y_{D},B_{D}\right) /\partial U_{SB}^{u}<0\), for values of \(U_{SB}^{u}\) that satisfy (A33). Thus, when \(q>{\overline{q}}\) and (A23) is violated, a secondbest optimum will necessarily entail an upward distortion on the labor supply of nonusers (\(T^{\prime }\left( Y_{SB}^{n}\right) <0\)). Regarding the maximum value that can be achieved by the nonusers’ utility along the secondbest PF, it will depend on whether the lower bound for \(U_{SB}^{u}\), as provided by (A33), defines a value that is larger or not than the lower bound that we have assumed in ( 6). If the RHS of (A33) is larger than \(\left( w^{u}q\right) ^{2}/2\), i.e. if
the maximum utility that can be achieved by nonusers along the secondbest PF is found by evaluating (A30) at the value for \(U_{SB}^{u}\) provided by the RHS of (A33). In this case, after tedious calculations, one obtains that
If instead the RHS of (A33) is weakly smaller than \(\left( w^{u}q\right) ^{2}/2\), i.e. if
the maximum utility that can be achieved by nonusers along the secondbest PF is found by evaluating (A30) at \(U_{SB}^{u}=\left( w^{u}q\right) ^{2}/2\). This gives:
\(\square \)
Appendix D
Proof of Proposition 4
Assume that \(\left( w^{u}q\right) w^{u}\ne \left( w^{n}\right) ^{2}\) and that \(s=0\). If incentivecompatibility considerations were not an issue, the government would assign, respectively, nonusers and users to the undistorted bundles
and
With \(U_{LF}^{n}\le {\overline{V}}^{n}<U_{LF}^{n}\frac{\pi }{2}\frac{ (Y_{LF}^{u}Y_{LF}^{n})^{2}}{Y_{LF}^{n}}\), we know from Appendix A that users cannot be offered the bundle (D2) since it violates the selfselection constraint requiring nonusers not to be tempted to choose the bundle intended for users. What we want to ascertain is whether, by properly choosing the subsidy rate s, the government can offer users an undistorted bundle while at the same time preventing mimicking from nonusers.
Assume that the government introduces a subsidy at rate \(s>0\) and that it offers to users the bundle
while keeping unchanged at (D1) the bundle for nonusers.
Comparing (D2) and (D3), we can see that, whereas \(Y^{u}\) is the same, the value of \(B^{u}\) in (D3) has been lowered by an amount \(\left( w^{u}q\right) qs=\left( Y^{u}/w^{u}\right) qs=h^{u}qs\), which exactly offsets the saving that users enjoy due to the subsidy on jobrelated expenses. Therefore, the bundle (D3) represents an undistorted bundle that allows users to achieve the same utility as under the bundle (D2). The difference is that, while offering (D2 ) with \(s=0\) is not incentivecompatible, offering (D3) with \(s>0\) prevents mimicking by nonusers provided that the following condition holds:
i.e. provided that
Solving for the minimum value for s, denoted by \(s^{*}\), that satisfies inequality (D4), one gets:
Thus, when s is set according to (D5) the government could offer users an undistorted bundle, without inducing mimicking by nonusers, even when \({\overline{V}}^{n}<U_{LF}^{n}\frac{\pi }{2}\frac{ (Y_{LF}^{u}Y_{LF}^{n})^{2}}{Y_{LF}^{n}}\), a result that could not be achieved if the government only relied on a nonlinear income tax.
However, this does not allow concluding that the secondbest optimum will necessarily coincide with the firstbest optimum. In fact, once s is chosen according to (D5), the other selfselection constraint, i.e. the one requiring users not to be tempted to mimic nonusers, may become binding.^{Footnote 34} The reason is that, since nonusers are still offered the undistorted bundle (D1), the consumption available for a user behaving as a mimicker, i.e. choosing the bundle intended for nonusers, increases by the amount \(\left( w^{n}\right) ^{2}sq/w^{u}\), where \(\left( w^{n}\right) ^{2}/w^{u}\) represents the labor supply of a user behaving as a mimicker. In particular, users will not have an incentive to mimic nonusers if the following condition holds:
where the LHS of the inequality above represents the utility achieved by users at the undistorted bundle offered to them by the government, and the RHS represents the utility that they would achieve if they were to choose the bundle (D1) intended for nonusers.^{Footnote 35}
Rewriting (D6) as
and substituting for s the value provided by (D5) gives:
Multiplying both sides of the inequality above by \(w^{u}q\), one obtains
which can be rewritten as
Multiplying both sides of the inequality above by \(2\left( w^{n}\right) ^{2}w^{u}\) gives
or, equivalently:
so that the the nomimicking condition (D6) can be restated as follows:
From (D7) one can see that, when \(w^{u}\left( w^{u}q\right) \left( w^{n}\right) ^{2}\ge 0\), users have no incentive to mimic nonusers. When instead \(\left( w^{n}\right) ^{2}>w^{u}\left( w^{u}q\right) \), users have no incentive to mimic nonusers when the following condition holds:
namely when
Therefore, when \(\left( w^{n}\right) ^{2}>w^{u}\left( w^{u}q\right) \) and (D8) is violated, an optimal nonlinear income tax coupled with an optimal subsidy on jobrelated expenses will not allow implementing a firstbest optimum.^{Footnote 36}
Notice also that the RHS of (D8) defines a value for \({\overline{V}} ^{n}\) that is lower than \(\frac{\left( w^{n}\right) ^{2}}{2}\frac{\pi \left[ \left( w^{u}q\right) w^{u}\left( w^{n}\right) ^{2}\right] ^{2}}{2\left( w^{n}\right) ^{2}}\) when \(\left( w^{n}\right) ^{2}>w^{u}\left( w^{u}q\right) \).^{Footnote 37} Thus, provided that \({\overline{V}}^{n}\) is not too low, an optimal subsidy allows implementing the firstbest allocation even when \( \left( w^{n}\right) ^{2}>w^{u}\left( w^{u}q\right) \). In particular, the range of values for \({\overline{V}}^{n}\) for which this occurs is given by:
So far, our analysis has relied on the assumption that \(\left( w^{u}q\right) w^{u}\ne \left( w^{n}\right) ^{2}\) so that \(Y_{LF}^{u}\ne Y_{LF}^{n}\). If instead \(Y_{LF}^{u}=Y_{LF}^{n}\), it is easy to see that supplementing a nonlinear income tax with a subsidy on jobrelated expenses allows implementing a firstbest optimum. In fact, assume that \(\left( w^{n}\right) ^{2}/2\le {\overline{V}}^{n}<\left( w^{n}\right) ^{2}/2=U_{LF}^{n}\). By offering to all agents, users and nonusers, the bundle \(\left( Y,B\right) =\left( \left( w^{n}\right) ^{2},\frac{\left( w^{n}\right) ^{2}}{2}+{\overline{V}}^{n}\right) \) and setting \(s=\frac{\frac{ \left( w^{n}\right) ^{2}}{2}{\overline{V}}^{n}}{\left( w^{u}q\right) q\pi }\) , one achieves redistribution (\(U_{SB}^{n}={\overline{V}}^{n}<U_{LF}^{n}\); \( U_{SB}^{u}=U_{LF}^{u}+\frac{1\pi }{\pi }\left( U_{LF}^{n}{\overline{V}} ^{n}\right) >U_{LF}^{u}\)), while at the same time leaving undistorted the labor supply of all agents (\(Y_{LF}^{n}=Y_{SB}^{n}=Y_{LF}^{u}=Y_{SB}^{u}\)), maintaining incentivecompatibility (given that all agents are offered the same bundle in the (Y, B)space), and satisfying the public budget constraint (since the cost of the subsidy benefiting users, i.e. \(\left( w^{u}q\right) sq\pi \), is exactly matched by the total revenue collected through the income tax, i.e. \(\frac{\left( w^{n}\right) ^{2}}{2}{\overline{V}} ^{n}\)). \(\square \)
Appendix E
Proof that, under a maxmin social welfare function, both selfselection constraints are binding in equilibrium when s is optimally chosen and \(Y_{LF}^{u}<Y_{LF}^{n}\).
Notice first that, when \(Y_{LF}^{u}<Y_{LF}^{n}\), we have that \( U_{LF}^{u}<U_{LF}^{n}\) so that a maxmin planner will redistribute from nonusers to users. From Appendix D we know that, optimizing s along with a nonlinear income tax, a firstbest optimum can be implemented as long as \( {\overline{V}}^{n}\ge U_{LF}^{n}\frac{\pi }{2}\frac{2w^{u}q}{w^{u}}\left( Y_{LF}^{n}Y_{LF}^{u}\right) \); for \({\overline{V}}^{n}<U_{LF}^{n}\frac{\pi }{ 2}\frac{2w^{u}q}{w^{u}}\left( Y_{LF}^{n}Y_{LF}^{u}\right) \), the government will instead implement a secondbest optimum where both selfselection constraints are binding. Since the LHS of (D6) provides an expression for \(U^{u}\left( {\overline{V}}^{n}\right) \) under a firstbest optimum, we have that, for \({\overline{V}}^{n}=U_{LF}^{n}\frac{\pi }{2}\frac{2w^{u}q}{w^{u}}\left( Y_{LF}^{n}Y_{LF}^{u}\right) =\frac{1}{2} \left\{ \left( w^{n}\right) ^{2}\pi \frac{2w^{u}q}{w^{u}}\left[ \left( w^{n}\right) ^{2}w^{u}\left( w^{u}q\right) \right] \right\} \),
which implies that \(U^{u}\left( {\overline{V}}^{n}\right) <{\overline{V}}^{n}\). In fact, we have:
This shows that a maxmin planner will implement an equilibrium where \( {\overline{V}}^{n}<U_{LF}^{n}\frac{\pi }{2}\frac{2w^{u}q}{w^{u}}\left( Y_{LF}^{n}Y_{LF}^{u}\right) \). But at this equilibrium, both selfselection constraints will be binding.
Appendix F
In the numerical examples below all numbers are rounded to two decimals to enhance readability. The exact numbers are available upon request.
Switching from a separating to a pooling equilibrium when \({\overline{V}}^{n}\)is gradually reduced and workrelated expenses are subsidized. Assume that \(w^{u}=11\), \(w^{n}=10\), \( q=5\) and \(\pi =1/2\). Under laissezfaire we have that \(Y_{LF}^{u}=\left( w^{u}q\right) w^{u}=66\) and \(Y_{LF}^{n}=\left( w^{n}\right) ^{2}=100\), with \(U_{LF}^{u}=\left( w^{u}q\right) ^{2}/2=18\) and \(U_{LF}^{n}=\left( w^{n}\right) ^{2}/2=50\). Set \({\overline{V}}^{n}=U_{LF}^{n}\frac{\pi }{2} \frac{2w^{u}q}{w^{u}}\left( Y_{LF}^{n}Y_{LF}^{u}\right) =36.86\), which represents the minimum value for \({\overline{V}}^{n}\) that still allows the government to implement a firstbest optimum when a nonlinear income tax is supplemented with an optimal subsidy. At the solution to the government’s problem we get \(s=68.31\%\), \(\left( Y^{u},B^{u}\right) =\left( 66,58.64\right) \), \(\left( Y^{n},B^{n}\right) =\left( 100,86.86\right) \), \( U^{u}\left( Y^{u},B^{u}\right) =31.1364\), \(U^{n}\left( Y^{n},B^{n}\right) =36.86\). We also have that \(T^{\prime }\left( Y^{n}\right) =0\) and \( T^{\prime }\left( Y^{u}\right) =31.05\%\). Even though users face a positive marginal income tax rate, their marginal effective tax rate, which is given by \(T^{\prime }\left( Y^{u}\right) sq/w^{u}\), is equal to 0.^{Footnote 38}
Lowering \({\overline{V}}^{n}\) to 36.5 one gets a secondbest optimum which is a separating equilibrium where both selfselection constraints bind. In particular, \(s=69.79\%\), \(\left( Y^{u},B^{u}\right) =\left( 60.88,55.03\right) \), \(\left( Y^{n},B^{n}\right) =\left( 97.35,83.89\right) \) , \(U^{u}\left( Y^{u},B^{u}\right) =31.36\), \(U^{n}\left( Y^{n},B^{n}\right) =36.5\). Moreover, \(T^{\prime }\left( Y^{n}\right) =2.65\%\), \(T^{\prime }\left( Y^{u}\right) =35.95\%\) and the marginal effective tax rate faced by users is equal to 4.23%. The labor supply of both types is downward distorted.
Setting \({\overline{V}}^{n}=35\), one gets a separating equilibrium where \( U^{u}=31.42\) and \(U^{n}=36.05\); since \(U^{n}>{\overline{V}}^{n}\), it follows that \({\overline{V}}^{n}=35\) does not belong to the domain of the function \( U^{u}\left( {\overline{V}}^{n}\right) \) which describes the Pareto frontier.
Finally, assume that \({\overline{V}}^{n}=33\). In this case the solution to the government’s problem is a pooling equilibrium where \(Y^{u}=Y^{n}=84.62\), \( B^{u}=B^{n}=68.80\) and \(s=82.25\%\). At this pooling equilibrium \(U^{u}=32.38\) , \(U^{n}=33\) and both the labor supply and the consumption of users are lower than for nonusers. The labor supply of users is upward distorted and the labor supply of nonusers is downward distorted.^{Footnote 39}
Numerical example showing the possibility that a distortion arises even though no selfselection constraint is binding at a secondbest optimum. Assume that the users’ workrelated costs are given by the concave function \(\varphi \left( h\right) =5h+0.5\sqrt{h}\). Furthermore, assume that \(w^{u}=12.87\), \(w^{n}=10\), and \(\pi =1/5\). Under laissezfaire we have that \(Y_{LF}^{u}=100.13\) and \(Y_{LF}^{n}=\left( w^{n}\right) ^{2}=100 \), with \(U_{LF}^{u}=29.57\) and \(U_{LF}^{n}=50\). Assume that in the Pareto efficient tax problem \({\overline{V}}^{n}\) is set equal to 40.01. At a secondbest optimum we get that \(Y_{SB}^{u}=0\), so that the labor supply of users is distorted downwards, \(Y_{SB}^{n}=100\) (no distortion on the labor supply of nonusers), \(U_{SB}^{n}=40.01\) and \(U_{SB}^{u}=39.96\).^{Footnote 40} However, since the utility for a nonuser choosing the bundle intended for users would be equal to 39.96, and the utility for a user choosing the bundle intended for nonusers would be equal to 19.58, it follows that no selfselection constraint is binding at the secondbest optimum. Nonetheless, observe that without a selfselection constraint requiring nonusers not to be tempted to mimic users, the latter could have been offered an undistorted bundle (in our example, the bundle \( \left( Y,B\right) =\left( 100.13,140.09\right) \)).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bastani, S., Blomquist, S. & Micheletto, L. Pareto efficient income taxation without singlecrossing. Soc Choice Welf 55, 547–594 (2020). https://doi.org/10.1007/s0035502001257z
Received:
Accepted:
Published:
Issue Date: