“Educational policies deserve to be programmed not only with a view to improving education in the widest sense, but also in order to influence the income distribution” (Tinbergen 1975, p. 18).

1 Introduction

In his book Income Distribution, the Dutch Nobel-prize winner Tinbergen (1975) extensively discusses the merits of increasing the supply of skilled workers relative to unskilled workers to reduce wage inequality. As the relative supply of skilled workers falls, the skill premium is lowered, and wage inequality diminishes. Tinbergen’s concern with growing inequality between skilled and unskilled workers is more relevant today than it was in the 1970s. Many Western countries are currently confronted with sharply increasing skill premiums. Skill-biased technological change causes the demand for skilled workers to increase more rapidly than the supply of skilled workers (Katz and Autor 1999). In Tinbergen’s terminology: the race between education and technological change is currently lost by education. Also, globalization may jeopardize the prospects for low-skilled workers. In light of the deteriorating labor market position of low-skilled workers, it is not surprising that subsidies to foster skill formation have a strong policy appeal. By boosting human capital formation, equality may be served because general equilibrium effects on wages reduce the skill premium. As there is less pre-tax inequality, the need to redistribute incomes through distorting income taxes may diminish at the same time.

The main question of this paper is: Should general equilibrium effects on the wage distribution be exploited in an optimal redistributive tax-cum-education system? To answer this question, this paper analyzes optimal redistributive tax and education policies in a Mirrlees (1971)-framework. Due to imperfect substitution between different skill types in labor demand, the skill premium is determined by both demand and supply conditions in the labor market. Furthermore, skill levels are endogenously determined by human capital investments, and not exogenous as in Mirrlees (1971).

Naturally, governments use the income tax and transfer system to redistribute resources. An important question is, therefore, whether education policies should optimally complement the redistributive tax system. Bovenberg and Jacobs (2005) derive, in a partial equilibrium framework, that human capital formation should neither be taxed nor subsidized (on a net basis) in an optimal redistributive program with linear or nonlinear taxes and subsidies. The intuition is as follows. One the one hand, subsidies on education are implicit subsidies on work effort, since working and learning are complementary in generating income. On the other hand, education subsidies are regressive, since high-ability individuals invest more in human capital (the ‘ability bias’). With the earnings functions used by Bovenberg and Jacobs (2005), both effects cancel out, and the sole role of education subsidies is to offset the distortions of the income tax on skill formation. Indeed, the Diamond and Mirrlees (1971) production efficiency theorem dictates that all investments in human capital (an intermediate input) are efficient under these conditions (Jacobs and Bovenberg 2011).

This paper closely follows the analysis of Bovenberg and Jacobs (2005). However, the main difference is that the wages of workers are now endogenous. This paper maintains their assumptions to ensure that human capital is optimally not subsidized in the absence of general equilibrium effects on wages. In particular, the earnings function is assumed to be weakly separable between ability, labor, and education, both with linear and nonlinear policies. Additionally, the production function for human capital is assumed to have a constant elasticity if linear policies are considered, see also Jacobs and Bovenberg (2011). Individuals also have an iso-elastic utility function, so that labor supply elasticities are constant, and income effects are absent. This simple utility function highlights the crucial role of human capital supply elasticities under linear policy instruments. Finally, the analysis is restricted to two types of agents that differ in their ability to acquire human capital. We refer to the high-ability type as the ‘skilled’ and the low-ability type as the ‘unskilled’ agent.Footnote 1

Whether education policy should optimally be employed to provoke redistributive general equilibrium effects on the skill premium is shown to depend crucially on two things. First, can education policy affect the relative supply of human capital in such a way that the skill premium falls? The skill premium will be a downward sloping function of the relative supplies of total ‘effective labor supply’. Effective labor supply is the product of the quantity of labor supply (i.e., hours worked) and the quality of labor supply (i.e., human capital), which is determined by human capital investment. Hence, education subsidies must increase the relative effective labor supply to be helpful in reducing wage inequality. Second, if education subsidies do indeed increase relative supply of effective labor, should education subsidies, when optimally combined with an income tax, also be used? The answers to both questions are not obvious, and they differ fundamentally for linear and nonlinear policy instruments.

The first part of this paper considers optimal linear tax and education subsidies. Linear education subsidies tend to increase effective labor supply of both workers at the same time, and it is not clear whether relative supply of human capital is increased at all. The effective labor supply of each agent depends on his human capital and his labor effort. It is demonstrated that general equilibrium effects can never be exploited for redistributional reasons when effective labor supply elasticities are equal across agents, and income effects are absent. The reason is that relative effective labor supply remains fixed. Indeed, linear education subsidies will only compress wage differentials if the skilled worker’s supply of effective labor is more elastic with respect to the subsidy than the unskilled worker’s effective labor supply. As agents are assumed to have human capital production functions with a constant elasticity (under linear instruments only), the labor supply elasticity of the skilled worker has to be higher than that of the unskilled worker for education subsidies to have the potential to compress the wage distribution.

The second question is: If the skilled worker has a higher labor supply elasticity, and education subsidies therefore can reduce the skill premium, should education subsidies also be employed in an optimal redistributive program alongside optimal linear taxes? With linear policies, the answer to this question is no. A linear education subsidy is distributionally equivalent to a linear income tax in the absence of general equilibrium effects. The reason is that gross income is linear in education as a result of the constant elasticity of the production function for human capital. Hence, it is not optimal to exploit general equilibrium effects with the education subsidy because the income tax can do this equally well, while avoiding excessive investment in human capital. By allowing for general equilibrium effects, this paper demonstrates that the distributional equivalence between income taxes and education taxes is preserved. The linear income tax generates the same general equilibrium effects on the wage distribution as the linear education subsidy, while the latter avoids distortions in human capital investments. As a result, the efficiency results of Bovenberg and Jacobs (2005) derived in partial equilibrium carry over to the general equilibrium case.

In contrast, the optimal linear income tax is importantly affected by general equilibrium effects on wages. The assumption that skilled workers have a higher labor supply elasticity than unskilled workers implies that higher income taxes will reduce effective labor supply of the skilled workers relatively more than that of unskilled workers. Linear income taxation thus increases the skill premium, and before-tax income inequality increases. Consequently, general equilibrium effects on wages run against the distributional benefits of a higher income tax, and optimal income taxes are lowered as a result. Theoretically, the optimal linear income tax may even turn negative if the indirect general equilibrium effects on the pre-tax wage distribution are strong enough to offset the direct effects of higher marginal tax rates on the post-tax wage distribution. That optimal linear tax rates could be negative is an important new finding in the literature on optimal linear income taxation (Sheshinski 1972; Dixit and Sandmo 1977; Atkinson and Stiglitz 1980).

The second part of the paper considers the simultaneous setting of optimal nonlinear tax and education policies. Optimal nonlinear policies differ fundamentally from linear policies when general equilibrium effects on wages are present. The key to understanding why they differ is that the relative effective labor supply can, by definition, be directly steered by the nonlinear tax and education schedules—even if all supply elasticities of labor and human capital are equal. By giving a marginal education subsidy on the high type, and a marginal education tax on the low type, the government can directly increase the relative supply of skilled human capital to lower the skill premium. Therefore, the first question—can education policy affect the relative supply of education in such a way that the skill premium falls?—can be answered affirmatively.

The second question is: Should education subsidies be optimally employed in an optimal redistributive program? The answer with nonlinear policies is yes. The skilled worker optimally faces a marginal subsidy on education, while the unskilled worker faces a marginal tax on education. The optimal marginal income tax on the skilled worker is negative, while the optimal marginal tax on the unskilled worker is positive, as in Stern (1982) and Stiglitz (1982). Hence, under nonlinear taxation, the efficiency results break down. We demonstrate that marginal education taxes/subsidies are directly linked to the top rate: marginal subsidies are zero when the top rate is zero. As the skilled worker faces a positive subsidy on both labor effort and education, and the unskilled worker faces a positive tax on both education and labor effort, the skill premium will be reduced. A skilled worker will be less tempted to mimic an unskilled worker, incentive compatibility constraints will be relaxed, and the government can redistribute more income. In contrast to the linear policies, nonlinear instruments do exploit general equilibrium effects for redistribution.

We analyze the quantitative importance of general equilibrium effects by simulating optimal nonlinear income taxes and education subsidies. We find that the marginal top rate is negative, and rather small for plausible elasticities of substitution between skilled and unskilled labor, which confirms the findings of Stern (1982). Optimal education subsidies are not large either, since there is a direct link between the top rate and education subsidies. However, we demonstrate quantitatively that general equilibrium should be exploited for redistribution when the elasticity of substitution between skilled and unskilled labor is very low.

Tinbergen’s suggestion to promote skill formation so as to provoke a decline in the skill premium for redistributional reasons is possible only under nonlinear policies. The case is lost under linear tax instruments. Intuitively, the generic linear education subsidy is an inefficient instrument to reduce the skill premium because it increases human capital supply of all agents simultaneously. Nonlinear education subsidies avoid this simultaneous increase in human capital supplies and can be tailored to increase supply of the high-ability types, while simultaneously lowering the supply of the low-ability types. Numerical simulations of optimal nonlinear policies demonstrate education policies can only be usefully employed for redistribution when the high-skilled and low-skilled workers are poor substitutes in production. However, for empirically plausible values of the elasticity of substitution between skilled and unskilled workers, it appears that education subsidies have only a limited role to play in an optimal tax-cum-education policy.Footnote 2

This paper is related to a number of other papers. First, by allowing for endogenous wage rates, this paper contributes to a growing literature on optimal income tax and education policies; for earlier contributions, see, for example, Ulph (1977), Hare and Ulph (1979), Tuomala (1990), and Nielsen and Sørensen (1997). More recent contributions include Maldonado (2008), Alstadsaeter et al. (2008), Grossman and Poutvaara (2009), and Schindler (2011). Second, a number of papers analyze optimal taxation if different skill types earn endogenously determined wages, and the government cannot tax or subsidize all production inputs at different rates. Naito (1999, 2004, 2007) and Pirttilä and Tuomala (2001) show that both the Diamond and Mirrlees (1971) production efficiency theorem, and the Atkinson and Stiglitz (1976) theorem break down with endogenous wages. We also demonstrate that when wages are endogenous the zero commodity tax theorem is not applicable to education and production efficiency in human capital formation ceases to be optimal under nonlinear income taxation.

The rest of this paper is structured as follows. The next section presents the model with optimal linear income taxation and education policies in general equilibrium. Section 3 studies the same problem using nonlinear instruments. Section 4 simulates optimal nonlinear policies. Section 5 concludes. Appendix contains all the main derivations. Some non-essential derivations and more simulation results are available in an unpublished appendix.Footnote 3

2 Model

The standard models of optimal income taxation with general equilibrium effects on wages are extended with human capital formation.Footnote 4 This section presents the base-line model with linear instruments. Nonlinear instruments will be discussed later. Individuals differ in their capacity to accumulate human capital and earnings capacities of individuals are endogenous rather than exogenously given. Furthermore, individuals with higher ability have a comparative advantage in skill formation. A ‘one-shot’ model of human capital investments is analyzed. One may view this model as describing life-time investments in human capital, life-time labor supply and life-time consumption, where there are no inter-temporal distortions due to capital taxes or capital market failures, for example.Footnote 5 To fully track down the general equilibrium impact of tax and education policies analytically, the analysis is restricted to two types, as in almost the entire literature.

2.1 Individuals

There is a unit mass of high-ability and low-ability workers who are indexed by n=1 and n=2, respectively. The fraction of high-ability workers is g 1 and the fraction of low-ability workers is g 2. Each worker has an iso-elastic utility function u(c n ,l n ), which is defined over consumption c n and work effort l n according to

$$ u_{n}(c_{n},l_{n})\equiv c_{n}- \frac{l_{n}^{1+1/\varepsilon _{n}}}{1+1/\varepsilon_{n}},\quad n=1,2, $$
(1)

where ε n >0 is the (un)compensated wage elasticity of labor supply of individual n. Since income effects are absent, compensated and uncompensated elasticities coincide, and labor supply is always upward sloping. This utility function is also used for its analytical simplicity in Diamond (1998), Saez (2001), and Naito (2004). This specification of utility is sufficiently general to stress the main points at stake, while not introducing additional analytical complexity due to income effects, as in Allen (1982). Moreover, it highlights the crucial role of different labor supply elasticities under linear policy instruments. Indeed, elasticities of labor supply are assumed to differ, and ε 1 is not equal to ε 2. In the absence of income effects, different elasticities of labor supply or human capital formation are necessary to obtain general equilibrium effects of policy. As human capital elasticities are assumed to be equal, see below, linear policy instruments would have no general equilibrium effects if labor supply elasticities would be identical.

Gross labor income z n of each individual is

$$ z_{n}\equiv w_{n}h_{n}l_{n},\quad n=1,2, $$
(2)

where w n denotes the gross wage rate per unit of human capital of an individual of skill type n, h n is the level of human capital of each agent, i.e., the number of efficiency units of labor, and l n denotes work effort. Total ‘effective labor supply’ thus equals h n l n , which encompasses both the quality of labor and the quantity of labor supply.

Human capital is accumulated on the intensive margin. Individuals invest e n of their resources in education. One can think of e n as the years enrolled in education or the quality of education where each individual has access to the same educational inputs, but transforms them differently into human capital depending on ability. The production function for human capital is given by

$$ h_{n}\equiv a_{n}\phi_{n}(e_{n}),\qquad \phi^{\prime }(e_{n})>0,\qquad \phi^{\prime \prime }(e_{n})<0, \quad n=1,2, $$
(3)

where a n is the exogenous productivity of investment in human capital; a 1>a 2, i.e., high-ability types have a comparative advantage in learning. High-ability types thus generate more human capital with the same amount of educational efforts because \(\frac{\partial z_{n}}{\partial e_{n}\partial a_{n}}=w_{n}\phi^{\prime }(e_{n})l_{n}>0\).Footnote 6

To ensure that optimal education subsidies are zero in the absence of general equilibrium effects, the earnings function is weakly separable in ability, education, and labor. The elasticity of the production function is also assumed to be constant under linear policies, and is denoted by \(\beta \equiv \frac{\phi^{\prime }(e_{n})e_{n}}{\phi (e_{n})}\) (Jacobs and Bovenberg 2011). In general equilibrium, the high-ability type is assumed to earn a higher gross wage than the low-ability type so as to obtain an economically meaningful redistribution problem, i.e., w 1 h 1 l 1>w 2 h 2 l 2. This assumption guarantees that gross labor earnings for the high-ability type are always larger than gross labor earnings of the low-ability type.

The price of one unit of education is denoted by p and is common for both individuals.Footnote 7 All costs are assumed to be tax deductible since the major costs of education consist of taxed opportunity costs.Footnote 8 Investments in education e n are subsidized at flat rate s. Gross incomes z n are taxed at a constant marginal rate t. In addition, every individual may receive a non-individualized lump-sum transfer b. Hence, the income tax is progressive in the sense that average tax rates increase with income. The fundamental informational requirements to levy a linear income tax and to provide linear education subsidies are that aggregate gross incomes and aggregate investment in human capital must be verifiable by the government.

Consumption c n equals total net labor income minus education expenditures:

$$ c_{n}=(1-t) \bigl( w_{n}a_{n}\phi (e_{n})l_{n}-(1-s)pe_{n} \bigr) +b,\quad n=1,2. $$
(4)

The first-order conditions for utility maximization yield the following constant elasticity labor supply functions for each individual

$$ l_{n}= \bigl( (1-t)w_{n}a_{n}\phi (e_{n}) \bigr)^{\varepsilon _{n}},\quad n=1,2. $$
(5)

Labor supply l n increases with the net marginal wage rate and taxes depress labor supply. The first-order condition for optimal human capital investment is given by

$$ w_{n}a_{n}\phi^{\prime }(e_{n})l_{n}=(1-s)p, \quad n=1,2. $$
(6)

Marginal benefits of learning (the left-hand side) should be equal to the marginal costs of learning (the right-hand side). Subsidies increase investment in human capital. Taxes have no direct effect on learning because both marginal costs and marginal benefits are equally affected. Taxation does, however, reduce labor supply and lowers the returns of investments in human capital indirectly.

First-order conditions are necessary, but not sufficient. Second-order conditions are satisfied by imposing the following restriction on the parameters:Footnote 9

$$ \mu_{n}\equiv 1-\beta (1+\varepsilon_{n})>0,\quad n=1,2. $$
(7)

The tax elasticities of labor supply \(\varepsilon_{n}^{lt}\equiv -\frac{ \partial l_{n}}{\partial t}\frac{(1-t)}{l_{n}}=\frac{\varepsilon_{n}(1-\beta )}{\mu_{n}}\), and education \(\varepsilon_{n}^{et}\equiv - \frac{\partial e_{n}}{\partial t}\frac{(1-t)}{e_{n}}=\frac{\varepsilon_{n}}{ \mu_{n}}\) are important determinants of the optimal tax rates. The tax elasticity of labor earnings z n =w n a n ϕ(e n )l n amounts to \(\varepsilon_{n}^{zt}=\varepsilon_{n}^{lt}+\beta \varepsilon_{n}^{et}= \frac{\varepsilon_{n}}{\mu_{n}}\). Note that the tax elasticity of total labor earnings z n equals the tax elasticity of effective labor supply h n , since both wages and ability are exogenous to the individual. The tax elasticity of gross income \(\frac{\varepsilon_{n}}{\mu_{n}}\) exceeds the wage elasticity of gross income ε n . The reason is that the tax rate t reduces the after-tax wage (1−t)w n a n ϕ(e n ) both directly (by raising the tax wedge between the before-tax wage and the after-tax wage t) and indirectly (by depressing the before-tax wage rate w n a n ϕ(e n ) through its negative impact on learning e n ). Learning is harmed indirectly because lower labor supply depresses the utilization rate of human capital. Similarly, the subsidy elasticities are given by: \(\varepsilon_{n}^{ls}\equiv \frac{\partial l_{n}}{\partial s}\frac{(1-s)}{l_{n}}=\frac{\beta \varepsilon_{n}}{\mu_{n}}\), \(\varepsilon_{n}^{es}\equiv \frac{\partial e_{n}}{\partial s}\frac{(1-s)}{e_{n}}=\frac{1}{\mu_{n}}\), and \(\varepsilon_{n}^{zs}\equiv \frac{\partial z_{n}}{\partial s}\frac{(1-s)}{z_{n}}=\frac{\beta (1+\varepsilon_{n})}{\mu_{n}}\).

2.2 Firms

There is one sector of production. A representative firm maximizes profits while taking wage rates w 1 and w 2 for each skill type as given. The firm produces output Y with a neoclassical production function, which features constant returns to scale in labor inputs H 1 and H 2:

$$ Y\equiv F(H_{1},H_{2}), $$
(8)

where F n (⋅)>0, F nn (⋅)<0, F 12≥0, n=1,2, and subscript n refers to the nth argument of differentiation. The income share of the low-income earner is denoted by \(\alpha \equiv \frac{w_{2}H_{2}}{w_{1}H_{1}+w_{2}H_{2}}\). \(1-\alpha \equiv \frac{w_{1}H_{1}}{w_{1}H_{1}+w_{2}H_{2}}\) is, therefore, the income share of the high-income earner. Further, if α<1/2, the low-ability type earns less than the high-ability type. \(\sigma \equiv \frac{F_{1}(\cdot)F_{2}(\cdot)}{F_{12}(\cdot)F(\cdot)}\) denotes the partial elasticity of substitution between H 1 and H 2 in the production function F(⋅).

First-order conditions for profit maximization are necessary and sufficient, and given by

$$ w_{n}=F_{n}(H_{1},H_{2}),\quad n=1,2. $$
(9)

The skill premium π is the ratio of wages of skilled and unskilled workers, i.e., πw 1/w 2. With constant returns to scale in production, π is only a function of the relative supplies of skilled and unskilled workers, H 1/H 2 (using f(H 1/H 2)≡F(H 1,H 2)/H 2):

$$ \pi ( H_{1}/H_{2} ) \equiv \frac{w_{1}}{w_{2}}= \frac{f^{\prime } ( H_{1}/H_{2} ) }{f ( H_{1}/H_{2} ) -(H_{1}/H_{2})f^{\prime } ( H_{1}/H_{2} ) }, $$
(10)

where π′(H 1/H 2)<0. The skill premium decreases if the relative supply of skilled workers, H 1/H 2, increases.

Note that if both types have the same labor supply elasticity ε n , all the tax and subsidy elasticities will be equal across individuals (see previous section). Hence, linear policy instruments cannot affect the skill premium π in that case.

2.3 General equilibrium

Labor market clearing requires that supply equals demand for each labor type:

$$ H_{n}=h_{n}l_{n}g_{n},\quad n=1,2. $$
(11)

Further, goods market equilibrium requires that total output equals total consumption, plus investments in human capital, plus exogenously given government expenditures Λ:

$$ F(H_{1},H_{2})= ( c_{1}+pe_{1} ) g_{1}+ ( c_{2}+pe_{2} ) g_{2}+\varLambda. $$
(12)

2.4 Government

The government maximizes a social welfare function over indirect utilities v n (b,t,s,w n ):

$$ \omega_{1}v_{1}(b,t,s,w_{1})g_{1}+ \omega_{2}v_{2}v_{2}(b,t,s,w_{2})g_{2},\qquad \omega_{n}\geq 0,\quad n=1,2, $$
(13)

where ω n denotes the Pareto weight of type n in social welfare. The Pareto weights sum to one: ω 1+ω 2=1. If \(\omega_{1}=\omega_{2}=\frac{1}{2}\), the social welfare function is utilitarian, and there is no social preference for redistribution due to the constancy of marginal utility of income at the individual level (no income effects). ω 2>ω 1 implies a social preference for redistribution.Footnote 10

The government collects taxes to finance the lump-sum transfer, the education subsidies, and the exogenous revenue requirement Λ. The government budget constraint reads as:

(14)

3 Optimal linear taxes and education subsidies

The government maximizes social welfare by optimally choosing the lump-sum transfer b, the linear marginal tax t and the linear education subsidy s. Formally, the following Lagrangian is maximized:Footnote 11

(15)

where η denotes the Lagrange multiplier of the government budget constraint, and the labor market clearing conditions have to be imposed, i.e., w n =F n (H 1,H 2), where H n =h n l n g n , n=1,2.

The first-order condition for the optimal lump-sum transfer is

$$ \frac{\omega_{1}}{\eta }g_{1}+\frac{\omega_{2}}{\eta }g_{2}=1, $$
(16)

where we used Roy’s lemma (\(\frac{\partial v_{n}}{\partial b}=1\)), and \(\frac{\omega_{n}}{\eta }\) is the social marginal utility of income of type n. The average social marginal benefits of a higher b (i.e., the left-hand side of (16)) should equal the costs in terms of a higher b (i.e., the right-hand side of (16)).

With the aid of the first-order condition for b (16), the distributional characteristic ξ z of labor income is defined as (minus) the normalized covariance between the social marginal utility of income \(\frac{\omega_{n}}{\eta }\), and gross labor income z n (Atkinson and Stiglitz 1980):

$$ \xi^{z}\equiv -\frac{\sum_{n}\frac{\omega_{n}}{\eta }z_{n}g_{n}-\sum_{n}z_{n}g_{n}\sum_{n}\frac{\omega_{n}}{\eta }g_{n}}{\sum_{n}z_{n}g_{n}\sum_{n}\frac{\omega_{n}}{\eta }g_{n}}=\frac{\sum_{n}(1-\frac{\omega_{n}}{\eta })z_{n}g_{n}}{\sum_{n}z_{n}g_{n}}, $$
(17)

where the second equality follows from (16). With a positive distributional characteristic ξ z, taxing labor income yields distributional benefits because the high-ability worker has a lower welfare weight than the low-ability worker, i.e., \(\frac{\omega_{1}}{\eta }<\frac{\omega_{2}}{\eta }\), and earns a higher income, z 1>z 2. Indeed, a zero distributional characteristic implies either that the government is utilitarian (ω 2=ω 1), and not interested in redistribution, or that the marginal contribution to the tax base is equal for both ability types (i.e., taxable income z n is the same for both types).

Similarly, the distributional characteristic of education ξ e is defined as

$$ \xi^{e}\equiv -\frac{\sum_{n}\frac{\omega_{n}}{\eta }e_{n}g_{n}- \sum_{n}e_{n}g_{n}\sum_{n}\frac{\omega_{n}}{\eta }g_{n}}{\sum_{n} \frac{\omega_{n}}{\eta }g_{n}\sum_{n}e_{n}g_{n}} =\frac{\sum_{n}(1-\frac{\omega_{n}}{\eta })e_{n}g_{n}}{\sum_{n}e_{n}g_{n}}= \xi^{z}. $$
(18)

A positive distributional characteristic implies that subsidizing (taxing) education results in distributional losses (gains). If education levels are equal for both workers, there is no educational inequality, and subsidizing education yields no distributional losses. The absence of a redistributional motive renders the distributional characteristic zero.

Note that the distributional characteristic of education is equal to the distributional characteristic of income because gross earnings are linear in education due to the constant elasticity of the production function for human capital. This assumption ensures that (negative) education subsidies are distributionally equivalent to income taxes, see also Jacobs and Bovenberg (2011). In the remainder, the superscripts are dropped, and ξξ z=ξ e.

Simplifying the first-order condition for t yields the optimal linear income tax at optimal education policy (see Appendix)

$$ \frac{t}{1-t}=\frac{1}{\bar{\varepsilon}^{zt}} \biggl( \xi - \biggl( \frac{\omega_{2}}{\eta }- \frac{\omega_{1}}{\eta } \biggr) \biggl( \frac{\varepsilon_{1}}{\mu_{1}}-\frac{\varepsilon_{2}}{\mu_{2}} \biggr) \frac{\alpha (1-\alpha )}{(1-\beta )}\frac{1}{\sigma } \biggr) , $$
(19)

where \(\bar{\varepsilon}^{zt}= ( (1-\alpha )\varepsilon_{1}^{lt} +\alpha \varepsilon_{2}^{lt} ) /(1-\beta )\) is the income weighted average of the tax elasticity of total labor earnings (i.e., effective labor supply). This expression generalizes the standard expression for the optimal linear income tax by allowing for human capital formation and general equilibrium effects on wages (Sheshinski 1972; Dixit and Sandmo 1977; Atkinson and Stiglitz 1980).

The optimal income tax formula shows the trade-off between equity (numerator) and efficiency (denominator). The larger the social preference for redistribution, the larger the ξ, and the higher the optimal marginal income tax. If both groups have an equal weight in social welfare (ω 2=ω 1), the optimal marginal income tax is zero because ξ=0 and \(\frac{\omega_{2}}{\eta }=\frac{\omega_{1}}{\eta }\). The larger the income-weighted average elasticity of labor earnings, the lower the optimal linear income tax because the labor-income tax more heavily distorts labor supply. Both the elasticities of labor supply and human capital formation determine the effective elasticity of earnings, which is due to the feedback between labor supply and human capital formation; see also Bovenberg and Jacobs (2005) for a more elaborate discussion.

The expression for the optimal income tax in general equilibrium differs from its partial-equilibrium counter part due to the presence of \(( \frac{\omega_{2}}{\eta }-\frac{\omega_{1}}{\eta } ) ( \frac{\varepsilon_{1}}{\mu_{1}}-\frac{\varepsilon_{2}}{\mu_{2}} ) \frac{\alpha (1-\alpha )}{(1-\beta )}\frac{1}{\sigma }\) (Bovenberg and Jacobs 2005). This term measures the distributional losses (or gains) arising from general equilibrium effects on wages, and are subtracted from the direct welfare gains of higher income taxes in reducing inequality (as measured by ξ). The optimal marginal linear income tax is lower with general equilibrium effects than without if: (i) substitution between labor types is finite (σ<∞), (ii) the (uncompensated) earnings elasticity of the skilled worker is larger than that of the unskilled type (\(\varepsilon_{1}^{zt}=\frac{\varepsilon_{2}}{\mu_{2}}>\varepsilon_{2}^{zt}=\frac{\varepsilon_{2}}{\mu_{2}}\)).

The effects of imperfect substitution in labor types have been subject of most of the papers in this field, so these results do not come as a surprise. If labor types are perfect substitutes (σ=∞), the last term in the expression of the optimal income tax (19) vanishes. In this case, wage rates per hour worked are not affected by changes in relative factor supplies, and the wage distribution is exogenous.

The second condition on the relative sizes of labor supply elasticities has received no attention so far. The formula demonstrates that general equilibrium effects only contribute to more equality if high-skilled labor earnings respond less elastically to an increase in taxes than low-skilled labor earnings (\(\varepsilon_{1}^{zt}=\frac{\varepsilon_{1}}{\mu_{1}}<\varepsilon_{2}^{zt}=\frac{\varepsilon_{2}}{\mu_{2}}\)). In that case, higher taxes raise relative supply of skilled labor (H 1/H 2), the skill premium π(H 1/H 2) declines, see (10), and smaller before-tax wage inequality results. If, however, skilled labor supply responds more elastically to taxes than unskilled labor supply (\(\varepsilon_{1}^{zt} =\frac{\varepsilon_{1}}{\mu_{1}}>\varepsilon_{2}^{zt} =\frac{\varepsilon_{2}}{\mu_{2}}\)), general equilibrium effects work against redistribution of incomes through income taxes by increasing before-tax income inequality as the relative supply of skilled labor declines.

If the compensated elasticities of effective total labor supply \(\varepsilon_{1}^{zt}\) and \(\varepsilon_{2}^{zt}\) are identical (\(\varepsilon_{1}^{zt}=\varepsilon_{2}^{zt}\) so that \(\frac{\varepsilon_{1}}{\mu_{1}}= \frac{\varepsilon_{2}}{\mu_{2}}\) and ε 1=ε 2), the general equilibrium term vanishes as well. The intuition is straightforward. With equal elasticities, linear taxation does not affect relative labor supply. If relative effective labor supply (H 1/H 2) remains constant, relative wages (w 1/w 2) remain constant as well, cf. the skill premium (10). This is the case, for example, if all individuals have identical preferences (ε 1=ε 2). Hence, this example illustrates the necessity of heterogeneous preferences (in the absence of income effects) for our model to make sense in a general equilibrium context.Footnote 12

A higher elasticity of skilled labor supply (ε 1>ε 2) is a necessary requirement for education subsidies to work in favor of redistribution via general equilibrium effects under linear policy instruments. Intuitively, only if the high-skilled agent has a higher elasticity of human capital investments with respect to subsidies (i.e., \(\varepsilon_{1}^{es}=\frac{1}{1-\beta ( 1+\varepsilon_{1} ) }>\frac{1}{1-\beta ( 1+\varepsilon_{2} ) } =\varepsilon_{2}^{es}\)), subsidies on human capital can compress the wage distribution by boosting high-skilled human capital more than low-skilled human capital. Hence, we will assume that ε 1>ε 2. Also from an empirical point of view, a larger elasticity of skilled labor earnings relative to unskilled labor earnings is plausible (Gruber and Saez 2003).

The expression for the optimal income tax (19) suggests that optimal marginal income taxes may even become negative if general equilibrium effects are strong enough and elasticities of effective total labor supply are sufficiently diverging. In that case, subsidizing work effort provokes such strong general equilibrium effects that low-ability workers are better off by paying subsidies to the high-ability workers; their before tax wages increase more than is needed to offset the rise in lump-sum taxes to finance the subsidies on work. Jacobs (2007) shows that this can happen even under not very unrealistic values for the elasticities of the model.

The general solution for optimal education subsidies in the presence of general equilibrium effects at the optimal tax system is (see Appendix)

$$ s=0. $$
(20)

The government sets the subsidies on education to zero. Therefore, educational investments are efficient in the presence of an optimal income tax. The intuition is similar as in Bovenberg and Jacobs (2005) and Jacobs and Bovenberg (2011).

Education subsidies do not reduce the labor supply distortion more compared to an equally costly reduction in income taxes. Intuitively, the constant elasticity in the human capital production function ensures that there is a linear relation between e n and z n . Hence, subsidizing e n is equivalent to subsidizing z n . However, besides distorting labor supply, subsidies on education also distort human capital investment. This distortion in education (over-investment) can be avoided by not subsidizing education. Therefore, the government does not want to use education subsidies to reduce the tax wedge on labor supply.

The government is also indifferent between education taxes and income taxes to redistribute income. Again, because education is linear in income, any redistribution that education taxes can achieve, can also be achieved with income taxes. Indeed, the distributional characteristics of income (ξ z) and education (ξ e) are equal. While both taxes on income and education reduce labor effort, taxing education e n additionally causes under-investment in human capital. The income tax does not directly distort human capital investment because costs and benefits of education are equally affected by the marginal tax rate. The government can therefore avoid distortions in human capital accumulation by taxing income instead of taxing education to redistribute resources.

In contrast to Bovenberg and Jacobs (2005), both subsidies and income taxes do cause general equilibrium effects. Nevertheless, their combined effect cancels in the expression for optimal education subsidies, since general equilibrium effects do not upset the linear relationship between earnings and education. Intuitively, (lower) income taxes can, in principle, achieve the same wage compression as (higher) education subsidies without causing inefficiencies in human capital formation. Hence, education subsidies remain distributionally equivalent to income taxes even in the presence of general equilibrium effects. For the same reasons, education subsidies still cannot be used as implicit taxes on leisure. Optimal subsidies thus ensure efficiency in human capital formation. Thus, from this discussion follows that education subsidies should not be employed to compress the wage distribution in the optimal redistributive program.

Due to the specific form of the earnings function and the linearity of policy instruments, the government can effectively separate production decisions from consumption choices, also in general equilibrium. By ensuring that all costs of education are effectively tax deductible, the government has access to a perfect profit tax to tax the rents from ability in human capital formation. Hence, the Diamond and Mirrlees (1971) production efficiency theorem still applies as in Bovenberg and Jacobs (2005).

The results in this section depend on specific assumptions regarding the utility function and the production function for human capital. Allowing for income effects (Allen 1982) and non-constant elasticities in the human capital production function will generally result in a role for general equilibrium effects determining optimal education policies. However, non-constant elasticities would give a role for education policies even in the absence of general equilibrium effects (Maldonado 2008; Jacobs and Bovenberg 2011). Moreover, income effects in labor supply would generate an additional channel (through the feedback of labor supply with human capital formation) whereby general equilibrium effects affect optimal education policies. We have switched off this (second-order) feedback by assuming that the income effects in labor supply are identical among individuals (i.e., both have a zero income effect). As long as income elasticities in labor supply are identical, not necessarily zero, the results of the current section remain valid. Hence, our findings are a reasonable first-order approximation, given that there is no compelling evidence indicating that income elasticities in labor supply and human capital elasticities differ much among skill types.

4 Optimal nonlinear taxes and subsidies

This section derives the optimal nonlinear tax and education policies. We maintain the assumption that the earnings function is weakly separable in ability, labor, and education, but we can now allow for general utility and production functions for human capital. These conditions still ensure that optimal education subsidies are zero in the absence of general equilibrium effects on wages (Jacobs and Bovenberg 2011). All results presented in this section are qualitatively the same with the specifications for the utility and the human capital production functions that we used to study the linear tax case. We also provide simulations of the optimal nonlinear taxes and subsidies, while maintaining the same structure on preferences and technologies as in the linear case.

The optimal nonlinear tax and subsidy rates are found by deriving the optimal second-best allocation of consumption, gross income and education, as in Stern (1982) and Stiglitz (1982). By using the first-order conditions for individual optimization, we can compute the optimal marginal income taxes and optimal marginal education subsidies that would decentralize the optimal second-best allocation.

Any solution to the optimal second-best problem has to satisfy the incentive compatibility constraints stating that each individual n has a weakly higher utility choosing the bundle (c n ,z n ,e n ) of consumption, gross income and education, which is intended for them, than utility of choosing the alternative bundle of (c m ,z m ,e m ), which is intended for the other individual m (n,m=1,2, nm). That is,

(21)
(22)

The government maximizes the social welfare function (13) subject to the economy’s resource constraint (12), and incentive compatibility constraint (21). Under normal circumstances, the second incentive constraint (22) is not binding at an optimal solution, and it will be ignored in the remainder (Stern 1982; Stiglitz 1982). Assuming for simplicity that the price of a unit of education is one, and the number of high-ability and low-ability persons are both equal to one, the following Lagrangian for the maximization of social welfare is formulatedFootnote 13

(23)

where the inverse of the skill premium is denoted by π −1, see (10). The conditions for labor market equilibrium (11) are substituted. To avoid confusion in notation, utility of the high-skilled type mimicking the low-skilled type is designated by \(u_{1}^{\ast }\equiv u_{1} ( c_{2},\frac{w_{2}a_{2}}{w_{1}a_{1}}l_{2} ) \). θ is the Lagrange multiplier on the incentive compatibility constraint.

First-order conditions for an optimal allocation are

(24)
(25)
(26)
(27)
(28)
(29)

From the first-order conditions for c 1 (24) and l 1 (25) follows the marginal tax rate, \(T_{1}^{\prime }\equiv 1+\frac{1}{w_{1}a_{1}\phi (e_{1})}\frac{\partial u_{1}/\partial l_{1}}{\partial u_{1}/\partial c_{1}}\), on the high-ability type (see Appendix):Footnote 14

$$ T_{1}^{\prime }=\frac{\theta }{\eta }\frac{1}{\sigma } \frac{\alpha }{(1-\alpha )}\frac{1}{w_{1}a_{1}\phi (e_{2})}\frac{\partial u_{1}^{\ast }}{\partial l_{2}}<0. $$
(30)

Consequently, the marginal top rate is exploited for redistribution with general equilibrium effects on wages. Indeed, the general expression of the optimal marginal tax on the skilled type is almost identical to the one without human capital formation (Stern 1982; Stiglitz 1982). The basic intuition of these papers carries over to the present case. A marginal subsidy on work for the high-ability type increases relative supply of skilled human capital, and lowers the skill premium π. Hence, the utility costs of mimicking the low-ability type increase, as it takes the skilled type more labor time to mimic the unskilled type. In the absence of general equilibrium effects on wages (σ=∞), there is no distortion at the top, and the marginal tax rate is zero; see also Seade (1977). Further, if the government is not interested in redistribution, the incentive constraint is slack, θ is zero, and optimal marginal taxes are zero as well.

Manipulation of the first-order condition for c 2 (27) and the first-order condition for l 2 (28) yields the marginal tax on labor of the low-ability type \(T_{2}^{\prime }\equiv 1+\frac{1}{w_{2}a_{2}\phi (e_{2})}\frac{\partial u_{2}/\partial l_{2}}{\partial u_{2}/\partial c_{2}}\) (see Appendix):

$$ \frac{T_{2}^{\prime }}{1-T_{2}^{\prime }}=\frac{\theta }{\eta }\frac{\partial u_{1}^{\ast }}{\partial c_{2}}+\frac{T_{1}^{\prime }}{(1-T_{2}^{\prime })} \frac{(1-\alpha )}{\alpha } ( \sigma -1 ) . $$
(31)

In the absence of general equilibrium effects, the marginal tax rate on the low-ability type is unambiguously positive, since the top rate is zero (\(T_{1}^{\prime }=0\)) in that case (Mirrlees 1971; Stiglitz 1982). The presence of general equilibrium effects (lower σ) increases the marginal tax rate \(T_{2}^{\prime }\) on the low type (recall \(T_{1}^{\prime }<0\)). This is intuitive, since a higher marginal tax rate on the low type reduces their labor supply, and therefore results in more before-tax wage equality. Hence, the high type is less tempted to mimic the low type, which relaxes the incentive compatibility constraint, and therefore results in more redistribution.

The optimal marginal education subsidy rate for the high type is derived from combining the first-order conditions for e 1 (26) with the optimal marginal income tax for skilled workers (30). Then, we find that the optimal marginal education subsidy for high-ability workers \(S_{1}^{\prime }\equiv 1-w_{1}a_{1}\phi^{\prime }(e_{1})l_{1}\) is positive (see Appendix):

$$ \frac{S_{1}^{\prime }}{1-S_{1}^{\prime }}=-\frac{\theta }{\eta }\frac{1}{\sigma }\frac{\alpha }{(1-\alpha )} \frac{1}{w_{1}a_{1}\phi (e_{2})}\frac{\partial u_{1}^{\ast }}{\partial l_{2}}=-T_{1}^{\prime }>0. $$
(32)

The intuition is that a marginal subsidy on human capital investment of the high type (like the marginal subsidy on work) lowers the skill premium, and makes it more costly for the high-ability type to mimic the low-ability type.

Similarly, we find the optimal marginal subsidy rate on education for the low-ability type \(S_{2}^{\prime }\equiv 1-w_{2}a_{2}\phi^{\prime }(e_{2})l_{2}\) from the first-order condition for e 2 (29) and the tax rate (31) (see Appendix):

$$ \frac{S_{2}^{\prime }}{1-S_{2}^{\prime }}=\frac{\theta }{\eta }\frac{1}{\sigma }\frac{\alpha }{(1-\alpha )} \frac{1}{w_{2}a_{1}\phi (e_{2})}\frac{\partial u_{1}^{\ast }}{\partial l_{2}}=\frac{(1-\alpha )}{\alpha }T_{1}^{\prime }<0. $$
(33)

Therefore, education is taxed on the margin for the low-ability type. Again, the mechanism is that general equilibrium effects relax the incentive compatibility constraint, and the government can redistribute more income.

Note that the expressions for the marginal education subsidies are all directly related to the top rate. Indeed, subsidies or taxes on education are larger if the top rate is lower, i.e., when general equilibrium effects are more important (lower σ), and if the government wishes to redistribute incomes more heavily (larger θ). If labor types are perfect substitutes (σ=∞), the optimal marginal education subsidies for both high and low-skilled workers are zero, i.e., \(S_{1}^{\prime }=S_{2}^{\prime }=0\), cf. Bovenberg and Jacobs (2005).

General equilibrium effects on the skill premium should indeed be exploited for redistribution under nonlinear policies, in contrast to optimal linear policies. The intuition is that, by using nonlinear instruments, the government can directly influence the skill premium π(H 1/H 2) by setting different marginal tax and subsidy rates for each worker, as long as the policy remains incentive compatible. This holds true irrespective of preferences or technologies. Hence, by optimally employing marginal subsidies on the high-ability type and marginal taxes on the low-ability type, the skill premium falls, the incentive compatibility constraint is relaxed, and the policy achieves more income redistribution.

Indirect taxes/subsidies, such as education subsidies, are not optimally zero under nonlinear income taxation with weakly separable preferences. This bolsters the findings by Naito (1999, 2004, 2007) and Pirttilä and Tuomala (2001) who investigated the desirability of nonzero commodity taxes and deviations from production efficiency in general equilibrium settings with endogenous wages. In the current model, education should optimally be taxed or subsidized under nonlinear income taxation to exploit factor price changes for redistribution, and the Atkinson and Stiglitz (1976) theorem does not apply to education subsidies. Moreover, since education choices are generally not efficient, the Diamond–Mirrlees production efficiency theorem ceases to apply to individual human capital production as well. The intuition is that, due to the nonlinearity of the policy schedules, the optimal policies do not constitute a perfect profit tax on the rents from human capital formation. As a result, consumption and investment choices cannot be (weakly) separated, which is a prerequisite for the production efficiency theorem to apply. In the absence of general equilibrium effects, education would not be taxed or subsidized on a net basis, and would be efficient.

5 Simulations of optimal nonlinear policies

To check whether general equilibrium effects can be quantitatively important for optimal nonlinear tax and education policies, we simulate the model following Stern (1982). We resort to the standard iso-elastic utility function with a constant wage elasticity of labor supply, which is augmented with a preference parameter δ to calibrate the dis-utility of labor supply:

$$ u_{n}(c_{n},l_{n})\equiv c_{n}- \delta \biggl( \frac{l_{n}^{1+1/\varepsilon _{n}}}{1+1/\varepsilon_{n}} \biggr) ,\quad n=1,2. $$
(34)

In the baseline, we set the wage elasticities of labor supply equal for both types at ε 1=ε 2=0.2, cf. Blundell and MaCurdy (1999). The preference parameter for leisure is calibrated at δ=10, so as to keep labor effort of both types between 0 and 1. The production function for human capital is Cobb–Douglas, \(a_{n}\phi (e_{n})\equiv a_{n}e_{n}^{\beta }\), with an elasticity β=0.2. This is probably a lower bound.Footnote 15 The ability parameters are set at a 1=10 and a 2=8. The aggregate production function features a constant elasticity of substitution between skilled and unskilled labor:

$$ Y= \bigl( \gamma H_{1}^{\rho }+(1-\gamma )H_{2}^{\rho } \bigr)^{1/\rho },\qquad \sigma \equiv \frac{1}{1-\rho }. $$
(35)

The elasticity of substitution between high and low-skilled labor is set at σ=1.5. The survey provided in Katz and Autor (1999) suggests values for σ between 1 and 2.5.Footnote 16

The share parameter is calibrated in the baseline at γ=0.63 to get an income share of skilled labor in total output of 1−α=0.65 in the absence of government intervention Stern (1982). The baseline welfare weights in the social welfare function are ω 1=1−ω 2=0.25. Setting ω 1=ω 2=0.5 corresponds to a utilitarian criterion, which is non-redistributive because the private marginal utility of income is unity. ω 1=1−ω 2=0 corresponds to a Rawlsian social welfare function. At the baseline simulation, the government revenue requirement is set at zero: Λ=0. The simulation results are given in Table 1.

Table 1 Optimal non-linear tax and education policies

Calculations by Stern (1982) in models without endogenous human capital formation reveal that general equilibrium effects have little impact on simulated optimum tax rates. We largely confirm this in all simulations. The top rate is indeed only slightly negative, even with endogenous learning. Since the expressions for optimal education subsidies are all directly related to the top rate, we see that the general equilibrium impact on optimum nonlinear subsidies is rather limited as well.

As expected, the optimal tax expressions are most sensitive to changes in the elasticity of substitution σ. In the top rows of Table 1, we calibrate the parameter γ so as to keep the income share of skilled labor fixed at 1−α=0.65 in the absence of government intervention.Footnote 17 We find that marginal education subsidies and taxes are substantially differing from zero when the elasticity of substitution between skilled and unskilled labor falls below unity, which is empirically less plausible.

Further, the elasticity of high-ability labor supply ε 1 is important in explaining the pattern of marginal taxes and subsidies. It is a crucial determinant of the incentive compatibility constraint. The higher this elasticity, the easier it is for the high type to mimic the low type, and the larger are marginal subsidies on high-ability labor supply and education. From the table follows that the main results are not driven by the labor supply elasticity of the low type, nor by the human capital elasticity, or the social welfare function.Footnote 18

6 Conclusions

This paper analyzed the simultaneous setting of optimal linear and nonlinear income taxes and education subsidies in two-type models with endogenous labor supply, endogenous human capital formation, and endogenous wage rates. To investigate the potential role of general equilibrium effects in shaping optimal linear tax and education policies, we ensured that optimal subsidies on education are zero in the absence of general equilibrium effects on wages. This required weakly separable earnings functions, and, for linear instruments only, a constant elasticity in the human capital production function. For linear education subsidies to work in favor of redistribution, we further assumed that labor supply of the high-ability (‘skilled’) type is more elastic than labor supply of the low-ability (‘unskilled’) type. Linear taxes and subsidies cannot—by assumption—affect the skill premium if all behavioral elasticities are equal.

We showed that optimal linear education subsidies are zero, even if linear tax and education policies have the potential to provoke equilibrium effects on wage rates. The intuition is that linear income taxes are distributionally equivalent to (negative) linear education subsidies, and more efficient because income taxes do not generate distortions in human capital formation, whereas linear subsidies cause over-investment. This holds true whether general equilibrium effects are present or not. The optimal linear income tax is, however, lowered due to general equilibrium effects on wages if skilled labor supply is more elastic than unskilled labor supply. A higher income tax increases the skill premium because skilled labor supply falls more than unskilled labor supply. These general equilibrium effects work against the direct redistributional gains of a higher income tax rate. The optimal linear income tax rate may even turn negative when general equilibrium effects on wages are sufficiently strong.

The results for optimal nonlinear policies are found to be fundamentally different. With nonlinear instruments, the government can directly affect the relative supply of skilled human capital using specific instruments, such as marginal subsidies on human capital or labor supply of the high type, and marginal taxes on human capital or labor supply of the low type. Consequently, one does not need to impose restrictions on preferences or the production function for human capital to obtain an impact of nonlinear policies on the skill premium. The nonlinearity of tax and subsidy schedules is sufficient. Optimal nonlinear policies do exploit general equilibrium effects on wages for redistributional purposes. The skilled worker optimally faces marginal subsidies on both work effort and education, whereas the unskilled worker optimally faces marginal taxes on work and education. As a result, wage differences are reduced, and the incentive compatibility constraint is relaxed because the skilled worker finds it harder to mimic the unskilled worker. However, simulations of optimal nonlinear policies revealed that the impact of general equilibrium effects on optimal policies is modest. Only at low levels of the elasticity of substitution between skilled and unskilled labor, marginal subsidies (taxes) on skilled (unskilled) education, and marginal subsidies on skilled work are found to be substantially positive.

In future research, the current analysis can be cast in a model with two production sectors, each exhibiting different factor intensities, so as to investigate the desirability of aggregate production efficiency, and the optimality of zero commodity taxation. Our conjecture is that deviations from aggregate production efficiency, and nonzero commodity will be optimal, as Naito (1999, 2004, 2007) and Pirttilä and Tuomala (2001) have demonstrated in similar settings, but without education policies and human capital investment. However, it remains unclear how optimal education policies will be affected. The current two-type analysis of nonlinear income taxation may also be extended to a setting with a continuum of skill types in order to further investigate how factor prices should be exploited for redistribution under nonlinear income taxation with more realistic skill distributions, as in, for example, Diamond (1998) and Saez (2001). Nevertheless, the expressions for the incentive compatibility constraints reveal that each of them is dependent on the entire wage distribution, and therefore on all the actions of all other agents. As a consequence, one may need to resort to numerical simulations. Our results under linear income taxation will also change when more general utility or earnings functions are used to analyze the importance of income effects, and to allow for the possibility that education has a varying degree of complementarity with work effort (Jacobs and Bovenberg 2011). Also, extensions with imperfections in labor markets due to for example minimum wages, search frictions, unions, and efficiency wages may be interesting avenues for future research.