1 Introduction

Skill-biased technical change (SBTC) has been an important driver of rising income inequality in many developed countries over the last decades (see, e.g., Van Reenen 2011). Skill-biased technology raises the relative demand for skilled workers. If relative demand grows faster than relative supply, the skill premium increases and so does income inequality.Footnote 1 The idea that income inequality is the result of the ‘race between education and technology’ dates back to Tinbergen (1975). He suggested that, in order to contain inequality, governments should increase investment in education, so as to increase the relative supply of skilled workers and win the race with technology. Goldin and Katz (2010, Ch.9, pp. 350-351) take up Tinbergen’s metaphor and argue that US policy should respond to SBTC by making the tax system more progressive and by increasing financial aid for higher education.

Despite the obvious relevance of SBTC to explain rising skill premia and wage inequality, very little analysis exists on the normative question whether it is a good idea to make tax systems more progressive or to stimulate investments in higher education in response to SBTC. Therefore, this paper studies how skill-biased technical change affects optimal linear taxes and education subsidies. We do so by extending the standard model of optimal linear income taxation of Sheshinski (1972) with endogenous skill formation and embed it in the ‘canonical model’ of SBTC, where high-skilled and low-skilled workers are imperfect substitutes in production (Katz and Murphy, 1992; Violante, 2008; Acemoglu and Autor, 2011).Footnote 2 In our model, individuals differ in their earning ability. They decide how much to work and whether to enroll in higher education. Only individuals with a sufficiently high ability become high-skilled, everyone else remains low-skilled. The wages of high-skilled and low-skilled workers are endogenously determined by relative demand and relative supply, where skill-biased technical change drives relative demand. An inequality-averse government maximizes social welfare by optimally setting linear income taxes and education subsidies. Our findings are the following.

We start our analysis by deriving optimal tax and education policies for given skill bias. As is usual in optimal taxation, optimal policies trade off the redistributional benefits against the efficiency costs of each policy instrument. The benefits consist of the direct distributional impact and the indirect redistributional impacts that originate from the general-equilibrium effects on wages. The linear income tax directly reduces income inequality, but also generates general-equilibrium effects on wages that raise pre-tax income differentials: by discouraging investment in education, the relative supply of skilled workers falls and the relative wage of skilled workers increases.Footnote 3 Moreover, education subsidies result in distributional losses, since high-skilled individuals have higher incomes than low-skilled individuals. However, as suggested by Tinbergen, these direct distributional losses can be countered by general-equilibrium effects on wages. By increasing the relative supply of skilled workers, the skill premium declines, and this reduces income inequality. For both the income tax and the education subsidy, the direct distributional impacts and indirect distributional impacts that result from general-equilibrium effects are traded off against distortions in labor supply and in investment in education.

We then analyze how optimal policy should respond to a change in skill bias. In doing so, we demonstrate that the policy recommendations of Tinbergen (1975) and Goldin and Katz (2010) need not be correct. Our analysis shows that the optimal policy response depends on the effect of SBTC on (1) direct distributional impacts, (2) general-equilibrium effects, and (3) distortions in education.Footnote 4 We derive analytically that the effect of SBTC on direct distributional impacts, general-equilibrium effects, and distortions in skill formation are all theoretically ambiguous.

To resolve these theoretical ambiguities, we quantify the impact of SBTC on optimal tax and education policy by calibrating our model to the US economy using data from the US Current Population Survey and empirical evidence on labor market responses to tax and education policy. Given that our model is stylized, these simulations should be taken with caution. Our main aim is to get a quantitative sense of the impacts of SBTC on the main determinants of optimal policy in our model: direct distributional effects, general-equilibrium effects, and tax distortions on education. We simulate the response of optimal taxes and education subsidies to a rise in skill bias that is in line with the observed increase in the skill premium between 1980 and 2016. Moreover, we show that education is optimally subsidized on a net basis before the shock in skill bias to exploit general-equilibrium effects on wages for income redistribution. Hence, investment in higher education is optimally distorted upwards. Our main finding is that the optimal income tax rate increases with SBTC, while the optimal education subsidy declines with SBTC.

To understand which mechanisms drive these policy responses to SBTC, we numerically decompose the impact of SBTC into the main theoretical determinants of tax and education policy: direct distributional impacts, indirect general-equilibrium effects, and education distortions. We find that the optimal tax rate increases because the direct distributional benefits of taxing income increase and the distortions of net subsidies on education become larger, which overturn the larger general-equilibrium effects of taxing labor income. The optimal education subsidy declines with SBTC, since both the direct distributional losses and the distortions of net subsidies on education increase more than the stronger indirect general-equilibrium effects of subsidizing education.

The main lesson of our paper is that the impact of SBTC on optimal tax and education policy is far from straightforward. Indeed, while our model is stylized, it conveys a number of important messages for thinking about the optimal policy response to SBTC. While SBTC typically raises income inequality and the skill premium, thus calling for higher taxes and lower education subsidies for redistributional reasons, SBTC also affects the power of tax and education policy to generate general-equilibrium effects, which work in the opposite direction. Moreover, we show that it is not obvious how tax distortions in education change in response to SBTC; in our simulations, the distortions of net subsidies on education increase.

Our simulations demonstrate that SBTC calls for a more progressive income tax system, while the subsidy rate on investment in higher education should decline. Therefore, we show that the suggestions of Tinbergen (1975) and Goldin and Katz (2010) to promote investment in higher education to win the race against technology need not be correct. Although these authors are right to emphasize the larger benefits of exploiting general-equilibrium effects with SBTC, our analysis reveals that (at least) two other effects need to be taken into account as well to judge whether education subsidies should optimally increase: larger inequality between high-skilled and low-skilled workers and larger upward distortions in education. We show that these latter two effects quantitatively dominate over general-equilibrium effects.

The remainder of this paper proceeds as follows. Section 2 reviews the literature and outlines our contributions. Section 3 sets up the model. Section 4 analyzes optimal policy. Section 5 presents the simulations. Finally, Sect. 6 concludes. Proofs of all propositions, additional derivations, and background materials are contained in the Appendix, part of which is available online only.

2 Related literature

We analyze optimal linear income taxes and education subsidies in an extension of the optimal linear tax model due to Sheshinski (1972) with an endogenous education decision on the extensive margin and endogenous wage rates for high-skilled and low-skilled labor as in Roy (1951).Footnote 5 We merge this model with the canonical model of SBTC, which goes back to Katz and Murphy (1992), Violante (2008), and Acemoglu and Autor (2011). This allows us to analyze optimal linear education subsidies and to explore the consequences of SBTC for optimal policies. Our paper makes a number of contributions to four strands in the literature.

First, we contribute to the literature that analyzes optimal income taxes jointly with optimal education subsidies; see, for example, Bovenberg and Jacobs (2005), Maldonado (2008), Bohacek and Kapicka (2008), Anderberg (2009), Jacobs and Bovenberg (2011), and Stantcheva (2017). We contribute to these papers by analyzing optimal tax and education policies with education on the extensive margin rather than on the intensive margin. We find that education subsidies are employed to alleviate tax distortions on education, but do not fully eliminate all tax-induced distortions on education, as in Bovenberg and Jacobs (2005). The government likes to tax away infra-marginal rents in education to redistribute income from high-skilled to low-skilled workers—ceteris paribus—see also Findeisen and Sachs (2016) and Colas et al. (2021).

Second, we contribute to the literature on optimal income taxation and education subsidies in the presence of general-equilibrium effects on the wage distribution. Dur and Teulings (2004) analyze optimal log-linear tax and education policies in an assignment model of the labor market.Footnote 6 Like Dur and Teulings (2004), we find that education might be subsidized on a net basis to exploit general-equilibrium effects for income redistribution. Jacobs (2012) analyzes optimal linear taxes and education subsides in a two-type version of the model of Bovenberg and Jacobs (2005) and shows that optimal education subsidies are not employed to generate general-equilibrium effects so as to compress the wage distribution. The reason is that with education on the intensive margin, the general-equilibrium effect of education subsidies is identical to the general-equilibrium effect of income taxes. Hence, education subsidies have no distributional value added over income taxes, but generate additional distortions in education. Our model does not have this property, since we analyze education on the extensive margin. We add to this literature by analyzing the optimal response of income taxes and education subsidies to skill-biased technical change.

Third, this paper is most closely related to papers that study the response of optimal tax and education policies to technical change (Ales et al., 2015; Heathcote et al., 2014; Jacobs and Thuemmel, 2021; Loebbing, 2020).Footnote 7 Jacobs and Thuemmel (2021) study optimal nonlinear tax and education policies in a random participation model with education-dependent nonlinear taxes and individuals differing along two dimensions: earning ability and costs of education. As a result, the income distributions of high- and low-skilled workers are overlapping, since there can be high- and low-skilled workers at each income level. They find that general-equilibrium effects never determine optimal policy. Intuitively, any redistribution from high-skilled to low-skilled workers via general-equilibrium effects can be achieved as well with the education-dependent tax system, while the distortions of exploiting general-equilibrium effects on skill formation can be avoided. This paper adds to Jacobs and Thuemmel (2021) by showing how optimal policy should be set if the government can, realistically, not employ skill-dependent income tax rates. Moreover, doing so allows us to simplify the model structure considerably by allowing for one-dimensional heterogeneity and a non-overlapping wage distribution.Footnote 8 In this setting, the government can redistribute more income beyond what can be achieved with the income tax system by exploiting general-equilibrium effects on wages. For this reason, education may even be subsidized on a net basis, which can never occur in Jacobs and Thuemmel (2021).

Furthermore, we add to the analyses in Loebbing (2020), Heathcote et al. (2014), and Ales et al. (2015), who study the response of optimal taxes to technical change, but not how education policy should respond. Loebbing (2020) studies the interaction between optimal nonlinear income taxes and directed technical change. Heathcote et al. (2014) study the impact of SBTC on the optimal degree of tax progressivity using a parametric tax function in a model with endogenous human capital formation and imperfect substitutability of skills.Footnote 9 Ales et al. (2015) analyze how the nonlinear income tax should adjust to technical change in a task-based model of the labor market with exogenous human capital decisions.Footnote 10 In line with all these papers, we confirm that the tax system becomes more progressive in response to technical change. Our contribution is to also analyze optimal education policy jointly with optimal tax policy and show that SBTC reduces optimal education subsidies.

3 Model

This section presents our model consisting of individuals, firms, and a government. Utility-maximizing individuals supply labor on the intensive margin and optimally decide to become high-skilled or to remain low-skilled on the extensive margin. Profit-maximizing firms employ high-skilled and low-skilled labor, while facing SBTC. The government optimally sets progressive income taxes and education subsidies to maximize social welfare.

3.1 Individuals

There is a continuum of individuals of unit mass. Each worker is endowed with earnings ability \(\theta \in [\underline{\theta },\overline{ \theta }]\), which is drawn from distribution \(F(\theta )\) with corresponding density \(f(\theta )\). Individuals derive utility from consumption c and disutility from labor supply l. Individuals have identical quasi-linear preferences:

$$\begin{aligned} U(c,l)\equiv c-\frac{l^{1+1/\varepsilon }}{1+1/\varepsilon },\ \ \varepsilon >0, \end{aligned}$$
(1)

where \(\varepsilon \) is the constant wage elasticity of labor supply.Footnote 11 Consumption is the numéraire commodity and its price is normalized to unity.

In addition to optimally choosing consumption and labor supply, each individual makes a discrete choice whether to become high-skilled (H) or to remain low-skilled (L). We indicate an individual’s education type by \(j\in \left\{ L,H\right\} \) and define \({\mathbb {I}}\) as an indicator function for being high-skilled:

$$\begin{aligned} {\mathbb {I}}\equiv {\left\{ \begin{array}{ll} 1 &{} \text {if }j=H, \\ 0 &{} \text {if }j=L. \end{array}\right. } \end{aligned}$$
(2)

To become high-skilled, workers need to invest a fixed amount of resources \( p(\theta )\), which captures expenses such as tuition fees, books, and the (monetary value of) effort. High-skilled individuals also forgo earnings as a low-skilled worker. The wage rate per efficiency unit of labor is denoted by \(w^{j}\). Gross earnings are thus equal to \(z_{\theta }^{j}\equiv w^{j}\theta l_{\theta }^{j}\).

We model the direct costs of education as a weakly decreasing function of the worker’s ability \(\theta \):

$$\begin{aligned} p(\theta )\equiv \pi \theta ^{-\psi },\ \ \ \pi \in (0,\infty ),\ \ \ \psi \in [0,\infty ). \end{aligned}$$

If \(\psi >0\), individuals with higher ability \(\theta \) have lower direct costs of education. Hence, more able students need to spend less on education, e.g., because they have lower costs of effort, lower tuition fees, require less tutoring, or obtain grants. Parameter \(\psi \) is only introduced to control the elasticity of enrollment in higher education in the simulations, which will be calibrated at empirically plausible values. However, in the theoretical derivations one can set \(\psi =0\) without any loss of generality, such that all individuals face the same direct costs of education \(\pi \).

The government levies linear taxes t on labor income and provides a non-individualized lump-sum transfer b. The tax system is progressive if both the tax rate t and transfer b are positive.Footnote 12 In addition, high-skilled individuals receive a flat rate education subsidy s on total resources \(p(\theta )\) invested in education. We do not restrict the education subsidy to be positive; hence, we allow for the possibility that high-skilled individuals may have to pay an education tax. Workers of type \(\theta \) with education j thus face the following budget constraint:

$$\begin{aligned} c_{\theta }^{j}=(1-t)z_{\theta }^{j}+b-(1-s)p(\theta )\mathrm {{\mathbb {I}}}. \end{aligned}$$
(3)

The informational assumptions of our model are that individual ability \( \theta \) and labor effort \(l_{\theta }^{j}\) are not verifiable, but aggregate labor earnings \({\bar{z}}\equiv \int _{\underline{\theta }}^{{\overline{\theta }} }z_{\theta }^{j}\mathrm {d}F(\theta )\) and aggregate education expenditures \( \int _{\underline{\theta }}^{{\overline{\theta }}}p(\theta )\mathrm {{\mathbb {I}}} \mathrm {d}F(\theta )\) are. Hence, the government can levy linear taxes on income and provide linear subsidies on education.Footnote 13\(^,\)Footnote 14 Importantly, the tax implementation does not exploit all information available to the government. In particular, we realistically assume that marginal tax rates are not conditioned on education choices, in contrast to Jacobs and Thuemmel (2021). Consequently, income taxes can no longer achieve the same income redistribution as reducing inequality via general-equilibrium effects on wage rates. Hence, exploiting general-equilibrium effects becomes socially desirable for income redistribution.

Workers maximize utility by choosing consumption, labor supply, and education, taking wage rates and government policy as given. For a given education choice, the first-order condition for maximizing utility in Eq. (1), subject to the budget constraint in Eq. (3), yields optimal labor supply for all workers of type \(\theta \) and education j:

$$\begin{aligned} l_{\theta }^{j}=[(1-t)w^{j}\theta ]^{\varepsilon }. \end{aligned}$$
(4)

Labor supply increases in net earnings per hour \((1-t)w^{j}\theta \), and more so if labor supply is more elastic (higher \(\varepsilon \)). Income taxation distorts labor supply downward as it drives a wedge between the social rewards of labor supply (\(w^{j}\theta \)) and the private rewards of labor supply (\((1-t)w^{j}\theta \)).

By substituting the first-order condition in Eq. (4) into the utility function in Eq. (1), and using the budget constraint Eq. (3), the indirect utility function is obtained for all \(\theta \) and j:

$$\begin{aligned} V_{\theta }^{j}\equiv \frac{[(1-t)w^{j}\theta ]^{1+\varepsilon }}{ 1+\varepsilon }+b-((1-s)p(\theta )){\mathbb {I}}. \end{aligned}$$
(5)

An individual chooses to invest in education if and only if she derives higher utility from being high-skilled than from remaining low-skilled, i.e., if \(V_{\theta }^{H}\ge V_{\theta }^{L}\). The critical level of ability \(\Theta \) that separates the high-skilled from the low-skilled individuals is determined by \(V_{\Theta }^{H}=V_{\Theta }^{L}\) and is given by

$$\begin{aligned} \Theta =\left[ \frac{\pi (1-s)(1+\varepsilon )}{(1-t)^{1+\varepsilon }((w^{H})^{1+\varepsilon }-(w^{L})^{1+\varepsilon })}\right] ^{\frac{1}{ 1+\varepsilon +\psi }}. \end{aligned}$$
(6)

All individuals with ability \(\theta <\Theta \) remain low-skilled, whereas all individuals with \(\theta \ge \Theta \) become high-skilled. A decrease in \(\Theta \) implies that more individuals become high-skilled. If \( w^{H}/w^{L}\) rises, more individuals invest in higher education. The same holds true for a decrease in the net cost of education \((1-s)\pi \). The income tax potentially distorts the education decision, since the direct costs of education are not tax-deductible, while the returns to education are taxed. Income taxes also reduce investment in education because they reduce labor supply, and thereby lower the ‘utilization rate’ of human capital. If labor supply would be exogenous (\(\varepsilon =0\)), and education subsidies would make all education expenses effectively deductible (i.e., \(s=t\)), education would be at its first-best level: \( \Theta =[\pi /(w^{H}-w^{L})]^{{1}/({1+\varepsilon +\psi })}\) (see Jacobs 2005; Bovenberg and Jacobs 2005).

3.2 Firms

A representative firm produces the homogeneous consumption good by using aggregate low-skilled labor L and aggregate high-skilled labor H as inputs. Output Y is produced with a constant-returns-to-scale CES production technology:

$$\begin{aligned}&Y(L,H,A)=B\left( \omega L^{\frac{\sigma -1}{\sigma }}+(1-\omega )(AH)^{\frac{ \sigma -1}{\sigma }}\right) ^{\frac{\sigma }{\sigma -1}},\nonumber \\&\quad A,B>0,\ \ \omega \in (0,1),\ \ \ \sigma >1, \end{aligned}$$
(7)

where \(\omega \) governs the income shares of low- and high-skilled workers, \(\sigma \equiv Y_{H}Y_{L}/(Y_{HL}Y)\) is the elasticity of substitution between low- and high-skilled labor, and skill bias is parameterized by A. We denote by \(\alpha \equiv HY_{H}(\cdot )/Y(\cdot )\) the income share of high-skilled workers. We model technology like in the canonical model of SBTC. We assume that \(\sigma >1\) to ensure that skill-biased technical change increases the relative wage of high-skilled to low-skilled workers (Acemoglu and Autor, 2011; Katz and Murphy, 1992; Violante, 2008). All theoretical results generalize to a general constant-returns-to-scale production technology that satisfies the Inada conditions and has an elasticity of substitution \(\sigma \) that is larger than unity, i.e., \(\sigma >1\) (see the Appendix).

The representative firm is competitive and maximizes profits by taking wage rates as given. The first-order conditions are:

$$\begin{aligned} w^{L}=Y_{L}(L,H,A), \end{aligned}$$
(8)
$$\begin{aligned} w^{H}=Y_{H}(L,H,A). \end{aligned}$$
(9)

In equilibrium, the marginal product of each labor input equals its marginal cost. Moreover, in equilibrium, wage rates \(w^{L}\) and \(w^{H}\) depend on skill bias A. To improve readability, we suppress arguments LH,  and A in the derivatives of the production function in the remainder of the paper.

Since we have normalized the mass of individuals to one, average (gross) labor earnings \({\bar{z}}\) equals total income, which in turn equals output Y:

$$\begin{aligned} {\overline{z}}\equiv \int _{\underline{\theta }}^{\Theta }z_{\theta }^{L}\mathrm {d} F(\theta )+\int _{\Theta }^{{\overline{\theta }}}z_{\theta }^{H}\mathrm {d} F(\theta )=Y. \end{aligned}$$
(10)

3.3 Government

The government maximizes social welfare, which is given by

$$\begin{aligned} \int _{\underline{\theta }}^{\Theta }\Psi (V_{\theta }^{L})\mathrm {d}F(\theta )+\int _{\Theta }^{{\overline{\theta }}}\Psi (V_{\theta }^{H})\mathrm {d}F(\theta ),\ \ \ \Psi ^{\prime }>0,\ \ \ \Psi ^{\prime \prime }<0, \end{aligned}$$
(11)

where \(\Psi (\cdot )\) is a concave transformation of indirect utilities of low- and high-skilled workers. The government budget constraint states that total tax revenue equals spending on education subsidies, non-individualized transfers, and an exogenous government revenue requirement R:

$$\begin{aligned} t\left[ \int _{\underline{\theta }}^{\Theta }w^{L}\theta l_{\theta }^{L}\mathrm {d }F(\theta )+\int _{\Theta }^{{\overline{\theta }}}w^{H}\theta l_{\theta }^{H} \mathrm {d}F(\theta )\right] =s\int _{\Theta }^{{\overline{\theta }}}p(\theta ) \mathrm {d}F(\theta )+b+R. \end{aligned}$$
(12)

3.4 General equilibrium

In equilibrium, factor prices \(w^{L}\) and \(w^{H}\) ensure that labor markets and the goods market clear. Labor market clearing implies that aggregate effective labor supplies for each skill type equal aggregate demands:

$$\begin{aligned} L= & {} \int _{\underline{\theta }}^{\Theta }\theta l_{\theta }^{L}\mathrm {d} F(\theta ), \end{aligned}$$
(13)
$$\begin{aligned} H= & {} \int _{\Theta }^{{\overline{\theta }}}\theta l_{\theta }^{H}\mathrm {d} F(\theta ). \end{aligned}$$
(14)

Goods market clearing implies that total output Y equals aggregate demand for private consumption, education expenditures, and exogenous government spending R:

$$\begin{aligned} Y=\int _{\underline{\theta }}^{\Theta }c_{\theta }^{L}\mathrm {d}F(\theta )+\int _{\Theta }^{{\overline{\theta }}}(c_{\theta }^{H}+p(\theta ))\mathrm {d} F(\theta )+R. \end{aligned}$$
(15)

Due to the Inada conditions on the production technology, there will be a strictly positive mass of both high-skilled individuals and low-skilled individuals in general equilibrium (i.e., \(0<\Theta <\infty \)) if \(\varepsilon >0\) and \(0\le t<1\). Moreover, the skill premium will then always be positive, i.e., \(w^{H}>w^{L}\). That the equilibrium features a positive mass of low- and high-skilled workers jointly with \(w^{H}>w^{L}\) can be proven by contradiction. Suppose that there would be an equilibrium in which the high-skilled wage is lower than the low-skilled wage, i.e., \(w^{H}<w^{L}.\) Then, from the expression for the optimal cutoff \(\Theta \) in Eq. (6) follows that nobody wants to become high-skilled (\(\Theta =0\)), since there are positive fixed costs of education (\(p(\theta )>0\)). However, if nobody wants to becomes high-skilled, then the wage of the high-skilled workers goes to infinity in view of the Inada conditions on the production function in Eq. (7), i.e., \(w^{H}\rightarrow \infty \), which contradicts that \(w^{H}<w^{L}\).

3.5 Behavioral elasticities

Before analyzing the optimal tax formulas, it is instructive to derive the general-equilibrium comparative statics of the model variables with respect to the income tax and education subsidy. Table 5 in Appendix A derives these behavioral elasticities.

The comparative statics of taxes and subsidies on behavior and wage rates can be summarized as follows. A higher income tax rate discourages both labor supply and investment in education. The latter because the direct costs of investment in human capital are not deductible from the income tax. The education subsidy boosts investment in education. Tax and education policy both affect the skill premium, i.e., the high-skilled wage relative to the low-skilled wage, by changing the relative supply of high-skilled to low-skilled labor. This occurs only via a change in investment in education, and not via changes in labor supply, since the labor-supply elasticity is the same for high-skilled and low-skilled workers. Larger income taxes raise the skill premium as fewer people become high-skilled. This implies that the adverse effect of taxation on high-skilled (low-skilled) labor supply is alleviated (exacerbated) by a rise in the skill premium. Similarly, education subsidies reduce the skill premium as more people become educated. As a result, education subsidies reduce high-skilled labor supply and increase low-skilled labor supply.

4 Optimal policy and SBTC

4.1 Optimal policy

The government maximizes social welfare in Eq. (11) by choosing the marginal income tax rate t, the lump-sum transfer b, and the education subsidy s, subject to the government budget constraint in Eq. (12). In order to interpret the expressions for the optimal tax rate t and the subsidy rate s, we introduce some additional notation.

First, we define the net tax wedge on skill formation \(\Delta \) as

$$\begin{aligned} \Delta \equiv tw^{H}\Theta l_{\Theta }^{H}-tw^{L}\Theta l_{\Theta }^{L}-sp(\Theta ). \end{aligned}$$
(16)

\(\Delta \) gives the increase in government revenue if the marginal individual with ability \(\Theta \) decides to become high-skilled rather than remaining low-skilled. If \(\Delta >0\), education is taxed on a net basis. \(tw^{H}\Theta l_{\Theta }^{H}\ \) gives the additional tax revenue when the marginal individual with ability \(\Theta \) becomes high-skilled. \(tw^{L}\Theta l_{\Theta }^{L}\) gives the loss in tax revenue as this individual no longer pays taxes as a low-skilled worker. The government also loses \(sp(\Theta )\) in revenue due to subsidizing education of this individual.

Second, we derive a measure for the distributional benefits of taxing income. In particular, let the social welfare weight of an individual of type \(\theta \) be defined as \(g_{\theta }\equiv \Psi ^{\prime }(V_{\theta })/\eta \), where \(\eta \) is the Lagrange multiplier on the government budget constraint—see below. Following Feldstein (1972), we define the distributional characteristic \(\xi \) of the income tax as

$$\begin{aligned} \xi \equiv \frac{\int _{\underline{\theta }}^{\Theta }(1-g_{\theta })z_{\theta }^{L} \mathrm {d}F(\theta )+\int _{\Theta }^{{\overline{\theta }}}(1-g_{\theta })z_{\theta }^{H}\mathrm {d}F(\theta )}{{\overline{z}}{\bar{g}}}>0. \end{aligned}$$
(17)

\(\xi \) equals minus the normalized covariance between social welfare weights \( g_{\theta }\) and labor earnings \(z_{\theta }^{j}\). \(\xi \) measures the social marginal value of income redistribution via the income tax, expressed in monetary equivalents, as a fraction of taxed earnings. Marginal distributional benefits of income taxation are positive, since the social welfare weights \(g_{\theta }\) decline with ability \(\theta \). We have \(0\le \xi \le 1\), where \(\xi \) is larger if the government has stronger redistributive social preferences. For a Rawlsian/maxi-min social welfare function, which features \(\Psi _{\underline{\theta }}^{\prime }=1/f(\underline{\theta })\gg 1\) and \( \Psi _{\theta }^{\prime }=0\) for all \(\theta >\underline{\theta }\), we obtain \( \xi =1\) if the lowest ability is zero (\(\underline{\theta }=0\)). In contrast, for a utilitarian social welfare function with constant weights \( \Psi ^{\prime }=1\), we obtain \(\xi =0\).Footnote 15 We also derive that \(\xi =0\) if \(z_{\theta }^{j}\) is equal for everyone so that the government is not interested in income redistribution with the income tax.

An alternative intuition for the distributional characteristic \(\xi \) is that it measures the social value of raising an additional unit of revenue with the income tax. It gives the income-weighted average of the additional unit of revenue (the ‘1’) minus the utility losses (\(g_{\theta }\)) that raising this unit of revenue inflicts on tax payers.

Third, we similarly derive a measure for the distributional benefits of taxing education:

$$\begin{aligned} \zeta \equiv \int _{\Theta }^{{\overline{\theta }}}\theta ^{-\psi }(1-g_{\theta }) \mathrm {d}F(\theta )\ge 0. \end{aligned}$$
(18)

\(\zeta \) captures the marginal benefits of income redistribution from the high-skilled to the low-skilled via a higher net tax on education (lower net education subsidy). It gives the value of an additional unit of revenue (the ‘1’) minus the utility losses (\(g_{\theta }\)) that raising this unit of revenue inflicts on high-skilled tax payers.

In contrast to the expression for \(\xi \), the distributional benefits in \(\zeta \) are not divided by average earnings and the average welfare weight of the high-skilled, since the education choice is discrete.Footnote 16 However, the distributional benefits \(\zeta \) are scaled with the cost of education by the term \(\theta ^{-\psi }\) because the costs of education decline with \(\theta \), and more so if \(\psi \) is larger. If the costs of education are larger for individuals with a lower ability \(\theta \), and every individual receives a linear subsidy on total costs, low-ability individuals receive higher education subsidies in absolute amounts. If the costs of education are the same for each individual, we have that \(\psi =0\), and the distributional characteristic \(\zeta \) only depends on the social welfare weights \( g_{\theta } \).

Fourth, we define the income-weighted social welfare weights of each education group as

$$\begin{aligned} {\tilde{g}}^{L}\equiv \frac{\int _{\underline{\theta }}^{\Theta }g_{\theta }z_{ \theta }^{L}\mathrm {d}F(\theta )}{\int _{\underline{\theta }}^{\Theta }z_{ \theta }^{L}\mathrm {d}F(\theta )}>{\tilde{g}}^{H}\equiv \frac{\int _{\Theta }^{ {\overline{\theta }}}g_{\theta }z_{\theta }^{H}\mathrm {d}F(\theta )}{ \int _{\Theta }^{{\overline{\theta }}}z_{\theta }^{H}\mathrm {d}F(\theta )}. \end{aligned}$$
(19)

The social welfare weights of the low-skilled are on average higher than the social welfare weights of the high-skilled, since the social welfare weights continuously decline in income.

Fifth, we define the ‘general-equilibrium elasticity’ \(\varepsilon _{GE}\) as

$$\begin{aligned} \varepsilon _{GE}\equiv & {} \frac{\alpha (1-\alpha )\varsigma \delta }{ \sigma +\varepsilon +\varsigma \delta (\beta -\alpha )},\ \ \ \varsigma \equiv \frac{1+\varepsilon }{1+\varepsilon +\psi }, \ \ \ \beta \equiv \frac{1}{1-(w^{L}/w^{H})^{1+\varepsilon }}, \\ \delta\equiv & {} \left( \frac{\Theta l_{\Theta }^{L}f(\Theta )}{L}+\frac{\Theta l_{\Theta }^{H}f(\Theta )}{H}\right) \Theta , \nonumber \end{aligned}$$
(20)

where \(\varsigma \) is a parameter combination of labor-supply and education elasticities (\(\varepsilon \) and \(\psi \)), \(\beta \) is a measure of the inverse skill premium, and \(\delta \) is a measure for effective labor supply around the skill margin \(\Theta \). The general-equilibrium elasticity \(\varepsilon _{GE}\) measures the response of the relative wage \(w^H/w^L\) in general equilibrium to the relative change in H/L in general equilibrium, taking into account simultaneous changes in relative demand (\(\sigma \)) and relative supply (\(\varepsilon \) and \(\psi \)). This term captures the quantitative importance of general-equilibrium effects of tax and education policies on the relative wages of high-skilled and low-skilled workers. The general-equilibrium elasticity \(\varepsilon _{GE}\) decreases if labor-supply and education responses are more elastic (\(\psi \) and \(\varepsilon \) higher) and there is a larger mass of labor around the skill cutoff (\(\delta \) higher). In that case, the quantities of high-skilled labor relative to low-skilled labor respond more strongly to tax policy changes, leaving less room for relative wage effects to clear the labor markets for high-skilled and low-skilled labor. The general-equilibrium elasticity also increases if the skill premium is higher (inverse skill premium \(\beta \) is lower) and it is ambiguous in the share of high-skilled labor income (\(\alpha \)).Footnote 17

Armed with the additional notation, we are able to state the conditions for optimal policy in the next proposition.

Proposition 1

The optimal lump-sum transfer, income tax, and net tax on education are determined by

$$\begin{aligned}&{\bar{g}}\equiv \int _{\underline{\theta }}^{{\overline{\theta }}}g_{\theta }\mathrm {d }F(\theta )=1, \end{aligned}$$
(21)
$$\begin{aligned}&\frac{t}{1-t}\varepsilon +\frac{\Delta }{(1-t){\bar{z}}}\Theta f(\Theta )\varepsilon _{\Theta ,t}=\xi -({\tilde{g}}^{L}-{\tilde{g}} ^{H})\varepsilon _{GE}, \end{aligned}$$
(22)
$$\begin{aligned}&\frac{\Delta }{(1-t){\bar{z}}}\Theta f(\Theta )\varepsilon _{\Theta ,s}=\frac{s\pi }{(1-t){\bar{z}}}\zeta -\rho ({\tilde{g}}^{L}-{\tilde{g}}^{H})\varepsilon _{GE}, \end{aligned}$$
(23)

where \(\varepsilon _{\Theta ,t}\equiv \frac{\partial \Theta }{\partial t}\frac{1-t}{\Theta }\) is the elasticity of \(\Theta \) with respect to the net-of-tax rate \(1-t\), \(\varepsilon _{\Theta ,s}\equiv -\frac{\partial \Theta }{\partial s}\frac{s}{\Theta }\) is the elasticity of \(\Theta \) with respect to the subsidy rate s, and \(\rho \equiv \frac{s}{(1-s)(1+\varepsilon )}>0\) captures the importance of education subsidies in the total direct costs of education.

Proof

See Appendix B. \(\square \)

We interpret each optimality condition in Proposition 1 in the following subsections and relate our results to earlier findings in the literature.

4.1.1 Optimal transfer b

The optimality condition for the lump-sum transfer b in Eq. (21) equates the average social marginal benefit of giving all individuals one euro more in transfers (left-hand side) to the marginal costs of doing so (right-hand side). This expression is standard in optimal linear tax models; see also Sheshinski (1972), Dixit and Sandmo (1977), and Hellwig (1986).Footnote 18

4.1.2 Optimal income tax t

The optimal income tax in Eq. (22) equates the total marginal distortions of income taxation on the left-hand side with its distributional benefits on the right-hand side, for any value of the optimal education subsidy, including the optimal level.Footnote 19 On the left-hand side, \(\frac{t}{1-t}\varepsilon \) captures the marginal deadweight loss of distorting labor supply. The larger the wage elasticity of labor supply \(\varepsilon \), the more income taxes distort labor supply. \(\frac{\Delta }{(1-t) {\bar{z}}}\Theta f(\Theta )\varepsilon _{\Theta ,t}\) denotes the marginal distortion of the education decision due to the income tax. A higher marginal tax rate discourages individuals from becoming high-skilled. The larger is the elasticity \(\varepsilon _{\Theta ,t}\), the larger are the distortions of income taxation on education. The higher the net tax wedge on education (in terms of net income) \({\Delta }/((1-t){\bar{z}})\), the more income taxation distorts education, and the lower should the optimal tax rate be. If education subsidies are higher, they counter the distortions of income taxes on education by lowering \(\Delta \), and optimal income taxes will be set higher—ceteris paribus. Hence, education subsidies allow for more progressive income taxes by alleviating the distortions on skill formation, as in Bovenberg and Jacobs (2005). \(\Theta f(\Theta )\) measures the ‘size of the tax base’ at the marginal graduate \(\Theta \). The higher the mass of individuals \(f(\Theta )\) and the larger their ability \(\Theta \), the more important are tax distortions in education.

The right-hand side of Eq. (22) gives the distributional benefits of income taxes. The larger are the marginal distributional benefits of income taxes—as captured by \(\xi \)—the higher should be the optimal tax rate. This is the standard term in optimal linear tax models; see also Sheshinski (1972), Dixit and Sandmo (1977), and Hellwig (1986). In addition, \(({\tilde{g}}^{L}-{\tilde{g}}^{H})\varepsilon _{GE}>0\) captures the distributional losses of income taxes due to general-equilibrium effects on the wage structure. Income taxation reduces skill formation. Hence, the supply of high-skilled labor falls relative to low-skilled labor. This raises high-skilled wages and depresses low-skilled wages. Consequently, before-tax inequality goes up and social welfare declines, since the income-weighted social welfare weights of the low-skilled workers are larger than the income-weighted social welfare weights of the high-skilled workers (\({\tilde{g}}^{L}>{\tilde{g}}^{H}\)). The direct gains of income redistribution (\( \xi \)) are therefore reduced by indirect, redistributional losses due general-equilibrium effects on the wage distribution \(({\tilde{g}}^{L}-{\tilde{g}}^{H})\varepsilon _{GE}\). The general-equilibrium elasticity \(\varepsilon _{GE}\) captures the strength of these general-equilibrium effects of income taxes. A lower elasticity of substitution \(\sigma \), a lower labor-supply elasticity \(\varepsilon \), and a lower education elasticity \(\psi \) provoke stronger general-equilibrium responses that erode the distributional powers of income taxation. Intuitively, if quantities adjust only little, relative wages will need to adjust relatively more to clear labor markets.

In the absence of general-equilibrium effects (\(\sigma =\infty \)), the general-equilibrium elasticity is zero (\(\varepsilon _{GE}=0\)). In that case, also education decisions are (required to be) exogenous (\(\varepsilon _{\Theta ,t}=0\)); see Appendix A.2. Consequently, the standard linear income tax in the absence of general-equilibrium effects and human capital distortions is obtained: \(\varepsilon{t}/({1-t}) = \xi \). See also Sheshinski (1972), Dixit and Sandmo (1977), and Hellwig (1986).

As a special case, we can also derive the optimal income tax without education subsidies. This allows us to relate the optimal income tax to Feldstein (1972), Allen (1982), and Jacobs (2012), who also study optimal linear income taxes with general-equilibrium effects on wages. The optimal income tax in the absence of education policy can be found by setting \(s=0\) in Eq. (22):

$$\begin{aligned} \frac{t}{1-t}\varepsilon +\frac{t}{1-t}\left( \frac{w^{H}\Theta l_{\Theta }^{H}-w^{L}\Theta l_{\Theta }^{L}}{{\bar{z}}}\right) \Theta f(\Theta )\varepsilon _{\Theta ,t}=\xi -({\tilde{g}}^{L}-{\tilde{g}} ^{H})\varepsilon _{GE}. \end{aligned}$$
(24)

We find that optimal linear income taxes are determined by general-equilibrium effects on wages. Our economic mechanism is different than in Feldstein (1972), Allen (1982), and Jacobs (2012). In all these papers, general-equilibrium effects depend on differences in (uncompensated) wage elasticities of labor supply between high-skilled and low-skilled workers. In particular, if high-skilled workers have the largest uncompensated wage elasticity of labor supply, then linear income taxes depress labor supply of high-skilled workers more than that of low-skilled workers, and this generates general-equilibrium effects on wages, which results in larger before-tax income inequality. Optimal income taxes are lowered accordingly.Footnote 20 High- and low-skilled individuals can have different uncompensated labor-supply elasticities due to differences in income elasticities or compensated elasticities. However, this mechanism is not relevant here, since we assume no income effects and compensated wage elasticities of labor supply are equal for both skill types. Indeed, the relative supply of skilled labor does not change due to changes in relative hours worked, but due to endogenous education choices. Income taxes unambiguously generate larger pre-tax income inequality due to general-equilibrium effects, since they reduce skill formation. This contrasts to the contributions that abstract from an endogenous education decision on the extensive margin.

If education subsidies are constrained to be zero, the optimality condition for optimal income taxes in Eq. (24) does not fundamentally change compared to the condition for optimal income taxes with non-zero, and potentially optimal, education subsidies in Eq. (22). The main difference is that the net tax wedge on education is now unambiguously positive, i.e., \(\Delta \equiv tw^{H}\Theta l_{\Theta }^{H}-tw^{L}\Theta l_{\Theta }^{L}>0\). Since direct costs of education are not subsidized, income taxes distort skill formation, besides labor supply. This additional tax distortion lowers optimal income taxes below levels that would be obtained in case skill formation would be exogenous, i.e., where \(\varepsilon _{\Theta ,t}=0\); see also Jacobs (2005).

4.1.3 Optimal net tax on education \(\Delta \)

The optimality condition for education subsidies is given in Eq. (23). The left-hand side gives the marginal distortions of taxing education on a net basis, while the right-hand side gives the distributional benefits of doing so, for any value of the income tax rate, including the optimal level.Footnote 21 If \(\Delta >0\), education is taxed on a net basis. The optimal education subsidy s follows from the net tax on education \(\Delta \equiv tw^{H}\Theta l_{\Theta }^{H}-tw^{L}\Theta l_{\Theta }^{L}-sp(\Theta )\). Education is distorted downwards more if the optimal net tax on education \(\frac{\Delta }{(1-t){\bar{z}}}\) is larger. Distortions on education are larger (higher \(\Delta \)) if the income tax t is set at a higher level—ceteris paribus. \(\Theta f(\Theta )\) is the same as in Eq. (22). It captures the size of the tax base at the marginal graduate \(\Theta \). The larger the education elasticity \(\varepsilon _{\Theta ,s}\) with respect to the subsidy rate s, the more skill formation responds to net taxes on education, and the lower should be the optimal net tax on education.

For given distributional benefits of net taxes on education on the right-hand side of Eq. (23), and for a given elasticity of education on the left-hand side of Eq. (23), the optimal subsidy s on education rises if the income tax rate t increases, so as to keep the net tax \(\Delta \) constant. These results are similar to Bovenberg and Jacobs (2005) who show that education subsidies should increase if income taxes are higher so as to alleviate the distortions of the income tax on skill formation—ceteris paribus.Footnote 22

Note that there is no impact of education subsidies on labor-supply distortions. Intuitively, a marginally higher education subsidy does not directly affect labor supply on the intensive margin. However, the subsidy does affect labor supply indirectly via changes in the wage distribution.

The distributional gains of net taxes on education are given on the right-hand side of Eq. (23). Since \(\zeta >0\), taxing education yields net distributional benefits. The higher is the distributional gain of taxing education \(\zeta \), the higher is the net tax education—ceteris paribus. In contrast to Bovenberg and Jacobs (2005), it is generally not optimal to set the education subsidy exactly equal to the tax rate (i.e., \(s=t\)) to obtain a zero net tax on education (i.e., \(\Delta =0\)). Since investment in education generates infra-marginal rents for all but the marginally skilled individual, the government likes to tax these rents and redistribute income from high-skilled to low-skilled workers. This finding is in line with Findeisen and Sachs (2016) and Colas et al. (2021), who also analyze optimal education policies with discrete education choices.Footnote 23

Furthermore, lower net taxes (or even net subsidies) on education generate general-equilibrium effects on wages that are captured by \(\rho ({\tilde{g}}^{L}-{\tilde{g}}^{H})\varepsilon _{GE}\). Lower taxes (or higher subsidies) give distributional gains, since pre-tax income inequality declines. Social welfare then increases, since the income-weighted social welfare weights of the low-skilled are higher than those of the high-skilled (\({\tilde{g}}^{L}>{\tilde{g}}^{H}\)). The general-equilibrium elasticity \(\varepsilon _{GE}\) captures the strength of general-equilibrium effects. If general-equilibrium effects are sufficiently strong, education may even be subsidized on a net basis rather than taxed on a net basis (i.e., \(\Delta <0\)), which is in fact the case in our baseline simulation below. This finding confirms Dur and Teulings (2004) who analyze optimal log-linear tax and education policies in an assignment model of the labor market and find that optimal education subsidies may need to be positive. In the absence of general-equilibrium effects (\(\sigma =\infty \)), the general-equilibrium elasticity is zero (\(\varepsilon _{GE}=0\)), and education subsidies are not deployed to exploit general-equilibrium effects for income redistribution.

The finding that education may be subsidized on a net basis contrasts with Jacobs (2012), who also analyzes optimal linear taxes and education subsidies with general-equilibrium effects. However, he models education on the intensive rather than the extensive margin, as in Bovenberg and Jacobs (2005). Education subsidies should then not be employed to generate general-equilibrium effects, because the general-equilibrium effect of linear education subsidies is identical to the general-equilibrium effect of linear income taxes. Hence, education subsidies have no distributional value added over income taxes, but only generate additional distortions in education.

If the government does not have access to income taxes at all (\(t=0\)), then Eq. (23) reduces to

$$\begin{aligned} -s\frac{\pi }{{\bar{z}}}\Theta ^{1-\psi } f(\Theta )\varepsilon _{\Theta ,s}=\frac{s\pi }{{\bar{z}}}\zeta -\rho ({\tilde{g}}^{L}-{\tilde{g}}^{H})\varepsilon _{GE} . \end{aligned}$$
(25)

In this case, like in the general case, optimal subsidies on education remain ambiguous in sign. On the one hand, direct income redistribution, as captured by the first term on the right-hand side, calls for a tax on education, while on the other hand general-equilibrium effects, as captured by the second term on the right-hand side, call for a subsidy on education.

Our findings also differ from Jacobs and Thuemmel (2021). They analyze optimal nonlinear income taxes that can be conditioned on skill type in a random participation model with education-dependent nonlinear taxes and individuals differing along two dimensions: earning ability and costs of education. They find that education is always taxed on a net basis, in contrast to this paper. In their framework, general-equilibrium effects do not enter optimal policy rules for both income taxes and education subsidies. The reason is that any redistribution from high-skilled to low-skilled workers via a general-equilibrium effects can be achieved as well with the income tax system, while the distortions in education can be avoided. Our analysis shows that tax and education policies should exploit general-equilibrium effects on the wage distribution in the realistic case that tax schedules cannot be conditioned on education. By generating general-equilibrium effects on wages, the government can redistribute more income beyond what can be achieved with the income tax system alone.Footnote 24

4.2 Effects of SBTC on optimal policy

To understand the mechanisms behind the optimal policy response to SBTC, we study the comparative statics of the optimal policy rules with respect to SBTC. SBTC affects optimal policy through three channels: (1) distributional benefits, (2) education distortions, and (3) general-equilibrium effects. We do not report the effect of SBTC on labor-supply distortions, since the marginal excess burden of income taxes (\(\varepsilon{ t}/(1-t) \)) is not affected by SBTC because the labor-supply elasticity \(\varepsilon \) is the same for all individuals.

We analytically derive how an increase in skill bias affects the terms in the formula for the optimal income tax rate in Eq. (22) and in the formula for the optimal subsidy rate in Eq. (23). Online Appendix A contains the formal derivations and more detailed explanations. Table 1 summarizes the analytical comparative statics and shows that the impact of SBTC on all elements of the expressions for optimal income taxes and optimal education subsidies in Proposition 1 is theoretically ambiguous. To gain a better understanding of the sign and quantitative size of these effects, we proceed by numerically analyzing the impact of SBTC on optimal policy. Table 1 also summarizes the outcomes of our simulations of the impact of SBTC on optimal policy rules, to which we turn next.

Table 1 Effect of SBTC on determinants of optimal tax and subsidy rate

5 Simulation

In this section, we simulate the consequences of SBTC for optimal tax and education policy. To do so, we first calibrate the model to the US economy. Then, we analyze the comparative statics of SBTC on optimal policy and uncover the channels through which SBTC affects optimal tax and education policy. Given the ambiguous theoretical effects of SBTC on optimal policy, the purpose of the simulations is to better understand how optimal policy should respond to SBTC in a reasonably quantified model.

5.1 Calibration

We calibrate our model to the US economy using data from the US Current Population Survey.Footnote 25 We choose 1980 as the base year for the calibration, since evidence of SBTC emerges around that time. The final year is 2016. To compute levels and changes in the skill premium and the share of high-skilled workers in the data, we classify individuals with at least a college degree as high-skilled and all other individuals as low-skilled. The share of high-skilled workers in the working population was 24% in 1980 and 47% in 2016. We define the skill premium as average hourly earnings of high-skilled workers relative to average hourly earnings of low-skilled workers minus one:

$$\begin{aligned} \text{ skill } \text{ premium }\equiv \frac{w^{H}}{w^{L}}\frac{\frac{1}{1-F(\Theta )} \int _{\Theta }^{{\overline{\theta }}}\theta \mathrm {d}F(\theta )}{\frac{1}{ F(\Theta )}\int _{\underline{\theta }}^{\Theta }\theta \mathrm {d}F(\theta )} - 1. \end{aligned}$$
(26)

In the data, the skill premium changed from 47% in 1980 to 83% in 2016, which is an increase of 76%.

We set the compensated wage elasticity of labor supply to \(\varepsilon =0.3\), based on evidence reported in the surveys of Blundell and Macurdy (1999) and Meghir and Phillips (2010). Although estimated uncompensated labor-supply elasticities are typically lower, we use a higher value to approximate the compensated labor-supply elasticity. Moreover, in our model, \(\varepsilon \) can also be interpreted as the elasticity of taxable income, encompassing more intensive margins, such as tax avoidance and evasion. The empirical literature reports figures in the range of 0.15 to 0.40 for the elasticity of taxable income; see the survey by Saez et al. (2012). We study alternative values for the elasticity \(\varepsilon \) in the robustness checks. For the ability distribution \(F(\theta )\), we choose a log-normal distribution with a Pareto tail.Footnote 26

The production technology is modeled according to the production function in Eq. (7). We set the elasticity of substitution between skilled and unskilled workers to \(\sigma =2.9\), following Acemoglu and Autor (2012).Footnote 27 We normalize the level of skill bias in 1980 to \(A_{1980}=1\). SBTC between 1980 and 2016 then corresponds to an increase from \(A_{1980}\) to \(A_{2016}\), while keeping all other parameters in the production function constant.

We adopt a social welfare function with a constant elasticity of inequality aversion \(\phi >0\):

$$\begin{aligned} \Psi (V_{\theta })= {\left\{ \begin{array}{ll} \frac{V_{\theta }^{1-\phi }}{1-\phi }, &{} \phi \ne 1 \\ \ln (V_{\theta }), &{} \phi =1 \end{array}\right. } . \end{aligned}$$
(27)

\(\phi \) captures the government’s desire for income redistribution. \(\phi =0\) corresponds to a utilitarian social welfare function, whereas for \(\phi \rightarrow \infty \) the social welfare function converges to a Rawlsian social welfare function.Footnote 28 In the simulations, we assume \(\phi =1\) as this leads to optimal marginal tax rates that are in a similar range as the marginal tax rates observed in the data. We also consider other parameters for \(\phi \) in the robustness checks.

We calibrate the model taking the tax rate, the transfer, and the education subsidy for 1980 and 2016 as given. The marginal tax rate in 1980 was on average \(t=34.1\%\), while it was \(t=27.5\%\) in 2016 (National Bureau for Economic Research, 2021). The transfer b is pinned down by the average tax rate, which was 18.1% in 1980 and 15.8% in 2016. The subsidy rate is set at \(s=47\%\) for 1980 (Gumport et al., 1997) and at \(s=35\%\) for 2016 (OECD, 2018).Footnote 29 The subsidy rate corresponds to the share of government spending in total spending on higher education. At the calibrated equilibrium, the tax system also pins down the level of government expenditure R. When we compute optimal policy, we set the revenue requirement to this level of government expenditure.

Finally, we calibrate the parameters of the cost function for education (\(\pi \) and \(\psi \)) as well as the parameters of the production function (\(\omega \), and \(A_{2016}\)). We calibrate \(\psi \) in the cost function for education by targeting an enrollment elasticity of 0.17. We base this elasticity on estimates in Dynarski (2000). Like many other studies, Dynarski (2000) reports semi-elasticities, which are based on the effect of changes in tuition subsidies (in percent) on college enrollment (in percentage points).Footnote 30 It is commonly estimated that a $1000 increase in tuition subsidies increases college enrollment by 3 to 5 percentage points; see Nielsen et al. (2010) for an overview.Footnote 31 We also include robustness checks for the enrollment elasticity. The parameters of the production function (\(\omega \) and \(A_{2016}\)) are calibrated by targeting levels and changes in the skill premium.

We calibrate the model by setting parameters so as to minimize the sum of squared relative errors between the moments generated by our model and the corresponding empirical moments from the data. Since some moments are much easier to match than others, we impose the constraint that the relative deviation of model moments from their empirical counterparts does not exceed 30%. All calibrated parameters are summarized in Table 2. The implied moments are reported in Table 3.

Table 2 Calibration: parameters

As expected, our model generates a skill premium that is generally too high, since the wage distributions of low and high-skilled workers do not overlap: the least-earning high-skilled worker still earns a higher wage than the best-earning low-skilled worker. At the same time, the relative change in the skill premium is lower than in the data (59% instead of 76%). The share of high-skilled workers in the model is somewhat higher than in the data for the year 1980, but only slightly higher for the year 2016. The subsidy elasticity of enrollment is lower than our empirical target. Overall, the calibration nevertheless strikes a reasonable balance in matching the various moments in the data.

Table 3 Calibration: model versus data

To illustrate how our model responds to SBTC, we first simulate an increase in skill bias while keeping taxes, subsidies, and transfers at their calibrated values, which we refer to as the ‘status-quo’ economy. The outcomes are plotted in Fig. 1. The share of high-skilled workers is slightly concave in skill bias, while the skill premium increases almost linearly with skill bias. As a benchmark, we also simulate an economy without taxes and education subsidies, which we refer to as the ‘laissez-faire’ economy.Footnote 32 Comparing the laissez-faire and the status-quo economy shows the effect of policy: under laissez-faire, the share of high-skilled workers is lower, and correspondingly, the skill premium is higher. We attribute this difference primarily to the education subsidy in the status-quo tax system. However, the differences between the two economies are small. Moreover, in both cases the effect of SBTC on the share of high-skilled and the skill premium is very similar.

Fig. 1
figure 1

Effect of SBTC under status-quo tax system and under laissez-faire. Note The horizontal axis corresponds to skill bias A. Status quo refers to the tax system used in the calibration, and summarized in Table 2. Laissez-faire corresponds to \(t=0\) and \(s=0\)

5.2 Optimal policy and SBTC

We compute optimal policy for different levels of skill bias and show the results in Fig. 2. The optimal marginal tax rate t increases monotonically with skill bias from about 21% to 31% (Panel 2a). The optimal transfer—expressed as share of average earnings \({\bar{z}}\)—increases monotonically from about 4% to 20% (Panel 2b). The optimal subsidy rate s falls monotonically from about 68% to 49% (Panel 2c).Footnote 33 Finally, Panel 2d shows the optimal net tax on skill formation \(\Delta \) as a fraction of average earnings \({\bar{z}}\). Since the optimal net tax is negative, education is subsidized on a net basis. This means that education is distorted upwards compared to the efficient level, i.e., there is ‘over-investment.’ From the simulations, we conclude that the general-equilibrium effects of education subsidies are stronger than the direct distributional losses of education subsidies. The net tax as a fraction of average earnings increases monotonically with SBTC from about \(-10\%\) to \(-7\%\). In other words, the net subsidy on education becomes smaller with SBTC.

Fig. 2
figure 2

Optimal policy under SBTC (all in %), skill bias A on the horizontal axis

5.3 Decomposition into different channels

We now uncover the three channels through which SBTC affects optimal policy. To do so, we start out from the optimum at \(A=1\) and then increase the level of skill bias, while holding s and t fixed. We then compute how each of the terms in the first-order conditions in Eqs. (22) and (23) is affected by the increase in skill bias. For each term, we report its initial level and its change due to SBTC in Table 4. The terms in the table are grouped according to the three channels by which SBTC affects optimal policy: (1) distributional benefits, (2) distortions in education, and (3) general-equilibrium effects. The effects have already been summarized in Table 1. Table 4 provides the quantification. We now discuss them in detail.

Table 4 Decomposition into different channels

5.3.1 Comparative statics of the optimal tax rate

5.3.1.1 Distributional benefits of income taxes \(\xi \)

The effect of SBTC on the distributional benefits of income taxes \(\xi \) is determined by changes in the income distribution and in the social welfare weights.

By raising the ratio of wage rates \(w^{H}/w^{L}\), SBTC changes the income distribution: directly by increasing before-tax wage differentials and indirectly by affecting labor-supply and education decisions. Since the increase in labor supply is larger if the wage rate or a worker’s ability are higher, income inequality between and within skill groups increases. Moreover, investment in education rises with SBTC, which also increases income inequality. General-equilibrium effects dampen the labor-supply and education responses by compressing wage differentials, but do not fully offset the direct increase in inequality. For given social welfare weights \( g_{\theta }\), SBTC thus increases the distributional benefits of taxing income \(\xi \).

However, also the social welfare weights change with SBTC. Social welfare weights decline with utility, since the government is inequality averse. High-ability workers experience the largest infra-marginal utility gain due to SBTC. As a consequence, social welfare weights for high-ability workers fall more than for low-ability workers.

The impact of SBTC on \(\xi \) is, therefore, analytically ambiguous: it raises both the utility of the high-ability individuals relatively more and lowers their social welfare weights more. In the numerical comparative statics, we find that SBTC raises the distributional benefits of taxing income (Table 4). The immediate effects on inequality thus dominate changes in social welfare weights. Ceteris paribus, higher distributional benefits of income taxes \(\xi \) thus call for an increase in the optimal tax rate.

5.3.1.2 Education distortions of income taxes \(\frac{\Delta }{(1-t){\bar{z}}} \Theta f(\Theta )\varepsilon _{\Theta ,t}\)

To disentangle the various effects of SBTC on education distortions, we begin with the first term in the expression for education distortions, \(\frac{\Delta }{(1-t){\bar{z}}}\). On the one hand, the net tax on education \( \Delta \equiv tw^{H}\Theta l_{\Theta }^{H}-tw^{L}\Theta l_{\Theta }^{L}-sp(\Theta )\) increases, because SBTC raises the wage differential between the marginally high-skilled and the marginally low-skilled worker—ceteris paribus. On the other hand, if education is subsidized (\(s>0\)), the net tax \(\Delta \) falls, because subsidies increase as SBTC lowers the marginal graduate \(\Theta \), who has higher costs of education—ceteris paribus.Footnote 34 Turning to the denominator, SBTC raises average income \({\overline{z}}\). The overall impact of SBTC on \(\frac{\Delta }{(1-t){\bar{z}}}\) is that it declines in our simulations.

Next, we turn to the size of the tax base at the marginal graduate, \( \Theta f(\Theta )\). Analytically, the impact on this expression is ambiguous. SBTC lowers \(\Theta \), but whether or not \(\Theta f(\Theta )\) increases depends on the location of \(\Theta \) in the skill distribution, i.e., before or after the mode. We find numerically that the tax base \(\Theta f(\Theta )\) increases with SBTC; hence, distortions on education become larger for that reason.

Finally, SBTC changes the elasticity of education with respect to the tax rate \(\varepsilon _{\Theta ,t}=\varsigma \frac{\sigma +\varepsilon }{\sigma +\varepsilon +\varsigma \delta (\beta -\alpha )}>0\); see Appendix A. It raises the income share of the high-skilled workers \(\alpha \) and reduces the measure of the inverse skill premium \(\beta \). However, the impact of SBTC on \(\delta \)—the mass of labor at the skill cutoff \(\Theta \)—is ambiguous, making its overall impact on \(\varepsilon _{\Theta ,t}\) ambiguous as well. In the numerical comparative statics, \(\varepsilon _{\Theta ,t}\) slightly increases.

Numerically, we find that education is distorted upwards; the net tax on education is negative (\(\Delta <0\)). Moreover, SBTC exacerbates these upward distortions. As upward education distortions become even larger with SBTC, the tax rate should increase, ceteris paribus (Table 4).

5.3.1.3 General-equilibrium effects of income taxes\(({\tilde{g}}^{L}-{\tilde{g}}^{H})\varepsilon _{GE}\)

To understand how SBTC affects general-equilibrium effects of income taxes, we need to know how the difference in income-weighted social welfare weights for the low- and the high-skilled workers, \({\tilde{g}}^{L}-{\tilde{g}}^{H}\), is affected by SBTC. First, SBTC raises income inequality between and within education groups, as argued above. Second, SBTC affects the composition of education groups as more individuals become high-skilled. Since the highest low-skilled worker and the lowest high-skilled worker now have a lower ability, both \( {\tilde{g}}^{L}\) and \({\tilde{g}}^{H}\) increase when keeping the schedule of social welfare weights fixed. However, even for a fixed schedule of social welfare weights the net impact on \({\tilde{g}}^{L}-{\tilde{g}}^{H}\) is not clear, as it is ambiguous whether \({\tilde{g}}^{L}\) or \({\tilde{g}}^{H}\) increases more. Third, the schedule of social welfare weights changes, as also argued above. Social welfare weights for individuals with higher ability or education decrease relative to social welfare weights of the individuals with lower ability or education, so that \({\tilde{g}}^{L}-{\tilde{g}} ^{H}\) increases. Taking these effects together, the analytical impact of SBTC on \({\tilde{g}} ^{L}-{\tilde{g}}^{H}\) is ambiguous, while the numerical impact is negative. Although the average social welfare weight of the low-skilled workers and the high-skilled workers both increase, this increase is found to be smaller for the low-skilled than for the high-skilled workers. Hence, the impact of larger inequality on social welfare weights is offset by the change in the composition of high- and low-skilled workers, and the impact of declining social welfare weights due to larger inequality.

Next, we turn to the impact of SBTC on the general-equilibrium elasticity \( \varepsilon _{GE}=\frac{\alpha (1-\alpha )\varsigma \delta }{\sigma +\varepsilon +\varsigma \delta (\beta -\alpha )}\). SBTC raises the income share \(\alpha \) of high-skilled workers and reduces the measure of the inverse skill premium \( \beta \). However, the analytical impact of SBTC on \(\delta \), and thus on \( \varepsilon _{GE}\) overall, is ambiguous. Numerically, SBTC increases \( \varepsilon _{GE}\). Hence, if SBTC becomes more important, the skill premium responds more elastically to changes in policy. Since \(\varepsilon _{GE}\) increases relatively more than \( {\tilde{g}}^{L}-{\tilde{g}}^{H}\) decreases, we find that general-equilibrium effects of income taxes become more important with SBTC. Ceteris paribus, this calls for lower income taxes (Table 4).

5.3.1.4 All effects combined

Whether the income tax rate rises or falls with SBTC depends on which effects dominate. The increase in distributional benefits as well as larger distortions of net subsidies on education calls for an increase in the income tax, whereas stronger general-equilibrium effects tend to lower income taxes. Numerically, we find that the first two effects dominate (Table 4). As a consequence, SBTC leads to a higher optimal income tax rate.

5.3.2 Comparative statics of the optimal subsidy rate

5.3.2.1 Distributional losses of education subsidies \(\frac{s\pi }{ (1-t){\bar{z}}}\zeta \)

SBTC affects the distributional characteristic of education \(\zeta \) by changing the social welfare weights \(g_{\theta }\), and by lowering the threshold \(\Theta \) as more individuals become high-skilled. Like before, the impact of SBTC on social welfare weights is ambiguous. In contrast to the impact of \(\Theta \) on the distributional characteristic of income \(\xi \), the decrease in \(\Theta \) leads to a higher distributional characteristic of education \(\zeta \)—ceteris paribus. Intuitively, as more individuals with lower social welfare weights become high-skilled, the average welfare weight of low-skilled workers increases, and it becomes more desirable to tax education on a net basis. General-equilibrium effects dampen the labor-supply and education responses, thereby limiting the rise in pre-tax inequality. Numerically, we find that SBTC raises the distributional benefits of taxing education \(\zeta \). Hence, as the distributional losses of education subsidies increase, the subsidy rate should decrease with SBTC, ceteris paribus (Table 4).

5.3.2.2 Education distortions of education subsidies \(\frac{\Delta }{(1-t) {\bar{z}}}\Theta f(\Theta )\varepsilon _{\Theta ,s}\)

The distortions of taxes and subsidies on education only differ by a factor \( \rho \equiv \frac{s}{(1-s)(1+\varepsilon )}>0\), which captures the importance of education subsidies in the total direct costs of education (see also Table 5). This factor is not affected by SBTC. As a consequence, the direction in which SBTC affects distortions on skill formation is the same for taxes and subsidies. As we have argued above, we cannot analytically sign the effect. Numerically, the optimal net tax on education is negative, i.e., there is optimally a net subsidy on education resulting in over-investment. Moreover, we find that SBTC exacerbates these education distortions. Hence, ceteris paribus, the optimal subsidy rate should decrease with SBTC (Table 4).

5.3.2.3 General-equilibrium effects of education subsidies \(\rho ( {\tilde{g}}^{L}-{\tilde{g}}^{H})\varepsilon _{GE}\)

The general-equilibrium effects of taxes and subsidies also differ only by factor \(\rho \). It follows from our discussion of the general-equilibrium effects of taxes that we cannot analytically sign the effect, while numerically we find an increase (Table 4). As the general-equilibrium effect of education subsidies becomes more important with SBTC, the optimal subsidy rate should increase, ceteris paribus.

5.3.2.4 Combined effect

While increased distributional losses and larger distortions due to over-investment in education call for a lower subsidy rate, the increased importance of general-equilibrium effects tends to increase optimal subsidies on education. Numerically, we find that the first two effects dominate (Table 4). As a consequence, the optimal subsidy rate falls with SBTC.

5.4 Robustness

We find numerically that the optimal tax rate increases with SBTC, while the optimal subsidy rate falls. We now investigate whether these findings are robust to changes in the most important model parameters. If so, even though we cannot sign the impact of SBTC analytically, we can be confident that the impact of SBTC on optimal policy holds more generally. We study the robustness of our findings with regard to (1) the government’s inequality aversion, (2) the labor-supply elasticity, and (3) the subsidy elasticity of enrollment into higher education.

To study the effect of different labor-supply elasticities and enrollment elasticities, we recalibrate our model to match the moments in the data under the different elasticities. We present the calibration outcomes in Table 6 in Appendix E. In all calibrations, the change in the skill premium and the share of high-skilled workers is kept at baseline values. All models thus capture SBTC in a comparable way. To study the robustness of our results with respect to inequality aversion, we do not need to recalibrate our model, since the corresponding parameter \(\phi \) does not interact with the other model parameters and can thus be set independently. The range of skill bias A differs by scenario, as it is calibrated to match SBTC in the data.

5.4.1 Inequality aversion

The baseline assumes an elasticity of inequality aversion of \(\phi =1\). Figure 3 in Appendix E presents robustness checks for two additional levels of inequality aversion: \(\phi =0.5 \) and \(\phi =1.5\). Larger values of inequality aversion correspond to higher optimal tax and subsidy rates. Yet, the qualitative pattern is the same as in the baseline (represented by the solid black line); the optimal tax rate increases with SBTC, while the optimal subsidy rate falls.

5.4.2 Labor-supply elasticity

We compute optimal policy for \(\varepsilon =0.1\) and \(\varepsilon =0.5\) and present the results, alongside the baseline of \(\varepsilon =0.3\) in Fig. 4 in Appendix E. As expected, if the labor-supply elasticity is lower, then optimal tax rates are higher. The optimal subsidy rate at \(\varepsilon =0.5\) is similar to the baseline, while it is higher at \(\varepsilon =0.1\). Again, the qualitative pattern of a rising optimal tax and falling optimal subsidy rate with SBTC remains.

5.4.3 Subsidy elasticity of enrollment

In the baseline calibration, the subsidy elasticity of enrollment is 0.12. We compute results for two alternative scenarios with enrollment elasticities of 0.10 and 0.14, which we plot in Fig. 5 in Appendix E. If the enrollment elasticity is higher, then optimal tax rates are higher, while the opposite holds for optimal subsidy rates. Once more, we confirm the qualitative pattern of a rising optimal tax and falling optimal subsidy rate with SBTC.

5.5 Limitations and future research

We have studied a stylized model of how tax and education policy should respond to SBTC. In doing so, we focused on three first-order issues when thinking about optimal redistributive tax and education policy: direct distributional impacts, distortions in labor supply and education, and general-equilibrium effects of income taxes and education subsidies. Nevertheless, we ignored a number of potentially important real-world features to keep our analysis tractable. In particular, the income distributions of low- and high-skilled workers are not overlapping, and there are no credit constraints, information frictions, or externalities. The latter three features might justify government intervention in education Barr (2004). Allowing for them might therefore change our conclusions, which should, therefore, be taken with caution.

However, for any of these factors to affect our conclusions, they would need to interact with SBTC. It is clearly possible that SBTC—by generating larger income inequality—exacerbates problems with credit constraints. Jacobs and Yang (2016) demonstrate that tighter credit constraints typically raise optimal taxes and education subsidies (lower net taxes on education), thereby strengthening the main findings of this paper. Related is Colas et al. (2021), who analyze the optimal allocation of education subsidies across the income distribution (and other dimensions). They find that education subsidies should optimally be targeted toward students from lower socioeconomic backgrounds, because credit constraints, externalities or information frictions are more severe for them. Their findings also suggest that optimal policies may need to become more progressive in response to SBTC.

The analysis of optimal tax and education policy with SBTC and capital market constraints, externalities or information frictions is therefore an interesting and important avenue for future research.

6 Conclusion

This paper studies how optimal linear income tax and education policy should respond to skill-biased technical change (SBTC). To do so, we introduce intensive margin labor supply and a discrete education choice into the canonical model of SBTC based on Katz and Murphy (1992), Violante (2008), and Acemoglu and Autor (2011). We start by deriving expressions for the optimal income tax and education subsidy for a given level of skill bias. The income tax and subsidy trade off direct distributional benefits and general-equilibrium effects of each policy against distortions of each policy on labor supply and education. Then, we analyze skill-biased technical change (SBTC), which is shown to have theoretically ambiguous impacts on both optimal income taxes and education subsidies, since SBTC simultaneously changes i) distributional benefits, ii) distortions in education, and iii) general-equilibrium effects.

To analyze the importance of each channel, the model is calibrated to the US economy to quantify the impact of SBTC on optimal policy. SBTC is found to make the tax system more progressive, since the distributional benefits of higher income taxes rise more than the tax distortions on education and the general-equilibrium effects of taxes. Moreover, education is subsidized on a net basis and is therefore above its efficient level. Hence, the subsidy indeed exploits general-equilibrium effects for redistribution. However, SBTC lowers optimal education subsidies, since the distributional losses and the distortions of higher education subsidies increase more than the general-equilibrium effects of subsidies.

In line with Tinbergen (1975) and Dur and Teulings (2004), we find that general-equilibrium effects do matter for the optimal design of tax and education policy. Moreover, our findings support the push for more progressive taxation in light of SBTC brought forward by Goldin and Katz (2010). However, Tinbergen and Goldin and Katz also advocate raising education subsidies to win the race against technology and to compress the wage distribution via general-equilibrium effects on wages. Our findings do not support this idea. The reason is that education subsidies not only compress wages, but also entail larger distributional losses and cause more over-investment in education as SBTC becomes more important. The latter are found to be quantitatively stronger than the larger benefits of using education subsidies to exploit general-equilibrium effects.

These results should be taken with caution as they have been derived in a stylized model, which made a number of important simplifications. Future research could fruitfully extend our analysis of optimal tax and education policy under SBTC to allow for overlapping wage distributions, borrowing constraints, information frictions, and externalities.