The marginal cost of public funds is one at the optimal tax system

This paper develops a Mirrlees framework with skill and preference heterogeneity to analyze optimal linear and nonlinear redistributive taxes, optimal provision of public goods, and the marginal cost of public funds (MCF). It is shown that the MCF equals one at the optimal tax system, for both lump-sum and distortionary taxes, for linear and nonlinear taxes, and for both income and consumption taxes. By allowing for redistributional concerns, the marginal excess burden of distortionary taxes is shown to be equal to the marginal distributional gain at the optimal tax system. Consequently, the modified Samuelson rule should not be corrected for the marginal cost of public funds. Outside the optimum, the marginal cost of public funds for distortionary taxes can be either smaller or larger than one. The findings of this paper have potentially important implications for applied tax policy and social cost–benefit analysis.

: "Where there is indirect damage, it ought to be added to the direct loss of satisfaction involved in the withdrawal of the marginal unit of resources by taxation, before this is balanced against the satisfaction yielded by the marginal expenditure. It follows that, in general, expenditure ought not to be carried out so far as to make the real yield of the marginal unit of resources expended by the government equal to the real yields of the last unit left in the hands of the representative citizen."

Introduction
The marginal cost of public funds is the ratio of the social marginal value of a unit of resources raised by the government and the social marginal value of a unit of resources in the private sector. 1 The marginal cost of public funds is therefore a measure indicating the scarcity of public resources. Ever since Pigou (1947), many scholars and policymakers are convinced that the marginal cost of public funds must be larger than one, since the government relies on distortionary taxes to finance its outlays. If the marginal cost of public funds is indeed larger than one, this has important normative consequences for the determination of optimal public policy in many fields.
In early theoretical contributions, Stiglitz and Dasgupta (1971) and Atkinson and Stern (1974) demonstrated that the Samuelson (1954) rule for the optimum provision of public goods needs to be modified to account for tax distortions. 2 The optimal level of public goods provision should be lower, and the optimal size of the government should thus be smaller, if the marginal cost of public funds is higher. Many applied costbenefit analyses multiply the cost of public projects with a measure for the marginal cost of public funds that is larger than one. As a result, public projects are less likely to pass a cost-benefit test. For example, Heckman et al. (2010) evaluate the Perry Preschool Program and add 50 cents per dollar spent to account for the deadweight costs of taxation. Many other examples can be given, but the message is clear: The marginal cost of public funds has a tremendous impact on how governments should evaluate the desirability of public policies. 3 This paper questions the conventional wisdom that the marginal cost of public funds is necessarily larger than one by explicitly introducing distributional concerns to motivate tax distortions. Most of the literature has focused on Ramsey (1927) frameworks with homogeneous agents where redistributional concerns are absent by assumption see, e.g., Browning (1976Browning ( , 1987, Wildasin (1984), and Ballard and Fullerton (1992). Hence, the marginal cost of public funds for distortionary taxes are analyzed, without paying attention to the ultimate reasons why there are distortionary taxes. 4 This paper contributes to the literature on the marginal cost of public funds in a number of ways. First, it follows Mirrlees (1971) by providing a microeconomic foundation for tax distortions. Earning abilities and labor supplies are private information. These informational constraints in combination with distributional concerns of the government are the reason why distortionary taxation is optimal in second-best settings. Second, this paper does not need to rule out non-distortionary, non-individualized lump-sum taxes to obtain a non-trivial second-best policy problem, as in the representative-agent literature.
The main finding of this paper is that, at the optimal tax system, the marginal cost of public funds is equal to one for all tax instruments. The marginal cost of public funds for the non-distortionary non-individualized lump-sum tax equals one at the optimal tax system. This comes as no surprise, since non-individualized lumpsum taxes neither feature distortions nor distributional benefits. By adjusting nonindividualized lump-sum transfers, the government ensures that the social marginal value of resources is equalized in the public and the private sector. Moreover, the marginal cost of public funds for distortionary taxes should be one as well, since it should be equal to the marginal cost of public funds for non-distortionary taxes. In settings with heterogeneous agents, the marginal cost of public funds for a distortionary tax is thus shown to depend not only on the marginal excess burden of the tax, but also on the marginal distributional benefits of the tax. If the marginal cost of public funds is one at the optimal tax system, the marginal excess burden of a distortionary tax is exactly compensated by its marginal redistributional benefits.
To demonstrate how allowing for heterogeneous agents and non-individualized lump-sum taxes drive our main findings, this paper analyzes the special case in which the government cannot optimize lump-sum transfers. Even then it is not correct to conclude that the marginal costs of public funds of a distortionary tax is necessarily larger than one. The marginal cost of public funds for distortionary taxes can either be larger or smaller than one depending on whether the marginal distributional gains are smaller or larger than the excess burden of the distortionary tax. The representativeagent models are nested as a special case of the model without non-individualized lump-sum taxes and where all distributional benefits of taxation are zero. Only in that special case one can unambiguously conclude that the marginal cost of public funds 4 Some authors have allowed for heterogeneous agents, see, for example, the contributions by Browning and Johnson (1984) and Allgood and Snow (1998), but these studies focus mainly on the efficiency costs of taxation. Others have explicitly introduced distributional aspects of public goods and taxes, see, e.g., Christiansen (1981), Boadway and Keen (1993), Kaplow (1996), Sandmo (1998), Slemrod and Yitzhaki (2001), and Dahlby (2008). is larger than one, since all available tax instruments only cause distortions, but yield no redistributional gains.
The literature on the marginal cost of public funds has generated substantial confusion, see also the reviews in Ballard and Fullerton (1992), Dahlby (2008), and Jacobs (2009). In particular, earlier literature has not settled down on a commonly agreed definition for the marginal cost of public funds. 5 Moreover, the most regularly used definition, e.g., in Atkinson and Stern (1974), Ballard and Fullerton (1992), and Sandmo (1998), has some undesirable properties. First, the marginal cost of public funds for lump-sum taxes is generally not equal to one, even though there is no theoretical presumption that the marginal cost of public funds for lump-sum taxes should differ from one if lump-sum taxes are optimized. Second, measures for the marginal cost of public funds of distortionary taxes do not directly relate to the marginal excess burden of taxation, even though this relationship is suggested by, e.g., Pigou (1947), Harberger (1964), and Browning (1976. Finally, standard measures for the marginal cost of public funds are highly sensitive to the choice of the untaxed numéraire good. All these properties of standard measures for the marginal cost of public funds make it difficult to apply in policy, for example, in social cost-benefit analysis. This paper aims to resolve the issues in the literature by defining the marginal cost of public funds as the ratio of the social marginal value of public income and Diamond (1975)'s measure of (the average of) the social marginal value of private income. 6 Intuitively, if the private sector receives an additional unit of funds, social welfare not only increases because the private sector experiences higher utility, but social welfare also changes if the additional unit of funds in the private sector causes income effects in behavior that result in revenue losses or revenue gains for the government. These income effects on taxed bases need to be taken into account to correctly calculate the marginal cost of public funds. However, the standard definition of the marginal cost of public funds is defined as the ratio of the social marginal value of public income (measured in 'social utils') and (the average of) the private marginal value of private income (measured in 'private utils'). The traditional measure therefore ignores the income effects on taxed bases, which is shown to cause all its undesirable properties. 5 On the one hand, the so-called Pigou-Harberger-Browning approach (also called 'differential analysis') equates the marginal cost of public funds to one plus the marginal excess burden of taxation, which is determined by the compensated tax elasticity of earnings, see also Pigou (1947), Harberger (1964, and Browning (1976Browning ( , 1987. On the other hand, the Atkinson-Stern-Ballard-Fullerton approach (also called 'balanced-budget approach') bases the marginal cost of public funds on the ratio of the marginal value of income of the public and the private sector (i.e., the ratio of Lagrange multipliers on the government budget constraint and the private budget constraint). In that case, the uncompensated tax elasticity of earnings supply determines the marginal cost of funds, see also Atkinson and Stern (1974) and Ballard and Fullerton (1992). By using the latter approach to the marginal cost of funds, Sandmo (1998) and Slemrod and Yitzhaki (2001) include distributional aspects of distortionary linear taxation, while Gahvari (2006) extends it to nonlinear taxation. Using models with representative agents, Triest (1990), Håkonsen (1998), andDahlby (1998) develop yet another MCF concept relying on correcting the standard MCF measures with a ratio of the shadow value of public resources before and after the introduction of distortionary taxes. Up to this date, it remains unclear which MCF measure should be used in applied policy analysis. By using the Diamond-based measure for the marginal cost of public funds, it is demonstrated that-at the optimal tax system-the marginal cost of public funds for lump-sum taxes is one, the marginal cost of public funds for distortionary taxes is directly related to the excess burden (in the absence of distributional concerns), and the marginal cost of public funds measures are no longer sensitive to the normalization of the tax system.
The remainder of this paper is structured as follows. Section 2 introduces the model. Section 3 is devoted to optimal taxation, optimal provision of public goods, and the marginal cost of public funds under linear tax instruments. Section 4 shows that the main result extends to nonlinear taxation using a tax perturbation. Section 5 discusses the policy implications of the analysis. Section 6 concludes. An online Appendix derives the marginal cost of public funds using compensating variations and rigorously proves the main findings under nonlinear instruments.

Model
The model consists of heterogeneous individuals optimally supplying labor on the intensive margin and a benevolent government optimally setting taxes and public goods. 7 Without loss of generality, a partial-equilibrium setting is assumed in which prices are fixed, so that firms can be ignored. 8 The paper mainly focuses on linear tax instruments. Later it is demonstrated that the main findings carry over to nonlinear instruments.

Individuals
There is a mass N of individuals that differ by a single-dimensional parameter n ∈ N = [n, n], where the upper bound n could be infinite. n is the individual's earning ability ('skill level'), which equals the productivity per hour worked. The density of individuals of type n is denoted by f (n) and the cumulative distribution function by F(n). Note that N N f (n)dn = N .
Each individual n derives utility u(n) from consumption c(n) and pure public goods G. Furthermore, it derives disutility from supplying labor l(n). Each individual has an endowment of time, which is allocated between leisure and working. Consumption and leisure are both assumed to be normal goods. This paper allows for preference 7 It is based on earlier contributions by Ramsey (1927), Mirrlees (1971), Diamond and Mirrlees (1971), Sheshinski (1972), Diamond (1975), Christiansen (1981), Boadway and Keen (1993), Kaplow (1996) and Sandmo (1998). 8 Jacobs (2010) analyzes the model in general-equilibrium settings allowing for a representative firm operating a constant returns-to-scale technology and none of the results change. Almost all of the papers in the literature fix the marginal rates of transformation between all commodities at one. Hence, all prices are constant, and allowing for general equilibrium provides no additional insights. Moreover, our partialequilibrium results fully generalize to general-equilibrium settings with non-constant prices, since optimal second-best tax rules in general equilibrium are identical to the ones in partial equilibrium as long as there are constant returns to scale in production and all labor types are perfect substitutes in production, see also Diamond and Mirrlees (1971). heterogeneity: The utility function depends on the skill level n. The utility function is strictly (quasi-)concave and is twice continuously differentiable: (1) Subscripts denote partial derivatives. The government employs a tax schedule consisting of a linear tax rate t on gross labor earnings z(n) ≡ nl(n), a linear tax rate τ on consumption goods c(n), and a nonindividualized lump-sum transfer g. The informational requirements for employing linear taxes are that the government should be able to verify aggregate labor income or consumption. Note that non-individualized lump-sum transfers are always part of the instrument set of the government, since the government can always provide each individual with an equal amount of resources. The individual budget constraint states that net expenditures on consumption are equal to net labor earnings: (2) One tax instrument is redundant, since the consumption tax is equivalent to the income tax. Thus, without loss of generality, one tax instrument can always be normalized to zero. The individual maximizes utility (1) subject to its budget constraint (2). This yields the standard first-order condition for labor supply: 9 Taxation is distortionary as it drives a wedge between the marginal social benefits (n) and the marginal private benefits ((1 − t)n/(1 + τ )) of an increase in labor supply. Indirect utility of individual n can be written as v(n) ≡ v(t, τ, g, G, n). Straightforward application of Roy's identity yields the following properties of v(·): ∂v(n) ∂t = − λ(n)z(n), ∂v(n) ∂τ = − λ(n)c(n), ∂v(n) ∂g = λ(n), and ∂v(n) ∂G = λ(n) u G u c , where λ(n) is the private marginal utility of income of individual n. Given that preferences depend on the skill level n, the private marginal utility of income is not necessarily non-increasing in n. To ensure that a well-defined redistribution problem is obtained, it is assumed that ∂λ(n) ∂n ≤ 0. This inequality always holds if preferences are identical for all individuals due to the assumptions on the derivatives of the utility function.
Some additional notation is introduced to express the optimal policy rules in terms of well-known elasticity concepts. In particular, the compensated, uncompensated and income elasticities of labor supply and consumption demand with respect to the tax rates and the public good are denoted by: ∂G G c(n) , and ε cg ≡ (1 + τ ) ∂c(n) ∂g > 0, where the superscript u (c) denotes an uncompensated (compensated) change. In the remainder of the paper, a bar is used to indicate an income-weighted elasticity, e.g.,ε c lt ≡ N ε c lt z(n)dF(N ) N z(n)dF(n) −1 .

Government
The social objective is a utilitarian social welfare function: Maximizing a utilitarian social welfare function implies a social preference for income redistribution, since the private marginal utility of income λ(n) declines with skill n at the individual level. 10 The government budget constraint states that total tax revenues equal spending on transfers and public goods: where p denotes the constant marginal rate of transformation (i.e., the price) of public goods in terms of private goods.

Optimal taxation and public goods provision
The government maximizes social welfare subject to its budget constraint by choosing the non-individualized lump-sum transfer g, the tax rate on income t or consumption τ , and the level of public goods provision G. Optimal policies are derived under both tax normalizations (i.e., either consumption or income remains untaxed). The social marginal value of one unit of public resources is denoted by η. The Lagrangian for maximizing social welfare is given by: The first-order conditions for a maximum are: 10 The utilitarian case allows for the clearest representation of the main ideas in this paper. All results can be generalized to allow for a general, concave social welfare function (u(n)), with (·) > 0, (·) < 0, where the government may exhibit a stronger preference for redistribution than individuals do. See also the working paper version of this article (Jacobs 2010). As a corollary, all results also generalize to the case with exogeneously given Pareto weights δ(n), i.e., where (u(n)) ≡ δ(n)u(n), = δ(n), and = 0.
where the derivatives of indirect utility have been used in each first-order condition. 11

Marginal cost of public funds-the Diamond approach
This section derives optimal policies by employing a new definition for the marginal cost of public funds. The findings of this paper are contrasted with the more traditional definition in Sect. 3.2. By defining the marginal cost of public funds based on the social marginal value of income of Diamond (1975), it will be demonstrated that a number of issues with the traditional definition disappear. Moreover, new theoretical insights are derived that are policy relevant. The social marginal value of income of Diamond (1975) is given in the following definition.
Definition 1 The Diamond definition for the social marginal value of one unit of private income accruing to individual n equals As in the traditional definition, the social marginal value of private income captures the rise in social welfare if individual n has one unit of additional resources as measured by λ(n). The social marginal value of private funds is larger if the direct utility gains of a unit of private funds λ(n) are larger. However, the Diamond (1975) definition also includes the social value of the income effects on taxed bases. Intuitively, if the private sector has one unit of additional funds, this not only changes social welfare by providing utility to individuals, but it also changes social welfare if that unit of funds changes public revenue via income effects on taxed bases. In particular, if an individual receives an additional unit of funds, labor supply is reduced ( ∂l(n) ∂g < 0) and consumption demand increased ( ∂c(n) ∂g > 0), since both leisure and consumption are assumed to be normal goods. Hence, the government loses − tn ∂l(n) ∂g in tax revenues from the income tax (if t > 0) or gains τ ∂c(n) ∂g in revenues from the consumption tax (if τ > 0) if the individual receives an additional unit of funds. The social welfare effects of these revenue changes are obtained by multiplication of the revenue changes with η, the social marginal value of public resources. Thus, α(n) measures the total increase in social welfare if individual n has one unit of additional funds. Adopting α(n) as the social marginal value of private income of individual n makes it possible to define the marginal cost of public funds.

Definition 2
The marginal cost of public funds based on the Diamond measure of the social marginal value of private income is given by Analogously to the standard measure for the marginal cost of public funds, the Diamond-based measure for the marginal cost of public funds MCF thus measures the social marginal value of one unit of funds in the public sector η relative to the average social marginal value of one unit of funds in the private sector, i.e., N α(n)dF(n). 12 Using the Diamond (1975)-based social marginal value of income, we can define the Feldstein (1972) distributional characteristics of the tax bases and public goods.

Definition 3
The distributional characteristics ξ y of tax bases y(n) = {z(n), c(n)} based on the Diamond measure of the social marginal value of private income are given by ξ y is a normalized covariance between a tax base and the welfare weights. It represents the gain in social welfare (expressed in monetary equivalents and then divided by the taxed base) of redistributing a marginal unit of resources through base y(n) = {z(n), c(n)}. The distributional characteristics of the tax bases are positive, since the covariance between tax base y(n) and the social welfare weights α(n) is negative. Individuals with higher incomes or consumption levels feature lower welfare weights because the social marginal utility of income is diminishing in income or consumption due to diminishing private marginal utility of income. The positive distributional characteristic ξ y therefore implies that redistribution through taxing income or consumption yields distributional benefits. A stronger social desire for redistribution or greater inequality in the skill distribution raise the distributional characteristic.
Since ξ y is a positive normalized covariance, it ranges between one and zero. ξ y = 0 is obtained either if the government is not interested in redistribution because it attaches the same social welfare weights α(n) to all individuals or if the base y(n) is the same for all n so that there is no inequality.

Definition 4
The distributional characteristic of the public good based on the Diamond measure of the social marginal value of private income is The distributional characteristic for the public good ξ G is the negative normalized covariance between the social marginal valuation of income α(n) and the marginal willingness to pay for the public good u G u c . ξ G > 0 if the public good mainly benefits the rich, who feature the lowest social welfare weights α(n), and vice versa. ξ G = 0 if the government is not interested in redistribution and attaches the same welfare weights α(n) to all individuals or if all individuals benefit equally from the public good, i.e., u G u c is equal for all n. The next Lemma derives the marginal excess burden of the income or consumption tax. The excess burden measures the reduction in social welfare, measured in monetary units, expressed as a fraction of the tax base, of raising the distortionary income or consumption tax.

Lemma 1 The marginal excess burden for the income tax and the consumption tax are given by
Proof The welfare effect of a rise in the tax rate is evaluated, while public goods provision remains constant. A rise in the tax rate is considered, while each individual n receives a individual-specific lump-sum transfer s(n) so as to keep its utility constant. 13 The excess burden is equal to the resulting loss in tax revenue, which is summed over all individuals. To determine the excess burden of the income tax, assume that τ = 0. The change in taxes dt and lump-sum income ds(n) for each individual n should keep utility constant: dv(n) = λ(n)ds(n) − λ(n)nl(n)dt = 0. Hence, ds(n) = nl(n)dt for all n. Public revenue changes according to d R n ≡ − ds(n) + nl(n)dt + tn ∂l c (n) ∂t dt = tn ∂l c (n) ∂t dt for all n. Since utility remains constant, changes in labor supply are compensated changes. Rewriting yields a revenue loss of d R n = t dt over all individuals, and dividing by taxable income N N nl(n)dF(n), yields the marginal excess burden as a fraction of taxed income: Recall that the bar indicates an income-weighted elasticity. Using similar steps and setting t = 0 gives the marginal excess burden of the consumption tax as a fraction of taxed consumption: Using Definitions 2, 3, 4, and Lemma 1, Proposition 1 characterizes optimal tax policies and public goods provision under the Diamond measure for the marginal cost of public funds, while assuming that the consumption tax is normalized to zero. Each tax instrument has its own marginal cost of public funds. Therefore, MCF g and MCF t are introduced to denote the marginal cost of public funds of the lump-sum tax and the tax rate, respectively, see also Sandmo (1998).
Proposition 1 Under the Diamond-based MCF definition, and the consumption tax normalized to zero, the optimal rules for public goods provision and the linear income tax are given by Proof Equation (7) is simplified by setting τ = 0 and substituting Eq. (12) to find Eq. (19). Equation (8) is simplified by using Eq. (13), the Slutsky equation − nl(n) ∂l(n) ∂g , and setting τ = 0 to find the first part of Eq. (20). The second part follows from substituting Eq. (15). Equation (10) is simplified by using Eq. (14), the Slutsky equation ∂g , setting τ = 0, and using γ t ≡ N t N nl(n)dF(n)/ pG to find Eq. (18).
Equation (18) is the modified Samuelson rule for the optimal provision of public goods. The modified Samuelson rule equates the sum of the marginal social benefits to the marginal social costs of providing the public good. The benefits-the sum of the marginal rates of substitution N N u G u c dF(n)-are deflated by the distributional characteristic of the public good ξ G . If poor individuals value the public good more (less) than rich individuals do, then ξ G < 0 (ξ G > 0), and the level of public goods provision increases (decreases)-ceteris paribus. The right-hand side gives the marginal cost of providing the public good. The main result of this paper is that the cost side of the Samuelson rule does not include a measure for the marginal cost of public funds MCF. Indeed, there should be no correction for the marginal cost of public funds on the cost side of the modified Samuelson rule, since MCF = 1 at the optimal tax system. Consequently, tax distortions do not affect the decision rule for the optimal supply of public goods. 14 Providing public goods may reduce (exacerbate) preexisting labor tax distortions if public goods boost (reduce) compensated labor supply, i.e., ifε c lG > 0 (< 0). Hence, by overproviding (underproviding) public goods compared to the firstbest rule, the government alleviates the distortions of labor tax in the labor market, but this comes at the cost of inefficiencies in public goods provision, see also Atkinson and Stern (1974). γ t ≡ N t N z(n)dF(n)/ pG denotes the share of public goods that is financed with distortionary taxes. γ t captures the importance of reducing labor market distortions compared to introducing inefficiencies in public goods provision.
Equation (19) demonstrates that, at the optimal tax system, the marginal cost of public funds for the lump-sum tax (MCF g ) is always equal to one. The reason is that the lump-sum tax does neither cause deadweight losses nor have distributional gains (losses). Indeed, there is a zero covariance between the lump-sum tax g and the social welfare weights α(n), so that the distributional characteristic ξ for the lumpsum tax is zero. The marginal cost of public funds being equal to one is, therefore, merely a statement that the tax system is optimal: one unit of resources should be equally valuable in the private as in the public sector. Thus, the government should be indifferent between transferring funds from the public to the private sector.
Equation (21) shows that the marginal cost of public funds for all tax instruments should be equalized at the optimal tax system. Hence, the marginal cost of public funds for the income tax should be equal to the marginal cost of public funds for the lump-sum tax. Therefore, from Eq. (20) it follows that the marginal deadweight losses of income taxes should be exactly equal to the marginal distributional gains of income taxes: The marginal excess burden of a distorting tax rate (expressed in monetary units, as a fraction of taxed income) exactly equals the marginal benefits of redistribution (expressed in monetary units, as a fraction of taxed income). 15 The more society cares about distribution, the larger is ξ z , and the higher is the optimal income tax. The more elastically labor supply responds to taxes, the larger is −ε c lt , and the lower is the optimal income tax. This is the standard trade-off between equity and efficiency.
From Eq. (21) follows that the government is indifferent between using nondistortionary and distortionary marginal sources of finance at the tax optimum. There should be no correction of the modified Samuelson rule in Eq. (18) if the public good is financed at the margin with the lump-sum tax, since there is no deadweight loss involved (and no distributional gains either). However, neither should it contain a correction if the marginal source of finance for the public good is the distortionary tax. This is an application of the envelope theorem: The deadweight loss of a marginally higher tax exactly cancels against the distributional gain of the tax if the tax system is optimal.
Footnote 14 continued and second-best. Therefore, one cannot conclude that tax distortions do not affect the second-best level of public goods provision. 15 Gahvari (2014) suggested that MCF = 1 is a definition rather than a result in this setting. This is not correct, since MCF = 1 is an optimality condition, not a definition. Outside the tax optimum, the marginal cost of public funds for the lump-sum tax does not equal one (i.e., MCF T = 1) if the average social marginal value of private income ( N α(n)dF(n)) is unequal to the social marginal value of public income (η). Similarly, outside the optimum, the marginal cost of public funds for the distortionary income tax is not equal to one if the marginal distributional benefits (ξ z ) are not equal to marginal deadweight losses − t 1−tε lt . See also the section on sub-optimal taxation.
The marginal cost of funds for the income tax is directly related to the marginal excess burden of the tax if redistributional concerns are absent (ξ z = 0): without distributional concerns the MCF is exactly equal to the inverse of 1 − MEB. This confirms earlier literature suggesting an explicit link between the marginal cost of public funds and the excess burden of taxation, see also Pigou (1947), Harberger (1964), and Browning (1976). Indeed, at low levels of taxation, the marginal cost of public funds can be approximated by: MCF 1 + MEB. In Mirrlees (1971) analyses, however, the government only introduces distortionary taxes if doing so contributes to equality (i.e., if taxing labor income yields distributional benefits). With distributional concerns, ξ z > 0, and MCF t is lowered, as Eq. (20) reveals.
If the government would not be interested in income redistribution (i.e., ξ z = ξ G = 0), distortionary income taxes would be optimally zero (t = 0); see Proposition 1. Thus, in the absence of a preference for redistribution, all public goods would be financed with non-distortionary non-individualized taxes. Tax distortions are introduced only for redistributional reasons, not for public goods provision. Therefore, the marginal excess burden of the income tax is the price of equality and not the price of public goods provision.
The next Proposition demonstrates that all results remain valid using a different tax normalization, where consumption rather than income is taxed.
Proposition 2 Under the Diamond-based MCF definition, and with the income tax normalized to zero, the optimal rules for public goods provision and the linear consumption tax are given by Proof Equation (7) is simplified by setting t = 0 and substituting Eq. (12) to find Eq. (23). Equation (8) is simplified by using Eq. (13), the Slutsky equation ∂g , and setting t = 0 to find the first part of Eq. (24). The second part follows from substituting Eq. (15). Equation (10) is simplified by using Eq. (14), the Slutsky equation ∂g , setting t = 0, and using γ τ ≡ N τ N c(n)dF(n)/ pG to find Eq. (22).
Proposition 2 demonstrates that the marginal cost of public funds measures are independent from the particular normalization of the tax system. At the optimal tax system, the marginal cost of public funds remains equal to one for all tax instruments, as shown in Eqs. (23) and (25). Equation (24) derives that the excess burden of the consumption tax equals its distributional gain: MEB τ = − τ 1+τε c cτ = ξ c . Moreover, the characterization of the optimal policy rule for public good provision in Eq. (22) is independent from the particular tax normalization.
Although the characterization of the optimal policy rules does not depend on the normalization of the tax system, the marginal excess burdens of consumption and income taxes-as defined in Lemma 1-are not quantitatively identical in the absence of distributional concerns (ξ z = ξ c = 0): Håkonsen (1998). The explanation for the difference in the MCF measures is that at identical allocations, the marginal excess burdens are expressed as fraction of a different tax base (income or consumption). However, both tax instruments have equal marginal excess burdens in absolute terms. This issue is moot once distributional concerns are included in the analysis. The reason is that both the marginal excess burden and marginal distributional gains are expressed as fractions of the same tax base. Consequently, the normalization of the excess burdens or distributional benefits of a tax with a particular tax base has become immaterial if taxes are optimized.
The next Proposition derives a special case for a separable and quasi-linear utility function in which the first-best Samuelson rule for public goods provision is obtained in second-best settings with distortionary taxation. 16

Proposition 3 If utility is given by u
≤ 0, then the optimal provision of public goods follows the first-best Samuelson rule in second-best settings with optimal distortionary taxation: Proof The first-order condition for labor supply is given by v (l(n)) = (1−t)n (1+τ ) , ∀n. Therefore, ε c lG = 0. Furthermore, u G u c = (G) is independent from skill n, hence, ξ G = 0. Substitution of ε c lG = 0 and ξ G = 0 in Eq. (18) yields the result.
This simple, special case demonstrates why it is misleading to ignore distributional concerns in the analysis of the marginal cost of public funds. When distributional concerns are ignored, the cost side of the Samuelson rule would include a measure for the deadweight loss of taxation, whereas the distributional benefits of the distortionary tax would be ignored. A first-best Samuelson rule is found in a second-best setting for two reasons. First, the public good does not increase or decrease compensated labor supply. Therefore, the public good cannot be used to alleviate the tax distortions on labor supply. Second, the public good does not affect the welfare distribution, since every individual benefits to an equal extent from the public good. Indeed, the public good is a perfect substitute for the lump-sum cash transfer g. Since the provision of the public good neither affects efficiency nor equity, provision of the public good should follow the first-best policy rule.

Marginal cost of public funds-the standard approach
This section compares the main findings of this paper to the papers adopting the standard definition for the marginal cost of public funds, see, e.g., Wilson (1991), Ballard and Fullerton (1992), Sandmo (1998), and Gahvari (2006). This section shows that standard MCF measures have three properties: (i) the MCF for lump-sum taxes is not equal to one; (ii) the MCF of distortionary tax instruments cannot be related to the excess burden if distributional concerns are absent; (iii) the MCF is highly sensitive to the normalization of the tax system. It is argued that these properties are caused by not including income effects on taxed bases in the average social marginal value of private income.
The traditional measure for the marginal cost of public funds is given in the following definition.

Definition 5
The marginal cost of public funds based on the standard measure of the private marginal value of private income is given by The traditional marginal cost of public funds MCF s is the ratio of the social marginal value of public income η and the average of the private marginal value of private income λ(n), see, e.g., Wilson (1991), Ballard and Fullerton (1992), Sandmo (1998), and Gahvari (2006). This MCF measure is not economically appealing, because it compares the social marginal value of funds in the public sector η (in social 'utils') with the average of the average private marginal value of funds in the private sector N λ(n)dF(n) (in private 'utils'). While N λ(n)dF(n) indeed measures the average increase in private utility, it does not measure the increase in social welfare if all individuals in the private sector receive an additional euro, because the (welfarerelevant) income effects on the taxed bases are not included. 17 To characterize the optimal tax expressions, the definitions for the Feldstein (1972) distributional characteristics of the tax bases and public goods are adjusted by using λ(n) instead of α(n) as the social welfare weights.

Definition 6
The distributional characteristics ξ s y of tax bases y(n) = {z(n), c(n)} based on the standard measure of the social marginal value of private income are given by 17 Of course, this does not mean that one cannot define the marginal cost of public funds as in the traditional definition, since a definition cannot be wrong in and of itself. However, it is logically impossible that two mathematically distinct definitions within the same model have the same economic meaning. If the marginal cost of public funds is supposed to measure social marginal value of additional public resources relative to the social marginal value of additional private resources, then the traditional measure does not correctly measure the marginal cost of public funds.

Definition 7
The distributional characteristic of the public good based on the standard measure of the social marginal value of private income is The next proposition replicates Sandmo (1998) and derives optimal tax policies and public goods provision under the standard measure for the marginal cost of public funds, while the consumption tax is normalized to zero.

Proposition 4 With the standard MCF definition, and with the consumption tax normalized to zero, the optimal rules for public goods provision and the linear income tax are given by
Proof Equation (7) is simplified by setting τ = 0 and substituting Eq. (27) to find Eq. (31). Equation (8) is simplified by using Eq. (28) and setting τ = 0 to find the first part of Eq. ( 32). The second part of Eq. (32) follows upon substitution of the Slutsky equationε u lt =ε c lt −ε lg and using Eq. (15). Equation (10) is simplified by using Eq. (29), setting τ = 0, and using γ t ≡ N t N nl(n)dF(n)/ pG to find Eq. (30).
Proposition 4 is mathematically equivalent to Proposition 1, since both Propositions are derived from the same first-order conditions in equations (7)-(10). However, the difference lies in the economic interpretation of the optimal policy rules in both Propositions because different definitions for the MCF are adopted.
The most important difference is that the modified Samuelson rule in Eq. (30) now does have a correction for the standard marginal cost of public funds. 18 The standard measure for the marginal cost of public funds is generally not equal to one at the optimal tax system. In particular, Eq. (31) shows that the standard measure for the marginal cost of public funds for the lump-sum tax is always smaller than one (MCF s g < 1) if there is a positive income tax (t > 0) and leisure is a normal 18 Note also that the uncompensated cross-elasticity of labor supply with respect to public goodsε u lG enters the expression rather than the compensated cross-elasticity. Whereas the uncompensated elasticity is zero with utility functions exhibiting (weak) separability between public goods and labor (leisure), the compensated cross-elasticity is generally different from zero and positive for a wide class of utility functions including the separable ones, see Jacobs (2009). good (ε lg < 0). Intuitively, transferring an extra unit of funds from individuals to the government via a larger lump-sum tax generates an income effect in labor supply, and this raises revenues from the income tax if it is positive (t > 0). Consequently, one can raise the lump-sum tax by less than one unit to raise one unit of public funds. The reason why the lump-sum tax does not have a marginal cost of public funds of unity-as with the Diamond-based measure-is that it compares the social marginal value of public resources to the private, not the social, marginal value of private resources. By ignoring the income effects on taxed bases, the average social marginal value of private income is 'overestimated', and, hence, the traditional MCF of lumpsum taxes is driven down below 1 if income is taxed. However, one would theoretically expect the lump-sum tax to have a marginal cost of public funds equal to one for three reasons. First, the lump-sum tax does not cause distortions. Second, the lump-sum tax does not features distributional effects, in the sense that the normalized covariance between the social welfare weights λ(n) and the lump-sum tax g is zero. Third, the social marginal value of both public and private resources should be exactly the same if taxes are optimized. However, the standard definition suggests otherwise. That the marginal cost of public funds for lump-sum taxes is not equal to one in the tax optimum is the first property of the traditional MCF definition.
Furthermore, Eq. (32) gives the marginal cost of public funds for the tax rate (MCF s t ). MCF s t depends on the income-weighted uncompensated tax elasticity of labor supplyε u lt and the distributional benefits of income taxes ξ z . In the absence of distributional concerns (i.e., ξ s z = 0), Eq. (32) shows that it is not possible to directly relate the marginal cost of public funds of the distortionary income tax to the marginal excess burden of the income tax MEB ≡ − t 1−tε lt . However, many papers in the literature have suggested that the marginal cost of public funds should be a measure of the welfare costs of taxation in the absence of distributional concerns, see, for example, Pigou (1947), Harberger (1964) and Browning (1976). However, the sign ofε u lt is theoretically ambiguous due to offsetting income and substitution effects. 19 MCF s t > 1 is obtained only if the labor supply curve is upward-sloping (ε u lt < 0). MCF s t < 1 if there is a backward-bending labor supply curve (ε u lt > 0). The result that the MCF of a distortionary tax can be smaller than one led to a large literature trying to explain this counterintuitive finding and to relate it to the marginal excess burden of the tax; see for example Triest (1990), Ballard and Fullerton (1992) and Dahlby (2008).
Once again, the reason why the standard definition of the marginal cost of public funds for a distortionary tax cannot be related to the excess burden of the tax is that it substitutes the average of the private marginal value of private income for the average social marginal value of private income. However, the average private marginal value of private income does not include the income effects on taxed bases. Therefore, these income effects on taxed bases show up in the denominator for the marginal cost of public funds of the tax; see Eq. (32). As a result, the marginal cost of funds measure for the distorting income tax rate is driven down below unity if income is taxed. That the marginal cost of public funds for a distortionary tax cannot be directly related to the excess burden of taxation in the absence of distributional concerns-and may even be smaller than one-is the second property of the standard definition.
The Diamond-based and the standard MCF definitions coincide if income effects on taxed bases are absent, i.e., ∂l(n) ∂g = ∂c(n) ∂g = 0 so that λ(n) = α(n), see also Eq. (11). Intuitively, the private marginal value of private income λ(n) is then a sufficient statistic for the social marginal value of private income α(n). This special case applies as well to money metric indirect utility functions, where the marginal utility of income is constant and equal to one. 20 An important strand in the literature has, alternatively, derived the marginal cost of public funds of a distortionary tax in representative-agent settings in terms of the compensating variation (CV) and the change in government revenue (d R), see, e.g., Ballard (1990), Mayshar (1990), andHåkonsen (1998). In particular, if taxes are optimized and distributional concerns are absent, the marginal cost of public funds is equal to: 21 Online Appendix A shows that − CV d R is equal to the standard measure of the marginal cost of public funds in Definition 5. Also this alternative approach to the marginal cost of public funds needs reconsideration, because it expresses the compensating variation in terms of the uncompensated change in tax revenue. This is not logical, since the MCF measure is derived by (implicitly) assuming that individuals are perfectly compensated if the tax is marginally increased. Consequently, public revenue can only change due to compensated behavioral responses, while income effects are absent. Therefore, the compensating variation should be expressed in terms of compensated revenue changes (d R c ), and not in terms of uncompensated revenue changes (dR). If the compensating variation is expressed in terms of compensated revenue changes, the Diamond-based measure for the marginal cost of public funds in Definition 2 is found (see online Appendix A): This is another indication that the Diamond-based measure for the marginal cost of public funds has more desirable economic properties than the standard measure. Triest (1990), Håkonsen (1998) and Dahlby (2008) also aim to derive a relationship between the standard MCF measure and the marginal excess burden in models without distributional concerns. These contributions develop a MCF measure, which adjusts the MEB with the ratio of the shadow value of public resources in the absence of taxation (η f b , 'first-best') and after the introduction of distortionary taxation (η sb , 'second-best'). In particular, the relationship is given by: η sb captures the increased scarcity of public resources due to (higher) taxation that is unrelated to the excess burden of taxation. It is implicitly related to the income effects of (higher) distortionary taxes on taxed bases. 22 Like the standard approach, also this approach still does not yield a direct correspondence between MCF and MEB, due to the multiplier term η f b η sb . Moreover, there is no obvious way to estimate the alternative MCF measure of Triest (1990), Håkonsen (1998), and Dahlby (2008): The term η f b η sb is not measurable empirically. The Diamond-based definition for the MCF-in the absence of distributional concerns-features a direct correspondence between MCF and MEB and is expressed in empirically measurable sufficient statistics: compensated elasticities and tax rates.
The next Proposition demonstrates that the standard MCF measure is very sensitive to the normalization of the tax system.

Proposition 5 With the standard MCF definition, and with the income tax normalized to zero, the optimal rules for public goods provision and the linear consumption tax are given by
Proof Equation (7) is simplified by setting t = 0 and substituting Eq. (27) to find Eq. (37). Equation (8) is simplified by using Eq. (28), and setting t = 0 to find the first part of Eq. (38). The second part of Eq. (38) follows upon substitution of the Slutsky equationε u cτ =ε c cτ −ε cg and using Eq. (15). Equation (10) is simplified by using Eq. (29), setting t = 0, and using γ τ ≡ N τ N c(n)dF(n)/ pG to find Eq. (36).
Proposition 5 shows that the standard definition for the marginal cost of public funds is highly sensitive to the normalization of the tax system. Equation (37) reveals that the marginal cost of public funds for the lump-sum tax is always higher than (or equal to) one if consumption is taxed. Recall, it is smaller than (or equal to) one if income is taxed. The reason is that the income elasticity of consumption is positive (if consumption is a normal good):ε cg > 0. If the lump-sum tax is increased (transfer is reduced), there is a negative income effect in consumption demand, which reduces the revenue from the consumption tax. With the different tax normalization, income effects 22 This can be seen by taking an approximation of the MCF s t : . From this follows that The approximation is valid if tax rates and the uncompensated elasticity are not too high. on the consumption tax base now make the average social marginal value of private income larger than the average private marginal value of private income. Therefore, the standard measure for the marginal cost of public funds is larger. As the income effect on the taxed base switches in sign, MCF s g switches from a number below one under income taxation to a number above one under consumption taxation. 23 Equation (38) shows that, in the absence of distributional concerns (ξ s c = 0), and no lump-sum taxes, the marginal cost of funds for the consumption tax is always larger than one, i.e., MCF s τ > 1, since substitution effects and income effects in consumption demand are reinforcing rather than offsetting. Although the sign of the MCF measure is now intuitively correct, its magnitude is not. Since the average private marginal value of private income ignores the income effects on taxed bases, the traditional MCF measure overestimates the marginal cost of public funds.
Proposition 5 shows that a different normalization of the tax system therefore produces completely different marginal cost of public funds measures for both lump-sum and distortionary taxes even though the optimal second-best allocation is the same under both normalizations. 24 This is the third property of the standard marginal cost of public funds measures.
The normalization of the tax code explains the findings of Wilson (1991) and Sandmo (1998). They both suggest that distributional concerns are the reason why the marginal cost of public funds is smaller than one. However, Proposition 5 demonstrates that the conclusion would be reversed if consumption is taxed rather than income. In this case, MCF s g and MCF s τ denote the marginal cost of public funds of the lump-sum tax and the consumption tax.
To summarize, the standard MCF measure has three properties, which are economically unappealing. First, the marginal cost of public funds for lump-sum taxes is not equal to one in the tax optimum-irrespective of the normalization of the tax system. Hence, the social marginal value of public resources seems to be unequal to the social marginal value of private resources in the tax optimum. Second, in the absence of distributional concerns, the marginal cost of public funds for a distortionary tax instrument cannot be directly related to the excess burden of the tax instrumentirrespective of the normalization of the tax system. Hence, the MCF measure does not properly capture the welfare costs of taxation in the absence of distributional concerns. Third, the MCF measures for both distortionary and lump-sum taxes are shown to be highly sensitive to the particular normalization of the tax system. From a practical point of view, all these properties render the applicability of standard MCF measures in applied policy analysis problematic: which number for the MCF should policy makers employ?
The properties of the standard definition of the MCF are caused by ignoring income effects on taxed bases in calculating the social marginal value of private income. By including the income effects on taxed bases in the social marginal value of private income, Sect. 3.1 has shown that (i) the marginal cost of public funds for lump-sum taxes always equals one in a tax optimum; (ii) there exists an explicit and direct link between the marginal cost of public funds and the excess burden of taxation (in the absence of distributional concerns), and (iii) MCF measures are not sensitive to the normalization of the tax system.

Sub-optimal taxation
To conclude the discussion on linear taxation, this section explores to what extent the results are driven by allowing for heterogeneous agents and non-individualized lump-sum transfers. To that end, suppose that the government cannot optimize the lump-sum tax. Then, the government has to resort to distortionary taxation as the marginal source of finance for public goods. For brevity, this section only discusses income taxation. The following Proposition derives optimal policy for any level of lump-sum taxation. 25 Proposition 6 Under the Diamond-based MCF definition, the lump-sum tax exogeneously given, and the consumption tax normalized to zero, the optimal rules for public goods provision and the marginal cost of public funds are given by Proof This result follows immediately from Proposition 1.
The modified Samuelson rule in Eq. (40) now features a correction (MCF t = 1−ξ z 1−MEB t ) for the marginal cost of public funds of the income tax. Even when lump-sum taxes are unavailable, one can not conclude that the marginal cost of public funds for distortionary taxation is necessarily larger than one. This depends on both the excess burden of the income tax MEB t and the distributional benefits ξ z of the income tax. If the income tax is sub-optimally low (high) from a distributional perspective (i.e., ξ z > (<)MEB t ), then the marginal cost of public funds is smaller (bigger) than one, i.e., MCF t < 1 (> 1), and optimal public goods provision is larger (smaller), everything else equal. Intuitively, if MCF t < 1, the government over-provides public goods-relative to the second-best rule with optimized transfers-to compensate for the sub-optimal income redistribution by the income tax (and vice versa if MCF t > 1).
The representative-agent models are nested as a special case of the model where lump-sum taxes are excluded and the distributional effects of taxation or public goods are absent (ξ z = ξ G = 0). In that particular case, the marginal cost of public funds is unambiguously larger than one, since MCF t = 1 1−MEB t > 1. Consequently, only in this specific case, distortionary taxation unambiguously lowers public goods provision compared to the first-best rule. However, this case is of limited practical interest for the simple reason that if everyone would be identical, everyone would prefer nonindividualized lump-sum taxes over distortionary income taxes to finance public goods.
If tax systems are not optimized because lump-sum taxes are not available, the Samuelson rule in Eq. (40) is modified compared to the first-best rule. Four additional factors appear in second-best settings with distortionary taxation: (i) interactions of the public good with distorted labor supply (ε c lG ), (ii) distributional benefits (or costs) of the public good (ξ G ), (iii) distortions of the financing of the public good, i.e., the marginal excess burden of distortionary taxation (MEB t ), and (iv) distributional benefits of the financing of the public good (ξ z ). Only if non-individualized lumpsum taxes are available and taxes are optimized, the distributional gains equal the deadweight losses of financing the public good, so that MEB t = ξ z , and effects (iii) and (iv) cancel from Eq. (40).

Nonlinear taxation
The analysis has so far been confined to linear policy instruments. In the real world, however, most tax systems are nonlinear. This short section extends the model to allow for nonlinear income taxation, as in Mirrlees (1971). 26 In doing so, previous literature is extended and amended by analyzing optimal taxation and public goods provision with preference heterogeneity (as in the linear case). By using a perturbation of the optimal tax schedule, all major findings derived under linear policies are shown to carry over to nonlinear policies.
In particular, let the nonlinear tax schedule be denoted by T (z(n)), where T (z(n)) ≡ dT (z(n))/dz(n) denotes the marginal income tax rate at income z(n). The tax function is assumed to be continuous and differentiable. Proposition 7 derives the marginal cost of public funds under optimal nonlinear income taxation using a tax perturbation. Online Appendices B and C to this paper provide mathematically rigorous proofs. Moreover, online Appendices B and C derive the optimal nonlinear tax schedule, the modified Samuelson rule, and provides an elaborate discussion of the consequences of preference heterogeneity for optimal public good provision. Proposition 7 The marginal cost of public funds is equal to one under optimal nonlinear income taxation: Proof Consider a small tax perturbation where the intercept of the tax function T (0) is marginally raised such that it raises one unit of income from all taxpayers. This tax reform has the following three effects. First, the government gains a marginal unit of resources per capita. Second, this policy mechanically decreases social welfare-measured in monetary equivalents-for each individual by λ(n) η . Third, a smaller −T (0) generates behavioral changes on labor supply. Since marginal tax rates are unaffected by this policy reform, substitution effects are zero and only the income effects matter. The income effect on labor supply changes tax revenues by T (z(n)) ∂z(n) ∂(−T (0)) = T (z(n))n ∂l(n) ∂(−T (0)) for each individual. The total change in social welfare should be zero if the intercept of the tax function is optimized: Rewriting yields Proposition 7.
The marginal cost of public funds is also one under optimal nonlinear taxation. Intuitively, the government has acces to a non-distortionary marginal source of public finance: The intercept of the tax function T (0). −T (0) is equivalent to the nonindividualized lump-sum transfer g under linear taxation. The government lowers the intercept of the tax function −T (0), i.e., it provides all individuals with more income, until the marginal utility of private income plus income effects (α(n)) are on average equal to its marginal cost in terms of revenue (η). Hence, if the tax system is optimized, the government is indifferent to a marginal redistribution of income from the public to the private sector. In an optimal tax system, the marginal cost of public funds for all distortionary marginal tax rates T (z(n)) should then be equal to the marginal cost of public funds for the non-distortionary tax T (0). Thus, tax distortions should be equal to distributional gains for marginal tax rates at each point in the income distribution.
Online Appendix B shows that the expression for the optimal provision of public goods under nonlinear taxation is the same as Eq. (18) for optimal public goods provision under linear taxation, except for a correction for nonlinear marginal tax rates in the γ t term. Online Appendix B further shows that the expression for the optimal nonlinear income tax is the same as in Mirrlees (1971) and Saez (2001). Both will not be discussed further here.

Instrument set
Some may question whether in the real world governments have in fact access to nondistortionary lump-sum taxes as a marginal source of public finance due to practical or even legal problems in implementing such taxes. However, from a theoretical point of view, the lump-sum part of the tax system generally consists of subsidies (i.e., transfers) to individuals. So, even if one likes to rule out non-individualized lump-sum taxes, it is certainly feasible to marginally reduce lump-sum transfers. Moreover, most real-world tax systems entail a general tax credit or a general tax exemption, which acts as a lump-sum subsidy and can be (and often is) changed by policy makers without running into all kinds of practical or legal obstacles. Hence, this paper's assumption that non-individualized lump-sum transfers are a marginal source of public finance can be defended on both theoretical and empirical grounds. Kaplow (1996Kaplow ( , 2004Kaplow ( , 2008 argues that, even if the tax system is not optimal, neither distributional concerns nor incentive effects of the financing of public goods with distortionary taxes should be included in the discussion of second-best policy analysis. 27 In this argument, the government simultaneously does two things: (i) it changes the provision of the public good, and (ii) it applies a benefit-absorbing change in the nonlinear tax schedule that fully extracts each individual's willingness to pay for the public good. Given that the utility function is identical across agents and weakly separable, this tax change does not affect incentives to supply labor. Since the tax adjustment perfectly imitates a pure benefit tax, the first-best Samuelson rule can be used to judge whether public goods provision should increase or not, without making corrections for the marginal cost of public funds.

Kaplow and sub-optimal taxation
Kaplow's approach has a number of disadvantages. First, preferences must be weakly separable and identical, otherwise the benefit-absorbing change in the nonlinear tax schedule is generally not incentive compatible, and cannot be implemented as a result, see also Laroque (2005) and Gahvari (2006). Second, while the tax system is not assumed to be optimal, the nonlinear income tax schedule is flexible enough to perfectly off-set any distributional effect of public good provision at each income level. However, if there are binding constraints make the nonlinear tax schedule sub-optimal, the very same constraints may prevent the implementation of the benefit-absorbing tax changes. Third, Kaplow's analysis cannot be readily generalized to linear tax schedules, because linear taxes generally do not neutralize all distributional effects of public goods-except in the knife-edge case where the benefits of public goods are proportional in income. Fourth, in real-world policy making, changes in the provision of public goods generally occur without neutralizing the distributional effects with benefit-absorbing changes in the tax system (Gahvari 2006). This paper avoided these disadvantages by analyzing public good provision and income taxation with non-separable and heterogeneous preferences, allowing for both linear and nonlinear tax schedules, and without adjusting the (linear or nonlinear) tax schedules to fully neutralize the distributional impact of public goods. It demonstrated that the marginal cost of public funds should not be present in the modified Samuelson rule for public goods provision. Intuitively, if the tax system is optimized, a marginal change in distortionary taxes to finance public goods produces exactly offsetting distortions and distributional gains. This envelope property of optimal taxes explains why this result generalizes to non-separable and heterogeneous preferences, linear and nonlinear tax systems, and without the need to implement benefit-absorbing tax changes.

Implications for other public policies
The results of this paper have larger relevance to other public policies than public goods provision. First, Jacobs and Boadway (2014) analyze optimal linear or nonlinear commodity taxes jointly with optimal nonlinear income taxes in the frameworks of Atkinson and Stiglitz (1976) and Mirrlees (1976). They show that the marginal cost of public funds-based on Diamond's social value of income-remains equal to one in the full tax optimum. Hence, distortions from commodity taxation do not drive up the marginal cost of public funds above one if income and commodity taxes are optimized.
Second, Kleven and Kreiner (2006) extend the literature on the marginal cost of public funds-based on the traditional definition-to account for extensive-margin distortions in labor supply, besides intensive-margin distortions. They argue that participation distortions tend to raise the marginal cost of public funds. However, Jacquet, Lehmann and Van der Linden (2013) merge the Mirrlees model of optimal income taxation with labor supply on the intensive margin with Diamond (1980)'s model of optimal income taxation with labor supply on the extensive margin. Jacobs et al. (2017) follow their analysis and demonstrate that in this model the marginal cost of public funds is once more equal to one in the tax optimum if the Diamond-based measure for the marginal costs of public funds is employed. Participation distortions are therefore not a reason why the marginal cost of public funds would rise above one in the tax optimum.
Third, Sandmo (1975), Bovenberg and De Mooij (1994) and Bovenberg and Van der Ploeg (1994) show that the optimal corrective tax is driven below the Pigouvian level if the marginal cost of public funds is larger than one. Intuitively, providing environmental quality directly competes with provision of ordinary public goods. Jacobs and de Mooij (2015) analyze optimal corrective taxes alongside optimal linear and nonlinear income taxes in the models of Atkinson and Stiglitz (1976) and Mirrlees (1976), which are extended with externalities. They demonstrate that the marginal cost of public funds is equal to one in the tax optimum with optimal corrective taxation. Hence, it is generally incorrect to set optimal corrective taxes below Pigouvian levels even in second-best settings with distortionary taxes. Consequently, governments should not pursue less ambitious environmental policies if tax rates are high, as Sandmo (1975) and Bovenberg and De Mooij (1994) suggested.
Fourth, Barro (1979) and Lucas and Stokey (1983) show that under distortionary taxation Ricardian equivalence breaks down, and debt-financing and tax financing of public spending cease to be equivalent. Since tax distortions are convex in tax rates, it is better to smooth tax rates over time, rather than having time-varying tax rates. Therefore, public debt should optimally be used to smooth tax rates over time. Werning (2007), however, demonstrates that ignoring distributional concerns to motivate tax distortions has fundamental consequences for the theory of optimal debt management. The reason is that, at the margin, the government always has access to non-distortionary sources of finance (i.e., non-individualized lump-sum transfers). Werning (2007)'s analysis implies that the marginal cost of public funds is equal to one at all times if governments optimize tax systems, see also Jacobs (2009). As a result, debt and (lump-sum) tax financing are equivalent, Ricardian equivalence is restored, and the optimal path of public debt becomes indeterminate.
Fifth, if the marginal cost of public funds is equal to one, revenue-raising instruments (such as taxes) are not superior to revenue-neutral instruments (such as regulation), or to revenue-reducing instruments (such as subsidies) if resources are valued equally at the margin in the public and the private sector. Hence, one needs to include additional, instrument-specific constraints into the analysis in order to assess the desirability of revenue-raising over revenue-neutral or revenue-reducing policy instruments. Therefore, theories on optimal regulation and procurment that rely on a marginal cost of funds larger than one need reconsideration, see, e.g., Laffont and Tirole (1993).

Social cost-benefit analysis
A common practice to add, say, 50 cents for every dollar spent on a public project to account for tax distortions, but completely ignoring the distributional benefits of tax instruments at the same time, is incorrect, see for example Heckman et al. (2010). Such practice is bound to yield policy errors, because many public projects can now fail the social cost-benefit test, whereas they could be socially desirable. It is probably most practical not to make corrections for the marginal cost of public funds in social cost-benefit analysis. 28 Tax distortions cancel out against distributional gains if taxes are optimally set, which renders the marginal cost of public funds equal to one. Alternatively, if one does not want to assume that taxes are optimally set, then not only the deadweight costs of taxation, but also the distributional benefits of taxation should be included in social cost-benefit analysis. See also Sect. 3.3 on sub-optimal taxation.
To justify setting the marginal cost of public funds equal to one in social costbenefit analysis, policy makers may invoke Becker (1983)'s efficient redistribution hypothesis, which argues that the political system should achieve an outcome in which all opportunities for political gains to redistribute incomes are exhausted. Alternatively, one may invoke the principle of insufficient reason, since the calculation of deadweight costs and distributional benefits of taxation is fraught with difficulties. In particular, if tax systems are not considered to be optimal, a social cost-benefit requires estimates of both the marginal distortions and the marginal distributional benefits of the taxes that are used to finance the public project. It is in principle possible to estimate the deadweight losses of taxation, even though uncertainties in deadweight loss estimates can be substantial. The estimation of the distributional benefits of taxation, however, is a hazardous task for any policy analyst, since this inevitably involves political judgments about the desirability of income redistribution from which policy analysts should preferably abstain. 29

Conclusion
This paper analyzed the simultaneous setting of optimal taxes and the provision of public goods in standard optimal tax models with heterogeneous agents having heterogeneous preferences. Both optimal linear and nonlinear tax schedules are analyzed. It has been demonstrated that tax distortions are the ultimate result of redistributional concerns. Using Diamond (1975)'s definition for the social marginal value of private income, which includes policy-induced income effects on tax bases, the marginal cost of public funds at the optimal tax system is shown to be one, for optimal linear and nonlinear taxes, and for income and consumption taxes. At the optimal tax system, the marginal cost of public funds for all tax instruments should be equalized. Hence, the marginal cost of public funds for distortionary taxation equals the marginal cost of public funds for non-distortionary taxation, which equals one. The distributional benefits of distortionary tax rates are therefore equal to the marginal excess burden of tax rates. The modified Samuelson rule is derived using general preference structures. It is demonstrated that the marginal cost of public funds does not determine optimal public goods provision under optimal taxation. Modified Samuelson rules for public good provision under linear and nonlinear taxation are identical.
Applied policy economists should therefore use the marginal excess burden as the social cost of income redistribution, and the marginal cost of public funds as the social cost of financing public activities that are unrelated to income redistribution. It should be remembered that the marginal cost of public funds of a distortionary tax does not only capture the deadweight loss of a distortionary tax, but also its distributional benefits. When tax systems are not optimal, the marginal cost of public funds could either be smaller or larger than one, depending on whether the distributional benefits are smaller than the marginal excess burden of distortionary taxes.
This paper followed common practice in the second-best literature by assuming that the government is a benevolent social planner and markets are Pareto-efficient in the absence of government intervention. Government failure could explain why taxes and public goods are not set at second-best optimal levels. Moreover, not only governments, but also markets can fail. The main lesson of this paper is that without explicitly incorporating the fundamental reasons why governments or markets fail into the analysis, it is premature to conclude that the marginal cost of public funds lies above or below one. In future research, the marginal cost of public funds should be derived in settings where market or government failure is explicitly modeled from first principles and not derived from ad hoc constraints.