Optimal tax problems with multidimensional heterogeneity: a mechanism design approach

Jacquet, Laurence; Lehmann, Etienne

doi:10.1007/s00355-021-01349-4

Optimal tax problems with multidimensional heterogeneity: a mechanism design approach

Original Paper
Published: 10 July 2021

Volume 60, pages 135–164, (2023)
Cite this article

Social Choice and Welfare Aims and scope Submit manuscript

Laurence Jacquet¹ &
Etienne Lehmann²

517 Accesses
1 Citation
Explore all metrics

Abstract

We propose a new method, that we call an allocation perturbation, to derive the optimal nonlinear income tax schedules with multidimensional individual characteristics on which taxes cannot be conditioned. It is well established that, when individuals differ in terms of preferences on top of their skills, optimal marginal tax rates can be negative. In contrast, we show that with heterogeneous behavioral responses and skills, one has optimal positive marginal tax rates, under utilitarian preferences and maximin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal nonlinear taxation: a simpler approach

Article Open access 04 May 2023

Bunching in rank-dependent optimal income tax schedules

Article 15 January 2022

Majority rule and selfishly optimal nonlinear income tax schedules with discrete skill levels

Article 13 April 2019

Notes

John Weymark has largely contributed to the literature on optimal income taxation, see e.g. Weymark (1986), Weymark (1987) and Brett and Weymark (2011).
Our paper studies the optimal tax system when individual characteristics, despite being observable by the tax authority, cannot be used as tags (Akerlof 1978), due to legal and/or horizontal equity reasons.
Our definition of “group” is identical to the one in Werning (2007), p. 13.
A smoothly increasing (decreasing) function is also called an increasing (decreasing) diffeomorphism for which the derivative maps the positive real line onto itself.
In (Jacquet and Lehmann 2021, Proposition 5), we show that the assumption of a smoothly-increasing-in-types allocation amounts to assuming: (i) twice differentiability of the tax function $T(\cdot )$, that (ii) for all $(w,\theta )\in \mathbb {R}_+^*\times \Theta$, the second-order condition associated to the individual maximization program holds strictly and that (iii) for all $(w,\theta )\in \mathbb {R}_+^*\times \Theta$, the function $y\mapsto \mathscr {U}\left( y-T(y),y;w,\theta \right)$ admits a unique global maximum over $\mathbb {R}_+$.
For instance, we never found cases where the second-order incentive-compatibility constraints were violated in the large set of simulations we run on US data with taxpayers differing in terms of gender and labor supply elasticities, see Jacquet and Lehmann (2021).
More precisely, in the left-hand side of Eq. (14a), the term $-\frac{v_{y}\left[ w,\theta \right] }{w\cdot v_{yw}\left[ w,\theta \right] }$ which is equal to the ratio of $\varepsilon (w,\theta )$ and $\alpha (w,\theta )$ [see Eq. (35) in the Appendix], is weighted by the conditional density times the skill, $W(w,\theta )\ f(W(y,\theta )|\theta )$. And, in the right-hand side of (14a), which encapsulates the mechanical and income effects, the weights are the conditional skill densities.
In Hellwig (2007), under a utilitarian criterion, positive optimal tax rates are obtained with more general preferences.
If the utility function $u(\cdot )$ in (1) were parameterized by type w and $\theta$ while $v(\cdot )$ were simply parameterized by w, individuals who earn the same income would have distinct social marginal welfare weights. This could drive negative marginal tax rates. Similarly, if both $u(\cdot )$ and $v(\cdot )$ were parameterized by w and $\theta$, one would also expect negative marginal tax rates. Let us stress that our method could not be used in this framework since the pooling function (10) cannot depend simultaneously on Y and C.
Hence function ${\underline{W}}(\cdot ,\theta )$ coincides with the pooling function $W(\cdot ,\theta )$.
Indeed, at $m=0$, $\mathscr {Y}^R_y$ does no longer depend on the direction R of the tax reform.

References

Akerlof GA (1978) The economics of tagging as applied to the optimal income tax, welfare programs, and manpower planning. Am Econ Rev 68(1):8–19
Google Scholar
Blumkin T, Sadka E, Shem-Tov Y (2015) International tax competition: zero tax rate at the top re-established. Int Tax Public Finance 22(5):760–776
Article Google Scholar
Boadway R, Marchand M, Pestieau P, del Mar Racionero M (2002) Optimal redistribution with heterogeneous preferences for leisure. J Public Econ Theory 4(4):475–498
Article Google Scholar
Brett C, Weymark JA (2003) Financing education using optimal redistributive taxation. J Public Econ 87(11):2549–2569
Article Google Scholar
Brett C, Weymark JA (2011) How optimal nonlinear income taxes change when the distribution of the population changes. J Public Econ 95(11):1239–1247
Article Google Scholar
Choné P, Laroque G (2010) Negative marginal tax rates and heterogeneity. Am Econ Rev 100(5):2532–47
Article Google Scholar
Cremer H, Lozachmeur J-M, Pestieau P (2012) Income taxation of couples and the tax unit choice. J Popul Econ 25(2):763–778
Article Google Scholar
Cuff K (2000) Optimality of workfare with heterogeneous preferences. Can J Econ 33(1):149–174
Article Google Scholar
Diamond P (1998) Optimal income taxation: an example with U-shaped pattern of optimal marginal tax rates. Am Econ Rev 88(1):83–95
Google Scholar
Gomes R, Lozachmeur J-M, Pavan A (2018) Differential taxation and occupational choice. Rev Econ Stud 85(1):511–557
Article Google Scholar
Guesnerie R (1995) A contribution to the pure theory of taxation. Cambridge University Press, Cambridge
Book Google Scholar
Hammond P (1979) Straightforward individual incentive compatibility in large economies. Rev Econ Stud 46(2):263–282
Article Google Scholar
Hellwig MF (2007) A contribution to the theory of optimal utilitarian income taxation. J Public Econ 91(7):1449–1477
Article Google Scholar
Hendren N (2020) Measuring economic efficiency using inverse-optimum weights. J Public Econ 2020:187
Google Scholar
Jacquet L, Lehmann E (2013) Optimal redistributive taxation with both extensive and intensive responses. J Econ Theory 148(5):1770–1805
Article Google Scholar
Jacquet L, Lehmann E (2021) Optimal income taxation with composition effects. J Eur Econ Assoc 19(2):1299–1341
Google Scholar
Kleven HJ, Kreiner CT, Saez E (2009) The optimal income taxation of couples. Econometrica 77(2):537–560
Article Google Scholar
Lehmann E, Simula L, Trannoy A (2014) Tax me if you can! Otimal nonlinear income tax between competing governments. Q J Econ 129(4):1995–2030
Article Google Scholar
Lockwood BB, Weinzierl M (2015) De Gustibus non est Taxandum: heterogeneity in preferences and optimal redistribution. J Public Econ 124:74–80
Article Google Scholar
Mirrlees J (1971) An exploration in the theory of optimum income taxation. Rev Econ Stud 38(2):175–208
Article Google Scholar
Piketty T (1997) La Redistribution fiscale contre le chômage. Revue française d’économie 12(1):157–203
Article Google Scholar
Ramsey FP (1927) A contribution to the theory of taxation. Econ J 37(145):47–61
Article Google Scholar
Rochet J (1985) The taxation principle and multi-time Hamilton-Jacobi equations. J Math Econ 14(2):113–128
Article Google Scholar
Rochet J-C, Stole LA (2002) Nonlinear pricing with random participation. Rev Econ Stud 69(1):277–311
Article Google Scholar
Rochet J-C, Stole LA (1998) Ironing, sweeping, and multidimensional screening. Econometrica 66(4):783–826
Article Google Scholar
Rothschild C, Scheuer F (2013) Redistributive taxation in the Roy model. Q J Econ 128(2):623–668
Article Google Scholar
Rothschild C, Scheuer F (2016) Optimal taxation with rent-seeking. Rev Econ Stud 83(3):1225–1262
Article Google Scholar
Sachs D, Tsyvinski A, Werquin N (2020) Nonlinear tax incidence and optimal taxation in general equilibrium. Econometrica 88(2):469–493
Article Google Scholar
Saez E (2001) Using elasticities to derive optimal income tax rates. Rev Econ Stud 68(1):205–229
Article Google Scholar
Saez E (2002) Optimal income transfer programs: intensive versus extensive labor supply responses. Q J Econ 117:1039–1073
Article Google Scholar
Salanié B (2011) The economics of taxation, 2nd edn. MIT Press
Scheuer F (2013) Adverse selection in credit markets and regressive profit taxation. J Econ Theory 148(4):1333–1360
Article Google Scholar
Scheuer F (2014) Entrepreneurial taxation with endogenous entry. Am Econ J Econ Policy 6(2):126–63
Article Google Scholar
Scheuer F, Werning I (2016) Mirrlees meets Diamond-Mirrlees. Working Paper 22076, National Bureau of Economic Research, March
Werning I (2007) Pareto efficient income taxation. MIT Working Paper 2007
Weymark JA (1986) A reduced-form optimal nonlinear income tax problem. J Public Econ 30(2):199–217
Article Google Scholar
Weymark JA (1987) Comparative static properties of optimal nonlinear income taxes. Econometrica 55(5):1165–1185
Article Google Scholar
Wilson RB (1993) Nonlinear pricing. Oxford University Press, Oxford
Google Scholar

Download references

Author information

Authors and Affiliations

THEMA-CY Cergy Paris Université, THEMA, Université de Cergy-Pontoise, 33 boulevard du Port, 95011, Cergy-Pontoise Cedex, France
Laurence Jacquet
CRED (TEPP) University Panthéon Assas Paris II, 12 place du Panthéon, 75231, Paris Cedex 05, France
Etienne Lehmann

Authors

Laurence Jacquet
View author publications
You can also search for this author in PubMed Google Scholar
Etienne Lehmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Laurence Jacquet.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We thank Craig Brett and two anonymous refererees, Pierre Boyer, Vidar Christiansen, Guy Laroque, Emmanuel Saez, Laurent Simula, Stefanie Stantcheva, Kevin Spiritus, Alain Trannoy and Nicolas Werquin. This research was partly realized while Laurence Jacquet was research associate at Oslo Fiscal Studies. She gratefully acknowledges support by Labex MME-DII. The usual disclaimer applies.

Etienne Lehmann is also research fellow at IZA, CESifo and CEPR.

A Appendix

1.1 A.1 Proof of Lemma 4

Proof

The proof consists of two steps. In step (i), we show that there exists at most one incentive-compatible allocation $(w,\theta )\mapsto ({\underline{C}}(w,\theta ),{\underline{Y}}(w,\theta ))$ that verifies Assumption 3 and such that $({\underline{C}}(w,\theta _0),{\underline{Y}}(w,\theta _0))=(C(w,\theta _0),Y(w,\theta _0))$. In step (ii), we show that this allocation verifies the whole set of incentive constraints (6).

Step (i). To build up the entire incentive-compatible allocation $(w,\theta )\mapsto ({\underline{C}}(w,\theta ),{\underline{Y}}(w,\theta ))$, we must choose $({\underline{C}}(w,\theta _0),{\underline{Y}}(w,\theta _0))=(C(w,\theta _0),Y(w,\theta _0))$ at any skill level. For each group $\theta$, ${\underline{Y}}(\cdot ,\theta )$ verifies Assumption 3 if and only if its reciprocal ${\underline{Y}}^{-1}(\cdot ;\theta )$ is smoothly increasing. Let $y\in \mathbb {R}_+$ be an income level. As $Y(\cdot ,\theta _0)$ is smoothly increasing from Assumption 3, there exists a unique skill level w such that $y=Y(w,\theta _0)$. Then according to Lemma 3, among individuals of group $\theta$, only those of skill ${\underline{W}}(w,\theta )$ must be assigned to the income level $y=Y(w,\theta _0)$ to verify incentive-compatibility.^{Footnote 10} Therefore, ${\underline{Y}}^{-1}(\cdot ,\theta )$ must be defined by:

$$\begin{aligned} {\underline{Y}}^{-1}(\cdot ,\theta ):\qquad y\overset{Y^{-1}(\cdot ,\theta _0)}{\longmapsto }w=Y^{-1}(y,\theta _0) \overset{{\underline{W}}(\cdot ,\theta )}{\longmapsto }Y^{-1}(y,\theta ). \end{aligned}$$

${\underline{Y}}^{-1}(\cdot ,\theta )$ is then smoothly increasing as a combination of two smoothly increasing functions. Moreover, since for each type $(\omega ,\theta )$, there exists a single skill level $\omega$ such that ${\underline{Y}}(\omega ,\theta )=\textit{Y}(w,\theta _0)$, incentive compatibility requires that ${\underline{C}}(\omega ,\theta )$ also needs to be equal to ${\underline{C}}(w,\theta _0)$. This ends the proof of step (i).

Step (ii). Note that the allocation $(w,\theta )\mapsto ({\underline{Y}}(w,\theta ),{\underline{C}}(w,\theta ))$ is built in such a way that one has ${\underline{Y}}(\omega ,\theta )=\textit{Y}(w,\theta _0)\text { and } {\underline{C}}(\omega ,\theta )=\textit{C}(w,\theta _0)$ if and only if $\omega ={\underline{W}}(w,\theta )$ and (10) holds. Differentiating in w both sides of these two equations and rearranging terms, we obtain

$$\begin{aligned} \frac{{\dot{C}}\left( w,\theta _0\right) }{{\dot{Y}}\left( w,\theta _0\right) }= \frac{\dot{{\underline{C}}}\left( {\underline{W}}(w,\theta ),\theta _0\right) }{\dot{{\underline{Y}}}\left( {\underline{W}}(w,\theta ),\theta _0\right) }. \end{aligned}$$

As $w\mapsto (C(w,\theta _0),Y(w,\theta _0))$ is assumed to verify the within-group incentive constraints in Eq. (8b), we know that the left-hand side of the above equation is equal to

$$\begin{aligned} \mathscr {M}(C(w,\theta _0),Y(w,\theta _0);w,\theta _0). \end{aligned}$$

Using the definition of ${\underline{W}}(\cdot ,\theta )$, we have that $w\mapsto ({\underline{C}}(w,\theta ),{\underline{Y}}(w,\theta ))$ also verifies Eq. (8b). From Lemma 2, it thus verifies the within-group incentive constraints (7). We now check whether the inequality (6) is verified for any $(w,w',\theta ,\theta ')\in \mathbb {R}_+^2\times \Theta ^2$. We know there exists $\omega \in \mathbb {R}_+$ such that ${\underline{Y}}(\omega ,\theta )={\underline{Y}}(w',\theta ')\text { and } {\underline{C}}(\omega ,\theta )={\underline{C}}(w',\theta ')$. The incentive constraints in (6) are therefore equivalent to:

$$\begin{aligned} \mathscr {U}\left( C(w,\theta ),Y(w,\theta );w,\theta \right) \ge \mathscr {U}\left( C(\omega ,\theta ),Y(\omega ,\theta );w,\theta \right) . \end{aligned}$$

The latter inequality is verified as $w\mapsto ({\underline{C}}(w,\theta ),{\underline{Y}}(w,\theta ))$ satisfies Eq. (8b). $\square$

1.2 A.2 Derivation of Eq. (17)

Proof

To derive (17), we must compute the various Gâteaux derivatives at $t=0$. For $A=C,Y,U$ and a given $\delta$, the Gâteaux derivative of A in the direction $\Delta _Y(\cdot ,\cdot ;\delta )$ at $t=0$ is denoted $\hat{{\hat{A}}}(x,\theta ;\delta )$. Let us remind its definition:

$$\begin{aligned} \hat{{\hat{A}}}(x,\theta ;\delta )\overset{\text {def}}{\equiv }\underset{t\mapsto 0}{\lim } \frac{{\hat{A}}(x,\theta ;t,\delta )-A(w,\theta )}{t}. \end{aligned}$$

By definition we get: $\hat{{\hat{Y}}}(x,\theta _0;\delta )=\Delta _Y(x;\delta )$, and from (15b) we obtain:

$$\begin{aligned} \hat{{\hat{Y}}}(x,\theta ;\delta )=0\qquad \text {if}\qquad x \in \left[ 0,W(w-\delta ,\theta )\right] \cup \left[ W(w,\theta ),+\infty \right) . \end{aligned}$$

(24a)

Equation (15c) imply that the Gâteaux derivatives of utilities are nil for skill below $W(w-\delta ,\theta )$. For skills x between $W(w-\delta ,\theta )$ and $W(w,\theta )$, Eq. (15e) implies:

$$\begin{aligned} \hat{{\hat{U}}}(x,\theta ;\delta )= -\int _{W(w-\delta ,\theta )}^{x} \upsilon _{yw}\left( Y(\omega ,\theta _0);\omega ,\theta _0\right) \ \hat{{\hat{Y}}}(\omega ,\theta _0;\delta )\ d\omega . \end{aligned}$$

(24b)

For skill x above $W(w,\theta )$, according to (15f), we have:

$$\begin{aligned} \hat{{\hat{U}}}(x,\theta ;\delta )= -\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \upsilon _{yw}\left( Y(\omega ,\theta _0);\omega ,\theta _0\right) \ \hat{{\hat{Y}}}(\omega ,\theta _0;\delta )\ d\omega . \end{aligned}$$

(24c)

Moreover, Eq. (15h) implies that the Gâteaux derivatives of income must verify:

$$\begin{aligned} \int _{w-\delta }^{w} \upsilon _{yw}\left( Y(\omega ,\theta _0);\omega ,\theta \right) \ \hat{{\hat{Y}}}(\omega ,\theta _0;\delta ) \ d\omega = \int _{W(w-\delta ,\theta )}^{W(w,\theta )} \upsilon _{yw}\left( Y(\omega ,\theta );\omega ,\theta \right) \ \hat{{\hat{Y}}}(\omega ,\theta ;\delta ) \ d\omega . \end{aligned}$$

(24d)

Using Eqs. (12), (24a) and (24c), the Gâteaux derivative of the Lagrangian (16) is:

$$\begin{aligned}&\frac{\partial \hat{{\mathscr {L}}}}{\partial t}(0;\delta )= \int _{\theta \in \Theta }\left\{ \int _{W(w-\delta ,\theta )}^{W(w,\theta )} \left( 1-\frac{\upsilon _y(Y(x,\theta );x,\theta )}{u^\prime (C(x,\theta ))}\right) \hat{{\hat{Y}}}(x,\theta ;\delta ) f(x|\theta )dx\right. \nonumber \\&+ \left. \int _{W(w-\delta ,\theta )}^{W(w,\theta )} \left( \frac{\Phi _U[x,\theta ]}{\lambda }-\frac{1}{u^\prime [x,\theta ]}\right) \hat{{\hat{U}}}(x,\theta ;\delta ) f(x|\theta )dx\right. \nonumber \\&- \left( \int _{W(w-\delta ,\theta )}^{W(w,\theta )} \upsilon _{yw}\left( Y(x,\theta );x,\theta \right) \ \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx\right) \nonumber \\&\times \left. \left( \int _{W(w,\theta )}^{\infty } \left( \frac{\Phi _U[x,\theta ]}{\lambda }-\frac{1}{u^\prime [x,\theta ]}\right) f(x|\theta )dx \right) \right\} d\mu (\theta ). \end{aligned}$$

(25)

Dividing the first-order condition $\frac{\partial \hat{\mathscr {L}}}{\partial t}(0;\delta )=0$ by $\int _{w-\delta }^{w} \upsilon _{yw}\left( Y(x,\theta _0);x,\theta _0\right) \ \hat{{\hat{Y}}}(x,\theta _0;\delta )\ dx$ implies, using (24b) and (24d), that:

$$\begin{aligned}&\int _{\theta \in \Theta } \dfrac{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \left( 1-\dfrac{\upsilon _y(Y(x,\theta );x,\theta )}{u^\prime (C(x,\theta ))}\right) \hat{{\hat{Y}}}(x,\theta ;\delta ) f(x|\theta )dx}{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \upsilon _{yw}\left( Y(x,\theta );x,\theta \right) \ \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx} \ d\mu (\theta ) \nonumber \\&\quad = \int _{\theta \in \Theta }\left\{ \int _{W(w-\delta ,\theta )}^{W(w,\theta )} \left( \dfrac{\Phi _U[x,\theta ]}{\lambda }-\dfrac{1}{u^\prime [x,\theta ]}\right) \dfrac{\int _{W(w-\delta ,\theta )}^{x} \upsilon _{yw}\left( Y(x,\theta );x,\theta \right) \ \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx}{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \upsilon _{yw}\left( Y(x,\theta );x,\theta \right) \ \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx } f(x|\theta )dx \right. \nonumber \\&\qquad + \left. \int _{W(w,\theta )}^{\infty } \underset{}{\left( \frac{\Phi _U[x,\theta ]}{\lambda }-\frac{1}{u^\prime [x,\theta ]}\right) } f(x|\theta )dx\right\} d\mu (\theta ). \end{aligned}$$

(26)

We finally take the limit of the latter equality when $\delta$ tends to 0. Let us consider the first term in the right-hand side of (26). Since

$$\begin{aligned} \dfrac{\int _{W(w-\delta ,\theta )}^{x} \upsilon _{yw}\left( Y(x,\theta );x,\theta \right) \ \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx}{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \upsilon _{yw}\left( Y(x,\theta );x,\theta \right) \ \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx } \in [0,1] \end{aligned}$$

we get that:

$$\begin{aligned}&\left| \int _{\theta \in \Theta } \int _{W(w-\delta ,\theta )}^{W(w,\theta )} \left( \dfrac{\Phi _U[x,\theta ]}{\lambda }-\dfrac{1}{u^\prime [x,\theta ]}\right) \dfrac{\int _{W(w-\delta ,\theta )}^{x} \upsilon _{yw}\left( Y(x,\theta );x,\theta \right) \ \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx}{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \upsilon _{yw}\left( Y(x,\theta );x,\theta \right) \ \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx } f(x|\theta )dx d\mu (\theta )\right| \\&\quad \le \left| \int _{\theta \in \Theta } \int _{W(w-\delta ,\theta )}^{W(w,\theta )} \left( \dfrac{\Phi _U[x,\theta ]}{\lambda }-\dfrac{1}{u^\prime [x,\theta ]}\right) f(x|\theta )dx d\mu (\theta )\right| . \end{aligned}$$

As the right hand-side of the latter inequality tends to 0 when $\delta$ tends to 0, the limit of (26) when $\delta$ tends to zero leads to:

$$\begin{aligned}&\underset{\delta \mapsto 0}{\lim }\quad \int _{\theta \in \Theta } \dfrac{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \left( 1-\dfrac{\upsilon _y(Y(x,\theta );x,\theta )}{u^\prime (C(x,\theta ))}\right) \hat{{\hat{Y}}}(x,\theta ;\delta ) f(x|\theta )dx}{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \upsilon _{yw}\left( Y(x,\theta );x,\theta \right) \ \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx} d\mu (\theta ) \nonumber \\&\quad = \iint _{\theta \in \Theta ,x\ge W(w,\theta )} \left( \frac{\Phi _U[x,\theta ]}{\lambda }-\frac{1}{u^\prime [x,\theta ]}\right) f(x|\theta )dx\ d\mu (\theta ). \end{aligned}$$

(27)

By continuity, the variations of $f(x|\theta )$, $\upsilon _y(Y(x,\theta );x,\theta )$, $\upsilon _{yw}(Y(x,\theta );x,\theta )$ and $u^\prime (c(x,\theta ))$ within the skill intervals $[W(w-\delta ,\theta ),W(w,\theta )]$ are of second-order when $\delta$ tends to 0. As $\Theta$ and intervals $[W(w-\delta ,\theta ),W(w,\theta )]$ are compact, for any small $e>0$, there always exists ${\tilde{\delta }}(e)$ such that for all $(x,\theta )\in [W(w-{\tilde{\delta }}(e),\theta ),W(w,\theta )]\times \Theta$, one has:

$$\begin{aligned}&\left( \frac{1-\upsilon _y[ W(w,\theta ) ,\theta ]}{u^\prime (C(W(w,\theta ),\theta )}f(W(w,\theta )|\theta )-e\right) \hat{{\hat{Y}}}(x,\theta ;\delta ) \le \left( \frac{1-\upsilon _y[ W(x,\theta ) ,\theta ]}{u^\prime (C(W(x,\theta ),\theta )}f(x|\theta )\right) \hat{{\hat{Y}}}(x,\theta ;\delta ) \\&\quad \le \left( \frac{1-\upsilon _y[W(w,\theta ) ,\theta ]}{u^\prime (C(W(w,\theta ),\theta )}f(W(w,\theta )|\theta )+e \right) \hat{{\hat{Y}}}(x,\theta ;\delta ) \end{aligned}$$

and

$$\begin{aligned}&\left( \upsilon _{yw} [W(w,\theta ) ,\theta ] - e\right) \hat{{\hat{Y}}}(x,\theta ;\delta ) \le \upsilon _{yw} [W(x,\theta ) ,\theta ] \ \hat{{\hat{Y}}}(x,\theta ;\delta ) \le \left( \upsilon _{yw} [W(w,\theta ) ,\theta ] +e \right) \hat{{\hat{Y}}}(x,\theta ;\delta )<0 \end{aligned}$$

so that for all $\delta <{\tilde{\delta }}(e)$:

$$\begin{aligned}&\int _{\theta \in \Theta } \dfrac{\left( 1-\dfrac{\upsilon _y(Y(W(w,\theta ),\theta );W(w,\theta ),\theta )}{u^\prime (C(W(w,\theta ),\theta )}\right) f(W(w,\theta )|\theta )+e}{\upsilon _{yw}(Y(W(w,\theta ),\theta );W(w,\theta ),\theta )-e}\ \dfrac{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \hat{{\hat{Y}}}(x,\theta ;\delta ) dx}{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx} d\mu (\theta ) \\&\quad \le \int _{\theta \in \Theta } \dfrac{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \left( 1-\dfrac{\upsilon _y(Y(x,\theta );x,\theta )}{u^\prime (C(x,\theta ))}\right) \hat{{\hat{Y}}}(x,\theta ;\delta ) f(x|\theta )dx}{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \upsilon _{yw}\left( Y(x,\theta );x,\theta \right) \ \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx} d\mu (\theta ) \\&\quad \le \int _{\theta \in \Theta } \dfrac{\left( 1-\dfrac{\upsilon _y(Y(W(w,\theta ),\theta );W(w,\theta ),\theta )}{u^\prime (C(W(w,\theta ),\theta )}\right) f(W(w,\theta )|\theta )-e}{\upsilon _{yw}(Y(W(w,\theta ),\theta );W(w,\theta ),\theta )+e}\ \dfrac{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \hat{{\hat{Y}}}(x,\theta ;\delta ) dx}{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx} d\mu (\theta ). \end{aligned}$$

and therefore, for all $\delta <{\tilde{\delta }}(e)$:

$$\begin{aligned}&\int _{\theta \in \Theta } \dfrac{\left( 1-\dfrac{\upsilon _y(Y(W(w,\theta ),\theta );W(w,\theta ),\theta )}{u^\prime (C(W(w,\theta ),\theta )}\right) f(W(w,\theta )|\theta )+e}{\upsilon _{yw}(Y(W(w,\theta ),\theta );W(w,\theta ),\theta )-e} d\mu (\theta ) \\&\quad \le \int _{\theta \in \Theta } \dfrac{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \left( 1-\dfrac{\upsilon _y(Y(x,\theta );x,\theta )}{u^\prime (C(x,\theta ))}\right) \hat{{\hat{Y}}}(x,\theta ;\delta ) f(x|\theta )dx}{\int _{W(w-\delta ,\theta )}^{W(w,\theta )} \upsilon _{yw}\left( Y(x,\theta );x,\theta \right) \ \hat{{\hat{Y}}}(x,\theta ;\delta )\ dx} d\mu (\theta ) \\&\quad \le \int _{\theta \in \Theta } \dfrac{\left( 1-\dfrac{\upsilon _y(Y(W(w,\theta ),\theta );W(w,\theta ),\theta )}{u^\prime (C(W(w,\theta ),\theta )}\right) f(W(w,\theta )|\theta )-e}{\upsilon _{yw}(Y(W(w,\theta ),\theta );W(w,\theta ),\theta )+e} d\mu (\theta ) \end{aligned}$$

Hence, the left-hand side of (27) is equal to the left-hand side of (17). $\square$

1.3 A.3 Proof of Lemma 5

With one-dimensional heterogeneity, we only consider within-group incentive constraints. Adopting a first-order approach, only (8a) is considered when building up the Hamiltonian:

$$\begin{aligned} \left( Y(w,\theta )-\mathscr {C}\left( Y(w,\theta ),U(w,\theta );w,\theta \right) +\dfrac{\Phi \left( U(w,\theta );w,\theta \right) }{\lambda } \right) \cdot f(w|\theta )-q(w|\theta )\cdot v_w\left( Y(w,\theta );w,\theta \right) . \end{aligned}$$

where $Y(w,\theta )$ and $U(w,\theta )$ are the control and state variables respectively. Using (12), the necessary conditions are:

$$\begin{aligned} 0= & {} \left( 1-\dfrac{v_y\left[ w,\theta \right] }{u^\prime \left[ w,\theta \right] }\right) \cdot f(w|\theta )- q(w|\theta )\cdot v_{yw}\left[ w,\theta \right] \end{aligned}$$

(28a)

$$\begin{aligned} -{\dot{q}}\left( w|\theta \right)= & {} \left( \dfrac{\Phi _U\left[ w,\theta \right] }{\lambda }-\dfrac{1}{u^\prime \left[ w,\theta \right] }\right) \cdot f(w|\theta ) \end{aligned}$$

(28b)

$$\begin{aligned} 0= & {} q(0|\theta ) \end{aligned}$$

(28c)

$$\begin{aligned} 0= & {} \underset{w\mapsto \infty }{\lim }q(w|\theta ). \end{aligned}$$

(28d)

Combining (28b) with (28d) leads to

$$\begin{aligned} q(w|\theta )=\int _w^\infty \left( \dfrac{\Phi _U\left[ w,\theta \right] }{\lambda }-\dfrac{1}{u^\prime \left[ w,\theta \right] }\right) \cdot f(\omega |\theta )d\omega . \end{aligned}$$

(28e)

Combining (3), (2), (28a) and (28e) leads to (18a). Combining (28c) with (28e) leads to (18b).

1.4 A.4 Proof of Proposition 2

Define a reform of a tax schedule $y\mapsto T(y)$ with its direction, which is a differentiable function $y\mapsto R(y)$ defined on ${\mathbb {R}}_+$, and with its algebraic magnitude $m\in {\mathbb {R}}$. After a reform, the tax schedule becomes $y\mapsto T(y)-m \ R(y)$ and the utility of an individuals of type $(w,\theta )$ is:

$$\begin{aligned} U^R(m;w,\theta )\overset{\text {def}}{\equiv }\quad \underset{y}{\max }\quad u(y-T(y)+m\ R(y))-\upsilon (y;w,\theta ). \end{aligned}$$

(29)

We denote by $Y^R(m;w,\theta )$ the income of workers of types $(w,\theta )$ after the reform and her consumption becomes $C^R(m;w,\theta )=Y^R(m;w,\theta )-T(Y^R(m;w,\theta ))+m\ R(Y^R(m;w,\theta ))$. When $m=0$, we have $Y^R(0;w,\theta )=Y(w,\theta )$ and $C^R(0;w,\theta )=C(w,\theta )$. Applying the envelope theorem to (29), we get:

$$\begin{aligned} \frac{\partial U^R}{\partial m}(m;w,\theta )=u_c\left( C^R(m;w,\theta )\right) \ R(y). \end{aligned}$$

(30)

Using (3), the first-order condition associated to (29) equalizes to zero the following expression:

$$\begin{aligned} {\mathscr {Y}}^R(y,m;w,\theta ) \overset{\text {def}}{\equiv }1-T^\prime (y)+m\ R^\prime (y)-\mathscr {M}\left( y-T(y)+m\ R(y),y;w,\theta \right) . \end{aligned}$$

(31)

For simplicity, we drop the superscript R and write $\mathscr {Y}_y(Y(w,\theta );w,\theta )$ for $\mathscr {Y}_y^R(Y(w,\theta ),0;w,\theta )$.^{Footnote 11} We define behavioral responses to tax reforms of direction R by applying the implicit function theorem to $\mathscr {Y}^R(y,m;w,\theta )=0$ at $m=0$, which yields:

$$\begin{aligned} \frac{\partial Y^R}{\partial m}(0;w,\theta )=-\frac{\mathscr {Y}^R_{m}(Y(w,\theta ),0;w,\theta )}{\mathscr {Y}_{y}(Y(w,\theta ),0;w,\theta )} \end{aligned}$$

(32)

where:

$$\begin{aligned} {\mathscr {Y}}_y^R(y,m;w,\theta )= & {} -T^{\prime \prime }(y)-\mathscr {M}_y(y-T(y)+m\ R(y),y;w,\theta ) \nonumber \\&-{\mathscr {M}}(y-T(y)+m\ R(y),y;w,\theta )\ {\mathscr {M}}_c(y-T(y)+m\ R(y),y;w,\theta ), \end{aligned}$$

(33a)

$$\begin{aligned} \mathscr {Y}_m^R(y,m;w,\theta )= & {} R^\prime (y)- R(y)\ {\mathscr {M}}_c(y-T(y)+m\ R(y),y;w,\theta ). \end{aligned}$$

(33b)

Using (2) and plugging $R(Y(w,\theta ))=0$ and $R^\prime (Y(w,\theta ))=0$ into (33b), the compensated elasticity of earnings (19a) can be rewritten as:

$$\begin{aligned} \varepsilon (w,\theta )=\frac{\mathscr {M}(C(w,\theta ),Y(w,\theta );w,\theta )}{-Y(w,\theta )\ \mathscr {Y}_y(Y(w,\theta );w,\theta ) }>0 \end{aligned}$$

(34a)

which is positive since $\mathscr {Y}_y\left( Y(w,\theta );w,\theta \right) <0$. Plugging $R(Y(w,\theta ))=1$ and $R^\prime (Y(w,\theta ))=0$ into (33b), the income effect (19b) can be rewritten as:

$$\begin{aligned} \eta (w,\theta )= \frac{\mathscr {M}_c(C(w,\theta ),Y(w,\theta );w,\theta )}{\mathscr {Y}_y(Y(w,\theta );w,\theta )} \end{aligned}$$

(34b)

which is negative if leisure is a normal good, since then $\mathscr {M}_c>0$. The elasticity $\alpha (w;\theta )$ of earnings with respect to the skill level can be expressed as:

$$\begin{aligned} \alpha (w,\theta )= \frac{w\ \mathscr {M}_w(C(w,\theta ),Y(w,\theta );w,\theta )}{Y(w,\theta )\ \mathscr {Y}_y(Y(w,\theta );w,\theta )} >0. \end{aligned}$$

(34c)

Dividing (34a) by (34c) we get:

$$\begin{aligned} \frac{\varepsilon (w,\theta )}{\alpha (w,\theta )}=-\frac{v_{y}\left[ w,\theta \right] }{w\cdot v_{yw}\left[ w,\theta \right] }. \end{aligned}$$

(35)

Plugging (34a) into (34b) leads to:

$$\begin{aligned} \eta (w,\theta )= Y(w,\theta ) \cdot \frac{u^{\prime \prime }\left[ w,\theta \right] }{u^{\prime }\left[ w,\theta \right] } \cdot \varepsilon (w,\theta ). \end{aligned}$$

It is then straightforward to obtain:

$$\begin{aligned} {\hat{\eta }}(Y(w,\theta _0))=Y(w,\theta _0)\cdot \frac{u^{\prime \prime }\left[ w,\theta _0\right] }{u^{\prime }\left[ w,\theta _0\right] }\cdot {\hat{\varepsilon }}(Y(w,\theta _0)). \end{aligned}$$

(36)

Let $y\in \mathbb {R}_+$. Since $\mathscr {Y}_y\left( Y(w,\theta );w,\theta \right) <0$, there exists a single skill level w such that $y=Y(w,\theta _0)$. From (2), we know that:

$$\begin{aligned} 1-T^\prime \left[ w,\theta \right] =\frac{v_y\left[ w,\theta \right] }{u^\prime \left[ w,\theta \right] }. \end{aligned}$$

(37)

The term in the left-hand side integral of (14a) can be rewritten as:

$$\begin{aligned} \frac{v_y\left[ W(w,\theta ),\theta \right] }{- W(w,\theta )\ v_{yw}\left[ W(w,\theta ),\theta \right] }\ W(w,\theta ) \ f(W(w,\theta )|\theta )= & {} \frac{\varepsilon \left( W(w,\theta ),\theta \right) }{\alpha \left( W(w,\theta ),\theta \right) }\cdot W(w,\theta ) \ f(W(w,\theta )|\theta )\\= & {} \varepsilon \left( W(w,\theta ),\theta \right) \ Y(w,\theta _0)\ h(Y(w,\theta _0)|\theta ). \end{aligned}$$

The first equality is obtained using Eq. (35). The second equality uses (21). It implies with (22b) that Eq. (14a) can be rewritten as:

$$\begin{aligned} \frac{T^\prime \left[ w,\theta _0\right] }{1-T^\prime \left[ w,\theta _0\right] }\cdot {\hat{\varepsilon }}\left( Y(w,\theta _0)\right) \cdot Y(w,\theta _0)\cdot \hat{h}(Y(w,\theta _0)) = J(w) \end{aligned}$$

(38)

where J(w) is defined by the right-hand side of (14a). $J(\cdot )$ admits for derivative ${\dot{J}}(w)$ where:

$$\begin{aligned}&{\dot{J}}(w)={\dot{C}}(w,\theta _0) \frac{u^{\prime \prime }\left[ w,\theta _0\right] }{ u^{\prime }\left[ w,\theta _0\right] } J(w)\\&\qquad + \int \limits _{\theta \in \Theta }\left\{ \frac{ \Phi _U\left[ W(w,\theta ),\theta \right] \ u^{\prime }\left[ W(w,\theta ),\theta \right] }{\lambda } -1\right\} {\dot{W}}(w,\theta )\ f\left( W(w,\theta )|\theta \right) d\mu (\theta ) \\&\quad = \int _{\theta \in \Theta }\left\{ g\left( W(w,\theta ),\theta \right) -1\right\} \cdot {\dot{W}}(w,\theta )\cdot f\left( W(w,\theta ; \theta _0)|\theta \right) \cdot d\mu (\theta ) +{\dot{C}}(w,\theta _0) \cdot \frac{ u^{\prime \prime }\left[ w,\theta _0\right] }{ u^{\prime }\left[ w,\theta _0\right] } \cdot J(w) \end{aligned}$$

where (20) has been used. Deriving with respect to the skill w both sides of (9) and of $C(w,\theta _0)=Y(w,\theta _0)-T\left( Y(w,\theta _0)\right)$, we obtain:

$$\begin{aligned} {\dot{W}}(w,\theta )=\frac{{\dot{Y}}\left( w,\theta _0\right) }{{\dot{Y}}\left( W(w,\theta ),\theta \right) } \qquad \text {and}\qquad {\dot{C}}(w,\theta _0) =\left( 1-T^\prime \left( Y(w,\theta _0)\right) \right) \ {\dot{Y}}(w,\theta _0). \end{aligned}$$

We thus obtain:

$$\begin{aligned} {\dot{J}}(w)=\left( \int \limits _{\theta \in \Theta }\left\{ g\left( W(w,\theta ),\theta \right) -1\right\} \frac{f\left( W(w,\theta )|\theta \right) }{{\dot{Y}}(W(w,\theta ),\theta )} d\mu (\theta ) + \left( 1-T^\prime \left[ w,\theta _0\right] \right) \frac{u^{\prime \prime }\left[ w,\theta _0\right] }{u^{\prime }\left[ w,\theta _0\right] } J(w) \right) {\dot{Y}}(w,\theta _0). \end{aligned}$$

Using (21) and (38), ${\dot{J}}(w)$ can be rewritten as:

$$\begin{aligned} {\dot{J}}(w)= & {} \left( \int \limits _{\theta \in \Theta }\left\{ g\left( W(w,\theta ),\theta \right) -1\right\} h\left( Y(w,\theta _0)|\theta \right) d\mu (\theta ) \right. \\&+ \left. T^\prime \left( Y(w,\theta _0)\right) Y(w,\theta _0) \frac{u^{\prime \prime }\left( C(w,\theta _0)\right) }{ u^{\prime }\left( C(w,\theta _0)\right) } {\hat{\varepsilon }}(Y(w,\theta _0)) \hat{h}(Y(w,\theta _0)) \right) {\dot{Y}}(w,\theta _0). \end{aligned}$$

Using (36) and (22d), we get:

$$\begin{aligned} -{\dot{J}}(w)= & {} \left\{ 1-\hat{g}(Y(w,\theta _0))- {\hat{\eta }}(Y(w,\theta _0))\cdot T^\prime \left( Y(w,\theta _0)\right) \right\} \cdot \hat{h}\left( Y(w,\theta )\right) \cdot {\dot{Y}}(w,\theta _0). \end{aligned}$$

As $J(w)=\int _{x\ge w} (-{\dot{J}}(x))dx$, we get

$$\begin{aligned} J(w)= & {} \int _{x \ge w} \left\{ 1-\hat{g}(Y(x,\theta _0))- {\hat{\eta }}(Y(x,\theta _0))\cdot T^\prime \left( Y(x,\theta _0)\right) \right\} \cdot \hat{h}\left( Y(x,\theta )\right) \cdot {\dot{Y}}(x,\theta _0)\cdot dx. \end{aligned}$$

Changing variables by posing $z=Y(x,\theta _0)$, we get

$$\begin{aligned} J(w)=\int _{z \ge Y(w,\theta _0)} \left\{ 1-\hat{g}(z)- {\hat{\eta }}(z)\cdot T^\prime \left( Y(z)\right) \right\} \cdot \hat{h}\left( Y(x,\theta )\right) \cdot dz. \end{aligned}$$

(39)

Plugging (39) into (38) gives (23a). Combining (14b) and (39) leads to (23b).

1.5 A.5 Proof of Proposition 3

Let us denote

$$\begin{aligned} K(w) \overset{\text {def}}{\equiv }\iint \limits _{\theta \in \Theta ,x\ge W(w,\theta )} \left( \frac{1}{u^{\prime }(C(x,\theta ))}\ -\frac{\Phi _{U}(U(x,\theta );x,\theta )}{\lambda }\right) f(x|\theta )dx\ d\mu (\theta ) \end{aligned}$$

(40)

the ratio of the right-hand side of (14a) at the skill level w divided by $u^\prime \left( Y(w,\theta _0)-T(Y(w,\theta _0))\right)$. According to Proposition 1, Eq. (14a) and $\upsilon _y>0>\upsilon _{yw}$, the sign of $T^\prime (Y(w,\theta _0))$ is the sign of K(w).

Under utilitarian preferences, $\Phi _{u}=1$. Changing variable in (40) from x to t such that $x=W(t,\theta )$, (i.e. $Y(x,\theta )\equiv Y(t,\theta _0)$ and $C(x,\theta )\equiv C(t,\theta _0)$ according to (9)), we get:

$$\begin{aligned} K(w) = \int _{t\ge w}\left( \frac{1}{u^{\prime }(C(t,\theta _0))}-\frac{1}{\lambda }\right) \ \left( \int _{\theta \in \Theta } {\dot{W}}(t,\theta )\ f\left( W(t,\theta )\vert \theta \right) d\mu \left( \theta \right) \right) \ dt \end{aligned}$$

The derivative of K(w) has the sign of $1/\lambda -1/u^{\prime }(C(w,\theta _0))$, which is decreasing in w because of the concavity of $u(\cdot )$. Moreover, $\underset{w\mapsto \infty }{\lim }K(w)=0$ and Eq. (14b) implies that $K(0)=0$. Therefore, $K(\cdot )$ first increases and then decreases. It is thus positive for all (interior) skill levels. So, optimal marginal tax rates are positive.

Under maximin, one has $U(x,\theta )>U(0,\theta )$ for all $x>0$ from (8a). Therefore, within each group, the most deserving individuals are those whose skill $w=0$. The maximin objective implies $\Phi _{U}\left[ x,\theta \right] =0$ for all $x>0$. Hence, Eq. (40) simplifies to:

$$\begin{aligned} K(w) = \iint \limits _{\theta \in \Theta ,x\ge W(w,\theta )} \frac{1}{u^{\prime }(C(x,\theta ))}\ f(x|\theta )dx\ d\mu (\theta ) \end{aligned}$$

for all $w>0$, which is always positive, thereby leading to positive optimal marginal tax rates.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jacquet, L., Lehmann, E. Optimal tax problems with multidimensional heterogeneity: a mechanism design approach. Soc Choice Welf 60, 135–164 (2023). https://doi.org/10.1007/s00355-021-01349-4

Download citation

Received: 19 January 2021
Accepted: 05 May 2021
Published: 10 July 2021
Issue Date: January 2023
DOI: https://doi.org/10.1007/s00355-021-01349-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal tax problems with multidimensional heterogeneity: a mechanism design approach

Abstract

Access this article

Similar content being viewed by others

Optimal nonlinear taxation: a simpler approach

Bunching in rank-dependent optimal income tax schedules

Majority rule and selfishly optimal nonlinear income tax schedules with discrete skill levels

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Appendix

1.1 A.1 Proof of Lemma 4

Proof

1.2 A.2 Derivation of Eq. (17)

Proof

1.3 A.3 Proof of Lemma 5

1.4 A.4 Proof of Proposition 2

1.5 A.5 Proof of Proposition 3

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimal tax problems with multidimensional heterogeneity: a mechanism design approach

Abstract

Access this article

Similar content being viewed by others

Optimal nonlinear taxation: a simpler approach

Bunching in rank-dependent optimal income tax schedules

Majority rule and selfishly optimal nonlinear income tax schedules with discrete skill levels

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Appendix

A Appendix

1.1 A.1 Proof of Lemma 4

Proof

1.2 A.2 Derivation of Eq. (17)

Proof

1.3 A.3 Proof of Lemma 5

1.4 A.4 Proof of Proposition 2

1.5 A.5 Proof of Proposition 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation