1 Introduction

Lifelong learning is high on the policy agenda. Societal and technological changes increase the need to invest in lifelong learning. For example, effective retirement ages in developed economies have risen dramatically over the past decade.Footnote 1 Also, technological change and globalization seem to reduce the life spans of sectors, firms and products (Goos et al. 2014; Autor et al. 2015). As a result, individuals are more likely to switch jobs and careers during their working life and are more likely to switch tasks within a given job. In the face of these changes, maintaining and investing in human capital during working life becomes increasingly important. At the same time, policymakers worry that individuals and/or their employers underinvest in lifelong learning, due to, for example, holdup problems (Malcomson 1997, 1999).Footnote 2 Although it is difficult to determine empirically whether there is underinvestment in lifelong learning in general, policymakers seem particularly worried about certain subgroups of the population that have a distaste for formal learning, such as lower educated individuals (see, e.g., Eurostat 2016) and workers in sectors that seem particularly ‘at risk’ due to technological change and globalization. Policymakers therefore try to mitigate potential underinvestment in lifelong learning, by providing financial support to employees and their employers that undertake lifelong learning, regulating and funding post-initial education and training, informing employees and their employers about the possibilities for lifelong learning and scrutinizing labor market regulations for adverse side effects on lifelong learning. Recently, a literature has emerged that investigates the effectiveness of different policy measures. However, so far only direct financial support measures have been investigated systematically and even then the empirical evidence on the effectiveness of this type of policy remains scarce. On the prospects for tax incentives to stimulate lifelong learning, we know very little.

In this paper, we study whether a tax deduction for lifelong learning can stimulate investment in lifelong learning. Specifically, we consider the effects of a tax deduction in the Netherlands, where individuals can deduct their expenditures on post-initial work-related training and education from their pre-tax personal income. This includes tuition fees, books, necessary clothing and depreciation on a computer when the computer is necessary for a work-related course. Jumps in marginal tax rates provide exogenous variation in the financial incentives to undertake lifelong learning. We study the effect of this exogenous variation on the probability of filing lifelong learning expenditures and on the amount of lifelong learning expenditures filed, for different subgroups and at different points in the income distribution.

We employ a regression kink design to estimate the impact of the tax deduction on lifelong learning expenditures. The Dutch income tax system features two discontinuous jumps in the statutory marginal tax rate in our data period. Moving from the left to the right of the discontinuity, the upward jump in the marginal tax rate implies a lower effective cost for lifelong learning to the right of the discontinuity. This results in kinks in the net financial cost of lifelong learning. Because we observe (almost) no bunching for singles around the kinks, we can apply a regression kink design for singles.Footnote 3 For couples, we show that bunching at the kinks due to the shifting of tax deductions between fiscal partners (to reduce their tax burden) complicates the analysis.Footnote 4 We use a high-quality administrative dataset of tax returns on the universe of Dutch taxpayers for the years 2006–2013. This dataset provides information on all relevant earnings activities of the Dutch population and also contains all the information on tax deductions.

Our main findings are as follows. First, at the kink for low-income singles (approximately 18 thousand euro) we find no statistically significant effect on the probability to file lifelong learning expenditures. However, at the kink for high-income singles (approximately 52 thousand euro) we find an increase in the probability to file for lifelong learning expenditures of about 10%, though the effect on the average amount filed is not statistically significant. Second, when looking at the effects for subgroups of high-income singles, we find larger effects for natives, the higher educated and individuals in the interval 40–45 years of age. Third, we show that shifting of tax deductions between fiscal partners results in bunching around the kinks for couples, inflating the effects for primary earners (the partner with the highest income) and resulting in counterintuitive negative effects for secondary earners (the partner with the lowest income).

We make a number of contributions to the literature. We contribute to the scarce literature on the causal effects of tax incentives for lifelong learning. We build on the analysis by Leuven and Oosterbeek (2012), but make substantial improvements. The authors use a sample of about 100 thousand Dutch tax returns, of which only a subsample of individuals is close to the relevant tax bracket thresholds. Our paper uses about 10 million tax returns. Furthermore, we estimate separate regressions for singles and couples, which turns out to be very important for the results. Also, we use the more novel regression kink design approach, which seems more appropriate than the regression discontinuity design used in Leuven and Oosterbeek (2012) given the kink in the financial incentive driving the result. Finally, our results for couples provide another striking example of how manipulation of the running variable may give rise to results that are substantially biased.

The only other paper, to the best of our knowledge, to directly study the effectiveness of tax stimuli for lifelong learning, but then targeted at employers, is Leuven and Oosterbeek (2004). They find that a tax advantage for employers for training activities of their workers over the age of 40 only shifted training expenses from employees just below 40 to those just over 40, with little to no effect on overall training expenses.

Furthermore, we contribute to the general literature on the impact of financial incentives on lifelong learning. These papers typically find positive but limited effects. For example, Schwerdt et al. (2012) investigate a general voucher program in Switzerland, Hidalgo et al. (2014) look at a voucher program for specific sectors in the Netherlands and Görlitz and Tamm (2016) analyze a large co-financing instrument in Germany. In all cases, employees could pick a short training program at lower than regular costs. Training participation increased by between 13 and 20 percentage points due to these subsidies. Furthermore, Schwerdt et al. (2012) also consider heterogeneous treatment effects and find that lower educated individuals seem to benefit somewhat more by participating in additional training in terms of higher wages. Other papers in this literature investigate policies in which employers receive (part of) the subsidy directly (Görlitz 2010; Abramovsky et al. 2011; Van der Steeg and van Elk 2015).

Our paper also relates to a relatively new literature studying the effects of tax incentives on initial education (Dynarski and Scott-Clayton 2016). In countries with many private schools, tuition expenses can be substantial and sometimes the tax authorities are subsidizing these expenditures directly. Also, savings for future college tuition expenditures are in certain cases deductible. These tax subsidies are both meant to increase private school and college attendance and to give income support to low- and middle-income families with children. A few papers have been able to identify causal effects on higher education participation, and these papers found small effects of these tax subsidies at best. Bulman and Hoxby (2015) find negligible effects on several outcomes in higher education of three tax credits for households who pay tuition and fees. Hoxby and Bulman (2016) argue that this might be due to the price inelasticity of marginal households, but that limited knowledge about the deduction and the delay in receiving the financial benefit also matters.

The outline of the paper is as follows. Section 2 gives a brief description of relevant elements of the Dutch income tax system and the tax deduction for lifelong learning. Section 3 outlines a stylized life cycle model that makes predictions about the relationship between the tax deduction and marginal tax rates and investments in lifelong learning, which motivates the setup of our empirical analysis. Section 4 discusses our empirical methodology. A description of the dataset, including descriptive statistics, is given in Sect. 5. Section 6 presents the main results as well as a number of robustness checks. Section 7 discusses our findings and presents supplementary evidence that the effects we measure are at least in part true effects on lifelong learning and not merely reporting effects. Section 8 concludes. An online appendix contains supplementary material.

2 Institutional setting

We exploit differences in marginal tax rates to identify the effect of the tax deduction on lifelong learning expenditures in the Netherlands. In this section, we explain the tax deduction for lifelong learning and outline the relevant characteristics of the Dutch income tax system for our sample period (2006–2013).

The tax deduction for lifelong learning is an income tax deduction for out-of-pocket expenditures on post-initial work-related training and education. The financial gain of the tax deduction is equal to the expenditures (minus a threshold) multiplied by the marginal income tax rate. The marginal income tax rate is a stepwise increasing function of individual taxable income.Footnote 5 Figure 1 shows the marginal tax rates by taxable income levels for the years 2006, 2012 and 2013.Footnote 6 The difference between the tax rates in the first and second bracket is approximately 8 percentage points over the period 2006 to 2012 and drops to 5 percentage points following the increase in the marginal tax rate for the first bracket in 2013. The difference between the tax rates in the third and fourth bracket is 10 percentage points throughout the entire sample period. The end of the tax brackets has increased gradually over the period 2006–2012, due to indexation with inflation. In 2013, the end of the first bracket increased somewhat, while the end of the second and third brackets decreased somewhat.

Fig. 1
figure 1

Marginal tax rates: 2006, 2012 and 2013. The figure shows the marginal tax rates (MTRs) by taxable income for the years 2006, 2012 and 2013. In 2006, the change in the MTR is 7.3% points at the end of the first tax bracket and 10% points at the end of the third tax bracket. In 2012, the change in the MTR is 8.85% points at the end of the first tax bracket and 10% points at the end of the third tax bracket. In 2013, the change in the MTR is 5% points at the end of the first tax bracket and 10% points at the end of the third tax bracket

Lifelong learning expenditures are only deductible if the goal is to stimulate human capital formation and/or to improve one’s labor market position. This includes tuition fees, books, necessary clothing and depreciation on a computer when the computer is necessary for a work-related course. Living and travel expenses are excluded, as are expenditures on courses for strictly personal development, ‘hobbies’ and materials used for self study. Furthermore, untaxed benefits for lifelong learning, such as a study grant from the government or a private institution, or a reimbursement from an employer for training expenses, should be subtracted from the deducted amount. Over the period 2006–2012, a threshold of 500 euro is applied to all deductible lifelong learning expenditures in a given year. The maximum deductible amount each year is 15,000 euro.

The deduction for lifelong learning expenses is entered when filing taxes each year. Most people file taxes through a software program provided by the tax authorities. Other deductions are shown on the same page as the deduction for lifelong learning expenses, which makes each of them more salient for people who qualify for any of the deductions.Footnote 7 For each possible deduction, there is a short explanation and a box where taxpayers can enter the amount. Only when individuals are audited do they have to provide proofs of expenses for any deduction they filed. This trust-based system could provide opportunities for tax evasion and fraud. Fines for fraud or errors add up to three times the amount that was wrongly filed. Unfortunately, we have no data on the probability of being audited. We also do not observe whether individuals use a tax advisor to file their taxes.

The deductible for lifelong learning expenditures changed quite substantially in 2013. First, the threshold was reduced from 500 euro to 250 euro. Second, the deductible became limited to tuition fees and compulsory additional learning tools, such as books and protection materials. This meant, for example, that the depreciation of a computer was no longer deductible. These changes are the main reason (next to stable differences in marginal tax rates) why we limit ourselves to the 2006–2012 period in the main analyses.

While training expenditures are typically individual expenditures, fiscal partners can choose whether they deduct the expenditures from their own taxable income or whether they transfer the expenditures to their partner who can then subtract it from his or her taxable income. To minimize the household tax burden, partners typically shift the tax deduction to the partner that has the higher marginal tax rate (see Appendix F).Footnote 8

3 Theoretical framework

Following Leuven and Oosterbeek (2012), we illustrate the basic mechanism via which a tax deduction for lifelong learning expenditures in combination with differences in marginal tax rates affects the investment in lifelong learning in a stylized life cycle model.

Lifetime utility depends on consumption in period 1 and 2: \(U(C_1,C_2)\). We assume that the utility function is additively separable in period 1 utility and period 2 utility, and period 2 utility is discounted by a factor \(1/(1+\delta )\), where \(\delta\) is the subjective discount rate:

$$\begin{aligned} U(C_1,C_2) = U(C_1) + \frac{1}{1+\delta }U(C_2). \end{aligned}$$
(1)

Consumption in period 1 depends on gross income \(w_1\), lifelong learning expenditures L, the tax rate \(\tau _1\) and savings S:

$$\begin{aligned} C_1 = (1-\tau _1) (w_1 - L) - S, \end{aligned}$$
(2)

noting that lifelong learning expenditures are deducted from gross income rather than net income. Also note that for simplicity we assume that agents face a flat tax system. Consumption in period 2 then depends on gross income \(w_2\), the return on lifelong learning expenditures, the tax rate \(\tau _2\) and the return on period 1 savings:

$$\begin{aligned} C_2 = (1-\tau _2) (w_2 + f(L)) + (1+r)S, \end{aligned}$$
(3)

where f(L) is the return on lifelong learning expenditures in terms of a higher gross period 2 income, for which we assume \(f(0)=0\), \(f' > 0\) and \(f''<0\), and r is the return on savings.

Maximizing lifetime utility with respect to lifelong learning expenditures and savings gives, respectively:

$$\begin{aligned} \frac{\partial U(.)}{\partial L}= & {} 0 \Rightarrow U'_{C_1} (-(1-\tau _1)) + \frac{1}{1+\delta }U'_{C_2}(1-\tau _2)f'(L) = 0, \end{aligned}$$
(4)
$$\begin{aligned} \frac{\partial U(.)}{\partial S}= & {} 0 \Rightarrow U'_{C_1} (-1) + \frac{1}{1+\delta }U'_{C_2}(1+r) = 0. \end{aligned}$$
(5)

Solving for L then gives the implicit function:

$$\begin{aligned} f'(L)=\frac{(1-\tau _1)}{(1-\tau _2)}\frac{(1+r)}{(1+\delta )}. \end{aligned}$$
(6)

In the empirical application below, we will compare individuals with a lower \(\tau _1\), with a taxable income just below a tax bracket threshold, with individuals with a higher \(\tau _1\), with a taxable income just above a tax bracket threshold. Equation (6) shows that ceteris paribus, individuals with a higher \(\tau _1\), will invest more in lifelong learning than individuals with a lower \(\tau _1\). Indeed, when \(\tau _1\) is higher, the right-hand side of (6) is lower. Hence, at the optimum, \(f'(L)\) will be lower as well, and given that \(f''(L) < 0\), this implies that L should be higher. Intuitively, the investment cost of lifelong learning is lower when \(\tau _1\) is higher.Footnote 9 In Appendix Figure D.1, we show that ceteris is indeed very close to paribus as individuals just below and just above income tax bracket thresholds are very similar in observable characteristics (and hence in r and \(\delta\) in terms of our simple stylized model). They also face very similar tax rates \(\tau _2\) in the years after the lifelong learning investment; see Figure B.1 for singles using the deduction in 2006.Footnote 10

4 Empirical methodology

The tax deduction introduces a kink in the effective costs of lifelong learning expenditures. Therefore, we prefer to use a regression kink design, provided that the conditions for using a regression kink design hold.Footnote 11 A crucial condition for a regression kink design is that there is no bunching around the kink. Below, we show that this condition holds for singles, but not for couples. Hence, in our main analysis we focus on singles.

We exploit the differences in the marginal tax rates in a regression kink design to identify the causal effect of the tax deduction on lifelong learning expenditures. The general idea is that the outcome variable is a continuous function of income in the absence of the tax deduction, but that the tax deduction in combination with a discontinuity in the marginal tax rate creates an exogenous kink in the effective costs of lifelong learning, which potentially results in a kink in the use and expenditures on lifelong learning as well.Footnote 12

Figure 2 illustrates the kink when going from the third to the fourth bracket, located at a taxable income of 52,000 euro. Suppose that an individual has 2500 euro lifelong learning expenditures. The marginal tax rate to the left of the kink is 42%. The effective costs of the lifelong learning expenditures to the left of the kink then are \((1 - 0.42)*(2500 - 500) + 500 = 1,660\) euro. When the individual has taxable income (before the tax deduction is applied) in the fourth tax bracket, the effective costs of lifelong learning expenditures are lower. For example, at 1000 euro to the right of the threshold, the effective costs of lifelong learning are \((1-0.52)*(2500-1500)+(1-0.42)*(1500-500)+500=1,560\) euro, or \(6\%\) less than on the left-hand side of the threshold. Finally, for individuals with a taxable income 2000 euro to the right of the threshold and beyond, the effective costs of lifelong learning are \((1-0.52)*(2500-500)+500=1460\) euro, or 12% less than on the left-hand side of the threshold. This suggests running a regression kink design using observations up to the point where the financial gain becomes constant again, and for symmetry we then also use observations from the same distance to the kink on the left-hand side. Hence, using observations from the interval [50,000,54,000], indicated by the dashed lines in Figure 2.

Fig. 2
figure 2

Effective costs of 2500 euro lifelong learning expenditures. The figure shows the effective costs of 2500 euro on lifelong learning expenditures by taxable income (before the deduction for lifelong learning expenditures) in 2006 at ‘kink 2’ (end of the third tax bracket). Individuals can only deduct lifelong learning expenditures in excess of 500 euro. (The first 500 euro are not deductible.) We see that individuals with a taxable income above 54 thousand euro have the lowest effective costs for lifelong learning expenditures. Given their marginal tax rate of 52%, they pay only 48% of the remaining 2000 euro that is deductible from taxable income. Individuals with a taxable income of 52 thousand euro or less have a marginal tax rate of 42% and pay 58% of the remaining 2000 euro. Individuals with a taxable income between 52 and 54 thousand euro pay between 48 and 58% of the remaining 2000 euro

We estimate the effect of the tax deduction on (i) the probability of filing lifelong learning expenditures, and (ii) the amount of lifelong learning expenditures filed (including the zeros), using the following linear model:

$$\begin{aligned} Y_{it} = \alpha + \beta R_{it} + \delta 1(R_{it}>0) * R_{it} + \gamma X_{it} + \eta _t + \epsilon _{it}, \end{aligned}$$
(7)

where i denotes the individual and t denotes the calendar year. \(R_{it}\) is (recentered) taxable income before deducting lifelong learning expenditures; the parameter \(\delta\) measures the treatment effect, the change in the slope at the kink. \(X_{it}\) are a set of demographic control variables, \(\eta _{t}\) are year fixed effects, and \(\epsilon _{it}\) is the error term. To account for correlation in the error term at a level higher than the individual, we use cluster-robust standard errors for income groups of 100 euro (Bertrand et al. 2004; Donald and Lang 2007).Footnote 13

A potential threat to the identification using the regression kink design is that individuals may be able to manipulate their income. Indeed, as suggested by the analysis of Saez (2010), individuals may bunch in income at kink points in the budget constraint due to changes in marginal tax rates, by adjusting their effort or working hours. Another way individuals can manipulate their income, relevant for couples, is by shifting tax deductions between fiscal partners. As a result of this manipulation of income, observable and unobservable characteristics may no longer be smooth around the kink points, and as a result we may observe differences in the use of lifelong learning not only due to differences in financial incentives but for other reasons as well. It is therefore essential to test for this type of bunching, and whether the observable characteristics are ‘smooth’ around the kink. We therefore start our empirical analysis with a bunching analysis of income to see whether this is a problem or not.

5 Data

For the empirical analysis, we use the universe of Dutch taxpayers, available via the remote access server of Statistics Netherlands. We have data for the period 2006–2013, but we focus on the period 2006–2012. During the period 2006–2012, the tax deduction for lifelong learning expenditures remained largely unchanged.

We make the following selections. We drop all individuals younger than 25 years of age or older than 60 years of age. Furthermore, we drop individuals who are enrolled at a full-time higher education institution, because students can use the tax deduction for other reasons than lifelong learning expenditures. We also exclude individuals on retirement benefits, on other types of social insurance and individuals without income, because their demographic characteristics are quite different from the rest of the sample.

As dependent variables, we consider the take-up rate of the lifelong learning tax deduction and the deducted amount. We subtract the threshold of 500 euro from the deducted amount before we calculate the take-up rate (dummy) and the deducted amount.

Table 1 shows the distribution of the use of the deductible by income level. For singles, 26% of the population has taxable income below 20,000 euros and about 2.7% uses the deductible in 2006–2012. For singles with an income between 20,000 and 40,000 euros, the largest group, about 3%, uses the deductible. Higher-income singles make up a much smaller share of the population, but are more likely to use the deductible. In addition to their more frequent use, they also deduct higher amounts. Particularly, singles with a taxable income of more than 60,000 euros—about 5% of the population of singles—have a relatively high deductible of close to 3,400 euros for those using the deductible. In Appendix H, we report the full distribution of nonzero deducted amounts as well as the distribution up to 5000 euro for singles over 2006–2012.

Table 1 The distribution of the use of the deductible for single households
Table 2 Descriptive statistics for singles around the kinks

We study the effect of two discontinuities in marginal tax rates on lifelong learning expenditures: (1) the increase in the marginal tax rate when we move from the first to the second tax bracket, which we indicate as ‘kink 1’, and (2) the increase in the marginal tax rate when we move from the third to the fourth tax bracket, which we indicate as ‘kink 2’.

Descriptive statistics for singles around these two kinks are given in Table 2. In the first column, we present descriptive statistics for the sample around kink 1. Specifically, these are statistics for the sample in our preferred specification with individuals from \(-1330\) to \(+1330\) euro around kink 1. 2.8% of this sample deducts lifelong learning expenditures, and the average amount deducted, including the zeros, is about 38 euros. The average amount is 1359 (1330 to the left of the kink) euro per person that uses the deduction, which motivates the sample interval that we use. 63% of the singles around kink 1 are female, they are on average 40 years of age and have 0.8 children on average, and 14% of them are either born outside the Netherlands or have at least one parent born outside the Netherlands. We have about 735 thousand observations in this sample.

The second column gives the descriptive statistics for the sample around kink 2 for our preferred specification with a bandwidth of 2000 euros around the kink. The take-up rate is higher for this group, 3.5%, and the average amount, including the zeros, is also higher at about 74 euro. The average amount is 2097 euro per person that uses the deduction, which again motivates the sample interval we use.

There are fewer females in the sample around kink 2 (31%) then around kink 1; on average, they are also somewhat older, are less likely to have children, and are less likely to be born outside the Netherlands or from foreign parents. This sample is also smaller, with about 230 thousand observations. These individuals are in the top 10% of the full income distribution in the Netherlands.

6 Results

We start our analysis with a formal bunching analysis to highlight that singles exhibit very little bunching around the kinks.Footnote 14 We then continue to estimate the effect of the kinks on lifelong learning expenditures using the regression kink design for singles and also consider heterogeneous effects for a large number of subgroups. Next, we briefly discuss the results for couples, where we do observe substantial bunching around the kinks. We close this section with a discussion of our findings and their interpretation.

6.1 Bunching analysis singles

We perform a formal bunching analysis closely following Best and Kleven (2018). We group the data over the period 2006–2012 into bins of 100 euros, around kink 1 and 2, respectively. We then estimate the following model to construct the counterfactual density of households around the kinks:

$$\begin{aligned} c_i = \sum _{j=0}^5 \beta _j (z_i)^j + \sum _{-2000}^{2000} \gamma _k \mathbbm {1}\{i=k\} + \varepsilon _i, \end{aligned}$$
(8)

where \(c_i\) is the number of households in bin i and \(z_i\) is the distance between bin i and the kink (either kink 1 or kink 2). We follow Best and Kleven (2018) and estimate a polynomial of order 5.Footnote 15 The second term collects a set of dummies for whether the bins are in the omitted range around the kink. In our baseline, we set this at \(-2000\) euros and 2000 euros. Since we observe no clear (visual) bunching for singles, there is no clear guideline for where to set the thresholds. However, using larger and smaller omitted regions yielded similar results.

Fig. 3
figure 3

Bunching taxable income singles kink 1

Fig. 4
figure 4

Bunching taxable income singles kink 2. Own calculations based on register data from Statistics Netherlands. The figures show the actual density and the counterfactual density based on estimates of Eq. 8. The estimate of b is the amount of excess mass to the left of the kink divided by the counterfactual density at the kink, and m is the amount of missing mass at the right side of the kink divided by the counterfactual density at the kink. Standard errors are obtained by bootstrapping 200 times

The counterfactual density without bunching around the kinks is constructed using the predicted values from Eq. (8), where we exclude the omitted region. We compare the counterfactual density with the actual density to get an estimate of the bunching around the kinks. We present estimates of the amount of bunching—the difference between the actual and counterfactual density of households in the area just before the kink—divided by the counterfactual density at the kink. If there is bunching, we would also expect to see missing mass on the right side of the kink. We also present estimates of this, again divided by the counterfactual density at the kink. We obtain standard errors by bootstrapping this procedure 200 times.

We find (virtuallyFootnote 16) no bunching in taxable incomeFootnote 17 around kink 1; see Fig. 3. In the figures, b gives the amount of bunching to the left of the kink and m the amount of ‘missing mass’ to the right of the kink. For kink 2, we find a small amount of bunching around kink 2, see Fig. 4, with some excess mass to the left and to the right of the kink.Footnote 18 This likely results from the use of other deductions, like the deduction for mortgage interest payments, as there is less bunching in gross income around kink 2; see Figure C.2 in the Appendix.Footnote 19 However, this small amount of bunching is unlikely to affect the results of our regression kink design. First, the distribution of the demographic control variables is smooth around both kink 1 and kink 2; see Figure D.1a–D.1g in the Appendix. Second, we do not observe a sudden increase in the take-up of the deduction for lifelong learning expenditures or the amount of lifelong learning expenditures filed (see below).

6.2 Base results lifelong learning expenditures singles

Figure 5a and b shows the take-up rate of the deductible for lifelong learning expenditures for kink 1 and 2, respectively.Footnote 20 We present averages per taxable income bin of 100 euro. The solid red lines give the predicted take-up rate, using a linear regression model without demographic control variables, allowing for a different slope to the left and the right of the kink. The dashed red lines give the corresponding 95% confidence intervals.

Fig. 5
figure 5

Probability to use the deductible and the deducted amount for singles. Own calculations based on register data from Statistics Netherlands. The regression lines are linear functions without any control variables, with a separate intercept and slope on the right-hand side of the kink. The deducted amount includes the zeros for nonusers. Estimates for kink 1 include observations from minus 1330 to plus 1330 euro relative to the kink. Estimates for kink 2 include observations from minus 2000 to plus 2000 euro relative to the kink. N for kink 1 is 665,604 N for kink 2 is 198,815

Above the graph, we report the corresponding coefficient for the change in the slope on the right-hand side. The graph and the estimated coefficient suggest no effect for kink 1, but a positive and statistically significant effect for kink 2. Figure 5c and d plots the declared amount of schooling expenditures for singles around kink 1 and kink 2 (above the threshold, and including the zeros). Again, there is no apparent kink in the relation between the declared amount and taxable income at kink 1, but there appears to be a kink, albeit not statistically significantly different from zero, in the relation between the declared amount and taxable income at kink 2. Furthermore, for kink 2, we also see a ‘flattening out’ of the effect on the take-up rate and the deducted amount, which is consistent with the flattening out of the financial gain to the right of the kink (see Sect. 4).

In these figures, we have not taken into account observable characteristics. Our model suggests that it could be important to control for these. In Panel A in Table 3, we present regression results for the regression kink coefficient (change in the slope) without and with demographic control variables and for different bandwidths for singles around kink 1. Column (1) gives the results for the probability of using the lifelong learning deduction without demographic control variables. For all bandwidths, we find a small and statistically insignificant effect. The results are very similar when we include demographic control variables in Column (2). Our preferred specification includes demographic control variables and uses a bandwidth of 1330 euro. Here, we find an effect of \(-0.0016.\) The running variable is in thousands of euro; hence, the interpretation is that the additional financial gain of having an income 1000 euro to the right of the kink leads to a statistically insignificant drop in the take-up rate of the lifelong learning deduction of \(-0.16\) percentage points. Our preferred bandwidth is 1330 euro because this is the average amount of schooling expenditures deducted at 1,330 euro to the right of the kink, which is where the kink ends on average.Footnote 21 Also, for the deducted amount we find a small and insignificant (negative) effect, with and without demographic control variables; see Columns (3) and (4), respectively.

Table 3 Treatment effect estimates for singles on the probability to use the deductible and the deducted amount, for different bandwidths around the kink

Panel B in Table 3 gives the regression results for the regression kink coefficient for singles around kink 2, again without and with demographic control variables and for different bandwidths. For kink 2, our preferred bandwidth is 2000 euro, which is very close to the average lifelong learning expenditures of 2060 euro which are deducted at 2060 euro to the right of the kink. For this bandwidth, we find a statistically significant positive effect of 0.38 percentage points. A bandwidth that is somewhat smaller or larger results in a somewhat lower coefficient, but not statistically significantly different from our preferred bandwidth, though the coefficient becomes statistically insignificant when we limit the bandwidth to 1,500 euro.

The regression results for the average deducted amount for different bandwidths for singles around kink 2 are given in Columns (3) and (4) of panel B, without and with demographic control variables, respectively. Again, accounting for demographic control variables hardly affects the results. For our preferred bandwidth of 2000 euro, and including demographic control variables, we find a positive, but statistically insignificant, coefficient of 5.8 euro. Since there is a lot of variation in the amounts filed, statistical power may also play a role here. The coefficient becomes statistically significant when we use a somewhat wider bandwidth.

To further investigate the robustness of our results, we performed the permutation test outlined in Ganong and Jäger (2018). We construct a distribution of placebo estimates in regions where we know there is no kink in the tax system and regions where we know there is a kink. Using our preferred bandwidth of 1330 euro, we find no statistically significant treatment effect for filing lifelong learning expenditures at kink 1.Footnote 22 For kink 2, using our preferred bandwidth of 2000 euro, we do find a highly statistically significant treatment effect on filing lifelong learning expenditures.Footnote 23 This is consistent with our main results.

Furthermore, in our baseline estimates we use a local linear specification, estimated separately for data to the left and to the right of the kink. We have also estimated quadratic and cubic specifications. For kink 1, we find no evidence for a kink using these more flexible specifications. For kink 2 and using our preferred bandwidth, we find smaller estimates that are no longer statistically significant. However, if we use a somewhat larger bandwidth of 2500 euros the treatment again becomes statistically significant (as in the local linear specification) for both the quadratic and cubic specification.

Hence overall, we find no statistically significant effect for either filing lifelong learning expenditures or the average amount filed for singles around the low-income kink. However, we do find some evidence of a positive effect on filing lifelong learning expenditures for singles around the high-income kink, though the effect on the average amount filed is not statistically significant at our preferred bandwidth.

We can convert our estimate at kink 2 to an elasticity of the probability of (deducting) lifelong learning expenditures with respect to the effective costs of lifelong learning expenditures. Consider an individual that has 2500 euro in lifelong learning expenditures, or 2000 euro above the threshold (which is close to the average around kink 2). Furthermore, suppose that this individual has an income that is 1000 euro to the right of the kink, which is in the middle of the region where the financial gain increases. For this individual, we predict an increase in the take-up rate of 0.38 percentage points, or about +10% relative to the baseline of 3.8 percentage points left of the kink. The effective costs of lifelong learning are about \(6\%\) lower than to the left of the kink; see Sect. 4. The elasticity of the take-up rate of (deducting) lifelong learning expenditures with respect to the effective costs of lifelong learning expenditures is then \(+10\%/(-6\%) \approx -1.7\) with a 95% confidence interval of \([-0.8,-2.5]\).Footnote 24

6.3 Heterogeneity analysis singles

Table 4 gives the estimated treatment effect on lifelong learning expenditures for subgroups of singles.Footnote 25 In the text, we focus on the effect on the take-up rate of the deductible around kink 2. The treatment effects around kink 1 and for the deducted amount (both at kink 1 and 2) are typically small or statistically insignificant.Footnote 26

Table 4 Treatment effects by subgroups of singles

We find that the treatment effect is somewhat larger for men than for women, although the difference between the two is not statistically significant. The treatment effect is, however, substantially larger for single natives than for singles born outside the Netherlands or with parents born outside the Netherlands. This may be the result of better opportunities for lifelong learning or better information about the tax deduction for lifelong learning for natives. The treatment effect is also substantially larger for higher-educated than for lower-educated singles, which again may be due to differences in opportunities or differences in knowledge of the tax system.Footnote 27 Turning to differences by age-groups, we find that the treatment effect is relatively large for ‘middle-aged’ persons, in particular 40–44 years of age, compared to younger and older workers. Indeed, middle age may be the time to invest in another job, whereas skills are typically more up-to-date for younger workers, while the return period for investments in work-related human capital is typically shorter for older workers.

Looking at different sectors and contract types, we see that the effect for singles in the private sector or with a temporary or flexible contract type is typically somewhat lower than the main estimate for all singles, and the same holds for workers that have changed jobs in the last year.

Finally, we also see substantial persistence in the use of the lifelong learning deduction. Indeed, singles that used the lifelong learning deduction in the previous year have almost an 8 percentage points higher probability to use the lifelong learning deduction this year. This could be because some courses simply take multiple years to complete. Another reason could be that once people are aware of the deductible, they tend to use it more frequently.

6.4 Results for couples

Couples can shift several tax deductions from one partner to the other. Indeed, many couples actually do shift these deductibles between partners, because after-tax household income will increase when deductibles are shifted from the partner with the lower marginal tax rate to the partner with the higher marginal tax rate. As a result, although we observe very limited bunching around kink 1 and kink 2 for primary earners in gross income, see Figure F.1 and F.2 in the Appendix, at the same time we observe substantial bunching around kink 1 and kink 2 for primary earners in taxable income; see Figures 6 and 7. This also results in discontinuities in the means of the demographic control variables around the kinks for primary (and secondary) earners; see Fig. F.3a–F.3h (and Figure F.4a–F.4g). This invalidates the conditions necessary to conduct a regression kink analysis.

Fig. 6
figure 6

Bunching taxable income primary earners kink 1

Fig. 7
figure 7

Bunching taxable income primary earners kink 2. Own calculations based on register data from Statistics Netherlands. The figures show the actual density and the counterfactual density based on estimates of equation 8. The estimate of b is the amount of excess mass to the left of the kink divided by the counterfactual density at the kink, and m is the amount of missing mass at the right side of the kink divided by the counterfactual density at the kink. Standard errors are obtained by bootstrapping 200 times

Things are even more complicated when analyzing the effect of the changes in marginal tax rates on using the lifelong learning deduction, because couples can also shift lifelong learning expenditures between partners.Footnote 28 As we show in Appendix F, this results in artificially high lifelong learning expenditures for primary earners to the right side of the kinks and artificially low and counterintuitive negative expenditures for secondary earners.

Due to these complications for couples, we prefer to focus on singles for studying the causal effects of the tax deduction on filing lifelong learning expenditures. Appendix F gives some further analyses for couples using a so-called donut regression discontinuity design to mitigate the complications due to bunching.Footnote 29 Furthermore, using the rich tax return that we have, we also consider the different effects on so-called own expenditures and filed expenditures, where the former only includes the own expenditures on lifelong learning, before potential shifting of these expenditures between partners. If we—keeping all the complications mentioned above in mind—take the results on own expenditures at face value, we find a positive impact on own lifelong learning expenditures for primary earners at kink 1, but not at kink 2. We find no impact for secondary earners.

7 Discussion

An important question that remains is whether we measure the effect on actual lifelong learning expenditures, or simply the reporting of lifelong learning expenditures (Slemrod and Yitzhaki 2002; Doerrenberg et al. 2017). Indeed, the financial incentive to file expenditures is higher for people above the kink than for people below the kink.

A first piece of evidence that people using the deductible actually do take up more training is that enrollment in publicly funded education is substantially higher for people using the deductible than for those who do not. Table 5 reports enrollment rates for people using the deductible and those not declaring any training expenditures. We find that from short tertiary education upwards, enrollment rates are significantly higher for those using the deductible. The remaining 80% are either not enrolled or enrolled in private education, for which we have no data.

Second, people also predominantly report using the deductible for training expenses. Table 6 shows the frequency of voluntarily given descriptions accompanying a random sample of 50,000 deductions in 2013. We find that about 23% report using the deductible for tuition fees or similar terms, 14% for books and 9% for ‘study costs.’ People also often report the names of private education institutes or universities.Footnote 30

Table 5 Enrollment in publicly funded education (%)
Table 6 Frequency of words reported in tax filings

Another question is why the effect seems to be smaller for low-income singles than for high-income singles. One potential explanation is that differences in statutory tax rates are perhaps less important than differences in effective marginal tax rates due to income-dependent tax credits and subsidies for low-income singles. Indeed, Figure G.1a and G.1b in the Appendix, for childless singles and lone parents, respectively, shows that effective marginal tax rates can be quite different from statutory tax rates for low-income singles. (For high-income singles, they are almost identical in the period we consider.) As a result, low-income singles may not respond to the differences in statutory marginal tax rates.Footnote 31

One could also argue that the tax credit for lifelong learning is not very salient for low-income singles or that friction costs prevent them from filing lifelong learning expenditures (Ladner et al. 2009; Chetty et al. 2013). One piece of evidence that the tax credit is not very salient is that of the full population in the Labor Force Survey who claim that they took (partly) self-paid training during the year, and who hence are potentially eligible for the tax credit; only about a quarter of them actually declare their expenditures. This is similar for low- and high-income individuals. On the other hand, when we consider the change in the system in 2013, when the threshold was reduced from 500 euro to 250 euro, we see a sharp increase in filed lifelong learning expenditures between 250 and 500 euro, for both low- and high-income singles (see Fig. H.2a and H.2b). This suggests that the tax deduction was as salient to low-income singles as it was to high-income singles. However, the difference in effective marginal tax rates may have been less salient. The same figures also show that both groups also report very small expenditures on lifelong learning expenditures. This suggests that filing frictions also appear to be small for both low- and high-income groups.

Another reason why low-income individuals may be less likely to respond has to do with other costs associated with post-initial education. Time constraints may be considerable, especially when young children are present, and many people—particularly lower-educated individuals—dislike formal learning and hence may experience psychic costs from doing so. Furthermore, people may be myopic—again particularly lower-educated individuals—or more generally underestimate the gains of lifelong learning (see, e.g., Heckman et al. 2006).

8 Conclusion

In this paper, we have studied the effectiveness of a tax deduction for lifelong learning expenditures in terms of the take-up rate of lifelong learning expenditures and the average amount of lifelong learning expenditures. For singles, which is our preferred group because for them we do not observe manipulation of the running variable, we find heterogeneous effects of the tax deduction. For the high-income group, the take-up rate of lifelong learning expenditures increases by a statistically significant 10%, although the effect on the average amount of lifelong learning expenditures filed is not statistically significant. The effect is relatively large for men, natives, higher educated and middle-aged. However, at a relatively low level of income the additional effect of the tax deduction is essentially zero, with tight confidence intervals. We find hardly any bunching by singles at the low- and high-income kinks, suggesting small behavioral responses in income with respect to marginal tax rates (following the methodology of Saez 2010). For couples, we find substantial bunching around the tax kinks, due to the shifting of tax deductibles between partners. As a result, individuals to the left and the right of the tax kinks differ in their demographic characteristics. Furthermore, the possibility to shift the tax deduction for lifelong learning between fiscal partners results in large biased positive treatment effects for primary earners, and counterintuitive negative treatment effects for secondary earners. Hence, we prefer the estimates for singles to learn about the causal effect of the tax deduction on lifelong learning expenditures.

Our study contributes to the scarce literature on tax incentives for lifelong learning. We show that, at the margin, tax incentives provide an incentive for high-income workers to pursue and report training. For low-income workers, tax incentives do not increase their level of reported training. Overall, we might wonder whether our findings show that the policy is effective. On the one hand, the marginal deadweight loss seems rather high. At kink 1, it is \(100\%\), since we do not find an increase in training at kink 1. At kink 2, it is around \(90\%\) at 1000 euros above the kink. Compared to the literature on schooling vouchers, which shows deadweight losses of \(30\%\) (Schwerdt et al. 2012) to \(60\%\) (Hidalgo et al. 2014), the fiscal incentive seems less effective. The apparent lack of effectiveness is in line with the literature on the effects of fiscal incentives for initial education (Bulman and Hoxby 2015). However, the deadweight loss does not take into account the size of the effect relative to the costs. Indeed, the elasticity of lifelong learning with respect to effective costs is non-negligible. At kink 2, effective costs of lifelong learning drop by \(6\%\), while take-up of the deductible increases by \(10\%\). This still implies a (sizeable) elasticity of \(-1.7\) for higher incomes.Footnote 32