The Spanish personal income tax: facts and parametric estimates

In this paper, we use administrative data on tax returns to characterize the distributions of before- and after-tax income, tax liabilities and tax credits in Spain for individuals and households. We use the most recent available data, 2015 for individuals and 2013 for households, but also discuss how the income distribution and taxes have changed since 2002. We also estimate effective tax functions that capture the underlying heterogeneity of the data in a parsimonious way. These parametric functions can be used to calculate after-tax incomes in surveys where this information is not directly available, and can also be used in quantitative work in macroeconomics and public finance.

The Working Paper Series seeks to disseminate original research in economics and fi nance. All papers have been anonymously refereed. By publishing these papers, the Banco de España aims to contribute to economic analysis and, in particular, to knowledge of the Spanish economy and its international environment.
The opinions and analyses in the Working Paper Series are the responsibility of the authors and, therefore, do not necessarily coincide with those of the Banco de España or the Eurosystem.
The Banco de España disseminates its main reports and most of its publications via the Internet at the following website: http://www.bde.es.

Introduction
This paper makes two contributions. First, we use administrative data on tax returns to characterize the distributions of before and after-tax income, tax liabilities, and tax credits in Spain. We also calculate effective average and marginal tax rates that individuals and households face. We use the most recent available data, 2015 for individuals and 2013 for households, but also discuss how the income distribution and taxes have changed since 2002. Second, we provide estimates of effective tax functions. These functions map gross incomes of individuals or households into taxes that they pay, summarizing the complicated structure of taxes in easy-to-interpret and easy-to-use parametric forms. As such, they provide valuable inputs for quantitative studies of fiscal policy in models with heterogeneous agents. 1, 2 Our approach follows Gouveia and Strauss (1994) two lowest quintiles. The top quintile faces an average effective tax rate of 19.0%, while the tax rate for the top 1% is 30.6%.
The tax system in Spain taxes the so-called general income, which mainly consists of labor and self-employment income, and savings income, which mainly consists of capital income, at different rates. Taxes on general income are higher and more progressive than taxes on savings income. 6 Hence, to each of these income categories certain deductions are applied and then the corresponding tax liabilities are calculated. Tax liabilities corresponding to these two categories are then summed, and tax credits are applied to the total tax liabilities to figure out what the taxpayer owes to the state. Given this structure, for the estimation of the effective tax function, we follow two different approaches. First, we estimate one single function for the final tax liabilities as a function of gross income for each year between 2002 and 2015. We focus on two different specifications: one proposed by Benabou (2002) and Heathcote, Storesletten, and Violante (2017), which we call the HSV specification, and the GS specification, used by Gouveia and Strauss (1994).
In our estimation we account for the fact that low incomes are subject to zero effective tax rates, and estimate an income threshold below which tax liabilities are zero. In the second approach, we estimate three different functions: a function that relates general income to general tax rates; a second function that links the savings income to the savings tax rates; and a third function that accounts for the amount of tax credits as a function of total gross income. We show that both approaches result in tax functions that accurately estimate both the level and the distribution of the tax liabilities observed in the data. As an illustration of the use of these tax functions, we apply them to the EFF survey data and calculate after-tax incomes for each household, a variable not available in the original survey.
The rest of the paper is organized as follows. Section 2 describes the Spanish Personal Income Tax. Section 3 describes the dataset and lays out the definitions and sample restrictions. Section 4 presents the basic facts of the income and tax distributions. Section 5 presents the parametric estimates of the tax functions. Section 6 presents the basic facts of the after-tax income distributions for administrative and survey data. Section 7 concludes.

Overview
The Spanish Personal Income Tax (PIT) or Impuesto sobre la Renta de las Personas Físicas (IRPF), taxes the income of Spanish residents. 7 Table 1 documents different sources of tax revenue for Spain, Euro Area and the OECD countries in 2015. The total tax collection with the PIT is 7 Income subject to the PIT corresponds to worldwide income, although a number of bilateral agreements eliminates double taxation.
Electronic copy available at: https://ssrn.com/abstract=3420702 The tax is withheld at source and each year, between April and June, taxpayers must file a tax return based on the previous calendar year's total income. In 2015, all taxpayers with a labor income above e22,000, or with a capital income (excluding income from real-estate) above e1,600, or with a real-estate income above e1,000, or with any income from self-employment had to file a tax return. Many taxpayers below the labor income threshold, around 81% of them in 2015, still choose to file a tax return, since they are likely to obtain a refund due to tax credits. Tax returns can be filed single or jointly. Single tax returns are filed at the individual level, whereas joint tax returns can be filed by spouses or single-parent families with at least one dependent child. self-employment income. From these gross income sources, a set of deductible expenses can be subtracted, which include, among others, social security contributions paid by the employee, a  deduction for earning any labor income, and business expenses associated to self-employment. 9 The result of this subtraction results in adjusted gross income.
Adjusted gross income is then grouped into two categories, which are subsequently taxed at different rates. The first type of income is called general income and includes labor income, selfemployment income and some forms of capital income (mainly, income from real state). 10 The 9 There are two deductions for earning labor income. First, all taxpayers are eligible for a e2,000 deduction.
Second, an additional deduction of up to e3,700 is given to taxpayers whose labor income is below e14,450. These quantities are further increased for some groups of taxpayers, such as disabled workers, or unemployed who had moved to a different location in order to start a new job. 10 Other forms of capital income that are in general income include incomes that come from the participation in common property regimes and other civil associations, such as unsettled estates or communities of property owners.
Electronic copy available at: https://ssrn.com/abstract=3420702 (e.g. realized capital gains, dividend payments, and interest income). 11 To each type of income a set of tax deductions are applied. Deductions that can be applied to general income include a tax deduction for couples filing jointly and contributions to private pension plans. 12 Figure 2 shows the tax schedules in the two selected regions in 2015.
11 In 2015 savings income covered slightly more than 60% of total capital income. 12 In 2015, the deduction on couples filing jointly amounted to e3,400, while the limit on contributions to private pension plans was set to e8,000. 13 In practice, the Spanish system of regional financing is complex, see de la Fuente (2010) for a detailed description. Roughly speaking, regions keep 25% of their tax collection and either receive or contribute money in net terms from two funds aiming at ensuring sufficient financing for each region and a homogenous provision of public services deemed essential, such as health and education. Regions can also raise money from financial markets by issuing debt.
Gross tax liabilities, which are calculated by applying state and region tax schedules to general and savings taxable income, are then reduced by a series of tax credits. First, a family allowance is subtracted from the gross tax liabilities from general taxable income. The amount of the family allowance depends on the characteristics of the taxpayer and their family, such as age, number of dependent children, number of dependent parents, and disability status of the taxpayer and other family members. 14 The actual amount that is subtracted from gross tax liabilities is calculated by applying the general tax schedules to the family allowance. For example, if the total family allowance is e5,500, which is below the first income threshold in panel A of Figure 2, then tax 14 In 2015, this allowance was e5,550 for the taxpayer (e6,700 and e6,950 for taxpayers older than 65 and 75, respectively), plus e2,400 for the first child, e2,700 for the second, e4,000 for the third, etc; plus e1,150 for each dependent parent older than 65, and e1,400 for each dependent parent older than 75; plus e3,000 for each disabled member of the household (e9,000 euro for severe disabilities). Furthermore, the allowance for children is increased if they are less than 3 years old. Also note that regions can modify these amounts.
Electronic copy available at: https://ssrn.com/abstract=3420702   Electronic copy available at: https://ssrn.com/abstract=3420702 liabilities are reduced by e5,500 × 0.095 = e522.5. If the general taxable income of a taxpayer is less than their family allowance, then the extra amount of the family allowance can be used to reduce the gross tax liabilities from savings taxable income.
After subtracting the family allowance, the tax liabilities from the state general income and state savings income are pooled together. Similarly, the region tax liabilities (from general and savings income) are also added up. To these two types of tax liabilities a set of non-refundable tax credits are applied. Non-refundable tax credits include part of mortgage payments (if the house was purchased before 2013) and an extended set of regional and state tax credits. 15 Finally, tax liabilities are further reduced by a set of refundable tax credits. In 2015, such credits were provided for employed mothers with children below 3 years old, taxpayers with disabled parents or children, single-parent families with at least two children, and large families (those with 3 or more children, or 2 children when at least one of them is disabled). The amount of the tax credit given to large families is limited to e2,400, while the rest cannot be larger than e1,200. 16 In order to summarize the structure of taxes, let GI j for j = l, k, e be the gross income from labor (l), capital (k) and self-employment (e). Adjusted gross income (AGI j ) is obtained by 15 The region-specific tax credits, which can be means-tested, include credits for taking care of disabled or elderly, births, adoptions, large families, school expenses, donations, housing expenses, etc. Other state tax credits are granted to, among others, charity donations and renters earning income below a certain threshold. The state tax credit for renters has been phased out since 2015. 16 The most important refundable credit is the one provided to employed mothers with children below 3. In 2015, close to 750,000 women received it, which represented close to 4% of the total number of tax returns, being granted around e935 on average. The refundable tax credit granted to large families comes next, which accrued to close to 500,000 taxpayers (2.6% of the total) and amounted to e945 on average.
subtracting deductions (D j ) from the gross income. Adjusted gross income from labor, capital and self-employment are then grouped together under two categories: general income (g) and savings income (s), i.e., Then another set of deductions (OD g ) are subtracted from AGI g to obtain general taxable income: The family allowance (F A) is calculated as a function of the taxpayer and their family characteristics. The allowance pertaining to the general income (F A g ) is computed as: Electronic copy available at: https://ssrn.com/abstract=3420702 The gross tax liabilities that corresponds to T I g are then calculated as: where τ g is the general tax schedule.
In order to obtain the gross tax liabilities for savings income (GT L s ), the savings adjusted gross income (AGI s ) is reduced by unused portions of OD g (denoted by OD s ) to obtain the savings taxable income (T I s = AGI s − OD s ). 17 The family allowance pertaining to savings income (F A s ) is computed as: Then, the tax liabilities for savings income are calculated as follows: where τ s is the savings tax schedule.
Finally, the two gross tax liabilities are summed and nonrefundable and refundable tax credits (NT C and T C) are subtracted to obtain tax liabilities: 17 In practice, only certain elements of OD g can be used in OD s a slowing economy. In contrast, after 2008, the sharp fall in the GDP and the subsequent deterioration of the budget balance led to sizable tax increases between 2010 and 2012. Once again, following the recent economic recovery, significant tax cuts took place in 2015.
The first major reform of the personal income tax during the 21st century was in 2003. It involved a reduction in the number of tax brackets (from 6 to 5) and tax rates (the top marginal tax rate was reduced from 48% to 45%). There was also an increase in the family allowance (e.g.
for a taxpayer with 2 children, by about e600) and a tax credit of e1,200 on employed mothers with at least one child below age 3 was introduced. In 2007 the government implemented a big reform, which consisted of a further reduction of tax brackets (from 5 to 4) and tax rates (the top marginal tax rates were reduced from 45% to 43%). The family allowance was also increased (e.g. for a taxpayer with 2 children, one of them below age 3, by close to e5,000) and was redefined as a general income tax credit instead of a deduction. Three other important changes were a raise in savings tax rates (from 15% to 18%), a reshuffling of tax bases, which moved many capital

Recent Reforms of the Personal Income Tax (2002-2015)
The Spanish PIT has undergone several changes during recent years. In general, the taxes are income items to the savings schedule, and the introduction of a tax credit of e2,500 on births and adoptions. In 2008, a e400 tax credit for labor and self-employment income earners was introduced in order to spur private expenditure. Furthermore, a non-refundable tax credit for house renters was also implemented.
Between 2010 and 2012, the successive governments increased taxes or reduced deductions and credits in the context of the economic crisis and the deterioration of the budget balance. In 2010 the e400 tax benefit was eliminated and the savings tax rates were increased (from 18% to 21% for taxpayers earning more than e6,000 of savings income). In 2011 the tax credit on births and adoptions was eliminated and the top marginal tax rates were increased from 43% to a range of 44.9% to 49%, depending on the region. In 2012 the government approved a significant increase of marginal rates, which affected the entire tax schedule (for instance, the top marginal rates were increased by 7 percentage points). This tax increase, which was initially intended to last for two years, was later extended until 2014. Furthermore, a deduction associated to house purchases was eliminated in 2013.
After the crisis, the government adopted a big reform. It consisted in a reduction of tax brackets and tax rates, which overturned partly the 2012 tax raise, and resulted in the tax system outlined in Figure 2. Also, the family allowance was increased, and a set of new refundable tax credits that depend on family characteristics were introduced (such as the one accruing to large families).

Micro data on Tax Returns (2002-2015)
We use an administrative dataset containing a (stratified) random sample of tax returns, which includes almost the complete set of fiscal and socio-demographic information taxpayers provide in their returns. Hence, the dataset provides a very detailed account of income from different sources, tax benefits, tax liabilities and household characteristics (number of dependent relatives, disability, location, etc.). The income and taxes paid are not censored either at the bottom or at the top of the distribution.
The unit of observation in the dataset is a tax return, which can be of two types: single or joint. As mentioned, single tax returns are filed at the individual level, whereas joint tax returns represent two spouses filing together, or single-parent families with at least one child. In joint tax returns incomes are pooled together and taxpayers are entitled to an additional tax deduction on top of those accruing to single filings (see Figure 1). Other than this additional deduction, the respectively. In these repeated cross-sections, it is not possible to match household members, e.g.
to match a husband and wife who file two independent single tax returns. As a result, it is not possible to study taxes at the household level.
The panel dataset covers the period 1999-2013 and has a smaller sample size (around 3.2% of the universe of taxpayers in 2013). The main advantage of the panel is that it is possible to match spouses who file single tax returns. Therefore, it is possible to compute total taxes paid by households. Furthermore, computing incomes and taxes at the household level allows us to compare the household income distribution from tax data with that obtained from survey data, such as the EFF. Below we use the cross-section and the panel data to describe and estimate the tax functions for individual taxpayers and households, respectively. Table 2 provides a comparison between the cross-section sample aggregates in 2015 and their population. The data provides a very accurate representation of income and tax liabilities of the 19.5 million tax return filers, the differences being less than 1% on the selected items, except for gross income reported by the self-employed, for which the discrepancy is larger.

Definitions and Sample Restrictions
20 It also includes any income of employees (wage and salary earners) who set up an economic activity to generate income. amount that is calculated by the application of the tax schedule to taxable income and the final tax liabilities. Tax liabilities correspond to the amount that the taxpayer effectively has to pay, i.e.
they are net of all, refundable or non-refundable, tax credits. As a result, they can be negative.
The average effective tax rates are computed as tax liabilities over gross income. 21 We also define the average effective general tax rate as tax liabilities resulting from the application of the 21 If the tax liabilities are non-positive, then we set the tax rate to zero. Note that we could also compute tax rates as the ratio of tax liabilities to adjusted gross income. We favor the broader definition of income to compute average tax rates and total tax deductions.
general tax schedule net of the family allowance (the box Gross Tax Liabilities 1 in Figure 1) over general income. We subtract the family allowance because for many (low income) taxpayers, this is equal to the general taxable income, hence by subtracting it from the numerator we avoid an artificial overestimation of the general tax rate (for these taxpayers the resulting average general tax In this section we explain in detail the definition of the main variables used in the paper. Specifically, we describe the different income types we account for, the characterization of tax liabilities Electronic copy available at: https://ssrn.com/abstract=3420702 rate is zero). Average savings tax rates are computed similarly. 22 Finally, the statutory marginal tax rates for a particular income level (or income window) are calculated as the average of the marginal rates of general and savings income, weighted by the corresponding income shares. We also calculate effective marginal tax rates as the change in tax liabilities that result from marginal changes in gross income. 23 In all calculations we restrict the sample to taxpayers with positive total gross income, nonnegative gross income from different sources (labor, capital and self-employment), and average tax rates below the maximum statutory marginal tax rate. We do not restrict the sample by the age of the taxpayer. These restrictions only affect about 3% of all taxpayers in the sample. 24

Survey of Household Finances
As mentioned above, we compare the estimated household income distribution from the tax return data with that obtained from the Survey of Household Finances. The EFF is a survey conducted by the Bank of Spain that collects information on socio-economic characteristics, income, assets, debts and spending of around 6,000 households in each wave. Moreover, the survey oversamples highwealth households, in order to allow for a sufficient number of observations to study the financial behavior at the top of the wealth distribution and to accurately measure aggregate wealth. The Note that households in the tax data are defined as the taxpayer and their spouse, i.e. excluding other members of the household filing a tax return. Therefore, in order to compare the income aggregates between the tax and the survey data, we construct two household definitions in the EFF. The first is denoted "fiscal household" and adds up the gross income of the household's reference person and their spouse. Note that the EFF provides information for each household member on labor and self-employment income items. The capital income items are, however, 22 According to the 2015 tax code, the boxes (in Modelo 100 ) corresponding to each definition are the following.  Guner et al. (2014), Section 6. For each income level y 0 , represented as a ratio of income over mean income, the marginal tax rate is approximated as the average of the variation in tax liabilities when income increases to y 0 + Δy and when income decreases to y 0 − Δy, with Δy = 0.4. Below we compute effective marginal rates from income levels ranging 0.2 to 9.8 in steps of 0.4.
24 Table A.1 in the online appendix shows that the average income and other characteristics of the restricted sample do not differ significantly from those of the universe of taxpayers.
Electronic copy available at: https://ssrn.com/abstract=3420702 25 Note that since we focus on aggregate household income, it is irrelevant for two-person households to assign capital income to the reference person, their partner, or to split it between the two. 26 According to the 2014 EFF wave, we define gross income as: i (p6 64 i + p6 66 i + p6 68 i p6 70 + p6 74b i + p6 74 i) + p6 75d1 + p6 75d3 + p6 75d4 (labor) + i p6 72 i (self-employment) + p7 2 + p7 10 + p7 12 + p7 12a + p7 14 + p7 4a + p7 4b + p7 6a + p7 6b + p7 8a + p7 8b + p6 76b + p6 75f (capital), where i indexes each household member (the reference person and their spouse in two-person households and the former in one-person households). 27 Notice that under the two household definitions we impose this rule on the added income of the reference person and their spouse. Additionally, for the case of "whole households" we apply the restriction on each household member. Hence, if he/she does not fulfill the restriction, it is excluded from the household. person). 25 Note also that we classify the income sources provided by the EFF so as to mimic the labor, capital and self-employment groups defined in the tax data. 26 Second, we construct a larger household definition encompassing all the household members, which we denote by the term "whole household".
As with the tax data, we restrict the sample to households earning positive gross income and non-negative gross income from all sources (labor, capital and self-employment). 27 This amounts to dropping around 2% of the households.

Basic Facts of the Income and Tax Distributions
In this section we report basic facts on income, tax liabilities, and tax benefits for samples of individuals in 2015 and households in 2013. Moreover, we compare the results for the households with those obtained from the EFF.
reported for the whole household. We assume that all capital income belongs to the household's reference person (even if a particular asset could belong, e.g., to an elderly living with the reference Table 3 summarizes how different notions of income are distributed among individuals in 2015.

Individuals
The inequality in gross incomes is significant. The top quintile accounts for about 47.1% of total gross income, while the bottom quintile's share is only 4.6%, a ratio of 10 to 1. The income share of the top 1%, a popular measure of income inequality, is about 9.5%. This is lower than other big euro area countries, such as Germany (11.1%) and France (10.8%), and it is much smaller than what we observe in Anglo-Saxon economies (12.8% in the UK and 20.2% in the US). Nevertheless, it is higher than the top 1% income share in Scandinavian countries (for example, Sweden is 8.8% and Norway is 8.5%) and in Italy (7.3%). Electronic copy available at: https://ssrn.com/abstract=3420702 income share of the top 1% and the Gini coefficient increase as we move from gross to taxable income. This is not surprising, since most of the taxes are paid by richer households. Indeed, for many taxpayers at the bottom quintile (about 20% of them), taxable income becomes zero once deductions are applied to their gross income.
Finally, columns (4) to (6) of Table 3 show the distribution of income from different income sources. The capital and self-employment income are much more unequally distributed than the labor income. The capital income renders a higher degree of concentration at the bottom and top quintiles, when compared to gross income. For example, the bottom 20% accounts for just 4.6% of gross income, while it accumulates 5.4% of capital income; the top 1% accumulating 9.5% and 32.7%, respectively. Self-employment income is also concentrated at the very top, but the lower end of the income distribution accumulates a substantial amount as well. Notes: This table displays the distribution of gross income, adjusted gross income and taxable income, as well as the distribution of gross income sources (labor, capital, and self-employment) for the sample of 2015 individuals. The Gini coefficient in columns (4) to (6) are computed including the observations with zero income, while the percentile ratios of those columns exclude them.
When we move to adjusted and taxable incomes in Table 3  In Table 5 we decompose the sources of income across the income distribution. As columns (1) to (3) show, labor income is by far the largest source of income. Its importance increases monotonically from quintiles 1 to 4, where it represents between 80% and 90% of total income. In distribution (columns 2 to 5). It is worth noticing that there is only a small number of taxpayers that report relatively large incomes in their tax returns, which would put them in higher income brackets (see Figure 2). Average individual gross income in the data is about e24,000. Hence, 80% of households report gross incomes that are below the mean gross income. Indeed, 99% of taxpayers report total gross income below e105,000 (about 5 times the mean income). Also, columns (2) to (5) show that average income levels across income sources are low. For instance, the top 1% earns on average slightly above e120,000 of labor income, while average self-employment 29 Note that self-employment income is net of deductible expenses associated to the business activity, see Section 3.2. As a result, the figures might underestimate the actual pre-tax income from self-employment. 30 As we document in Tables A.2   Electronic copy available at: https://ssrn.com/abstract=3420702  (4) and (5) shows the decomposition of gross income between general and savings income. Note that columns (1) to (3) and columns (4) to (5) add up to 100.
for 7.8% of gross income in the second quintile, while it drops to around 4% to 6% for richer individuals. At the top of the distribution it accounts for slightly more than 12% of total income.
In columns (4) and (5) we show the decomposition of gross income between general and savings income. While general income is by far the largest income source, for taxpayers in the top 1% income taxed under the savings scale is significant, reaching on aggregate 30% of total income.

Households
In Table 6 we compare the household income distribution in 2013 computed from the tax data and from the EFF. Regarding the latter, the column (2) depicts the income distribution under the fiscal household definition (the household head and their spouse), whereas the column (4) shows the distribution under the whole household definition (all the household members). We find that the EFF and the tax data provide very similar estimates of the income distribution, especially if the top decile income from labor is less important; although even for the top 1% the share of labor income is very high, close to 65%. In the lowest end of the distribution, especially in the bottom 1%, capital income appears very significant, although this reflects the very low income levels of this group (see Table 4). Excluding the lowest quintile, capital income accounts for around 6% to 9% of gross income, reaching 24.1% for the richest taxpayers. Self-employment income accounts Electronic copy available at: https://ssrn.com/abstract=3420702 between the tax and the survey data tend be larger.

Tax Rates and Tax Liabilities
In Table 7 we summarize the distribution of tax liabilities and tax rates. In columns (1) and (2) we also depict the corresponding distributions of gross income and taxable income (already shown in  amounts to around 50% in both the tax and the survey data, while the bottom 20% receives around 5% of earnings. In general, the discrepancies between the tax and the survey data tend to be larger Electronic copy available at: https://ssrn.com/abstract=3420702 The high concentration of tax liabilities is reflected in the small average tax rates at the lower end of the income distribution and the larger rates at the upper end, which average 19.0% in the top quintile and 30.6% in the top 1%. Average statutory marginal tax rates are also highest for richer individuals, reaching almost 40% for the top 1%, while they are significantly lower as we move down the income distribution.
These averages hide a substantial degree of heterogeneity across individuals. Panel A of Figure   3 depicts the average effective tax rates across different multiples of mean gross income, together 18 with 2 standard error bands. 31 As can be seen, there is wide variation of tax rates even for individuals with the same gross income, being this the result of different family characteristics and tax benefit entitlements. The shape of this curve is what the parametric estimates of Section 5 are meant to approximate. 32 In panel B of Figure 3 we represent the corresponding curves of statutory and effective marginal tax rates. The figure shows that marginal rates increase rapidly with income, but stabilize at 31 Note that mean individual gross income in 2015 was e24,291, while household mean income in 2013 amounted to e30,839. 32 Figure A.1 in the online appendix shows that median tax levels are almost identical to mean tax levels up to 4 times mean income (about e100,000) and slightly higher above that. Notes: This table shows the distribution of individual tax liabilities (column 3), average effective tax rates (column 4) and statutory marginal tax rates (column 5) across the gross income distribution. In columns (1) and (2) the distribution of gross income and taxable income are summarized in order to highlight the progressivity of the tax code.
for 9.5% of gross income, but pays about 21% of total taxes. As a matter of fact, close to 93% of tax payments are concentrated in the top 40%, while the bottom two deciles account for only 0.5% of the tax.
Electronic copy available at: https://ssrn.com/abstract=3420702 around 3 times mean income (e75,000) and start to decline linearly at a slow rate. The set of tax benefits renders the effective curve below the statutory one, being the difference roughly about 4 percentage points on average. While the effective tax rates increase from 0 to about 30%, most taxpayers face much lower rates.
For about 75% of all taxpayers, the effective tax rates are below 15% (the sum of the first 3 bars in Figure 4). As a result, while most discussion on tax increases and tax cuts focus on top marginal rates, for a great majority of households, the relevant tax rates are much lower. 33

Tax Benefits (Deductions and Credits)
We next turn to the distribution of tax benefits. In Table 8 we describe the distribution of the most important tax deductions, which, as we mentioned in Section 3.2, are tax benefits that reduce Electronic copy available at: https://ssrn.com/abstract=3420702 (the last row). When we consider the aggregate, the most important tax deduction is the one granted to labor income earners, which accounts for about 63% of total deductions. It is followed by social security contributions paid by the employees (20%), the tax benefit associated to joint tax returns (10%), and the contributions to private pension plans (4%). There are, however, differences in the importance of these deductions along the income distribution. For instance, the deduction for contributions to private pension plans accounts for 27% of all tax deductions for the top 1% of taxpayers, while it represents less than 2% for the first two quintiles. 34 The top quintile benefits from more than 25% of the total tax deductions, while the bottom quintile receives around 16% (see the first column of  Electronic copy available at: https://ssrn.com/abstract=3420702

Parametric Estimates
In this section we present the estimated effective average tax functions. We proceed as follows.

Effective Tax Functions of Individuals in 2015
In order to account for the fact that a significant number of Spanish taxpayers face a zero tax rate (panel A of Figure 4), we estimate: where t is the average tax rate, I stands for multiples of mean gross income,Ī is the income threshold, chosen so as to minimize the mean squared error, and f ( I) is a parsimonious non-linear the online appendix shows their relative importance for different income groups. By far the family allowance is the largest tax credit, representing more than 95% of these benefits for the bottom 20% and more than 80% for the top 20%. Next is the tax credit associated to house purchases, that granted to employed mothers, large families, and a battery of region-specific tax credits. 35 As for the distribution of these benefits, the family allowance is evenly distributed, since it depends solely on family characteristics. Note that the smaller share accruing to the lower end of the income distribution is explained by the exhaustion of tax liabilities as a result of the application of (part of) this allowance. On the contrary, the tax credits associated to house purchases and large families benefit the richer individuals, whereas benefits granted to employed mothers and the set of region-specific benefits goes mainly to the middle of the income distribution.
Electronic copy available at: https://ssrn.com/abstract=3420702 In panel A of Figure 5 we plot the estimated average tax rates resulting from the specifications together with the data. The observed average tax rates show a steep increase at lower income levels and then flatten out at the right-end of the income distribution. From equation (2), the marginal tax rate of the HSV specification is given by: while from equation (3) we can derive the marginal tax rate function of the GS specification as: 39 For the OECD tax and benefit calculator, see: http://www.oecd.org/els/soc/benets-and-wages/tax-benetwebcalculator/.
high degree of precision. The income cutoffs are estimated between 49% and 55% of mean income for individuals in 2015, and between 36% and 42% of mean income for households in 2013.
and the GS specification, used in Gouveia and Strauss (1994): Note that in this case I is replaced by I, i.e. by the income level. 37 Table 10 shows the parameter estimates. 38 In general, the parameters are estimated with a Using the parametric estimates depicted in Table 10, the panel B of Figure 5 shows the resulting marginal tax rate functions, as well as the data. The data for marginal tax rates correspond to effective marginal tax rates. As mentioned in Section 4.2, effective marginal rates increase rapidly and flatten out at a certain income level. This last feature is well accounted for by the shape of the Electronic copy available at: https://ssrn.com/abstract=3420702  Electronic copy available at: https://ssrn.com/abstract=3420702 GS function. On the other hand, marginal tax rates under this specification increase and flatten too quickly compared to the data. At around 5 times mean income, the marginal tax rates are 33.5% under the GS specification, while they are 36.8% in the data. In contrast, for 1.5 times mean income, the GS tax function overestimates the marginal tax rates by around 3.5 percentage points. On the contrary, the HSV tax function captures the marginal tax rates very well up to 4 times mean income. After 4 times mean income, however, the marginal tax rates keep increasing under the HSV function, while they are flat in the data. By 5 times mean income, for example, the marginal tax rate under the HSV function is about 3 percentage points higher than the data.
Overall, the HSV function fits well the tax rates of the well-off, but it is unable to capture the near constant marginal tax rates at very high income levels, which leads to an over estimation of

Three-function Approach
In this section we provide an alternative approach to parametrize the Spanish Personal Income Tax. We estimate three different functions that connect income from different sources (general vs. savings) to the tax liabilities. Specifically, we estimate a function that relates general income with general tax rates; a second function that links the savings income to the savings tax rates; and a third function that accounts for the amount of tax credits as a function of total gross income. In this way, starting from gross income by income source, the final tax liabilities of the taxpayer can be easily estimated by going through each of these functions. It must be noted that one advantage of this three-function approach is that it allows simulating more detailed reforms, such as a change in capital tax rates.
For the general tax rate function we pose the same functional form as in the effective tax function estimated in Section 5.1, i.e. that described in equation (1). We follow this approach given that the shape of general tax rates resembles that of effective tax rates. We Electronic copy available at: https://ssrn.com/abstract=3420702 where t s is the average savings tax rate, I s stands for multiples of mean savings income, κ is the sample mean of the savings tax rate if I s ≥S andS is again chosen so as to minimize the MSE.
Finally, for the tax credit function, we follow Guner et al. (2017) and estimate the following Ricker model: where c stands for total tax credits as a fraction of gross income and I refers to multiples of mean gross income. 43, 44 The three estimated functions are depicted in Figure 6, while the parametric estimates are shown in Table 11. The panel A of the figure indicates that both the HSV and the GS specifications 43 Note that total tax credits are computed net of the family allowance, since the latter is subsumed in the computation of the general tax rate, see the definition thereof in Section 3.2. Also, note that the general tax rate function and the tax credit function are estimated by NLS, while the savings tax rate function is estimated by OLS.
In the estimation of the tax credit functions, we exclude a few observations whose tax credits are larger than their gross income. 44 The second term in equation (7) is known as Ricker-function, after Ricker (1954).  HSV and GS specifications), the savings income tax function, and the tax credits function for individuals.Ī stands for mean general income below which tax rates are estimated to be zero, whileS is the estimated kink of the savings tax function.

Evaluation of Tax Functions
How well do these functions capture the level and the distribution of tax liabilities? In this section we provide an assessment. In the first column of Table 12 we depict the distribution of tax revenue by income quantile in the data. The remaining columns show the percentage deviation of the estimates from the data. We can see that the tax functions approximate quite well total capture well the shape of the general income tax function. In panel B, also, it becomes apparent that the shape of the savings income tax function is well approximated by a piecewise functions of the form estimated, where the tax rate increases linearly and flattens out at around 13 times mean savings income. Finally, tax credits seem to benefit more, as a fraction of gross income, taxpayers earning around mean income. From that point on, the incidence of tax credits diminishes until it converges at around 0.62% of gross income. This shape is decently captured by the model proposed, yet the tax credits of the right-end of the income distribution are overstated, see the panel C.  (3) is based on the GS function. In columns (4) and (5) we report the results from the three-function approach. This entails estimating one function each for general income tax rates, savings income tax rates and tax credits. In column (4) the general income function is the HSV specification, whereas in column (5) it is the GS function. The savings income tax rates are modeled by a linear function with a kink and tax credits are estimated as in Guner et al. (2017). See Section 5.2 for more details.
tax collection, except the HSV specification of the single-function approach, which tend to over predict it. For example, both specifications in the three-function approach render a deviation of less than 1.5%, while the GS function in the one-function approach underestimates total revenue by less than 1%. As already observed in Figure 5, the fact that this function converged to a top marginal tax rate below the one observed in the data leads to an under prediction of taxes paid by the top 40% (see the first column), a degree of progressiveness that is well captured by the tax by the top 1%, although the revenue raised by the top 20% is well accounted for. In contrast, the ever-increasing top marginal tax rate of the HSV function results in an over prediction of taxes paid by the 20% and 1% richest taxpayers in the one-function approach.  Notes: This table shows the percentage point difference of the distribution of tax liabilities across income groups estimated from each tax function and the data. Columns (2) and (3) are based on tax functions estimated from final tax liabilities, i.e. the one-function approach, see Section 5.1. Column (2) displays the results of the HSV function, while column (3) is based on the GS function. In columns (4) and (5) we report the results from the three-function approach. This entails estimating one function each for general income tax rates, savings income tax rates and tax credits. In column (4) the general income function is the HSV specification, whereas in column (5) it is the GS function. The savings income tax rates are modeled by a linear function with a kink and tax credits are estimated as in Guner et al. (2017). See Section 5.2 for more details.
functions. Also, as noted before, the main challenge is to account for the average rates of the very rich. In this regard, it is worth noting that the differences are reasonably small, being lower than 1.5 percentage points in all specifications, except the HSV function in the one-function approach.

Changes in Effective Tax Rates since 2000
In  Table A.7 reports the parameter estimates. 45 The picture encompassing the full set of years as well as the parameter estimates are available upon request.
Electronic copy available at: https://ssrn.com/abstract=3420702 For this reason, we use the tax functions to estimate, given gross income, the tax liabilities faced by the household, as we explain below. After-tax income, in both the administrative and survey data, is computed as gross income minus tax liabilities.
Starting with the tax data, columns (1) to (4) of Table 14 illustrate the progressiveness of the tax code, by depicting the distribution of gross and after-tax income, for both individuals and households. It is worth noting that after-tax income is substantially less unequal than gross

Effective Tax Functions for Households in 2013
The second column of Table 10

After-tax Income
In this section we provide a brief account of after-tax income in both the tax and the survey (the EFF) data. This allows us to evaluate the progressiveness of the tax code, by comparing gross income and after-tax income figures. Note that in the survey data after-tax income is not observed.
Electronic copy available at: https://ssrn.com/abstract=3420702 income. The Gini coefficient, for instance, declines by about 4 to 5 percentage points (from 0.42 to 0.38 for individuals, and from 0.45 to 0.40 for households), and the 90th to 10th percentile ratio is reduced from 7.31 to 6.00 for individuals and from 7.89 to 6.50 for households. Along the income distribution, the income share of the top 20% gets reduced by around 4 percentage points as a result of the tax, while the rest of quintiles experiment an increase in their income share.
In columns (5) to (8) of the same table we present the after-tax income distribution estimated in the survey data. We show the results for the two household definitions: fiscal household (comprising the reference person and their spouse, in columns 5 and 6) and whole household (comprising all We also depict the corresponding data and functions for single households (panel B) and married households (panel C). Each data point corresponds to the mean average tax rate of taxpayers whose income is larger than or equal to the point in the x-axis and less than the next point.
The tax rate functions are evaluated at the corresponding x-axis point.
household members, in columns 7 and 8). As mentioned above, the EFF provides income solely in gross terms. Hence, we make use of the estimated GS function from the tax data to approximate Electronic copy available at: https://ssrn.com/abstract=3420702 Notes: This table depicts the gross and after-tax income distributions from tax data for individuals in 2015 (columns 1 and 2), households in 2013 (columns 3 and 4) and the gross and after-tax income distributions of the Survey of Household Finances (EFF, columns 5 to 8). After-tax income is computed by subtracting the tax liabilities from gross income. Households in the EFF are constructed under two alternative definitions: fiscal household (columns 5 and 6), comprising the head and their partner, and whole household (columns 7 and 8), including all household members. Note that the gross income in column (5) is the result of aggregating incomes from all members of the household even if they are negative, while column (7) is estimated for all members of the household restricted to have positive gross income and non-negative income sources. The after-tax income in the EFF (columns 6 and 8) are estimated by applying the estimated household tax function of the GS specification to the gross household income in EFF.
those observed in the tax data. 47 Hence, the application of the tax functions to the survey data the tax liabilities faced by each household in the survey, and then compute after-tax income. For the definition of fiscal household, we apply the household tax function. For the whole household definition, we apply the household tax function for the reference person and their spouse, the individual function for the remaining household members, and then we aggregate each member's after-tax income at the household level. We find that the estimated after-tax income distributions in the survey data are able to capture the shift from the gross to net income distribution that we observed in the tax data. Specifically, the first four quintiles experience an increase in their income share, while the top 20% undergoes a reduction, the magnitude of the changes being similar to In Tables 15 and 16 we report how gross and after-tax income inequality have changed in recent years. In the individual data the Gini coefficient remains relatively stable during the sample period, while there is an increase of the 90th to 10th and 50th to 10th percentile ratios in the wake of the financial crisis, suggesting larger inequality within taxpayers. This increase can be explained by the evolution of income shares along the income distribution, which are depicted in Tables A.10 can provide a fruitful approach to analyze after-tax income in this type of datasets, even if the actual information is missing.
48 See also Tables A.12 and A.13 for the evolution of household gross and after-tax income shares during the sample period, respectively. and A.11 in the online appendix. In this respect, it is worth noting the income share decline of the bottom 20% of taxpayers. Regarding the household tax data, while overall inequality, as captured by the Gini index, seems to have increased in the run-up to the crisis and decreased thereafter, the percentile ratios shows somewhat the opposite trend, see Table 16. 48 In Table 17 we depict the evolution of the household gross income distribution as computed from the different waves of the EFF. They point to a rather stable distribution, at least with respect to the selected inequality indices. Tables A.14 and A.15 in the online appendix show the corresponding evolution in the gross and after-tax income shares, respectively. Interestingly, the over time pattern is comparable to that found in the tax data (see Tables A.12     Electronic copy available at: https://ssrn.com/abstract=3420702 distribution of gross income and its sources, taxable income, tax benefits, tax liabilities and aftertax income, as well as effective average and marginal tax rates. We do so for individuals and for households, defining the latter as either joint declarations or as two individual declarations from the same household and differentiate between single and married. We also briefly review how the PIT legislation and the effective tax rates have changed during the period of the analysis.
A second contribution of the paper is the estimation of parametric functions of the effective average tax rates that can be readily used in applied work. We follow two different approaches.
First, we estimate a single expression for the final tax liabilities as a function of gross income.
Second, we estimate three different functions, one for the general tax rates that apply to the general taxable income, one for the savings tax rates, applied to the savings taxable income, and one for the tax credits. Both approaches generate a distribution of tax liabilities that is very close to the one we observe in the data.

Compliance with Ethical Standards
Guner acknowledges financial support from the Spanish Ministry of Economy and Competitiveness, Grant ECO2014-54401-P. The authors declare that they have no conflict of interest. This article does not contain any studies with human participants or animals performed by any of the authors.
This article does not contain any information that requires informed consent.

Conclusions
In this paper we exploit a rich uncensored administrative dataset of tax returns for the years 2002 to 2015 to present key facts about the Spanish Personal Income Tax system. We focus on the Electronic copy available at: https://ssrn.com/abstract=3420702        Notes: This table shows the distribution of gross and after-tax income from the tax return data aggregated at the household level and the Survey of Household Finances (EFF). Households in the EFF are constructed under two alternative definitions: fiscal household (columns 3 and 4), which consists of the head and their partner, and whole household (columns 5 and 6), which includes all household members. Note that the gross income in column (3) is the result of aggregating incomes from all members of the household even if they are negative, while column (5) is estimated for all members of the household restricted to have positive gross income and non-negative income sources. The after-tax income in the EFF (columns 6 and 8) are estimated by applying the estimated household tax function of the HSV specification to the gross household income in EFF.      Notes: This table shows the evolution of after-tax income as estimated in the EFF. Households are defined as fiscal households, i.e. they include the reference person and their spouse. After-tax income is estimated by applying the GS tax function as estimated in column (2) of Table 10 to the household gross income. Notes: This table depicts the over time evolution of after-tax income as estimated in the EFF. Households are defined as fiscal households, i.e. they include the reference person and their spouse. After-tax income is estimated by applying the HSV tax function as estimated in column (2) of Table 10 to the household gross income.  Electronic copy available at: https://ssrn.com/abstract=3420702  Electronic copy available at: https://ssrn.com/abstract=3420702