1 Introduction

With more than 75,000 pages, US federal tax bills are three times larger today than when President Ronald Reagan complained about their complexity, even though the Internal Revenue Service revised tax schedules and instructions to improve readability during the past decades. Hence, despite the regulatory efforts made to improve the readability of tax bills (US Comptroller General, 1978), multiple evidences indicate that these are still low in readability (see, e.g., Bradbury et al., 2018). This can have sizable and detrimental effects on economic activity since legislative complexity can be seen as an indicator of a poor quality of government and a source of corruption (Rose-Ackerman, 2007).Footnote 1

Persistent legislative complexity might be the result of the rational behavior of self-interested politicians aiming to appropriate public rent and to modify the income distribution in their favor (Tullock et al., 2002). Indeed, complex legislation makes the applicable law ambiguous and enables members of the legislation to take advantage of the asymmetric information with private agents (Acemoglu, 2006, 2013; Aidt, 2013). One channel through which this might be possible is the textual complexity of tax bills and the resulting underreaction of investors to (complex) information.

To date, however, the literature has considered the effect of tax complexity mainly through the lens of the number of tax rates, tax bases, tax payments, and special provisions. For example, Saez (2010) finds that taxpayers do not bunch at kinks if marginal incentives are fully understood and labor supply could be freely adjusted, while Baake et al. (2004) suggest that a simplified tax system—which does not distinguish consumption goods and work-related goods—provides higher incentives to work and is less redistributive. Tax complexity, however, has other dimensions, such as the readability of tax documents, which has not been studied thus far.Footnote 2 Indeed, textually complex financial documents, such as those related to tax bills, might limit the investors’ ability to extract information and could lead to an (initial) underreaction. Following this line of thought, Abeler and Jaeger (2015) conduct a controlled laboratory experiment and find that people underreact to complex tax incentives, measured through the number of pages of legislation, the readability of the legislation, and the number of exemptions. Looking at consumers, Chetty et al. (2009) document that these underreact to changes in non-salient taxes. Both papers explain their findings with the cognitive costs that individual have to bear to digest all available information.

Against this background, we study the textual complexity of federal tax bills and its consequences on financial markets. We argue that textual complexity might affect financial market participants’ behavior through at least two channels. First, tax bill complexity can be considered as a “barrier to entry” into markets due to private costs of acquiring information about new legislation (Kaplow, 1996). In this context, complexity reflects additional transaction costs that might prevent investors to access more data. This reduces their willingness to collect or process the information and, ultimately, affects their trading behavior (Bloomfield, 2008; Grossman & Stiglitz, 1980; Lundholm, 1991). This channel is referred to as the “incomplete revelation hypothesis.”

The second channel is based on the literature on human psychological and cognitive processes and examines the role of financial information in shaping investor decision-making processes (Sims, 2005). Specifically, given the cognitive limitations of investors, information complexity might affect their ability to fully incorporate such information into investment decisions. Put differently, textual complexity adds noise to messages and receivers are unable to fully internalize this information (Jin et al., 2022). As a consequence, investors might underreact to information that is more difficult to read, which results in an inertial reaction of asset prices to new information (see, for instance, Dellavigna and Pollet, 2009). This channel is referred as the “limited attention hypothesis.” In contrast, improved processing fluency associated with more readable disclosures enhances investors’ confidence so that these can rely on the information in their decision-making and react more adequately to the disclosures (Loughran & McDonald, 2014).Footnote 3

Based on these considerations, we hypothesize that financial market participants underreact when they face a complex tax bill. To test this hypothesis, we first collect the 32 tax bills identified by Romer and Romer (2010) (henceforth RR) in the period 1962–2003 from the congressional archives.Footnote 4 We rely on this particular dataset since it is commonly used in the literature to identify fiscal policy shocks and allows differentiating the tax bills according to their revenue effects, the nature of the changes, and their motivation. Second, to measure the textual complexity of tax bills, we rely on the Flesch-Kincaid (FK) grade level index to generate a readability measure based on education grade levels (Kincaid et al., 1975). The most important benefit of this measure is that it is based on objective elements of the underlying texts. Hence, despite the recent advancements in text analysis, the FK index remains very efficient and is widely used in the social science literature.Footnote 5 Finally, we assess the relationship between the tax bills’ textual complexity and financial markets’ behavior by considering the changes in 10-year government bond yields and S &P 500 returns in windows from one day before the signing of a tax bill until up to four days after the signing of a bill by the President.Footnote 6

Our results show that financial market participants react significantly to the characteristics of tax bills after their signing. Specifically, we find a negative (positive) and significant relationship between the present value of tax bills and changes in the 10-year bond yields (S &P 500 returns), suggesting that market participants actually appreciate an increase in taxes. The magnitude of this relationship increases over time, indicating that market participants underreact at first and need a couple of days to digest the information contained in the tax bills. This delay can be explained by the textual characteristics in the case of the 10-year bond yields as a lower readability (proxied by the years of education required to understand a bill) partly counteracts the negative relationship for up to three days after the signing of a tax bill, but not thereafter. In the case of the stock market, we find similar evidence, but only for a part of the readability measures employed in this paper. We also test if the results differ according to the nature of the tax change (endogenous or exogenous) and the phase of the business cycle during which the bill is signed. Lastly, we document the robustness of our results with respect to (i) the usage of abnormal changes in bond yields (abnormal stock returns) instead of their raw counterparts, (ii) other textual characteristics of the tax bills, such as length, textual uncertainty, and sentiment, (iii) the timing of the tax bills (using the date of the vote in the Congress instead of the Presidential signing), (iv) the electoral cycle, (v) parameter instability, and (vi) outliers. Hence, we provide robust empirical evidence for an underreaction of investors around the day a bill is signed, which is explained by the textual complexity of the tax bills.

This paper relates to at least three strands of the literature. The first one analyzes the effect of tax changes using a narrative identification approach and the same dataset (e.g., Romer & Romer, 2010; Perotti, 2012; Mertens & Ravn, 2012, 2013). The second strand examines the effect of the complexity of financial disclosures on investors’ behavior (e.g., Miller, 2010; You & Zhang, 2009). The third branch analyzes the connection between financial markets and tax policy (e.g., Gaertner et al., 2020; Ardagna, 2009; Wagner et al., 2018).Footnote 7

The remainder of this paper is organized as follows. Section 2 describes our approach to measure the textual complexity of tax bills. Section 3 presents the empirical methodology. Section 4 shows the results. Section 5 concludes.

2 Measuring the textual complexity of tax bills

We first collect the dataset of tax bills identified by RR from the congressional archives for the period from October 1962 to May 2003 (see Table 12 in Appendix A for a list of the bills). Hence, the Revenue Act of 1962 (signed on 16-Oct-62) marks the first event in our sample—given the availability of daily 10-year government bond yields in the FRED database—and the Jobs and Growth Tax Relief Reconciliation Act of 2003 (signed on 28-May-03) marks the final one. We obtain a set of 32 scanned PDF files that need to be converted into text files in order to be processed. We utilize the Tesseract package in R that relies on a powerful optical character recognition (OCR). OCR is the process of finding and recognizing text inside a scanned paper by using language-specific training data in the recognized words. Specifically, Tesseract converts each single PDF file into a PNG format and then scans for texts within the created images.Footnote 8

As a second step, we rely on Schuck (1992)’s definition to measure the complexity of the tax bills.Footnote 9 Out of the four dimensions he identifies, we focus on the density and conjecture that the complexity of a tax bill can be objectively measured with the characteristics of its text. Indeed, social scientists usually proxy text complexity by its readability, that is, the easiness for an individual to read a text and to understand the information. Following this line of thought, past research (Flesch, 1948) has identified text characteristics, such as the length of words and sentences, as good predictors of readability.

To quantify the readability of tax bills and, thus, their complexity, several measures—such as the Flesch-Kincaid (FK) grade level (Kincaid et al., 1975)—exist, which can be interpreted as the number of years of education needed to sufficiently comprehend a text.Footnote 10 The FK index has been used in a variety of fields, spanning from medicine to psychology, also including many studies in economics (e.g., Jansen, 2011a) and political science (e.g., Schoonvelde et al., 2019). The FK index is calculated as follows:

$$\begin{aligned} 0.39 \cdot \mathrm{{(\# words /\# sentences)}} + 11.8 \cdot \mathrm{{(\# syllables/\# words)}} \ - \ 15.59 \end{aligned}$$
(1)

(# words /# sentences) reflects the number of words per sentence and (# syllables/# words) the number of syllables per word. The rationale underlying this index is that many words per sentence or many syllables per word in a text decrease (increase) its readability (complexity) and, therefore, require more years of education to be understood.

Fig. 1
figure 1

The Flesch-Kincaid Grading Level of Tax Bills. Notes: Figure shows the textual complexity of the tax bills (as measured by the FK grade level). The corresponding tax bills can be found in Table 12

Figure 1 shows the FK index of the tax bills from 1962 until 2003. Most of the federal tax bills in our sample were implemented during the period between 1960 and 1985. There is no clear upward or downward trend in the complexity of tax bills over time.Footnote 11 Additional information about the present value of the tax bills (see Fig. 3 in Appendix A) does not reveal any substantial correlation between these two dimensions (see Table 14 in Appendix A). Lastly, it has to be noted that the tax bill from 15-Mar-1966 (Tax Adjustment Act of 1966) is clearly an outlier in terms of its textual complexity and will be excluded as part of our robustness tests in Sect. 4.3.

3 Empirical methodology

Our sample consists of the 32 legislative texts identified by RR for which we have financial market data at hand. We consider the signing of each bill as an event and are interested in the movement of financial markets around the events. Figure 4 shows the average daily changes in the 10-year government bond yields as well as the average daily returns of the S &P 500 returns two days before until four days after the signing of a tax bill. We do not observe a clear pattern in the data, but this should not be surprising. Tax bills have at least two dimensions that should matter for market participants.

The first one is the present value of taxes (as % of nominal GDP), which has been used in various studies to analyze the effect of tax shocks (see, e.g., Romer & Romer, 2010; Perotti, 2012; Mertens & Ravn, 2012, 2013). Our sample consists of 17 bills that lead to an increase in the present value of taxes and 15 tax cuts, with the Economic Recovery Tax Act of 1981 (signed on 13-Aug-81) being the largest (with a present value of \(-\)3.96% of GDP).Footnote 12 An increase in taxes should lead to lower refinancing costs of the government and—as a contractionary fiscal policy measure—should also dampen economic growth (at least in the short-run). Both channels indicate a negative relationship between the present value of a tax bill and bond yields after the signing of the tax bill. The effect on stock returns, however, is a priori unclear. An improvement in public finances should lead to a lower discount rate, but a higher tax burden should reduce (future) cash flows. Hence, the discounted present value of firms could either increase or decrease.

The second dimension is the textual complexity of tax bills. As indicated in Sect. 1, market participants might need some time to digest the information in the bill in order to update their information set and to adjust their trading behavior. Hence, we expect that the maximum effect of a tax bill on bond yields and stock returns materializes with a time lag due to an underreaction of the investors. In particular, we conjecture that a higher level of textual complexity contributes to this delay and we expect an opposing sign compared to the one of the present value. Accordingly, our two hypotheses are as follows:

Hypothesis 1

Tax increases (cuts) lead to a decline (an increase) in 10-year bond yields. The effect of tax changes on stock returns is ambiguous.

Hypothesis 2

The maximum effect of a tax change can be found after a time lag. Higher textual complexity contributes to this transmission delay.

To test how tax bills are processed by financial market participants, we consider several windows around each event corresponding to the President’s signing date of a bill. Indeed, even though the legislative process in the US starts long before a bill is signed by the President and the fiscal measures are publicly debated in the meantime, several papers find no evidence in favor of anticipation effects. For instance, Mertens and Ravn (2012) use a timing convention to distinguish between anticipated and unanticipated changes in taxes.Footnote 13 Their finding of no strong anticipation effects is in line with several studies showing that it is the implementation date of tax changes that mainly affects macroeconomic variables (Poterba, 1988; Parker, 1999; Souleles, 1999, 2002; Mertens & Ravn, 2010). More recently, Hayo and Mierzwa (2020) find a significant reaction of financial markets on the days changes in tax legislation are actually implemented. These findings might also be explained by the prevailing uncertainty until the signing date of a tax bill, which could be due to the relatively large number of vetoes made by US Presidents in the 1960 s and 1970 s (for instance, Richard M. Nixon made 26 vetoes while Gerald R. Ford made 48 vetoes).Footnote 14 This raises one crucial question for financial markets: Will the bill actually pass the final step of the legislative process?

Each window begins one day before the event. The end of a window can be either on the day after the signing or up to four days thereafter to capture the information processing. This yields a total of four event windows, which are summarized in Fig. 2. We calculate the changes in the 10-year bond yields as well as the growth rate in the S &P 500 index over each of these windows.

Fig. 2
figure 2

Event Windows Around the Signing of a Tax Bill

Next, we explain these changes in bond yields and stock returns using the following specification:

$$\begin{aligned} y_{e,w} = \alpha + \beta PV_e + \gamma FK_e + \epsilon _{e,w} \end{aligned}$$
(2)

\(y_{e,w}\) is either (i) the change in 10-year government bond yields or (ii) the growth rate in the S &P 500 index for event e during one of the windows w introduced in Fig. 2. \(\alpha \) is a constant term, \(\beta \) and \(\gamma \) are slope parameters for the present value (\(PV_e\)) of a tax bill and its textual complexity (\(FK_e\)), and \(\epsilon _{e,w}\) is the error term.Footnote 15 Eq. (2) is estimated using least squares with heteroskedasticity-consistent standard errors. Both explanatory variables are demeaned, so that the constant term represents the conditional average change in the 10-year bond yields or the conditional average growth rate of the S &P 500 for an event window. It has to be noted that we also considered including an interaction term between the two slope parameters. This term, however, generates no additional insights and is excluded from the analysis.Footnote 16

Our analysis is based on a couple of assumptions. First, we assume that there are no confounding factors. This assumption can be easily justified as the signing of tax bills is the result of a (lengthy) political process and the actual signing day should not be systematically related to, for instance, monetary policy decisions or the publication of important macroeconomic news.Footnote 17 Second, it is typically the unexpected component of an announcement that should actually matter for financial market participants. In an ideal world, we would extract such a news component from market expectations as it is done in the literature on macroeconomic news or monetary policy decisions and their impact on financial markets. However, in the absence of a series for the expected present value (textual complexity), we consider the actually observed value as second-best proxy.

Finally, we also considered studying the response of the volatility of both financial series around the signing of a tax bill. For that purpose, we extract the conditional standard deviation of the change in the 10-year bond yields and the S &P 500 returns with the help of an AR(1)-GARCH(1,1) model and t-distributed errors. The results (available on request) indicate no systematic relationship of the conditional standard deviations with the present value of tax bills or their textual complexity in the four event windows.

4 Results

4.1 Baseline results

Tables 1 and 2 show the results of Eq. (2) when using the 10-year bond yields and the S &P 500 returns as the dependent variable, respectively.

Table 1 Change in 10-year bond yields
Table 2 S &P 500 Returns

We find a negative and significant relationship between the present value of tax bills and the change in the 10-year bond yields (in line with H1). In line with H2, this relationship increases over time, suggesting that financial market participants underreact at first and need a couple of days to digest the information contained in the tax bills. In terms of economic magnitude, we find that a one percentage point (pp) increase in the present value of tax bills lowers the 10-year bond yield by 3.1 basis points (bps) on the day after the bill is signed (as compared to t1), while the response increases to −6.8 bps three and −9.3 bps four days after the signing. Stock market returns, on the other hand, increase significantly by 63.5 bps two days after a bill is signed, and the response amounts to 84.9 bps for the event window [t−1; t+4]. Hence, we find evidence that tax consolidation is actually appreciated by financial markets.

The complexity of tax bills, measured by the FK grading level, provides additional insights about market participants’ behavior when confronted with difficult-to-read texts. We observe that a lower readability (i.e., higher complexity) is positively associated with the changes in the 10-year bond yields, whereas the effect of textual complexity on bond yields slightly increases over time (up to 1.2 bps after two days), it is no longer significant four days after the signing. Hence, in line with H2, higher textual complexity contributes to the delay in the processing of information by market participants. This result, however, is not replicated for the stock market as the coefficients for complexity are insignificant at the 10% level.

4.2 Subsample analysis

One of the most important contributions of the narrative approach by RR is to demonstrate that exogenous fiscal policy shocks have a larger and more significant effect on output as compared to broader measures of tax changes. Additional studies, such as Auerbach and Gorodnichenko (2012) and Fazzari et al. (2015), find that the effect of exogenous tax changes on output varies over the business cycle and that it is larger during economic slack. Against this background, we test whether the effect of the present value and the textual complexity of the tax bills on financial markets’ behavior is conditional on the nature of tax bills or the business cycle phase during which they are signed. Specifically, we distinguish between (i) endogenous and (ii) exogenous tax bills relying on the classification of RR and between tax bills implemented during (iii) an expansion and (iv) a recession relying on the NBER business cycle chronology.Footnote 18 We focus on [t1; t+2] in this subsample analysis as the baseline results show the largest effects for textual complexity in this event window. Tables 3 and 4 show the results.

Table 3 Change in 10-year bond yields: subsample analysis

When considering the changes in the 10-year bond yields, the results for the present value and textual complexity are qualitatively similar for exogenous (albeit insignificant at the 10% level for the present value) and endogenous tax changes as well as for bills signed during an expansion. The only difference can be found for bills signed during a recession. However, the latter results have to be taken with a grain of salt due to the low number of observations in this subsample. For the S &P 500 returns, we find that the textual complexity of exogenous tax changes and those signed during an expansion significantly matter in a way consistent with H2 as there is an underreaction to more complex bills. Other than that, we find qualitatively similar results for the present value of exogenous tax changes, those signed during an expansion, and for endogenous tax changes (albeit not significant at the 10% level).

Table 4 S &P 500 returns: subsample analysis

4.3 Robustness tests

4.3.1 “Abnormal” changes in bond yields and stock returns

First, we replace the “raw” changes/returns in Eq. (2) with the “abnormal” changes/returns over each of the four event windows. For that purpose, we calculate the average daily changes of the 10-year bond yields (S &P 500 returns) over a window from 180 days to 30 days before each event. We subtract these daily averages from the actual daily changes in bond yields (daily stock returns) to calculate the abnormal changes (returns) over the two- to five-day windows. The results of this robustness test are shown in Tables 5 and 6.

The coefficients are qualitatively and quantitatively almost unchanged when compared to the baseline results in Tables 1 and 2, in particular for the bond market. Hence, our results are robust to using the “abnormal” changes/returns instead of the “raw” changes/returns. This is also the reason why we stick to the usage of the non-transformed left-hand side variables through the remainder of this paper.

Table 5 “Abnormal” changes in 10-year bond yields
Table 6 “Abnormal” S &P 500 returns

Different readability measures To test if our results are robust to the readability measure used in the specification, we use different approaches gaging text clarity: (i) the Flesch reading ease (FRE) index, (ii) the Gunning Fog (GF) index, the (iii) Simple Measure of Gobbledygook (SMOG) grade, and (iv) the new Dale-Chall (DC) readability formula. The FRE, the GF, and the SMOG indexes are similar to the FK grading level as these indicate the easiness to read a text and the years of formal education a person needs to understand the text on the first reading. While the Flesch reading-ease (FRE) is computed with the same formula as the FK grading level but with different values, the GF index, and the SMOG grade are calculated using complex words (polysyllables), that is, words consisting of two or more syllables.Footnote 19 The new DC index uses a count of “hard” words and sentence length to calculate the US grade level of a text.Footnote 20 Since all five measures (FK, FRE, GF, SMOG, and DC) show a substantial correlation (see also Table 14 in the Appendix), we additionally use the (standardized) first principal component (PC1) of these as explanatory variable in the robustness test.Footnote 21 We replace the FK grading level with the different readability measures and re-estimate Eq. (2) in the event window [t−1; t\(+\)2]. It has to be noted that the FRE measure is defined as actual reading ease measure (and not a complexity measure). Hence, we multiply its coefficient with −1 to facilitate a comparison of the results across readability measures. Tables 7 and 8 show the results.

Table 7 Change in 10-year bond yields: different readability measures
Table 8 S &P 500 returns: different readability measures

The results for the 10-year bond yields are robust, irrespective of the readability measure used in the estimation. Higher textual complexity leads to an underreaction of bond returns. In the case of stock returns, we now find some support for a significant effect of the complexity of tax bills for four (FRE, SMOG, DC, and PC1) out of the six complexity as these partly counteract the positive effects of the present value on stock returns. Hence, when replacing the FK index, we also find results that are consistent with H2 as there is an initial underreaction of the stock market, which can be (partly) explained by the textual complexity of the tax bills.

4.3.2 Length, Uncertainty and Sentiment

In a third step, we test if our results are not confounded by other textual characteristics, such as the length of the text, textual uncertainty, and the sentiment conveyed in a tax bill. Tan et al. (2014) suggest that low readability can magnify the effect of language sentiment if the former obfuscates readers’ understanding. Hence, we extend our baseline specification and control for the number of words in tax bills (in logs), the textual uncertainty of tax bills, and the sentiment of tax bills. The measures for textual uncertainty (the share of words with an uncertain connotation) and sentiment (the share of positively connotated words minus the share of negatively connotated words) are obtained using the dictionary of Loughran and McDonald (2011). Tables 9 and 10 show the results.

The results for the 10-year bond yields are robust, irrespective of any additionally included covariate. If at all, the coefficient for the present value loses significance once sentiment is included into Eq. (2). In the case of the S &P 500 returns, we find some variation in the size of the coefficient for the present value (between 50.6 bps and 74.1 bps). The smallest coefficient can be explained by the inclusion of the sentiment variable, which positively affects stock returns as a 1pp higher sentiment leads to an increase of 124.3 bps in the S &P 500 returns. Lastly, textual complexity remains insignificant on the stock market.

Table 9 Change in 10-year bond yields: length and sentiment
Table 10 S &P 500 returns: length and sentiment

4.3.3 Congressional vote vs. presidential signing

Every bill in the US has to pass both chambers of the Congress before the President has to sign or veto the enrolled bill. Hence, there is some uncertainty in the electoral process even after the final vote of the Senate. Still, market participants might already react to the present value and the complexity of a bill after it has passed the Congress. Accordingly, we collect the dates of the final vote in the Senate and test if market participants react at that time. Table 11 sets out the results with column ‘President’ replicating the baseline results using the restricted sample.Footnote 22 A couple of things are worth highlighting. First, the \(R^2\) is higher in the three-day windows around the singing of the bill by the President. In addition, we observe larger (absolute) coefficients with a higher level of significance in the columns ‘President’ as compared to the columns ‘Congress.’ Hence, this reaffirms our choice of using the signing day by the President as endpoint for the electoral process as the weaker results for the Congress vote might indeed reflect the remaining uncertainty in the process.

Table 11 Congressional vote vs. presidential signing

4.3.4 Electoral cycle

Three of the bills in our sample (8-Nov-1966, 6-Nov-1978, and 5-Nov-1990) were signed on an election day or on the day before. Hence, the results could be confounded by the financial markets’ reaction to these three midterm elections. Alternatively, market participants could react differently to bills signed close before an election. As a consequence, we exclude the three bills and re-estimate Eq. (2) without these in a first step. Furthermore, four additional bills (16-Oct-1962, 30-Oct-1972, 4-Oct-1976, and 22-Oct-1986) were signed in the month before an Presidential or midterm election. Accordingly, we exclude all seven bills to account for the potentially confounding electoral cycle in a second step. The resultsFootnote 23 remain robust for the bond market with a minor shrinkage in the significance of the coefficient for the present value. On the stock market, the coefficient for the present values becomes smaller, but remains quantitatively robust. Hence, we are confident that our results are not confounded by the electoral cycle.

4.3.5 Parameter stability

Next, we test for the stability of the estimated parameters over time. With the emergence of the Internet, information might be spread quicker and computer technology might also be helpful in digesting the complexity of the tax bills. To test for potential breaks in the parameters, we re-estimate Eq. (2) recursively using an expanding window with a minimum size of ten tax bills. The results can be found in Fig. 5 in Appendix B. The coefficients for the present value and textual complexity are significant on the bond market for all windows considered in the recursive estimations. If at all, the estimate for the present value decreases over time (in absolute terms), whereas the estimate for textual complexity increases. In the case of the stock market, the estimate for the present value does not show a distinct tendency over time with respect to its size; however, it becomes statistically significant only if a certain number of observations is included in the analysis. In general, we do not find evidence of a distinct structural break in the data.

4.3.6 US monetary policy decisions

Three of the bills in our sample (9-Jul-73, 20-Dec-77, and 18-Jul-84) are signed on the day of US monetary policy decisions or the day thereafter. Removing these from the analysis leads to a virtually unchanged coefficient for textual complexity on the bond market. On the stock market, the coefficient becomes slightly larger (its absolute value) and significant at the 10% level. Hence, we find no evidence that decisions by the Federal Open Market Committee confound our results.

4.3.7 Outlier analysis

As mentioned in Sect. 2, the Tax Adjustment Act of 1966 (Text No 5 in Table 12) is an outlier. It contains a lot of numbers and short half-sentences and, therefore, the FK measure fails to grade this text in an “appropriate” way. Accordingly, we test whether our results are robust when excluding this text. In the case of the 10-year bond yields, the coefficients are slightly larger when omitting the “questionable” text; for the stock returns, the results are qualitatively unchanged despite some slightly smaller coefficients without the outlier.

Our results also remain robust if we exclude the Economic Recovery Tax Act of 1981 (signed on 13-Aug-81, Text No 21 in Table 12) from the analysis. This tax cut (the largest one in our sample as mentioned in Sect. 3) was at the center of Ronald Reagan’s Presidential campaign and, therefore, largely anticipated by market participants. In this case, the coefficient for the present value is slightly larger for bond yields and stocks returns without the outlier, whereas the coefficients for textual complexity remain virtually unchanged.

5 Conclusions

In this paper, we analyze whether the textual complexity of tax bills affects financial market participants’ behavior. We first collect the 32 tax bills identified by Romer and Romer (2010) in the period 1962–2003 from the congressional archives. Second, to measure the textual complexity of tax bills, we mainly rely on the Flesch-Kincaid grade level index to generate a readability measure based on education grade levels. Lastly, we assess the relationship between tax bills’ textual complexity and financial markets’ behavior by considering the changes in 10-year government bond yields and S &P 500 returns in windows from one day before the signing of a tax bill until up to four days after the signing.

Our results show that financial market participants react significantly to the characteristics of tax bills after their signing. Specifically, we find a negative (positive) and significant relationship between the present value of tax bills and changes in the 10-year bond yields (S &P 500 returns), suggesting that market participants actually appreciate an increase in taxes. The magnitude of this relationship increases over time, indicating that market participants underreact at first and need a couple of days to digest the information contained in the tax bills. This delay can be explained by the textual characteristics in the case of the 10-year bond yields as a lower readability (proxied by the years of education required to understand a bill) partly counteracts the negative relationship for up to three days after the signing of a tax bill, but not thereafter. In the case of the stock market, we find similar evidence, but only for a part of the readability measures employed in this paper. We also test if our results differ according to the nature of the tax change (endogenous or exogenous) and the phase of the business cycle. Lastly, we document the robustness of our results with respect to (i) the usage of abnormal changes in bond yields (abnormal stock returns) instead of their raw counterparts, (ii) other textual characteristics of the tax bills, such as length, textual uncertainty, and sentiment, (iii) the timing of the tax bills (using the date of the vote in the Congress instead of the Presidential signing), (iv) the electoral cycle, (v) parameter instability, and (vi) outliers.

This paper considers changes in government bond yields and the overall stock market. Indeed, there might be differences in the response to complex tax bills of, say, accounting firms and tech companies. Hence, an interesting avenue for future research would be to analyze the reaction of different industries or firms.