1 Introduction

Since the seminal work of Jin and Myers (2006) firm-specific stock price crashes are linked to bad news hoarding by firm managers. Due to incentives such as compensation packages, empire building, and reputation concerns (Ball 2009; Graham et al. 2005), managers have the tendency to hide adverse information from outside investors which leads to bad news being stockpiled within the firm. Once the accumulated bad news reaches a tipping point where it can no longer be contained, stock prices are prone to crash. Given the devastating effects of crash risk on investor welfare, literature in the field of accounting and finance has increasingly turned to understanding the determinants of bad news hoarding behaviour. In this study we examine managers’ intentions to offer equity (hereafter: equity intent) as a potential incentive to hoard bad news. In the case of equity offerings, managers are known to time the market with regard to the current market valuation (Graham and Harvey 2001). More specifically, firm managers seeking equity are incentivised to provide more optimistic information prior to the equity offering to raise more capital, even if this contrasts the firm’s true performance (Lang and Lundholm 2010). Given this evidence, we believe that managers who try to persuade investors to invest in their company are more likely to hoard bad news. However, whether managers’ equity intent serves as a driver for bad news hoarding behaviour as reflected in future stock price crash risk remains an open empirical question. To address this issue, we apply modern techniques of information retrieval to firm disclosures to identify cues revealing managers’ equity intent and quantify its effects on future stock price crash risk.

A prominent example of managerial bad news hoarding with severe effects on investor welfare is the case of Marrone Bio Innovations. Former COO Hector M. Absi Jr. allegedly profited from hiding various sales concessions from independent auditors and outside investors.Footnote 1 When the SEC uncovered the alleged fraud in August 2014, the company’s share price plunged by more than 40%. Given the severe effects of opportunistic management behaviour, section 954 of the 2010 Dodd-Frank Act mandates that publicly listed companies adopt clawback provisions that require managers to return part of all benefits they received as the result of misstated accounting numbers. However, Bao et al. (2018) provide empirical evidence that clawback adoptions do not help mitigating crash risk as managers are induced to hide bad news through different channels than manipulated accounting numbers. Since stock price crash risk can neither be mitigated by portfolio diversification (Sunder 2010), literature in the field of accounting and finance has focused on identifying determinants of managerial bad news hoarding to help investors detecting crash prone companies via screening.Footnote 2 Interestingly, we find that, prior to the stock price crash of Marrone Bio Innovations in 2014, the company openly stated in its 10-K filing for the fiscal year 2013 that it was searching for additional external funding (e.g. equity intent) in order to expand its operational business. Considering our opening arguments, the intention to raise equity could have served as an incentive for the management to provide an overly optimistic outlook and hide bad news from investors. In this case, screening companies for managers’ equity intent could have helped investors to better assess firm-specific crash risk and preserve shareholder welfare.

Although firm fundamentals (e.g., leverage ratio, market-book-ratio) help to explain firm financing choices on average, balance sheet items cannot reveal corporate financing intentions. To grasp managers’ equity intent, we build on a concept proposed by Hoberg and Maksimovic (2014) and search for commonly used phrases in the Liquidity and Capital Resources (LIQ + CAP) section, a mandated disclosure of the MD&A section in 10-K files, that managers use to express their equity intent. However, managers’ equity intent is likely to come in shades. The authors suggest that, even if managers do not actively offer equity eventually, the mere intention to do so should be reflected in their choice of words when discussing the capital resources of their firm. Consequently, we follow Hoberg and Maksimovic (2014) to compile a continuous measure of equity intent that allows us to consider early signals that may mark the start of managers’ tendencies to hoard bad news. Specifically, we estimate cosine similarities between vector representations of LIQ + CAP sections in which managers explicitly state that they are issuing equity and all other LIQ + CAP sections in our sample. Instead of using a standard Bag-of-Words (BOW) model that is frequently used in accounting and finance research to estimate document similarities, we construct neural network embeddings of each LIQ + CAP section using the Doc2Vec model developed by Le and Mikolov (2014) that is proven to outperform standard BOW models as it accounts for semantic relationships.

After the identification of equity intent, we quantify its effect on firm specific future stock price crash risk. We follow the extant literature and analyse the effect of equity intent in conjunction with earnings manipulation—another popular determinant of stock price crash risk related to bad news hoarding (Hutton et al. 2009; Kim et al. 2019). Our results suggest that firms with stronger equity intent are more likely to suffer from stock price crashes in the future. These findings hold for different popular stock price crash risk measures and remain robust after controlling for firms’ earnings manipulation, which is a popular vehicle to hide adverse information as well as an extensive set of control variables. Moreover, the effects have a comparable magnitude like the effect earnings manipulation and textual obfuscation exert on stock price crash risk. To strengthen the case that equity intent is in fact a motive for managers to hoard bad news, we conduct a causal mediation analysis where we use the fraction of negative words in each firm’s MD&A section as a mediator variable. In line with the theory of bad news hoarding, our results show that stronger equity intent incentivises managers to withhold negative information by using less negative language in their MD&A, supporting the view that bad news hoarding serves as an important economic mechanism predicting the likelihood of future price crashes following a sudden release of accumulated negative information.

In additional analyses we examine how managers’ equity intent relates to the usage of earnings manipulation and textual obfuscation. The results show that equity intent reinforces the effects of these two popular vehicles of bad news hoarding. While earnings manipulation and textual obfuscation offer only marginal predictive power for future stock price crash risk for firms with low equity intent, they are strongly significant determinants for future stock price crash risks for high equity intent firms. Furthermore, we conduct a series of analyses to assess the robustness of our main results. Our robustness tests show that equity intent is an early warning indicator for stock price crash risk up to a prediction window of three years while predictive power decreases for wider prediction windows. In the next analysis we evaluate whether equity intent also predicts upward stock price jumps. If equity intent is a motive for bad news hoarding, we should not be able to predict upward stock price jumps in a similar way as stock price crashes. In fact, the results are generally consistent with this notion. We further check whether our findings are driven by model specifications of the applied neural network to estimate the LIQ + CAP embeddings. By exploring an alternative approach to estimate LIQ + CAP embeddings, we find that our main results remain robust for all stock price crash risk measures. Finally, we test the sensitivity of our results by controlling for several additional business risk and textual proxies that are associated with stock price crash risk. Again, our results remain virtually unchanged.

This paper contributes to the literature in the following way: In line with the SEC’s plan to use textual analyses to detect managerial opportunistic behaviour (Eaglesham 2013), we contribute to the literature that uses textual information to complement the prediction of stock price crashes. Consistent with El‐Haj et al. (2019), who state that “mainstream accounting and finance research appears to be behind the curve in terms of computational linguistic sophistication” (p. 266), prior works in the field of stock price crash risk mainly rely on basic content analysis methods, providing evidence that more complex and ambiguous company disclosures obfuscates financial information and thus, facilitates bad news hoarding (Ertugrul et al. 2017; Kim et al. 2019). Instead of identifying a new vehicle that managers use to hide bad news from outside investors, we draw on the bad news hoarding theory of Jin and Myers (2006) which requires the existence of certain incentives that drive managers decision to hoard bad news in the first place. Hence, by raising the question of whether company disclosures contain information about managerial incentives to hide bad news, we add a new perspective on how to use company filings to identify determinants of future crash risk. Furthermore, we argue that, as compared to prior works relying on broad textual characteristics of company files, our approach is superior in detecting managers’ bad news hoarding as it identifies a theoretically grounded incentive to use asymmetric information in efforts to time markets within managers’ capital structure decisions.

The remainder of this paper is structured as follows: Sect. 2 reviews literature in the field of stock price crash risk and provides our research hypothesis. Section 3 describes our data, the crash risk measures, the neural network embeddings we use to identify equity intent and the models to predict future crash risk. In Sect. 4 we present our prediction results before Sect. 5 provides additional analyses and Sect. 6 concludes the paper.

2 Literature review and hypothesis

Since Jensen and Meckling (1976) first introduced agency costs, a large stream of literature analyses how linking the performance measurement of managers to certain budgetary or financial targets exacerbates existing conflicts of interests between them and outside investors that incentivise managers to engage in earnings management (Franco-Santos et al. 2012; Jensen 2005). Likewise, since the current market valuation of a firm can influence its managers' strategic capital structure decisions, in theory, managers are incentivised to withhold information that could negatively affect the firms’ market valuation (Alti 2003; Baker and Wurgler 2002; DeAngelo et al. 2010; Graham and Harvey 2001). Unsurprisingly managers that engage in earnings management in this context, also try to maximize their seasoned equity offerings (SEO) proceeds by increasing SEO pricing (Kim and Park 2005) but at the same time experience a declining operating performance as well as a distinctly post-SEO underperformance as the real consequences of their earnings management decisions reach the market (Cohen and Zarowin 2010; DuCharme et al. 2004; Walker and Yost 2008). However, thus far the literature has neglected that managerial opportunistic behaviour may also have much larger implications for investors welfare due to its impact on stock price crash risk. Hence, we contribute to this literature by analysing how a firms equity intent serves as a motive for opportunistic behaviour in the form of bad news hoarding, which leads to stock price crashes.

In this regard Jin and Myers (2006) set up a model in which the degree of opaqueness affects the division of risk bearing between firm insiders (managers) and outside investors. Managers’ ambitions to capture parts of the firm’s cash flow is limited by outside investors’ perception of the firm’s cash flow and value. If cash flows exceed investors’ expectations, managers are able to capture a larger fraction of it. Conversely, if managers underperform relative to investors’ expectations, they suffer reduced compensation and might even run the danger to lose their job. Thus, managers are incentivised to overstate financial performance and increase opacity by strategically withholding negative information from outside investors. In this sense, managers’ disclosure preferences are not aligned with those of outside investors (Kothari et al. 2009). However, there is a natural limit to the accumulation of bad news, because the probability that outsider investors grasp the firm’s true value continuously increases the more negative news are accumulated. When further obfuscations become too costly or difficult to maintain, all the negative information are released at once, resulting in the firm’s stock price to crash. Given the subsequent destruction of shareholder welfare, managers’ incentives to engage in bad news hoarding and the underlying mechanisms of this practice have received increasing attention from financial research. Previous literature has already established the positive relation between information asymmetries and stock price crash risk. For instance, incentive-based management compensation, CEO overconfidence (Kim et al. 2016), and overall career concerns (Kothari et al. 2009) provide motives for managers to uphold and expand information asymmetries against outside investors, thus, increasing the risk of a stock price crash (Kim et al. 2011).

In search for other motives for managers to obfuscate corporate disclosure, we turn our focus on corporate financing decisions, more precisely managers’ equity intent. Besides the obvious importance for the continuation and success of a firm’s business activities, financial management decisions are subject to agency conflicts between managers and outside investors, especially in the case of equity financing. While we do not plan to extend the comprehensive literature on choosing the optimal financial strategy, we are interested in the possibilities of opportunistic management behaviour (and its financial consequences) that are associated with managers’ equity intent.

In general, managers should be interested in a successful SEO for at least two reasons. Firstly, their future courses of action inside the firm benefits from higher financial resources. Secondly, they may fear the negative resonance following a disappointing SEO from a career perspective. Conversely, a positive stock performance and a successful equity issuance should enhance their job security and increase their value on the labour market (Chi and Gupta 2009). Consistently, the influential survey of corporate financing decisions by Graham and Harvey (2001) showed that managers proactively chose their equity issuances with regard to the current market valuation of the firm. Since firm outsiders are unable to assess the true value of a firm (given semi-informationally efficient markets), managers are able to exploit these information asymmetries to maximize the corresponding capital inflow. Lang and Lundholm (2010) observe changes in corporate disclosure policies around the period of an SEO. They show that firms seeking for equity tend to provide more detailed and more optimistic disclosure prior to the equity offering, leading to an increase of their stock price prior to the offering announcement. While the increase of the stock price may also be the result of improved economic conditions, Lang and Lundholm (2010) provide evidence, that this effect also holds true for firms without an economic improvement which “hype up” their stock. Interestingly, firms dialling up their disclosure suffer from comparably higher price declines after the actual announcement of the equity offering, indicating a market correction for “hyped up” stocks. Bergmann and Roychowdhury (2008) show that managers strategically adapt the level of disclosure (long-horizon earnings forecasts) to mould investor expectations and foster the firm’s valuation. In the case of high investor sentiment, managers are expected to decrease the level of voluntary disclosure to manage future expectations and maintain optimistic firm valuations. Thus, managers are incentivised to uphold high investor sentiment by concealing negative information in the interest of optimizing their SEO. Due to the limits of bad news accumulation, an opportunistic manager should offer equity when the (positive) difference between outside investors’ assessment and firm’s true value is the highest. In this sense, the search for equity could be interpreted as signal that managers do not expect the market’s perception of the firm’s value to further increase in the near future (Krasker 1986). Following the notion of Bergmann and Roychowdhury (2008), the search for equity would mark the beginning of a decrease in a firm’s disclosure level and the starting point of negative news concealment. Given this evidence, managers who intend to issue equity in the foreseeable future might be more likely to withhold negative news from outside investors to secure a successful SEO and maximize their capital inflow. We thus posit the following hypothesis:

Hypothesis

Managers’ equity intent incentivises bad news hoarding behaviour as reflected by increasing future stock price crash risk.

The analysis above suggests that managers’ equity intent may help investors to better grasp potential risks arising from bad news hoarding behaviour. However, managers’ intentions cannot be extracted from simple balance sheet data and thus, require a rigorous identification strategy. Therefore, we construct neural network embeddings from textual disclosures to estimate managers’ equity intent based on linguistic characteristics. Moreover, since firms seeking equity might generally be associated with a riskier business profile, we employ a battery of variables capturing fundamental risk to alleviate the concern that our results are simply driven by business risk.

3 Sample selection, data, and research design

3.1 Sample selection

Table 1 documents the sample selection process. Following Schmidt et al. (2019) in screening data from Refinitiv Datastream and Worldscope, our sample starts with available US firm-years from 1994 to 2019. After removing financial firms (SIC codes 6000–6999) and utilities (SIC codes 4900–4999), we collect weekly returns for each firm-year. To avoid bias arising from thin trading, we remove firms with a fiscal year-end price lower than $1.Footnote 3 In addition, we remove observations with less than 26 weeks of stock return data and observations with nonpositive total assets at the beginning of a fiscal year.Footnote 4 In order to estimate managers’ equity intent, we next collect all available 10-K files from the SEC’s EDGAR system for our sample of firms and extract the MD&A section using the parsing procedure proposed by Reichmann and Reichmann (2022).Footnote 5 Having extracted the MD&A section, we compile Regular Expressions to extract the LIQ + CAP section. Since the following methodology relies on a clean identification of the LIQ + CAP section, we describe the parsing procedure in greater detail in Appendix 1. After dropping observations that have no machine-readable LIQ + CAP section and insufficient financial data to construct control variables, 28,382 observations from 2825 firms remain in the final sample.

Table 1 Sample selection

3.2 Measuring managers’ equity intent

While common balance sheet data does not provide any cues on managers’ equity intent, Hoberg and Maksimovic (2014) show that managers address financing needs within the LIQ + CAP section, a mandated disclosure of the MD&A section in 10-K files. Thus, the authors compile a set of phrases managers commonly use in the LIQ + CAP section to express their equity intent. These include the following:

issuing equity securities OR expects equity securities OR through equity financing OR sources equity financing OR seek equity investments OR seek equity financings OR access equity markets OR raised equity arrangements OR undertake equity offerings OR sell common stock OR issuing common stock OR selling common stock OR use equity offerings OR offering equity securities OR planned equity offering OR seek equity offering OR raise equity offering OR equity offering would add OR additional equity offering OR considering equity offering OR seek equity financing OR pursue equity offering OR consummates equity offering OR raises equity capital OR raise equity offering OR sources equity offering

Using the above list of words and phrases, we identify a sample of firms that explicitly express equity intent.Footnote 6 However, manager’s equity intent is likely to come in shades. While some managers will explicitly state to seek equity, others may still consider it even though they to do not explicitly state to do so. Based on the idea that managers with equity intent are likely to discuss similar content (i.e. use similar words) within the LIQ + CAP section, Hoberg and Maksimovic (2014) suggest to construct continuous measures of managers’ equity intent. Specifically, the authors estimate cosine similarities to score how proximate the overall LIQ + CAP vocabulary is to that of firms that explicitly express their equity intent. Therefore, the authors employ a BOW approach to convert each LIQ + CAP section to a vector representation of word counts. Formally, if the full sample of LIQ + CAP sections is made of \(|V|\) unique words (the total vocabulary), the BOW approach represents a given LIQ + CAP section \(j\) as a vector with a length of \(|V|\), where each entry is assigned to a word in the vocabulary and its corresponding word count, \(coun{t}_{i,j}\). However, research in the field of computational linguistic points out two major drawbacks of this method (Le and Mikolov 2014): firstly, the BOW approach does not account for word order. As long as the same words are used, different documents can have exactly the same vector representations. Secondly, BOW vectors do not account for the semantics of words. For instance, when using the BOW approach, the words “profit”, “revenue”, and “water” are considered equally distant, even though “profit” and “revenue” are semantically much closer than “profit” and “water”. To address these issues, Le and Mikolov (2014) developed Doc2Vec, an unsupervised machine learning algorithm that learns dense vector representations from a sample of documents.

The authors build on the famous Word2Vec model developed by Mikolov et al. (2013a, b) which represents a neural network that attempts to estimate so-called word embeddings of each word found in a sample of documents. Word embeddings can be understood as learned vector representations of words that capture their semantic information, or put differently, their meaning. Therefore, Word2Vec relies on an old concept in linguistics suggesting that words that have similar meaning are more likely to co-occur with similar word-neighbours (Harris 1954). Hence, the neural network “reads” through a sample of documents and thereby learns to predict the neighbouring words of each word it encounters. To improve its predictions, the model continually updates the weights of hidden layers that are used to predict the neighbouring words. When learning is completed after a certain number of iterations, these layer weights are used as effective vector representations of words (i.e. word embeddings). The word embeddings all have a fixed, pre-defined dimension \(d\) that captures the semantic information learned from the co-occurrence relationship of each word and its neighbours. As a result, words that co-occur with similar neighbouring words are assigned with more similar word embeddings, that, when plotted into vector space, occupy closer locations to one another.

While Word2Vec was developed to estimate vector representations of words, Doc2Vec has become a widely used technique to create vector representations of whole documents such as our LIQ + CAP sections. Le and Mikolov (2014) propose two different architectures to estimate vector representations of documents. The Distributed Memory (DM) model is based on Word2Vec but also adds a \(d\times N\) document matrix to the modelling process, where \(d\) denotes a given number of dimensions and \(N\) denotes the total number of documents in the sample. Therefore, besides training word embeddings, the model also trains a \(d\)-dimensional feature vector that is unique to each of the \(N\) documents in the sample. In comparison, the Distributed Bag Of Words (DBOW) model is less complex as it ignores word order and only uses the document vector to predict a randomly sampled set of words from the respective document. In both cases, the models run against a loss function and continuously update the document matrix to achieve better predictions. After finalizing a given number of iterations, the document matrix contains \(d\)-dimensional document vectors for each of the \(N\) documents. In their influential paper, Le and Mikolov (2014) test how these so-called document embeddings compare to BOW vectors by estimating cosine similarities between the content of web pages and respective search queries. Their findings show that, compared to a standard BOW model, the usage of Doc2Vec gives a 53% relative improvement in terms of error rate when performing information retrieval tasks.Footnote 7 Consequently, we construct document embeddings in the spirit of Le and Mikolov (2014).

We begin by employing extensive text-preprocessing steps to reduce textual noise, improve the generalizability of our Doc2Vec model and form phrases that are specific to the LIQ + CAP sections. Specifically, we closely follow the recommendations of Reichmann and Reichmann (2022) by (i) splitting the document into sentences for further processing, (ii) replacing named entities such as firm names, dates, or money values to predefined tags, (iii) employing lemmatization to reduce feature dimensionality,Footnote 8 (iv) forming phrases in the spirit of Mikolov et al. (2013b) and (v) removing stopwords that are defined as words that do not add meaning to a document.Footnote 9 In the following, we illustrate the impact on the text preprocessing steps on an exemplary sentence taken from our sample of LIQ + CAP sectionsFootnote 10:

“Investments in property, plant and equipment were $128 million in 2009 and $154 million in 2008”.

After employing preprocessing steps (ii) to (v), the sentence reads as follows:

“investment property_plant_and_equipment -money- -date- -money- -date-”.

(ii) Named entities such as the money value “$128 million” and dates like “2009” are replaced with the tags “-money-” and “-date-Footnote 11; (iii) lemmatization reduces the word “Investments” (plural) to its base-form “investment” (singular); (iv) the phrase “property, plant and equipment” is found to significantly co-occur within our sample of LIQ + CAP sections, so that it is concatenated to a single token using underscores (“property_plant_and_equipment”);Footnote 12 and (v) stopwords outside of phrases like “and”, “in”, and “were” are removed.Footnote 13 As the result shows, the sentence has become shorter and is less likely to reflect linguistic cues that are specific to a certain point in time which helps to reduce model dimensionality and improves the generalizability of the document embeddings.

Next, we use the Doc2Vec module of the genism library in Python to estimate vector representations of each LIQ + CAP section. While the DBOW and the DM model only differ slightly, the simpler DBOW model ignores word order which could reduce its ability to capture semantic information. Therefore, for our baseline analysis, we employ the DM model and later apply the DBOW model to test the robustness of our findings. For the implementation of the Doc2Vec model, we first have to define the dimensionality \(d\) of the document embeddings which the model learns during the training process. While larger vector representations may yield better results, they also require more costly computations. However, based on the findings of Pennington et al. (2014) who show that embedding dimensions greater than 300 offer little improvement in quality, we follow the vast majority of the literature that employs textual embedding models and set number of embedding dimensions equal to 300 (e.g. Lau and Baldwin 2016; Li et al. 2021; Matin et al. 2019). For the remaining parameters, we closely follow the recommendations of Lau and Baldwin (2016) who provide optimal Doc2Vec hyper-parameter settings for semantic textual similarity tasks.Footnote 14 After 1000 iterations of training over the whole sample of LIQ + CAP sections, we assign each section a 300-dimensional document vector. We then construct an equity focus vector by averaging the vectors of all LIQ + CAP sections in which managers explicitly express their equity intent. Finally, we construct \({EQUITY\_INTENT}_{it}\) as cosine similarities from each LIQ + CAP sections to the equity focus vector. While \({EQUITY\_INTENT}_{it}\) naturally ranges between 0 and 1, higher values indicate a stronger equity intent.

In order to better understand how our main variable of interest is measures, Fig. 1 graphically displays a random sample of 500 LIQ + CAP embeddings and their distant to the equity intent vector by employing t-Distributed Stochastic Neighbour Embeddings (t-SNE) techniques to visualize the 300-dimensional vectors in a 3D scatter plot.Footnote 15 Each dot represents the document embedding of a randomly drawn LIQ + CAP section. The redder the dots, the higher \({EQUITY\_INTENT}_{it}\) (i.e. the closer they are to the equity intent vector). Consequently, a red dot that occupies the exact same space as the equity intent vector would have a value of \({EQUITY\_INTENT}_{it}=1\). Put simply, we would expect that redder firms have a higher propensity to experience a one-year-ahead stock price crash, as they exhibit a stronger intent to issue equity.

Fig. 1
figure 1

Measuring managers’ equity intent

In order to validate our measure of equity intent, we examine the correlations between a firm’s equity intent and various firm characteristics such as firm size (\(LOGM{V}_{it}\)), the market-to-book ratio (\(MT{B}_{it}\)), leverage (\(LE{V}_{it})\), operating performance (\(RO{A}_{it})\) as well as risk measures including firm-specific return volatility \((SIGM{A}_{it}\)) and cash flow volatility (\(CFVO{L}_{it}\)). In addition, we include proxies for firms’ inherent information asymmetry using an intangibility measure (\(ADJROT{A}_{it})\) and R&D expense (\(PROP\_COS{T}_{it}\)). Finally, we also test correlations with firm age (\(AG{E}_{it}\)). The results are reported in the Table 2A. We find that smaller, younger, badly performing, and high-growth firms have a higher equity intent. Furthermore, firms that are less capable of using debt financing, firms with higher cash flow volatility, and firms with higher information asymmetry are more likely to search for equity. In summary, we conclude that our measure of equity intent correlates with common firm characteristics that are likely to constrain firms’ financing options to equity financing.

Table 2 Descriptive statistics

3.3 Measuring crash risk

To measure firm-specific stock price crash risk, we calculate firm-specific weekly returns (\(W\)) using the following market-industry index model estimated for each firm and fiscal year:

$$r_{i\tau } = \alpha_{i} + \beta_{1i} r_{m\tau - 1} + \beta_{2i} r_{j\tau - 1} + \beta_{3i} r_{m\tau } + \beta_{4i} r_{j\tau } + \beta_{5i} r_{m\tau + 1} + \beta_{6i} r_{j\tau + 1} + \varepsilon_{i\tau }$$

where \({r}_{i}\), \({r}_{j}\), and \({r}_{m}\) are the returns in week \(\tau\) for stock \(i\), the value-weighted Fama–French index for industry \(j\), and the CRSP value-weighted market index m, respectively. Since 10-K files are usually filed within a three month period after a firm’s official fiscal year-end, we define a fiscal year as the 12-month period ending three month after a firm’s official fiscal year-end to avoid look-ahead bias (Kim et al. 2019).Footnote 16 To account for non-synchronous trading in our estimation (Dimson 1979), we augment the index model with lead and lag terms for market and industry returns.Footnote 17 The firm-specific weekly return for firm \(i\) and week \(\tau\), \({W}_{i\tau }\), is the natural logarithm of one plus the residual of the equation above. For the selection of crash risk measures, we follow the existing literature and calculate the following four measures (Al Mamun et al. 2021; Chen et al. 2001; Hong et al. 2017; Jin and Myers 2006; Kim et al. 2011, 2019; Kim and Zhang 2016). First, the negative skewness of weekly stock returns:

$$NCSKEW_{it} = - \left[ {n\left( {n - 1} \right)^{\frac{3}{2}} \mathop \sum \limits_{\tau = 1}^{n} W_{i\tau }^{3} } \right]/\left[ {\left( {n - 1} \right)\left( {n - 2} \right)\left( {\mathop \sum \limits_{\tau = 1}^{n} W_{i\tau }^{2} } \right)^{\frac{3}{2}} } \right]$$

\(NCSKEW_{it}\) is calculated by dividing the negative of the third moment of firm-specific weekly returns by the standard deviation of firm-specific weekly returns raised to the third power over all n weeks of the fiscal year. Through scaling the raw third moment by the standard deviation cubed we are able to compare firms with different stock price variances (Chen et al. 2001). By putting a minus sign in front of the nominator, high values of \(NCSKEW_{it}\) correspond with a more left-skewed distribution and an increased stock price crash risk for firm i in year t. Hence, we adopt the common convention that more left-skewed return distributions are associated with frequent small gains but bear the risk of extreme negative outliers (i.e. stock price crashes).

As measures based on third moments are potentially overly affected by extreme observations, we apply the “down-to-up” volatility \(DUVOL_{it}\) as our second crash risk variable. \(DUVOL_{it}\) equals the natural logarithm of the ratio of the standard deviation of firm-specific weekly returns in down weeks to the standard deviation of firm-specific weekly returns in up weeks:

$$DUVOL_{it} = log\left[ {\frac{{\left( {n_{u} - 1} \right)\mathop \sum \nolimits_{DOWN} W_{i\tau }^{2} }}{{\left( {n_{d} - 1} \right)\mathop \sum \nolimits_{UP} W_{i\tau }^{2} }}} \right]$$

where \(n_{u}\) and \(n_{d}\) represent the number of up and down weeks defined as weeks in which the weekly return \(W_{i\tau }\) exceeds or falls below the average return of a fiscal year, respectively. Based on the pattern of incremental gains and larger losses, “crash prone” firms should be characterised by higher standard deviation of firm-specific weekly returns in down weeks compared to up weeks. Thus, high values of \(DUVOL_{it}\) should correspond with an increased stock price crash risk for firm i in year t.

Thirdly, we use \(COUNT_{it}\) defined as the difference in frequencies between negative stock price crashes and positive upward stock price jumps in firm-specific returns. Stock price crashes (upward jumps) are defined as a firm-specific weekly return that falls (rises) 3.09 standard deviations below (above) the annual mean.Footnote 18 Similar to prior works, we calculate \(COUNT_{it}\) as the difference between stock price crashes and upward jumps (Jin and Myers 2006). Based on the same argumentation as our previous measures, skewed-return distributions, or put differently, “crash prone” firms should be characterised by more stock price crashes than upward jumps. In conclusion, higher values of \(COUNT_{it}\) indicate increased stock price crash risk for firm i in year t. Finally, we include the indicator variable \(CRASH_{it}\) that equals one if a firm experiences a stock price crash within a fiscal year and zero otherwise. Therefore, \(CRASH_{it}\) captures the realization of a firm-specific stock price crash.

To evaluate our hypothesis that managers’ equity intent increases future stock price crash risk through bad news hoarding behaviour, we regress we our crash risk measures \(NCSKEW_{it + 1} , DUVOL_{it + 1} , COUNT_{it + 1}\), and \(CRASH_{it + 1}\) on our text-based equity intent variable \(EQUITY\_INTENT_{it}\) as well as a set of control variables:

$$CRASH RISK_{it + 1} = \beta_{0} + \beta_{1} EQUITY\_INTENT_{it} + \sum \beta_{k} CONTROLS_{it}^{k} + \varepsilon_{it}$$

The set of control variables includes the non-linear relationship between financial opacity, measured as the three years moving sum of discretionary accruals \((OPAQUE_{it} )\) and its squared term, since Hutton et al. (2009) document a concave relation between financial opacity and crash risk. Firm size, measured as the natural logarithm of a firm’s market capitalization (\(LOGMV_{it}\)), controls for various aspects of a firm’s operational and business environment. Prior works show that growth firms, captured by the market-to-book ratio (\(MT{B}_{it}\)), are more crash-prone (Hutton et al. 2009; Kim et al. 2019), whereas firms that are less likely to experience stock price crashes should be more capable of obtaining debt (\(LE{V}_{it}\)). Since Graham et al. (2005) suggests that poorly performing firms are more likely to withhold bad news, we control for operating performance (\(RO{A}_{it}\)) and the mean of firm-specific weekly returns during a fiscal year (\(RE{T}_{it}\)). Chen et al. (2001) find that investor heterogeneity, measured as the detrended level of turnover (\(DTUR{N}_{it}\)), is positively associated with crash risk. Finally, firm-specific skewness (\(NCSKE{W}_{it}\)) and volatility (\(SIGM{A}_{it}\)) is documented to correlate with crash risk (Kim et al. 2019; Wu and Lai 2020). All of our regressions also include year and industry fixed effects (based 2-digit SIC industry classification) to account for year- and industry-wide variation in crash risk patterns. The reported standard errors are robust to heteroscedasticity (MacKinnon and White 1985).

4 Results

4.1 Univariate statistics

Panel A of Table 2 contains descriptive statistics for our main variables \({NCSKEW}_{it+1}\), \({DUVOL}_{it+1}\), \({COUNT}_{it+1}\), \(CRAS{H}_{it+1}\), and \({EQUITY\_INTENT}_{it}\), as well as our additional control variables. The average cosine similarity between an individual firms’ equity intent and the equity intent vector equals 0.281. The mean values (medians) of the variables \({NCSKEW}_{it+1}\) and \({DUVOL}_{it+1}\) of 0.128 (0.030) and 0.022 (− 0.035), respectively, indicate that the distribution of weekly returns of the firms in our sample exhibits a pronounced negative skewness. Consistently, we find that, on average, 23.4% of the observations experience a stock price crash as indicated by the mean value of \(CRAS{H}_{it+1}\) which is generally consistent with the descriptive statistics reported by Kim et al. (2019). Interestingly for \({COUNT}_{it+1}\) we find a marginally negative mean of − 0.001 which indicates that we observe slightly more stock price jumps than crashes. But the direct comparison with the other two measures, demonstrates that if a crash happens it tends to be larger than a corresponding jump and thus detrimental for investor welfare.Footnote 19

As a first step of our analysis, we use univariate comparisons between equity intent and our crash risk measures. Therefore, we sort our sample into tercile groups by the equity intent variable \({EQUITY\_INTENT}_{it}\) and present the mean values of the one-year-ahead crash risk measures for each group in Panel B of Table 2. The results demonstrate that firms with low equity intent are more likely to experience upward stock price jumps rather than crashes as indicated by the negative means for two of our dependent variables. More importantly, we find that the stock price crash risk increases monotonically from the low-equity intent group to the high-equity intent group when we measure crash risk by \({NCSKEW}_{it+1}\) and \({DUVOL}_{it+1}\). Furthermore, the differences between the high- and low-equity intent group are statistically significant at the 1%-level. For \({COUNT}_{it+1}\), we find that crash risk decreases from low- to medium-equity intent only to peak for the high-equity intent group. The differences in crash risk between the high- and low-equity intent groups are statistically significant at the 5%-level. Similarly, turning to the \(CRAS{H}_{it+1}\) measure, we find that 22.6% of the observations in the low-equity intent group experience at least one stock price crash during a fiscal year, whereas for the high-equity intent group, 25.2% of the observations experience at least one stock price crash risk during a fiscal year. The difference in means between the low- and high-equity intent group is statistically significant on the 1%-level. However, to get a deeper understanding how equity intent affects the level of crash risk, we need to consider other factors influencing crash risk within a multivariate analysis.

4.2 Main results

Table 3 shows the results of our regressions to predict one-year-ahead crash risk. The regression models differ only with respect to the dependent variables, so that column 1 shows the results for \({NCSKEW}_{it+1}\), column 2 for \({DUVOL}_{it+1}\) and column 3 for the results for \({COUNT}_{it+1}.\) We find that the coefficient of \({EQUITY\_INTENT}_{it}\) is positive and statistically significant on the 1%-level for all four crash risk measures, \({NCSKEW}_{it+1}, {DUVOL}_{it+1}\), \({COUNT}_{it+1}\), and \(CRAS{H}_{it+1}\). These results are in line with our hypotheses that a stronger equity intent helps to predict future stock price crash risk as it may represent a motive for bad news hoarding. Consistent with the extant literature on the relationship between earnings management and crash risk (Hutton et al. 2009) we find the coefficients for \({OPAQUE}_{it}\) (\({{OPAQUE}_{it}}^{2})\) to be positive (negative) and statistically significant at 1%-level or 5%-level in all model specifications. Even more importantly these results demonstrate that a firm’s equity intent predicts crash risk in \(t+1\) after controlling for the level of earnings management and insofar is a potent predictor for bad news hoarding that transcends the effects of earnings manipulation. In line with expectations, the R-squared of the regression models for \({COUNT}_{it+1}\) and \(CRAS{H}_{it+1}\) indicate a weaker model fit due to the relatively small variations in these variables over the whole sample period since both only contain information on tail events of the return distribution. Moreover, this might also contribute to less statistical significance for some of the control variables because of relatively higher standard errors. However, the overall model fit of our models is consistent with results reported in prior literature on stock price crash risk which is reflective of the complex nature of predictions models for rare events such as stock price crashes. We also estimate the economic significance of \({EQUITY\_INTENT}_{it}\) on the realization on a one-year-ahead crash event as captured by \({CRASH}_{it+1}\). Therefore, we use a margins model to estimate partial derivatives of the regression equation with respect to each variable in the model for each unit in the data. We find that a one standard deviation increase in \({EQUITY\_INTENT}_{it}\) is associated with an 0.80% increased probability of a one-year-ahead crash.Footnote 20 This effect is comparable to other determinants of crash risk identified by prior research. For instance, Kim et al. (2019) show that a one standard deviation in their modified readability measure is associated with a 0.52% increased crash probability, whereas Hutton et al. (2009) show that a one standard deviation increase in \(OPAQU{E}_{it}\) is associated with a 1.73% increase in the probability of a future crash event. In summary, we consider the effect of \({EQUITY\_INTENT}_{it}\) to be economically significant. Therefore, our findings may help investors secure their investment performance against stock price crashes.

Table 3 Impact of equity intent on stock price crash risk

Up to this point, our results present a positive association between a firm’s equity intent and its crash risk. To further examine the role of equity intent as a motive for bad news hoarding, we conduct a mediation analysis. In this analysis we assess the extent to which the effect of equity intent could be explained by another mediator variable that is likely to capture bad news hoarding by managers. In this regard, we assume that managers who try to block the flow of negative information to the public, intentionally or unintentionally use less negative language to explain their assessment of their firm. Therefore, we construct the mediation variable \({NEGW}_{it}\) that represents the fraction of negative words used by firm managers within the MD&A section. We estimate the fraction of non-negated negative words within the MD&A section by using the Fin-Neg wordlist proposed by Loughran and McDonald (2011).Footnote 21 Following the recommendations of Baron and Kenny (1986), the subsequent mediation analysis proceeds in three steps. Firstly, as documented under “Step 1” in Table 4, we regress the mediator \(NEG{W}_{it}\) on our main independent variable \(EQUITY\_INTEN{T}_{it}\). The results show that more equity intent of a firm is associate with significantly (at the 1%-level) less negative words within its MD&A section, indicating that managers who are seeking equity allow less negative news flow. Secondly, we regress the crash risk measures on \(EQUITY\_INTEN{T}_{it}\) as shown in “Step 2” of Table 4. Hence, the second step simply repeats our baseline analysis and shows that higher equity intent corresponds to higher crash risk. Thirdly. We regress the crash risk measures on both our main independent variable \(EQUITY\_INTEN{T}_{it}\) and on the mediator \(NEG{W}_{it}\). We find that while the coefficient of \(EQUITY\_INTEN{T}_{it}\) remains positive and statistically significant, negative news flow helps mitigating future crash risk as indicated by the negative and statistically significant coefficient of \(NEG{W}_{it}\).Footnote 22 Finally, we estimate the average causal mediation effect (ACME). The ACME stands for the indirect effect of equity intent on crash risk that goes through negative language of managers assessed negative wording in the MD&A section of a firms 10-K report. For the calculation of the ACME we use 1000 simulations and non-parametric bootstrapped confidence intervals. The results are presented in columns (2)–(4). The ACME is statistically significant at the 1%-level. This finding indicates that, a significant part of the effect of equity intent on crash risk can be attributed to bad news hoarding through less negative language used by firm managers in their MD&A section. In line with the predictions of Jin and Myers (2006) as well as the vast majority of empirical research on crash risk, these findings indicate that managers’ bad news hoarding behavior serves an important economic mechanism through which managers’ equity intent affect future crash risk.

Table 4 Bad news hoarding—mediating effect of negative wording in the MD&A section

Our paper illustrates that a firms equity intent acts a motive for bad news hoarding. Given that managers actively engage in earnings manipulation (Hutton et al. 2009) as well as textual obfuscation to hide negative news (Kim et al. 2019) one could naturally ask whether these instruments of bad news hoarding are more likely to increase crash risk when managers are incentivised to hoard bad news due to their equity intent. Consistent with the literature, we use the FOG Index to assess the textual obfuscation based on the readability of the 10-K report. We calculate the FOG Index as (words per sentence + percentage of complex words) × 0.4 with a higher FOG Index indicating that the report is more difficult to read, where complex words are defined as words exceeding three syllables. The FOG Index indicates the number of years of formal education a reader of average intelligence needs to understand a given text on the first reading. We follow the recommendations of Kim et al. (2019) and adjust the FOG Index for more than 2000 multisyllabic words that occur frequently in financial reports but are unlikely to be considered complex from an investor’s perspective. After modifying the FOG Index, we derive \(MODFO{G}_{it}\) as proposed by Kim et al. (2019). If a firms equity intent reinforces the effects of earnings manipulation and textual obfuscation on crash risk, we would expect that \({OPAQUE}_{it}\) and \({MODFOG}_{it}\) have a stronger predictive power when managers have equity intent. Hence, we examine the effects of earnings manipulation and textual obfuscation for two different groups based on their equity intent. Similar to the univariate comparison in Table 2, we conduct a split sample analysis by comparing firms within the lower tercile (low equity intent) and the upper tercile (high equity intent) group of \({EQUITY\_INTENT}_{it}\). In Panel A of Table 5 the results for the low-equity intent group indicate that textual obfuscation is solely significant at the 5%-level for our second crash risk measure \({DUVOL}_{it+1}\). Moreover, \({OPAQUE}_{it}\) significantly predicts crash risk in \(t+1\) for \({NCSKEW}_{it+1}\)(on the 5%-level) and \({DUVOL}_{it+1}\) (on the 1%-level) and \(CRAS{H}_{it+1}\) (on the 5%-level only for the squared term of \(OPAQU{E}_{it}\)). In line with our expectations, the results for the high-equity intent group presented in Panel B of Table 5 demonstrate that the effects of both instruments of bad news hoarding (earnings manipulation and textual obfuscation) increase in effect size and statistical significance. Thus, textual obfuscation increases crash risk in \(t+1\) at least at the 5%-level for all crash risk measures. Similarly, earnings manipulation also increases crash risk in \(t+1\) at least on the 5%-level for all measures of crash risk.Footnote 23 Overall, these findings indicate that managers’ equity intent increases the adverse consequences of common bad news hoarding tool which further strengthens our view that managers equity intent encourages managers to hide adverse information.

Table 5 Impact of equity intent on the association between earnings manipulation, textual obfuscation and stock price crash risk

5 Additional analyses

In order to test the sensitivity of our results, we conduct several robustness tests. Firstly, because investors that try to hedge against future crash risk might be interested in the persistence with which equity intent signals bad news hoarding, we expand the prediction window of all four crash risk measures. While our baseline analysis employs a 12-month ahead prediction window, managers may accumulate bad news over several years. Thus, we test the relationship between managers’ equity intent and the 24-, 36-, and 48-month ahead stock price crash risk. The results are displayed in Table 6. We find that the coefficients of \({EQUITY\_INTENT}_{it}\) remain statistically significant at least on the 5%-level for all four crash variables when employing a 24-month window, and even remain significant on at least 5% for \(NCSKE{W}_{it+3}\) and \(DUVO{L}_{it+3}\) when employing a 36-month window. Even for a 48-month prediction window, the coefficients of \({EQUITY\_INTENT}_{it}\) remain statistically significant on the 10%-level when predicting \(DUVO{L}_{it+4}\). Consistent with the notion that bad news hoarding becomes more difficult to maintain over the course of time the coefficients and their respective t-values of \({EQUITY\_INTENT}_{it}\) steadily decreases with an increasing prediction window, similar the models’ R-squared. Given the decreasing importance of \(EQUITY\_INTEN{T}_{it}\) over time, our results suggest that investors should screen 10-K filings on a yearly basis to obtain better predictions of future crash risk.

Table 6 Persistence of the impact of equity intent on stock price crash risk

Secondly, we examine the association between managers’ equity intent and upward stock price jumps. Our results generally suggest that managers’ equity intent is associated with increasing bad news hoarding behaviour. While a gradual increase of the stock price as a result of managers “hyping up” the firm’s stock price prior to an SEO seems plausible, sudden upward price jumps seem comparably unlikely, as they would indicate positive market reaction following the revelation of unexpected (good) news. In this case, one would expect that \({EQUITY\_INTENT}_{it}\) is associated with future stock price crashes as a consequence of the sudden release of the accumulated negative information, but should either have no, or a negative correlation with upward stock price jumps. Therefore, we decompose the crash risk measure \(COUN{T}_{it+1}\) into stock price crashes and upward jumps and estimate both proxies separately. Unreported results show that \({EQUITY\_INTENT}_{it}\) is only positively associated with stock price crashes and shows a negative correlation with upward jumps, that is statistically significant only at the 10%-level. Similar to the conception of a placebo test, the theoretical consistency of these results suggest that our inferences are unlikely to be driven by unobservable forces.

Thirdly, we construct an alternative measure for our main variable of interest \({EQUITY\_INTENT}_{it}\). Given the usage of neural networks, we acknowledge that the variable is subject to a large parameter space and uncountable interactions between neurons. While our baseline analysis employs the DM model, we next estimate word embeddings using the DBOW model proposed by Le and Mikolov (2014). The DBOW model is generally considered to be less complex compared to the DM model as it only uses the document matrix to predict a randomly sampled set of consecutive words from a document. For the implementation of the DBOW model, we follow the recommendations of Lau and Baldwin (2016) and estimate 300-dimensional document embeddings in 400 iterations over the whole sample of LIQ + CAP sections.Footnote 24 Similar to our baseline analysis, we use these document embeddings to estimate an equity intent vector and calculate cosine similarities from each LIQ + CAP embeddings to the equity intent vector to derive \(EQUITY\_INTENT{\_DBOW}_{it}\). Table 7 shows the results of estimating our crash risk measures using \(EQUITY\_INTENT{\_DBOW}_{it}\). The results remain virtually unchanged, indicating that our inferences are not merely driven by model and parameter choices.

Table 7 Alternative measurement of equity intent

Fourthly, we include additional control variables to address concerns of omitted variable bias.Footnote 25 Specifically, we address concerns that managers’ equity intent is merely a reflection of a firm’s risk profile by controlling for additional variables capturing business risks including a firm’s cash flow volatility \(CFVO{L}_{it}\), sales volatility \(SALESVO{L}_{it}\), earnings volatility \(EARNVO{L}_{it}\) as well as the Herfindahl–Hirschman Index based on 3-digit SIC codes \(HH{I}_{it}\) (Giroud and Mueller 2010; Kim et al. 2019). Since our main variable of interest is based on textual data, we also control for textual proxies that prior works found to be significantly associated with crash risk. These include the modified FOG Index \(MODFO{G}_{it}\) proposed by Kim et al. (2019), managers’ usage of ambiguous language measured by the fraction of weak modal words within the MD&A \(WMO{D}_{it}\) and the natural logarithm of a 10-K’s file size in megabytes \(FILESIZ{E}_{it}\) as in Ertugrul et al. (2017).Footnote 26 Finally, based on the findings of Wu and Lai (2020), we include two additional measures for firm’s inherent information asymmetry, namely, intangible intensity \(ADJROT{A}_{it}\) as suggested by Clausen and Hirth (2016) as well as a firm’s proprietary costs \(PROP\_COS{T}_{it}\) measured as a firm’s R&D expense scaled by total assets and firm age (\(AG{E}_{it}\)). The results displayed in Table 8 show that the coefficient of \({EQUITY\_INTENT}_{it}\) remains statistically significant for all four crash measures, alleviating concerns of omitted variable bias.Footnote 27

Table 8 Additional controls

Finally, we test whether earnings manipulation, a common vehicle managers use to hoard bad news (Hutton et al. 2009), serves as another mediator for the effect of equity intent on stock price crash risks. Specifically, one could argue that managers do not only block negative news flow but also engage in more aggressive earnings management to obfuscate financial disclosures. Therefore, in Table 9, we use \(OPAQU{E}_{it}\) as a mediator variable for \(EQUITY\_INTEN{T}_{it}\). “Step 1” of the mediation analysis shows that higher equity intent is associated with higher levels of earnings management, suggesting that managers who search for equity are more likely to engage in earnings management. Similarly, the results of “Step 2” and “Step 3” of the mediation analysis are largely consistent with our baseline analysis, indicating that earnings management serves as a significant mediator for effects of managers’ equity intent on future crash risk. In summary, the above analyses suggest robustness of our finding that managers equity intent incentivises managers to hoard bad news and hence, increases firm-specific crash risk.

Table 9 Bad News Hoarding – Mediating effect of earnings manipulation

6 Conclusion

Previous research has established the notion that managers are incentivised to strategically withhold negative information from outside investors in order to maximize their own capture of the firm’s cash flows (Jin and Myers 2006; Kothari et al. 2009). Since the accumulation of bad news is limited when further obfuscations become too costly or difficult to maintain, all negative information is released at once, resulting in the firm’s stock price to crash. Given the subsequent destruction of shareholder welfare, the motives for managers to conduct this behaviour has received increasing attention from financial research literature. While incentive-based management compensation, CEO overconfidence (Kim et al. 2016), and overall career concerns (Kothari et al. 2009) have already been identified as motives to obfuscate corporate disclosure, we turn our focus on corporate financing decisions, thus drawing the connection to arguably one of the most important areas of corporate decision making. Capital structure decisions, especially in the case of equity financing. are known for providing agency conflicts between managers and outside investors, which should incentivise managers to obfuscate corporate disclosure.

Further extending the analysis of drivers of bad news hoarding behaviour, we analyse whether the intention to issue equity incentivises managers to strategically withhold negative information to maximize corresponding capital inflow and strengthen their position of power inside the firm. We examine the role of managers’ intentions within the capital structure decisions in shaping firm-specific stock price crash risks, by analysing how managers express the financing needs of their firms in the LIQ + CAP sections of their MD&A. Using neural network embeddings of the LIQ + CAP sections, we identify different degrees of equity intent. Differences in the management’s inclination to raise more equity may lead to different levels of bad news hoarding which in turn may impact firm-specific stock price crash risks. We expect that managers expressing equity intent in the firms’ disclosures are more prone to bad news hoarding behaviour which helps to predict future stock price crash risk of firms. In fact, we find that a firm’s current equity intent has strong predictive power for firm-specific stock price crash risk in \(t+1\). As a central motive for bad news hoarding, equity intent exerts comparable effects on crash risk as other popular obfuscation vehicles. By conducting a mediation analysis, we find that stronger equity intent affects future stock price crash risk through the channel of bad news hoarding. Consistent with this notion, we find that managers’ equity intent incentivises managers to obfuscate financial disclosure by manipulating earnings and writing overly complex reports. Moreover, our results remain robust for wider prediction windows up to three years, alternative measures of equity intent as well as for a diverse set of additional factors that potentially influence stock price crash risk.

This paper contributes to the stock price crash risk literature by describing the mechanisms between management intentions in capital structure decisions and future stock price crash risks. As financing decisions present one of the most essential business activities, our results apply relevance for both investors and a broad scope of corporate decision makers. In addition, our approach fits in with current SEC efforts to use data analytics to uncover EPS manipulation as part of their current policy agenda requiring stronger corporate disclosures to better protect investors.Footnote 28 In contrast to the rather basic textual information complementing prediction models applied in previous research, our approach holds additional prediction power for stock price crashes as it is sensitive towards various textual cues in management statements within firm disclosures indicative for bad news hoarding. More general, it enables to gauge management intentions from written statements and therefore provides an interesting venue for future research analysing the effects of opportunistic management behaviour on financial market outcomes. For example, using our method to identify managers’ intentions with regards to capital structure decisions, future research may provide a deeper analysis of how manager time the release of good news prior to raising capital (Lang and Lundholm 2010). While this paper suggests that using information retrieval techniques that go beyond simple summarization techniques such as word counts and readability measures, future works may further explore textual cues of bad news hoarding by applying more sophisticated approaches such as named entity recognition techniques, topic models, or word embeddings.