Decision Analytics for Initial Public Offerings: How Filing Sentiment Influences Stock Market Returns

Companies issuing stocks through an initial public offering (IPO) are obligated to publish relevant information as part of a prospectus. Besides quantitative figures from accounting, this document also contains qualitative information in the form of text. In this chapter, we analyze how sentiment in the prospectus influences future stock returns. In addition, we investigate the impact of pre-IPO sentiment in financial announcements on first-day returns. The results of our empirical analyses using 572 IPOs from US companies suggest a negative link between words linked to uncertainty and future stock market returns for up to 10 trading days. Conversely, we find that uncertainty expressed in pre-IPO announcements is positively linked to first-day stock returns. These insights have implications for research on IPOs by demonstrating that future stock returns are also driven by textual information from the prospectus and assist investors in placing their orders.


Introduction
Companies are increasingly turning to stock markets to raise expansion capital to monetize the investments of early private investors. An initial public offering (IPO) refers to a type of public offering where shares of stock in a company are sold to the general public, for the first time, on a securities exchange. As a result of regulatory publication requirements, firms must file a prospectus with, e.g., the Securities and Exchange Commission (SEC) in the USA prior to its initial public stock offering (c.f. Patton 2013a, Ritter andWelch 2002). Details of the proposed offering are disclosed to potential purchasers in the form of a detailed document known as a prospectus. This prospectus contains key figures from financial accounting, as well as information on current and future business models, potential opportunities, and risks threatening the business. The prospectus is used as an instrument to remove or at least alleviate disadvantages stemming from information asymmetries among different investor groups. Obviously, the prospectus plays a major role in the formation of IPO prices, which ultimately determines the amount of capital that accrues from the IPO. Thus, all involved stakeholders in the IPO may need to understand how the prospectus affects the final IPO prices. While the underwriting company has a natural incentive to cast a positive light on their company, investors have a need to decipher the text messages for relevant signals on the future performance of the company.
The theory of market microstructure may fill the bill in explaining the relationship between the prospectus and final IPO price as it refers to the branch of finance that is concerned with the details of how exchange occurs in markets. More specifically, market microstructure theory epitomizes the study of the process and outcomes of exchanging assets under a specific set of rules (O'Hara 1995). One crucial aspect refers to the role of information and how prices reflect them. Interestingly, market microstructure theory is still in its infancy, dealing with information which consists of text messages, such as the prospectus. The theory is, thus, nearly silent in explaining how text messages are processed into prices (c.f. Jegadeesh and Wu 2013, Loughran and McDonald 2011, Tetlock et al. 2008. This is where decision analytics, as a subfield of Information Systems (IS) research, comes into play (Chen et al. 2012) in order to contribute to the understanding of how human agents process information when facing textual news. Besides quantitative information, such as quarterly earnings, the textual content obviously also provides valuable insights. Typically, the sentiment of financial news is utilized to analyze how information is processed (Siering 2013) as investors and analysts perceive novel information differently. Liebmann et al. (2012) come to the conclusion that investors rapidly translate novel information into transactions, whereas analysts wait to respond. Similarly, commodity markets seem to be mostly driven by negative news (Feuerriegel andNeumann 2013, Feuerriegel et al. 2014). Hence, information processing has been subject to many decision analytics research publications since problems span several dimensions, namely, volume, velocity, variety, and veracity (IBM 2013). Thus, these research questions are placed at the intersection between behavioral finance, Information Systems, and decision analytics, which has been receiving great attention lately (Chen et al. 2012).
Altogether, information from an IPO prospectus provides the basis for the decision-making process for potential investors. Thus, it is logical consequence to study the information processing of investors facing IPO filings in the context of efficient electronic markets. Though IPO filings consist of a few quantitative facts, a large proportion accounts for qualitative information. While prior research focuses on quantitative facts, such as the number of offered shares, knowledge of the textual components is rare. Hence, this paper examines the statistical relationship between sentiment in IPO filings and the subsequent stock market performance.
As a main contribution to Information Systems research, this chapter sheds some light on information processing along IPO filings. The soft data contained in these filings conveys insights and potentially valuable information that is absent in traditional quantitative numbers. A recent study by Ferris et al. (2013) points out that the extent to which management is confident about the success of its issue and the implications that such beliefs have on IPO pricing are largely ignored in the literature. To close this gap, individual research questions addressing the content of IPO prospectuses, as well as contributions, are as follows.
Confirmatory Analysis: To what extent are the stock market returns of firms driven by the sentiment of their IPO filings?
As a confirmatory analysis of previous research (Ferris et al. 2013, Loughran andMcDonald 2013), we investigate how investors react to IPO prospectuses and analyze information processing empirically by exploiting sentiment analysis. We find that sentiment in IPO prospectuses correlates with first-day return. In fact, one standard deviation increase in the sentiment measure is negatively linked to an increase in first-day returns by an economically significant 10.56%.
Research Question 1: How persistent is the influence of filing tone on long-term returns?
According to our results, sentiment is not only positively correlated with returns in the short term, but the polarity of filings also affects the performance in the long term.
Research Question 2: How is the IPO performance affected by pre-IPO news sentiment?
The confirmatory analysis proves that IPO filings have an effect on the subsequent stock market returns. However, the media coverage of firms going public occurs in more depth, since journalists also report on the progress of the IPO, the previous performance of the firm, and the general outlook. Because of these reasons, we investigate the impact of pre-IPO news announcements on first-day returns and, finally, compare the influence of sentiment on IPO filings and news announcement in terms of strength.
The remainder of this chapter is structured as follows. In Sect. 2, we combine previous research on both initial public offerings and approaches for sentiment analysis in order to study literature on their intersection, which deals with the information processing theory of IPO filings. To gauge the sentiment of these IPO filings, we detail our research model in Sect. 3. Using this research model, Sect. 4 analyzes how the sentiment in an IPO prospectus is linked to the corresponding stock market performance and how information processing is influenced by pre-IPO news sentiment. Section 5 concludes the chapter with a summary and an outlook on future research.

Related Work
In this section, we present related literature grouped into two categories. First, we revisit literature covering different types of IPO filings. Second, we compare approaches for financial documents to measure their sentiment. Third, we review previous work on information processing of investors when facing IPO filings. All in all, the following references provide evidence that investigating the effect of sentiment on IPO filings across different time spans, as well as the impact of pre-IPO news sentiment, is both a novel and relevant research question to the Information Systems community.
To a large extent, market efficiency relies upon the availability of information (Fama 1965). Access to market information is promoted with ease in electronic markets, and because of straightforward access, decision-makers (i.e., consumers, suppliers, and intermediaries) can use more detailed information to make purchases and sales more beneficial (e.g., Granados et al. 2010). In fact, it is both native and crucial to IS research regarding how decision-makers process and act upon (qualitative) information in IPO filings. Though information processing has been extensively studied in capital markets, this is not the case when it comes to literature focusing on IPO filings: "An examination of the textual or soft information contained in prospectuses is less common" (Ferris et al. 2013).

IPO Filings
This section introduces the basic filing types required by US firms going public. For US firms, one of the first steps in going public is filing a Form S-1 on the Securities and Exchange Commission's (SEC) Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system. Typically, the S-1 offers investors their first detailed glimpse of a firm's business model and financial statements. It is a pre-effective registration statement submitted when a company decides to go public. Hence, an investor can use its content to evaluate potential investment opportunity. If the S-1 is followed by a successful initial public offering, firms file a final prospectus (so-called Form 424), which is published via EDGAR on the day of or a few days following the IPO. Loughran and McDonald (2013) report a median time lag of 88 calendar days between S-1 and Form 424. According to the authors, more calendar time between the S-1 and 424 filings (the final IPO prospectus) is associated with significantly lower first-day returns. Companies with more problematic S-1 filings or facing adverse market conditions are often delayed in issuing stock. Some of the delay in going public could be due to the many days of management needed to properly respond to SEC concerns about the IPO document or a sharp decline in IPO market conditions. However, the overall tone of the S-1 is typically very similar to the tone of the final IPO prospectus. In fact, changes in tone between the S-1 and 424 filings seem to be unrelated to positive revisions in the offer price (Loughran and McDonald 2013).

Sentiment Analysis in the Finance Discipline
Methods that use the textual representation of documents to measure the positivity and negativity of the content are referred to as opinion mining or sentiment analysis. In fact, sentiment analysis can be utilized to extract subjective information from text sources, as well as to measure how market participants perceive and react upon financial materials. Here, one uses the observed price reactions following the financial text to validate the accuracy of the sentiment analysis routines. Based upon sentiment measures, one can study the relationship between financial documents and their effect on markets. For example, empirical evidence shows that a discernible relation between news content and its stock market reaction exists (e.g., Antweiler andFrank 2004, Tetlock 2007, among the first).
As sentiment analysis is applied to a broad variety of domains and text sources, research has devised various approaches to measure sentiment. For example, Pang and Lee (2008) provide a comprehensive domain-independent survey. Within finance, recent literature surveys (Minev et al. 2012, Mittermayer andKnolmayer 2006b) compare studies aimed at stock market prediction. For example, dictionarybased approaches are very frequently used in recent financial text mining research (cp. Demers and Vega 2010, Henry 2008, Jegadeesh and Wu 2013, Loughran and McDonald 2011, Tetlock et al. 2008. These methods count the frequency of predefined positive and negative words from a given dictionary-producing results that are straightforward and reliable. Similarly, term-weighting methods also rely upon positive and negative words, but these are classified based on word frequencies in a training set (i.e., the prior probabilities) to assign a weight to each term (Liebmann et al. 2012). Machine learning approaches (e.g., Antweiler and Frank 2004, Li 2010, Mittermayer and Knolmayer 2006a, Schumaker and Chen 2009) offer a broad range of methods but may suffer from overfitting (Sharma and Dey 2012).
In our research framework, we have only a few hundred IPO filings, each linked to a large text basis (i.e., a length of several thousand words) along with a continuous return value. Thus, we have experienced difficulties in achieving robust results with term-weighting as well as with machine learning approaches and focused, instead, on dictionary-based methods. In fact, we have tested all of the above dictionary-based metrics in combination with various dictionaries. In short, we find that dictionaries vary strongly in the strength of the link between sentiment and stock market reaction. Along with previous research on IPOs, the ratio of words labeled as uncertainty according to the Loughran and McDonald Financial Sentiment Dictionary (McDonald 2012) outperforms all other approaches.

Information Processing Theory of IPO Filings
This section provides an overview of related research investigating the information processing of IPO filings. First, we review other research studying the information processing of IPO, but without considering the polarity of the prospectus content.
• Arnold et al. (2010) examine the risk factors section of a prospectus. They not only count the number of risk factors disclosed in this section but also measure the number of words used to explain each of the risk factors. They find that the soft information is significantly related to both the initial and subsequent IPO returns, but not to ex post measures of investor sentiment. • Hanley and Hoberg (2012) find that issuers trade off first-day returns and disclosure in amended filings (S-1/A) as a hedge against the risk of future law suits. The two authors show that IPOs having a higher risk of material omissions in their filings are more likely to hedge litigation exposure with higher levels of underpricing. A greater level of disclosure by issuers, as proxied by more meaningful revisions in their S-1/A filings during the bookbuilding period, lowers the probability of being sued by investors. The key assumption of Hanley and Hoberg (2012) is that an issuer with a large pre-IPO price adjustment (but having only revised their filings slightly) is likely to have a material omission in the prospectus as the issuer did not disclose the information underlying the price change. • To evaluate disclosure style, Loughran and McDonald (2014) create a standardized statistic that aggregates a series of writing components specifically identified by the SEC. The six components are average sentence length, average word length, passive voice, legalese, personal pronouns, and negative/superfluous phrases. In the context of Form 424 filings, the authors find substantial differences in the content of IPO filings. This is a result of the 1998 rule implemented by the SEC requiring firms to use plain English in their prospectus filings.
Closest to our research are the approaches by Ferris et al. (2013), Hanley and Hoberg (2010), and Loughran and McDonald (2013), in terms of both research objectives and methodology. We individually address each of these papers as follows: • Ferris et al. (2013) examine the effect that conservative or cautionary language (measured using negative tone) in the prospectus might have on IPO performance. As a result, greater conservatism in the prospectus is related to increased underpricing. The authors observe that this relation is stronger for technology than non-technology IPOs. Besides that, they study how conservative language evolves across time. In addition, they find a significant relationship between conservative language from the Loughran-McDonald dictionary and an average industry-adjusted return on assets for the 3 years following the IPO, but give no insights on how returns evolve through time. • Hanley and Hoberg (2010) examine the information content of IPO initial prospectuses and its effect on pricing. The authors split the information contained in the S-1 into standard and informative components. Interestingly, they find that IPOs with more informative content in their S-1 have lower offer price revisions and first-day returns. They argue that IPOs with more standard S-1 content would be more likely to solicit information from investors during bookbuilding. • Liu et al. (2014) explore the effects of pre-IPO media coverage (in the Factiva database) measured by the number of newspaper articles. According to the authors, this news count correlates positively with the stock's long-term value, liquidity, analyst coverage, institutional investor ownership, and first-day returns.
Overall, the paper provides evidence for a long-term role for media coverage, consistent with Merton's attention or investor recognition hypothesis. • Loughran and McDonald (2013) study the relationship between S-1, as well as Form 424 filings, and the IPO first trading day returns. As a result, IPOs with a more uncertain or weak modal tone experience higher first trading-day returns. Moreover, the percentages of uncertain, weak modal, and negative words in the S-1 are much more powerful variables in explaining levels of underpricing than many commonly used IPO control variables, such as venture capital dummy, top-tier underwriter dummy, or trailing annual sales. In fact, the t-statistics are higher when using Form 424 instead of S-1. In addition, the tone is positively linked with higher volatility in a 60-day period following the offering, but there is no similar evidence for stock market returns.
Hence, information processing has been the subject of many IS research publications. However, the above research papers concentrate on a partial examination of prospectus sentiment, and, thus, many questions are left unanswered. All listed publications lack (1) an in-depth evaluation of how persistent the influence of filing content on long-term returns is and (2) a comparison between the reception of IPO filings and pre-IPO news sentiment. Consequently, an empirical study analyzing the effects of filing sentiment in the long term and in pre-IPO news seems to be an open research question. Consequently, we pursue a rigorous IS approach: First, we utilize sentiment analysis to extract the polarity of IPO filings. Second, we specify a model to measure the impact of prospectus sentiment empirically.

Preprocessing IPO Filings
Before performing the actual sentiment analysis, several operations are involved in the preprocessing phase. The individual steps are as follows: • Cleaning. We use the textual component of IPO filings only, and then, all HTML tags and XBRL syntax are removed. This is consistent with Ferris et al. (2013), as well as Loughran and McDonald (2013). • Tokenization. Each announcement is split into sentences and single words named tokens (Grefenstette and Tapanainen 1994). • Negations. Negations invert the meaning of words and sentences. According to Loughran and McDonald (2011), as well as Tetlock (2007), handling negations is an inevitable prerequisite to measure positive tone. However, in all previous papers, handling negations is neglected. To close this gap, we adopt the following approach (Dadvar et al. 2011): When encountering the word no, each of the subsequent three words (i.e., the object) is counted as words from the opposite dictionary. When other negating terms are encountered (rather, hardly, couldn't, wasn't, didn't, wouldn't, shouldn't, weren't, don't, doesn't, haven't, hasn't, won't, hadn't, never), the meaning of all succeeding words is inverted. • Stop word removal. Words without a deeper meaning, such as the, is, of, etc., are named stop words and, thus, can be removed. We use a list of 571 stop words (Lewis et al. 2004). • Synonym merging. Synonyms, though spelled differently, convey the same meaning. Thus, approximately 150 frequent synonyms are grouped and aggregated by their meaning-a method referred to as pseudoword generation (Manning and Schütze 1999). • Stemming. Stemming refers to the process of reducing inflected words to their stem (Manning and Schütze 1999). Here, we use the so-called Porter stemming algorithm.

Method for Sentiment Analysis
As shown in a recent empirical evaluation on the robustness of sentiment analysis (Feuerriegel and Neumann 2013), the correlation between sentiment in financial materials and the corresponding stock market reaction varies strongly across different sentiment metrics. Out of the potential dictionaries, the ratio of words labeled as uncertain-according to the Loughran and McDonald Financial Sentiment Dictionary (McDonald 2012)-achieves the highest robustness, and, consequently, we rely upon this approach in the following evaluation. Let us briefly recapitulate this sentiment analysis approach. Let W total (P ) denote the total number of words in the prospectus P and W uncertain (P ) denote the number of uncertainty words in the prospectus P . Then, the sentiment is defined by Thus, the sentiment variable Sentiment(P ) measures the number of uncertainty words normalized by the number of total words. We improve the metric, in contrast to Ferris et al. (2013), as well as Loughran and McDonald (2013), by using rules to invert the meaning when encountering negations. Thus, tokens following a negation are not counted as uncertainty words. This allows us to further boost the accuracy.

Empirical Evaluation: Analyzing Sentiment of IPO Filings
Having discussed the steps to compute sentiment values, we apply the sentiment analysis to investigate how stock market returns of initial public offerings are driven by their prospectus. Initially, we choose the IPO filing corpus and gain first insights by descriptive statistics. Then, we specify all control variables, which, ultimately, give the regression design that inspects information processing by linking returns with prospectus sentiment. By varying the time span of returns, we study the persistence of sentiment on stock market performance, and in addition to that, we analyze the reception of news covering initial public offerings.

IPO Filing Corpus, News Corpus, and Stock Market Data
A structured dataset of US initial public offerings is found in the so-called Kenney-Patton IPO database (Kenney and Patton 2013a;b). This database consists of all de novo firms going public on American exchanges and filed with the Securities and Exchange Commission (SEC), where we restrict the corpus to those initial public offerings between the years 2003 and 2010. This accounts for a total of 625 IPOs.
In a next step, we collect the final prospectus (i.e., Form 424) for each of these initial public offerings from the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system. 1 Filings are available online for the 621 out of 625 initial public offerings.
Afterward, we combine the Form 424 filings with their corresponding stock market performance. Here, we use the return after the n-th trading day, defined as Daily data from stock markets is (mostly) provided by Thomson Reuters Datastream where price data is available for 577 initial public offerings. As in Ferris et al. (2013) and Loughran and McDonald (2013), we require an initial offer price of at least $5. In order to adjust for sector-specific differences like previous research (Ferris et al. 2013, Loughran andMcDonald 2013), we introduce dummy variables for each industry. We use the dummies provided by the Thomson Reuters Business Classification (TRBC), 2 which are classified into 52 industry groups. Because of data availability, we use a total of 572 out of these 577 observations. Our news corpus originates from the Thomson Reuters News Archive for Machine Readable News. We choose Reuters news deliberately because of three reasons: (1) Reuters conveys, in particular, news about stock markets. (2) Reuters news is a third-party content and, thus, gives a certain level of objectivity. (3) In contrast to newspapers, news agencies feature a shorter time lag and lack perturbations by edits. All announcements provided by Reuters arise from January 1, 2003, onward. The announcements come along with stock symbols denoting the firm the content it deals with. Based upon these labels, the news corpus is filtered such that we extract the announcements of firms going public. 3 In addition, we require announcements to be published during the 365 days prior to the IPO. All in all, this set of criteria filters a total of 1085 announcements covering initial public offerings. If no announcement on an IPO is available, we set the sentiment value to 0. If more than one announcement is available, we use the average sentiment value. Furthermore, we incorporate the number of news announcement linked to each IPO into a separate control variable #News. Examples of headlines are given in Table 1. From this table, we can see that the news not only reports that a firm is going public but states more insights of the pre-IPO activities.

Descriptive Statistics of Stock Market Returns
In our evaluation, we vary the time span of returns to study the persistence of prospectus sentiment. All descriptive statistics of issued share prices, as well as returns after the n-th trading day, are presented in Table 2. Accordingly, the initial issued prices range from $4.00 to $85.00, while the mean initial share price accounts for $14.2613 with a standard deviation of $6.5506. Furthermore, the returns after 21 trading days span approximately from −0.8337 to 30.8000, with a nonzero mean of approximately 0.4304. Interestingly, the kurtosis of 134.3806 on the first day of trading is substantially higher than 3, indicating the existence of heavy tails, which are probably caused by price spikes. Both characteristics of nonzero mean and a high kurtosis are consistent with related literature (cf. Jenkinson andLjungqvist 2001, Ritter andWelch 2002, for recent reviews), where these effects are referred to as underpricing.

Control Variables
When isolating and extracting the effect of sentiment, we use a wide range of control variables. We control for the development of the stock market and, in addition to that, further control variables which comprise the number of shares, the share overhang, and the underwriter discount. The latter three variables are retrieved from the Kenney-Patton IPO database (Kenney and Patton 2013a;b), which we improved with additional corrections. Similar variables are used by a number of prior papers to explain first-day returns (see, e.g., Ferris et al. 2013, Loughran andMcDonald 2013). Individual control variables are as follows: 4 • UpRevision This control variable measures the percentage of the upward revision from the mid-point of the filing range if the offer price is greater than the midpoint; otherwise it is set to zero. As in Ferris et al. (2013) and Loughran and McDonald (2013), including this variable is crucial; when this variable is added last, many others are no longer significant. • Days S − 1, IPO To control for possible delays of IPOs, we include a variable measuring the logarithmized calendar days between S-1 and the initial public offering.
• NASDAQ −15 d We include the general development of a stock market index to ensure that the stock market performance is driven by sentiment instead of the economic cycle or the general mood of investors. Thus, we choose the return of the NASDAQ-100 stock market index of the 15 days prior to the initial public offering. Here, values originate from Thomson Reuters Datastream. • lnSales This variable is defined as the logarithm of the number of shares sold to the public in the offering multiplied by the initial share price. Thus, the logarithm of all sales controls the volume of the initial public offering. • ShareOverhang This variable indicates the number of shares retained divided by the number of shares in the initial offering. More precisely, it is defined as where S outstand is the number of shares outstanding after the offer and S sold is the number of shares sold to the public in this offering. • UnderwriterDiscount This variable includes the per share discounts and commissions taken by the underwriters.
By using these control variables, we want to assure that our results measure the impact that comes from the sentiment only and avoid perturbations by other causes, such as changes in fundamental variables or external events. Next, descriptive statistics of all variables are presented in Table 3. In addition to that, we incorporate additional dummies into our model to adjust for both seasonal and sector-specific effects.

Regression Design
This section presents the corresponding regression design, i.e., where n specifies the n-th day of trading, β i are coefficients to be estimated, and is the error term. Dummies are added for each sector, as well as year, to consider additional external events not covered by the control variables and to handle nonseasonally adjusted time series. The correlation of sentiment values and first-day returns is depicted in Fig. 2. This diagram also features a so-called LOWESS trend line, which is a locally weighted scatterplot smoothing calculated via local regressions. From this LOWESS trend line (smoothing parameter f set to 2/3), we can identify a visible relationship between the sentiment variable and stock market returns. Finally, we give justice to extreme stock price effects and remove outliers at the 0.05% level at both ends.

Results: Linking Sentiment and Stock Market Performance
In this section, we investigate how investors react to IPO prospectuses and analyze the information processing empirically. To succeed in this goal, we exploit sentiment analysis to measure the relationship between investor behavior and the content of IPO filings.

Confirmatory analysis: To what extent are stock market returns of firms driven by the sentiment of their IPO filings?
We use the above regression design from Eq. (4) to measure the impact of sentiment in prospectuses on IPO underpricing. We tested for heteroskedasticity, constant variance, serial correlation, and normally distributed residuals at the 0.01% level to ensure that the results are not confounded. When checking variance inflation factors, we also see no indication of multicollinearity. Independence across IPO filings is given as long as all prospectuses are entirely novel and not based on an interrelated course of events. Under the assumption that first-day returns are jointly multivariate normal, as well as independently and identically distributed through time, the model can be estimated using ordinary least squares (OLS).
Regression results are given in Table 4. According to this table, we observe that prospectus sentiment influences first-day returns significantly. The corresponding t-value accounts for −2.505 with a p-value of 0.013. Furthermore, we find that one standard deviation increase in the sentiment measure is negatively linked to an increase in first-day returns by an economically significant 10.56%. When additionally comparing the coefficients of the fundamental variables and sentiment, we find that the sentiment coefficient (accounting for −61.879) exceeds all other coefficients originating from the fundamental variables greatly. In addition to that, none of the control variables are, except for UpRevision, significant at the 10% level. Altogether, these results provide evidence that the sentiment in IPO prospectuses correlates significantly with first-day returns.

Results: Analyzing the Persistence of Sentiment Impact Over Time
Having discussed the regression design, we proceed to analyze how the sentiment affects n-th trading day returns after the initial public offering.

Research Question 1: How persistent is the influence of filing sentiment on longterm returns?
Thus, we vary the time span of returns across 10 and 21 days of trading, which is equivalent to 2 weeks and 1 month, respectively. The corresponding regression results are presented in Tables 5 and 6. Again, we checked for heteroskedasticity, constant variance, serial correlation, and normally distributed residuals and multicollinearity to ensure that we can estimate the model using OLS. In short, the results can be summarized as a declining influence of sentiment for longer time lags.
In more detail, the coefficients indicating the sentiment reception start at −61.879 for first-day returns but change slightly to −54.302 after the 10th day of trading. Similarly, the t-statistic of the sentiment variable drops from −2.505 for first-day returns to −2.266 after the 10th day of trading. Its value shrinks further down to −1.033 when looking at returns after the 21st day of trading. The influence of sentiment on both the first and the 10th day of trading is significant at the 5%

Results: Prospectus Sentiment Versus News Sentiment
While the previous sections analyzed the influence of news, the question regarding the relevance of the general news coverage is left unanswered.

Research Question 2: How is the IPO performance affected by pre-IPO news coverage?
We extend the regression of Eq. (4) by an additional variable measuring the sentiment of news announcements related to that initial public offering. Out of all 572 initial public offerings, we find at least one announcement in the news corpus for 207 firms, i.e., 39.6%. We tested for autocorrelation, heteroskedasticity, constant variance, serial correlation, and normally distributed residuals at the 0.01% level to ensure that the results are not confounded. When checking for variance inflation factors, we also see no indication of multicollinearity. Thus, the model can be estimated using ordinary least squares (OLS). The regression results are presented in Table 7. The influence of IPO sentiment remains similar with a coefficient of −66.045 and a corresponding t-value of −2.456. However, the influence of news announcements seems much stronger. Here, the news sentiment accounts for a coefficient of 20.907 with a P -value of 0.000 48 and a t-value of 3.518. This is interesting for two reasons. First, a higher t-value indicates a significant link between returns and news, even stronger than the IPO sentiment. Second, while more uncertainty words in the prospectus lower the first-day returns, this does not hold for the news announcements. The more uncertainty words in the news announcements, the higher the first-day return. While one standard deviation increase in the IPO sentiment measure is linked to a decrease in first-day returns by an economically significant 11.30%, the news sentiment behaves differently. Here, one standard deviation change in news sentiment correlates with an increase in first-day returns by 14.72%. Furthermore, the number of news announcements (given by #News) has no impact on IPO prices at any common significance level. Astonishingly, the effect shows a larger magnitude, and at the same time, the more negative the pre-IPO news coverage, the higher the first-day returns.
A possible explanation can be attributed to the investors' psychological constraints in information processing-the processing depends on the attention different information receives. This phenomenon is already addressed in accounting literature (Hirshleifer et al. 2011): Limited investor attention theory for earning announcements suggests that investors neglect information about future earnings in current-period announcements creating a post-earning announcement drift. Subsequently, this mispricing is corrected over time. The same explanation may apply to IPO prospectuses, where mainly the risks of the company are recognized, which may entail the negative coefficient of our IPO sentiment variable. Underestimating future opportunities may create a post-prospectus announcement drift that was visible in our previous analysis. News announcements prior to the IPO are obviously less suspect to the companies' risks. On the contrary, the positive coefficient suggests that more emphasis is put on the companies' opportunities. Apparently, the degree of investor attention depends on the specific type of information.

Conclusion
Before shares of a company are sold to the general public on a security exchange for the first time, regulatory publication requirements force US firms to file an initial public offering prospectus. Studying the information processing of investors facing these filings is an active research question in the context of efficient electronic markets, though knowledge is rare (e.g., Liebmann et al. 2012): "While the accounting numbers in IPO prospectuses are closely studied by investors, analysts, and others involved in the equity issuance process, an examination of the textual or soft information contained in prospectuses is less common" (Ferris et al. 2013).
To close this gap, research must harness decision analytics in order to explain the relationship between the textual content of filings and final IPO prices.
In this paper, we address information processing in IPO filings by analyzing how the sentiment in prospectuses influences stock market performances. Using the final filings of 572 US initial public offerings (Form 424) between 2003 and 2010, we can identify soft content as a major driver of stock returns. All in all, we can empirically establish a relationship between sentiment in IPO filings and stock market reaction. One standard deviation increase in the sentiment measure correlates negatively with a change in first-day returns by an economically significant 10.56%. This dependency appears not only on the first day of trading but also for longer phases of up to 10 days of trading. Apparently, "soft information can offer context to financial numbers and share values, provide insight into managerial expectations, and identify important qualifiers or caveats that are absent from purely numerical data" (Ferris et al. 2013). As an explanation, we find supporting evidence that we can expect initial public offerings with substantial uncertain language to have, on average, lower preliminary offer prices. This effect occurs due to the need for bankers to compensate investors for their information production (Loughran and McDonald 2013). According to Ferris et al. (2013), the findings "suggest that when hard information is noisier, textual information is seen as having greater usefulness and is more likely to be factored into IPO pricing." However, a much stronger impact on stock market prices happens not through the IPO prospectus, but comes from the pre-IPO news sentiment. Interestingly, the more uncertainty words that appear in pre-IPO news, the higher the following first-day stock market return. Consequently, this uncertainty puzzle must be resolved. It seems that investors mainly focus on the chances of a company rather than on the risks. This stands in contrast to the prospectus where risk considerations play a major role.
The work presented in this chapter opens several avenues for future research. First, a pooled regression can be utilized to analyze possible differences between no crisis and crisis periods where investors change their information processing to more cautious behavior. Second, further effort is needed to validate our approach in terms of robustness, and, thus, we plan to extend our analysis further to filings, such as S-1 or, alternatively, transcripts of court decisions. Third, an obvious next step is to bridge the gap between explanatory regression models and a prediction model (Shmueli and Koppius 2011)  Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.