Investor sentiment and the cross‑section of stock returns: new theory and evidence

We extend the noise trader risk model of Delong et al. (J Polit Econ 98:703–738, 1990) to a model with multiple risky assets to demonstrate the effect of investor sentiment on the cross-section of stock returns. Our model formally demonstrates that market-wide sentiment leads to relatively higher contemporaneous returns and lower subsequent returns for stocks that are more prone to sentiment and difficult to arbitrage. Our extended model is consistent with the existing empirical evidence on the relationship between sentiment and cross-sectional stock returns. Guided by the extended model, wen also decompose investor sentiment into long-and short-run components and predict that long-run sentiment negatively associates with the cross-sectional return and short-run sentiment positively varies with the cross-sectional return. Consistent with these predictions, we find a negative relationship between the long-run sentiment component and subsequent stock returns and positive association between the short-run sentiment component and contemporaneous stock returns.

Di n g, We njie ORCID: h t t p s://o r ci d.o r g/ 0 0 0 0-0 0 0 3-2 7 7 4-5 7 7 7, M a z o u z, Kh elifa ORCID: h t t p s://o r ci d.o r g/ 0 0 0 0-0 0 0 1-6 7 1 1-1 7 1 5 a n d Wa n g, Qi n g w ei ORCID: h t t p s://o r ci d.o r g/ 0 0 0 0-0 0 0 2-3 6 9 5-7 8 4 6 2 0 1 9. Inv e s t o r s e n ti m e n t a n d t h e c r o s s-s e c tio n of s t o c k r e t u r n s : N e w t h e o ry a n d e vi d e n c e. R evi e w of Q u a n ti t a tiv e Fi n a n c e a n d Acco u n ti n g 5 3 , p p . 4 9 3-5 2 5. 1 0. 1 0 0 7/ s 1 1 1 5 6-0 1 8-0 7 5 6-z file P u blis h e r s p a g e : h t t p:// dx. doi.o r g/ 1 0. 1 0 0 7/ s 1 1 1 5 6-0 1 8-0 7 5 6-z < h t t p:// dx. doi.o r g/ 1 0. 1 0 0 7/ s 1 1 1 5 6-0 1 8-0 7 5 6-z > Pl e a s e n o t e: C h a n g e s m a d e a s a r e s ul t of p u blis hi n g p r o c e s s e s s u c h a s c o py-e di ti n g, fo r m a t ti n g a n d Thi s v e r sio n is b ei n g m a d e a v ail a bl e in a c c o r d a n c e wit h p u blis h e r p olici e s. S e e h t t p://o r c a . cf. a c. u k/ p olici e s. h t ml fo r u s a g e p olici e s. Co py ri g h t a n d m o r al ri g h t s fo r p u blic a tio n s m a d e a v ail a bl e in ORCA a r e r e t ai n e d by t h e c o py ri g h t h ol d e r s .

Introduction
Several theoretical studies, such as Delong et al. (1990) (DSSW hereafter), demonstrate that investor sentiment affects asset prices when rational arbitrageurs face limits to arbitrage. 1 These studies focus on a single risky asset, and accordingly, their models are more suitable for empirical tests involving aggregate market portfolios (Huang et al. 2015). However, while there is ample evidence that market sentiment affects the cross-section of asset returns, 2 a little has been done to explain the theoretical basis for the role of investor sentiment in the context of multiple assets.
In this study, we provide a parsimonious and tractable model of how investor sentiment affects the cross-section of stock returns. We extend the DSSW model by introducing multiple risky assets that differ in their exposure to market-wide sentiment. Our analysis is motivated by the premise that predictions from a single-asset model do not necessarily hold in multi-asset economies (Verrecchia 2001). For example, Cochrane et al. (2008) show that price-dividend ratio is constant in the one-tree model of Lucas (1978), but varies over time in a two-tree model. Therefore, it is unclear whether DSSW predictions can be generalized to markets with more than one risky asset.
To shed some light on this issue, we develop a simple model that formalizes Baker and Wurgler's (2006) idea that sentiment-prone assets are also more difficult to arbitrage and provides the theoretical intuition for the widely documented evidence that investor sentiment affects the cross-sectional asset returns. Our model assumes that there are two types of risky assets A and B and that irrational investors' beliefs are biased upwards (downwards) more towards A than B when market sentiment is high (low). That is to say, asset A has higher exposure to market-wide sentiment (more sentiment-prone) than asset B. When investor sentiment is uncertain, this assumption also implies that the equilibrium returns of asset A relative to the returns of asset B will fluctuate more with the shift in market sentiment, and hence posing higher noise trader risk to rational arbitrageurs. Thus, the contemporaneous returns of asset A are expected to exhibit greater sensitivity to the changes in investor sentiment than the contemporaneous returns of asset B. The returns of asset A are also expected to reverse more than the returns of asset B as investor sentiment eventually reverts to its mean. Therefore, the return difference between the more sentiment-prone asset and the less sentiment-prone asset should be positively associated with the change in contemporaneous sentiment and negatively related to the level of lagged sentiment. These predictions are in line with the existing empirical evidence on the relationship between sentiment and cross-sectional stock returns.
Similar to DSSW, our model of sentiment effect features long-and the short-run investor sentiment components. The long-run sentiment reflects the average bullishness of noise traders, while the short-run sentiment represents the transitory deviations from the longrun sentiment. Both components affect the price of the single risky asset in the DSSW model. Unlike DSSW, the two components in our model have cross-sectional implications. When the long-run sentiment remains the same and the short-run component increases, irrational investors become more bullish and drive up the relative returns of more sentiment-prone stocks. Hence, we predict a positive correlation between contemporaneous changes in the short-run sentiment and the relative returns of sentiment-prone stocks over sentiment-immune stocks. However, a higher long-run sentiment exerts more upward pressure on the prices of more sentiment prone stocks and reduce their expected future returns. 3 Therefore, we predict the long-run component to be a contrarian predictor of subsequent cross-sectional returns.
Motivated by our theoretical model, we empirically decompose investor sentiment into a short-run sentiment component, constructed as incremental changes of sentiment, and a long-run sentiment component, measured by a moving average of investor sentiment in the past 2 years. 4 We test the cross-sectional pricing effect of both components with common stock listed on NYSE, AMEX, and NASDAQ between July 1965 and Sep 2015. We follow Baker and Wurgler (2006) and construct sixteen long-short portfolios that buy more sentiment-prone stocks and short less sentiment-prone stocks. Sentiment-prone stocks tend to be small, young, more volatile, unprofitable, non-dividend-paying, and distressed, with extreme growth potential or with a relatively high percentage of intangible assets.
Consistent with our theoretical predictions, we find a positive relationship between short-run sentiment and contemporaneous cross-sectional stock returns and a negative association between long-run sentiment component and subsequent cross-sectional stock returns. Further analysis suggests that the effect of the sentiment components on returns is generally stronger for stocks that are small, young, volatile, unprofitable/non-dividend paying, financially distressed and have high growth potential. These findings are robust after accounting for systematic risk and time-varying factor loadings as well as to alternative sentiment measures, to alternative constructions of portfolios, and to alternative decomposition of sentiment. Additionally, the effect of decomposed investor sentiment remains strong after taking the effect of investor attention into account.
The contribution of our paper is twofold. First, we contribute to the theory of investor sentiment by presenting a parsimonious model that explicitly examines the effect of market-wide sentiment on the cross-sectional asset returns. Such a model provides a theoretical underpinning for the existing empirical evidence that market-wide sentiment does not have a uniform impact on stock returns. It also captures well the idea of Baker and Wurgler (2006) that more sentiment-prone assets are also more difficult to arbitrage. The type of limits to arbitrage we consider here is the noise trader risk, while limits to arbitrage in Baker and Wurgler (2006) take many other forms, including transaction costs and idiosyncratic risk. Admittedly, our multiple-assets version of noise trader risk model is a simple and straightforward extension of DSSW (1990), and is not intended to model the belief formation process that originates from the behavioral heuristic evidenced in psychology literature (Barberis et al. 1998) or sentiment that arises endogenously. Instead, our simple model aims to bridge the gap between the theoretical studies, which focus on the role of sentiment in context of a single risky asset, and empirical literature, which provides ample evidence on the cross-sectional pricing impact of investor sentiment. DSSW (1990) propose a simple overlapping generation model of a market with one risky asset and one risk-free asset and two types of two-period-lived agents, sophisticated investors and irrational noise traders with stochastic misperception. The uncertainty of noise traders' misperception creates "noise trader risk" that deters rational investors from fully arbitraging. Because of its single risky asset setting, DSSW model is presumably better suited for explaining the impact of investor sentiment at the aggregate levels (Huang et al. 2015). Since the claims or results in a single-asset model can sometimes be reversed in multi-asset economies (Verrecchia 2001), it is unclear whether predictions of DSSW can be generalized to markets with more than one risky asset. In this study, we extend the single risky asset model of DSSW to a noise trader risk model with multiple risky assets that vary in their exposure to market-wide investor sentiment.

A cross-sectional noise trader risk model
In the DSSW noise trader risk model, sophisticated investors (denoted as "i") have rational In the DSSW noise trader risk model, sophisticated investors (denoted as "i") have rational expectations, and noise traders (denoted as "n") hold biased belief and trade on noise. The percentage of noise traders in the market is set as µ, and the percentage of sophisticated investors is 1 − µ. Both noise traders and sophisticated investors' utility function is a constant absolute risk aversion (CARA) function of wealth, U =−e −(2 ) , where is the coefficient of absolute risk aversion and is wealth. If holding period returns are normally distributed, solving expected utility optimization is equivalent to maximizing ̄− 2 , where ̄ is the expected final wealth, and 2 is the one period ahead variance of the expected wealth.
There are two assets in the DSSW model, one risky asset and one risk-free asset. The difference between the risk-free asset and the unsafe assets lies in their supply. The riskfree asset is in perfectly elastic supply, which implies that its price is fixed. However, the supply of the unsafe asset is fixed at one unit, which means that its prices fluctuate along with the change in demand. In each period the risk-free asset has a fixed real rate of r and the risky assets have fixed dividend rate r.
Unlike DSSW, our model has two risky assets, assets u 1 and u 2 , in addition to the riskfree asset. 5 Sophisticated investors choose to hold i t,1 of the risky asset u 1 and i t,2 of the risky asset u 2 to maximize their expected utility. However, given their misperception, noise traders maximize their expected utility by choosing a portfolio with holding n t,1 of risky asset u 1 and holding n t,2 of risky asset u 2 . Due to the various types of limits of arbitrage described in Shleifer and Vishny (1997), sophisticated and noisy traders cannot make unlimited bidding against each other.
Assume that the overall market sentiment t follows a normal distribution with t ∼ N * , 2 . To examine the cross-sectional effect of investor sentiment and noise trader risk, we further assume that noise traders have different misperceptions of the risky assets u 1 and u 2 .
Equations (1) and (2) show that noise traders' misperception of a risky asset contains a systematic component proportional to market sentiment and an idiosyncratic component. The new disturbance terms that we added to the DSSW's model are the idiosyncratic investor sentiment, t,i . Such idiosyncratic disturbances capture the firm-specific investor sentiment that is often observed in the financial markets. Our focus, however, is not the pricing role of the idiosyncratic investor sentiment, but the pricing effect of market sentiment, t .
While there is no theory supporting a particular stochastic structure of investor sentiment, Eqs. (1) and (2) can be motivated in three ways: first, we do observe episodes of bull and bear market and associated waves of market-wide sentiment (e.g., during the bubble period in American stocks in the 1920s), as well as sentiment towards individual stocks (e.g., the South Sea company bubble). Second, while existing empirical studies mainly focus on market-wide sentiment, there is a growing literature on the role of idiosyncratic sentiment, often measured by the degrees of positive and negative tones in a firm's media coverage, corporate disclosures, internet posts, and overnight returns (e.g., Tetlock 2007;Bhattacharya et al. 2009;Kim and Kim 2014;Aboody et al. 2018). Finally, some theoretical models assume that future dividends or firm profitability are linear in the market and firm-specific factors (e.g., Peng and Xiong 2006;Jackson et al. 2017). 6 If noise traders (4) cov t,1 , t = 0, cov t,2 , t = 0, cov t,1 , t,2 = 0 5 Extending the model further with more than two risky assets is straightforward. We focus on the two risky assets model since it is sufficient to shed the lights on the cross-sectional effect of investor sentiment. 6 A number of empirical studies document that market and firm-specific factors help explain analyst forecast accuracy (e.g., Hutton et al. 2012). Practitioners also place great emphasis on the market factor when estimating a firm's future profitability (Jackson et al. 2018).
form their (biased) beliefs about future price on the basis of these models, then their beliefs are likely to be linearly related to market and firm-specific misperceptions.
Equations (1)-(4) above imply that 2 . Latter we show that equilibrium price volatility of asset u 1 is larger than that of asset u 2 due to higher noise trader risk ( 2 1 > 2 2 ). Higher noise trader risk poses stronger limits to arbitrage for rational investors to trade against irrational investors. As a result, our model parsimoniously captures the intuitive observation of Baker and Wurgler (2006) that more sentimentprone assets are also more difficult to arbitrage, although the limits to arbitrage in Baker and Wurgler (2006) are broader and not necessarily related to noise trader risk.
For sophisticated investors maximization of their expected utility is equivalent to maximize For noise traders maximization of their expected utility is equivalent to maximize where t 2 p t+1,1 is the conditional expectation of one-step-ahead variance of p t+1,1 , t 2 p t+1,2 is the conditional expectation of one-step-ahead variance of p t+1,2 , and t cov p t+1,1 , p t+1,2 is the conditional expectation of the covariance of the one-step-ahead risky assets' price p t+1,1 and p t+1,2 . Solving the above optimization problem with first order conditions yields the portfolio holdings of the two risky assets: n t,2 t cov p t+1,1 , p t+1,2 + n t,1 1 t + t,1 + n t,2 2 t + t,2 The proposition derived from this extended model will be the same if 2 1 > 2 2 . Generally speaking, we consider the idiosyncratic sentiment to have a much smaller sample variance compared with the systematic sentiment component. Our conclusion do not rely on the relationship between 2 1 and 2 2 . 8 We relax the assumption of positive α 1 and α 2 in the discussions at the end of this section.
By assuming that equilibrium prices in all periods have identical distributions, we can obtain the equilibrium pricing functions by solving the following function recursively: If the conditional variance of the price is constant, substituting the conditional one-stepahead price forward yields: The equilibrium price is a function of both the misperception level and the noise trader risk induced by stochastic misperception. The second term in the equilibrium price function shows the change in the price caused by the fluctuations of the overall market misperceptions around its long-run mean. When noise traders become more bullish relative to the average overall market misperceptions, their demand pushes the price up. The third term captures the deviation of the price from the fundamental value caused by the average overall market misperceptions. The fourth term captures the compensation for bearing the "noise trader risk" created by the uncertainty of the next period's misperception. Noise trader risk makes sophisticated investors unwilling to trade entirely against noise traders since future misperceptions of noise traders can become extreme. The type of limits to arbitrage that deter sophisticated investor from trading against noise relates mainly to the uncertainty about the idiosyncratic and systematic components of sentiments. 9 The last term captures the fluctuation in price caused by the variation of idiosyncratic misperception towards risky asset u 1 . Equations (11) and (12) imply that the unconditional price volatility and the price covariance of the two risky assets at time t + 1 are given as: (1 + r) 2 + 2 2 1 (1 + r) 2 9 Limits to arbitrage in Baker and Wurgler (2006) take many other forms, such as transaction costs and idiosyncratic risk.
We also solve the Eqs. (7)-(10) to obtain the portfolio holdings for sophisticated investors and noise traders: Given the obtained price volatility and covariance, then 2 k 2 − 2 1 2 2 < 0 , k 2 − 2 2 1 < 0 , and k 1 − 2 1 2 < 0 . Thus, the sophisticated investors' holdings of the two risky assets are inversely proportional to current market sentiment, while the noise traders' holdings of these risky assets are positively associated with the current market sentiment. That is, sophisticated investors reduce their demand for sentiment-prone assets while noise traders increase their demand for sentiment-prone assets when overall market sentiment becomes more bullish.
The exposure of risky assets to the overall market misperception also affects the portfolio holdings. The sophisticated investors' holdings of u 1 (u 2 ) is positively related to 1 ( 2 ). Since 1 > 2 , the current market sentiment will have a greater effect on investors' (both sophisticated and irrational) holdings of asset u 1 . This also means that when overall market sentiment becomes more bullish, sophisticated investors will reduce their demand for asset u 1 more than asset u 2 and irrational investors will increase their demand for asset u 1 more than on asset u 2 .
Recall that the excess return from date t to date t + 1 is noted as: R t+1 = r + t p t+1 − p t (1 + r) . Plugging it in the equilibrium price, we obtain the return for asset u 1 and asset u 2 at time t + 1 where 1 ( 2 ) are functions of t,1 and t+1,1 ( t,2 and t+1,2 ). Hence, the return difference between the two risky assets is The equilibrium returns can also be expressed as functions of the deviation of current sentiment from its long-run mean, denoted as t ( t = t − * ).
Hence, the return difference between asset u 1 and asset u 2 at time t + 1 can be written as Take the unconditional expectations of Eq. (25), we obtain: We consider * , the mean of the overall market misperception, as the long-run sentiment component, and the incremental sentiment change as the short-run sentiment component. The latter is measured by either sentiment increment, t − (1 + r) t−1 , or the changes of sentiment's derivation from its long run mean, t+1 − (1 + r) t .
Equations (20) and (21) show that the short-run sentiment is positively related to the returns of both risky assets. The effect is stronger for asset u 1 than u 2 because 1 > 2 [see Eq. (25)]. As a result, we have the following hypothesis on the pricing effect of the shortrun component of investor sentiment:

Hypothesis 1
The short-run sentiment component is positively related to the contemporaneous difference in returns of more sentiment-prone and less sentiment-prone assets.
Inspection of Eqs. (20) and (21) also reveals that returns of a risky asset depend on the exposure of the misperception about its price to the overall market misperception. High long-run sentiment implies lower equilibrium returns for both risky assets ( 1 and 2 are positive), and the returns of u 1 are more adversely affected by the long run sentiment than u 2 ( 1 > 2 ) . This leads to our hypothesis on the pricing effect of the long-run component of investor sentiment.

Hypothesis 2
The long-run sentiment component predicts lower returns of more sentiment-prone assets relative to returns of less sentiment-prone assets.
If we relax the assumption that 1 > 2 > 0 by allowing 2 < 0 while keeping 1 > 0 , the effect of long-and short-run investor sentiment are inversed for the risky asset u 2 . In this case, when the average market sentiment becomes more bullish, it exerts a negative effect on contemporaneous returns due to the short-run sentiment effect, and a positive effect on future returns due to the long-run sentiment effect. The opposite is true in case of the risky asset u 1 , as it still has a positive exposure to market sentiment. Hence, the returns of the two risky assets move in the opposite directions, and the effect of long-and short-run sentiment will be muted at the aggregate market level. Baker and Wurgler (2007) make a similar argument that if the price of a low sentiment-prone stock is inversely related to sentiment, due to reasons such as "flight to quality", then the effect of sentiment on the aggregate market return is unlikely to be significant. This implication may also explain the inconclusive evidence on the impact of investor sentiment on aggregate market returns (Chen et al. 1993;Elton et al. 1998;Brown and Cliff 2004;Baker and Wurgler 2007;Baker et al. 2012). However, for a given positive 1 when 1 > 2 , a negative (instead of positive) 2 makes the cross-sectional returns difference between u 1 and u 2 more dispersed. Hence, it is not surprising that several empirical studies find that market sentiment affects asset returns in the cross-section (Baker and Wurgler 2006;Lemmon and Portniaguina 2006; among others).

Portfolio construction
To test our theory, we rely on Baker and Wurgler (2006) to find out the proxies related to how sensitive a stock is to the overall market sentiment (i.e., 1 and 2 in our model) and use the Baker-Wurgler sentiment index as our baseline measure of market sentiment (i.e., t in our model). Baker and Wurgler (2006) argue that firms that are small, young, volatile, non-dividend-paying, non-profitable, informationally opaque, financially distressed or have more growth opportunity are more sentiment-prone. Similar to Baker and Wurgler (2006), we construct sixteen long-short portfolios based on ten characteristics representing the sentiment-prone level of a stock. The ten characteristics are market capitalization (ME), firm age (Age), total risk (Sigma), earnings-book ratio for profitable firms (E/BE), dividendbook ratio for dividend payers (D/BE), fixed assets ratio (PPE/A), research and development ratio (RD/A), book-to-market ratio (BE/ME), external finance over assets (EF/A) and sales growth ratio (GS). 10 These proxies could measure the exposure to systematic sentiment in our model because they reflect how attractive a stock is to speculative demand (and how difficult it is for sophistical investor to arbitrage mispricing). 11 The monthly stock returns are from CRSP. The stock market data include the returns of all common stocks (share codes in 10 and 11) in NYSE, AMEX, and NASDAQ (with stock exchange code in 1 2 3) between January 1962 and Sep 2015. Our final sample consists of 18,175 firms. The firm-level accounting data are from Compustat. We adopt the Fama and French (1992) approach and match the year-end accounting data of year t − 1 to monthly returns from July t to June t + 1. The ten firm characteristics are winsorized at 99.5 and 0.5% annually. The breakpoints for deciles are defined only using NYSE firms. The 10 Details on these characteristics variables are provided in Table 7 in the "Appendix". 11 Several studies also support that these ten characteristics measures the elasticity of a stock towards investor sentiment. Berger and Turtle (2012) test whether the ten characteristics measures sentiment-prone level effectively by calculating the average sentiment beta in the cross-sectional decile portfolios sorted on these characteristics. They conclude that investor sentiment sensitivities are significantly correlated with their categorization measures. Sharma et al. (2015) find that time-varying herding behaviour is more prevalent in certain stock sectors. top three, middle four and bottom three decile portfolios used to construct the long-short portfolio, are denoted as H, M and L, respectively. We follow Baker and Wurgler's (2006) approach to construct the long-short portfolios and obtain the relative returns of more sentiment-prone stocks over less sentiment-prone stocks when possible. 12 To facilitate the interpretation of regression results, we use the returns of more sentiment-prone stocks minus the returns of less sentiment-prone stocks as the dependent variables. For example, BE/ME(L-M) represents the return difference between the bottom three deciles and the middle four deciles when using BE/ME as the sorting characteristic. All the portfolio returns are equal-weighted. 13 The sixteen long-short portfolios can be categorized into six groups. The first group is named "Size, Age, and Risk". Small, young, and volatile firms are sentiment-prone, and therefore the long-short portfolios constructed on ME, Age and Sigma are denoted as ME(L-H), Age(L-H), and Sigma(H-L), respectively. The second group is referred to as "Profitability and Dividend Policy". In this group, non-profitable stocks or stocks with low dividend payments are considered to be more prone to sentiment. The third group is labelled "Tangibility" and contains portfolios constructed by PPE/A and RD/A. Stocks with more intangible assets are considered to be more informationally opaque and therefore more prone to sentiment. The last three groups are called "Growth Opportunity and Distress", "Growth Opportunities" and "Distress", respectively, and consist of portfolios that are constructed from BE/ME, EF/A, and GS.
The reason for constructing nine long-short portfolios based on the last three characteristics (BE/ME, EF/A, and GS) is that those three characteristics have a multidimensional nature. Stocks at the top and the bottom deciles sorted on BE/ME, EF/A or GS represent either extreme growth or extreme value stocks, while firms in the middle deciles are more stable and safer. In the meantime, those three characteristics are also seen as a generic pricing factor in literature. Take BE/ME as an example. High BE/ME implies that a firm is in distress, while low BE/ME indicates extreme growth potential. On one hand, financially distress stocks are highly appealing to the speculative demand of irrational investors, firms with high BE/ME, as low EF/A or low GS are likely to be more prone to sentiment. On the other hand, as high growth firms are hard to value, the returns of firms with low BE/ME, high EF/A, and high GS are likely to be affected more by shifts in investor sentiment.
Panel A of Table 1 presents the summary statistics for the long-short portfolio returns. Most of the long-short portfolios have positive average returns. The negative average portfolio returns of EF/A(H-L) and GS(H-L) are not surprising, because EF/A and GS measure the growth opportunity and financial stress in the opposite way from BE/ME. 14 Returns of the sixteen long-short portfolios are all positively skewed. The last two columns in Panel A report the first-order autoregression coefficients (AR(1)) and the correlation between the variable and 1-month lagged Baker-Wurgler sentiment (Corr), respectively. Although the 12 We only consider the RD/A from 1972 because the RD/A data are not available until 1971. We use the same variable definition of Baker and Wurgler (2006) except for RD/A, which we do not replace any missing value with zero. Replacing R&D missing values with zeros will cause some problem after mid 2000 where almost half of the observations are zero. We focus on the stocks that have non-missing R&D values. Monthly returns are adjusted for delisting. 13 We also conduct tests with value-weighted returns and find similar results. 14 One may find it more appropriate to construct long-short portfolios as EF/A(L-H) and GS(L-H). However, due to the multidimensional nature of EF/A and GS, we would not expect decomposed investor sentiment to perform well in predicting long-short portfolio returns calculated with top and bottom portfolio of EF/A or GS, whatever the long-legs or short-legs are. Accordingly, we use EF/A(H-L) and GS(H-L) to be consistent with Baker and Wurgler(2006).

Table 1 Summary statistics
This table reports the descriptive statistics of dependent variables and the decomposed Baker-Wurgler sentiment measures. The first column is the variable used to sort portfolio. H is a portfolio of top three deciles; M is the middle four deciles; L contains the bottom three deciles. AR(1) represents the first-order autocorrelation. Corr represents the correlation between the summarised variable and one-term lagged Baker-Wurgler sentiment index. Panel A contains summary statistics for all the return premiums of the sixteen long-short strategies. In Panel A, the first column is the characteristic used and the second column is the portfolio construction. Panel B contains the statistics of longrun sentiment measure ρ LR,t and two proxies for short-run sentiment. η t -η t−1 is the changes of the sentiment deviation from long-run sentiment, and t − t−1 ⊥ is the sentiment increment orthogonalized to the long-run sentiment over previous periods. ***p < 0.01, **p < 0.05, *p < 0. portfolio returns of ME(L-H) and BE/ME (H-L) have little autocorrelation, the returns of all other long-short portfolios are significantly autocorrelated. Furthermore, except for BE/ME(L-H), GS(H-L) and BE/ME(H-M), Baker-Wurgler sentiment negatively predicts future returns with strong statistical significance. The negative correlation coefficients between the long-short portfolio returns and 1-month lagged investor sentiment accord with previous studies showing that investor sentiment is a good contrarian predictor of future cross-sectional returns.

Decomposition of investor sentiment
To test our theoretical predictions on the cross-sectional effect of the long-and short-run sentiment, we empirically decompose the overall market sentiment, as measured by the original monthly sentiment index of Baker and Wurgler (2006), into long-and short-run components. We choose the Baker and Wurgler index as our baseline sentiment measure to make it easier to compare our results with theirs. Baker and Wurgler (2006) use the principal component analysis method to extract the common component of five sentiment proxies, including closed-end fund discount (CEFD), the number and the first-day returns of IPOs (NIPO, RIPO), the equity share in total new issues (S), and the dividend premium (P). 15 The Baker-Wurgler index, Sent_BW, is orthogonalized to macroeconomic variables, including the growth in industrial production, the growth in durable, nondurable, and services consumption, the growth in employment and the NBER dummy variable for recessions. The sample period is from July 1965 to September 2015. We also use other survey-based sentiment measures such as Consumer Confidence index to obtain the long-and short-run components and find similar results. We implement two approaches to decompose the original investor sentiment proxy. The first one uses a moving average of the original sentiment proxy as a crude yet intuitive measure for the long-run sentiment component. More specifically, at each time t, the longrun sentiment component LR,t is the moving average of the original sentiment index over a 2-year period between [t − 25, t − 2]. While the choice of a 24-month window is admittedly somewhat arbitrary, it is partially motivated by the observation that periods of high/low sentiment often persist for around 2 years. 16 When the long-run sentiment component is measured crudely by smoothing average, there are two ways to construct the corresponding short-run sentiment component. One measure of the short-run component t − t−1 ⊥ is the change in the current sentiment from its previous level, which is also orthogonalized to the long-run sentiment component. t − t−1 is orthogonalized from the long-run sentiment component to obtain a measure of the short-run sentiment fluctuation that is uncorrelated with the long run sentiment. 15 We thank Jeffery Wurgler for providing these data. The Baker and Wurgler sentiment index used to be a common component of six proxies among which TURNOVER is included. Baker and Wurgler index no longer use TURNOVER due to the concern that TURNOVER does not represent sentiment well when high frequency trading is prevailing in the market. 16 For example, the US stock market experienced a "new-issue mania" between 1961 and 1962, high investor sentiment for firms with strong growth potential between 1967 and 1968, and a bubble in gambling issues in 1977 and 1978. Concerning the bubble bursting, it also usually takes around 2 years for stock price to come back to earth in the anecdotal history. For instance, following the high-tech bubble in early 1980s, investors' demand shifted to dividend paying stocks between 1987 and 1988. For robustness purposes, we also consider alternative windows of the moving average for long-run sentiment, including 12-month, 36-month and 48-month, and our conclusion remains unchanged.
Another measure for short-run sentiment t − t−1 is the change in the deviation of current sentiment from its correspondent long-run sentiment ( t − LR,t )−( t−1 − LR,t−1 ). 17 Our second approach to decompose sentiment is from Beveridge and Nelson (1981). 18 The Beveridge-Nelson decomposition is an approach to decompose the ARIMA(p, 1, q) process into two components: a permanent component that is a random walk with drift and a transitory component that is a stationary process with a mean of zero. We consider the permanent component of the decomposed sentiment index as the long-run sentiment (BN_LR), and the transitory component of decomposed sentiment index as the short-run sentiment (BN_SR). Figure 1 depicts the time series of decomposed long-and short-run sentiment and the original Baker-Wurgler index when using a moving average to obtain long-run sentiment. The long-run sentiment is LR and the short-run sentiment is t − t−1 . The graph shows that the long-run sentiment is strongly correlated with the original Baker-Wurgler sentiment index, albeit with some lags. The long-run sentiment is smoother than the original Baker-Wurgler index, while the short-run sentiment is relatively small and fluctuates around zero. The short-run sentiment component is generally smaller in magnitudes than the long-run sentiment component. Figure 2 plots Beveridge-Nelson decomposed sentiment and the original Baker-Wurgler index. It shows that BN_LR is highly correlated with the original Baker-Wurgler  1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 year Sent_BWL ong-term Sent_BW ηt-ηt-1 17 Based on our model, the short-run sentiment should be t − (1 + r) t−1 ⊥ . We nevertheless follow the previous literature and ignore the effect of risky-free rate to obtain a short-run sentiment proxy, We also use t − (1 + r) t−1 ⊥ to run the tests and the regression results are strongly consistent with the results of using t − t−1 ⊥ . The monthly risky-free rate is small and does not affect our main results. 18 We thank Dominique Ladiray for providing the algorithm codes.
sentiment. Comparing Fig. 2 with Fig. 1, the long-run sentiment is no longer a lagged version of original sentiment. The correlation coefficient between the long-run sentiment and the original sentiment is higher when we use BN_LR as the long-run sentiment indicator. Figure 2 also shows that BN_SR has a broader range than other short-run sentiment measures, such as t − t−1 and t − t−1 ⊥ . Panel B of Table 1 presents the descriptive statistics of the decomposed investor sentiment during the sample period from July 1965 to September 2015. In terms of magnitudes, the long-run sentiment is much larger than the short-run sentiment. The standard deviations of the long-run sentiment LR and BN_LR are 0.91 and 1.06, respectively. The standard deviations of the two short-run sentiment components, t − t−1 and t − t−1 ⊥ , are both 0.02. The Beveridge-Nelson decomposition generates a short-run sentiment with relatively larger magnitude than t − t−1 and t − t−1 ⊥ . The short-run sentiment component, BN_SR, has a standard deviation of 0.22. Panel B also shows that the long-run sentiment measures, namely LR and BN_LR, have significant first-order autocorrelation coefficient with a value of 0.99. Short-run sentiment measure t − t−1 ⊥ does not have a significant correlation with its own lagged term, as it has been orthogonalized to the strongly persistent long-run sentiment component. The short-run sentiment BN_SR is still significantly auto-correlated, with a first-order autocorrelation coefficient of 0.91. The last column of Panel B presents the correlation between each decomposed sentiment and the one-term lagged Baker-Wurgler sentiment. Apart from t − t−1 ⊥ , the long-and short-run sentiment measures are significantly associated with the original sentiment, although the correlation coefficients for the short-run sentiment are relatively small in terms of magnitude. With the exception of t − t−1 ⊥ , the short-term sentiment measures are negatively associated with the one-period lagged original Baker-Wurgler sentiment.  1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 year Sent_BWB N Long-term Sent_BW BN Short-term Sent_BW

Cross-sectional returns on decomposed sentiment
Our theoretical model predicts that both the long-and short-run sentiment components affect cross-sectional stock returns. To test this prediction, we run the following regression: where R t,1 − R t,2 represents the relative returns of a sentiment-prone portfolio over a sentiment-immune portfolio, LR,t refers to the long-run sentiment component at time t, Δ s,t represents the short-run sentiment increments, and X is a vector of control variables. The control variables include Fama-French (2015) five factors (RMRF, SMB, HML, RMW, CMA) and the Carhart (1997) momentum factor (UMD). 19 Specifically, RMRF is the market return premium over risk-free rate; SMB is the average return on the three small portfolios minus the average return on the three big portfolios; HML is the average return on the two value portfolios minus the average return on the two growth portfolios; RMW is the average return on the two robust operating profitability portfolios minus the average return on the two weak operating profitability portfolios; CMA is the average return on the two conservative investment portfolios minus the average return on the two aggressive investment portfolios; and UMD is the average return of high prior return portfolio over low prior return portfolio. The control variable SMB (HML) is excluded when the long-short portfolio is constructed with ME (BE/ME). The control variable RMW is excluded when the long-short portfolio is constructed with E/BE or D/BE. The strong persistence of the long-run sentiment measure may raise the concern of spurious regressions. Stambaugh (1999) points out that the coefficient estimates of predictive regression with a small sample can be biased and distort the t-statistics when the predictor is highly persistent. Stambaugh bias exists if the autoregressive disturbance of a lagged stochastic regressor correlates with the regression error term. Under this circumstance, OLS regression results will lead to an erroneous conclusion that the lagged regressors have predictive power while in fact they do not. Thus, it is paramount that we account for Stambaugh bias in our predictive regressions. To this end, we adopt the multi-predictor augmented regression method of Amihud et al. (2009) to adjust for the Stambaugh bias in the estimated coefficients and report the t-statistics of coefficients estimated from a wild bootstrap procedure. The detailed methodology of this wild bootstrap procedure is in the "Appendix". We also calculate Newey-West standard errors (Newey and West 1986) that are robust to heteroscedasticity and serial correlation, and we choose a maximum lag of 12 throughout the regressions. Table 2 reports the estimation coefficients on the standardized long-and short-run sentiment components of the Baker-Wurgler sentiment indicator. All independent variables (including the long-and short-run sentiment measures) are standardised before running the regression to facilitate comparisons of coefficients. The short-run sentiment in Panel A ( t − t−1 ) is measured by changes in the sentiment deviation from the long-run sentiment. The short-run sentiment in Panel B, t − t−1 ⊥ , is measured by the sentiment increment orthogonalized to the long-run sentiment component. The longrun sentiment component ( LR,t ) in both Panels A and B is the moving average of prior [− 25,− 2] monthly investor sentiment. The long-and short-run sentiment components in Panel C are Beveridge-Nelson decomposed long-and short-run sentiment, BN_LR (27) R t,1 − R t,2 = + 1 LR,t + 2 Δ s,t + X + u t , Table 2 Regressions of monthly cross-sectional returns on decomposed sentiment This table reports the regressions of long-short portfolio returns on both the long-run and short-run sentiment R t,1 − R t,2 = + 1 LR,t + 2 Δ s,t + X + u t R t,1 − R t,2 represents the return disparity of more sentiment-prone portfolio over the less sentiment-prone portfolio. The control variables (X) include the Fama-French Five factors (RMRF, HML, SMB, RMW, CMA), and the momentum factor (UMD). SMB (HML) will not be included in regression when return premium is constructed by ME (BE/ME). The first two columns show how the portfolio is constructed. H, M, L represents the top three, middle four and bottom three decile portfolios respectively. The long-run sentiment component LR,t in Panel A and Panel B is the standardized smoothing average of prior [− 25,− 2] monthly investor sentiment. Short term component in Panel A and Panel B are respectively the standardized incremental change of sentiment deviation from long-run sentiment average t − t−1 and the standardized incremental sentiment orthogonalized to long-run sentiment t − t−1 ⊥ . Panel C presents the effect of BW sentiment decomposed with Beveridge and Nelson (1981) method; the long-and short-run sentiment are noted as BN_LR and BN_SR respectively. All coefficients are adjusted for Stambaugh-bias. The p values reported in parentheses are obtained from wild bootstrap procedures in which all stimulation uses Newey West robust t-statistics. See "Appendix" for details of bootstrap simulation. ***p < 0.01, **p < 0.05, *p < 0.1 and BN_SR, respectively. We report the Stambaugh bias-adjusted coefficients with bootstrapped p values in parentheses.
Panel A in Table 2 shows that the long-run sentiment component LR,t is a significant contrarian predictor for 13 out of 16 long-short portfolio returns, consistent with our theory. Furthermore, the short-run sentiment component t − t−1 is significantly and positively associated with 11 out of 16 long-short portfolio returns, which is consistent with existing evidence that contemporaneous excess returns are positively related to shifts in sentiment (Lee et al. 2002;Ben-Rephael et al. 2012).
In general, the empirical results in Panel A provide strong support for our two main hypotheses. The signs of coefficients on both the long-and short-run sentiment components in 11 out of 16 regressions are consistent with the predictions of our theory. However, the coefficients on the long-run sentiment are significantly positive for the regressions involving PPE/A(L-H) and EF/A(H-L), while the coefficients on the short-run sentiment are significantly negative in the case of GS(H-L) and BE/ME(L-H). These portfolios with the unexpected signs of the regression coefficients are portfolios in the "Tangibility" and in "Growth Opportunity and Distress" group. Baker and Wurgler (2006) also find that investor sentiment is not a good predictor of the future returns of the portfolios in the "Tangibility" group and argue that the multidimensional nature of BE/ME, EF/A, and GS makes the results unclear in the "Growth Opportunity and Distress" group.
Panel B reports the estimation results of the long-run sentiment LR,t and the short-run sentiment t − t−1 ⊥ . The coefficients on the long-run sentiment component are significantly negative in 13 of the 16 regressions and the coefficients on the short-run sentiment component are significantly positive in 11 out of 16 regressions, consistent with the results in Panel A. The coefficients on the short-run sentiment are significantly positive in 5 out of the 16 regressions. The dependent variables in those regressions with unexpected coefficients on the long run and short sentiment belong to the "Tangibility" and the "Growth Opportunity and Distress" group. The magnitude of the coefficients on the long-and the short-run sentiment measures in Panel B is in most cases comparable to their counterparts in Panel A.
Panel C reports the estimation results of the Beveridge-Nelson decomposed longand short-run sentiment. The coefficients on the long-run sentiment BN_LR are almost all significantly negative, while those on the short-run sentiment BN_SR are significantly positive in six out of the 16 regressions. Under the Beveridge-Nelson decomposition, the long-run sentiment no longer exhibits a pattern lagged to original sentiment (recall that in Fig. 2), and yet is still negatively associated with the subsequent long-short portfolio returns. Despite some differences in the magnitude, the coefficients on the long-and shortrun sentiment components in Panel C have the same signs as their counterparts in Panel A.
Regressions with the long-short portfolio returns as the dependent variables may obscure the effects of the two sentiment components on individual decile portfolios. To address this issue, we run regressions of decile portfolio returns on both the long-and short-run sentiment variables and control variables: R t,i,j represents the return of the ith decile portfolio sorted by variable j, where i represents the decile portfolio rank and takes values from 1 to 10 and j is one of the ten firm characteristic variables used to construct the decile portfolios. The control variables (X) include the Fama-French five factors (RMRF, HML, SMB, RMW, CMA), and the momentum factor (UMD). Table 3 reports the coefficients on the long-and short-run sentiment components. Panel A documents a large variation in the coefficients on the long-run sentiment component across the decile portfolios. The observed patterns are consistent with predictions of our theoretical model. Specifically, we find decile portfolios that are more prone to marketwide sentiment are usually more affected by the long run sentiment. More specifically, for Size and Age sorted portfolios, the coefficients of the long-run sentiment increase from Decile 1 to Decile 10 most of the time, indicating that the reversal effect of long-run sentiment effect on decile returns is larger on small and young stocks than large and old stocks. The coefficients on the long-run sentiment for Sigma sorted decile portfolios decrease almost monotonically with the decile rank, implying that high long-run sentiment leads to lower returns on more volatile decile portfolios than less volatile portfolios. For the two variables in the "Tangibility" group, PPE/A and RD/A, there is no clear pattern in the long-run sentiment coefficients across different deciles. This finding is consistent with Baker and Wurgler (2006) and is in line with the results in Table 2.
For firm characteristics with the multidimensional nature, namely BE/ME, EF/A, and GS, we find that the coefficients on the long-run sentiment across deciles have an inverse U-shape. This indicates that the middle deciles are less prone to overall market, while distressed stocks and stocks with strong growth potential are more prone to overall market. When sorted by ME, Sigma, and D/BE, the less sentiment-prone deciles and the more sentiment-prone deciles show different exposure to the long-run sentiment. For example, for deciles sorted on Sigma, the coefficients of the bottom two deciles are significantly positive, indicating that the "bond-like" stocks with low return volatility have negative exposure to the long-run overall market sentiment. One plausible explanation proposed by Baker and Wurgler (2007) is "flights to quality". When overall market is pessimistic, bond-like stocks are more appealing to not only sophisticated investors but also noise traders, leading to rising prices of those stocks during low market sentiment periods. The reverse pattern we found in "bond-like" stocks helps explain the weak relationship between investor sentiment and aggregate market return documented in the literature. Indeed, in an untabulated regression, we test the effects of decomposed sentiment on aggregate market returns and find the coefficients of both the long-and short-run sentiment components are insignificant.
Panel B of Table 3 reports the coefficients of the short-run sentiment component. The coefficients on the short-term sentiment are positive and significant in almost all more sentiment-prone decile portfolios. There is a decreasing (increasing) pattern in the coefficients of the short-run sentiment across the deciles sorted by ME, Age, E/BE and D/BE (Sigma). However, the coefficients on the short-run sentiment follow a U-shaped pattern across the deciles sorted by BE/ME, EF/A, and GS and exhibit no clear pattern in the deciles sorted by PPE/A and RD/A. We also find significantly negative coefficients on the short-run sentiment in most of the less sentiment-prone deciles. In general, the results in Panel B confirm the conclusions drawn from Panel A that the effect of short-run sentiment on returns varies across deciles and the bond-like stocks have negative exposure to overall market sentiment.

Robustness checks
This paper presents a behavioural explanation for the variations in the cross-sectional stock returns. However, it is possible that variation in investor sentiment reflects changes in systematic risk and our results may not be entirely consistent with the behavioural story. For example, changes in our decomposed sentiment measures may coincide with time variation in the market beta. If that is the case, the cross-sectional patterns conditional on the ten characteristics would represent a rational compensation for systematic risk. We test this possibility with a time-varying CAPM beta model.  This table reports the regressions of long-short portfolio returns on both the long-run and short-run sentiment R t,i,j = + 1,i,j LR,t + 2,i,j t − t−1 + X + u t R t,i,j represents the return of the ith decile portfolio sorted by variable j, where i is from 1 to 10 and j is one of the ten sentiment-prone characteristic variables. The control variables (X) include the Fama-French Five factors (RMRF, HML, SMB, RMW, CMA), and the momentum factor (UMD). SMB (HML) is not included in regression when the long-short portfolio is constructed by ME (BE/ME). Long-run sentiment component * t−1 in is the standardized smoothing average of prior [− 25, − 2] monthly investor sentiment, and short term component is the standardized incremental change of sentiment deviation from long-run sentiment average t − t−1 . Panel A and Panel B respectively reports the coefficients long-and short-run sentiment. The label "a" indicates that the sign of the coefficient is contrary to the hypothesis. The coefficients are adjusted for Stambaugh-bias. The p values are obtained from wild bootstrap procedures in which all stimulation uses Newey West robust t-statistics. ***p < 0.01, **p < 0.05, *p < 0.1  where R t,1 − R t,2 represents the portfolio returns that long the more sentiment-prone portfolios and short the less sentiment-prone portfolios, LR,t refers to the long-run sentiment component at time t, Δ s represents the short-run sentiment increments, and RMRF t is the market return premium. If the negative (positive) effect of long-run (short-run) sentiment on the cross-sectional return is driven by its negative (positive) effect on the beta loading of market return premium, the coefficients for the interaction terms will be significantly different from zero, and the sign of i will be the same as the sign of i in Table 2; otherwise, the behavioural story holds. That is to say, 1 should be significantly negative and 2 should be significantly positive if the rational explanation holds. Table 4 shows the sign and magnitude of the coefficients on both the long-and shortrun sentiment components remain consistent with their counterparts in Table 2 even after including the interaction terms in the regressions. This evidence suggests that the long-and short-run sentiment components do indeed affect the cross-sectional stock returns. We also consider another potential systematic risk explanation, which posits that even when the market beta is constant, the decomposed investor sentiment may reflect the variations in the market return premium. If this story holds, the decomposed investor sentiment should perform well in predicting the market return premium. However, in an unreported regression of market return premium on decomposed investor sentiment, we find little evidence that decomposed investor sentiment components affect the aggregate market returns.
One may also argue that the effect of investor sentiment may come from the cross-sectional variation in investor attention. As several studies show that rising investor attention leads to predictable returns both in time-series and in the cross-section (e.g., Barber and Odean 2008;Da et al. 2015;Peng and Xiong 2006;Li and Yu 2012), it remains unclear whether stocks that appear to be most affected by investor sentiment are actually those that attract most attention from investors. To shed some light on this issue, we investigate the correlation between investor sentiment and the difference between attention of more sentiment-prone stocks and less sentiment-prone stocks. We use two investor attention measures proposed by Barber and Odean (2008), i.e., the abnormal trading volume index and the abnormal return index. We first calculate the monthly abnormal trading volume and abnormal return indexes for each firm and then calculate the average investor attention for each portfolio. The abnormal trading volume (return) index is calculated as the ratio of the stock's trading volume (return) in a given month to its average trading volume (return) over the prior 1-year. We construct the cross-sectional attention disparity in the same way as we calculate the return disparity of the sentiment-prone portfolio over the sentiment-immune portfolio. Take ME sorted long-short portfolio as an example, the attention disparity is the difference in the average attention between the bottom three deciles and the top three deciles of size sorted portfolios. In an unreported test, we find that the correlation coefficients between attention disparity and the change in market sentiment are usually relatively small and mostly insignificant.
We also examine whether the long-and short-run sentiment components lose their explanatory and predictive power after adding the contemporaneous cross-sectional investor attention disparity measure as an additional control variable in our regressions. The regression results are shown in Table 5. Panel A of Table 5 presents the regression results when investor attention A t,1 is measured by abnormal trading volume and Panel B shows the results when attention disparity is measured by abnormal return. The results suggest that the investor attention disparity is significantly related to cross-sectional stock returns. However, the effects of the decomposed sentiment components on returns remain strong (29) R t,1 − R t,2 = + 1 LR,t + 2 Δ s,t + b + 1 LR,t + 2 Δ s,t RMRF t + u t ,  This table reports regressions of long-short portfolio returns on the market risk premium and its interactions with both long-run sentiment and short-run sentiment the return disparity of more sentiment-prone portfolio over the less sentiment-prone portfolio. The long-short portfolios are constructed based on the firm characteristics following Baker and Wurgler (2006) approach. The Baker-Wurgler sentiment index is decomposed. The long-run sentiment component LR,t is the standardized smoothing average of prior [− 25, − 2] monthly investor sentiment. Short-run sentiment component Δ s,t is measured by t − t−1 and t − t−1 in Panel A and Panel B respectively. All regressors are standardized. The coefficients are adjusted for Stambaugh-bias. The label "a" marks the coefficients of interaction terms significantly supports the rational story. The p values reported in parentheses are obtained from wild bootstrap procedures in which all stimulation uses Newey West robust t-statistics. ***p < 0.01, **p < 0.05, *p < 0.1  even after controlling for investor attention. The results in Panels A and B are almost the identical in terms of the sign and significance of the coefficients for the long-and the shortrun sentiment. As predicted, we show that the long-run sentiment negatively predicts future cross-sectional returns and the short-run sentiment is positively correlated with contemporaneous cross-sectional return premium. Although unreported, including lagged investor attention disparity measures does not change our results either. Therefore we conclude that cross-sectional investor attention is unlikely to drive our results. It is commonly known that measures of investor sentiment are noisy (Baker and Wurgler 2006). While the Baker-Wurgler sentiment index has become a workhorse sentiment indicator, it relies on variables such as returns, volatility, and trading volume, which are themselves equilibrium outcomes. Another way to measure sentiment is to use survey data. Following prior literature (e.g., Lemmon and Portniaguina 2006;Ho and Hung 2009), we also use two survey-based sentiment indicators, the Consumer Confidence Index by Michigan (ICS) and the Conference Board Consumer Confidence Index (CCI) from Bloomberg, as alternative sentiment measures. We report the results in Table 6. Panel A shows that the long-run sentiment exhibit a significantly negative effect on future return in 9 out of 16 the cases, with 8 of these 9 cases having a significantly positive coefficients on the short-run sentiment. Most of the cases that do not fit our expectation are related to BE/ME-, EF/A-, and GS-based portfolios. For these portfolios, the results are also mixed in Baker and Wurgler's (2006) study. The results in Panel B are largely consistent with their counterparts in Panel A, especially when looking into the coefficients of long-run sentiment. Thus, overall, the results of the survey-based sentiment indicators are generally consistent with those from decomposed Baker-Wurgler Sentiment.
We also conduct some other robustness checks (the results are not reported for the sake of brevity). First, we re-run our regressions with decomposed sentiment indicators from other widely accepted investor sentiment proxies, such as the closed-end fund discount (CEFD) proposed by Lee et al. (1991) and the aligned sentiment indicator (Sent_PLS) from Huang et al. (2015). We find similar results when using decomposed CEFD and Sent_PLS in the regression, especially for the coefficients of the long-run sentiment. Second, we construct different measures of sentiment components by taking the moving average of the Baker-Wurgler sentiment index over different horizons as the measure of the long-run sentiment. Specifically, our conclusions remain unchanged when the 12-month and 36-month horizon are used to calculate long-run sentiment. Third, we divide the samples into high and low sentiment periods, where a low (high) sentiment period is defined as the period when the current sentiment is lower (higher) than the previous 2-year smoothing average sentiment. We find that the long-and short-run sentiment components perform better at explaining the cross-sectional returns during periods of high sentiment. This evidence may be explained by more binding short-selling constraints during high sentiment periods (e.g., Nagel 2005;Stambaugh et al. 2012;Yu and Yuan 2011). Fourth, because of the close association of investor sentiment premium and macroeconomics announcement (Du and Hu 2018), we include a set of additional macroeconomic control variables (dividend-price ratio, price-earnings ratio, term spread, and default spread) in our regressions. Furthermore, we include the liquidity factor and bid-ask spreads to account for the effect of liquidity. The results are consistent with the baseline analysis. Fifth, it might be argued that the financial and firm characteristics used to construct our long-short portfolios are noisy proxies of the degree of exposure to market-wide sentiment. To address this concern, we use principal component analysis (PCA) to build an index of exposure to investor sentiment based on these accounting and financial variables, and test whether the decomposed sentiment can explain or predict the index. Using PCA helps remove the idiosyncratic noise 2 ) represents the return (investor attention) disparity of more sentiment-prone portfolio over the less sentiment-prone portfolio. The control variables (X) include the Fama-French Five factors (RMRF, HML, SMB, RMW, CMA), and the momentum factor (UMD). The first two columns show how the portfolio is constructed. H, M, L represents the top three, middle four and bottom three decile portfolios respectively. The long-run sentiment component * t−1 is the standardized smoothing average of prior [− 25, − 2] monthly investor sentiment. t − t−1 is the standardized incremental change of sentiment deviation from long-run sentiment average. Panel A (B) reports the results when A t is measured by abnormal trading volume (abnormal return). The p values reported in parentheses are obtained from wild bootstrap procedures in which all stimulation uses Newey West robust t-statistics. ***p < 0.01, **p < 0.05, *p < 0.    embedded in the individual accounting and financial variables. We find our results continue to hold. Finally, our conclusions remain unchanged when we re-run our tests using value-weighted portfolio returns to isolate the size effect on the portfolio returns.

Conclusion
This study uses a simple and straightforward model to show that investor sentiment affects returns of different assets disproportionally. We extend Delong et al. (1990) model to a noise trader risk model with multiple risky assets. In our model, we allow the risky assets to have different exposure to overall market investor sentiment and provide theoretical predictions that are consistent with the empirical evidence of the effect of investor sentiment on the cross-sectional stock returns. Motivated by the model, we also decompose investor sentiment into the long-and shortrun components. Consistent with the theory, we find that the long-run sentiment component is a contrarian predictor of future long-short portfolio returns and the short-run sentiment is positively correlated with contemporaneous long-short portfolio returns, where the long-short portfolios long sentiment-prone stocks and short sentiment-immune stocks. Furthermore, we test whether the effect of the sentiment components can be attributed to the time-varying beta loading of the market premium (or other risk factors). We show that the effect of sentiment components on the cross-sectional return is not related to systematic risk. Accordingly, the behavioural story holds. Further analysis suggests that our results are robust to alternative sentiment measures, different sample periods, additional control variables, and the use of value-weighted returns.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Table 7 gives a detailed description of the variables needed to construct the portfolios.

Appendix 1: Detailed description of variables representing sentiment-prone level
(CCI). The p values reported in parentheses are obtained from wild bootstrap procedures in which all stimulation uses Newey West robust t-statistics. ***p < 0.01, **p < 0.05, *p < 0.1  RD/A = XRD/AT BE/ME Book-to-market ratio This is the log of the ratio of book equity to market equity. We match fiscal year ending calendar year t − 1 ME with June t BE log(1 + BE/DEC_ME) EF/A External finance over assets External finance (EF) is equal to the change in assets (Item 6) less the change in retained earnings (Item 36). When the change in retained earnings is not available, we use net income (Item 172) less common dividends (Item 21) instead EF1 = dif(RE); EF2 = dif(NI-DVC); EF/A = (dif(AT)coalesce(EF1, EF2, 0))/AT GS Sales growth Sales growth is the percentage change in net sales (Item 12). We first calculate the original sales growth ratio and then use its position in the ten-decile to note GS. GS has a range from [1, 10] GS = dif(SALE)/lag(SALE)