Skip to main content
Log in

Removing biases in computed returns

  • Original Research
  • Published:
Review of Quantitative Finance and Accounting Aims and scope Submit manuscript

Abstract

This paper presents a straightforward method for asymptotically removing the well-known upward bias in observed returns of equally-weighted portfolios. Our method removes all of the bias due to any random transient errors such as bid-ask bounce and allows for the estimation of short horizon returns. We apply our method to the CRSP equally-weighted monthly return indexes for the NYSE, Amex, and NASDAQ and show that the bias is cumulative. In particular, a NASDAQ index (with a base of 100 in 1973) grows to the level of 17,975 by 2006, but nearly half of the increase is due to cumulative bias. We also conduct a simulation in which we simulate true prices and set spreads according to a discrete pricing grid. True prices are then not necessarily at the midpoint of the spread. In the simulation we compare our method to calculating returns based on observed closing quote midpoints and find that the returns from our method are statistically indistinguishable from the (simulated) true returns. While the mid-quote method results in an improvement over using closing transaction prices, it still results in a statistically significant amount of upward bias. We demonstrate that applying our methodology results in a reversal of the relative performance of NASDAQ stocks versus NYSE stocks over a 25 year window.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. See Macaulay (1938) and Fisher (1966), among others.

  2. A notable exception is Fama et al. (1969), who use continuously compounded returns.

  3. For example, the explanation of the weighting method for the Dow Jones Turkey Equal Weighted 15 Index from the company’s web site (http://www.djindexes.com/mdsidx/?event=showTurkey15) is that “The index includes the largest stocks traded on the Istanbul Stock Exchange, and is equal weighted to limit the influence of the biggest companies on overall index performance.” Also in 2005, NASDAQ began constructing an equally-weighted version (rebalanced quarterly) of several of their indexes including the NASDAQ 100. See also Hamza et al. (2006) for a discussion of the efficacy of different index weighting methods for emerging markets.

  4. In Blume and Stambaugh (1983) a buy-and-hold portfolio sets equal weights in a portfolio at the beginning of the period and no rebalancing is done before the end of the multi-period investment horizon. In contrast a rebalanced portfolio is rebalanced each period. See Roll (1983) for a comparison of rebalanced and buy-and-hold portfolio returns.

  5. For example Blume and Stambaugh (1983) examine one-year investment horizons and Conrad and Kaul (1993) examine 3-year investment horizons.

  6. Blume and Stambaugh (1983) and Bessembinder and Kalcheva (2007) note that while short horizon continuous-compounded rates of return contain no bias, they also possess certain properties that limit their use in many tests.

  7. The CRSP Equally-Weighted Index methodology is first developed by Cohen and Fitch (1966).

  8. This upward bias in equally-weighted returns is first observed by Macaulay (1938, pp. 149–154) in his analysis of railroad stocks. Macaulay concludes that the bias, which he calls mathematical drift, is larger than can be caused by chance, and is much larger than the bias observed for a value-weighted index of the same stocks.

  9. Although Blume and Stambaugh (1983) consider other possible pricing errors (see their Sect. 2.4), they assert that the bias from them is negligible.

  10. Equally weighting portfolios is the method most commonly used in event studies.

  11. Buy-and-hold portfolios reduce the bias because the weights used after the first period have a negative correlation with subsequent observed returns which offsets the upward bias. It then follows that other portfolio weighting methods will also reduce the bias if the weights are similarly negatively correlated with observed returns. Bessembinder and Kalcheva (2007) show that the method presented in this paper is one such method.

  12. While quoted bid-ask spread is defined as \( \hat{P}_{a} - \hat{P}_{b} \), effective spread takes into account the fact that trades can occur at prices other than the posted bid and ask. Effective spread measures the distance between the midpoint of the spread and trade prices. Mathematically it can be expressed as \( 2\left[ {\hat{P}_{t} - {\frac{{\hat{P}_{a} + \hat{P}_{b} }}{2}}} \right] \), where \( \hat{P}_{t} \) is the observed trade price.

  13. Discrete pricing may cause errors if the amount of expected price adjustment to information is less than the minimum tick size on a market. In addition, discrete pricing may cause true prices to deviate from the mid-point of bid-ask spread.

  14. Fisher and Lorie (1964, 1968, and 1977) do just that. However, they find that initial year returns are almost always higher that second and subsequent year returns.

  15. Multiplying each stock’s “return weight” by one plus the stock’s observed return and cumulating over stocks yields our Eq. (11).

  16. Additional examples of random transient errors include errors in observed stock prices caused by incorrect order entry or transaction recording.

  17. Observed variables will be indicated with a hat and true values of variables will have no notation.

  18. As shown in Blume and Stambaugh (1983) footnote 6.

  19. Blume and Stambaugh (1983), and some others, implicitly assume continuous pricing so that the distribution of error terms is binomial. That is, the true price of stock is the midpoint of the spread. Given the reality of tick-induced discrete pricing, true price is not necessarily at the midpoint of spread. Therefore, a log normal distribution of error terms is more representative. In “Appendix 1” we show that the bias arising from a log normally distributed error term is equal to that of a binomially distributed error.

  20. Subtracting 1 from a wealth relative yields the return on an index or portfolio.

  21. Assuming no lagged adjustment is equivalent to assuming no serial covariance, thus the product of expectations is equal to the expectation of the product.

  22. Since \( \left\{ {\overline{{ 1\, + \,\sigma^{ 2} \left( {e_{i,t - 2} } \right)}} } \right\} \) is in both the numerator and denominator, Jensen’s inequality does not apply.

  23. The conditional expected index relative table is provided in “Appendix 2”.

  24. As a check, we first re-construct the CRSP Equally-Weighted Index, so that we can find possible differences in the data bases.

  25. If the index value for December 1926 is set to 100, then the unbiased NYSE index would have a value of 361,016 by 2002, while the CRSP NYSE index would reach a value of 836,852. This further illustrates the impact of bias in the construction of long series of equally-weighted indexes.

  26. See Jones and Lipson (2001) and Bessembinder (2003), among others.

References

  • Aitchison J, Brown JAC (1957) The lognormal distribution, with special reference to its uses in economics. University Press, Cambridge

    Google Scholar 

  • Bessembinder H (2003) Trade execution costs and market quality after decimalization. J Financ Quant Anal 38:747–777

    Article  Google Scholar 

  • Bessembinder H, Kalcheva I (2007) Liquidity biases in asset pricing tests. Working Paper, David Eccles School of Business, University of Utah

  • Blume ME, Stambaugh RF (1983) Biases in computed returns: an application to the size effect. J Financ Econ 12:387–404

    Article  Google Scholar 

  • Canina L, Michaely R, Thaler R, Womack K (1998) Caveat compounder: a warning about using the daily CRSP equal-weighted index to compute long-run excess returns. J Financ 53(1):403–416

    Article  Google Scholar 

  • Cohen KJ, Fitch BP (1966) The average investment performance index. Manage Sci 12(6):B195–B215 (Series B, Managerial)

    Article  Google Scholar 

  • Conrad J, Kaul G (1993) Long-term market overreaction or biases in computed returns. J Financ 48(1):39–63

    Article  Google Scholar 

  • Cootner PH (1962) Stock prices: random versus systematic changes. Ind Manag Rev 3:24–45

    Google Scholar 

  • Fama EF (1991) Efficient capital markets: II. J Financ 46:1575–1617

    Article  Google Scholar 

  • Fama EF, Fisher L, Jensen MC, Roll R (1969) The adjustment of stock prices to new information. Int Econ Rev 10(1):1–21

    Article  Google Scholar 

  • Ferson WE, Korajczyk RA (1995) Do arbitrage pricing models explain the predictability of stock returns? J Bus 68:309–349

    Article  Google Scholar 

  • Fisher L (1966) Some new stock-market indexes. J Bus 39:191–225

    Article  Google Scholar 

  • Fisher L, Lorie JH (1964) Rates of return on investments in common stock. J Bus 37:1–21

    Article  Google Scholar 

  • Fisher L, Lorie JH (1968) Rates of return on investments in common stock: the year-by-year record, 1926–65. J Bus 41:291–316

    Article  Google Scholar 

  • Fisher L, Lorie JH (1977) A half century of returns on stocks and bonds: Rates of return on investments in common stocks and on U.S. Treasury securities, 1926–1976. University of Chicago Graduate School of Business, Chicago

    Google Scholar 

  • Hamza O, Kortas M, L’Her J-F, Roberge M (2006) International equity portfolios: selecting the right benchmark for emerging markets. Emerg Market Rev 7:111–128

    Article  Google Scholar 

  • Hou KW, Moskowitz T (2005) Market frictions, price delay, and the cross-section of expected returns. Rev Financ Stud 18(3):981–1020

    Article  Google Scholar 

  • Jones CM, Lipson ML (2001) Sixteenths: direct evidence on institutional execution costs. J Financ Econ 59:253–278

    Article  Google Scholar 

  • Keim D (1983) Size related anomalies and stock market seasonality: further empirical evidence. J Financ Econ 12:13–32

    Article  Google Scholar 

  • Macaulay FR (1938) Some theoretical problems suggested by the movements of interest rates, bond yields, and stock prices in the United States since 1856. National Bureau of Economic Research, New York

    Google Scholar 

  • Niederhoffer V, Osborne MFM (1966) Market making and reversal on the stock exchange. J Am Stat Assoc 61:897–916

    Article  Google Scholar 

  • Reinganum MR (1982) A direct test of Roll’s conjecture on the firm size effect. J Financ 37:27–35

    Article  Google Scholar 

  • Roll R (1983) On computing mean returns and the small firm premium. J Financ Econ 12:371–386

    Article  Google Scholar 

  • Roll R (1984) A simple measure of the effective bid-ask spread in an efficient market. J Financ 39:1127–1139

    Article  Google Scholar 

  • Samuelson PA (1965) Proof that properly anticipated prices fluctuate randomly. Ind Manag Rev 6:41–49

    Google Scholar 

  • Weaver DG (1991) Sources of short-term errors in the relative prices of common stocks. Ph.D Dissertation, Rutgers University

Download references

Acknowledgments

We thank Marshall Blume, Ivan Brick, Stephen Brown, Douglas Jones, Jay Ritter, Scott Linn, Michael Pagano, David C. Porter, Robert Stambaugh, Yusif Simaan, and David Whitcomb for their comments on earlier versions of this study. Fisher and Weaver thank the Whitcomb Center for Research in Financial Services for research support. Fisher also thanks the donors of the First Fidelity Bank Research Professorship of Finance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel G. Weaver.

Appendices

Appendix 1

1.1 Proof that a log normally distributed error term is approximately equal to a binomially distributed error term

Let \( Y_{it} = { \log }_{e} (1 + e_{it} ) \). If \( ( 1+ e_{it} ) \) is log normally distributed, Y it is normally distributed with mean \( \mu_{it} \) and variance \( \sigma^{2} (Y_{it} ) \). From Aitchison and Brown (1957, pp. 8–10),

$$ {\text{E}}(1 + e_{i,t - 1} ) = \exp \left[ {\mu_{it} + 0.5\sigma^{2} (Y_{it} )} \right] $$
(A1)

and

$$ {\text{E}}\left[ {{\frac{1}{{1 + e_{i,t - 1} }}}} \right] = \exp \left[ { - \mu_{it} + 0.5\sigma^{2} (Y_{it} )} \right] $$
(A2)

By assumption\( {\text{ E(1}} + e_{it} ) = 1 \), then since exp(0) = 1 it must be that

$$ \mu_{it} + 0.5\sigma^{2} (Y_{it} ) = 0 $$
(A3)

Solving for \( \mu_{it} \) and substituting the result into Eq. (A2) yields

$$ {\text{E}}\left[ {{\frac{1}{{1 + e_{i,t - 1} }}}} \right] = \exp \left[ {\sigma^{2} (Y_{it} )} \right] $$
(A4)

Aitchison and Brown (1957, Eq. 2.9) state that the exponential function of the variance of the transformed distribution (i.e., the right-hand side of Eq. (A4)) is equal to one plus the squared coefficient of variation of the parent, \( (1 + e_{it} ) \), distribution or \( e^{{\sigma^{2} }} = 1 + \eta^{2} \). Therefore, Eq. (A4) can be rewritten as

$$ {\text{E}}\left[ {{\frac{1}{{1 + e_{i,t - 1} }}}} \right] = 1 + {\frac{{\sigma^{2} \left( {1 + e_{i,t - 1} } \right)}}{{\left[ {{\text{E}}(1 + e_{i,t - 1} )} \right]^{2} }}} $$
(A5)

Since \( {\text{E(1}} + e_{i,t - 1} ) = 1 \) and \( \sigma^{ 2} ( 1+ e_{it} ) = \sigma^{2} (e_{i,t - 1} ) \), Eq. (A5) becomes

$$ {\text{E}}\left[ {{\frac{1}{{1 + e_{i,t - 1} }}}} \right] = 1 + \sigma^{2} \left( {e_{i,t - 1} } \right) $$
(A6)

.

Appendix 2

2.1 Conditional one-period expected index relatives assuming lagged adjustments to new information

Recall Eq. (5) from the text:

$$ \hat{P}_{{_{i,t} }} = \left[ {\left( {1 - a_{it} } \right)P_{i,t - 1} + a_{it} P_{it} } \right]\left( {1 + e_{it} } \right) $$
(5)

where: a i,t  = adjustment coefficient which shows the extent to which prices have adjusted to information released since t − 1. If there is no lag process, a i,t  = 1. For tractability we assume either that all prices fully adjust or that they do not adjust at all.

a i,t−1

a i,t

\( \hat{P}_{i,t - 1} \)

\( \hat{P}_{i,t} \)

Conditional \( E\left( {\hat{W}_{t} } \right) \)

0

0

\( P_{i,t - 2} \left( {1 + e_{i,t - 1} } \right) \)

\( P_{i,t - 1} \left( {1 + e_{it} } \right) \)

\( W_{t - 1} \overline{{(1 + \sigma^{2} (e_{t - 1} )}} ) \)

0

1

\( P_{i,t - 2} \left( {1 + e_{i,t - 1} } \right) \)

\( P_{it} \left( {1 + e_{it} } \right) \)

\( W_{t - 1} W_{t} \overline{{(1 + \sigma^{2} (e_{t - 1} )}} ) \)

1

0

\( P_{i,t - 1} \left( {1 + e_{i,t - 1} } \right) \)

\( P_{i,t - 1} \left( {1 + e_{it} } \right) \)

\( \overline{{(1 + \sigma^{2} (e_{t - 1} )}} ) \)

1

1

\( P_{i,t - 1} \left( {1 + e_{i,t - 1} } \right) \)

\( P_{it} \left( {1 + e_{it} } \right) \)

\( W_{t} \overline{{(1 + \sigma^{2} (e_{t - 1} )}} ) \)

Define θ as the probability that a = 0 and α as (1 − θ) then

$$ {\text{E}}\left( {\hat{W}_{t} } \right) \approx \overline{{(1 + \sigma^{2} (e_{t - 1} )}} )\left[ {\theta^{2} W_{t - 1} + \alpha \theta W_{t - 1} W_{t} + \alpha \theta + \alpha^{2} W_{t} } \right] $$
$$ \approx \overline{{(1 + \sigma^{2} (e_{t - 1} )}} \left( {\alpha + W_{t - 1} } \right)\left( {\theta + \alpha W_{t} } \right) $$
(A7)

and since W = (1 + R), where R is the return on the true aggregate portfolio, and α = 1  θ, then Eq. (A7) can be rewritten as.

$$ {\text{E}}(\hat{W}_{t} ) \approx \overline{{(1 + \sigma^{2} (e_{t - 1} ))}} \left[ {1 - \theta + \theta \left( {1 + R_{t - 1} } \right)} \right]\left\langle {\theta + \left( {1 - \theta } \right)\left( {1 + R_{t} } \right)} \right\rangle $$
(A8)

or

$$ {\text{E}}(\hat{W}_{t} ) \approx \overline{{(1 + \sigma^{2} (e_{t - 1} ))}} \left( {1 + \theta R_{t - 1} } \right)\left( {1 + \alpha R_{t} } \right) $$
(18)

Appendix 3

3.1 The ratio of a two-period expected index relative to a one-period expected index relative assuming lagged adjustment to new information

Following the methodology of “Appendix 2”, finding the expected one-period index relative ending at time t − 1 yields the following conditional table.

a i,t−2

a i,2

\( \hat{P}_{i,t - 2} \)

\( \hat{P}_{i,t - 1} \)

Conditional \( {\text{E}}\left( {\hat{W}_{t - 1} } \right) \)

0

0

\( P_{i,t - 3} \left( {1 + e_{i,t - 2} } \right) \)

\( P_{i,t - 2} \left( {1 + e_{i,t - 1} } \right) \)

\( W_{t - 2} \overline{{(1 + \sigma^{2} (e_{t - 2} )}} ) \)

0

1

\( P_{i,t - 3} \left( {1 + e_{i,t - 2} } \right) \)

\( P_{i,t - 1} \left( {1 + e_{i,t - 1} } \right) \)

\( W_{t - 2} W_{t - 1} \overline{{(1 + \sigma^{2} (e_{t - 2} )}} ) \)

1

0

\( P_{i,t - 2} \left( {1 + e_{i,t - 2} } \right) \)

\( P_{i,t - 2} \left( {1 + e_{i,t - 1} } \right) \)

\( \overline{{(1 + \sigma^{2} (e_{t - 2} )}} ) \)

1

1

\( P_{i,t - 2} \left( {1 + e_{i,t - 2} } \right) \)

\( P_{i,t - 1} \left( {1 + e_{i,t - 1} } \right) \)

\( W_{t - 1} \overline{{(1 + \sigma^{2} (e_{t - 2} )}} ) \)

Also as in “Appendix 2”, define θ as the probability that a = 0 and α as 1 − θ. Then

$$ {\text{E}}\left( {\hat{W}_{t - 1} } \right) \approx \overline{{(1 + \sigma^{2} (e_{t - 2} )}} )\left[ {\theta^{2} W_{t - 2} + \alpha \theta W_{t - 2} W_{t - 1} + \alpha \theta + \alpha^{2} W_{t - 1} } \right] $$
(A9)
$$ \approx \overline{{(1 + \sigma^{2} (e_{t - 2} )}} )\left( {\alpha + W_{t - 2} } \right)\left( {\theta + \alpha W_{t - 1} } \right) $$
(A10)

and once again, since W = (! + R), where R is the return on the true aggregate portfolio and α = 1  θ, then Eq. (A7) can be rewritten as

$$ {\text{E}}(\hat{W}_{t - 1} ) \approx \overline{{(1 + \sigma^{2} (e_{t - 2} ))}} \left[ {1 - \theta + \theta \left( {1 + R_{t - 2} } \right)} \right]\left\langle {\theta + \left( {1 + \theta } \right)\left( {1 + R_{t - 1} } \right)} \right\rangle $$
(A11)

or

$$ {\text{E}}(\hat{W}_{t - 1} ) \approx \overline{{(1 + \sigma^{2} (e_{t - 2} )}} )\left( {1 + \theta R_{t - 2} } \right)\left( {1 + \alpha R_{t - 1} } \right) $$
(A12)

Similarly, the following table gives the two-period aggregate wealth relative.

a i,t−2

a it

\( \hat{P}_{i,t - 2} \)

\( \hat{P}_{it} \)

Conditional \( {\text{E}}\left( {{}_{2}\hat{W}_{t} } \right) \)

0

0

\( P_{i,t - 3} \left( {1 + e_{i,t - 2} } \right) \)

\( P_{i,t - 1} \left( {1 + e_{it} } \right) \)

\( W_{t - 2} W_{t - 1} \overline{{(1 + \sigma^{2} (e_{t - 2} )}} ) \)

0

1

\( P_{i,t - 3} \left( {1 + e_{i,t - 2} } \right) \)

\( P_{it} \left( {1 + e_{it} } \right) \)

\( W_{t - 2} W_{t - 1} W_{t} \overline{{(1 + \sigma^{2} (e_{t - 2} )}} ) \)

1

0

\( P_{i,t - 2} \left( {1 + e_{i,t - 2} } \right) \)

\( P_{i,t - 1} \left( {1 + e_{it} } \right) \)

\( W_{t - 1} \overline{{(1 + \sigma^{2} (e_{t - 2} )}} ) \)

1

1

\( P_{i,t - 2} \left( {1 + e_{i,t - 2} } \right) \)

\( P_{it} \left( {1 + e_{it} } \right) \)

\( W_{t - 1} W_{t} \overline{{(1 + \sigma^{2} (e_{t - 2} )}} ) \)

Defining θ as the probability that a = 0 and α as (1  θ) then

$$ {\text{E}}\left( {{}_{2}\hat{W}_{t} } \right) \approx \left( {\overline{{1 + \sigma^{2} (e_{t - 2} )}} } \right)\left[ {\theta^{2} W_{t - 2} W_{t - 1} + \alpha \theta W_{t - 2} + \alpha \theta W_{t - 1} + \alpha^{2} W_{t - 1} W_{t} } \right] $$
(A13)

and in return form

$$ {\text{E}}({}_{2}\hat{W}_{t} ) \approx \left( {\overline{{1 + \sigma^{2} (e_{t - 2} )}} } \right)\left[ {\left( {1 + R_{t - 1} } \right)\left( {1 + \theta R_{t - 2} } \right)\left( {1 + \alpha R_{t} } \right)} \right] $$
(A14)

then the ratio of (A14) to (A12) is

$$ {\frac{{{\text{E}}\left( {_{2} \hat{W}_{t} } \right)}}{{{\text{E}}\left( {\hat{W}_{t - 1} } \right)}}} = {\frac{{\left( {\overline{{1 + \sigma^{2} (e_{t - 2} )}} } \right)\left[ {\left( {1 + R_{t - 1} } \right)\left( {1 + \theta R_{t - 2} } \right)\left( {1 + \alpha R_{t} } \right)} \right]}}{{\left( {\overline{{1 + \sigma^{2} (e_{t - 2} )}} } \right)\left( {1 + \theta R_{t - 2} } \right)\left( {1 + \alpha R_{t - 1} } \right)}}} $$
(A15)

Since \( \left\{ {\overline{{ 1+ \sigma^{ 2} \left( {e_{i,t - 2} } \right)}} } \right\} \) is in the numerator and denominator, Jensen’s inequality does not apply, thus

$$ {\text{E}}\left( {{\frac{{{}_{2}\hat{W}_{t} }}{{\hat{W}_{t - 1} }}}} \right) \approx {\frac{{\left( {1 + R_{t - 1} } \right)\left( {1 + \alpha R_{t} } \right)}}{{\left( {1 + \alpha R_{t - 1} } \right)}}} $$
(A16)

Therefore, in the presence of a lagged adjustment process longer than the holding period under consideration, our method will remove all of the random transient error bias, but not all of the lagged adjustment process bias.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fisher, L., Weaver, D.G. & Webb, G. Removing biases in computed returns. Rev Quant Finan Acc 35, 137–161 (2010). https://doi.org/10.1007/s11156-009-0161-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11156-009-0161-8

Keywords

JEL Classification

Navigation