Skip to main content
Log in

Can Sentiment Analysis and Options Volume Anticipate Future Returns?

  • Published:
Computational Economics Aims and scope Submit manuscript

“Prediction is very difficult, especially about the future.”

Niels Bohr

Abstract

This paper evaluates the question of whether sentiment extracted from social media and options volume anticipates future asset return. The research utilized both textual based data and a particular market data derived call-put ratio, collected between July 2009 and September 2012. It shows that: (1) features derived from market data and a call-put ratio can improve model performance, (2) sentiment derived from StockTwits, a social media platform for the financial community, further enhances model performance, (3) aggregating all features together also facilitates performance, and (4) sentiment from social media and market data can be used as risk factors in an asset pricing framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Abu Bakar, A., Siganos, A., & Vagenas-Nanos, E. (2014). Does mood explain the monday effect? Journal of Forecasting, 33(6), 409–418.

    Article  Google Scholar 

  • Agarwal, A., Biadsy, F., & Mckeown, K. R. (2009). Contextual phrase-level polarity analysis using lexical affect scoring and syntactic n-grams. In Proceedings of the 12th conference of the European chapter of the association for computational linguistics, Athens, Greece, pp. 24–32.

  • Aisopos, F., Tzannetos, D., Violos, J. & Varvarigou, T. (2016). Using n-gram graphs for sentiment analysis: an extended study on Twitter. In Proceedings of the 2016 IEEE second international conference on big data computing service and applications, Oxford, United Kingdom, pp. 44–51.

  • Anthony, J. H. (1988). The interrelation of stock and options market trading-volume data. The Journal of Finance, 43(4), 949–964.

    Article  Google Scholar 

  • Antweiler, W., & Frank, M. Z. (2004). Is all that talk just noise? The information content of internet stock message boards. The Journal of Finance, 59(3), 1259–1294.

    Article  Google Scholar 

  • Asur, S., & Huberman, B. A. (2010). Predicting the future with social media. In Proceedings of the 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT), Los Alamitos, CA, Vol. 1 (pp. 492-499).

  • Bermingham, A., & Smeaton, A. F. (2010). Classifying sentiment in microblogs: Is brevity an advantage? In Proceedings of the 19th ACM international conference on Information and Knowledge Management, Toronto, CA (pp. 1833–1836).

  • Billingsley, R. S., & Chance, D. M. (1988). Put-call ratios and market timing effectiveness. The Journal of Portfolio Management, 15(1), 25–28.

    Article  Google Scholar 

  • Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8.

    Article  Google Scholar 

  • Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.

    Google Scholar 

  • Cao, C., Griffin, J. M., & Chen, Z. (2003). Informational content of option volume prior to takeovers, Yale SOM Working Paper No. ES-31.

  • Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems, 28(2), 15–21.

    Article  Google Scholar 

  • Cambria, E., & White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational Intelligence Magazine, 9(2), 48–57.

    Article  Google Scholar 

  • Carhart, M. M. (1997). On persistence in mutual fund performance. The Journal of Finance, 52(1), 57–82.

    Article  Google Scholar 

  • Chen, J., Hong, H., & Stein, J. C. (2002). Breadth of ownership and stock returns. Journal of financial Economics, 66(2), 171–205.

    Article  Google Scholar 

  • Chen, Z., & Lu, A. (2017). Slow diffusion of information and price momentum in stocks: Evidence from options markets. Journal of Banking and Finance, 75, 98–108.

    Article  Google Scholar 

  • Choi, H., & Varian, H. (2012). Predicting the present with Google Trends. Economic Record, 88(s1), 2–9.

    Article  Google Scholar 

  • Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological), 34(2), 187–220.

  • Danbolt, J., Siganos, A., & Vagenas-Nanos, E. (2015). Investor sentiment and bidder announcement abnormal returns. Journal of Corporate Finance, 33, 164–179.

    Article  Google Scholar 

  • Da, Z., Engelberg, J., & Gao, P. (2011). In search of attention. The Journal of Finance, 66(5), 1461–1499.

    Article  Google Scholar 

  • De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann, R. J. (1990a). Positive feedback investment strategies and destabilizing rational speculation. The Journal of Finance, 45(2), 379–395.

  • De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann, R. J. (1990b). Noise trader risk in financial markets. Journal of Political Economy, 98(4), 703–738.

  • Dietterich, T. G. (2000). Ensemble methods in machine learning. In Multiple classifier systems. MCS 2000. Lecture Notes need space after comma in Computer Science, Springer, Berlin, Heidelberg, Vol. 1857.

  • Fama, E. F., & MacBeth, J. D. (1973). Risk, return, and equilibrium: Empirical tests. The Journal of Political Economy, 81(3), 607–636.

  • Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3–56.

    Article  Google Scholar 

  • Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.

    Article  Google Scholar 

  • Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. The annals of statistics, 28(2), 337–407.

    Article  Google Scholar 

  • Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2012). A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(4), 463–484.

    Article  Google Scholar 

  • Ghiassi, M., Skinner, J., & Zimbra, D. (2013). Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural network. Expert Systems with Applications, 40(16), 6266–6282.

    Article  Google Scholar 

  • Gruhl, D., Guha, R., Kumar, R., Novak, J., & Tomkins, A. (2005). The predictive power of online chatter. In Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery and data mining, Chicago, IL (pp. 78–87).

  • Hamid, A., & Heiden, M. (2015). Forecasting volatility with empirical similarity and Google Trends. Journal of Economic Behavior and Organization, 117, 62–81.

    Article  Google Scholar 

  • Hennig-Thurau, T., Wiertz, C., & Feldhaus, F. (2015). Does Twitter matter? The impact of microblogging word of mouth on consumers’ adoption of new movies. Journal of the Academy of Marketing Science, 43(3), 375–394.

    Article  Google Scholar 

  • Houlihan, P. & Creamer, G. G. (2014). Leveraging a call-put ratio as a trading signal. Howe School Research Paper No. 2015–49. Available at SSRN: https://ssrn.com/abstract=2363475.

  • Houlihan, P. & Creamer, G. G. (2015). Leveraging social media to predict continuation and reversal in asset prices. Available at SSRN: https://ssrn.com/abstract=2527968.

  • Hu, J. (2014). Does option trading convey stock price information? Journal of Financial Economics, 111(3), 625–645.

    Article  Google Scholar 

  • Liu, B. (2010). Sentiment analysis and Subjectivity. Handbook of Natural Language Processing, 2, 627–666.

    Google Scholar 

  • Kanakaraj, M. & Guddeti, R. M. R. (2015). Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques. In 2015 Ninth IEEE international conference on semantic computing (ICSC), Anaheim, CA (pp. 169–170).

  • Kaplan, A. M., & Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons, 53(1), 59–68.

    Article  Google Scholar 

  • Kim, S. H., & Kim, D. (2014). Investor sentiment from internet message postings and the predictability of stock returns. Journal of Economic Behavior and Organization, 107, 708–729.

    Article  Google Scholar 

  • Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. The Journal of Finance, 66(1), 35–65.

    Article  Google Scholar 

  • Maglogiannis, I. G. (2007). Emerging artificial intelligence applications in computer engineering: Real word AI systems with applications in eHealth, HCI, information retrieval and pervasive technologies. Amsterdam: Ios Press.

    Google Scholar 

  • Martínez-Cámara, E., Martín-Valdivia, M. T., Urena-López, L. A., & Montejo-Ráez, A. R. (2014). Sentiment analysis in Twitter. Natural Language Engineering, 20(01), 1–28.

    Article  Google Scholar 

  • Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein. Structure, 405(2), 442–451.

    Google Scholar 

  • Nguyen, T. H., Shirai, K., & Velcin, J. (2015). Sentiment analysis on social media for stock movement prediction. Expert Systems with Applications, 42(24), 9603–9611.

    Article  Google Scholar 

  • Pan, J., & Poteshman, A. M. (2006). The information in option volume for future stock prices. Review of Financial Studies, 19(3), 871–908.

    Article  Google Scholar 

  • Pang, B., & Lee, L. (2004). A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd annual meeting on association for computational linguistics, Barcelona, Spain (p. 271).

  • Poria, S., Cambria, E., Winterstein, G., & Huang, G. B. (2014). Sentic patterns: Dependency-based rules for concept-level sentiment analysis. Knowledge-Based Systems, 69, 45–63.

    Article  Google Scholar 

  • Russell, S., Norvig, P., & Intelligence A. (2009). Artificial Intelligence: A modern approach (3rd ed.). Englewood Cliffs: Prentice-Hall.

    Google Scholar 

  • Saif, H., He, Y., Fernandez, M., & Alani, H. (2016). Contextual semantics for sentiment analysis of Twitter. Information Processing and Management, 52(1), 5–19.

    Article  Google Scholar 

  • Shen, D., Zhang, W., Xiong, X., Li, X., & Zhang, Y. (2016). Trading and non-trading period Internet information flow and intraday return volatility. Physica A: Statistical Mechanics and its Applications, 451, 519–524.

    Article  Google Scholar 

  • Siganos, A., Vagenas-Nanos, E., & Verwijmeren, P. (2014). Facebook’s daily sentiment and international stock markets. Journal of Economic Behavior and Organization, 107, 730–743.

    Article  Google Scholar 

  • Tumarkin, R., & Whitelaw, R. F. (2001). News or noise? Internet postings and stock prices. Financial Analysts Journal, 57(3), 41–51.

    Article  Google Scholar 

  • Whissell, C., Fournier, M., Pelland, R., Weir, D., & Makarec, K. (1986). A dictionary of affect in language: IV. Reliability, validity, and applications. Perceptual and Motor Skills, 62(3), 875–888.

    Article  Google Scholar 

  • Wu, L. & Brynjolfsson, E. (2014). The future of prediction: How Google searches foreshadow housing prices and sales. In A. Goldfarb, S. M. Greenstein, and C. E. Tucker (Eds). Economic analysis of the digital economy. University of Chicago Press, Chicago, IL, 89–118.

  • Wysocki, P. D. (1998). Cheap talk on the web: The determinants of postings on stock message boards. University of Michigan Business School Working Paper, (98025).

  • Xie, B., Passonneau, R. J., Wu, L., & Creamer, G. G. (2013). Semantic frames to predict stock price movement. In Proceedings of the 51st annual meeting of the association for computational linguistics, Sofia, Bulgaria (pp. 873–883).

  • Zhang, W., Shen, D., Zhang, Y., & Xiong, X. (2013). Open source information, investor attention, and asset pricing. Economic Modelling, 33, 613–619.

    Article  Google Scholar 

  • Zhang, Y., Feng, L., Jin, X., Shen, D., Xiong, X., & Zhang, W. (2014). Internet information arrival and volatility of SME PRICE INDEX. Physica A: Statistical Mechanics and its Applications, 399, 70–74.

    Article  Google Scholar 

  • Zhou, Z. H., Wu, J., & Tang, W. (2002). Ensembling neural networks: Many could be better than all. Artificial Intelligence, 137(1), 239–263.

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank StockTwits for providing the messages. The authors also thank Shu-Heng Chen, Blake LeBaron, Jon Kaufman, David Starer, Hamed Ghoddusi, Khaldoun Khashanah, and three anonymous referees for suggestions and informal discussions about this research. The opinions presented are the exclusive responsibility of the authors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrick Houlihan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Houlihan, P., Creamer, G.G. Can Sentiment Analysis and Options Volume Anticipate Future Returns?. Comput Econ 50, 669–685 (2017). https://doi.org/10.1007/s10614-017-9694-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10614-017-9694-4

Keywords

Navigation