Can Sentiment Analysis and Options Volume Anticipate Future Returns?

“Prediction is very difficult, especially about the future.”

Niels Bohr

Abstract

This paper evaluates the question of whether sentiment extracted from social media and options volume anticipates future asset return. The research utilized both textual based data and a particular market data derived call-put ratio, collected between July 2009 and September 2012. It shows that: (1) features derived from market data and a call-put ratio can improve model performance, (2) sentiment derived from StockTwits, a social media platform for the financial community, further enhances model performance, (3) aggregating all features together also facilitates performance, and (4) sentiment from social media and market data can be used as risk factors in an asset pricing framework.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

References

  1. Abu Bakar, A., Siganos, A., & Vagenas-Nanos, E. (2014). Does mood explain the monday effect? Journal of Forecasting, 33(6), 409–418.

    Article  Google Scholar 

  2. Agarwal, A., Biadsy, F., & Mckeown, K. R. (2009). Contextual phrase-level polarity analysis using lexical affect scoring and syntactic n-grams. In Proceedings of the 12th conference of the European chapter of the association for computational linguistics, Athens, Greece, pp. 24–32.

  3. Aisopos, F., Tzannetos, D., Violos, J. & Varvarigou, T. (2016). Using n-gram graphs for sentiment analysis: an extended study on Twitter. In Proceedings of the 2016 IEEE second international conference on big data computing service and applications, Oxford, United Kingdom, pp. 44–51.

  4. Anthony, J. H. (1988). The interrelation of stock and options market trading-volume data. The Journal of Finance, 43(4), 949–964.

    Article  Google Scholar 

  5. Antweiler, W., & Frank, M. Z. (2004). Is all that talk just noise? The information content of internet stock message boards. The Journal of Finance, 59(3), 1259–1294.

    Article  Google Scholar 

  6. Asur, S., & Huberman, B. A. (2010). Predicting the future with social media. In Proceedings of the 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT), Los Alamitos, CA, Vol. 1 (pp. 492-499).

  7. Bermingham, A., & Smeaton, A. F. (2010). Classifying sentiment in microblogs: Is brevity an advantage? In Proceedings of the 19th ACM international conference on Information and Knowledge Management, Toronto, CA (pp. 1833–1836).

  8. Billingsley, R. S., & Chance, D. M. (1988). Put-call ratios and market timing effectiveness. The Journal of Portfolio Management, 15(1), 25–28.

    Article  Google Scholar 

  9. Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8.

    Article  Google Scholar 

  10. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.

    Google Scholar 

  11. Cao, C., Griffin, J. M., & Chen, Z. (2003). Informational content of option volume prior to takeovers, Yale SOM Working Paper No. ES-31.

  12. Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems, 28(2), 15–21.

    Article  Google Scholar 

  13. Cambria, E., & White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational Intelligence Magazine, 9(2), 48–57.

    Article  Google Scholar 

  14. Carhart, M. M. (1997). On persistence in mutual fund performance. The Journal of Finance, 52(1), 57–82.

    Article  Google Scholar 

  15. Chen, J., Hong, H., & Stein, J. C. (2002). Breadth of ownership and stock returns. Journal of financial Economics, 66(2), 171–205.

    Article  Google Scholar 

  16. Chen, Z., & Lu, A. (2017). Slow diffusion of information and price momentum in stocks: Evidence from options markets. Journal of Banking and Finance, 75, 98–108.

    Article  Google Scholar 

  17. Choi, H., & Varian, H. (2012). Predicting the present with Google Trends. Economic Record, 88(s1), 2–9.

    Article  Google Scholar 

  18. Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological), 34(2), 187–220.

  19. Danbolt, J., Siganos, A., & Vagenas-Nanos, E. (2015). Investor sentiment and bidder announcement abnormal returns. Journal of Corporate Finance, 33, 164–179.

    Article  Google Scholar 

  20. Da, Z., Engelberg, J., & Gao, P. (2011). In search of attention. The Journal of Finance, 66(5), 1461–1499.

    Article  Google Scholar 

  21. De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann, R. J. (1990a). Positive feedback investment strategies and destabilizing rational speculation. The Journal of Finance, 45(2), 379–395.

  22. De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann, R. J. (1990b). Noise trader risk in financial markets. Journal of Political Economy, 98(4), 703–738.

  23. Dietterich, T. G. (2000). Ensemble methods in machine learning. In Multiple classifier systems. MCS 2000. Lecture Notes need space after comma in Computer Science, Springer, Berlin, Heidelberg, Vol. 1857.

  24. Fama, E. F., & MacBeth, J. D. (1973). Risk, return, and equilibrium: Empirical tests. The Journal of Political Economy, 81(3), 607–636.

  25. Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3–56.

    Article  Google Scholar 

  26. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.

    Article  Google Scholar 

  27. Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. The annals of statistics, 28(2), 337–407.

    Article  Google Scholar 

  28. Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2012). A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(4), 463–484.

    Article  Google Scholar 

  29. Ghiassi, M., Skinner, J., & Zimbra, D. (2013). Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural network. Expert Systems with Applications, 40(16), 6266–6282.

    Article  Google Scholar 

  30. Gruhl, D., Guha, R., Kumar, R., Novak, J., & Tomkins, A. (2005). The predictive power of online chatter. In Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery and data mining, Chicago, IL (pp. 78–87).

  31. Hamid, A., & Heiden, M. (2015). Forecasting volatility with empirical similarity and Google Trends. Journal of Economic Behavior and Organization, 117, 62–81.

    Article  Google Scholar 

  32. Hennig-Thurau, T., Wiertz, C., & Feldhaus, F. (2015). Does Twitter matter? The impact of microblogging word of mouth on consumers’ adoption of new movies. Journal of the Academy of Marketing Science, 43(3), 375–394.

    Article  Google Scholar 

  33. Houlihan, P. & Creamer, G. G. (2014). Leveraging a call-put ratio as a trading signal. Howe School Research Paper No. 2015–49. Available at SSRN: https://ssrn.com/abstract=2363475.

  34. Houlihan, P. & Creamer, G. G. (2015). Leveraging social media to predict continuation and reversal in asset prices. Available at SSRN: https://ssrn.com/abstract=2527968.

  35. Hu, J. (2014). Does option trading convey stock price information? Journal of Financial Economics, 111(3), 625–645.

    Article  Google Scholar 

  36. Liu, B. (2010). Sentiment analysis and Subjectivity. Handbook of Natural Language Processing, 2, 627–666.

    Google Scholar 

  37. Kanakaraj, M. & Guddeti, R. M. R. (2015). Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques. In 2015 Ninth IEEE international conference on semantic computing (ICSC), Anaheim, CA (pp. 169–170).

  38. Kaplan, A. M., & Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons, 53(1), 59–68.

    Article  Google Scholar 

  39. Kim, S. H., & Kim, D. (2014). Investor sentiment from internet message postings and the predictability of stock returns. Journal of Economic Behavior and Organization, 107, 708–729.

    Article  Google Scholar 

  40. Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. The Journal of Finance, 66(1), 35–65.

    Article  Google Scholar 

  41. Maglogiannis, I. G. (2007). Emerging artificial intelligence applications in computer engineering: Real word AI systems with applications in eHealth, HCI, information retrieval and pervasive technologies. Amsterdam: Ios Press.

    Google Scholar 

  42. Martínez-Cámara, E., Martín-Valdivia, M. T., Urena-López, L. A., & Montejo-Ráez, A. R. (2014). Sentiment analysis in Twitter. Natural Language Engineering, 20(01), 1–28.

    Article  Google Scholar 

  43. Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein. Structure, 405(2), 442–451.

    Google Scholar 

  44. Nguyen, T. H., Shirai, K., & Velcin, J. (2015). Sentiment analysis on social media for stock movement prediction. Expert Systems with Applications, 42(24), 9603–9611.

    Article  Google Scholar 

  45. Pan, J., & Poteshman, A. M. (2006). The information in option volume for future stock prices. Review of Financial Studies, 19(3), 871–908.

    Article  Google Scholar 

  46. Pang, B., & Lee, L. (2004). A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd annual meeting on association for computational linguistics, Barcelona, Spain (p. 271).

  47. Poria, S., Cambria, E., Winterstein, G., & Huang, G. B. (2014). Sentic patterns: Dependency-based rules for concept-level sentiment analysis. Knowledge-Based Systems, 69, 45–63.

    Article  Google Scholar 

  48. Russell, S., Norvig, P., & Intelligence A. (2009). Artificial Intelligence: A modern approach (3rd ed.). Englewood Cliffs: Prentice-Hall.

    Google Scholar 

  49. Saif, H., He, Y., Fernandez, M., & Alani, H. (2016). Contextual semantics for sentiment analysis of Twitter. Information Processing and Management, 52(1), 5–19.

    Article  Google Scholar 

  50. Shen, D., Zhang, W., Xiong, X., Li, X., & Zhang, Y. (2016). Trading and non-trading period Internet information flow and intraday return volatility. Physica A: Statistical Mechanics and its Applications, 451, 519–524.

    Article  Google Scholar 

  51. Siganos, A., Vagenas-Nanos, E., & Verwijmeren, P. (2014). Facebook’s daily sentiment and international stock markets. Journal of Economic Behavior and Organization, 107, 730–743.

    Article  Google Scholar 

  52. Tumarkin, R., & Whitelaw, R. F. (2001). News or noise? Internet postings and stock prices. Financial Analysts Journal, 57(3), 41–51.

    Article  Google Scholar 

  53. Whissell, C., Fournier, M., Pelland, R., Weir, D., & Makarec, K. (1986). A dictionary of affect in language: IV. Reliability, validity, and applications. Perceptual and Motor Skills, 62(3), 875–888.

    Article  Google Scholar 

  54. Wu, L. & Brynjolfsson, E. (2014). The future of prediction: How Google searches foreshadow housing prices and sales. In A. Goldfarb, S. M. Greenstein, and C. E. Tucker (Eds). Economic analysis of the digital economy. University of Chicago Press, Chicago, IL, 89–118.

  55. Wysocki, P. D. (1998). Cheap talk on the web: The determinants of postings on stock message boards. University of Michigan Business School Working Paper, (98025).

  56. Xie, B., Passonneau, R. J., Wu, L., & Creamer, G. G. (2013). Semantic frames to predict stock price movement. In Proceedings of the 51st annual meeting of the association for computational linguistics, Sofia, Bulgaria (pp. 873–883).

  57. Zhang, W., Shen, D., Zhang, Y., & Xiong, X. (2013). Open source information, investor attention, and asset pricing. Economic Modelling, 33, 613–619.

    Article  Google Scholar 

  58. Zhang, Y., Feng, L., Jin, X., Shen, D., Xiong, X., & Zhang, W. (2014). Internet information arrival and volatility of SME PRICE INDEX. Physica A: Statistical Mechanics and its Applications, 399, 70–74.

    Article  Google Scholar 

  59. Zhou, Z. H., Wu, J., & Tang, W. (2002). Ensembling neural networks: Many could be better than all. Artificial Intelligence, 137(1), 239–263.

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank StockTwits for providing the messages. The authors also thank Shu-Heng Chen, Blake LeBaron, Jon Kaufman, David Starer, Hamed Ghoddusi, Khaldoun Khashanah, and three anonymous referees for suggestions and informal discussions about this research. The opinions presented are the exclusive responsibility of the authors.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Patrick Houlihan.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Houlihan, P., Creamer, G.G. Can Sentiment Analysis and Options Volume Anticipate Future Returns?. Comput Econ 50, 669–685 (2017). https://doi.org/10.1007/s10614-017-9694-4

Download citation

Keywords

  • Social media
  • Investor sentiment
  • Behavioral finance
  • Machine learning