Leveraging Social Media to Predict Continuation and Reversal in Asset Prices

  • Patrick Houlihan
  • Germán G. CreamerEmail author


Using features extracted from StockTwits messages between July 2009 and September 2012, we show through simulations that: (1) message volume and sentiment can be used as a risk factor in an asset pricing model framework; (2) message volume and sentiment help explain the diffusion of price information over several days, and (3) message volume and sentiment can be used as features to predict asset price directional moves. Our findings suggest statistics derived from message volume and sentiment can improve asset price forecasts and leads to a profitable trading strategy.


Social media Crowdsourcing Sentiment analysis Machine learning Computational finance 



Partial funding for this research was provided by Stevens Alliance for Innovation and Leadership.


  1. Abarbanell, J. S., & Bernard, V. L. (1992). Tests of analysts’ overreaction/underreaction to earnings information as an explanation for anomalous stock price behavior. The Journal of Finance, 47(3), 1181–1207.CrossRefGoogle Scholar
  2. Agarwal, A., Biadsy, F., & Mckeown, K. R. (2009). Contextual phrase-level polarity analysis using lexical affect scoring and syntactic n-grams. In Proceedings of the 12th conference of the European chapter of the association for computational linguistics. Athens (pp. 24–32).Google Scholar
  3. Aisopos, F., Tzannetos, D., Violos, J., & Varvarigou, T. (2016). Using n-gram graphs for sentiment analysis: An extended study on twitter. In IEEE second international conference on big data computing service and applications (BigDataService). Oxford (pp. 44–51).Google Scholar
  4. Andrei, D., & Hasler, M. (2014). Investor attention and stock market volatility. The Review of Financial Studies, 28(1), 33–72.CrossRefGoogle Scholar
  5. Antweiler, W., & Frank, M. Z. (2004). Is all that talk just noise? The information content of internet stock message boards. The Journal of Finance, 59(3), 1259–1294.CrossRefGoogle Scholar
  6. Asur, S., & Huberman, B. A. (2010). Predicting the future with social media. In Proceedings of the 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology. Toronto (Vol. 01, pp. 492–499).Google Scholar
  7. Ben-Rephael, A., Da, Z., & Israelsen, R. D. (2017). It depends on where you search: Institutional investor attention and underreaction to news. The Review of Financial Studies, 30(9), 3009–3047.CrossRefGoogle Scholar
  8. Bernard, V. L., & Thomas, J. K. (1990). Evidence that stock prices do not fully reflect the implications of current earnings for future earnings. Journal of Accounting and Economics, 13(4), 305–340.CrossRefGoogle Scholar
  9. Bollen, J., Mao, H., & Pepe, A. (2011a). Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In Proceedings of the fifth international AAAI conference on weblogs and social media. Barcelona (pp. 450–453).Google Scholar
  10. Bollen, J., Mao, H., & Zeng, X. (2011b). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8.CrossRefGoogle Scholar
  11. Brown, G. W., & Cliff, M. T. (2005). Investor sentiment and asset valuation. The Journal of Business, 78(2), 405–440.CrossRefGoogle Scholar
  12. Bukovina, J. (2016). Social media big data and capital markets—an overview. Journal of Behavioral and Experimental Finance, 11, 18–26.CrossRefGoogle Scholar
  13. Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems, 28(2), 15–21.CrossRefGoogle Scholar
  14. Carhart, M. M. (1997). On persistence in mutual fund performance. The Journal of Finance, 52(1), 57–82.CrossRefGoogle Scholar
  15. Chen, H., De, P., Hu, Y. J., & Hwang, B. H. (2014). Wisdom of crowds: The value of stock opinions transmitted through social media. The Review of Financial Studies, 27(5), 1367–1403.CrossRefGoogle Scholar
  16. Chen, Y., & Xie, J. (2008). Online consumer review: Word-of-mouth as a new element of marketing communication mix. Management Science, 54(3), 477–491.CrossRefGoogle Scholar
  17. Choi, H., & Varian, H. (2012). Predicting the present with Google Trends. Economic Record, 88, 2–9.CrossRefGoogle Scholar
  18. Da, Z., Engelberg, J., & Gao, P. (2014). The sum of all FEARS investor sentiment and asset prices. The Review of Financial Studies, 28(1), 1–32.CrossRefGoogle Scholar
  19. Das, S. R., & Chen, M. Y. (2007). Yahoo! for Amazon: Sentiment extraction from small talk on the web. Management Science, 53(9), 1375–1388.CrossRefGoogle Scholar
  20. Devitt, A., & Ahmad, K. (2007). Sentiment polarity identification in financial news: A cohesion-based approach. In Proceedings of the 45th annual meeting of the association of computational linguistics. Prague (pp. 984–991).Google Scholar
  21. Dugast, J. (2018). Unscheduled news and market dynamics. The Journal of Finance, 73(6), 2537–2586.CrossRefGoogle Scholar
  22. Edmans, A., Goncalves-Pinto, L., Groen-Xu, M., & Wang, Y. (2018). Strategic news releases in equity vesting months. The Review of Financial Studies, 31(11), 4099–4141.CrossRefGoogle Scholar
  23. Engelberg, J., & Gao, P. (2011). In search of attention. The Journal of Finance, 66(5), 1461–1499.CrossRefGoogle Scholar
  24. Engelberg, J., McLean, R. D., & Pontiff, J. (2018). Anomalies and news. The Journal of Finance, 73(5), 1971–2001.CrossRefGoogle Scholar
  25. Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3–56.CrossRefGoogle Scholar
  26. Fama, E. F., & MacBeth, J. D. (1973). Risk, return, and equilibrium: Empirical tests. Journal of Political Economy, 81(3), 607–636.CrossRefGoogle Scholar
  27. Forrest, D. L. (2012). Shrunken learning rates do not improve AdaBoost on benchmark datasets, master’s thesis. Corvallis: OSU Libraries, Oregon State University.Google Scholar
  28. Frazzini, A. (2006). The disposition effect and underreaction to news. The Journal of Finance, 61(4), 2017–2046.CrossRefGoogle Scholar
  29. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.CrossRefGoogle Scholar
  30. Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors). The Annals of Statistics, 28(2), 337–407.CrossRefGoogle Scholar
  31. Garcia, D. (2013). Sentiment during recessions. The Journal of Finance, 68(3), 1267–1300.CrossRefGoogle Scholar
  32. Gilbert, E., & Karahalios, K. (2010) Widespread worry and the stock market. In Proceedings of the fourth international AAAI conference on weblogs and social media. Washington D.C. (pp. 58–65).Google Scholar
  33. Granger, C. W. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society, 37, 424–438.CrossRefGoogle Scholar
  34. Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67.CrossRefGoogle Scholar
  35. Hong, H., Kubik, J. D., & Stein, J. C. (2005). Thy neighbor’s portfolio: Word-of-mouth effects in the holdings and trades of money managers. The Journal of Finance, 60(6), 2801–2824.CrossRefGoogle Scholar
  36. Hong, H., Lim, T., & Stein, J. C. (2000). Bad news travels slowly: Size, analyst coverage, and the profitability of momentum strategies. The Journal of Finance, 55(1), 265–295.CrossRefGoogle Scholar
  37. Hong, H., & Stein, J. C. (1999). A unified theory of underreaction, momentum trading, and overreaction in asset markets. The Journal of Finance, 54(6), 2143–2184.CrossRefGoogle Scholar
  38. Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. Seattle (pp. 168–177).Google Scholar
  39. Jegadeesh, N., & Titman, S. (1993). Returns to buying winners and selling losers: Implications for stock market efficiency. The Journal of Finance, 48(1), 65–91.CrossRefGoogle Scholar
  40. Kaji, N., & Kitsuregawa, M. (2007). Building lexicon for sentiment analysis from massive collection of HTML documents. In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL). Prague (pp. 1075–1083).Google Scholar
  41. Kearney, C., & Liu, S. (2014). Textual sentiment in finance: A survey of methods and models. International Review of Financial Analysis, 33, 171–185.CrossRefGoogle Scholar
  42. Kim, S. H., & Kim, D. (2014). Investor sentiment from internet message postings and the predictability of stock returns. Journal of Economic Behavior and Organization, 107, 708–729.CrossRefGoogle Scholar
  43. Liu, B. (2010). Sentiment analysis and subjectivity. Handbook of Natural Language Processing, 2, 627–666.Google Scholar
  44. Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. The Journal of Finance, 66(1), 35–65.CrossRefGoogle Scholar
  45. Luo, X., Zhang, J., & Duan, W. (2013). Social media and firm equity value. Information Systems Research, 24(1), 146–163.CrossRefGoogle Scholar
  46. Martínez-Cámara, E., Martín-Valdivia, M. T., Urena-López, L. A., & Montejo-Ráez, A. R. (2014). Sentiment analysis in twitter. Natural Language Engineering, 20(1), 1–28.CrossRefGoogle Scholar
  47. Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure, 405(2), 442–451.CrossRefGoogle Scholar
  48. Mishne, G., & De Rijke, M. (2006). Capturing global mood levels using blog posts. In AAAI spring symposium: Computational approaches to analyzing weblogs (Vol. 6, pp. 145–152).Google Scholar
  49. Nasukawa, T., & Yi, J. (2003). Sentiment analysis: Capturing favorability using natural language processing. In Proceedings of the 2nd international conference on knowledge capture. Sanibel Island (pp. 70–77).Google Scholar
  50. Nazzaro, M., & Blackshaw, P. (2006). Consumer-generated media (CGM) 101: Word-of-month in the age of the web-fortified consumer. A Nielsen BuzzMetrics white paper (2nd ed.). Spring.Google Scholar
  51. Nguyen, T. H., Shirai, K., & Velcin, J. (2015). Sentiment analysis on social media for stock movement prediction. Expert Systems with Applications, 42(24), 9603–9611.CrossRefGoogle Scholar
  52. O’Connor, B., Balasubramanyan, R., Routledge, B. R., & Smith, N. A. (2010). From tweets to polls: Linking text sentiment to public opinion time series. In Proceedings of the fourth international AAAI conference on weblogs and social media. Washington, D.C. (pp. 122–129).Google Scholar
  53. Oh, C., & Sheng, O. (2011). Investigating predictive power of stock micro blog sentiment in forecasting future stock price directional movement. In International conference on information systems. Shangai (pp. 1–19).Google Scholar
  54. Oliveira, N., Cortez, P., & Areal, N. (2013). On the predictability of stock market behavior using stocktwits sentiment and posting volume. In Portuguese conference on artificial intelligence. Acores (pp. 355–365).Google Scholar
  55. Pak, A., & Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the seventh conference on international language resources and evaluation. Valletta (pp. 1320–1326).Google Scholar
  56. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1–135.CrossRefGoogle Scholar
  57. Poria, S., Cambria, E., Winterstein, G., & Huang, G. B. (2014). Sentic patterns: Dependency-based rules for concept-level sentiment analysis. Knowledge-Based Systems, 69, 45–63.CrossRefGoogle Scholar
  58. Saif, H., He, Y., Fernandez, M., & Alani, H. (2016). Contextual semantics for sentiment analysis of twitter. Information Processing and Management, 52(1), 5–19.CrossRefGoogle Scholar
  59. Schumaker, R. P., & Chen, H. (2009). Textual analysis of stock market prediction using breaking financial news: The AZFin text system. ACM Transactions on Information Systems (TOIS), 27(2), 12.CrossRefGoogle Scholar
  60. Shen, D., Zhang, W., Xiong, X., Li, X., & Zhang, Y. (2016). Trading and non-trading period internet information flow and intraday return volatility. Physica A: Statistical Mechanics and its Applications, 451, 519–524.CrossRefGoogle Scholar
  61. Siganos, A., Vagenas-Nanos, E., & Verwijmeren, P. (2014). Facebook’s daily sentiment and international stock markets. Journal of Economic Behavior and Organization, 107, 730–743.CrossRefGoogle Scholar
  62. Stone, P. J., Bales, R. F., Namenwirth, J. Z., & Ogilvie, D. M. (1966). The general inquirer: A computer system for content analysis and retrieval based on the sentence as a unit of information. Behavioral Science, 7(4), 484–498.CrossRefGoogle Scholar
  63. Sul, H. K., Dennis, A. R., & Yuan, L. I. (2014). Trading on twitter: The financial information content of emotion in social media. In 47th Hawaii international conference on systems science (HICSS). Waikoloa (pp. 806–815).Google Scholar
  64. Sul, H. K., Dennis, A. R., & Yuan, L. (2017). Trading on twitter: Using social media sentiment to predict stock returns. Decision Sciences, 48(3), 454–488.CrossRefGoogle Scholar
  65. Tetlock, P. C., Saar-Tsechansky, M., & Macskassy, S. (2008). More than words: Quantifying language to measure firms’ fundamentals. The Journal of Finance, 63(3), 1437–1467.CrossRefGoogle Scholar
  66. Vapnik, V. N. (1999). The nature of statistical learning theory (2nd ed.). New York: Springer.Google Scholar
  67. Whissell, C. (1989). The dictionary of affect in language. In R. Plutchik & H. Kellerman (Eds.), Emotion: Theory, research, and experience (Vol. 4, pp. 113–131). San Diego: Academic Press.Google Scholar
  68. Wilson, T., Hoffmann, P., Somasundaran, S., Kessler, J., Wiebe, J., Choi, Y., Cardie, C., Riloff, E., & Patwardhan, S. (2005). OpinionFinder: A system for subjectivity analysis. In Proceedings of HLT/EMNLP on interactive demonstrations. Vancouver (pp. 34–35).Google Scholar
  69. Xie, B., Passonneau, R. J., Wu, L., & Creamer, G. G. (2013). Semantic frames to predict stock price movement, In Long papers, 51st annual meeting of the association for computational linguistics. Sofia (Vol. 1, pp. 873–883).Google Scholar
  70. Zhang, Y., Feng, L., Jin, X., Shen, D., Xiong, X., & Zhang, W. (2014). Internet information arrival and volatility of SME PRICE INDEX. Physica A: Statistical Mechanics and its Applications, 399, 70–74.CrossRefGoogle Scholar
  71. Zhang, X., Fuehres, H., & Gloor, P. A. (2011). Predicting stock market indicators through twitter “I hope it is not as bad as I fear”. Procedia-Social and Behavioral Sciences, 26, 55–62.CrossRefGoogle Scholar
  72. Zhang, X., Fuehres, H., & Gloor, P. A. (2012). Predicting asset value through twitter buzz. In Advances in collective intelligence 2011, (pp. 23–34). Berlin: Springer.Google Scholar
  73. Zhu, J., Zou, H., Rosset, S., & Hastie, T. (2009). Multi-class AdaBoost. Statistics and its Interface, 2, 349–360.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Stevens Institute of TechnologyHobokenUSA

Personalised recommendations