Skip to main content

Big Data Analysis of StockTwits to Predict Sentiments in the Stock Market

  • Conference paper
Discovery Science (DS 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8777))

Included in the following conference series:

Abstract

Online stock forums have become a vital investing platform for publishing relevant and valuable user-generated content (UGC) data, such as investment recommendations that allow investors to view the opinions of a large number of users, and the sharing and exchanging of trading ideas. This paper combines text-mining, feature selection and Bayesian Networks to analyze and extract sentiments from stock-related micro-blogging messages called “StockTwits”. Here, we investigate whether the power of the collective sentiments of StockTwits might be predicted and how these predicted sentiments might help investors and their peers to make profitable investment decisions in the stock market. Specifically, we build Bayesian Networks from terms identified in the tweets that are selected using wrapper feature selection. We then used textual visualization to provide a better understanding of the predicted relationships among sentiments and their related features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Atsalakis, G.S., Valavanis, K.P.: Surveying stock market forecasting techniques – Part II: Softcomputing methods. Expert Systems with Applications 36(3), 5932–5941 (2009)

    Article  Google Scholar 

  2. Claburn, T.: “Twitter growth surges 131% in March Information Week (2009), http://www.informationweek.com/news/internet/social_network/showArticle.jhtml?articleID=216500968 (retrieved October 25, 2010)

  3. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29(2-3), 131–163 (1997)

    Article  MATH  Google Scholar 

  4. Guresen, E., Kayakutlu, G., Daim, T.U.: Using artificial neural network models in stock market index prediction. Expert Systems with Applications 38(8), 10389–10397 (2011)

    Article  Google Scholar 

  5. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. The Journal of Machine Learning Research 3, 1157–1182 (2003)

    MATH  Google Scholar 

  6. Huang, C.-J., Yang, D.-X., Chuang, Y.-T.: Application of wrapper approach and composite classifier to the stock trend prediction. Expert Systems with Applications 34(4), 2870–2878 (2008)

    Article  Google Scholar 

  7. John, G.H., Kohavi, R., Pfleger, K.: Irrelevant Features and the Subset Selection Problem. In: ICML, vol. 94, pp. 121–129 (1994)

    Google Scholar 

  8. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1), 273–324 (1997)

    Article  MATH  Google Scholar 

  9. Kramer, A.D.: An unobtrusive behavioral model of gross national happiness. In: Proceedings of the 28th International Conference on Human Factors in Computing Systems, p. 287. ACM (2010)

    Google Scholar 

  10. Lee, M.-C.: Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Systems with Applications 36(8) (2009)

    Google Scholar 

  11. Loughran, T., McDonald, B.: When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. The Journal of Finance 66(1), 35–65 (2011)

    Google Scholar 

  12. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations  11(1) (2009)

    Google Scholar 

  13. Ni, L.-P., Ni, Z.-W., Gao, Y.-Z.: Stock trend prediction based on fractal feature selection and support vector machine. Expert Systems with Applications 38(5), 5569–5576 (2011)

    Article  Google Scholar 

  14. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, vol. 10, p. 79. Association for Computational Linguistics (2002)

    Google Scholar 

  15. Pazzani, M., Muramatsu, J., Billsus, D.: Syskill&Webert: Identifying Interesting Web Sites. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 54–61. AAAI Press, Portland (1996)

    Google Scholar 

  16. Pearl, J.: Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann (1988)

    Google Scholar 

  17. R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2012), http://www.R-project.org/ , ISBN 3-900051-07-0

  18. Sima, C., Dougherty, E.R.: The peaking phenomenon in the presence of feature-selection. Pattern Recognition Letters 29(11), 1667–1674 (2008)

    Article  Google Scholar 

  19. Stein, G., Chen, B., Wu, A.S.,Hua, K.A.: Decision tree classifier for network intrusion detection with GA-based feature selection. In: Proceedings of the 43rd Annual Southeast Regional Conference, vol. 2, pp. 136–141. ACM (March 2005)

    Google Scholar 

  20. Sui, X.-S., Qi, Z.-Y., Yu, D.-R., Hu, Q.-H., Zhao, H.: A novel feature selection approach using classification complexity for SVM of stock market trend prediction. In: International Conference on Management Science and Engineering, Harbin, China, pp. 1654–1659 (2007)

    Google Scholar 

  21. Tan, T.Z., Quek, C., Ng, G.S.: Biological brain-inspired genetic complementary learning for stock market and bank failure prediction. Computational Intelligence 23(2), 236–261 (2007)

    Article  MathSciNet  Google Scholar 

  22. Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, p. 417. Association for Computational Linguistics (2002)

    Google Scholar 

  23. Wordle-Beautiful Word clouds (May 20, 2014), http://www.wordle.net/creat

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Al Nasseri, A., Tucker, A., de Cesare, S. (2014). Big Data Analysis of StockTwits to Predict Sentiments in the Stock Market. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds) Discovery Science. DS 2014. Lecture Notes in Computer Science(), vol 8777. Springer, Cham. https://doi.org/10.1007/978-3-319-11812-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11812-3_2

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11811-6

  • Online ISBN: 978-3-319-11812-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics