Skip to main content
Log in

Stock prediction: an event-driven approach based on bursty keywords

  • Feature Article
  • Published:
Frontiers of Computer Science in China Aims and scope Submit manuscript

Abstract

There are many real applications existing where the decision making process depends on a model that is built by collecting information from different data sources. Let us take the stock market as an example. The decision making process depends on a model which that is influenced by factors such as stock prices, exchange volumes, market indices (e.g. Dow Jones Index), news articles, and government announcements (e.g., the increase of stamp duty). Yet Nevertheless, modeling the stock market is a challenging task because (1) the process related to market states (rise state/drop state) is a stochastic process, which is hard to capture using the deterministic approach, and (2) the market state is invisible but will be influenced by the visible market information, like stock prices and news articles. In this paper, we propose an approach to model the stock market process by using a Non-homogeneous Hidden Markov Model (NHMM). It takes both stock prices and news articles into consideration when it is being computed. A unique feature of our approach is event driven. We identify associated events for a specific stock using a set of bursty features (keywords), which has a significant impact on the stock price changes when building the NHMM. We apply the model to predict the trend of future stock prices and the encouraging results indicate our proposed approach is practically sound and highly effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adler P A, Adler P. The market as collective behavior. In: The Social Dynamics of Financial Markets, Greenwich: JAI Press, 1984: 85–105

    Google Scholar 

  2. Blumer H. Outline of collective behavior. In: Readings in Collective Behavior. 2nd ed. Pittsburgh: Carnegie Press, 1975: 22–45

    Google Scholar 

  3. Festinger L. A theory of cognitive dissonance. California: Stanford University Press, Reprinted in 1968

    Google Scholar 

  4. Klausner M. Sociological theory and the bechavio of financial markets. The Social Dynamics of Financial Markets, 1984: 57–81

  5. Wu D, Fung G P C, Yu J X, Liu Z Integrating multiple data sources for stock prediction. In: Proceedings of WISE 2008, 2008: 77–89

  6. Lavrenko V, Schmill MD, Lawire D, Ogivie P, Jensen D, Allan J. Mining of Concurrent Text and Time Series. In: Proceedings of KDD00 Workshop on Text Mining, 2000

  7. Hughes J P, Guttorp P, Charles S P. A non-homogeneous hidden Markov model for precipitation occurrence. Applied Statistics, 1999, 48(1): 15–30

    MATH  Google Scholar 

  8. Bodie Z, Kane A, Marcus A J. Investments. Chicago: Irwin, third edition, 1996

    Google Scholar 

  9. X. Ge, P. Smyth, Deformable markov model templates for time-series pattern matching. In: Proceedings of KDD00, 2000: 81–90

  10. Holmes W J, Russell M J. Probabilistic-trajectory segmental hmms. Computer Speech and Language, 1999,13: 3–38

    Article  Google Scholar 

  11. Jurafsky D, Martin J H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice-Hall, 2000

  12. Kirshner S. Modeling of multivariate time series using hidden Markov models. PhD thesis, University of California, Irvine, 2005

    Google Scholar 

  13. Fung G P C, Yu J X, Yu P S, Lu H. Parameter free bursty events detection in text streams. In: Proceedings of VLDB05, 2005: 181–192

  14. Kohara K, Ishikawa T, Fukuhara Y, Nakamura Y. Stock Price Prediction using prior knowledge and neural networks. Intelligent Systems in Accounting, Finance and Management. 1997, 6: 11–12

    Article  Google Scholar 

  15. Keogh E J, Chu S, Hart D, Pazzani M J. An online algorithm for segmenting time series. In: Proceedings of ICDM01, 2001: 289–296

  16. Salton G, McGill M J. Introduction to Modern Information Retrieval. McGraw-Hill Inc., 1986

  17. Fung G P C, Yu J X, Lam W. News sensitive stock trend prediction. In: Proceedings of PAKDD02, 2002: 481–493

  18. DPang-Ning Tan M S, Kumar V. Introduction to Data Mining. New York: Addison-Wesley, 2006

    Google Scholar 

  19. Hellstrom T, Holmstrom K. Predicting the Stock Market. Sweden: Marardalen university, 1998

    Google Scholar 

  20. Klein F, Prestbo J A. News and the Market. Chicago: Henry Regenry, 1974

    Google Scholar 

  21. Fawcett T, Provost F J. Activity monitoring: Noticing interesting changes in behavior. In: Proceedings of KDD 99, 1999: 53–62

    Google Scholar 

  22. Thomas J D, Sycara K. Integrating genetic algorithms and text learning for financial prediction. In: Proceedings of the Genetic and Evolutionary Computing 2000 Conference Workshop on Data Mining with Evolutionary Algorithms, 2000

  23. Nigam K, Lafferty J, McCallum A. Using maximunm entropy for text classification. In: Proceedings of the 16th International Joint Conference Workshop on Machine Learning for Information Filtering, 1999

  24. Wthrich B, Permunetilleke D, Leung S, Cho V, Zhang J, Lam W. Daily prediction of major stock indices from textual www data. In: Proceedings of KDD98, 1998: 364–368

  25. Wthrich B. Probabilistic knowledge bases. IEEE Transactions on Knowledge and Data Engineering, 1995,7(5): 691–698

    Article  Google Scholar 

  26. Ponte J M, Croft W B. A language modeling approach to information retrieval. In: Proceedings of SIGIR98, 1998: 275–281

  27. Fung G P C, Yu J X, Lu H. The predicting power of textual information on financial markets. IEEE Intelligent Informatics Bulletin, 2005,5(1):1–10

    Google Scholar 

  28. Joachims T. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of 10th European Conference on Machine Learning (ECML98), Chemnitz, Germany, 1998: 137–142

  29. Mittermayer M A, Knolmayer G F. Newscats: A news categorization and trading system. In: Proceedings of ICDM 06, 2006: 1002–1007

  30. Kim S, Smyth P, Luther S. Modeling waveform shapes with random effects segmental hidden markov models. In: Proceedings of the 20th conference on Uncertainty in artificial intelligence, 2004: 309–316

  31. Basseville M, Nikiforov I. Detection of Abrupt Changes: Theory and Applications. Prentice-Hall, 1993

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeffrey Xu Yu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, D., Fung, G.P.C., Yu, J.X. et al. Stock prediction: an event-driven approach based on bursty keywords. Front. Comput. Sci. China 3, 145–157 (2009). https://doi.org/10.1007/s11704-009-0029-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-009-0029-z

Keywords

Navigation