Skip to main content

Forecasting Financial Market Volatility Using a Dynamic Topic Model

Abstract

This study employs big data and text data mining techniques to forecast financial market volatility. We incorporate financial information from online news sources into time series volatility models. We categorize a topic for each news article using time stamps and analyze the chronological evolution of the topic in the set of articles using a dynamic topic model. After calculating a topic score, we develop time series models that incorporate the score to estimate and forecast realized volatility. The results of our empirical analysis suggest that the proposed models can contribute to improving forecasting accuracy.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

References

  • Andersen, T. G., Bollerslev, T., & Diebold, F. X. (2007). Roughing it up: Including jump components in the measurement, modeling, and forecasting of return volatility. Review of Economics and Statistics, 89, 701–720.

    Article  Google Scholar 

  • Aouadi, A., Arouri, M., & Teulon, F. (2013). Investor attention and stock market activity: Evidence from France. Economic Modelling, 35, 674–681.

    Article  Google Scholar 

  • Bank, M., Larch, M., & Peter, G. (2011). Google search volume and its influence on liquidity and returns of German stocks. Financial Markets and Portfolio Management, 25, 239–264.

    Article  Google Scholar 

  • Barber, B., & Odean, T. (2001). The internet and the investor. Journal of Economic Perspectives, 15, 41–54.

    Article  Google Scholar 

  • Barndorff-Nielsen, O. E., & Shephard, N. (2002). Econometric analysis of realized volatility and its use in estimating stochastic volatility models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64, 253–280.

    Article  Google Scholar 

  • Bauwens, L., Hafner, C., & Laurent, S. (2012). Handbook of volatility models and their applications. Wiley handbooks in financial engineering and econometrics. Hoboken: Wiley.

    Google Scholar 

  • Bijl, L., Kringhaug, G., Molnár, P., & Sandvik, E. (2016). Google searches and stock returns. International Review of Financial Analysis, 45, 150–156.

    Article  Google Scholar 

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    Google Scholar 

  • Bollerslev, T., Patton, A. J., & Quaedvlieg, R. (2016). Exploiting the errors: A simple approach for improved volatility forecasting. Journal of Econometrics, 192, 1–18.

    Article  Google Scholar 

  • Caporin, M., & McAleer, M. (2014). Robust ranking of multivariate GARCH models by problem dimension. Computational Statistics and Data Analysis, 76, 172–185.

    Article  Google Scholar 

  • Choi, H., & Varian, H. (2012). Predicting the present with Google Trends. Economic Record, 88, 2–9.

    Article  Google Scholar 

  • Corsi, F. (2009). A simple approximate long-memory model of realized volatility. Journal of Financial Econometrics, 7, 174–196.

    Article  Google Scholar 

  • Da, Z., Engelberg, J., & Gao, P. (2011). In search of attention. The Journal of Finance, 66, 1461–1499.

    Article  Google Scholar 

  • Dimpfl, T., & Jank, S. (2016). Can internet search queries help to predict stock market volatility? European Financial Management, 22, 171–192.

    Article  Google Scholar 

  • Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101, 5228–5235.

    Article  Google Scholar 

  • Hansen, P. R., & Lunde, A. (2005). A realized variance for the whole day based on intermittent high-frequency data. Journal of Financial Econometrics, 3, 525–554.

    Article  Google Scholar 

  • Iwata, T., Yamada, T., Sakurai, Y. & Ueda, N. (2010). Online multiscale dynamic topic models. In Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 663–672). KDD ’10. New York, NY: ACM.

  • Joseph, K., Wintoki, M. B., & Zhang, Z. (2011). Forecasting abnormal stock returns and trading volume using investor sentiment: Evidence from online search. International Journal of Forecasting, 27, 1116–1127.

    Article  Google Scholar 

  • Kim, S.-H., & Kim, D. (2014). Investor sentiment from internet message postings and the predictability of stock returns. Journal of Economic Behavior and Organization, 107, 708–729.

    Article  Google Scholar 

  • Masuda, H., & Morimoto, T. (2012). Optimal weight for realized variance based on intermittent high-frequency data. Japanese Economic Review, 63, 497–527.

    Article  Google Scholar 

  • McAleer, M., & Medeiros, M. C. (2008). Realized volatility: A review. Econometric Reviews, 27, 10–45.

    Article  Google Scholar 

  • Minka, T. P. (2000). Estimating a Dirichlet distribution. Technical report, Microsoft Research

  • Mitra, G., & Mitra, L. (2011). The handbook of news analytics in finance. The Wiley finance series. Hoboken: Wiley.

    Book  Google Scholar 

  • Moat, H. S., Preis, T., Olivola, C. Y., Liu, C., & Chater, N. (2014). Using big data to predict collective behavior in the real world. Behavioral and Brain Sciences, 37, 92–93.

    Article  Google Scholar 

  • Nardo, M., Petracco-Giudici, M., & Naltsidis, M. (2016). Walking down wall street with a tablet: A survey of stock market predictions using the web. Journal of Economic Surveys, 30, 356–369.

    Article  Google Scholar 

  • Noureldin, D., Shephard, N., & Sheppard, K. (2012). Multivariate high-frequency-based volatility (heavy) models. Journal of Applied Econometrics, 27, 907–933.

    Article  Google Scholar 

  • Patton, A. J. (2011). Data-based ranking of realised volatility estimators. Journal of Econometrics, 161, 284–303.

    Article  Google Scholar 

  • Siganos, A., Vagenas-Nanos, E., & Verwijmeren, P. (2014). Facebook’s daily sentiment and international stock markets. Journal of Economic Behavior and Organization, 107, 730–743.

    Article  Google Scholar 

  • Smith, G. P. (2012). Google internet search activity and volatility prediction in the market for foreign currency. Finance Research Letters, 9, 103–110.

    Article  Google Scholar 

  • Takeda, F., & Wakao, T. (2014). Google search intensity and its relationship with returns and trading volume of Japanese stocks. Pacific-Basin Finance Journal, 27, 1–18.

    Article  Google Scholar 

  • Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62, 1139–1168.

    Article  Google Scholar 

  • Varian, H. R. (2014). Big data: New tricks for econometrics. Journal of Economic Perspectives, 28, 3–28.

    Article  Google Scholar 

  • Vlastakis, N., & Markellos, R. N. (2012). Information demand and stock market volatility. Journal of Banking and Finance, 36, 1808–1821.

    Article  Google Scholar 

  • Vozlyublennaia, N. (2014). Investor attention, index performance, and return predictability. Journal of Banking and Finance, 41, 17–35.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takayuki Morimoto.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Morimoto, T., Kawasaki, Y. Forecasting Financial Market Volatility Using a Dynamic Topic Model. Asia-Pac Financ Markets 24, 149–167 (2017). https://doi.org/10.1007/s10690-017-9228-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10690-017-9228-z

Keywords

  • Big data
  • Online news
  • Dynamic topic model
  • Topic score
  • Forecasting
  • Realized volatility

JEL Classification

  • C10
  • C80
  • G17