Abstract
The problem of modeling the continuously changing trends in finance markets and generating real-time, meaningful predictions about significant changes in those markets has drawn considerable interest from economists and data scientists alike. In addition to traditional market indicators, growth of varied social media has enabled economists to leverage micro- and real-time indicators about factors possibly influencing the market, such as public emotion, anticipations and behaviors. We propose several specific market related features that can be mined from varied sources such as news, Google search volumes and Twitter. We further investigate the correlation between these features and financial market fluctuations. In this paper, we present a Delta Naive Bayes (DNB) approach to generate prediction about financial markets. We present a detailed prospective analysis of prediction accuracy generated from multiple, combined sources with those generated from a single source. We find that multi-source predictions consistently outperform single-source predictions, even though with some limitations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Basalto, N., Bellotti, R., De Carlo, F., Facchi, P., Pascazio, S.: Clustering stock market companies via chaotic map synchronization. Physica A: Statistical Mechanics and its Applications 345(1), 196–206 (2005)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. JMLR 3, 993–1022 (2003)
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008(10), P10008 (2008)
Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. Computational Science 2(1), 1–8 (2011)
Bollerslev, T.: Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31(3), 307–327 (1986)
He, W., Guo, L., Shen, J., Akula, V.: Social media-based forecasting: A case study of tweets and stock prices in the financial services industry. Journal of Organizational and End User Computing (JOEUC) 28(2), 74–91 (2016)
Jin, F., Khandpur, R.P., Self, N., Dougherty, E., Guo, S., Chen, F., Prakash, B.A., Ramakrishnan, N.: Modeling mass protest adoption in social network communities using geometric brownian motion. In: Proc. KDD 2014, pp. 1660–1669. ACM (2014)
Jin, F., Self, N., Saraf, P., Butler, P., Wang, W., Ramakrishnan, N.: Forex-foreteller: Currency trend modeling using news articles. In: Proc. KDD 2013 Demo Track, pp. 1470–1473. ACM (2013)
Jin, F., Wang, W., Zhao, L., Dougherty, E., Cao, Y., Lu, C.T., Ramakrishnan, N.: Misinformation propagation in the age of twitter. Computer 47(12), 90–94 (2014)
Kalman, R.: A new approach to linear filtering and prediction problems. Journal of Basic Engineering 82(1), 35–45 (1960)
Lavrenko, V., Schmill, M., Lawrie, D., Ogilvie, P., Jensen, D., Allan, J.: Mining of concurrent text and time series. In: KDD 2000 Workshop, pp. 37–44 (2000)
Mao, H., Counts, S., Bollen, J.: Predicting financial markets: Comparing survey, news, twitter and search engine data. Quantitative Finance Papers 1112(1051) (2011)
Ming, F., Wong, F., Liu, Z., Chiang, M.: Stock market prediction from wsj: text mining via sparse matrix factorization. In: 2014 IEEE International Conference on Data Mining (ICDM), pp. 430–439. IEEE (2014)
Piñeiro-Chousa, J., VizcaÃno-González, M., Pérez-Pico, A.M.: Influence of social media over the stock market. Psychology & Marketing 34(1), 101–108 (2017)
Preis, T., Moat, H.S., Stanley, H.E.: Quantifying trading behavior in financial markets using Google Trends. Scientific reports 3 (2013)
Preis, T., Reith, D., Stanley, H.E.: Complex dynamics of our economic life on different scales: insights from search engine query data. Phil. Trans. Math. Phys. Eng. Sci. 368(1933), 5707–5719 (2010)
Quax, R., Kandhai, D., Sloot, P.M.: Information dissipation as an early-warning signal for the Lehman Brothers collapse in financial time series. Scientific reports 3 (2013)
Ramakrishnan, N., Butler, P., Muthiah, S., Self, N., Khandpur, R., Saraf, P., Wang, W., Cadena, J., Vullikanti, A., Korkmaz, G., et al.: ’beating the news’ with embers: forecasting civil unrest using open source indicators. In: Proc. KDD 2014, pp. 1799–1808. ACM (2014)
Rao, T., Srivastava, S.: Modeling movements in oil, gold, forex and market indices using search volume index and twitter sentiments. In: Proc. WebSci 2013, pp. 336–345 (2013)
Ruiz, E.J., Hristidis, V., Castillo, C., Gionis, A., Jaimes, A.: Correlating financial time series with micro-blogging activity. In: Proc. WSDM 2012, pp. 513–522 (2012)
Sul, H.K., Dennis, A.R., Yuan, L.I.: Trading on twitter: Using social media sentiment to predict stock returns. Decision Sciences (2016)
Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267–288 (1996)
Veiga, A., Jiao, P., Walther, A.: Social media, news media and the stock market. News Media and the Stock Market, March 29, 2016
Wang, J., Yao, Y., Mao, Y., Sheng, B., Mi, N.: Omo: optimize mapreduce overlap with a good start (reduce) and a good finish (map). In: IPCCC (2015)
Wang, J., Yao, Y., Mao, Y., Sheng, B., Mi, N.: Fresh: fair and efficient slot configuration and scheduling for hadoop clusters. In: CLOUD 2014, pp. 761–768. IEEE (2014)
Weng, J., Lee, B.S.: Event detection in twitter. In: Proc. ICWSM 2011 (2011)
Yamato, J., Ohya, J., Ishii, K.: Recognizing human action in time-sequential images using hidden Markov model. In: Proc. CVPR 1992, pp. 379–385 (1992)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Jin, F., Wang, W., Chakraborty, P., Self, N., Chen, F., Ramakrishnan, N. (2017). Tracking Multiple Social Media for Stock Market Event Prediction. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2017. Lecture Notes in Computer Science(), vol 10357. Springer, Cham. https://doi.org/10.1007/978-3-319-62701-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-62701-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62700-7
Online ISBN: 978-3-319-62701-4
eBook Packages: Computer ScienceComputer Science (R0)