Abstract
Cryptocurrencies are more and more used in official cash flows and exchange of goods. Bitcoin and the underlying blockchain technology have been looked at by big companies that are adopting and investing in this technology. The CRIX Index of cryptocurrencies http://hu.berlin/CRIX indicates a wider acceptance of cryptos. One reason for its prosperity certainly being a security aspect, since the underlying network of cryptos is decentralized. It is also unregulated and highly volatile, making the risk assessment at any given moment difficult. In message boards one finds a huge source of information in the form of unstructured text written by e.g. Bitcoin developers and investors. We collect from a popular crypto currency message board texts, user information and associated time stamps. We then provide an indicator for fraudulent schemes. This indicator is constructed using dynamic topic modelling, text mining and unsupervised machine learning. We study how opinions and the evolution of topics are connected with big events in the cryptocurrency universe. Furthermore, the predictive power of these techniques are investigated, comparing the results to known events in the cryptocurrency space. We also test hypothesis of self-fulling prophecies and herding behaviour using the results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bao, Y., & Datta, A. (2014). Simultaneously discovering and quantifying risk types from textual risk disclosures. Management Science, 60(6), 1371–1391.
Blei, D., Ng, A. Y., Jordan, M. I., & Lafferty, J. (2003). Latent Dirichlet allocation; Journal of Machine Learning Research, 3, 993–1022.
Blei, D., & Lafferty, J. (2006). Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning (AMC).
Bommes, E., Chen, C. Y., Härdle, W. K. (2017). Textual sentiment and sector-specific reaction. Forthcoming.
Chang, J., Boyd-Graber, J. L., Wang, C., & Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. Advances in Neural Information Processing Systems, 288–296.
Cheah, E. T., & Fry, J. (2015). Speculative bubbles in Bitcoin markets? An empirical investigation into the fundamental value of Bitcoin. Economics Letters, 130, 32–36.
Cheung, A., Roca, E., & Su, J. J. (2015). Crypto-currency bubbles: An application of the Phillips-Shi-Yu (2013) methodology on Mt. Gox bitcoin prices. Applied Economics, 47(23), 2348–2358.
Frigyik, B. A., Kapila, A., & Gupta, M. R. (2010). Introduction to the Dirichlet distribution and related processes. Technical Report, Department of Electrical Engineering, University of Washington.
Griffiths, T., & Steyvers, M. (2004). Finding Scientific Topics. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl1), 5228–5235.
Hall, D., Jurafsky, D., & Manning, C. (2008). Studying the history of ideas using topic models. Proceedings of the Conference on Empirical Methods in Natural Language Processing, 363–371.
Huang, K. W., & Li, Z. L. (2011). A multilable text classification algorithm for labeling risk factors in SEC form 10-K. ACM Transactions on Management Information Systems (TMIS), 2(3), 18.
Kristoufek, L. (2013). BitCoin meets Google Trends and Wikipedia: Quantifying the relationship between phenomena of the Internet era. Scientific Reports, 3, 3415.
Mai, F., Bai, Q., Shan, Z., Wang, X. S., & Chiang, R. H. (2015). The impacts of social media on Bitcoin performance. In Proceedings of the Thirty Sixth International Conference on Information Systems (ICIS 2015).
Matta, M., Lunesu, I., & Marchesi, M. (2015). Bitcoin spread prediction using social and web search media. Proceedings of DeCAT.
Mimno, D., Wallach, H. M., Talley E., Leenders, M., & McCallum, A. (2011). Optimizing semantic coherence in topic models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 262–272.
Smailović, J., Grčar, M., Lavrač, N., & Žnidaršič, M. (2013). Predictive sentiment analysis of tweets: A stock market application. In Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data (pp. 77–88). Berlin: Springer.
Wallach, H. M., Jensen, S. T., Dicker, L. H., & Heller, K. A. (2010). An alternative prior process for nonparametric Bayesian clustering. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 9, 892–899.
Zhang, J. L., Härdle, W. K., Chen, C. Y., & Bommes, E. (2016). Distillation of news flow into analysis of stock reactions. Journal of Business and Economic Statistics, 34, 547–563.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer-Verlag GmbH Germany
About this chapter
Cite this chapter
Linton, M., Teo, E.G.S., Bommes, E., Chen, C.Y., Härdle, W.K. (2017). Dynamic Topic Modelling for Cryptocurrency Community Forums. In: Härdle, W., Chen, CH., Overbeck, L. (eds) Applied Quantitative Finance. Statistics and Computing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-54486-0_18
Download citation
DOI: https://doi.org/10.1007/978-3-662-54486-0_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-54485-3
Online ISBN: 978-3-662-54486-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)