Dynamic and context-dependent stock price prediction using attention modules and news sentiment

Königstein, Nicole

doi:10.1007/s42521-023-00089-7

Dynamic and context-dependent stock price prediction using attention modules and news sentiment

Original Article
Published: 01 August 2023

Volume 5, pages 449–481, (2023)
Cite this article

Digital Finance Aims and scope Submit manuscript

Nicole Königstein^1,2

81 Accesses
Explore all metrics

Abstract

The growth of machine-readable data in finance, such as alternative data, requires new modeling techniques that can handle non-stationary and non-parametric data. Due to the underlying causal dependence and the size and complexity of the data, we propose a new modeling approach for financial time series data, the \(\alpha _{t}\)-RIM (recurrent independent mechanism). This architecture makes use of key–value attention to integrate top-down and bottom-up information in a context-dependent and dynamic way. To model the data in such a dynamic manner, the \(\alpha _{t}\)-RIM utilizes an exponentially smoothed recurrent neural network, which can model non-stationary times series data, combined with a modular and independent recurrent structure. We apply our approach to the closing prices of three selected stocks of the S &P 500 universe as well as their news sentiment score. The results suggest that the \(\alpha _{t}\)-RIM is capable of reflecting the causal structure between stock prices and news sentiment, as well as the seasonality and trends. Consequently, this modeling approach markedly improves the generalization performance, that is, the prediction of unseen data, and outperforms state-of-the-art networks, such as long–short-term memory models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prediction model for stock price trend based on recurrent neural network

Article 12 May 2020

DeepAR-Attention probabilistic prediction for stock price series

Article 15 May 2024

How Much Does Stock Prediction Improve with Sentiment Analysis?

Data availability

The datasets generated during and/or analyzed during the current study are available in this repository: https://github.com/QuantLet/alpha_t-RIM.

References

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., . . . & Zheng, X. (2016, November). Tensorflow: A system for large-scale machine learning. 12th USENIX symposium on operating systems design and implementation (OSDI 16) (pp. 265–283). Savannah, GA: USENIX Association. Retrieved from https://www.usenix.org/conference/osdi16/technicalsessions/presentation/abadi
Adebiyi, A., Adewumi, A., & Ayo, C. (2014). 03). Comparison of arima and artificial neural networks models for stock price prediction. Journal of Applied Mathematics, 2014, 1–7. https://doi.org/10.1155/2014/614342
Article Google Scholar
Bengio, Y. (2017). The consciousness prior. CoRR, abs/1709.08568 . Retrieved from arXiv:1709.08568
Carruthers, P. (2006). The architecture of the mind: Massive modularity and the flexibility of thought. Oxford: Oxford University Press UK.
Book Google Scholar
Cho, K., Courville, A., & Bengio, Y. (2015). Describing multimedia content using attention-based encoder-decoder networks. IEEE Transactions on Multimedia, 17(11), 1875–1886. https://doi.org/10.1109/TMM.2015.2477044
Article Google Scholar
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling.
Dixon, M. (2020). Industrial forecasting with exponentially smoothed recurrent neural networks. arXiv preprint arXiv:2004.04717v2 .
Drury, M. (2017). Polynomial, spline, gaussian and binner smoothing are carried out building a regression on custom basis expansions. https://github.com/madrury/basis-expansions/blob/master/examples/comparison-of-smoothing-methods.ipynb. GitHub.
Galassi, A., Lippi, M., & Torroni, P. (2020). Attention in natural language processing. IEEE Transactions on Neural Networks and Learning Systems, 1–18. Retrieved from https://doi.org/10.1109/tnnls.2020.3019893
Glorot, X., & Bengio, Y. (2010, 13–15 May). Understanding the difficulty of training deep feedforward neural networks. Y.W. Teh & M. Titterington (Eds.), Proceedings of the thirteenth international conference on artificial intelligence and statistics (Vol. 9, pp. 249–256). Chia Laguna Resort, Sardinia, Italy: PMLR. Retrieved from http://proceedings.mlr.press/v9/glorot10a.html
Goyal, A., Lamb, A., Hoffmann, J., Sodhani, S., Levine, S., Bengio, Y., & Schölkopf, B. (2020). Recurrent independent mechanisms. arXiv preprint arXiv:1909.10893v6 .
Harsh Panday, V. S. P., & Vijayarajan, V. (2020). Stock prediction using sentiment analysis and long short term memory. European Journal of Molecular and Clinical Medicine, 7(2), 5060–5069.
Google Scholar
Hazimeh, H., Zhao, Z., Chowdhery, A., Sathiamoorthy, M., Chen, Y., Mazumder, R., . . . & Chi, E.H. (2021). Dselect-k: Differentiable selection in the mixture of experts with applications to multi-task learning. CoRR, abs/2106.03760 . Retrieved from arXiv:2106.03760
Henaff, M., Szlam, A., & LeCun, Y. (2016). Orthogonal rnns and long-memory tasks. CoRR, abs/1602.06662 . Retrieved from arXiv:1602.06662
Henaff, M., Weston, J., Szlam, A., Bordes, A., & LeCun, Y. (2016). Tracking the world state with recurrent entity networks. CoRR, abs/1612.03969 . Retrieved from arXiv:1612.03969
Hochreiter, S., & Schmidhuber, J. (1999). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Kim, S., & Kang, M. (2019). Financial series prediction using attention lstm.
Kim, M. (2015). 10). Cost-sensitive estimation of arma models for financial asset return data. Mathematical Problems in Engineering, 2015, 1–8. https://doi.org/10.1155/2015/232184
Article Google Scholar
Kingma, D.P., & Ba, J. (2017). Adam: A method for stochastic optimization.
Kipf, T., Fetaya, E., Wang, K.-C., Welling, M., & Zemel, R. (2018). Neural relational inference for interacting systems.
Königstein, N. (2021). Dynamic and context-dependent stock price prediction using attention modules and news sentiment. https://github.com/Nicolepcx/alphat-RIM. GitHub.
Pearl, J. (2009). Causality: Models, reasoning and inference (2nd ed.). USA: Cambridge University Press.
Book Google Scholar
Peters, J., Janzing, D., & Schölkopf, B. (2017). Elements of causal inference: foundations and learning algorithms. Cambridge: The MIT Press.
Google Scholar
Pourahmadi, M. (2016). Time series modelling with unobserved components, by matteo m. pelagatti. published by crc press, 2015, pages: 257. isbn-13: 978-1-4822-2500-6. matteo pelagatti. Journal of Time Series Analysis, 37(4), 575–576. https://doi.org/10.1111/jtsa.12181
Article Google Scholar
Prado, M.L.d. (2018). Advances in financial machine learning. New York: Wiley.
Santoro, A., Faulkner, R., Raposo, D., Rae, J., Chrzanowski, M., Weber, T., . . . & Lillicrap, T. (2018). Relational recurrent neural networks.
Schoelkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., & Mooij, J. (2012). On causal and anticausal learning.
Selvin, S., Vinayakumar, R., Gopalakrishnan, E., Menon, V., & Soman, K. (2017). Stock price prediction using lstm, rnn and cnn-sliding window model. 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 1643-1647.
Simon, H.A. (1991). The architecture of complexity. In: Facets of systems science . Boston, US. pp. 457–476
Sugiyama, M., & Kawanabe, M. (2012). Machine learning in non-stationary environments: introduction to covariate shift adaptation. Cambridge: MIT.
Book Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., . . . & Polosukhin, I. (2017). Attention is all you need. CoRR, abs/1706.03762 . Retrieved from arXiv:1706.03762
Wang, Y., Huang, M., Zhu, X., & Zhao, L. (2016, November). Attentionbased LSTM for aspect-level sentiment classification. Proceedings of the 2016 conference on empirical methods in natural language processing (pp. 606–615). Austin, Texas: Association for Computational Linguistics. Retrieved from https://aclanthology.org/D16-1058 10.18653/v1/D16-1058
Zhang, X., Liang, X., Li, A., Zhang, S., Xu, R., & Wu, B. (2019). At-lstm An attention-based lstm model for financial time series prediction. IOP Conference Series Materials Science and Engineering., 569, 052037.
Article Google Scholar

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Impactvise AG, Zählerweg 5, 6300, Zug, Switzerland
Nicole Königstein
AI & Quant Research, quantmate, Amsterdam, Netherlands
Nicole Königstein

Authors

Nicole Königstein
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nicole Königstein.

Ethics declarations

Conflict of interest

The author has no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The author would like to thank YUKKA Lab, Berlin, for providing the raw data for this research. Furthermore, the author would like to thank Matthew Dixon and Saeed Amen, who provided significant support to the research with their insights and expertise. Finally, the author would like to thank Jörg Osterrieder for his comments and suggestions on this paper.

Appendix A

1.1 A.1 The \({\alpha _{t}}\) -RIM Hyper-Parameter

Due to the models’ constraints of the hyper-parameters (e.g., the number of RIMs have to be smaller or equal to k modules), the normal cross-validation could not be performed. Therefore, a special function was implemented to generate a list of dictionaries to be fed into the grid search as a parameter grid. The list encompasses the following parameters:

Units: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 25, 30, 35, 40, 45, 50
Number of RIMs: 4, 6, 8, 10, 12, 14
K modules: 4, 6, 8, 10, 12, 14
Input key size: 4, 6, 8, 10, 12
Input value size: 4, 6, 8, 10, 12
input query size: 4, 6, 8, 10, 12
Input keep probability: 0.6, 0.7, 0.8, 0.9
Number of communication heads: 2, 4, 6, 8
Communication key size: 4, 6, 8, 10, 12
Communication value size: 4, 6, 8, 10, 12
Communication query size: 4, 6, 8, 10, 12
Communication keep probability: 0.6, 0.7, 0.8, 0.9

1.2 A.2 Complete training results

1.2.1 A.2.1 Evaluation metrics

AMAZON

See Tables 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38.

Table 3 Univariate 5 input lags

Dynamic and context-dependent stock price prediction using attention modules and news sentiment

Abstract

Access this article

Similar content being viewed by others

Prediction model for stock price trend based on recurrent neural network

DeepAR-Attention probabilistic prediction for stock price series

How Much Does Stock Prediction Improve with Sentiment Analysis?

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A

Appendix A

1.1 A.1 The \({\alpha _{t}}\) -RIM Hyper-Parameter

1.2 A.2 Complete training results

1.2.1 A.2.1 Evaluation metrics

1.2.2 A.2.2 Re-scaled metrics

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation