Abstract
We propose an eXplainable Risk Ranking (XRR) model that uses multilevel encoders and attention mechanisms to analyze financial risks among companies. In specific, the proposed method utilizes the textual information in financial reports to rank the relative risks among companies and locate top high-risk companies; moreover, via attention mechanisms, XRR enables to highlight the critical words and sentences within financial reports that are most likely to influence financial risk and thus boasts better model explainability. Experimental results evaluated on 10-K financial reports show that XRR significantly outperforms several baselines, yielding up to 7.4% improvement in terms of ranking correlation metrics. Furthermore, in our experiments, the model explainability is evaluated by using finance-specific sentiment lexicons at word level and a newly-provided annotated reference list at the sentence level to examine the learned attention models.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
We here split the volatilities based on 30-th, 60-th, 80-th, and 90-th percentiles, yielding the average numbers of the five categories per year as 702, 702, 467, 234, and 234, respectively.
- 2.
Note that in BERT models, words in different sentences (or documents) are associated with different representations; to reflect this, we treat words in different documents as different words.
- 3.
- 4.
We also adopt RankSVM with TF-IDF as features by following [24], the results of which are close to the ones of TFIDF-Rank.
- 5.
Due to resource limitations, we could not train a domain-specific BERT model; however, we speculate that using a domain-specific BERT would yield further improvements.
- 6.
We omit the comparison to Fasttext here as its performance in Table 1 distances it from the other three models.
- 7.
The firm size is defined as the logarithm of the sum of all current and long-term assets held by a company (in million dollars).
- 8.
The stock return is the appreciation in the price plus any dividends paid, divided by the original price of the stock.
- 9.
- 10.
The list will be publicly available upon publication.
References
Aikman, D., et al.: Funding liquidity risk in a quantitative model of systemic stability. Cent. Bank. Anal. Econ. Policies Book Ser. 15, 371–410 (2011)
Akhtar, M.S., Kumar, A., Ghosal, D., Ekbal, A., Bhattacharyya, P.: A multilayer perceptron based ensemble technique for fine-grained financial sentiment analysis. In: Proceedings of EMNLP, pp. 540–546 (2017)
Buehlmaier, M.M., Whited, T.M.: Are financial constraints priced? Evidence from textual analysis. Rev. Financ. Stud. 31(7), 2693–2728 (2018)
Burges, C., et al.: Learning to rank using gradient descent. In: Proceedings of ICML, pp. 89–96 (2005)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2018)
Ding, X., Zhang, Y., Liu, T., Duan, J.: Deep learning for event-driven stock prediction. In: Proceedings of IJCAI, pp. 2327–2333 (2015)
Dos Santos, C., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING, pp. 69–78 (2014)
Fama, E.F., French, K.R.: Common risk factors in the returns on stocks and bonds. J. Financ. Econ. 33(1), 3–56 (1993)
Kendall, M.G.: A new measure of rank correlation. Biometrika 30(1/2), 81–93 (1938)
Grave, E., Mikolov, T., Joulin, A., Bojanowski, P.: Bag of tricks for efficient text classification. In: Proceedings of EACL, pp. 427–431 (2017)
Hu, Z., Liu, W., Bian, J., Liu, X., Liu, T.Y.: Listening to chaotic whispers: a deep learning framework for news-oriented stock trend prediction. In: Proceedings of WSDM, pp. 261–269 (2018)
Ito, T., Lyons, R.K., Melvin, M.T.: Is there private information in the FX market? the Tokyo experiment. J. Financ. 53(3), 1111–1130 (1998)
Kogan, S., Levin, D., Routledge, B.R., Sagi, J.S., Smith, N.A.: Predicting risk from financial reports with regression. In: Proceedings of NAACL, pp. 272–280 (2009)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Loughran, T., McDonald, B.: When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J. Financ. 30(1), 81–93 (2011)
Luo, L., et al.: Beyond polarity: interpretable financial sentiment analysis with hierarchical query-driven attention. In: Proceedings of IJCAI, pp. 4244–4250 (2018)
Myers, J.L., Well, A., Lorch, R.F.: Research Design and Statistical Analysis, vol. 30. Lawrence Erlbaum (2003)
Nopp, C., Hanbury, A.: Detecting risks in the banking system by sentiment analysis. In: Proceedings of EMNLP, pp. 591–600 (2015)
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Proceedings of EMNLP, pp. 1532–1543 (2014). http://www.aclweb.org/anthology/D14-1162
Rekabsaz, N., Lupu, M., Baklanov, A., Hanbury, A., Dür, A., Anderson, L.: Volatility prediction using financial disclosures sentiments with word embedding-based IR models. arXiv preprint arXiv:1702.01978 (2017)
Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans. Inform. Syst. (TOIS) 27(2), 12 (2009)
Toma, A., Dedua, S.: Quantitative techniques for financial risk assessment: a comparative approach using different risk measures and estimation methods. Proc. Econ. Financ. 8, 712–719 (2014)
Tsai, M.F., Wang, C.J.: On the risk prediction and analysis of soft information in finance reports. Eur. J. Oper. Res. 257(1), 243–250 (2016)
Tsai, M.F., Wang, C.J., Chien, P.C.: Discovering finance keywords via continuous-space language models. ACM Trans. Manage. Inform. Syst. (TMIS) 7(3), 7 (2016)
Wiegreffe, S., Pinter, Y.: Attention is not not explanation. arXiv preprint arXiv:1908.04626 (2019)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of NAACL, pp. 1480–1489 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Lin, TW., Sun, RY., Chang, HL., Wang, CJ., Tsai, MF. (2021). XRR: Explainable Risk Ranking for Financial Reports. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12978. Springer, Cham. https://doi.org/10.1007/978-3-030-86514-6_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-86514-6_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86513-9
Online ISBN: 978-3-030-86514-6
eBook Packages: Computer ScienceComputer Science (R0)