Abstract
An accurate prediction of financial distress is beneficial to investors and allows banks and other financial institutions to build an early warning system to avoid risk contagion. This study investigated financial distress prediction using textual sentiment extracted from listed firms’ annual reports in the Chinese market. The sentiments reflected by the firms’ management discussions and analysis (MD&A) sections and audit reports were extracted separately through the application of deep learning algorithms. We found that the sentiment score extracted from MD&A sections was more optimistic compared with that extracted from audit reports. Moreover, the experimental results demonstrated that the modeling performance was significantly improved with the incorporation of textual sentiment scores, and the inclusion of sentiment from audit reports lead to a more significant incremental improvement than that from the MD&A sections. However, when both sentiment scores were included in the modeling input, the improvement in predictive accuracy was insignificant compared to the model using audit report scores only. Our study highlights the predictive power of textual information in annual reports, and shows that the textual sentiment of annual reports should be applied in distress modeling. The results provide implications for the utilization of soft information in credit risk modeling in the context of Chinese market, and such application can be further explored in other areas of operational research studies.
Similar content being viewed by others
Notes
The python codes are available in the link.
https://github.com/wu-maud/textual-analysis/tree/master/MDAFDPre.
References
Agarwal, S., Chen, V. Y. S., & Zhang, W. (2016). The information value of credit rating action reports: A textual analysis. Management Science, 62(8), 2218–2240.
Alfaro, E., García, N., Gámez, M., & Elizondo, D. (2008). Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks. Decision Support Systems, 45(1), 110–122.
Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. Journal of Finance, 23, 589–609.
Baesens, B., Van Gestel, T., Viaene, S., Stepanova, M., Suykens, J., & Vanthienen, J. (2003). Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 54, 627–635.
Beaver, W. H. (1966). Financial ratios as predictors of failure. Journal of Accounting Research, 4, 71–111.
Beaver, W. H., McNichols, M. F., & Rhie, J. (2005). Have financial statements become Less informative? Evidence from the ability of financial ratios to predict bankruptcy. Review of Accounting Studies, 10, 93–122.
Boyacioglu, M. A., Kara, Y., & Baykan, Ö. K. (2009). Predicting bank financial failures using neural networks, support vector machines and multivariate statistical methods: A comparative analysis in the sample of savings deposit insurance fund (SDIF) transferred banks in Turkey. Expert Systems with Applications, 36(2), 3355–3366.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Campbell, J. L., Chen, H., Dhaliwal, D. S., Lu, H., & Steele, L. B. (2014). The information content of mandatory risk factor disclosures in corporate filings. Review of Accounting Studies, 19(1), 396–455.
Campbell, J. Y., Hilscher, J., & Szilagyi, J. (2008). In search of distress risk. Journal of Finance, 63(6), 2899–2939.
Chauhan, N., Ravi, V., & Chandra, D. K. (2009). Differential evolution trained wavelet neural networks: Application to bankruptcy prediction in banks. Expert Systems with Applications, 36(4), 7659–7665.
Chava, S., & Jarrow, R. (2004). Bankruptcy prediction with industry effects. Review of Finance, 8, 537–569.
Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, USA, August 13–17, 2016. pp. 785–794.
Chen, M.-Y. (2011). Predicting corporate financial distress based on integration of decision tree classification and logistic regression. Expert Systems with Applications, 38(9), 11261–11272.
Devlin, J., Chang, M., Lee, K. & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arxiv.org/abs/1810.04805
Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning. Neural Computation, 10, 1895–1923.
Friedman, J. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.
Geng, R., Bose, I., & Chen, X. (2015). Prediction of financial distress: An empirical study of listed Chinese companies using data mining. European Journal of Operational Research, 241(1), 236–247.
Gurun, U. G., & Butler, A. W. (2012). Don’t believe the hype: Local media slant, local advertising, and firm value. Journal of Finance, 67, 561–598.
Hansen, P. R., Lunde, A., & James, M. N. (2011). The model confidence set. Econometrica, 79(2), 453–497.
Hillegeist, S. A., Keating, E. K., Cram, D. P., & Lundstedt, K. G. (2004). Assessing the probability of bankruptcy. Review of Accounting Studies, 9(1), 5–34.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Hosaka, T. (2019). Bankruptcy prediction using imaged financial ratios and convolutional neural networks. Expert Systems with Applications, 117, 287–299.
Jiang, C., Wang, Z., Wang, R., & Ding, Y. (2017). Loan default prediction by combining soft information extracted from descriptive text in online peer-to-peer lending. Annuals of Operations Research, 266, 511–529.
Jones, S., Johnstone, D., & Wilson, R. (2017). Predicting corporate bankruptcy: An evaluation of alternative statistical frameworks. Journal of Business Finance & Accounting, 44(1–2), 3–34.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In: Proceedings of the advances in neural information processing systems. pp. 1097—1105
Kumar, P. R., & Ravi, V. (2007). Bankruptcy prediction in banks and firms via statistical and intelligent techniques—a review. European Journal of Operational Research, 180, 1–28.
Lang, M., & Stice-Lawrence, L. (2015). Textual analysis and international financial reporting: Large sample evidence. Journal of Accounting and Economics, 60(2), 110–135.
Lessmann, S., Baesens, B., Seow, H.-V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247, 124–136.
Li, F. (2010). The information content of forward-looking statements in corporate filings—A naive Bayesian machine learning approach. Journal of Accounting Research, 48, 1049–1102.
Loughran, T., & Mcdonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. Journal of Finance, 66(1), 35–65.
Mai, F., Tian, S., Lee, C., & Ma, L. (2019). Deep learning models for bankruptcy prediction using textual disclosures. European Journal of Operational Research, 274(2), 743–758.
Matin, R., Hansen, C., Hansen, C., & Mølgaard, P. (2019). Predicting distresses using deep learning of text segments in annual reports. Expert Systems with Applications, 132, 199–208.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Retrieved from https://arxiv.org/abs/1310.4546
Nanni, L., & Lumini, A. (2009). An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring. Expert Systems with Applications, 36, 3028–3033.
Olson, D. L., Delen, D., & Meng, Y. (2012). Comparative analysis of data mining methods for bankruptcy prediction. Decision Support Systems, 52(2), 464–473.
Shumway, T. (2001). Forecasting bankruptcy more accurately: A simple hazard model. Journal of Business, 74, 101–124.
Sun, J., Fujita, H., Chen, P., & Li, H. (2017). Dynamic financial distress prediction with concept drift based on time weighting combined with adaboost support vector machine ensemble. Knowledge-Based Systems, 120, 4–14.
Sun, J., Lang, J., Fujita, H., & Li, H. (2018). Imbalanced enterprise credit evaluation with DTE-SBD: Decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Information Sciences, 425, 76–91.
Sun, J., Li, H., Huang, Q., & He, K. (2014). Predicting financial distress and corporate failure: A review from the state-of-the-art definitions, modeling, sampling, and featuring approaches. Knowledge-Based Systems, 57, 41–56.
Sundar, S. S. (1998). Effect of source attribution on perception of online news stories. Journalism & Mass Communication Quarterly, 75(1), 55–68.
Sundar, S. S. (1999). Exploring receivers’ criteria for perception of print and online news. Journalism & Mass Communication Quarterly, 76, 373–386.
Suykens, J., & Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural Processing Letters, 9, 293–300.
Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. Journal of Finance, 62, 1139–1168.
Tetlock, P. C. (2010). Does public financial news resolve asymmetric information? The Review of Financial Studies, 23(9), 3520–3557.
Tian, S., Yu, Y., & Guo, H. (2015). Variable selection and corporate bankruptcy forecasts. Journal of Banking and Finance, 52, 89–100.
Tsai, C., Hsu, Y., & Yen, D. C. (2014). A comparative study of classifier ensembles for bankruptcy prediction. Applied Soft Computing, 24, 977–984.
Tsai, C., & Wu, J. (2008). Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Systems with Applications, 34, 2639–2649.
Vapnik, V. (1995). The nature of statistical learning theory. Springer-Verlag.
Wilson, R. L., & Sharda, R. (1994). Bankruptcy prediction using neural networks. Decision Support Systems, 11(5), 545–557.
Xia, Y., He, L., Li, Y., Liu, N., & Ding, Y. (2019). Predicting loan default in peer-to-peer lending using narrative data. Journal of Forecasting, 39(2), 260–280.
Zhou, L. (2013). Performance of corporate bankruptcy prediction models on imbalanced dataset: The effect of sampling methods. Knowledge-Based Systems, 41, 16–25.
Zhou, L., Tam, K. P., & Fujita, H. (2016). Predicting the listing status of Chinese listed companies with multi-class classification models. Information Science, 328, 222–236.
Zmijewski, M. E. (1984). Methodological issues related to the estimation of financial distress prediction models. Journal of Accounting Research, 22, 59–82.
Acknowledgements
The authors would like to thank for the support from the funding of National Natural Science Foundation in China (Grant No. 71703162, 71901230).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Examples sentences with sentiment labels in MD&A sections and audit reports
Segment | Sentiment | Example sentences |
---|---|---|
MD&A sections | Positive | It is planned to actively extend the intelligent government services business in Shanghai and Guangxi Province |
Sales of social security card chip has experienced substantial growing in 2017 due to the policy adjustment in the short term, and more efforts should be made for the competition in the market of the third-generation social security card chip | ||
Neutral | We will make an overall plan to connect all sections together hierarchically and horizontally | |
We aim to strengthen the cash management and explore more financing opportunities | ||
Negative | Deferred income has experienced an increase of 15,057,000 yuan due to the delay of the settlement of national scientific funding | |
Because the corporate has initiated the procedure of material asset restructuring since 2013, and thus high service fee is incurred | ||
Iron ore is the main raw materials of steel products, of which the price has strong correlation with national economic structure, economic cycle and industrialisation level, and the demand of iron ore is highly dependent with importation without possession of pricing power which is also impacted by ocean freight | ||
Audit reports | Positive | In our opinion, the financial statements give a true and fair view of the financial position of the corporation at 31 December 2012, and the corporate operations and cash flows in 2012 in accordance with Chinese Financial Reporting Standards |
Neutral | We believe that the audit evidence we have obtained is sufficient and appropriate to provide a basis for our audit opinion | |
In making those risk assessments, the auditor considers internal control relevant to the entity’s preparation and true and fair presentation of the financial statements in order to design audit procedures that are appropriate in the circumstances, but not for the purpose of expressing an opinion on the effectiveness of the entity’s internal control | ||
The property management incomes are the main sources of the firm profit in 2013 | ||
Negative | Until 31 December 2017, the receivables of Shanghai Putian accumulates to 748,956,944 yuan, in that its obligor (Zhejiang Dawei Co., Ltd.) has filed bankruptcy since July 2017 | |
The corporate has filed an accusation to Liyang Industrial Development Co.,Ltd. due to the dispute in the contract of Tianmu Lake Hotel decoration |
Appendix B: Full list of candidate variables
No. | Variables | No. | Variables |
---|---|---|---|
1 | MDA score | 28 | Current assets to turnover |
2 | Audit score | 29 | Long-term assets to turnover |
3 | Working capital to total assets | 30 | Net assets to turnover |
4 | Retained earnings to total assets | 31 | Account payable to turnover |
5 | Earnings before interest and taxes to total assets | 32 | working capital to turnover |
6 | Market equity to total liabilities | 33 | Cash to turnover |
7 | Total sales to total assets | 34 | Operating revenue per share |
8 | Net income to total assets | 35 | Earnings per share |
9 | Total liabilities to total assets | 36 | retention ratio |
10 | Current ratio | 37 | Rate of capital accumulation |
11 | Quick ratio | 38 | price to earnings |
12 | Cash ratio | 39 | dividend to price |
13 | Owner’s equity to total assets | 40 | Market value to cash flow |
14 | Fixed assets to total assets | 41 | Market value to sales |
15 | Owner’s equity to fixed assets | 42 | Net cash flow per share |
16 | Current liability to total liabilities | 43 | Net cash flow from operating activities to net profit ratio |
17 | Long-term liability to total liabilities | 44 | Net assets per share |
18 | Owner’s equity to liability | 45 | Surplus reserves per share |
19 | Debt to tangible assets ratio | 46 | Undistributed profits per share |
20 | Gross operating profit ratio | 47 | Net cash flow from operating activities per share |
21 | Net profit to operating revenue | 48 | Net cash flow from investing activities per share |
22 | Net profit to current assets | 49 | Net cash flow from financing activities per share |
23 | Net profit to fixed assets | 50 | Current assets to owner’s equity |
24 | Net profit to owner’s equity | 51 | Price to book value |
25 | Return on invested capital | 52 | Relative size |
26 | Account receivables to turnover | 53 | Excess return |
27 | Inventory to turnover | 54 | Market volatility |
Rights and permissions
About this article
Cite this article
Huang, B., Yao, X., Luo, Y. et al. Improving financial distress prediction using textual sentiment of annual reports. Ann Oper Res 330, 457–484 (2023). https://doi.org/10.1007/s10479-022-04633-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-022-04633-3