Abstract
The New Third Board (NTB) market is a non-publicly traded stock exchange in the Chinese securities market and is an essential component of the Chinese capital market. The distinctive features of the NTB market are its low entry barriers, high flexibility, and relatively minimal information disclosure requirements, which, in turn, introduce higher levels of risk. In order to effectively predict the financial risks of NTB-listed companies, a predictive model based on data mining and machine learning technologies needs to be developed. The purpose of this research is to construct a financial risk prediction model for NTB-listed companies, based on integrated feature engineering and learning models, to enhance risk warning capabilities and accuracy. In this study, 15 predictive indicators were formed based on collected financial data of listed companies, and the F-score was used to calculate risk prediction ground truth. Subsequently, through supervised learning, an ensemble learning model, Catboost, was trained for risk assessment and prediction in different time periods. The results of the study indicate that this framework aligns with professional scoring trends, and the mean squared error (MSE) and mean absolute error (MAE) metrics outperform traditional machine learning methods significantly. Notably, the MAE metric is as low as 0.124, suggesting a high level of precision in intelligent risk prediction, offering new perspectives for financial risk assessment of NTB-listed companies in the future.
Similar content being viewed by others
Data Availability
The dataset can be accessed upon request.
References
Aboody, D., & Lev, B. (2018). Machine learning in financial statement analysis: Beyond the shame of outliers. Journal of Accounting Research, 56(5), 1281–1308.
Altman, E. I. (1968). Financial ratios discriminant analysis and prediction of corporate bankruptcy. Journal of Finance, 23(9), 389–606.
Bao, Y., & Sun, Y. (2020). A machine learning approach to predict corporate bond default probability. Journal of Banking & Finance, 111, 105660.
Chang, V., Li, T., & Zeng, Z. (2019). Towards an improved Adaboost algorithmic method for computational financial analysis. Journal of Parallel and Distributed Computing, 134, 219–232.
Chen, W., & Lai, C. (2018). A multi-level ensemble model for predicting corporate financial distress. Applied Soft Computing, 73, 390–404.
Chen, Y., Li, K., & Lu, R. (2017). An empirical study of the application of F-score in financial statement analysis. Proceedings of the 2017 International Conference on Finance, Accounting and Economics (pp. 147–154). ACM.
Cheng, X., Liu, S., Sun, X., Wang, Z., Zhou, H., Shao, Y., & Shen, H. (2021). Combating emerging financial risks in the big data era: A perspective review. Fundamental Research, 1(5), 595–606.
De Lucia, C., Pazienza, P., & Bartlett, M. (2020). Does good ESG lead to better financial performances by firms? Machine learning and logistic regression models of public enterprises in Europe. Sustainability, 12(13), 5317.
Edward, I., & Altman. (1968). The prediction of corporate bankruptcy: A discriminant analysis. The Journal of Finance, 23(1), 193–194.
Fitzpatrick, P. J. (1932). A comparison of ratios of successful industrial enterprises with those of failed firms. Certified Public Accountant, 2, 389–603.
Haldeman, R. G., & Narayanan, P. (1977). Zeta analysis: A new model to identify bankruptcy risk of corporations. Journal of Banking and Finance, 1, 29–54.
Hong, S., Wu, H., Xu, X., & Xiong, W. (2022). Early warning of enterprise financial risk based on decision tree algorithm. Computational Intelligence and Neuroscience, 2022, 9182099.
Huang, X., Zhang, C. Z., & Yuan, J. (2020). Predicting extreme financial risks on imbalanced dataset: A combined kernel FCM and kernel SMOTE based SVM classifier. Computational Economics, 56, 187–216.
Kou, G., Chao, X., Peng, Y., Alsaadi, F. E., & Herrera-Viedma, E. (2019). Machine learning methods for systemic risk analysis in financial sectors. Technological and Economic Development of Economy, 25(5), 716–742.
Leo, M., Sharma, S., & Maddulety, K. (2019). Machine learning in banking risk management: A literature review. Risks, 7(1), 29.
Liu, X., Zhang, X., Liu, B., & Lu, X. (2019). Research on stock price prediction based on Bayesian network model. Complexity, 2019, 1–11.
Liu, Z., Du, G., Zhou, S., Lu, H., & Ji, H. (2022). Analysis of internet financial risks based on deep learning and BP neural network. Computational Economics, 59(4), 1481–1499.
Mohsin, M., Taghizadeh-Hesary, F., Panthamit, N., Anwar, S., Abbas, Q., & Vo, X. V. (2021). Developing low carbon finance index: Evidence from developed and developing economies. Finance Research Letters, 43, 101520.
Ohlson, J. (1980). Financial ratios and the probabilistic prediction of. Journal of Accounting Research, 18(1), 109–131.
Pal, R., Kupka, K., & Aneja, A. P. (2016). Business health characterization: A hybrid regression and support vector machine analysis. Expert Systems with Applications, 49, 48–59.
Soui, M., Gasmi, I., Smiti, S., & Ghédira, K. (2019). Rule-based credit risk assessment model using multi-objective evolutionary algorithms. Expert Systems with Applications, 126, 144–157.
Sun, Y., Chen, L., Sun, H., & Taghizadeh-Hesary, F. (2020). Low-carbon financial risk factor correlation in the belt and road PPP project. Finance Research Letters, 35, 101491.
Tam, K. Y., & Kiang, M. (1990). Predicting bank failures: A neural network approach. Taylor & Francis Inc.
Tian, J., & Shao, B. (2023). Financing constraints and information asymmetry of SMEs — The development of digital finance and financial risks of enterprises. Journal of the Knowledge Economy. https://doi.org/10.1007/s13132-023-01452-0
Wang, X., Mao, Y., Duan, Y., & Guo, Y. (2022). Frontiers in environmental science a study on China coal price forecasting based on CEEMDAN-GWO-CatBoost hybrid forecasting model under carbon neutral target. Frontiers in Environmental Science, 10, 1014021.
Wei, X., Rao, C., Xiao, X., Chen, L., & Goh, M. (2023). Risk assessment of cardiovascular disease based on SOLSSA-CatBoost model. Expert Systems with Applications, 219, 119648.
Wu, J., & Zhang, Y. (2019). An improved F-score model based on the combination of the traditional F-score and non-financial data. Journal of Intelligent & Fuzzy Systems, 37(1), 69–79.
Yeh, C. C., Chi, D. J., & Hsu, M. F. (2010). A hybrid approach of DEA, rough set and support vector machines for business failure prediction. Expert Systems with Applications, 37(2), 1535–1541.
Zhang, Z., & Wang, Z. (2022). Research on credit scoring based on transformer-Cat Boost network structure. 2022 IEEE 12th International Conference on Electronics Information and Emergency Communication (ICEIEC) (pp. 75–79). IEEE.
Zhang, Y. (2023). Can digital finance reduce government debt pressure and financing constraints? The impact of digital finance on regional systemic financial risk. Journal of the Knowledge Economy. https://doi.org/10.1007/s13132-023-01451-1
Acknowledgements
The authors would like to thank the Key Research and Practice Project of Higher Education Teaching Reform in Henan Province in 2021 and Research and Practice of Higher Education Teaching Reform in Henan Province in 2021 General Project, for their strong support to this article.
Funding
This study was funded by the Key Research and Practice Project of Higher Education Teaching Reform in Henan Province in 2021: Research and practice of Marketing Skills Master Studio to improve the professional ability of higher vocational teachers, project number: 2021SJGLX693, and the Research and Practice of Higher Education Teaching Reform in Henan Province in 2021 General Project: Research and Practice of integration of accounting competition and teaching under the background of Vocational College Skills Competition, project number: 2021SJGLX826.
Author information
Authors and Affiliations
Contributions
Conceptualization and research methods: Haitao Lu; data collection and analysis: Xiaofeng Hu; investigation: Haitao Lu, Xiaofeng Hu; writing: Haitao Lu, Xiaofeng Hu.
Corresponding author
Ethics declarations
Ethics Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed Consent
The authors declare that all the authors have informed consent.
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection on Innovation Management in Asia
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lu, H., Hu, X. Enhancing Financial Risk Prediction for Listed Companies: A Catboost-Based Ensemble Learning Approach. J Knowl Econ (2023). https://doi.org/10.1007/s13132-023-01601-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13132-023-01601-5