Abstract
Products, services, among many other things in life have a quality standard, are inclusive, or do not harm customers. Regulations required from their manufacturers or providers make it possible. This type of requirement also exists in the finance sector. Governments, international agencies, or civil institutions are responsible for creating, applying, and inspecting these regulations. Regulators from all spheres (federal, state, and municipal) constantly demand changes in the finance sector to meet current needs adequately. This paper presents the constant evolution of a banking compliance application in Brazil. It aims to classify the relevance or irrelevance of regulatory documents published by more than 100 Brazilian regulators, affecting the businesses of more than 40 departments of Banco do Brasil. The application uses a hybrid strategy, combining machine learning and rules for a binary classification challenge involving each company department. This work also presents a particular type of corpus imbalance called The Imbalance Within Class.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
O’Halloran, S., Maskey, S., McAllister, G., Park, D.K., Chen, K.: Data science and political economy: application to financial regulatory structure. RSF Russell Sage Found. J. Soc. Sci. 2, 87–109 (2016)
Morgan, D.P.: Rating banks: risk and uncertainty in an opaque industry. Am. Econ. Rev. 92, 874–888 (2002)
de Lima, A.J.D., Ferreira, L.N., Brandi-vinicius, V.R.: The rise of risk: a word on financial stability regulation
Leo, M., Sharma, S., Maddulety, K.: Machine learning in banking risk management: a literature review. Risks 7, 29 (2019)
Kumar, B.S., Ravi, V.: A survey of the applications of text mining in financial domain. Knowl.-Based Syst. 114, 128–147 (2016)
El-Haj, M., Rayson, P., Walker, M., Young, S., Simaki, V.: In search of meaning: lessons, resources and next steps for computational analysis of financial discourse. J. Bus. Finance Account. 46, 265–306 (2019)
Gonçalves, T., Quaresma, P.: A preliminary approach to the multilabel classification problem of Portuguese juridical documents. In: Pires, F.M., Abreu, S. (eds.) EPIA 2003. LNCS (LNAI), vol. 2902, pp. 435–444. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-24580-3_50
de Araujo, P.H.L., de Campos, T.E., Braz, F.A., da Silva, N.C.: VICTOR: a dataset for Brazilian legal documents classification. In: Proceedings of The 12th Language Resources and Evaluation Conference, pp. 1449–1458 (2020)
Rodríguez, M.M., Bezerra, L.D.: Processamento de linguagem natural para reconhecimento de entidades nomeadas em textos jurídicos de atos administrativos (portarias). Revista de Engenharia e Pesquisa Aplicada 5, 67–77 (2020)
Faria de Azevedo, R., et al.: Screening of email box in Portuguese with SVM at Banco do Brasil. In: Quaresma, P., Vieira, R., Aluísio, S., Moniz, H., Batista, F., Gonçalves, T. (eds.) PROPOR 2020. LNCS (LNAI), vol. 12037, pp. 153–163. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41505-1_15
O’Halloran, S., Maskey, S., McAllister, G., Park, D.K., Chen, K.: Big data and the regulation of financial markets. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, pp. 1118–1124 (2015)
Neill, J.O., Buitelaar, P., Robin, C., Brien, L.O.: Classifying sentential modality in legal language: a use case in financial regulations, acts and directives. In: Proceedings of the 16th edition of the International Conference on Artificial Intelligence and Law, pp. 159–168 (2017)
Wong, K.Y.: Learning regulatory compliance data for data governance in financial services industry by machine learning models (2020)
Gogas, P., Papadimitriou, T., Agrapetidou, A.: Forecasting bank failures and stress testing: a machine learning approach. Int. J. Forecast. 34, 440–455 (2018)
Suss, J., Treitel, H.: Predicting bank distress in the UK with machine learning (2019)
Petropoulos, A., Siakoulis, V., Stavroulakis, E., Vlachogiannakis, N.E.: Predicting bank insolvencies using machine learning techniques. Int. J. Forecast. 36, 1092–1113 (2020)
Jagtiani, J., Vermilyea, T., Wall, L.D.: The roles of big data and machine learning in bank supervision. Forthcoming, Banking Perspectives (2018)
Polyzos, S., Samitas, A., Kampouris, E.: Economic stimulus through bank regulation: government responses to the COVID-19 crisis. J. Int. Fin. Mark. Inst. Money, 101444 (2021)
Howe, J.S.T., Khang, L.H., Chai, I.E.: Legal area classification: a comparative study of text classifiers on Singapore supreme court judgments. arXiv preprint arXiv:1904.06470 (2019)
Park, K.Y., Lee, Y.J., Kim, S.: Deciphering monetary policy board minutes through text mining approach: the case of Korea. Bank of Korea WP 1 (2019)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Souza, F., Nogueira, R., Lotufo, R.: BERTimbau: pretrained BERT models for Brazilian Portuguese. In: 9th Brazilian Conference on Intelligent Systems, BRACIS, Rio Grande do Sul, Brazil, 20–23 October (2020, to appear)
Leite, J.A., Silva, D.F., Bontcheva, K., Scarton, C.: Toxic language detection in social media for Brazilian Portuguese: new dataset and multilingual analysis. arXiv preprint arXiv:2010.04543 (2020)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
de Azevedo, R.F. et al. (2022). Banking Regulation Classification in Portuguese. In: Pinheiro, V., et al. Computational Processing of the Portuguese Language. PROPOR 2022. Lecture Notes in Computer Science(), vol 13208. Springer, Cham. https://doi.org/10.1007/978-3-030-98305-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-98305-5_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98304-8
Online ISBN: 978-3-030-98305-5
eBook Packages: Computer ScienceComputer Science (R0)