Abstract
Traditional invoice text classification methods are labor-intensive and inefficient. In order to effectively identify the types of invoices, a Chinese text classification model based on deep learning BERT-TextCNN is designed, and a short text classification dataset of invoices is obtained from a municipal tax bureau to train and test the model, and to compare and analyze the performance of BERT-TextCNN model, BERT model, and TextCNN model. As a result, compared to traditional neural network models, the BERT + TextCNN model can accurately classify Chinese text, effectively prevent excessive fitting, and have good generalization ability. The performance of text classification is improved compared to both BERT model and TextCNN model alone. Draw a conclusion through experiments which show that the BERT-TextCNN model has good classification effect and good stability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent Neural Network Regularization. arXiv preprint arXiv:1409.2329 (2014)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 (2018)
Chen, Q., Zhuo, Z., Wang, W.: Bert for Joint Intent Classification and Slot Filling. arXiv preprint arXiv:1902.10909 (2019)
Sun, C., Qiu, X., Xu, Y., Huang, X.: How to fine-tune bert for text classification? In: Chinese Computational Linguistics: 18th China National Conference, CCL 2019, Kunming, China, October 18–20 (2019)
Liu, J., Xia, C., Yan, H., Xie, Z., Sun, J.: Hierarchical comprehensive context modeling for Chinese text classification. IEEE Access 7, 154546–154559 (2019)
Chawla, S., Kaur, R., Aggarwal, P.: Text classification framework for short text based on TFIDF-FastText. In: Multimedia Tools and Applications, pp. 1–14 (2023)
Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. IJCAI 350, 3172077–3172295 (2017)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Goldberg, Y., Levy, O.: word2vec Explained: Deriving Mikolov et al.’s Negative-Sampling Word-Embedding Method. arXiv preprint arXiv:1402.3722 (2014)
Sarzynska-Wawer, J., Wawer, A., Pawlak, A., Szymanowska, J., Stefaniak, I., Jarkiewicz, M., Okruszek, L.: Detecting formal thought disorder by deep contextualized word representations. Psychiatry Res. 304, 114135 (2021)
Kim, Y.: Convolutional Neural Networks for Sentence Classification. arXiv preprint arXiv (2014)
Song, P., Geng, C., Li, Z.: Research on text classification based on convolutional neural network. In: 2019 International Conference on Computer Network, Electronic and Automation (ICCNEA), pp. 229–232. IEEE (2019)
Chen, Z., Tang, Y., Zhang, Z., Zhang, C., Wang, L.: Sentiment-aware short text classification based on convolutional neural network and attention. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence, pp. 1172–1179 (2019)
Jing, W., Bailong, Y.: News text classification and recommendation technology based on wide & deep-bert model. In: 2021 IEEE International Conference on Information Communication and Software Engineering, pp.209–216 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, J., Li, L., Yu, B. (2024). Short Text Classification of Invoices Based on BERT-TextCNN. In: Kountchev, R., Patnaik, S., Nakamatsu, K., Kountcheva, R. (eds) Proceedings of International Conference on Artificial Intelligence and Communication Technologies (ICAICT 2023). ICAICT 2023. Smart Innovation, Systems and Technologies, vol 368. Springer, Singapore. https://doi.org/10.1007/978-981-99-6641-7_13
Download citation
DOI: https://doi.org/10.1007/978-981-99-6641-7_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-6640-0
Online ISBN: 978-981-99-6641-7
eBook Packages: EngineeringEngineering (R0)