Abstract
Text classification is an important topic in natural language processing, with the development of social network, many question-and-answer pairs regarding health-care and medicine flood social platforms. It is of great social value to mine and classify medical text and provide targeted medical services for patients. The existing algorithms of text classification can deal with simple semantic text, especially in the field of Chinese medical text, the text structure is complex and includes a large number of medical nomenclature and professional terms, which are difficult for patients to understand. We propose a Chinese medical text classification model using a BERT-based Chinese text encoder by N-gram representations (ZEN) and capsule network, which represent feature uses the ZEN model and extract the features by capsule network, we also design a N-gram medical dictionary to enhance medical text representation and feature extraction. The experimental results show that the precision, recall and F1-score of our model are improved by 10.25%, 11.13% and 12.29%, respectively, compared with the baseline models in average, which proves that our model has better performance.
Similar content being viewed by others
Data availability
The datasets used in our paper can be downloaded from the following link: CMDD: https://github.com/Toyhom/Chinese-medical-dialogue-data. webMedQA: https://github.com/hejunqing/webMedQA. CHIP-CTC: https://github.com/Monst1016/CHIP-CTC.
References
Li Y, Song Y, Zhao W, Guo X, Ju X, Vogel D (2019) Exploring the role of online health community information in patients’ decisions to switch from online to offline medical services. Int J Med Inform 130:103951. https://doi.org/10.1016/j.ijmedinf.2019.08.011
Yang Y, Zhang X, Lee PK (2019) Improving the effectiveness of online healthcare platforms: an empirical study with multi-period patient-doctor consultation data. Int J Prod Econ 207:70–80. https://doi.org/10.1016/j.ijpe.2018.11.009
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv (CSUR) 34(1):1–47. https://doi.org/10.1145/505282.505283
Joachims T (2005) Text categorization with support vector machines: learning with many relevant features. In: Machine Learning: ECML-98: 10th European Conference on Machine Learning Chemnitz, Germany, April 21–23, 1998 proceedings, pp 137–142. Springer, Berlin. https://doi.org/10.1007/BFb0026683
Tang B, He H, Baggenstoss PM, Kay S (2016) A Bayesian classification approach using class-specific features for text categorization. IEEE Trans Knowl Data Eng 28(6):1602–1606. https://doi.org/10.1109/TKDE.2016.2522427
Yan J, Li J, Gao X (2011) Chinese text location under complex background using Gabor filter and SVM. Neurocomputing 74(17):2998–3008. https://doi.org/10.1016/j.neucom.2011.04.031
Yao L, Mao C, Luo Y (2019) Clinical text classification with rule-based features and knowledge-guided convolutional neural networks. BMC Med Inform Decis Mak 19(3):31–39. https://doi.org/10.1186/s12911-019-0781-4
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. Adv Neural Inf Process Syst 30
Zhang Q, Yuan Q, Lv P, Zhang M, Lv L (2022) Research on medical text classification based on improved capsule network. Electronics 11(14):2229. https://doi.org/10.3390/electronics11142229
Shen Y, Zhang Q, Zhang J, Huang J, Lu Y, Lei K (2019) Improving medical short text classification with semantic expansion using word-cluster embedding. In: Information Science and Applications 2018, ICISA 2018, pp 401–411. Springer, Singapore. https://doi.org/10.1007/978-981-13-1056-0_41
Prabhakar SK, Won DO (2021) Medical text classification using hybrid deep learning models with multihead attention. Comput Intell Neurosci. https://doi.org/10.1155/2021/9425655
Lu H, Ehwerhemuepha L, Rakovski C (2022) A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance. BMC Med Res Methodol 22(1):181. https://doi.org/10.1186/s12874-022-01665-y
Liu K, Chen L (2019) Medical social media text classification integrating consumer health terminology. IEEE Access 7:78185–78193. https://doi.org/10.1109/ACCESS.2019.2921938
Diao S, Bai J, Song Y, Zhang T, Wang Y (2019) ZEN: Pre-training Chinese text encoder enhanced by n-gram representations. arXiv preprint http://arxiv.org/abs/1911.00720. https://doi.org/10.48550/arXiv.1911.00720
Mazzia V, Salvetti F, Chiaberge M (2021) Efficient-capsnet: capsule network with self-attention routing. Sci Rep 11(1):14634. https://doi.org/10.1038/s41598-021-93977-0
Gupta PK, Siddiqui MK, Huang X, Morales-Menendez R, Panwar H, Terashima-Marin H, Wajid MS (2022) COVID-WideNet—a capsule network for COVID-19 detection. Appl Soft Comput 122:108780. https://doi.org/10.1016/j.asoc.2022.108780
Wei Y, Liu Y, Li C, Cheng J, Song R, Chen X (2023) TC-Net: a transformer capsule network for EEG-based emotion recognition. Comput Biol Med 152:106463. https://doi.org/10.1016/j.compbiomed.2022.106463
Qin Y, Yuen C, Shao Y, Qin B, Li X (2022) Slow-varying dynamics-assisted temporal capsule network for machinery remaining useful life estimation. IEEE Trans Cybern 53(1):592–606
Devlin J, Chang M W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint http://arxiv.org/abs/1810.04805. https://doi.org/10.48550/arXiv.1810.04805
Li L, Weinberg CR, Darden TA, Pedersen LG (2001) Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17(12):1131–1142. https://doi.org/10.1093/bioinformatics/17.12.1131
Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674. https://doi.org/10.1109/21.97458
Chen T, Guestrin C (2016). Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 785–794. https://doi.org/10.1145/2939672.2939785
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Liu T Y (2017). Lightgbm: a highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint http://arxiv.org/abs/1301.3781. https://doi.org/10.48550/arXiv.1301.3781
Joulin A, Grave E, Bojanowski P, Mikolov T (2016) Bag of tricks for efficient text classification. arXiv preprint http://arxiv.org/abs/1607.01759. https://doi.org/10.48550/arXiv.1607.01759
Han F, Yao J, Zhu H, Wang C (2020) Underwater image processing and object detection based on deep CNN method. J Sens. https://doi.org/10.1155/2020/6707328
Seo M, Kim M (2020) Fusing visual attention CNN and bag of visual words for cross-corpus speech emotion recognition. Sensors 20(19):5559. https://doi.org/10.3390/s20195559
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 29, no 1. https://doi.org/10.1609/aaai.v29i1.9513
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, no 01, pp 7370–7377
Wei J, Zou K (2019). Eda: easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint http://arxiv.org/abs/1901.11196. https://doi.org/10.1609/aaai.v33i01.33017370
Zhao W, Ye J, Yang M, Lei Z, Zhang S, Zhao Z (2018) Investigating capsule networks with dynamic routing for text classification. arXiv preprint http://arxiv.org/abs/1804.00538. https://doi.org/10.48550/arXiv.1804.00538
Srivastava S, Khurana P, Tewari V (2018) Identifying aggression and toxicity in comments using capsule network. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pp 98–105
Yang M, Zhao W, Chen L, Qu Q, Zhao Z, Shen Y (2019) Investigating the transferring capability of capsule networks for text classification. Neural Netw 118:247–261. https://doi.org/10.1016/j.neunet.2019.06.014
Yadav S, Dhage S (2022) Emergence of capsule network for automatic medical disease classification. In: 2022 Sardar Patel International Conference on Industry 4.0-Nascent Technologies and Sustainability for ‘Make in India’ Initiative, pp 1–6. IEEE
Wu Y, Schuster M, Chen Z, Le Q V, Norouzi M, Macherey W, Dean J (2016). Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint http://arxiv.org/abs/1609.08144. https://doi.org/10.48550/arXiv.1609.08144
He J, Fu M, Tu M (2019) Applying deep matching networks to Chinese medical question answering: a study and a dataset. BMC Med Inform Decis Mak 19(2):91–100. https://doi.org/10.1186/s12911-019-0761-8
Zong H, Yang J, Zhang Z, Li Z, Zhang X (2021) Semantic categorization of Chinese eligibility criteria in clinical trials using machine learning methods. BMC Med Inform Decis Mak 21(1):1–12. https://doi.org/10.1186/s12911-021-01487-w
Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv preprint http://arxiv.org/abs/1408.5882. https://doi.org/10.48550/arXiv.1408.5882
Li Y J, Zhang H J, Pan W M, Feng R J, Zhou Z Y (2021). Microblog rumor detection based on Bert-DPCNN. In: Artificial Intelligence in China: Proceedings of the 2nd International Conference on Artificial Intelligence in China, pp 524–530. Springer, Singapore. https://doi.org/10.1007/978-981-15-8599-9_60
Shreyashree S, Sunagar P, Rajarajeswari S, Kanavalli A (2022). BERT-based hybrid RNN model for multi-class text classification to study the effect of pre-trained word embeddings. Int J Adv Comput Sci Appl 13(9). https://doi.org/10.14569/IJACSA.2022.0130979
Li X, Zhang Y, Jin J, Sun F, Li N, Liang S (2023) A model of integrating convolution and BiGRU dual-channel mechanism for Chinese medical text classifications. PLoS One 18(3):e0282824. https://doi.org/10.1371/journal.pone.0282824
Funding
This work was supported in part by FDCT Funding Scheme for Postdoctoral Researchers of Higher Education Institutions, Macau (0003/2021/APD), Joint Research and Development Fund of Wuyi University and Hong Kong and Macau(2019WGALH21, and Key Scientific Research Projects of Universities in Henan Province, China (23B520005).
Author information
Authors and Affiliations
Contributions
SL helped in conceptualization (lead); review and editing (lead); funding acquisition (lead) and methodology (equal). FS worked in methodology (lead); software (lead); original draft (lead); conceptualization (equal) and review and editing (supporting). HS worked in formal analysis (lead); supervision (equal) and review and editing (equal). TC worked in supervision (lead); formal analysis (equal) and review and editing (equal). WD contributed to review and editing (equal); conceptualization (supporting) and original draft (supporting).
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
The submitted work is original and is not published elsewhere in any form or language (partially or in full).
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liang, S., Sun, F., Sun, H. et al. A medical text classification approach with ZEN and capsule network. J Supercomput 80, 4353–4377 (2024). https://doi.org/10.1007/s11227-023-05612-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05612-6