Automatic ICD Coding Based on Segmented ClinicalBERT with Hierarchical Tree Structure Learning

Kang, Beichen; Wang, Xiaosu; Xiong, Yun; Zhang, Yao; Zhou, Chaofan; Zhu, Yangyong; Zhang, Jiawei; Tang, Chunlei

doi:10.1007/978-3-031-30678-5_19

Beichen Kang¹⁵,
Xiaosu Wang¹⁵,
Yun Xiong¹⁵,
Yao Zhang¹⁵,
Chaofan Zhou¹⁵,
Yangyong Zhu¹⁵,
Jiawei Zhang¹⁶ &
…
Chunlei Tang¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13946))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

1576 Accesses

Abstract

Automatic ICD coding aims at assigning the international classification of disease (ICD) codes to clinical notes documented by clinicians, which is crucial for saving human resources and has attracted much research attention in recent years. However, facing the challenges brought by the complex long textual narratives in clinical notes and the long-tailed data distribution in ICD codes, existing studies are ineffectual in the struggle to extract key information from the clinical notes and handle large amounts of small-data learning problems on the tail codes, which makes it hard to achieve satisfactory performance. In this paper, we present a ClinicalBERT-based model for automatic ICD coding, which can effectively cope with complex long clinical narratives via a segmentation learning mechanism and take advantage of the tree-like structure of ICD codes to transmit information among code nodes. Specifically, a novel hierarchical tree structure learning module is proposed to enable each code to utilize information both from upper and lower nodes of the tree, so that better code classifiers are learned for both head and tail codes. Experiments on MIMIC-III dataset show that our model outperforms current state-of-the-art (SOTA) ICD coding methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alsentzer, E., et al.: Publicly available clinical BERT embeddings. arXiv:1904.03323 (2019)
Beltagy, I., Peters, M.E., Cohan, A.: Longformer: The long-document transformer. arXiv:2004.05150 (2020)
Biswas, B., Pham, T.-H., Zhang, P.: TransICD: transformer based code-wise attention model for explainable ICD coding. In: Tucker, A., Henriques Abreu, P., Cardoso, J., Pereira Rodrigues, P., Riaño, D. (eds.) AIME 2021. LNCS (LNAI), vol. 12721, pp. 469–478. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77211-6_56
Chapter Google Scholar
Cao, P., Chen, Y., Liu, K., Zhao, J., Liu, S., Chong, W.: Hypercore: hyperbolic and co-graph representation for automatic ICD coding. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3105–3114 (2020)
Google Scholar
Chalkidis, I., Fergadiotis, M., Kotitsas, S., Malakasiotis, P., Aletras, N., Androutsopoulos, I.: An empirical study on large-scale multi-label text classification including few and zero-shot labels. In: EMNLP (2020)
Google Scholar
Chen, Y., Ren, J.: Automatic ICD code assignment utilizing textual descriptions and hierarchical structure of ICD code. In: 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 348–353. IEEE (2019)
Google Scholar
De Lima, L.R., Laender, A.H., Ribeiro-Neto, B.A.: A hierarchical approach to the automatic categorization of medical documents. In: Proceedings of the Seventh International Conference on Information and Knowledge Management, pp. 132–139 (1998)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 (2018)
Feucht, M., Wu, Z., Althammer, S., Tresp, V.: Description-based label attention classifier for explainable ICD-9 classification. arXiv:2109.12026 (2021)
Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc. (HEALTH) 3(1), 1–23 (2021)
MathSciNet Google Scholar
Huang, C.W., Tsai, S.C., Chen, Y.N.: PLM-ICD: automatic ICD coding with pretrained language models. arXiv:2207.05289 (2022)
Johnson, A.E., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3(1), 1–9 (2016)
Article MathSciNet Google Scholar
Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)
Article Google Scholar
Li, F., Yu, H.: ICD coding from clinical text using multi-filter residual convolutional neural network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8180–8187 (2020)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. arXiv:1907.11692 (2019)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv:1711.05101 (2017)
McCallum, A.K.: Multi-label text classification with a mixture model trained by EM. In: AAAI 99 Workshop on Text Learning (1999)
Google Scholar
Mullenbach, J., Wiegreffe, S., Duke, J., Sun, J., Eisenstein, J.: Explainable prediction of medical codes from clinical text. arXiv:1802.05695 (2018)
O’malley, K.J., Cook, K.F., Price, M.D., Wildes, K.R., Hurdle, J.F., Ashton, C.M.: Measuring diagnoses: ICD code accuracy. Health Serv. Res. 40(5p2), 1620–1639 (2005)
Google Scholar
Pascual, D., Luck, S., Wattenhofer, R.: Towards BERT-based automatic ICD coding: limitations and opportunities. arXiv:2104.06709 (2021)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Schütze, H., Manning, C.D., Raghavan, P.: Introduction to Information Retrieval, vol. 39. Cambridge University Press, Cambridge (2008)
MATH Google Scholar
Shi, H., Xie, P., Hu, Z., Zhang, M., Xing, E.P.: Towards automated ICD coding using deep learning. arXiv:1711.04075 (2017)
Song, C., Zhang, S., Sadoughi, N., Xie, P., Xing, E.: Generalized zero-shot text classification for ICD coding. In: Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pp. 4018–4024 (2021)
Google Scholar
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. arXiv:1503.00075 (2015)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Vu, T., Nguyen, D.Q., Nguyen, A.: A label attention model for ICD coding from clinical text. arXiv:2007.06351 (2020)
Xie, P., Xing, E.: A neural architecture for automated ICD coding. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1066–1076 (2018)
Google Scholar
Xie, X., Xiong, Y., Yu, P.S., Zhu, Y.: EHR coding with multi-scale feature attention and structured knowledge graph propagation. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 649–658 (2019)
Google Scholar
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Yuan, Z., Tan, C., Huang, S.: Code synonyms do matter: Multiple synonyms matching network for automatic ICD coding. arXiv:2203.01515 (2022)
Zhang, Z., Liu, J., Razavian, N.: Bert-xml: Large scale automated ICD coding using BERT pretraining. arXiv:2006.03685 (2020)
Zhou, T., et al.: Automatic ICD coding via interactive shared representation networks with self-distillation mechanism. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 5948–5957 (2021)
Google Scholar

Download references

Acknowledgements

This work is partially supported by the National Key Research and Development Plan Project 2022YFC3600901, CNKLSTISS, and NSF through grants IIS-1763365, IIS-2106972.

Author information

Authors and Affiliations

Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, Shanghai, China
Beichen Kang, Xiaosu Wang, Yun Xiong, Yao Zhang, Chaofan Zhou & Yangyong Zhu
IFM Lab, Department of Computer Science, University of California, Davis, CA, USA
Jiawei Zhang
Harvard Medical School, Boston, MA, USA
Chunlei Tang

Authors

Beichen Kang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaosu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yun Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Yao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chaofan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yangyong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chunlei Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaosu Wang .

Editor information

Editors and Affiliations

Tianjin University, Tianjin, China
Xin Wang
University of Torino, Turin, Italy
Maria Luisa Sapino
POSTECH, Pohang, Korea (Republic of)
Wook-Shin Han
University of California Santa Barbara, Santa Barbara, CA, USA
Amr El Abbadi
University of Auckland, Auckland, New Zealand
Gill Dobbie
Tianjin University, Tianjin, China
Zhiyong Feng
Beijing University of Posts and Telecommunications, Beijing, China
Yingxiao Shao
The University of Queensland, Brisbane, QLD, Australia
Hongzhi Yin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kang, B. et al. (2023). Automatic ICD Coding Based on Segmented ClinicalBERT with Hierarchical Tree Structure Learning. In: Wang, X., et al. Database Systems for Advanced Applications. DASFAA 2023. Lecture Notes in Computer Science, vol 13946. Springer, Cham. https://doi.org/10.1007/978-3-031-30678-5_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-30678-5_19
Published: 14 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30677-8
Online ISBN: 978-3-031-30678-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automatic ICD Coding Based on Segmented ClinicalBERT with Hierarchical Tree Structure Learning