Skip to main content

Automatic ICD Coding Based on Segmented ClinicalBERT with Hierarchical Tree Structure Learning

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13946))

Included in the following conference series:

  • 2013 Accesses

Abstract

Automatic ICD coding aims at assigning the international classification of disease (ICD) codes to clinical notes documented by clinicians, which is crucial for saving human resources and has attracted much research attention in recent years. However, facing the challenges brought by the complex long textual narratives in clinical notes and the long-tailed data distribution in ICD codes, existing studies are ineffectual in the struggle to extract key information from the clinical notes and handle large amounts of small-data learning problems on the tail codes, which makes it hard to achieve satisfactory performance. In this paper, we present a ClinicalBERT-based model for automatic ICD coding, which can effectively cope with complex long clinical narratives via a segmentation learning mechanism and take advantage of the tree-like structure of ICD codes to transmit information among code nodes. Specifically, a novel hierarchical tree structure learning module is proposed to enable each code to utilize information both from upper and lower nodes of the tree, so that better code classifiers are learned for both head and tail codes. Experiments on MIMIC-III dataset show that our model outperforms current state-of-the-art (SOTA) ICD coding methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alsentzer, E., et al.: Publicly available clinical BERT embeddings. arXiv:1904.03323 (2019)

  2. Beltagy, I., Peters, M.E., Cohan, A.: Longformer: The long-document transformer. arXiv:2004.05150 (2020)

  3. Biswas, B., Pham, T.-H., Zhang, P.: TransICD: transformer based code-wise attention model for explainable ICD coding. In: Tucker, A., Henriques Abreu, P., Cardoso, J., Pereira Rodrigues, P., Riaño, D. (eds.) AIME 2021. LNCS (LNAI), vol. 12721, pp. 469–478. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77211-6_56

    Chapter  Google Scholar 

  4. Cao, P., Chen, Y., Liu, K., Zhao, J., Liu, S., Chong, W.: Hypercore: hyperbolic and co-graph representation for automatic ICD coding. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3105–3114 (2020)

    Google Scholar 

  5. Chalkidis, I., Fergadiotis, M., Kotitsas, S., Malakasiotis, P., Aletras, N., Androutsopoulos, I.: An empirical study on large-scale multi-label text classification including few and zero-shot labels. In: EMNLP (2020)

    Google Scholar 

  6. Chen, Y., Ren, J.: Automatic ICD code assignment utilizing textual descriptions and hierarchical structure of ICD code. In: 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 348–353. IEEE (2019)

    Google Scholar 

  7. De Lima, L.R., Laender, A.H., Ribeiro-Neto, B.A.: A hierarchical approach to the automatic categorization of medical documents. In: Proceedings of the Seventh International Conference on Information and Knowledge Management, pp. 132–139 (1998)

    Google Scholar 

  8. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 (2018)

  9. Feucht, M., Wu, Z., Althammer, S., Tresp, V.: Description-based label attention classifier for explainable ICD-9 classification. arXiv:2109.12026 (2021)

  10. Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc. (HEALTH) 3(1), 1–23 (2021)

    MathSciNet  Google Scholar 

  11. Huang, C.W., Tsai, S.C., Chen, Y.N.: PLM-ICD: automatic ICD coding with pretrained language models. arXiv:2207.05289 (2022)

  12. Johnson, A.E., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3(1), 1–9 (2016)

    Article  MathSciNet  Google Scholar 

  13. Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)

    Article  Google Scholar 

  14. Li, F., Yu, H.: ICD coding from clinical text using multi-filter residual convolutional neural network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8180–8187 (2020)

    Google Scholar 

  15. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  16. Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. arXiv:1907.11692 (2019)

  17. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv:1711.05101 (2017)

  18. McCallum, A.K.: Multi-label text classification with a mixture model trained by EM. In: AAAI 99 Workshop on Text Learning (1999)

    Google Scholar 

  19. Mullenbach, J., Wiegreffe, S., Duke, J., Sun, J., Eisenstein, J.: Explainable prediction of medical codes from clinical text. arXiv:1802.05695 (2018)

  20. O’malley, K.J., Cook, K.F., Price, M.D., Wildes, K.R., Hurdle, J.F., Ashton, C.M.: Measuring diagnoses: ICD code accuracy. Health Serv. Res. 40(5p2), 1620–1639 (2005)

    Google Scholar 

  21. Pascual, D., Luck, S., Wattenhofer, R.: Towards BERT-based automatic ICD coding: limitations and opportunities. arXiv:2104.06709 (2021)

  22. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  23. Schütze, H., Manning, C.D., Raghavan, P.: Introduction to Information Retrieval, vol. 39. Cambridge University Press, Cambridge (2008)

    MATH  Google Scholar 

  24. Shi, H., Xie, P., Hu, Z., Zhang, M., Xing, E.P.: Towards automated ICD coding using deep learning. arXiv:1711.04075 (2017)

  25. Song, C., Zhang, S., Sadoughi, N., Xie, P., Xing, E.: Generalized zero-shot text classification for ICD coding. In: Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pp. 4018–4024 (2021)

    Google Scholar 

  26. Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. arXiv:1503.00075 (2015)

  27. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  28. Vu, T., Nguyen, D.Q., Nguyen, A.: A label attention model for ICD coding from clinical text. arXiv:2007.06351 (2020)

  29. Xie, P., Xing, E.: A neural architecture for automated ICD coding. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1066–1076 (2018)

    Google Scholar 

  30. Xie, X., Xiong, Y., Yu, P.S., Zhu, Y.: EHR coding with multi-scale feature attention and structured knowledge graph propagation. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 649–658 (2019)

    Google Scholar 

  31. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  32. Yuan, Z., Tan, C., Huang, S.: Code synonyms do matter: Multiple synonyms matching network for automatic ICD coding. arXiv:2203.01515 (2022)

  33. Zhang, Z., Liu, J., Razavian, N.: Bert-xml: Large scale automated ICD coding using BERT pretraining. arXiv:2006.03685 (2020)

  34. Zhou, T., et al.: Automatic ICD coding via interactive shared representation networks with self-distillation mechanism. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 5948–5957 (2021)

    Google Scholar 

Download references

Acknowledgements

This work is partially supported by the National Key Research and Development Plan Project 2022YFC3600901, CNKLSTISS, and NSF through grants IIS-1763365, IIS-2106972.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaosu Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kang, B. et al. (2023). Automatic ICD Coding Based on Segmented ClinicalBERT with Hierarchical Tree Structure Learning. In: Wang, X., et al. Database Systems for Advanced Applications. DASFAA 2023. Lecture Notes in Computer Science, vol 13946. Springer, Cham. https://doi.org/10.1007/978-3-031-30678-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-30678-5_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-30677-8

  • Online ISBN: 978-3-031-30678-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics