Skip to main content

Domain Robust Pipeline for Medical Causal Entity and Relation Extraction Task

  • Conference paper
  • First Online:
Health Information Processing. Evaluation Track Papers (CHIP 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1773))

Included in the following conference series:

  • 251 Accesses

Abstract

Medical entity and relation extraction is an essential task for medical knowledge graph, which can provide explanatory answers for medical search engine. Recently, PL-Marker, a deep learning based pipeline, has been proposed, which follows a similar NER &ER paradigm. In this method, medical entities are first identified by a NER model, and then they are combined by pairs to feed into a ER model to learn the causal relation among the medical entities. In this way, the pipeline cannot handle the complex entity relationships contained by CMedCausal due to its own defects, such as exposure bias and lack of relevance between entities and relationships. In this paper, we propose a novel pipeline: Domain Robust Pipeline (DRP) which tackles these challenges by introducing noisy entities to solve the exposure bias, adding KL loss to learn from samples with noisy labels, applying multitask learning to escape semantic traps and re-targeting the relationships to increase the robustness of the pipeline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    CHIP is an annual conference aiming to explore the mystery of life, improve the quality of health, and develop the level of medical treatment with the help of information processing technologies. http://www.cips-chip.org.cn/.

  2. 2.

    We use Bert as a pre-trained language model for the entity extraction task.

  3. 3.

    A PL-Marker pipeline usually consists of two serial models: a NER model and a ER model.

  4. 4.

    https://synalp.gitlabpages.inria.fr/webnlg-challenge.

  5. 5.

    https://open.nytimes.com/data/home.

  6. 6.

    It means that the result is considered as a subject and the reason is considered as an object.

References

  1. Zihao, L., Mosha, C., Zhenxin, M., et al.: CMedCausal-Chinese medical causal relation extraction dataset. biomedrxiv. 43(12), 23–27 (2022). https://doi.org/10.12201/bmr.202211.00004

    Article  Google Scholar 

  2. Deming, Y., Yankai, L., Peng, L., et al.: Packed levitated marker for entity and relation extraction. In: Smaranda, M., Preslav, N., Aline, V. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, May 2022, pp. 4904–4917 (2022)

    Google Scholar 

  3. Yucheng, W., Bowen, Y., Yueyang, Z., et al.: TPLinker: single-stage joint extraction of entities and relations through token pair linking. In: Donia, S., Nuria, B., Chengqing, Z. (eds.) Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, December 2020, pp. 1572–1582 (2020)

    Google Scholar 

  4. Wenxuan, Z., Muhao, C.: Learning from noisy labels for entity-centric information extraction. In: Marie-Francine, M., Xuanjing, H., Lucia, S., et al. (eds.) Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, November 2021, pp. 5381–5392 (2021)

    Google Scholar 

  5. Jianlin, S.: GlobalPointer: A Unified Method for Nested and Non-nested Ner. https://kexue.fm/archives/8373 (2021). Accessed 10 Jan 2023

  6. Jianlin, S.: GPLinker: GlobalPointer-based Joint Extraction of Entities. https://kexue.fm/archives/8888 (2022). Accessed 10 Jan 2023

  7. Ahmed, F.G.: Evaluating Deep Learning Models: The Confusion Matrix, Accuracy, Precision, and Recall. https://blog.paperspace.com/deep-learning-metrics-precision-recall-accuracy (2021). Accessed 13 Jan 2023

  8. Diederik, P.K., Jimmy, B.: Adam: A Method for Stochastic Optimization (2017). arXiv:1412.6980

  9. Yinhan, L., Myle, O., Naman, G., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach (2019). arXiv:1907.11692

  10. Vikas, Y., Steven, B.: A survey on recent advances in named entity recognition from deep learning models. In: Emily, M.B., Leon, D., Pierre, I. (eds.) Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, August 2018, pp. 2145–2158 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shengjun Yuan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liang, T., Yuan, S., Zhou, P., Fu, H., Wu, H. (2023). Domain Robust Pipeline for Medical Causal Entity and Relation Extraction Task. In: Tang, B., et al. Health Information Processing. Evaluation Track Papers. CHIP 2022. Communications in Computer and Information Science, vol 1773. Springer, Singapore. https://doi.org/10.1007/978-981-99-4826-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4826-0_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4825-3

  • Online ISBN: 978-981-99-4826-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics