Domain Robust Pipeline for Medical Causal Entity and Relation Extraction Task

Liang, Tao; Yuan, Shengjun; Zhou, Pengfei; Fu, Hangcong; Wu, Huizhe

doi:10.1007/978-981-99-4826-0_6

Tao Liang¹⁶,
Shengjun Yuan¹⁶,
Pengfei Zhou¹⁶,
Hangcong Fu¹⁶ &
…
Huizhe Wu¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1773))

Included in the following conference series:

China Health Information Processing Conference

251 Accesses

Abstract

Medical entity and relation extraction is an essential task for medical knowledge graph, which can provide explanatory answers for medical search engine. Recently, PL-Marker, a deep learning based pipeline, has been proposed, which follows a similar NER &ER paradigm. In this method, medical entities are first identified by a NER model, and then they are combined by pairs to feed into a ER model to learn the causal relation among the medical entities. In this way, the pipeline cannot handle the complex entity relationships contained by CMedCausal due to its own defects, such as exposure bias and lack of relevance between entities and relationships. In this paper, we propose a novel pipeline: Domain Robust Pipeline (DRP) which tackles these challenges by introducing noisy entities to solve the exposure bias, adding KL loss to learn from samples with noisy labels, applying multitask learning to escape semantic traps and re-targeting the relationships to increase the robustness of the pipeline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
CHIP is an annual conference aiming to explore the mystery of life, improve the quality of health, and develop the level of medical treatment with the help of information processing technologies. http://www.cips-chip.org.cn/.
2.
We use Bert as a pre-trained language model for the entity extraction task.
3.
A PL-Marker pipeline usually consists of two serial models: a NER model and a ER model.
4.
https://synalp.gitlabpages.inria.fr/webnlg-challenge.
5.
https://open.nytimes.com/data/home.
6.
It means that the result is considered as a subject and the reason is considered as an object.

References

Zihao, L., Mosha, C., Zhenxin, M., et al.: CMedCausal-Chinese medical causal relation extraction dataset. biomedrxiv. 43(12), 23–27 (2022). https://doi.org/10.12201/bmr.202211.00004
Article Google Scholar
Deming, Y., Yankai, L., Peng, L., et al.: Packed levitated marker for entity and relation extraction. In: Smaranda, M., Preslav, N., Aline, V. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, May 2022, pp. 4904–4917 (2022)
Google Scholar
Yucheng, W., Bowen, Y., Yueyang, Z., et al.: TPLinker: single-stage joint extraction of entities and relations through token pair linking. In: Donia, S., Nuria, B., Chengqing, Z. (eds.) Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, December 2020, pp. 1572–1582 (2020)
Google Scholar
Wenxuan, Z., Muhao, C.: Learning from noisy labels for entity-centric information extraction. In: Marie-Francine, M., Xuanjing, H., Lucia, S., et al. (eds.) Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, November 2021, pp. 5381–5392 (2021)
Google Scholar
Jianlin, S.: GlobalPointer: A Unified Method for Nested and Non-nested Ner. https://kexue.fm/archives/8373 (2021). Accessed 10 Jan 2023
Jianlin, S.: GPLinker: GlobalPointer-based Joint Extraction of Entities. https://kexue.fm/archives/8888 (2022). Accessed 10 Jan 2023
Ahmed, F.G.: Evaluating Deep Learning Models: The Confusion Matrix, Accuracy, Precision, and Recall. https://blog.paperspace.com/deep-learning-metrics-precision-recall-accuracy (2021). Accessed 13 Jan 2023
Diederik, P.K., Jimmy, B.: Adam: A Method for Stochastic Optimization (2017). arXiv:1412.6980
Yinhan, L., Myle, O., Naman, G., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach (2019). arXiv:1907.11692
Vikas, Y., Steven, B.: A survey on recent advances in named entity recognition from deep learning models. In: Emily, M.B., Leon, D., Pierre, I. (eds.) Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, August 2018, pp. 2145–2158 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Pacific Insurance Technology Co., Ltd., Shanghai, 200010, China
Tao Liang, Shengjun Yuan, Pengfei Zhou, Hangcong Fu & Huizhe Wu

Authors

Tao Liang
View author publications
You can also search for this author in PubMed Google Scholar
Shengjun Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Pengfei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hangcong Fu
View author publications
You can also search for this author in PubMed Google Scholar
Huizhe Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shengjun Yuan .

Editor information

Editors and Affiliations

Harbin Institute of Technology, Shenzhen, China
Buzhou Tang
Harbin Institute of Technology, Shenzhen, China
Qingcai Chen
Dalian University of Technology, Dalian, China
Hongfei Lin
Zhejiang University, Hangzhou, Zhejiang, China
Fei Wu
Fudan University, Shanghai, China
Lei Liu
South China Normal University, Guangzhou, China
Tianyong Hao
University of Pittsburgh, Pittsburgh, PA, USA
Yanshan Wang
The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
Haitian Wang
Medical Informatics Center of Peking University, Beijing, China
Jianbo Lei
Takeda Co. Ltd., Shanghai, China
Zuofeng Li
West China Hospital, Chengdu, China
Hui Zong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liang, T., Yuan, S., Zhou, P., Fu, H., Wu, H. (2023). Domain Robust Pipeline for Medical Causal Entity and Relation Extraction Task. In: Tang, B., et al. Health Information Processing. Evaluation Track Papers. CHIP 2022. Communications in Computer and Information Science, vol 1773. Springer, Singapore. https://doi.org/10.1007/978-981-99-4826-0_6

Download citation

DOI: https://doi.org/10.1007/978-981-99-4826-0_6
Published: 22 July 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4825-3
Online ISBN: 978-981-99-4826-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Domain Robust Pipeline for Medical Causal Entity and Relation Extraction Task