Extracting Decision Trees from Medical Texts: An Overview of the Text2DT Track in CHIP2022

Zhu, Wei; Li, Wenfeng; Wang, Xiaoling; Ji, Wendi; Wu, Yuanbin; Chen, Jin; Chen, Liang; Tang, Buzhou

doi:10.1007/978-981-99-4826-0_9

Wei Zhu¹⁶,
Wenfeng Li¹⁶,
Xiaoling Wang¹⁶,
Wendi Ji¹⁶,
Yuanbin Wu¹⁶,
Jin Chen¹⁷,
Liang Chen¹⁸ &
…
Buzhou Tang¹⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1773))

Included in the following conference series:

China Health Information Processing Conference

302 Accesses
7 Citations

Abstract

This paper presents an overview of the Text2DT shared task\(^{1}\) held in the CHIP-2022 shared tasks. The shared task addresses the challenging topic of automatically extracting the medical decision trees from the un-structured medical texts such as medical guidelines and textbooks. Many teams from both industry and academia participated in the shared tasks, and the top teams achieved amazing test results. This paper describes the tasks, the datasets, evaluation metrics, and the top systems for both tasks. Finally, the paper summarizes the techniques and results of the evaluation of the various approaches explored by the participating teams.\(^{1}\)(http://cips-chip.org.cn/2022/eval3)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

H.Z.L.L.: Overview of technology evaluation dataset for medical multimodal information extraction. J. Med. Inform. 43(12), 2–5+22 (2022)
Google Scholar
W.L.Z.W.: Text2dt: decision rule extraction technology for clinical medical text. J. Med Inform. 43(12), 16–22 (2022)
Google Scholar
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960)
Article Google Scholar
Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., Hu, G.: Revisiting pre-trained models for Chinese natural language processing. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 657–668. Association for Computational Linguistics, Online (November 2020). https://doi.org/10.18653/v1/2020.findings-emnlp.58, https://aclanthology.org/2020.findings-emnlp.58
Cui, Y., Yang, Z., Liu, T.: Pert: Pre-training BERT with permuted language model. ArXiv abs/2203.06906 (2022)
Google Scholar
Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. ArXiv abs/1611.01734 (2016)
Google Scholar
Gao, T., Yao, X., Chen, D.: SimcSE: simple contrastive learning of sentence embeddings. ArXiv abs/2104.08821 (2021)
Google Scholar
Grosan, C., Abraham, A.: Rule-based expert systems. In: ,Intelligent Systems Reference Library, vol. 17. Springer, Berlin pp. 149–185. Springer, Cham (2011). https://doi.org/10.1007/978-3-642-21004-4_7
Gu, Y., Tinn, R., Cheng, H., Lucas, M.R., Usuyama, N., Liu, X., Naumann, T., Gao, J., Poon, H.: Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc. 3, 1–23 (2020)
Article Google Scholar
Guo, Z., Ni, Y., Wang, K., Zhu, W., Xie, G.T.: Global attention decoder for Chinese spelling error correction. In: Findings (2021)
Google Scholar
Jiang, T., Zhao, T., Qin, B., Liu, T., Chawla, N.V., Jiang, M.: The role of: a novel scientific knowledge graph representation and construction model. In: Teredesai, A., Kumar, V., Li, Y., Rosales, R., Terzi, E., Karypis, G. (eds.) Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019. pp. 1634–1642. ACM (2019). https://doi.org/10.1145/3292500.3330942, https://doi.org/10.1145/3292500.3330942
Li, X., et al.: Pingan smart health and SJTU at COIN - shared task: utilizing pre-trained language models and common-sense knowledge in machine reading tasks. In: Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing. pp. 93–98. Association for Computational Linguistics, Hong Kong, China (November 2019). https://doi.org/10.18653/v1/D19-6011, https://aclanthology.org/D19-6011
Liang, X., et al.: R-drop: Regularized dropout for neural networks. ArXiv abs/2106.14448 (2021)
Google Scholar
Lu, Y., et al.: Unified structure generation for universal information extraction. In: Annual Meeting of the Association for Computational Linguistics (2022)
Google Scholar
Machado, A., Maran, V., Augustin, I., Wives, L.K., de Oliveira, J.P.M.: Reactive, proactive, and extensible situation-awareness in ambient assisted living. Expert Syst. Appl. 76, 21–35 (2017)
Article Google Scholar
Marcinkiewicz, M.A.: Building a large annotated corpus of English: The PENN treebank. Comput. Ling., 19(2), 313–330 (1994)
Google Scholar
Nohria, R.: Medical expert system-a comprehensive review. Int. J. Comput. Appl. 130(7), 44–50 (2015)
Google Scholar
Saibene, A., Assale, M., Giltri, M.: Expert systems: Definitions, advantages and issues in medical field applications. Expert Syst. Appl. 177, 114900 (2021)
Article Google Scholar
Shortliffe, E.H., Sepúlveda, M.J.: Clinical decision support in the era of artificial intelligence. JAMA 320(21), 2199–2200 (2018)
Article Google Scholar
Su, J., et al.: Global pointer: Novel efficient span-based approach for named entity recognition. ArXiv abs/2208.03054 (2022)
Google Scholar
Sun, H., et al.: Medical knowledge graph to enhance fraud, waste, and abuse detection on claim data: model development and performance evaluation. JMIR Med. Inform 8 (2020)
Google Scholar
Sun, T., et al.: A simple hash-based early exiting approach for language understanding and generation. ArXiv abs/2203.01670 (2022)
Google Scholar
Tsumoto, S.: Automated extraction of medical expert system rules from clinical databases based on rough set theory. Inf. Sci. 112(1–4), 67–84 (1998)
Article Google Scholar
Uzuner, Ö., Solti, I., Cadag, E.: Extracting medication information from clinical text. J. Am. Med. Inform. Assoc. 17(5), 514–518 (2010)
Article Google Scholar
Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Palmer, M., Hwa, R., Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017. pp. 845–854. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/d17-1088, https://doi.org/10.18653/v1/d17-1088
Wen, C., Chen, T., Jia, X., Zhu, J.: Medical named entity recognition from un-labelled medical records based on pre-trained language models and domain dictionary. Data Intell. 3(3), 402–417 (09 2021). https://doi.org/10.1162/dint_a_00105, https://doi.org/10.1162/dint_a_00105
Yan, Z., Zhang, C., Fu, J., Zhang, Q., Wei, Z.: A partition filter network for joint entity and relation extraction. In: Conference on Empirical Methods in Natural Language Processing (2021)
Google Scholar
Yu, T., et al.: Spider: a large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task. In: Riloff, E., Chiang, D., Hockenmaier, J., Tsujii, J. (eds.) Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018. pp. 3911–3921. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/d18-1425, https://doi.org/10.18653/v1/d18-1425
Zhang, N., et al.: CBLUE: A Chinese biomedical language understanding evaluation benchmark. In: Muresan, S., Nakov, P., Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022. pp. 7888–7915. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.acl-long.544, https://doi.org/10.18653/v1/2022.acl-long.544
Zhang, Z., Zhu, W., Zhang, J., Wang, P., Jin, R., Chung, pCEE-BERT: Accelerating BERT inference via patient and confident early exiting. In: NAACL-HLT (2022)
Google Scholar
Zhu, W.: AutoRC: Improving BERT based relation classification models via architecture search. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop. pp. 33–43. Association for Computational Linguistics, Online (Aug 2021). https://doi.org/10.18653/v1/2021.acl-srw.4, https://aclanthology.org/2021.acl-srw.4
Zhu, W.: LeeBERT: Learned early exit for bert with cross-level optimization. In: Annual Meeting of the Association for Computational Linguistics (2021)
Google Scholar
Zhu, W.: MVP-BERT: Multi-vocab pre-training for Chinese BERT. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop. pp. 260–269. Association for Computational Linguistics, Online (Aug 2021). https://doi.org/10.18653/v1/2021.acl-srw.27, https://aclanthology.org/2021.acl-srw.27
Zhu, W., et al.: paht_nlp @ MEDIQA 2021: multi-grained query focused multi-answer summarization. In: Proceedings of the 20th Workshop on Biomedical Language Processing. pp. 96–102. Association for Computational Linguistics, Online (Jun 2021). https://doi.org/10.18653/v1/2021.bionlp-1.10, https://aclanthology.org/2021.bionlp-1.10
Zhu, W., Ni, Y., Wang, X., Xie, G.: Discovering better model architectures for medical query understanding. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers. pp. 230–237. Association for Computational Linguistics, Online (June 2021). https://doi.org/10.18653/v1/2021.naacl-industry.29, https://aclanthology.org/2021.naacl-industry.29
Zhu, W., Ni, Y., Xie, G., Zhou, X., Chen, C.: The DR-KGQA system for automatically answering medication related questions in Chinese. In: 2019 IEEE International Conference on Healthcare Informatics (ICHI), pp. 1–6 (2019). https://doi.org/10.1109/ICHI.2019.8904496
Zhu, W., Wang, X., Ni, Y., Xie, G.T.: GAML-BERT: Improving bert early exiting by gradient aligned mutual learning. In: Conference on Empirical Methods in Natural Language Processing (2021)
Google Scholar
Zhu, W., Wang, X., Qiu, X., Ni, Y., Xie, G.T.: Autotrans: automating transformer design via reinforced architecture search. ArXiv abs/2009.02070 (2020)
Google Scholar
Zhu, W., et al.: PANLP at MEDIQA 2019: pre-trained language models, transfer learning and knowledge distillation. In: Proceedings of the 18th BioNLP Workshop and Shared Task, pp. 380–388. Association for Computational Linguistics, Florence, Italy (August 2019). https://doi.org/10.18653/v1/W19-5040, https://aclanthology.org/W19-5040
Zuo, Y., Zhu, W., Cai, G.: Continually detection, rapidly react: Unseen rumors detection based on continual prompt-tuning. In: COLING (2022)
Google Scholar

Download references

Acknowledgements

This work was supported by NSFC grants (No. 61972155 and 62136002) and National Key R &D Program of China (No. 2021YFC3340700), and Shanghai Trusted Industry Internet Software Collaborative Innovation Center.

Author information

Authors and Affiliations

East China Normal University, Shanghai, China
Wei Zhu, Wenfeng Li, Xiaoling Wang, Wendi Ji & Yuanbin Wu
University of Kentucky Lexington, Lexington, KY, USA
Jin Chen
Huashan Hospital of Fudan University, Shanghai, China
Liang Chen
Harbin Institute of Technology Shenzhen, Shenzhen, China
Buzhou Tang

Authors

Wei Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Wenfeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoling Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wendi Ji
View author publications
You can also search for this author in PubMed Google Scholar
Yuanbin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Liang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Buzhou Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoling Wang .

Editor information

Editors and Affiliations

Harbin Institute of Technology, Shenzhen, China
Buzhou Tang
Harbin Institute of Technology, Shenzhen, China
Qingcai Chen
Dalian University of Technology, Dalian, China
Hongfei Lin
Zhejiang University, Hangzhou, Zhejiang, China
Fei Wu
Fudan University, Shanghai, China
Lei Liu
South China Normal University, Guangzhou, China
Tianyong Hao
University of Pittsburgh, Pittsburgh, PA, USA
Yanshan Wang
The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
Haitian Wang
Medical Informatics Center of Peking University, Beijing, China
Jianbo Lei
Takeda Co. Ltd., Shanghai, China
Zuofeng Li
West China Hospital, Chengdu, China
Hui Zong

A Manual Evaluation of Annotated MDTs

The detail of our manual evaluation of medical decision trees are as follows:

1.
We observed the subjects’ performance on medical decision problems of similar difficulty under medical texts and MDTs. Specifically, subjects will answer three sets of medical decision questions, each group providing texts or decision trees containing the medical knowledge needed to answer the medical decision question. We observe their accuracy and time spent answering the decision question. Each set of questions is randomly selected from the question pool and is guaranteed to be of similar difficulty.
2.
We invited subjects to rate medical texts and MDTs in terms of readability, completeness, and helpfulness. Specifically, we randomly selected five medical texts and MDTs expressing the same knowledge. We asked subjects to score (0–3) them in terms of whether they were clear and easy to understand (readability), whether they were comprehensive and detailed (completeness), and whether they were helpful in understanding or studying medical knowledge (helpfulness).

Table 6. Results of manual evaluation of annotated MDTs. The results in the first field are for subjects without medical background, and the results in the second field are for medical practitioners. A represents the average accuracy of answering the medical decision questions. T represents the average seconds spent answering the medical decision questions. R, C, and H represent the readability, completeness, and helpfulness average scores.

Full size table

The results of the manual evaluation are shown in Table 6. We can draw the following conclusions:

For subjects without medical background, the medical decision tree helped them make more correct decisions in less time compared with the medical text and gained the highest scores for readability, completeness, and helpfulness. Theoretically, the completeness of the medical text should be better than the medical decision tree. Still, due to the poor readability of the medical text, the subjects may not have gained complete access to the knowledge contained in the medical text.

For medical practitioners, the medical decision tree group achieved the same accuracy on the medical decision questions as the medical text group, but the former took less time. The medical decision trees gained the highest readability and helpfulness scores and slightly lower completeness than the medical texts. The results demonstrate that the medical decision tree can help people make treatment decisions faster and better and can model medical decision knowledge clearly and intuitively, which can help readers better understand medical decision knowledge.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, W. et al. (2023). Extracting Decision Trees from Medical Texts: An Overview of the Text2DT Track in CHIP2022. In: Tang, B., et al. Health Information Processing. Evaluation Track Papers. CHIP 2022. Communications in Computer and Information Science, vol 1773. Springer, Singapore. https://doi.org/10.1007/978-981-99-4826-0_9

Download citation

DOI: https://doi.org/10.1007/978-981-99-4826-0_9
Published: 22 July 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4825-3
Online ISBN: 978-981-99-4826-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Extracting Decision Trees from Medical Texts: An Overview of the Text2DT Track in CHIP2022

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Manual Evaluation of Annotated MDTs

A Manual Evaluation of Annotated MDTs

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation