Comparing NLP Systems to Extract Entities of Eligibility Criteria in Dietary Supplements Clinical Trials Using NLP-ADAPT

Bompelli, Anusha; Silverman, Greg; Finzel, Raymond; Vasilakes, Jake; Knoll, Benjamin; Pakhomov, Serguei; Zhang, Rui

doi:10.1007/978-3-030-59137-3_7

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12299))

Included in the following conference series:

International Conference on Artificial Intelligence in Medicine

2049 Accesses
3 Citations

Abstract

Natural Language Processing (NLP) techniques have been used extensively to extract concepts from unstructured clinical trial eligibility criteria. Recruiting patients whose information in Electronic Health Records matches clinical trial eligibility criteria can potentially facilitate and accelerate the clinical trial recruitment process. However, a significant obstacle is identifying an efficient Named Entity Recognition (NER) system to parse the clinical trial eligibility criteria. In this study, we used NLP-ADAPT (Artifact Discovery and Preparation Toolkit) to compare existing biomedical NLP systems (BiomedICUS, CLAMP, cTAKES and MetaMap) and their Boolean ensemble to identify entities of the eligibility criteria of 150 randomly selected Dietary Supplement (DS) clinical trials. We created a custom mapping of the gold standard annotated entities to UMLS semantic types to align with annotations from each system. All systems in NLP-ADAPT used their default pipelines to extract entities based on our custom mappings. The systems performed reasonably well in extracting UMLS concepts belonging to the semantic types Disorders and Chemicals and Drugs. Among all systems, cTAKES was the highest performing system for Chemicals and Drugs and Disorders semantic groups and BioMedICUS was the highest performing system for Procedures, Living Beings, Concepts and Ideas, and Devices. Whereas, the Boolean ensemble outperformed individual systems. This study sets a baseline that can be potentially improved with modifications to the NLP-ADAPT pipeline.

A. Bompelli and G. Silverman—Equal-contribution first authors

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://z.umn.edu/annotation_guidelines.

References

Kuo, T-T., et al.: Ensembles of NLP tools for data element extraction from clinical notes. In: AMIA Annual Symposium Proceedings, vol. 2016, pp. 1880–1889 (2017)
Google Scholar
Kang, N., Afzal, Z., Singh, B., van Mulligen, E.M., Kors, J.A.: Using an ensemble system to improve concept extraction from clinical records. J. Biomed. Inform. 45, 423–428 (2012). https://doi.org/10.1016/j.jbi.2011.12.009
Article Google Scholar
Friedman, C.: Towards a comprehensive medical language processing system: methods and issues. In: Proceedings AMIA Annual Fall Symposium, pp. 595–599 (1997)
Google Scholar
Soysal, E., et al.: CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines. J. Am. Med. Inform. Assoc. 25, 331–336 (2018). https://doi.org/10.1093/jamia/ocx132
Article Google Scholar
Savova, G.K., et al.: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17, 507–513 (2010). https://doi.org/10.1136/jamia.2009.001560
Article Google Scholar
Conway, M., et al.: Moonstone: a novel natural language processing system for inferring social risk from clinical narratives. J Biomed. Seman. 10, 1–10 (2018). https://doi.org/10.1186/s13326-019-0198-0
Article MathSciNet Google Scholar
Wang, Y., et al.: Clinical information extraction applications: a literature review. J. Biomed. Inform. 77, 34–49 (2018). https://doi.org/10.1016/j.jbi.2017.11.011
Article Google Scholar
Friedman, C., Shagina, L., Lussier, Y., Hripcsak, G.: Automated encoding of clinical documents based on natural language processing. J. Am. Med. Inform. Assoc. 11, 392–402 (2004). https://doi.org/10.1197/jamia.M1552
Article Google Scholar
ten Teije, A., et al.: Knowledge Engineering and Knowledge Management: 18th International Conference, EKAW 2012, Galway City, Ireland, October 8-12, 2012. Proceedings. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33876-2
Book Google Scholar
Uzuner, Ö., South, B.R., Shen, S., DuVall, S.L.: 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc. 18, 552–556 (2011). https://doi.org/10.1136/amiajnl-2011-000203
Article Google Scholar
University of Minnesota, NLP/IE. nlp-adapt-kube (2019). https://github.com/nlpie/nlp-adapt-kube. Accessed 06 Jan 2020
University of Minnesota, NLP/IE, nlp-ensemble-explorer, UMN NLPIE (2020). https://github.com/nlpie/ensemble-explorer. Accessed 06 Jan 2020
Azam, S.S., Raju, M., Pagidimarri, V., Kasivajjala, V.: Q-Map: clinical concept mining from clinical documents. arXiv:1804.11149 (2018)
McCray, A.T., Burgun, A., Bodenreider, O.: Aggregating UMLS semantic types for reducing conceptual complexity. Stud. Health Technol. Inform. 84, 216–220 (2001)
Google Scholar
Semantic types and groups. https://metamap.nlm.nih.gov/SemanticTypesAndGroups.shtml. Accessed 05 May 2020
He, Z., Perl, Y., Elhanan, G., Chen, Y., Geller, J., Bian, J.: Auditing the assignments of top-level semantic types in the UMLS semantic network to UMLS concepts. In: Proceedings (IEEE International Conference Bioinformatics and Biomedicine), vol. 2017, pp. 1262–1269 (2017). https://doi.org/10.1109/BIBM.2017.8217840
University of Minnesota N, biomedicus (2019). https://github.com/nlpie/biomedicus. Accessed 06 Jan 2020
University of Texas, UT health, CLAMP (2020). https://clamp.uth.edu. Accessed 06 Jan 2020
Apache software foundation, cTAKES. https://ctakes.apache.org. Accessed 06 Jan 2020
The National Institutes of Health, MetaMap (2019). https://metamap.nlm.nih.gov. Accessed 06 Jan 2020
Apache foundation. UIMA project (2013). https://uima.apache.org. Accessed 08 Feb 2020
Aronson, A.R.: MetaMap evaluation (2001). https://ii.nlm.nih.gov/Publications/Papers/mm.evaluation.pdf
Technische Universität Darmstadt, ubiquitous knowledge processing lab, dkpro-cassis (2019). https://github.com/dkpro/dkpro-cassis. Accessed 06 Jan 2020
Miller, B.N., Ranum, D.L.: Parse tree. In: Problem Solving with Algorithms and Data Structures using Python. Section 7.6. https://runestone.academy/runestone/books/published/pythonds/Trees/ParseTree.html. Accessed 06 Jan 2020
Sang, E.F.T.K., Veenstra, J.: Representing text chunks. In: Proceedings of the 9th Conference on European Chapter of the Association for Computational Linguistics, Bergen, Norway, pp. 173–179. Association for Computational Linguistics (1999). https://doi.org/10.3115/977035.977059
University of Minnesota, NLP/IE. expected_number_boolean_combinations_n_eq_5.py. expected_number_boolean_combinations_n_eq_5.py (2020). https://gist.github.com/GregSilverman/3e09cb6b7c7bf664b4df14d309192bb3. Accessed 07 Feb 2020
Knoll, B.C., Melton, G.B., Liu, H., Xu, H., Pakhomov, S.V.S.: Using synthetic clinical data to train an HMM-based POS tagger. In: 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), pp. 252–255 (2016). https://doi.org/10.1109/BHI.2016.7455882
Albright, D., et al.: Towards comprehensive syntactic and semantic annotations of the clinical narrative. J. Am. Med. Inform. Assoc. 20, 922–930 (2013). https://doi.org/10.1136/amiajnl-2012-001317
Article Google Scholar
Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proceeding AMIA Symposium, pp. 17–21 (2001)
Google Scholar
Derczynski, L.: Complementarity, F-score, and NLP evaluation. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, pp. 261–266. European Language Resources Association (ELRA) (2016)
Google Scholar
Aronson, A.R., Lang, F.-M.: An overview of MetaMap: historical perspective and recent advances. J. Am. Med. Inform. Assoc. 17, 229–236 (2010). https://doi.org/10.1136/jamia.2009.002733
Article Google Scholar
Kilicoglu, H., Rosemblat, G., Fiszman, M., Shin, D.: Broad-coverage biomedical relation extraction with SemRep. BMC Bioinform. 21, 1–28 (2020). https://doi.org/10.1186/s12859-020-3517-7
Article Google Scholar
Rizvi, R.F., et al.: iDISK: the integrated dietary supplements knowledge base. J. Am. Med. Inform. Assoc. 27, 539–548 (2020). https://doi.org/10.1093/jamia/ocz216
Article Google Scholar
Vasilakes, J., Bompelli, A., Bishop, J., Adam, T., Bodenreider, O., Zhang, R.: Assessing the enrichment of dietary supplement coverage in the UMLS. J. Am. Med. Informa. Assoc. (2020, in press)
Google Scholar
Silverman, G.M., et al.: Named entity recognition in prehospital trauma care. Stud. Health Technol. Inform. 264, 1586–1587 (2019). https://doi.org/10.3233/SHTI190547
Article Google Scholar
Tignanelli, C.J., et al.: Natural language processing of prehospital emergency medical services trauma records allows for automated characterization of treatment appropriateness. J. Trauma Acute Care Surg. 88, 607–614 (2020). https://doi.org/10.1097/TA.0000000000002598
Article Google Scholar

Download references

Acknowledgements

This work was partially supported by the NIH’s National Center for Complementary and Integrative Health and the Office of Dietary Supplements under grant number R01AT009457 (Zhang); and supported by the National Center for Advancing Translational Sciences under grant number UL1TR002494 and U01TR002062.

Author information

Authors and Affiliations

Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
Anusha Bompelli, Jake Vasilakes, Benjamin Knoll & Rui Zhang
Department of Surgery, University of Minnesota, Minneapolis, MN, USA
Greg Silverman
Department of Pharmaceutical Care and Health Systems, University of Minnesota, Minneapolis, MN, USA
Raymond Finzel, Jake Vasilakes, Serguei Pakhomov & Rui Zhang

Authors

Anusha Bompelli
View author publications
You can also search for this author in PubMed Google Scholar
Greg Silverman
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Finzel
View author publications
You can also search for this author in PubMed Google Scholar
Jake Vasilakes
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Knoll
View author publications
You can also search for this author in PubMed Google Scholar
Serguei Pakhomov
View author publications
You can also search for this author in PubMed Google Scholar
Rui Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Zhang .

Editor information

Editors and Affiliations

School of Nursing, University of Minnesota, Minneapolis, MN, USA
Martin Michalowski
Ben-Gurion University of the Negev, Tonawanda, NY, USA
Robert Moskovitch

Appendix

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bompelli, A. et al. (2020). Comparing NLP Systems to Extract Entities of Eligibility Criteria in Dietary Supplements Clinical Trials Using NLP-ADAPT. In: Michalowski, M., Moskovitch, R. (eds) Artificial Intelligence in Medicine. AIME 2020. Lecture Notes in Computer Science(), vol 12299. Springer, Cham. https://doi.org/10.1007/978-3-030-59137-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-59137-3_7
Published: 26 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59136-6
Online ISBN: 978-3-030-59137-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Comparing NLP Systems to Extract Entities of Eligibility Criteria in Dietary Supplements Clinical Trials Using NLP-ADAPT

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation