Generating Positive Psychosis Symptom Keywords from Electronic Health Records

Viani, Natalia; Patel, Rashmi; Stewart, Robert; Velupillai, Sumithra

doi:10.1007/978-3-030-21642-9_38

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11526))

Included in the following conference series:

Conference on Artificial Intelligence in Medicine in Europe

3473 Accesses
5 Citations

Abstract

The development of Natural Language Processing (NLP) solutions for information extraction from electronic health records (EHRs) has grown in recent years, as most clinically relevant information in EHRs is documented only in free text. One of the core tasks for any NLP system is to extract clinically relevant concepts such as symptoms. This information can then be used for more complex problems such as determining symptom onset, which requires temporal information. In the mental health domain, comprehensive vocabularies for specific disorders are scarce, and rarely contain keywords that reflect real-world terminology use. We explore the use of embedding techniques to automatically generate lexical variants of psychosis symptoms into vocabularies, that can be used in complex downstream NLP tasks. We study the impact of the underlying text material on generating useful lexical entries, experimenting with different corpora and with unigram/bigram models. We also propose a method to automatically compute thresholds for choosing the most relevant terms. Our main contribution is a systematic study of unsupervised vocabulary generation using different corpora for an understudied clinical use-case. Resulting lexicons are publicly available.

RS, RP and SV are part-funded by the NIHR Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. RP has received support from a Medical Research Council (MRC) Health Data Research UK Fellowship (MR/S003118/1) and a Starter Grant for Clinical Lecturers (SGL015/1020) supported by the Academy of Medical Sciences, The Wellcome Trust, MRC, British Heart Foundation, Arthritis Research UK, the Royal College of Physicians and Diabetes UK. NV and SV have received support by the Swedish Research Council (2015-00359), Marie Skodowska Curie Actions, Cofund, Project INCA 600398.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Ethical approval for secondary analysis: Oxford REC C, reference 18/SC/0372.
2.
From: https://pypi.org/project/gensim/. Implementation details (preprocessing, parameters) available at: https://github.com/medesto/psychosis-symptom-keywords.

References

Wang, Y., Wang, L., Rastegar-Mojarad, M., et al.: Clinical information extraction applications: a literature review. J. Biomed. Inf. 77, 34–49 (2018)
Article Google Scholar
Kisely, S., Scott, A., Denney, J., Simon, G.: Duration of untreated symptoms in common mental disorders: association with outcomes. Br. J. Psychiatry 189(1), 79–80 (2006)
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Ye, C., Fabbri, D.: Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews. J. Biomed. Inf. 83, 63–72 (2018)
Article Google Scholar
Velupillai, S., Mowery, D.L., Conway, M., et al.: Vocabulary development to support information extraction of substance abuse from psychiatry notes. In: Proceedings of BioNLP 2016, pp. 92–101 (2016)
Google Scholar
Jackson, R., Patel, R., Velupillai, S., et al.: Knowledge discovery for deep phenotyping serious mental illness from electronic mental health records. F1000Res. 7 (2018). https://doi.org/10.12688/f1000research.13830.2
Perera, G., Broadbent, M., Callard, F., et al.: Cohort profile of the South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLaM BRC) Case Register: current status and recent enhancement of an Electronic Mental Health Record-derived data resource. BMJ Open 6(3) (2016). https://doi.org/10.1136/bmjopen-2015-008721
Saeed, M., Villarroel, M., Reisner, A.T., et al.: Multiparameter intelligent monitoring in intensive care II (MIMIC-II): a public-access intensive care unit database. Crit. Care Med. 39(5), 952–960 (2011)
Article Google Scholar
McDonald, R., Brokos, G.I., Androutsopoulos, I.: Deep relevance ranking using enhanced document-query interactions. In: Proceedings EMNLP 2018 (2018)
Google Scholar
Chiu, B., Crichton, G., Korhonen, A., Pyysalo, S.: How to train good word embeddings for biomedical NLP. In: Proceedings BioNLP 2016, pp. 166–174 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

IoPPN, King’s College London, London, UK
Natalia Viani, Rashmi Patel, Robert Stewart & Sumithra Velupillai
South London and Maudsley NHS Foundation Trust, London, UK
Rashmi Patel & Robert Stewart

Authors

Natalia Viani
View author publications
You can also search for this author in PubMed Google Scholar
Rashmi Patel
View author publications
You can also search for this author in PubMed Google Scholar
Robert Stewart
View author publications
You can also search for this author in PubMed Google Scholar
Sumithra Velupillai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Natalia Viani .

Editor information

Editors and Affiliations

Universitat Rovira i Virgili, Tarragona, Spain
David Riaño
Poznan University of Technology, Poznan, Poland
Szymon Wilk
VU Amsterdam, Amsterdam, The Netherlands
Annette ten Teije

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Viani, N., Patel, R., Stewart, R., Velupillai, S. (2019). Generating Positive Psychosis Symptom Keywords from Electronic Health Records. In: Riaño, D., Wilk, S., ten Teije, A. (eds) Artificial Intelligence in Medicine. AIME 2019. Lecture Notes in Computer Science(), vol 11526. Springer, Cham. https://doi.org/10.1007/978-3-030-21642-9_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-21642-9_38
Published: 30 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21641-2
Online ISBN: 978-3-030-21642-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics