Medical Entity Recognition and Negation Extraction: Assessment of NegEx on Health Records in Spanish
This work focuses on biomedical text mining. The core of this work is to make a step ahead in the negation detection of biomedical entities on Electronic Health Records (EHRs), where the detection of non-negated entities is as important as the identification of negated entities. For instance, the identification of a negated entity as factual, can produce diagnostic errors in decision support systems.
Negated entity recognition tackles two tasks: (1) entity recognition; (2) entity classification as negated or not. To identify negations, in the literature rule-based and machine-learning techniques have been used. This paper presents an adaptation of the rule-based system NegEx, which uses exact-matching for the aforementioned tasks.
Our contribution consist in assessing the aforementioned two tasks and explored alternatives for each of them, in such a way that the negation detection improves when the entity recognition is able to detect more entities correctly.
The evaluation was carried out within a real domain of 75 EHRs written in Spanish obtaining an f-measure of 76.2 for entity recognition and 73.8 for negation detection.
KeywordsNegation detection Electronic health records Text mining Spanish
The authors would like to thank the personnel of Pharmacy and Pharmacovigilance services of the Galdakao-Usansolo Hospital. This work was partially funded by the Spanish Ministry of Science and Innovation (EXTRECM: TIN2013-46616-C2-1-R, TADEEP: TIN2015-70214-P) and the Basque Government (DETEAMI: Ministry of Health 2014111003, Predoctoral Grant: PRE 2015 1 0211).
- 1.Blanco, E., Moldovan, D.I.: Some issues on detecting negation from text. In: FLAIRS (2011)Google Scholar
- 3.Ceusters, W., Elkin, P., Smith, B.: Negative findings in electronic health records and biomedical ontologies: a realist approach. Int. J. Med. Inform. 76, 326–333 (2017)Google Scholar
- 5.Costumero, R., Lopez, F., Gonzalo-Martín, C., Millan, M., Menasalvas, E.: An approach to detect negation on medical documents in Spanish. In: Ślȩzak, D., Tan, A.-H., Peters, J.F., Schwabe, L. (eds.) BIH 2014. LNCS (LNAI), vol. 8609, pp. 366–375. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-09891-3_34 Google Scholar
- 7.Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML, vol. 1, pp. 282–289 (2001)Google Scholar
- 8.Nakov, P., Zesch, T. (eds.): Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Association for Computational Linguistics and Dublin City University, Dublin, Ireland (2014)Google Scholar
- 11.Skeppstedt, M.: Negation detection in swedish clinical text. In: Proceedings of the NAACL HLT 2010 Second Louhi Workshop on Text and Data Mining of Health Documents, pp. 15–21. Association for Computational Linguistics (2010)Google Scholar
- 12.Skeppstedt, M., Dalianis, H., Nilsson, G.H.: Retrieving disorders and findings: results using SNOMED CT and NegEx adapted for swedish. In: Third International Workshop on Health Document Text Mining and Information AnalysisBled, Slovenia, 6 July 2011, Bled Slovenia, Collocated with AIME 2011, pp. 11–17 (2011)Google Scholar
- 13.Weegar, R., Kvist, M., Sundström, K., Brunak, S., Dalianis, H.: Finding cervical cancer symptoms in swedish clinical text using a machine learning approach and NegEx. In: AMIA Annual Symposium Proceedings. vol. 2015, p. 1296. American Medical Informatics Association (2015)Google Scholar