Abstract
Automatic spelling correction is one of the most important problems in natural language processing. Its difficulty increases in medical corpora, due to the intrinsic particularities that have these texts. These features include the use of specific terminology, abbreviations, acronyms and the presence of writing errors. In this article we present a systematic review of the literature on automatic spelling detection and correction for the medical domain. There are many works on detection and automatic correction, but there is no review delving into the process of automatic correction in the medical domain. Therefore, we intend to synthesize all the existing information on this research topic and the types of studies that have been carried out to date. We present the main techniques and resources, and finally also the limitations and specific challenges. The results reflect the importance of compiling an exhaustive dictionary. In addition, the results show the ordinary use of distance algorithms of spelling and phonetic similarity, as well as with statistical techniques. The improvement of performance in recent years is especially relevant because of the use of context-based methods, such as linguistic models or neural embeddings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Official website: https://www.mendeley.com/.
References
Ruch, P., Baud, R., Geissbühler, A.: Using lexical disambiguation and named-entity recognition to improve spelling correction in the electronic patient record. Artif. Intell. Med. 29(1), 169–184 (2003)
Patrick, J., Sabbagh, M., Jain, S., Zheng, H.: Spelling correction in clinical notes with emphasis on first suggestion accuracy. In: 2nd Workshop on Building and Evaluating Resources for Biomedical Text Mining, pp. 1–8 (2010)
Pollock, J.J., Zamora, A.: Collection and characterization of spelling errors in scientific and scholarly text. J. Am. Soc. Inf. Sci. 34(1), 51–58 (1983)
Verberne, S.: Context-sensitive spell checking based on trigram probabilities. Master’s thesis, University of Nijmegen (2002)
Kukich, K.: Techniques for automatically correcting words in text. ACM Comput. Surv. 24(4), 377–439 (1992)
Brill, E., Moore, R.C.: An improved error model for noisy channel spelling correction. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics – ACL, Hong Kong, pp. 286–293 (2000)
Pande, H.: Effective search space reduction for spell correction using character neural embeddings. In: Proceedings 15th Conference of the European Chapter of the Association for Computational Linguistics–EACL 2017, Valencia, pp. 170–174 (2017)
Kitchenham, B., Pearl Brereton, O., Budgen, D., Turner, M., Bailey, J., Linkman, S.: Systematic literature reviews in software engineering - a systematic literature review. Inf. Softw. Technol. 51(1), 7–15 (2009). https://doi.org/10.1016/j.infsof.2008.09.009
Takahashi, H., Yoshikawa, T., Furuhashi, T.: Reliability-based automatic repeat request with error potential-based error correction for improving P300 speller performance. In: Wong, K., Mendis, B., Bouzerdoum, A. (eds.) ICONIP 2010. LNCS, vol. 6444, pp. 50–57. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17534-3_7
Senger, C., Kaltschmidt, J., Schmitt, S.P.W., Pruszydlo, M.G., Haefeli, W.E.: Misspellings in drug information system queries: characteristics of drug name spelling errors and strategies for their prevention. Int. J. Med. Inf. 79, 832–839 (2010). https://doi.org/10.1016/j.ijmedinf.2010.09.005
Wong, W., Glance, D.: Statistical semantic and clinician confidence analysis for correcting abbreviations and spelling errors in clinical progress notes. Artif. Intell. Med. 53(3), 171–180 (2011)
Sayle, R.A., Petrov, P., Winter, J., Muresan, S.: Improved chemical text mining of patents using infinite dictionaries, translation and automatic spelling correction. J. Chem. Inf. Model. 3, 51–62 (2012). https://doi.org/10.1186/1758-2946-3-S1-O16
Siklósi, B., Novák, A., Prószéky, G.: Context-aware correction of spelling errors in Hungarian medical documents. Comput. Speech Lang. 35, 219–233 (2014)
Kilicoglu, H., Fiszman, M., Roberts, K., Demner-Fushman, D.: An ensemble method for spelling correction in consumer health questions. In: AMIA Annual Symposium Proceedings, pp. 727–736 (2015)
Lai, K.H., Topaz, M., Goss, F.R., Zhou, L.: Automated misspelling detection and correction in clinical free-text records. J. Biomed. Inform. 55, 188–195 (2015)
Zhou, X., et al.: Context-sensitive spelling correction of consumer-generated content on health care. JMIR Med. Inform. 31 3(3), 27 (2015). https://doi.org/10.2196/medinform.4211
Thompson, P.M., McNaught, J., Ananiadou, S.: Customised OCR correction for historical medical text. Digit. Herit. 1, 35–42 (2015)
Hussain, F., Qamar, U.: Identification and correction of misspelled drugs’ names in electronic medical records (EMR). In: Proceedings of the 18th International Conference on Enterprise Information Systems, vol. 2, pp. 333–338 (2016)
Fivez, P., Suster, S., Daelemans, W.: Unsupervised context sensitive spelling correction of clinical free-text with word and character N-Gram embeddings. In: Proceedings of the BioNLP 2017 Workshop, Vancouver, pp. 143–148. Association for Computational Linguistics (2016)
Dziadek, J., Henriksson, A., Duneld, M.: Improving terminology mapping in clinical text with context-sensitive spelling correction. In: Informatics for Health: Connected Citizen-Led Wellness and Population Health, vol. 235, pp. 241–245. IOS Press, Amsterdam (2017)
Lu, C.J., Demner-Fushman, D.: Improving spelling correction with consumer health terminology. In: AMIA 2018 Annual Symposium Proceedings, p. 2053. American Medical Informatics Association (2018)
Workman, T.E., Shao, Y., Divita, G., Zeng-Treitler, Q.: An efficient prototype method to identify and correct misspellings in clinical text. BMC Res. Notes 12(1), 42 (2019). https://doi.org/10.1186/s13104-019-4073-y
Unified Medical Language System (UMLS). https://www.nlm.nih.gov/research/umls/index.html. Accessed 20 Aug 2019
International Health Terminology Standards Development Organisation, SNOMED CT. http://www.ihtsdo.org/snomed-ct/. Accessed 20 Aug 2019
Moby Project. https://mobyproject.org. Accessed 20 Aug 2019
Medline Plus. https://medlineplus.gov/. Accessed 20 Aug 2019
PubMed. https://www.ncbi.nlm.nih.gov/pubmed/. Accessed 20 Aug 2019
The SPECIALIST Lexicon. http://lexsrv3.nlm.nih.gov/Specialist/Summary/lexicon.html. Accessed 20 Aug 2019
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dok-lady 10, 707 (1966)
Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)
Kernighan, M.D., Church, K.W., Gale, W.A.: A spelling correction program based on a noisy channel model. In: Proceedings of the 13th Conference on Computational Linguistics, vol. 2, pp. 205–210 (1990)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948)
WordNet. A Lexical Database for English. https://wordnet.princeton.edu/. Accessed 20 Aug 2019
National Library of Medicine, RxNorm. http://www.nlm.nih.gov/research/umls/rxnorm/. Accessed 20 Aug 2019
Acknowledgments
This research was funded by the Spanish National Research Agency (AEI) and the European Regional Development Fund (FEDER/ERDF) through project KBS4FIA (TIN2016-76323-R). This research is also funded by the Ministry of Education of Spain through the National Program for University Teacher Training (FPU/Ayudas para la formación de profesorado universitario).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
López-Hernández, J., Almela, Á., Valencia-García, R. (2019). Automatic Spelling Detection and Correction in the Medical Domain: A Systematic Literature Review. In: Valencia-García, R., Alcaraz-Mármol, G., Del Cioppo-Morstadt, J., Vera-Lucio, N., Bucaram-Leverone, M. (eds) Technologies and Innovation. CITI 2019. Communications in Computer and Information Science, vol 1124. Springer, Cham. https://doi.org/10.1007/978-3-030-34989-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-34989-9_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34988-2
Online ISBN: 978-3-030-34989-9
eBook Packages: Computer ScienceComputer Science (R0)