Advertisement

Automatic Spelling Detection and Correction in the Medical Domain: A Systematic Literature Review

  • Jésica López-HernándezEmail author
  • Ángela Almela
  • Rafael Valencia-García
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1124)

Abstract

Automatic spelling correction is one of the most important problems in natural language processing. Its difficulty increases in medical corpora, due to the intrinsic particularities that have these texts. These features include the use of specific terminology, abbreviations, acronyms and the presence of writing errors. In this article we present a systematic review of the literature on automatic spelling detection and correction for the medical domain. There are many works on detection and automatic correction, but there is no review delving into the process of automatic correction in the medical domain. Therefore, we intend to synthesize all the existing information on this research topic and the types of studies that have been carried out to date. We present the main techniques and resources, and finally also the limitations and specific challenges. The results reflect the importance of compiling an exhaustive dictionary. In addition, the results show the ordinary use of distance algorithms of spelling and phonetic similarity, as well as with statistical techniques. The improvement of performance in recent years is especially relevant because of the use of context-based methods, such as linguistic models or neural embeddings.

Keywords

Literature review Automatic spelling detection Automatic spelling correction Medical domain Spelling errors Misspellings 

Notes

Acknowledgments

This research was funded by the Spanish National Research Agency (AEI) and the European Regional Development Fund (FEDER/ERDF) through project KBS4FIA (TIN2016-76323-R). This research is also funded by the Ministry of Education of Spain through the National Program for University Teacher Training (FPU/Ayudas para la formación de profesorado universitario).

References

  1. 1.
    Ruch, P., Baud, R., Geissbühler, A.: Using lexical disambiguation and named-entity recognition to improve spelling correction in the electronic patient record. Artif. Intell. Med. 29(1), 169–184 (2003)CrossRefGoogle Scholar
  2. 2.
    Patrick, J., Sabbagh, M., Jain, S., Zheng, H.: Spelling correction in clinical notes with emphasis on first suggestion accuracy. In: 2nd Workshop on Building and Evaluating Resources for Biomedical Text Mining, pp. 1–8 (2010)Google Scholar
  3. 3.
    Pollock, J.J., Zamora, A.: Collection and characterization of spelling errors in scientific and scholarly text. J. Am. Soc. Inf. Sci. 34(1), 51–58 (1983)CrossRefGoogle Scholar
  4. 4.
    Verberne, S.: Context-sensitive spell checking based on trigram probabilities. Master’s thesis, University of Nijmegen (2002)Google Scholar
  5. 5.
    Kukich, K.: Techniques for automatically correcting words in text. ACM Comput. Surv. 24(4), 377–439 (1992)CrossRefGoogle Scholar
  6. 6.
    Brill, E., Moore, R.C.: An improved error model for noisy channel spelling correction. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics – ACL, Hong Kong, pp. 286–293 (2000)Google Scholar
  7. 7.
    Pande, H.: Effective search space reduction for spell correction using character neural embeddings. In: Proceedings 15th Conference of the European Chapter of the Association for Computational Linguistics–EACL 2017, Valencia, pp. 170–174 (2017)Google Scholar
  8. 8.
    Kitchenham, B., Pearl Brereton, O., Budgen, D., Turner, M., Bailey, J., Linkman, S.: Systematic literature reviews in software engineering - a systematic literature review. Inf. Softw. Technol. 51(1), 7–15 (2009).  https://doi.org/10.1016/j.infsof.2008.09.009CrossRefGoogle Scholar
  9. 9.
    Takahashi, H., Yoshikawa, T., Furuhashi, T.: Reliability-based automatic repeat request with error potential-based error correction for improving P300 speller performance. In: Wong, K., Mendis, B., Bouzerdoum, A. (eds.) ICONIP 2010. LNCS, vol. 6444, pp. 50–57. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-17534-3_7CrossRefGoogle Scholar
  10. 10.
    Senger, C., Kaltschmidt, J., Schmitt, S.P.W., Pruszydlo, M.G., Haefeli, W.E.: Misspellings in drug information system queries: characteristics of drug name spelling errors and strategies for their prevention. Int. J. Med. Inf. 79, 832–839 (2010).  https://doi.org/10.1016/j.ijmedinf.2010.09.005CrossRefGoogle Scholar
  11. 11.
    Wong, W., Glance, D.: Statistical semantic and clinician confidence analysis for correcting abbreviations and spelling errors in clinical progress notes. Artif. Intell. Med. 53(3), 171–180 (2011)CrossRefGoogle Scholar
  12. 12.
    Sayle, R.A., Petrov, P., Winter, J., Muresan, S.: Improved chemical text mining of patents using infinite dictionaries, translation and automatic spelling correction. J. Chem. Inf. Model. 3, 51–62 (2012).  https://doi.org/10.1186/1758-2946-3-S1-O16CrossRefGoogle Scholar
  13. 13.
    Siklósi, B., Novák, A., Prószéky, G.: Context-aware correction of spelling errors in Hungarian medical documents. Comput. Speech Lang. 35, 219–233 (2014)CrossRefGoogle Scholar
  14. 14.
    Kilicoglu, H., Fiszman, M., Roberts, K., Demner-Fushman, D.: An ensemble method for spelling correction in consumer health questions. In: AMIA Annual Symposium Proceedings, pp. 727–736 (2015)Google Scholar
  15. 15.
    Lai, K.H., Topaz, M., Goss, F.R., Zhou, L.: Automated misspelling detection and correction in clinical free-text records. J. Biomed. Inform. 55, 188–195 (2015)CrossRefGoogle Scholar
  16. 16.
    Zhou, X., et al.: Context-sensitive spelling correction of consumer-generated content on health care. JMIR Med. Inform. 31 3(3), 27 (2015).  https://doi.org/10.2196/medinform.4211CrossRefGoogle Scholar
  17. 17.
    Thompson, P.M., McNaught, J., Ananiadou, S.: Customised OCR correction for historical medical text. Digit. Herit. 1, 35–42 (2015)Google Scholar
  18. 18.
    Hussain, F., Qamar, U.: Identification and correction of misspelled drugs’ names in electronic medical records (EMR). In: Proceedings of the 18th International Conference on Enterprise Information Systems, vol. 2, pp. 333–338 (2016)Google Scholar
  19. 19.
    Fivez, P., Suster, S., Daelemans, W.: Unsupervised context sensitive spelling correction of clinical free-text with word and character N-Gram embeddings. In: Proceedings of the BioNLP 2017 Workshop, Vancouver, pp. 143–148. Association for Computational Linguistics (2016)Google Scholar
  20. 20.
    Dziadek, J., Henriksson, A., Duneld, M.: Improving terminology mapping in clinical text with context-sensitive spelling correction. In: Informatics for Health: Connected Citizen-Led Wellness and Population Health, vol. 235, pp. 241–245. IOS Press, Amsterdam (2017)Google Scholar
  21. 21.
    Lu, C.J., Demner-Fushman, D.: Improving spelling correction with consumer health terminology. In: AMIA 2018 Annual Symposium Proceedings, p. 2053. American Medical Informatics Association (2018)Google Scholar
  22. 22.
    Workman, T.E., Shao, Y., Divita, G., Zeng-Treitler, Q.: An efficient prototype method to identify and correct misspellings in clinical text. BMC Res. Notes 12(1), 42 (2019).  https://doi.org/10.1186/s13104-019-4073-yCrossRefGoogle Scholar
  23. 23.
    Unified Medical Language System (UMLS). https://www.nlm.nih.gov/research/umls/index.html. Accessed 20 Aug 2019
  24. 24.
    International Health Terminology Standards Development Organisation, SNOMED CT. http://www.ihtsdo.org/snomed-ct/. Accessed 20 Aug 2019
  25. 25.
    Moby Project. https://mobyproject.org. Accessed 20 Aug 2019
  26. 26.
    Medline Plus. https://medlineplus.gov/. Accessed 20 Aug 2019
  27. 27.
    PubMed. https://www.ncbi.nlm.nih.gov/pubmed/. Accessed 20 Aug 2019
  28. 28.
    The SPECIALIST Lexicon. http://lexsrv3.nlm.nih.gov/Specialist/Summary/lexicon.html. Accessed 20 Aug 2019
  29. 29.
    Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dok-lady 10, 707 (1966)MathSciNetGoogle Scholar
  30. 30.
    Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)CrossRefGoogle Scholar
  31. 31.
    Kernighan, M.D., Church, K.W., Gale, W.A.: A spelling correction program based on a noisy channel model. In: Proceedings of the 13th Conference on Computational Linguistics, vol. 2, pp. 205–210 (1990)Google Scholar
  32. 32.
    Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948)MathSciNetCrossRefGoogle Scholar
  33. 33.
    WordNet. A Lexical Database for English. https://wordnet.princeton.edu/. Accessed 20 Aug 2019
  34. 34.
    National Library of Medicine, RxNorm. http://www.nlm.nih.gov/research/umls/rxnorm/. Accessed 20 Aug 2019

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Jésica López-Hernández
    • 1
    Email author
  • Ángela Almela
    • 2
  • Rafael Valencia-García
    • 1
  1. 1.Facultad de InformáticaUniversidad de MurciaMurciaSpain
  2. 2.Facultad de LetrasUniversidad de MurciaMurciaSpain

Personalised recommendations