Skip to main content

Automatic Spelling Detection and Correction in the Medical Domain: A Systematic Literature Review

  • Conference paper
  • First Online:
Technologies and Innovation (CITI 2019)

Abstract

Automatic spelling correction is one of the most important problems in natural language processing. Its difficulty increases in medical corpora, due to the intrinsic particularities that have these texts. These features include the use of specific terminology, abbreviations, acronyms and the presence of writing errors. In this article we present a systematic review of the literature on automatic spelling detection and correction for the medical domain. There are many works on detection and automatic correction, but there is no review delving into the process of automatic correction in the medical domain. Therefore, we intend to synthesize all the existing information on this research topic and the types of studies that have been carried out to date. We present the main techniques and resources, and finally also the limitations and specific challenges. The results reflect the importance of compiling an exhaustive dictionary. In addition, the results show the ordinary use of distance algorithms of spelling and phonetic similarity, as well as with statistical techniques. The improvement of performance in recent years is especially relevant because of the use of context-based methods, such as linguistic models or neural embeddings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Official website: https://www.mendeley.com/.

References

  1. Ruch, P., Baud, R., Geissbühler, A.: Using lexical disambiguation and named-entity recognition to improve spelling correction in the electronic patient record. Artif. Intell. Med. 29(1), 169–184 (2003)

    Article  Google Scholar 

  2. Patrick, J., Sabbagh, M., Jain, S., Zheng, H.: Spelling correction in clinical notes with emphasis on first suggestion accuracy. In: 2nd Workshop on Building and Evaluating Resources for Biomedical Text Mining, pp. 1–8 (2010)

    Google Scholar 

  3. Pollock, J.J., Zamora, A.: Collection and characterization of spelling errors in scientific and scholarly text. J. Am. Soc. Inf. Sci. 34(1), 51–58 (1983)

    Article  Google Scholar 

  4. Verberne, S.: Context-sensitive spell checking based on trigram probabilities. Master’s thesis, University of Nijmegen (2002)

    Google Scholar 

  5. Kukich, K.: Techniques for automatically correcting words in text. ACM Comput. Surv. 24(4), 377–439 (1992)

    Article  Google Scholar 

  6. Brill, E., Moore, R.C.: An improved error model for noisy channel spelling correction. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics – ACL, Hong Kong, pp. 286–293 (2000)

    Google Scholar 

  7. Pande, H.: Effective search space reduction for spell correction using character neural embeddings. In: Proceedings 15th Conference of the European Chapter of the Association for Computational Linguistics–EACL 2017, Valencia, pp. 170–174 (2017)

    Google Scholar 

  8. Kitchenham, B., Pearl Brereton, O., Budgen, D., Turner, M., Bailey, J., Linkman, S.: Systematic literature reviews in software engineering - a systematic literature review. Inf. Softw. Technol. 51(1), 7–15 (2009). https://doi.org/10.1016/j.infsof.2008.09.009

    Article  Google Scholar 

  9. Takahashi, H., Yoshikawa, T., Furuhashi, T.: Reliability-based automatic repeat request with error potential-based error correction for improving P300 speller performance. In: Wong, K., Mendis, B., Bouzerdoum, A. (eds.) ICONIP 2010. LNCS, vol. 6444, pp. 50–57. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17534-3_7

    Chapter  Google Scholar 

  10. Senger, C., Kaltschmidt, J., Schmitt, S.P.W., Pruszydlo, M.G., Haefeli, W.E.: Misspellings in drug information system queries: characteristics of drug name spelling errors and strategies for their prevention. Int. J. Med. Inf. 79, 832–839 (2010). https://doi.org/10.1016/j.ijmedinf.2010.09.005

    Article  Google Scholar 

  11. Wong, W., Glance, D.: Statistical semantic and clinician confidence analysis for correcting abbreviations and spelling errors in clinical progress notes. Artif. Intell. Med. 53(3), 171–180 (2011)

    Article  Google Scholar 

  12. Sayle, R.A., Petrov, P., Winter, J., Muresan, S.: Improved chemical text mining of patents using infinite dictionaries, translation and automatic spelling correction. J. Chem. Inf. Model. 3, 51–62 (2012). https://doi.org/10.1186/1758-2946-3-S1-O16

    Article  Google Scholar 

  13. Siklósi, B., Novák, A., Prószéky, G.: Context-aware correction of spelling errors in Hungarian medical documents. Comput. Speech Lang. 35, 219–233 (2014)

    Article  Google Scholar 

  14. Kilicoglu, H., Fiszman, M., Roberts, K., Demner-Fushman, D.: An ensemble method for spelling correction in consumer health questions. In: AMIA Annual Symposium Proceedings, pp. 727–736 (2015)

    Google Scholar 

  15. Lai, K.H., Topaz, M., Goss, F.R., Zhou, L.: Automated misspelling detection and correction in clinical free-text records. J. Biomed. Inform. 55, 188–195 (2015)

    Article  Google Scholar 

  16. Zhou, X., et al.: Context-sensitive spelling correction of consumer-generated content on health care. JMIR Med. Inform. 31 3(3), 27 (2015). https://doi.org/10.2196/medinform.4211

    Article  Google Scholar 

  17. Thompson, P.M., McNaught, J., Ananiadou, S.: Customised OCR correction for historical medical text. Digit. Herit. 1, 35–42 (2015)

    Google Scholar 

  18. Hussain, F., Qamar, U.: Identification and correction of misspelled drugs’ names in electronic medical records (EMR). In: Proceedings of the 18th International Conference on Enterprise Information Systems, vol. 2, pp. 333–338 (2016)

    Google Scholar 

  19. Fivez, P., Suster, S., Daelemans, W.: Unsupervised context sensitive spelling correction of clinical free-text with word and character N-Gram embeddings. In: Proceedings of the BioNLP 2017 Workshop, Vancouver, pp. 143–148. Association for Computational Linguistics (2016)

    Google Scholar 

  20. Dziadek, J., Henriksson, A., Duneld, M.: Improving terminology mapping in clinical text with context-sensitive spelling correction. In: Informatics for Health: Connected Citizen-Led Wellness and Population Health, vol. 235, pp. 241–245. IOS Press, Amsterdam (2017)

    Google Scholar 

  21. Lu, C.J., Demner-Fushman, D.: Improving spelling correction with consumer health terminology. In: AMIA 2018 Annual Symposium Proceedings, p. 2053. American Medical Informatics Association (2018)

    Google Scholar 

  22. Workman, T.E., Shao, Y., Divita, G., Zeng-Treitler, Q.: An efficient prototype method to identify and correct misspellings in clinical text. BMC Res. Notes 12(1), 42 (2019). https://doi.org/10.1186/s13104-019-4073-y

    Article  Google Scholar 

  23. Unified Medical Language System (UMLS). https://www.nlm.nih.gov/research/umls/index.html. Accessed 20 Aug 2019

  24. International Health Terminology Standards Development Organisation, SNOMED CT. http://www.ihtsdo.org/snomed-ct/. Accessed 20 Aug 2019

  25. Moby Project. https://mobyproject.org. Accessed 20 Aug 2019

  26. Medline Plus. https://medlineplus.gov/. Accessed 20 Aug 2019

  27. PubMed. https://www.ncbi.nlm.nih.gov/pubmed/. Accessed 20 Aug 2019

  28. The SPECIALIST Lexicon. http://lexsrv3.nlm.nih.gov/Specialist/Summary/lexicon.html. Accessed 20 Aug 2019

  29. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dok-lady 10, 707 (1966)

    MathSciNet  Google Scholar 

  30. Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)

    Article  Google Scholar 

  31. Kernighan, M.D., Church, K.W., Gale, W.A.: A spelling correction program based on a noisy channel model. In: Proceedings of the 13th Conference on Computational Linguistics, vol. 2, pp. 205–210 (1990)

    Google Scholar 

  32. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948)

    Article  MathSciNet  Google Scholar 

  33. WordNet. A Lexical Database for English. https://wordnet.princeton.edu/. Accessed 20 Aug 2019

  34. National Library of Medicine, RxNorm. http://www.nlm.nih.gov/research/umls/rxnorm/. Accessed 20 Aug 2019

Download references

Acknowledgments

This research was funded by the Spanish National Research Agency (AEI) and the European Regional Development Fund (FEDER/ERDF) through project KBS4FIA (TIN2016-76323-R). This research is also funded by the Ministry of Education of Spain through the National Program for University Teacher Training (FPU/Ayudas para la formación de profesorado universitario).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jésica López-Hernández .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

López-Hernández, J., Almela, Á., Valencia-García, R. (2019). Automatic Spelling Detection and Correction in the Medical Domain: A Systematic Literature Review. In: Valencia-García, R., Alcaraz-Mármol, G., Del Cioppo-Morstadt, J., Vera-Lucio, N., Bucaram-Leverone, M. (eds) Technologies and Innovation. CITI 2019. Communications in Computer and Information Science, vol 1124. Springer, Cham. https://doi.org/10.1007/978-3-030-34989-9_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34989-9_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34988-2

  • Online ISBN: 978-3-030-34989-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics