Skip to main content

Context-Aware Correction of Spelling Errors in Hungarian Medical Documents

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7978))

Abstract

In our paper, we present a method for automated correction of spelling errors in Hungarian clinical records. We model the problem of spelling correction as a translation task, where the source language is the erroneous text and the target language is the corrected one using an SMT decoder to perform the error correction. Since no orthographically correct proofread text from this domain is available, we cannot use such a corpus for training the system, instead a spelling correction generation and ranking system is used to create translation models. In addition, a language model is used in order to model lexical context. We show that our system outperforms the first candidate accuracy of the baseline ranking system.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bao, Z., Kimelfeld, B., Li, Y.: A graph approach to spelling correction in domain-centric search. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 905–914. Association for Computational Linguistics, Stroudsburg (2011)

    Google Scholar 

  2. Boswell, D.: CSE 256 (Spring 2004) language models for spelling correction (2004)

    Google Scholar 

  3. Brill, E., Moore, R.C.: An improved error model for noisy channel spelling correction. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, ACL 2000, pp. 286–293. Association for Computational Linguistics, Stroudsburg (2000)

    Google Scholar 

  4. Brockett, C., Dolan, W.B., Gamon, M.: Correcting ESL Errors Using Phrasal SMT Techniques. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 249–256. Association for Computational Linguistics, Sydney (2006)

    Google Scholar 

  5. Church, K.W., Gale, W.A.: Probability scoring for spelling correction. Statistics and Computing 1(2), 93–103 (1991)

    Article  Google Scholar 

  6. Crowell, J., Zeng, Q., Ngo, L., Lacroix, E.: A frequency-based technique to improve the spelling suggestion rank in medical queries. J. Am. Med. Inform. Assoc. 11(3), 179–85

    Google Scholar 

  7. Ehsan, N., Faili, H.: Grammatical and context-sensitive error correction using a statistical machine translation framework. Softw., Pract. Exper. 43(2), 187–206 (2013)

    Article  Google Scholar 

  8. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open Source Toolkit for Statistical Machine Translation. In: Proceedings of the ACL 2007 Demo and Poster Sessions, pp. 177–180. Association for Computational Linguistics, Prague (2007)

    Google Scholar 

  9. Kukich, K.: Techniques for automatically correcting words in text. ACM Comput. Surv. 24(4), 377–439 (1992)

    Article  Google Scholar 

  10. Noeman, S., Madkour, A.: Language independent transliteration mining system using finite state automata framework. In: Proceedings of the 2010 Named Entities Workshop, NEWS 2010, pp. 57–61. Association for Computational Linguistics, Stroudsburg (2010)

    Google Scholar 

  11. Novák, A.: What is good Humor like? In: I. Magyar Számítógépes Nyelvészeti Konferencia, pp. 138–144. SZTE, Szeged (2003)

    Google Scholar 

  12. Oflazer, K., Güzey, C.: Spelling correction in agglutinative languages. In: Proceedings of the Fourth Conference on Applied Natural Language Processing, ANLC 1994, pp. 194–195. Association for Computational Linguistics, Stroudsburg (1994)

    Chapter  Google Scholar 

  13. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 311–318. Association for Computational Linguistics, Stroudsburg (2002)

    Google Scholar 

  14. Park, Y.A., Levy, R.: Automated whole sentence grammar correction using a noisy channel model. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 934–944. Association for Computational Linguistics, Stroudsburg (2011)

    Google Scholar 

  15. Patrick, J., Nguyen, D.: Automated proof reading of clinical notes. In: Gao, H.H., Dong, M. (eds.) PACLIC, pp. 303–312. Digital Enhancement of Cognitive Development, Waseda University (2011)

    Google Scholar 

  16. Pirinen, T.A., Lindén, K.: Finite-state spell-checking with weighted language and error models. In: Proceedings of the Seventh SaLTMiL Workshop on Creation and Use of Basic Lexical Resources for Less-resourced Languages, Valletta, Malta, pp. 13–18 (2010)

    Google Scholar 

  17. Prószéky, G., Kis, B.: A unification-based approach to morpho-syntactic parsing of agglutinative and other (highly) inflectional languages. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, ACL 1999, pp. 261–268. Association for Computational Linguistics, Stroudsburg (1999)

    Chapter  Google Scholar 

  18. Siklósi, B., Orosz, G., Novák, A., Prószéky, G.: Automatic structuring and correction suggestion system for Hungarian clinical records. In: 8th SaLTMiL Workshop on Creation and Use of Basic Lexical Resources for Less-resourced Languages, pp. 29–34 (2012)

    Google Scholar 

  19. Stolcke, A., Zheng, J., Wang, W., Abrash, V.: SRILM at sixteen: Update and outlook. In: Proc. IEEE Automatic Speech Recognition and Understanding Workshop, Waikoloa, Hawaii (December 2011)

    Google Scholar 

  20. Turchin, A., Chu, J.T., Shubina, M., Einbinder, J.S.: Identification of misspelled words without a comprehensive dictionary using prevalence analysis. In: AMIA Annual Symposium Proceedings, pp. 751–755 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Siklósi, B., Novák, A., Prószéky, G. (2013). Context-Aware Correction of Spelling Errors in Hungarian Medical Documents. In: Dediu, AH., Martín-Vide, C., Mitkov, R., Truthe, B. (eds) Statistical Language and Speech Processing. SLSP 2013. Lecture Notes in Computer Science(), vol 7978. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39593-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39593-2_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39592-5

  • Online ISBN: 978-3-642-39593-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics