Abstract
In our paper, we present a method for automated correction of spelling errors in Hungarian clinical records. We model the problem of spelling correction as a translation task, where the source language is the erroneous text and the target language is the corrected one using an SMT decoder to perform the error correction. Since no orthographically correct proofread text from this domain is available, we cannot use such a corpus for training the system, instead a spelling correction generation and ranking system is used to create translation models. In addition, a language model is used in order to model lexical context. We show that our system outperforms the first candidate accuracy of the baseline ranking system.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bao, Z., Kimelfeld, B., Li, Y.: A graph approach to spelling correction in domain-centric search. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 905–914. Association for Computational Linguistics, Stroudsburg (2011)
Boswell, D.: CSE 256 (Spring 2004) language models for spelling correction (2004)
Brill, E., Moore, R.C.: An improved error model for noisy channel spelling correction. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, ACL 2000, pp. 286–293. Association for Computational Linguistics, Stroudsburg (2000)
Brockett, C., Dolan, W.B., Gamon, M.: Correcting ESL Errors Using Phrasal SMT Techniques. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 249–256. Association for Computational Linguistics, Sydney (2006)
Church, K.W., Gale, W.A.: Probability scoring for spelling correction. Statistics and Computing 1(2), 93–103 (1991)
Crowell, J., Zeng, Q., Ngo, L., Lacroix, E.: A frequency-based technique to improve the spelling suggestion rank in medical queries. J. Am. Med. Inform. Assoc. 11(3), 179–85
Ehsan, N., Faili, H.: Grammatical and context-sensitive error correction using a statistical machine translation framework. Softw., Pract. Exper. 43(2), 187–206 (2013)
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open Source Toolkit for Statistical Machine Translation. In: Proceedings of the ACL 2007 Demo and Poster Sessions, pp. 177–180. Association for Computational Linguistics, Prague (2007)
Kukich, K.: Techniques for automatically correcting words in text. ACM Comput. Surv. 24(4), 377–439 (1992)
Noeman, S., Madkour, A.: Language independent transliteration mining system using finite state automata framework. In: Proceedings of the 2010 Named Entities Workshop, NEWS 2010, pp. 57–61. Association for Computational Linguistics, Stroudsburg (2010)
Novák, A.: What is good Humor like? In: I. Magyar Számítógépes Nyelvészeti Konferencia, pp. 138–144. SZTE, Szeged (2003)
Oflazer, K., Güzey, C.: Spelling correction in agglutinative languages. In: Proceedings of the Fourth Conference on Applied Natural Language Processing, ANLC 1994, pp. 194–195. Association for Computational Linguistics, Stroudsburg (1994)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 311–318. Association for Computational Linguistics, Stroudsburg (2002)
Park, Y.A., Levy, R.: Automated whole sentence grammar correction using a noisy channel model. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 934–944. Association for Computational Linguistics, Stroudsburg (2011)
Patrick, J., Nguyen, D.: Automated proof reading of clinical notes. In: Gao, H.H., Dong, M. (eds.) PACLIC, pp. 303–312. Digital Enhancement of Cognitive Development, Waseda University (2011)
Pirinen, T.A., Lindén, K.: Finite-state spell-checking with weighted language and error models. In: Proceedings of the Seventh SaLTMiL Workshop on Creation and Use of Basic Lexical Resources for Less-resourced Languages, Valletta, Malta, pp. 13–18 (2010)
Prószéky, G., Kis, B.: A unification-based approach to morpho-syntactic parsing of agglutinative and other (highly) inflectional languages. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, ACL 1999, pp. 261–268. Association for Computational Linguistics, Stroudsburg (1999)
Siklósi, B., Orosz, G., Novák, A., Prószéky, G.: Automatic structuring and correction suggestion system for Hungarian clinical records. In: 8th SaLTMiL Workshop on Creation and Use of Basic Lexical Resources for Less-resourced Languages, pp. 29–34 (2012)
Stolcke, A., Zheng, J., Wang, W., Abrash, V.: SRILM at sixteen: Update and outlook. In: Proc. IEEE Automatic Speech Recognition and Understanding Workshop, Waikoloa, Hawaii (December 2011)
Turchin, A., Chu, J.T., Shubina, M., Einbinder, J.S.: Identification of misspelled words without a comprehensive dictionary using prevalence analysis. In: AMIA Annual Symposium Proceedings, pp. 751–755 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Siklósi, B., Novák, A., Prószéky, G. (2013). Context-Aware Correction of Spelling Errors in Hungarian Medical Documents. In: Dediu, AH., Martín-Vide, C., Mitkov, R., Truthe, B. (eds) Statistical Language and Speech Processing. SLSP 2013. Lecture Notes in Computer Science(), vol 7978. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39593-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-39593-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39592-5
Online ISBN: 978-3-642-39593-2
eBook Packages: Computer ScienceComputer Science (R0)