Abstract
The paper presents a program for automatic spelling correction of texts from a very specific domain, which has been applied to mammography reports. We describe different types of errors and present the program of correction based on the Levenshtein distance and probability of bigrams.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
1. Brill E. and R. C. Moore. An ImprovedModel for Noisy Channel Spelling Correction, In: Proceedings of the 38th Annual Meeting of the ACL,2000, pp. 286–293.
2. Cucerzan, S.and E. Brill, Spelling correction as an iterative process that exploits the collective knowledge of web users in Proceedings of EMNLP 2004.
3. Busemann S. and Krieger H.-U. Resources and Techniques for Multilingual Information Extraction. In: Proceedings of LREC 2004, Lisbon, Portugal, 2004, pp. 1923–1926.
4. Crowell, J., Q. Zeng, L. Ngo, E. Lacroix, A Frequency-based Technique to Improve the Spelling Suggestion Rank in Medical Queries, In: Journal of he American Medical Informatics Association, vol 11, May/;Jun, 2004, pp 179–185.
5. Daciuk, J., Incremental Construction of Finite-State Automata and Transducers, and their Use in the Natural Language Processing, Ph.D. dissertation, Technical University of Gdańsk, Poland, 1998.
6. Drożdżyński W., Krieger H.-U., Piskorski J., Schäfer U., and Xu F. Shallow Processing with Uni.cation and Typed Feature Structures - Foundations and Applications. In: German AI Journal KI-Zeitschrift, 01/;04. Gesellschaft für Informatik e.V, 2004.
7. Jurafsky D., J, Martin, Speech and Language Processing. Prentice Hall, 2000.
8. Kukich K. Technique for automatically correcting words in text, In: ACM Comput. Surv., Vol. 24(4), 1992, pp. 377–439.
9. Kupść A., Marciniak M., Mykowiecka A., Piskorski J., and Podsiadły-Marczykowska T., Information Extraction from Mammogram Reports. In: KONVENS 2004, Vienna, Austria, 2004, pp. 113–116.
10. Levenshtein V. I., Binary codes capable of correcting deletions, insertions, and reversals, Doklady Akademii Nauk SSSR, 163(4):845–848, 1965 (Russian). English translation in Soviet Physics Doklady, 10(8):707–710, 1966.
11. Marciniak M., Mykowiecka A., Kupść A., and Piskorski J. Intelligent Content Extraction from Polish Medical Reports. In: Intelligent Media Technology for Communicative Intelligence, Springer, Berlin, Heidelberg, 2005, pp. 68–78
12. Mykowiecka A., Kupść A., Marciniak M., Rule-based Medical Content Extraction and Classi.cation. In: Intelligent Information Processing and Web Mining. Proceedings of the IIS'05 Conference, Gdańsk, Springer, 2005, pp. 237–245.
13. Piskorski J., Homola P., Marciniak M., Mykowiecka A., Przepiórkowski A., and Woliński M. Information Extraction for Polish using the SProUT Platform. In: Intelligent Information Processing and Web Mining. Proceedings of the IIS'04 Conference, Zakopane, Springer, 2004, pp. 225–236.
14. Woliński M., Morfeusz - a Practical Tool for the Morphological Analysis of Polish. In this volume.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer
About this paper
Cite this paper
Mykowiecka, A., Marciniak, M. (2006). Domain–Driven Automatic Spelling Correction for Mammography Reports. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 35. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-33521-8_56
Download citation
DOI: https://doi.org/10.1007/3-540-33521-8_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33520-7
Online ISBN: 978-3-540-33521-4
eBook Packages: EngineeringEngineering (R0)