Abstract
The wide use of abbreviations in modern texts poses interesting challenges and opportunities in the field of NLP. In addition to their dynamic nature, abbreviations are highly polysemous with respect to regular words. Technologies that exhibit some level of language understanding may be adversely impacted by the presence of abbreviations. This paper addresses two related problems: (1) expansion of abbreviations given a context, and (2) translation of sentences with abbreviations. First, an efficient retrieval-based method for English abbreviation expansion is presented. Then, a hybrid system is used to pick among simple abbreviation-translation methods. The hybrid system achieves an improvement of 1.48 BLEU points over the baseline MT system, using sentences that contain abbreviations as a test set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agirre, E., Martinez, D.: Smoothing and word sense disambiguation. In: Vicedo, J.L., MartÃnez-Barco, P., MuÅ„oz, R., Saiz Noeda, M. (eds.) EsTAL 2004. LNCS (LNAI), vol. 3230, pp. 360–371. Springer, Heidelberg (2004)
Callison-Burch, C., Koehn, P., Osborne, M.: Improved Statistical Machine Translation Using Paraphrases. In: NAACL 2006 (2006)
Carpuat, M., Wu, D.: Improving Statistical Machine Translation using Word Sense Disambiguation. In: Proceedings of EMNLP 2007, pp. 61–72 (2007)
Chan, Y., Ng, H., Chiang, D.: Word Sense Disambiguation Improves Statistical Machine Translation. In: Proceedings of ACL 2007, pp. 33–40 (2007)
Chiang, D.: Hierarchical Phrase-Based Translation. Computational Linguistics 33(2), 201–228 (2007)
Galley, M., Graehl, J., Knight, K., Marcu, D., DeNeefe, S., Wang, W., Thayer, I.: Scalable Inference and Training of Context-Rich Syntactic Translation Models. In: Proceedings of COLING/ACL 2006, pp. 961–968 (2006)
Gaudan, S., Kirsch, H., Rebholz-Schuhmann, D.: Resolving abbreviations to their senses in Medline. Bioinformatics 21(18), 3658–3664 (2005)
Hiroko, A., Takagi, T.: ALICE: An Algorithm to Extract Abbreviations from MEDLINE. Journal of the American Medical Informatics Association, 576–586 (2005)
Koehn, P., Och, F., Marcu, D.: Statistical phrase-based translation. In: Proceedings of NAACL 2003, pp. 48–54 (2003)
Larkey, L., Ogilvie, P., Price, A., Tamilio, B.: Acrophile: an automated acronym extractor and server. In: Intl. Conf. on Digital Libraries archive, 5th ACM Conf. on Digital libraries, pp. 205-214 (2000)
Li, Z., Yarowsky, D.: Unsupervised Translation Induction for Chinese Abbreviations using Monolingual Corpora. In: ACL 2008 (2008a)
Li, Z., Yarowsky, D.: Mining and modeling relations between formal and informal Chinese phrases from web corpora (2008b)
Menezes, A., Quirk, C.: Syntactic Models for Structural Word Insertion and Deletion. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 735–744, Honolulu (October 2008)
Metzler, D., Croft, W.B.: Combining the Language Model and Inference Network Approaches to Retrieval. Information Processing and Management Special Issue on Bayesian Networks and Information Retrieval 40(5), 735–750 (2004)
Molloy, M.: Acronym Finder (1997), from http://www.acronymfinder.com/ (retrieved February 8, 2010)
Navigli, R.: Word Sense Disambiguation: a Survey. ACM Computing Surveys 41(2) (2009)
Pakhomov, S.: Semi-Supervised Maximum Entropy Based Approach to Acronym and Abbreviation Normalization in Medical Texts. In: ACL 2002 (2002)
Quirk, C., Menezes, A., Cherry, C.: 2005. Dependency Treelet Translation: Syntactically Informed Phrasal SMT. In: ACL 2005 (2005)
Quirk, C., Udupa, R., Menezes, A.: Generative Models of Noisy Translations with Applications to Parallel Fragment Extraction. In: European Assoc. for MT (2007)
Roche, M., Prince, V.: AcroDef: A quality measure for discriminating expansions of ambiguous acronyms. In: Kokinov, B., Richardson, D.C., Roth-Berghofer, T.R., Vieu, L. (eds.) CONTEXT 2007. LNCS (LNAI), vol. 4635, pp. 411–424. Springer, Heidelberg (2007)
Roche, M., Prince, V.: Managing the Acronym/Expansion Identification Process for Text-Mining Applications. Int. Journal of Software and Informatics 2(2), 163–179 (2008)
Stevenson, M., Guo, Y., Al Amri, A., Gaizauskas, R.: Disambiguation of Biomedical Abbreviations. In: BioNLP Workshop, HLT 2009 (2009)
Xu, J., Huang, Y.: Using SVM to Extract Acronyms from Text. In: Soft Computing - A Fusion of Foundations, Methodologies and Applications, pp. 369–373 (2006)
Yeates, S.: Automatic Extraction of Acronyms from Text. In: New Zealand Computer Science Research Students Conference 1999, pp. 117–124 (1999)
Zahariev, M.: Automatic Sense Disambiguation for Acronyms. In: SIGIR 2004, pp. 586–587 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ammar, W., Darwish, K., El Kahki, A., Hafez, K. (2011). ICE-TEA: In-Context Expansion and Translation of English Abbreviations. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6609. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19437-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-19437-5_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19436-8
Online ISBN: 978-3-642-19437-5
eBook Packages: Computer ScienceComputer Science (R0)