Abstract
Translation systems that automatically extract transfer mappings (rules or examples) from bilingual corpora have been hampered by the difficulty of achieving accurate alignment and acquiring high quality mappings. We describe an algorithm that uses a best-first strategy and a small alignment grammar to significantly improve the quality of the mappings extracted. For each mapping, frequencies are computed and sufficient context is retained to distinguish competing mappings during translation. Variants of the algorithm are run against a corpus containing 200K sentence pairs and evaluated based on the quality of resulting translations.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Jensen, K. 1993. PEGASUS: Deriving argument structures after syntax. In K. Jensen, G. Heidorn, & S. Richardson (eds.) Natural Language Processing: The PLNLP Approach. Kluwer Academic Publishers, Boston, MA.
Kaji, H., Y. Kida and Y. Morimoto. 1992. Learning translation templates from bilingual text. In Proceedings of the fifteenth [sic] International Conference on Computational Linguistics, COLING-92, Nantes, Prance, 2:672–678.
The Langenscheidt Pocket Spanish Dictionary. 1997. Langenscheidt, Munich/Berlin, Germany.
Lavoie, B., M. White and T. Korelsky. 2001. Inducing lexico-structural transfer rules from parsed bi-texts. In Proceedings of the Workshop on Data-Driven Machine Translation, 39th Annual Meeting and 10th Conference of the European Chapter of the Association for Computational Linguistics, Toulouse, Prance, pp.17–24.
McCarthy, M. (ed.). 1995. Cambridge Word Selector. Cambridge University Press, Cambridge, UK.
Meyers, A., R. Yangarber, R. Grishman, C. Macleod and A. Moreno-Sandoval. 1998. Deriving transfer rules from dominance-preserving alignments. In 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Quebec, Canada, pp.843–847.
Meyers, A., M. Kosaka and R. Grishman. 2000. Chart-based transfer rule application in machine translation. In Proceedings of the 18th International Conference on Computational Linguistics: COLING 2000 in Europe, Saarbrücken, Germany, pp.537–543.
Moore, R. 2001. Towards a simple and accurate statistical approach to learning translation relationships among words. In Proceedings of the Workshop on Data-Driven Machine Translation, 39th Annual Meeting and 10th Conference of the European Chapter of the Association for Computational Linguistics, Toulouse, France, pp.79–86.
Pentheroudakis, J. and L. Vanderwende. 1993. Automatically identifying morphological relations in machine-readable dictionaries. In Proceedings of the Ninth Annual conference of the University of Waterloo Center for the new OED and Text Research, Waterloo, Ontario, Canada, pp.114–131
Richardson, S., W. Dolan, M. Corston-Oliver and A. Menezes. 2001. Overcoming the customization bottleneck using example-based MT. In Proceedings of the Workshop on Data-Driven Machine Translation, 39th Annual Meeting and 10th Conference of the European Chapter of the Association for Computational Linguistics, Toulouse, France, pp.9–16.
SoftArt translation dictionary, version 7. 1995. SoftArt Inc., FL.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Menezes, A., Richardson, S.D. (2003). A Best-First Alignment Algorithm for Automatic Extraction of Transfer Mappings from Bilingua Corpora. In: Carl, M., Way, A. (eds) Recent Advances in Example-Based Machine Translation. Text, Speech and Language Technology, vol 21. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0181-6_15
Download citation
DOI: https://doi.org/10.1007/978-94-010-0181-6_15
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-1401-7
Online ISBN: 978-94-010-0181-6
eBook Packages: Springer Book Archive