Skip to main content

A Best-First Alignment Algorithm for Automatic Extraction of Transfer Mappings from Bilingua Corpora

  • Chapter

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 21))

Abstract

Translation systems that automatically extract transfer mappings (rules or examples) from bilingual corpora have been hampered by the difficulty of achieving accurate alignment and acquiring high quality mappings. We describe an algorithm that uses a best-first strategy and a small alignment grammar to significantly improve the quality of the mappings extracted. For each mapping, frequencies are computed and sufficient context is retained to distinguish competing mappings during translation. Variants of the algorithm are run against a corpus containing 200K sentence pairs and evaluated based on the quality of resulting translations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Jensen, K. 1993. PEGASUS: Deriving argument structures after syntax. In K. Jensen, G. Heidorn, & S. Richardson (eds.) Natural Language Processing: The PLNLP Approach. Kluwer Academic Publishers, Boston, MA.

    Chapter  Google Scholar 

  • Kaji, H., Y. Kida and Y. Morimoto. 1992. Learning translation templates from bilingual text. In Proceedings of the fifteenth [sic] International Conference on Computational Linguistics, COLING-92, Nantes, Prance, 2:672–678.

    Google Scholar 

  • The Langenscheidt Pocket Spanish Dictionary. 1997. Langenscheidt, Munich/Berlin, Germany.

    Google Scholar 

  • Lavoie, B., M. White and T. Korelsky. 2001. Inducing lexico-structural transfer rules from parsed bi-texts. In Proceedings of the Workshop on Data-Driven Machine Translation, 39th Annual Meeting and 10th Conference of the European Chapter of the Association for Computational Linguistics, Toulouse, Prance, pp.17–24.

    Google Scholar 

  • McCarthy, M. (ed.). 1995. Cambridge Word Selector. Cambridge University Press, Cambridge, UK.

    Google Scholar 

  • Meyers, A., R. Yangarber, R. Grishman, C. Macleod and A. Moreno-Sandoval. 1998. Deriving transfer rules from dominance-preserving alignments. In 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Quebec, Canada, pp.843–847.

    Google Scholar 

  • Meyers, A., M. Kosaka and R. Grishman. 2000. Chart-based transfer rule application in machine translation. In Proceedings of the 18th International Conference on Computational Linguistics: COLING 2000 in Europe, Saarbrücken, Germany, pp.537–543.

    Google Scholar 

  • Moore, R. 2001. Towards a simple and accurate statistical approach to learning translation relationships among words. In Proceedings of the Workshop on Data-Driven Machine Translation, 39th Annual Meeting and 10th Conference of the European Chapter of the Association for Computational Linguistics, Toulouse, France, pp.79–86.

    Google Scholar 

  • Pentheroudakis, J. and L. Vanderwende. 1993. Automatically identifying morphological relations in machine-readable dictionaries. In Proceedings of the Ninth Annual conference of the University of Waterloo Center for the new OED and Text Research, Waterloo, Ontario, Canada, pp.114–131

    Google Scholar 

  • Richardson, S., W. Dolan, M. Corston-Oliver and A. Menezes. 2001. Overcoming the customization bottleneck using example-based MT. In Proceedings of the Workshop on Data-Driven Machine Translation, 39th Annual Meeting and 10th Conference of the European Chapter of the Association for Computational Linguistics, Toulouse, France, pp.9–16.

    Google Scholar 

  • SoftArt translation dictionary, version 7. 1995. SoftArt Inc., FL.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Menezes, A., Richardson, S.D. (2003). A Best-First Alignment Algorithm for Automatic Extraction of Transfer Mappings from Bilingua Corpora. In: Carl, M., Way, A. (eds) Recent Advances in Example-Based Machine Translation. Text, Speech and Language Technology, vol 21. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0181-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-94-010-0181-6_15

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-1401-7

  • Online ISBN: 978-94-010-0181-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics