Language Resources and Evaluation

, Volume 39, Issue 4, pp 267–285 | Cite as

Guidelines for Word Alignment Evaluation and Manual Alignment

  • Patrik LambertEmail author
  • Adrià De Gispert
  • Rafael Banchs
  • José B. Mariño


The purpose of this paper is to provide guidelines for building a word alignment evaluation scheme. The notion of word alignment quality depends on the application: here we review standard scoring metrics for full text alignment and give explanations on how to use them better. We discuss strategies to build a reference corpus, and show that the ratio between ambiguous and unambiguous links in the reference has a great impact on scores measured with these metrics. In particular, automatically computed alignments with higher precision or higher recall can be favoured depending on the value of this ratio. Finally, we suggest a strategy to build a reference corpus particularly adapted to applications where recall plays a significant role, like in machine translation. The manually aligned corpus we built for the Spanish-English European Parliament corpus is also described. This corpus is freely available.


alignment error rate bilingual evaluation gold standard manual alignment parallel corpus precision recall word alignment 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Ahrenberg L., Merkel M., Hein A.S., Tiedemann J. (2000) In: Proc. of the 2nd International Conference on Linguistic Resources and Evaluation (LREC). Athens, Greece, Vol. III: pp. 1255–1261.Google Scholar
  2. Brown P., Della Pietra S., Della Pietra V., Mercer R. (1993) The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19(2), pp. 263–311.Google Scholar
  3. Crego J.M., Mariño J., de Gispert A. (2004) Finite-state-based and Phrase-based Statistical Machine Translation. Proc. of the 8th Int. Conf. on Spoken Language Processing, ICSLP’04 pp. 37–40.Google Scholar
  4. David Yarowsky G.N., Wicentowski, R. (2001) Inducing Multilingual Text Analysis Tools via Robust Projection Across Aligned Corpora. In: Proc. of the 1st International Conference on Human Language Technology Research (HLT), pp. 161–168.Google Scholar
  5. de Gispert A., Mariño J., Crego J.M. (2004) Phrase-based Alignment Combining Corpus Cooccurrences and Linguistic Knowledge. Proc. of the Int. Workshop on Spoken Language Translation, IWSLT’04, pp. 107–114.Google Scholar
  6. Diab M., Resnik P. (2002) An Unsupervised Method for Word Sense Tagging Using Parallel Corpora. In: Proc. of the Annual Meeting of the Association for Computational Linguistics. Philadelphia, PA, pp. 255–262.Google Scholar
  7. Kuhn J. (2004) Experiments in Parallel-Text Based Grammar Induction. In: Proc. of the 42th Annual Meeting of the Association for Computational Linguistics. Barcelona, Spain, pp. 470–477.Google Scholar
  8. Lambert P. (2004) The Alignment Set Toolkit. Scholar
  9. Martin J., Mihalcea R., Pedersen T. (2005) Word Alignment for Languages with Scarce Resources. In: Proceedings of the ACL Workshop on Building and Using Parallel Texts. Ann Arbor, Michigan.Google Scholar
  10. Melamed I.D. (1998a) Annotation Style Guide for the Blinker Project. Technical Report 98-06, IRCS.Google Scholar
  11. Melamed I.D. (1998b) Manual Annotation of Translational Equivalence. Technical Report 98-07, IRCS.Google Scholar
  12. Mihalcea R. and Pedersen T. (2003). An Evaluation Exercise for Word Alignment. In: Mihalcea, R. and Pedersen, T. (eds) HLT-NAACL 2003 Workshop: Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, pp 1–10. Edmonton, Alberta, Canada, Association for Computational LinguisticsCrossRefGoogle Scholar
  13. Och F. and Ney H. (2003). A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics 29(1): 19–51CrossRefGoogle Scholar
  14. Och F. and Ney H. (2004). The Alignment Template Approach to Statistical Machine Translation. Computational Linguistics 30(4): 417–449CrossRefGoogle Scholar
  15. Och F.J., Ney H. (2000a) A Comparison of Alignment Models for Statistical Machine Translation. In: Proc. of the 18th Int. Conf. on Computational Linguistics. Saarbrucken,Germany, pp. 1086–1090.Google Scholar
  16. Och F.J., Ney H. (2000b) Improved Statistical Alignment Models. In: Proc. of the 38th Annual Meeting of the Association for Computational Linguistics. Hongkong, China, pp. 440–447Google Scholar
  17. Pedersen T., Rassier B. (2003) Aligner for Parallel Corpora. Scholar
  18. Ribeiro A., Lopes G. and Mexia J. (2001). Extracting Translation Equivalents from Portuguese–Chinese Parallel Texts. Journal of Studies in Lexicography 11(1): 118–194Google Scholar
  19. Smadja F.A., McKeown K.R. and Hatzivassiloglou V. (1996). Translating Collocations for Bilingual Lexicons: A Statistical Approach. Computational Linguistics 22(1): 1–38Google Scholar
  20. (1979). Information Retrieval. Butterworths, LondonGoogle Scholar
  21. Véronis J. (2000) Evaluation of Parallel Text Alignment Systems: The ARCADE Project. In: Parallel Text Processing: Alignment and Use of Translation Corpora. Kluwer Academic Publishers, pp. 369–388.Google Scholar
  22. Wang Y.-Y., Waibel A. (1998) Modeling with Structures in Statistical Machine Translation. In: Proc. of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics. Montreal, Canada, pp. 1357–1363.Google Scholar

Copyright information

© Springer 2006

Authors and Affiliations

  • Patrik Lambert
    • 1
    Email author
  • Adrià De Gispert
    • 1
  • Rafael Banchs
    • 1
  • José B. Mariño
    • 1
  1. 1.TALP Research CentreBarcelonaSpain

Personalised recommendations