Abstract
This chapter describes the ARCADE project, concerned with the evaluation of parallel text alignment systems. The project is composed of two tracks, devoted to the evaluation of alignment at sentence and word level respectively, and is planned for a four-year period. At the time of this report, twelve systems have participated in the sentence track, and five in the word track. Substantial progress has been made on the evaluation methodology, metrics and protocols, and a large reference corpus has been produced. The results show that sentence level alignment is quite satisfactory (over 98.5% accuracy on “normal” texts), although it degrades sharply for texts that do not match perfectly at the structural level (i.e., missing fragments, order differences, etc.). State-of-the-art word alignment systems can largely improve, since they reach only ca. 75% accuracy on the “translation spotting” task on which they were evaluated.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ahrenberg, L., Andersson, M., Merkel, M. (this volume). A knowledge-lite approach to word alignment. In Véronis, J. (Ed.), Parallel Text Processing. Dordrecht: Kluwer Academic Publishers.
Blank, I. (this volume). Lexical terminology extraction from parallel technical texts. In Véronis, J. (Ed.), Parallel Text Processing. Dordrecht: Kluwer Academic Publishers.
Gale, W. A., Church, K. W. (1993) A program for aligning sentences in bilingual corpora. Computational Linguistics, 19(3), 75–102.
Isabelle, P., Simard, M. (1996). Propositions pour la représentation et l’évaluation des alignements de textes parallèles. [Online] Available: http://www-rali.iro.umontreal.ca/arca2/PropEval/
Mariani, J. (1998). The Aupelf-Uref evaluation-based language engineering actions and related projects. Proceedings of First International Conference on Language Resources and Evaluation (LREC), 28–30 May 1998 (pp. 123–128 ). Granada, Spain.
Melamed, I. D. (1998a). Annotation Style Guide for the Blinker Project,University of Pennsylvania (IRCS Technical Report #98–06).
Melamed, I. D. (1998b). Manual Annotation of Translational Equivalence: The Blinker Project,University of Pennsylvania (IRCS Technical Report #98–07).
van Rijsbergen, C. J. (1979). Information Retrieval. 2nd edition, London: Butterworths.
Véronis, J. (1998). A study of polysemy judgements and inter-annotator agreement. Programme and advanced papers of the Senseval workshop, 2–4 September 1998. Herstmonceux Castle, England [no page numbers in original].
Véronis, J. (this volume). A survey of parallel text processing: from the Rosetta stone to the information society. In Véronis, J. (Ed.), Parallel Text Processing. Dordrecht: Kluwer Academic Publishers.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Véronis, J., Langlais, P. (2000). Evaluation of parallel text alignment systems. In: Véronis, J. (eds) Parallel Text Processing. Text, Speech and Language Technology, vol 13. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2535-4_19
Download citation
DOI: https://doi.org/10.1007/978-94-017-2535-4_19
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5555-2
Online ISBN: 978-94-017-2535-4
eBook Packages: Springer Book Archive