Abstract
Classical methods for parallel text alignment consider one specific level (e.g. sentences) at which two or more versions of a text are synchronised. This may lead to some problems when these documents are particularly long since alignment errors at some point in the text may, in the absence of any other linguistic information, propagate for some time without any chance of recovery. In this chapter we consider how multilingual parallel alignment can be based on the fact that more and more texts are now highly structured by means of tagging languages such as SGML. In particular we will describe recent efforts in multi-level alignment for which we will present the main advances as well as some of the difficulties to be dealt with, particularly when the text and its translation contain different encoding schemes or different encoding practices for the same scheme.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S. (1997). Querying semistructured data. Proceedings of ICDT ‘87, 6th International Conference, Delphi, Greece, January 8–10, Lecture Notes in Computer Science, Vol. 1186, Springer, 1997, 1–18.
Abiteboul, S., Quass, D., McHugh, J., Widom, J., Weiner, J. L. (1997). The Lore! query language for semistructured data. Journal of Digital Libraries, 1 (1), 68–88.
Bruneseaux, F., Romary, L. (1997). Codage des références et coréférences dans les dialogues homme-machine, Proceedings of Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing, Queen’s University at Kingston (Ontario), 3–7 June, 15–17.
Buneman, P., Davidson, S., Hillebrand, G., Suciu, D. (1996). A query language and optimization techniques for unstructured data. Proceedings of ACM-SIGMOD International Conference on Management of Data, Montreal, Canada, June 1996, 505–516.
Erjavec, T., Lawson, A., Romary, L. (1998). East meets West: Producing Multilingual Resources in a European Context. First International Language Resources and Evaluation Conference, Granada, Spain, 981–985.
Ide, N., Véronis, J. (Eds.) (1995). The Text Encoding Initiative: Background and Contexts. Dordrecht: Kluwer Academic Publishers.
Melby, A. K. (this volume). Sharing of translation memory databases derived from parallel text. In Véronis, J. (Ed.), Parallel Text Processing. Dordrecht: Kluwer Academic Publishers.
Romary, L., Bonhomme, P., Bruneseaux, F., Pierre!, J.-M. (1999). Silfide: A System for Open Access and Distributed Delivery of TEI Encoded Documents, Computers and Humanities, 33 (1–2), 31–38.
Sperberg-McQueen, C. M., Burnard, L. (Eds.) (1994). Guidelines for Electronic Text Encoding and Interchange. Chicago and Oxford.
Véronis, J. (this volume). A survey of parallel text processing: from the Rosetta stone to the information society. In Véronis, J. (Ed.), Parallel Text Processing. Dordrecht: Kluwer Academic Publishers.
Véronis, J., Langlais, Ph. (this volume). Evaluation of parallel text alignment systems: The ARCADE project. In Véronis, J. (Ed.), Parallel Text Processing. Dordrecht: Kluwer Academic Publishers.
Welty, C., Ide, N. (1999). Using the Right Tools: Enhancing Retrieval from Marked-up Documents, Computers and the Humanities, 33 (1–2), 59–84.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Romary, L., Bonhomme, P. (2000). Parallel alignment of structured documents. In: Véronis, J. (eds) Parallel Text Processing. Text, Speech and Language Technology, vol 13. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2535-4_10
Download citation
DOI: https://doi.org/10.1007/978-94-017-2535-4_10
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5555-2
Online ISBN: 978-94-017-2535-4
eBook Packages: Springer Book Archive