Computers and the Humanities

, Volume 38, Issue 3, pp 223–251

Bitext Generation Through Rich Markup

  • Arantza Casillas
  • Raquel Martínez

DOI: 10.1007/s10579-004-0233-2

Cite this article as:
Casillas, A. & Martínez, R. Computers and the Humanities (2004) 38: 223. doi:10.1007/s10579-004-0233-2


This paper reports on a method for exploiting a bitext as the primary linguistic information source for the design of a generation environment for specialized bilingual documentation. The paper discusses such issues as Text Encoding Initiative (TEI), proposals for specialized corpus tagging, text segmentation and alignment of translation units and their allocation into translation memories, Document Type Definition (DTD), abstraction from tagged texts, and DTD deployment for bilingual text generation. The parallel corpus used for experimentation has two main features:

alignment bilingual document generation bitext parallel corpus segmentation SGML TEI translation memories 

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Arantza Casillas
    • 1
  • Raquel Martínez
    • 1
  1. 1.Departamento Electridad y Electrónica, Facultad de Ciencia y TecnologíaUPV-EHUSpain