• George TambouratzisEmail author
  • Marina Vassiliou
  • Sokratis Sofianopoulos
Part of the SpringerBriefs in Statistics book series (BRIEFSSTATIST)


This chapter introduces the general design characteristics of PRESEMT and provides a detailed description of all resources required as well as all pre-processing steps needed, such as corpora processing and model creation.


  1. Ganchev K, Gillenwater J, Taskar B (2009) Dependency grammar induction via bitext projection constraints. In: Proceedings of the 47th Annual Meeting of the ACL, Singapore, 2–7 August, pp 369–377Google Scholar
  2. Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. MT Summit 2005, Phuket, ThailandGoogle Scholar
  3. Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labelling sequence data. In: 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, pp 282–289Google Scholar
  4. Markantonatou S, Sofianopoulos S, Giannoutsou O, Vassiliou M (2009) Hybrid machine translation for low- and middle-density languages. In: Nirenburg S (ed) Language engineering for lesser-studied languages. IOS Press, pp 243–274Google Scholar
  5. McCallum AK (2002) MALLET: a machine learning for language toolkit.
  6. Munteanu DS, Marcu D (2005) Improving machine translation performance by exploiting non-parallel corpora. Comput Linguist 31(4):477–504Google Scholar
  7. Och FJ, Ney H (2004) The alignment template approach to statistical machine translation. Comput Linguist 30(4):417–449CrossRefzbMATHGoogle Scholar
  8. Pomikálek J, Rychlý P (2008) Detecting co-derivative documents in large text collections. In: Proceedings of LREC2008, Marrakech, Morrocco, pp 1884–1887Google Scholar
  9. Prokopidis P, Georgantopoulos B, Papageorgiou H (2011) A suite of NLP tools for Greek. In: Proceedings of the 10th ICGL Conference, Komotini, Greece pp 373–383Google Scholar
  10. Schmid H (1994) Probabilistic part-of-speech tagging using decision trees. In: Proceedings of International Conference on New Methods in Language Processing, Manchester, UK, pp 44–49Google Scholar
  11. Sha F, Pereira FCN (2003) Shallow parsing with conditional random fields. In: Proceedings of HLT-NAACL Conference, pp 213–220Google Scholar
  12. Tambouratzis G, Simistira F, Sofianopoulos S, Tsimboukakis N, Vassiliou M (2011) A resource-light phrase scheme for language-portable MT. In: Proceedings of the 15th International Conference of the European Association for Machine Translation, 30–31 May, Leuven, Belgium, pp 185–192Google Scholar
  13. Tambouratzis G, Troullinos M, Sofianopoulos S, Vassiliou M (2012) Accurate phrase alignment in a bilingual corpus for EBMT systems. In: Proceedings of the 5th BUCC Workshop, held within the LREC-2012 Conference, May 26, Istanbul, Turkey, pp 104–111Google Scholar
  14. Tambouratzis G, Sofianopoulos S, Vassiliou M (2013) Language-independent hybrid MT with PRESEMT. In: Proceedings of HYTRA-2013 Workshop, held within the ACL-2013 Conference, Sofia, Bulgaria, 8 August, pp 123–130Google Scholar
  15. Tsuruoka Y, Tsujii J, Ananiadou S (2009) Fast full parsing by linear-chain conditional random fields. In: Proceedings of the 12th EACL Conference, 30 March–3 April, Athens, Greece, pp 790–798Google Scholar
  16. Wallach HM (2004) Conditional random fields: an introduction. University of Pennsylvania CIS Technical Report, MS-CIS-04-21, February 24Google Scholar

Copyright information

© The Author(s) 2017

Authors and Affiliations

  • George Tambouratzis
    • 1
    Email author
  • Marina Vassiliou
    • 1
  • Sokratis Sofianopoulos
    • 1
  1. 1.Institute for Language and Speech ProcessingAthensGreece

Personalised recommendations