Conference of the Association for Machine Translation in the Americas

AMTA 2002: Machine Translation: From Research to Real Users pp 165-176

Semi-automatic Compilation of Bilingual Lexicon Entries from Cross-Lingually Relevant News Articles on WWW News Sites

  • Takehito Utsuro
  • Takashi Horiuchi
  • Yasunobu Chiba
  • Takeshi Hamamoto
Conference paper

DOI: 10.1007/3-540-45820-4_17

Volume 2499 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Utsuro T., Horiuchi T., Chiba Y., Hamamoto T. (2002) Semi-automatic Compilation of Bilingual Lexicon Entries from Cross-Lingually Relevant News Articles on WWW News Sites. In: Richardson S.D. (eds) Machine Translation: From Research to Real Users. AMTA 2002. Lecture Notes in Computer Science, vol 2499. Springer, Berlin, Heidelberg

Abstract

For the purpose of overcoming resource scarcity bottleneck in corpus-based translation knowledge acquisition research, this paper takes an approach of semi-automatically acquiring domain specific translation knowledge from the collection of bilingual news articles on WWW news sites. This paper presents results of applying standard co-occurrence frequency based techniques of estimating bilingual term correspondences from parallel corpora to relevant article pairs automatically collected from WWW news sites. The experimental evaluation results are very encouraging and it is proved that many useful bilingual term correspondences can be efficiently discovered with little human intervention from relevant article pairs on WWW news sites.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Takehito Utsuro
    • 1
  • Takashi Horiuchi
    • 1
  • Yasunobu Chiba
    • 1
  • Takeshi Hamamoto
    • 1
  1. 1.Department of Information and Computer SciencesToyohashi University of TechnologyToyohashiJapan