Practical Translation Pattern Acquisition from Combined Language Resources

  • Mihoko Kitamura
  • Yuji Matsumoto
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3248)


Automatic extraction of translation patterns from parallel corpora is an efficient way to automatically develop translation dictionaries, and therefore various approaches have been proposed. This paper presents a practical translation pattern extraction method that greedily extracts translation patterns based on co-occurrence of English and Japanese word sequences, which can also be effectively combined with manual confirmation and linguistic resources, such as chunking information and translation dictionaries. Use of these extra linguistic resources enables it to acquire results of higher precision and broader coverage regardless of the amount of documents.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kitamura, M., Matsumoto, Y.: Automatic extraction of word sequence correspondences in parallel corpora. In: Proceedings of WVLC4, pp. 79–87 (1996)Google Scholar
  2. 2.
    Yamamoto, K., Kudo, T., Tsuboi, Y., Matsumoto, Y.: Learning sequence-tosequence correspondences from parallel corpora via sequential pattern mining. In: Proceedings of HLT-NAACL 2003, pp. 73–80 (2003)Google Scholar
  3. 3.
    Dunning, T.: Accurate methods for statistics of surprise and coincidence. Computational Linguistics 19, 61–74 (1991)Google Scholar
  4. 4.
    Matsumoto, Y., Utsuro, T.: Lexical knowledge acquisition. In: Handbook of Natural Language Processing, pp. 563–610. Marcel Dekker, New York (2000)Google Scholar
  5. 5.
    Utiyama, M., Isahara, H.: Reliable measures for aligning japanese-english news articles and sentences. In: Proceedings of ACL 2003, pp. 72–79 (2003)Google Scholar
  6. 6.
    Kitamura, M., Murata, T.: Practical machine translation system allowing complex patterns. In: Proceedings of MT Summit IX, pp. 232–239 (2003)Google Scholar
  7. 7.
    Ishigami, S.: Business Contract Letter Dictionary, E-book Version, No.1 Sale of Goods. IBD Corporation (1992) (in Japanese)Google Scholar
  8. 8.
    Moore, R.: Learning translations of named-entity phrases from parallel corpora. In: Proceedings of EACL 2003, pp. 259–266 (2003)Google Scholar
  9. 9.
    Al-Onaizan, Y., K., K.: Translating named entities using monolingual and bilingual resources. In: Proceedings of ACL 2002, pp. 400–408 (2002) Google Scholar
  10. 10.
    Melamed, I.: Automatic evaluation and uniform filter cascades for inducing n-best translation lexicons. In: Proceedings of WVLC3, pp. 184–198 (1995)Google Scholar
  11. 11.
    Yamamoto, K., Matsumoto, Y.: Acquisition of phrase-level bilingual correspondence using dependency structure. In: Proceedings of COLING 2000, pp. 933–939 (2000)Google Scholar
  12. 12.
    Kitamura, M., Matsumoto, Y.: A machine translation system based on translation rules acquired from parallel corpora. In: Proceedings of RANLP 1995, pp. 27–44 (1995)Google Scholar
  13. 13.
    Papineni, K., Roukos, S., Ward, T.: W.J., Z.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of ACL 2002, pp. 311–318 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Mihoko Kitamura
    • 1
    • 2
  • Yuji Matsumoto
    • 2
  1. 1.Graduate School of Information ScienceNara Institute of Science and TechnologyNaraJapan
  2. 2.Corporate Research & Development CenterOki Electric Industry Co., LtdOsakaJapan

Personalised recommendations