Skip to main content

Automatic Generation of Bilingual Dictionaries Using Intermediary Languages and Comparable Corpora

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6008))

Abstract

This paper outlines a strategy to build new bilingual dictionaries from existing resources. The method is based on two main tasks: first, a new set of bilingual correspondences is generated from two available bilingual dictionaries. Second, the generated correspondences are validated by making use of a bilingual lexicon automatically extracted from non-parallel, and comparable corpora. The quality of the entries of the derived dictionary is very high, similar to that of hand-crafted dictionaries. We report a case study where a new, non noisy, English-Galician dictionary with about 12,000 correct bilingual correspondences was automatically generated.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahn, K., Frampotn, M.: Automataic generation of translation dictionaries using intermediary languages. In: Cross-Language Knowledge Induction Workshop of EACL 2006, Trento, Italy, pp. 41–44 (2006)

    Google Scholar 

  2. Armentano-Oller, C., Carrasco, R.C., Corb-Bellot, A.M., Forcada, M.L., Ginest-Rosell, M., Ortiz-Rojas, S., Prez-Ortiz, J.A., Ramrez-Snchez, G., Snchez-Martnez, F., Scalco, M.A.: Open-source portuguese-spanish machine translation. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds.) PROPOR 2006. LNCS (LNAI), vol. 3960, pp. 50–59. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Carreras, X., Chao, I., Padró, L., Padró, M.: An open-source suite of language analyzers. In: 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal (2004)

    Google Scholar 

  4. Chiao, Y.-C., Zweigenbaum, P.: Looking for candidate translational equivalents in specialized, comparable corpora. In: 19th COLING 2002 (2002)

    Google Scholar 

  5. Curran, J.R., Moens, M.: Improvements in automatic thesaurus extraction. In: ACL Workshop on Unsupervised Lexical Acquisition, Philadelphia, pp. 59–66 (2002)

    Google Scholar 

  6. Fung, P., McKeown, K.: Finding terminology translation from non-parallel corpora. In: 5th Annual Workshop on Very Large Corpora, Hong Kong, pp. 192–202 (1997)

    Google Scholar 

  7. Fung, P., Yee, L.Y.: An IR approach for translating new words from nonparallel, comparable texts. In: Coling 1998, Montreal, Canada, pp. 414–420 (1998)

    Google Scholar 

  8. Gamallo, P.: Learning bilingual lexicons from comparable english and spanish corpora. In: Machine Translation SUMMIT XI, Copenhagen, Denmark (2007)

    Google Scholar 

  9. Gamallo, P., Pichel, J.-R.: Learning spanish-galician translation equivalents using a comparable corpus and a bilingual dictionary. In: Gelbukh, A. (ed.) CICLing 2008. LNCS, vol. 4919, pp. 413–423. Springer, Heidelberg (2008)

    Google Scholar 

  10. Nerima, L., Wehrli, E.: Generating bilingual dictionaries by transitivity. In: LREC 2008, pp. 2584–2587 (2008)

    Google Scholar 

  11. Paik, K., Shirai, S., Nakaiwa, H.: Automatic construction of a transfer dictionary considering directionality. In: COLING 2004 Multilingual Linguistic Resources Workshop, Geneva, pp. 25–32 (2004)

    Google Scholar 

  12. Rapp, R.: Automatic identification of word translations from unrelated english and german corpora. In: ACL 1999, pp. 519–526 (1999)

    Google Scholar 

  13. Saralegui, X., San Vicente, I., Gurrutxaga, A.: Automatic generation of bilingual lexicons from comparable corpora in a popular science domain. In: LREC 2008 Workshop on Building and Using Comparable Corpora (2008)

    Google Scholar 

  14. Shao, L., Ng, H.T.: Mining new word translations from comparable corpora. In: 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, pp. 618–624 (2004)

    Google Scholar 

  15. Wehrli, E., Nerma, L., Scherrer, Y.: Deep linguistic multilingual translation and bilingual dictionaries. In: Foruth Workshop on Statistical Machine Translation, Athens, Greece, pp. 90–94 (2009)

    Google Scholar 

  16. Zhang, Y., Ma, Q., Isahara, H.: Building japanese-chinese translation dictionary based on EDR japanese-english bilingual dictionary. In: MT Summit XI, Copenhagen, pp. 551–557 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gamallo Otero, P., Pichel Campos, J.R. (2010). Automatic Generation of Bilingual Dictionaries Using Intermediary Languages and Comparable Corpora . In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2010. Lecture Notes in Computer Science, vol 6008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12116-6_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12116-6_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12115-9

  • Online ISBN: 978-3-642-12116-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics