Skip to main content

Method to Build a Bilingual Lexicon for Speech-to-Speech Translation Systems

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7182))

  • 1339 Accesses

Abstract

Noun dropping and mis-translations occasionally occurs with Machine Translation (MT) output. These errors can cause communication problems between system users. Some of the MT architectures are able to incorporate bilingual noun lexica, which can improve the translation quality of sentences which include nouns. In this paper, we proposed an automatic method to enable a monolingual user to add new words to the lexicon. In the experiments, we compare the proposed method to three other methods. According to the experimental results, the proposed method gives the best performance in both point of view of Character Error Rate (CER) and Word Error Rate (WER). The improvement from using only a transliteration system is very large, about 13 points in CER and 32 points in WER.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bach, N., Hsiao, R., Eck, M., Charoenpornsawat, P., Vogel, S., Schultz, T., Lane, I., Waibel, A., Black, A.W.: Incremental adaptation of speech-to-speech translation. In: Proc. of Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), pp. 149–152 (2009)

    Google Scholar 

  2. Kawai, H., Isotani, R., Yasuda, K., Sumita, E., Masao, U., Matsuda, S., Ashikari, Y., Nakamura, S.: An overview of a nation-wide field experiment of speech-to-speech translation in fiscal year 2009. In: Proceedings of 2010 Autumn Meeting of Acoustical Society of Japan, pp. 99–102 (2010) (in Japanese)

    Google Scholar 

  3. Okuma, H., Yamamoto, H., Sumita, E.: Introducing a translation dictionary into phrase-based smt. The IEICE Transactions on Information and Systems 91-D(7), 2051–2057 (2008)

    Article  Google Scholar 

  4. Koehn, P., Och, F.J., Marcu, D.: Statistical Phrase-Based Translation. In: Proc. of Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), pp. 127–133 (2003)

    Google Scholar 

  5. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pp. 177–180. Association for Computational Linguistics (2007)

    Google Scholar 

  6. Tonoike, M., Kida, M., Takagi, T., Sasaki, Y., Utsuro, T., Sato, S.: Translation Estimation for Technical Terms using Corpus collected from the Web. In: Proceedings of the Pacific Association for Computational Linguistics, pp. 325–331 (2005)

    Google Scholar 

  7. Al-Onaizan, Y., Knight, K.: Translating named entities using monolingual and bilingual resources. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 400–408 (2002)

    Google Scholar 

  8. Sato, S.: Web-Based Transliteration of Person Names. In: Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 273–278 (2009)

    Google Scholar 

  9. Finch, A., Dixon, P., Sumita, E.: Integrating a joint source channel model into a phrase-based transliteration system. In: Proceedings of NEWS 2011 (2011) will be appeared

    Google Scholar 

  10. Finch, A., Sumita, E.: A bayesian model of bilingual segmentation for transliteration. In: Proceedings of the Seventh International Workshop on Spoken Language Translation (IWSLT), pp. 259–266 (2010)

    Google Scholar 

  11. Fukunishi, T., Finch, A., Yamamoto, S., Sumita, E.: Using features from a bilingual alignment model in transliteration mining. In: Proceedings of NEWS 2011 (2011)

    Google Scholar 

  12. Goldwater, S., Griffiths, T.L., Johnson, M.: Contextual dependencies in unsupervised word segmentation. In: ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 673–680. Association for Computational Linguistics, Morristown (2006)

    Google Scholar 

  13. Mochihashi, D., Yamada, T., Ueda, N.: Bayesian unsupervised word segmentation with nested pitman-yor language modeling. In: ACL-IJCNLP 2009: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language. Processing of the AFNLP, vol. 1, pp. 100–108. Association for Computational Linguistics, Morristown (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yasuda, K., Finch, A., Sumita, E. (2012). Method to Build a Bilingual Lexicon for Speech-to-Speech Translation Systems. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28601-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28601-8_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28600-1

  • Online ISBN: 978-3-642-28601-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics