Skip to main content

Improving Word Alignment Through Morphological Analysis

  • Conference paper
  • First Online:
Integrated Uncertainty in Knowledge Modelling and Decision Making (IUKM 2015)

Abstract

Word alignment plays a critical role in statistical machine translation systems. The famous word alignment system, IBM models series, currently operates on only surface forms of words regardless of their linguistic features. This deficiency usually leads to many data sparseness problems. Therefore, we present an extension that enables the integration of morphological analysis into the traditional IBM models. Experiments on English-Vietnamese tasks show that the new model produces better results not only in word alignment but also in final translation performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Goldsmith, M.J., Hajic, J., Mercer, R.L., Mohanty, S.: But dictionaries are data too. In: Proceedings of the Workshop on Human Language Technology, pp. 202–205. Association for Computational Linguistics (1993)

    Google Scholar 

  2. Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., Mercer, R.L.: The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19(2), 263–311 (1993)

    Google Scholar 

  3. Federico, M., Bertoldi, N., Cettolo, M.: Irstlm: an open source toolkit for handling large scale language models. In: Interspeech, pp. 1618–1621 (2008)

    Google Scholar 

  4. Koehn, P., Hoang, H.: Factored translation models. In: EMNLP-CoNLL, pp. 868–876 (2007)

    Google Scholar 

  5. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pp. 177–180. Association for Computational Linguistics (2007)

    Google Scholar 

  6. Lee, Y.-S.: Morphological analysis for statistical machine translation. In: Proceedings of HLT-NAACL 2004: Short Papers, pp. 57–60. Association for Computational Linguistics (2004)

    Google Scholar 

  7. Moore, R.C.: Improving ibm word-alignment model 1. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 518. Association for Computational Linguistics (2004)

    Google Scholar 

  8. Och, F.J.: Minimum error rate training in statistical machine translation. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol. 1, pp. 160–167. Association for Computational Linguistics (2003)

    Google Scholar 

  9. Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29(1), 19–51 (2003)

    Article  MATH  Google Scholar 

  10. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)

    Google Scholar 

  11. Sadat, F., Habash, N.: Combination of arabic preprocessing schemes for statistical machine translation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 1–8. Association for Computational Linguistics (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vuong Van Bui .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Van Bui, V., Tran, T.T., Nguyen, N.B.T., Pham, T.D., Le, A.N., Le, C.A. (2015). Improving Word Alignment Through Morphological Analysis. In: Huynh, VN., Inuiguchi, M., Demoeux, T. (eds) Integrated Uncertainty in Knowledge Modelling and Decision Making. IUKM 2015. Lecture Notes in Computer Science(), vol 9376. Springer, Cham. https://doi.org/10.1007/978-3-319-25135-6_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25135-6_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25134-9

  • Online ISBN: 978-3-319-25135-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics