Using Alignment Templates to Infer Shallow-Transfer Machine Translation Rules

  • Felipe Sánchez-Martínez
  • Hermann Ney
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4139)

Abstract

When building rule-based machine translation systems, a considerable human effort is needed to code the transfer rules that are able to translate source-language sentences into grammatically correct target-language sentences. In this paper we describe how to adapt the alignment templates used in statistical machine translation to the rule-based machine translation framework. The alignment templates are converted into structural transfer rules that are used by a shallow-transfer machine translation engine to produce grammatically correct translations. As the experimental results show there is a considerable improvement in the translation quality as compared to word-for-word translation (when no transfer rules are used), and the translation quality is close to that achieved when hand-coded transfer rules are used. The method presented is entirely unsupervised, and needs only a parallel corpus, two morphological analysers, and two part-of-speech taggers, such as those used by the machine translation system in which the inferred transfer rules are integrated.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Och, F.J.: Statistical Machine Translation: From Single-Word Models to Alignment Templates. PhD thesis, RWTH Aachen University, Aachen, Germany (2002)Google Scholar
  2. 2.
    Och, F.J., Ney, H.: The alignment template approach to statistical machine translation. Computational Linguistics 30(4), 417–449 (2004)CrossRefGoogle Scholar
  3. 3.
    Bender, O., Zens, R., Matusov, E., Ney, H.: Alignment templates: the RWTH SMT system. In: Proceedings of the International Workshop on Spoken Language Translation (IWSLT), Kyoto, Japan, pp. 79–84 (2004)Google Scholar
  4. 4.
    Probst, K., Levin, L., Peterson, E., Lavie, A., Carbonell, J.: MT for minority languages using elicitation-based learning of syntactic transfer rules. Machine Translation 17(4), 245–270 (2002)CrossRefGoogle Scholar
  5. 5.
    Lavie, A., Probst, K., Peterson, E., Vogel, S., Levin, L., Font-Llitjós, A., Carbonell, J.: A trainable transfer-based machine translation approach for languages with limited resources. In: Proceedings of Workshop of the European Association for Machine Translation (EAMT 2004), Valletta, Malta (2004)Google Scholar
  6. 6.
    Kaji, H., Kida, Y., Morimoto, Y.: Learning translation templates from bilingual text. In: Proceedings of the 14th Conference on Computational Linguistics, Morristown, NJ, USA, Association for Computational Linguistics, pp. 672–678 (1992)Google Scholar
  7. 7.
    Brown, R.D.: Adding linguistic knowledge to a lexical example-based translation system. In: Proceedings of the Eighth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI 1999), Chester, England, pp. 22–32 (1999)Google Scholar
  8. 8.
    Cicekli, I., Güvenir, H.A.: Learning translation templates from bilingual translation examples. Applied Intelligence 15(1), 57–76 (2001)MATHCrossRefGoogle Scholar
  9. 9.
    Liu, Y., Zong, C.: The technical analysis on translation templates. In: Proceedings of the IEEE International Conference on Systems, Man & Cybernetics (SMC), The Hague, Netherlands, pp. 4799–4803. IEEE, Los Alamitos (2004)Google Scholar
  10. 10.
    Och, F.J., Ney, H.: Discriminative training and maxium entropy models for statistical machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Lingustics (ACL), Philadelphia, PA, pp. 295–302 (2002)Google Scholar
  11. 11.
    Och, F.J.: An efficient method for determining bilingual word classes. In: EACL 1999: Ninth Conference of the European Chapter of the Association for Computational Lingustics, Bergen, Norway, pp. 71–76 (1999)Google Scholar
  12. 12.
    Corbí-Bellot, A.M., Forcada, M.L., Ortiz-Rojas, S., Pérez-Ortiz, J.A., Ramírez-Sánchez, G., Sánchez-Martínez, F., Alegria, I., Mayor, A., Sarasola, K.: An open-source shallow-transfer machine translation engine for the Romance languages of Spain. In: Proceedings of the 10th European Associtation for Machine Translation Conference, Budapest, Hungary, pp. 79–86 (2005)Google Scholar
  13. 13.
    Armentano-Oller, C., Carrasco, R.C., Corbí-Bellot, A.M., Forcada, M.L., Ginestí-Rosell, M., Ortiz-Rojas, S., Pérez-Ortiz, J.A., Ramírez-Sánchez, G., Sánchez-Martínez, F., Scalco, M.A.: Open-source portuguese–spanish machine translation. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds.) PROPOR 2006. LNCS, vol. 3960, pp. 50–59. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Canals-Marote, R., Esteve-Guillén, A., Garrido-Alenda, A., Guardiola-Savall, M., Iturraspe-Bellver, A., Montserrat-Buendia, S., Ortiz-Rojas, S., Pastor-Pina, H., Perez-Antón, P., Forcada, M.: The Spanish-Catalan machine translation system interNOSTRUM. In: Proceedings of MT Summit VIII: Machine Translation in the Information Age, Santiago de Compostela, Spain, pp. 73–76 (2001)Google Scholar
  15. 15.
    Brown, P.F., Pietra, S.A.D., Pietra, V.J.D., Mercer, R.L.: The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19(2), 263–311 (1993)Google Scholar
  16. 16.
    Vogel, S., Ney, H., Tillmann, C.: HMM-based word alignment in statistical translation. In: COLING 1996: The 16th International Conference on Computational Linguistics, Copenhagen, pp. 836–841 (1996)Google Scholar
  17. 17.
    Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29(1), 19–51 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Felipe Sánchez-Martínez
    • 1
  • Hermann Ney
    • 1
  1. 1.Lehrstuhl für Informatik VI – Computer Science DepartmentRWTH Aachen UniversityAachenGermany

Personalised recommendations