Incorporating target language semantic roles into a string-to-tree translation model

Su, Chao; Guo, Yu-hang; Huang, He-yan; Shi, Shu-min; Feng, Chong

doi:10.1631/FITEE.1601349

Incorporating target language semantic roles into a string-to-tree translation model

Published: 15 December 2017

Volume 18, pages 1534–1542, (2017)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Chao Su ORCID: orcid.org/0000-0001-6771-329X^1,3,
Yu-hang Guo¹,
He-yan Huang ORCID: orcid.org/0000-0002-0320-7520^1,2,
Shu-min Shi^1,2 &
…
Chong Feng^1,2

50 Accesses
1 Citation
Explore all metrics

Abstract

The string-to-tree model is one of the most successful syntax-based statistical machine translation (SMT) models. It models the grammaticality of the output via target-side syntax. However, it does not use any semantic information and tends to produce translations containing semantic role confusions and error chunk sequences. In this paper, we propose two methods to use semantic roles to improve the performance of the string-to-tree translation model: (1) adding role labels in the syntax tree; (2) constructing a semantic role tree, and then incorporating the syntax information into it. We then perform string-to-tree machine translation using the newly generated trees. Our methods enable the system to train and choose better translation rules using semantic information. Our experiments showed significant improvements over the state-of-the-art string-to-tree translation system on both spoken and news corpora, and the two proposed methods surpass the phrase-based system on large-scale training data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing English-Japanese Translation Using Syntactic Pattern Recognition Methods

Machine Translation Method Based on Non-compositional Semantics (Word-Level Sentence-Pattern-Based MT)

Parse and Corpus-Based Machine Translation

References

Aziz, W., Rios, M., Specia, L., 2011. Shallow semantic trees for SMT. Proc. 6th Workshop on Statistical Machine Translation, p.316–322.
Google Scholar
Baker, C.F., Fillmore, C.J., Lowe, J.B., 1998. The Berkeley Framenet Project. Proc. 17th Int. Conf. on Computational Linguistics, p.86–90. https://doi.org/10.3115/980451.980860
Google Scholar
Bazrafshan, M., Gildea, D., 2013. Semantic roles for string to tree machine translation. Proc. 51st Annual Meeting of the Association for Computational Linguistics, p.419–423.
Google Scholar
Brown, P.F., Cocke, J., Pietra, S.A.D., et al., 1990. A statistical approach to machine translation. Comput. Ling., 16(2): 79–85.
Google Scholar
Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., et al., 1993. The mathematics of statistical machine translation: parameter estimation. Comput. Ling., 19(2): 263–311.
Google Scholar
Chiang, D., 2005. A hierarchical phrase-based model for statistical machine translation. Proc. 43rd Annual Meeting on Association for Computational Linguistics, p.263–270. https://doi.org/10.3115/1219840.1219873
Google Scholar
Clark, H.J., Dyer, C., Lavie, A., et al., 2011. Better hypothesis testing for statistical machine translation: controlling for optimizer instability. Proc. 49th Annual Meeting of the Association for Computational Linguistics, p.176–181.
Google Scholar
Denkowski, M., Lavie, A., 2014. Meteor universal: language specific translation evaluation for any target language. Proc. 9th Workshop on Statistical Machine Translation, p.376–380. https://doi.org/10.3115/v1/W14-3348
Google Scholar
Galley, M., Hopkins, M., Knight, K., et al., 2004. What’s in a translation rule. Proc. Human Language Technology Conf. of the North American Chapter of the Association for Computational Linguistics. https://doi.org/10.21236/ada460212
Book Google Scholar
Gildea, D., Jurafsky, D., 2002. Automatic labeling of semantic roles. Comput. Ling., 28(3): 245–288. https://doi.org/10.1162/089120102760275983
Article Google Scholar
Huang, L., Chiang, D., 2005. Better k-best parsing. Proc. 9th Int. Workshop on Parsing Technology, p.53–64. https://doi.org/10.3115/1654494.1654500
Google Scholar
Koehn, P., 2004. Statistical significance tests for machine translation evaluation. Proc. Conf. on Empirical Methods in Natural Language Processing, p.388–395.
Google Scholar
Koehn, P., Och, F.J., Marcu, D., 2003. Statistical phrase-based translation. Proc. Conf. of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.48–54. https://doi.org/10.3115/1073445.1073462
Book Google Scholar
Koehn, P., Hoang, H., Birch, A., et al., 2007. Moses: open source toolkit for statistical machine translation. Proc. 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, p.177–180. https://doi.org/10.3115/1557769.1557821
Google Scholar
Komachi, M., Matsumoto, Y., Nagata, M., 2006. Phrase reordering for statistical machine translation based on predicate-argument structure. Int. Workshop on Spoken Language Translation, p.77–82.
Google Scholar
Liu, D., Gildea, D., 2008. Improved tree-to-string transducer for machine translation. Proc. 3rd Workshop on Statistical Machine Translation, p.62–69. https://doi.org/10.3115/1626394.1626402
Google Scholar
Liu, D., Gildea, D., 2010. Semantic role features for machine translation. Proc. 23rd Int. Conf. on Computational Linguistics, p.716–724.
Google Scholar
Liu, Y., Liu, Q., 2010. Joint parsing and translation. Proc. 23rd Int. Conf. on Computational Linguistics, p.707–715.
Google Scholar
Liu, Y., Liu, Q., Lin, S., 2006. Tree-to-string alignment template for statistical machine translation. Proc. 21st Int. Conf. on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, p.609–616. https://doi.org/10.3115/1220175.1220252
Google Scholar
Marcu, D., Wang, W., Echihabi, A., et al., 2006. SPMT: Statistical machine translation with syntactified target language phrases. Proc. Conf. on Empirical Methods in Natural Language Processing, p.44–52. https://doi.org/10.3115/1610075.1610083
Google Scholar
Meyers, A., Reeves, R., Macleod, C., et al., 2004. The nombank project: an interim report. HLT-NAACL Workshop: Frontiers in Corpus Annotation, p.24–31.
Google Scholar
Mi, H., Huang, L., Liu, Q., 2008. Forest-based translation. Proc. ACL-08: HLT, p.192–199.
Google Scholar
Och, F.J., Ney, H., 2004. The alignment template approach to statistical machine translation. Comp. Ling., 30(4): 417–449. https://doi.org/10.1162/0891201042544884
Article Google Scholar
Palmer, M., Gildea, D., Kingsbury, P., 2005. The proposition bank: an annotated corpus of semantic roles. Comp. Ling., 31(1): 71–106. https://doi.org/10.1162/0891201053630264
Article Google Scholar
Papineni, K., Roukos, S., Ward, T., et al., 2002. BLEU: a method for automatic evaluation of machine translation. Proc. 40th Annual Meeting on Association for Computational Linguistics, p.311–318. https://doi.org/10.3115/1073083.1073135
Google Scholar
Petrov, S., Barrett, L., Thibaux, R., et al., 2006. Learning accurate, compact, and interpretable tree annotation. Proc. 21st Int. Conf. on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, p.433–440. https://doi.org/10.3115/1220175.1220230
Google Scholar
Pradhan, S.S., Ward, W., Hacioglu, K., et al., 2004. Shallow semantic parsing using support vector machines. Human Language Technologies: the Annual Conf. of the North American Chapter of the Association for Computational Linguistics, p.233–240.
Google Scholar
Wu, D., 1995. Grammarless extraction of phrasal translation examples from parallel texts. Proc. 6th Int. Conf. on Theoretical and Methodological Issues in Machine Translation, p.354–372.
Google Scholar
Wu, D., 1996. A polynomial-time algorithm for statistical machine translation. Proc. 34th Annual Meeting on Association for Computational Linguistics, p.152–158. https://doi.org/10.3115/981863.981884
Google Scholar
Wu, D., Fung, P., 2009. Semantic roles for SMT: a hybrid two-pass model. Proc. Human Language Technologies: the Annual Conf. North American Chapter of the Association for Computational Linguistics, p.13–16. https://doi.org/10.3115/1620853.1620858
Google Scholar
Xiong, D., Zhang, M., Li, H., 2012. Modeling the translation of predicate-argument structure for SMT. Proc. 50th Annual Meeting of the Association for Computational Linguistics, p.902–911.
Google Scholar
Yamada, K., Knight, K., 2001. A syntax-based statistical translation model. Proc. 39th Annual Meeting on Association for Computational Linguistics, p.523–530. https://doi.org/10.3115/1073012.1073079
Google Scholar
Zhai, F., Zhang, J., Zhou, Y., et al., 2012. Machine translation by modeling predicate-argument structure transformation. Proc. Int. Conf. on Computational Linguistics, p.3019–3036.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
Chao Su, Yu-hang Guo, He-yan Huang, Shu-min Shi & Chong Feng
Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing, 100081, China
He-yan Huang, Shu-min Shi & Chong Feng
Beijing Advanced Innovation Center for Imaging Technology, Capital Normal University, Beijing, 100048, China
Chao Su

Authors

Chao Su
View author publications
You can also search for this author in PubMed Google Scholar
Yu-hang Guo
View author publications
You can also search for this author in PubMed Google Scholar
He-yan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Shu-min Shi
View author publications
You can also search for this author in PubMed Google Scholar
Chong Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Chao Su or He-yan Huang.

Additional information

Project supported by the National Natural Science Foundation of China (Nos. 61132009, 61201352, 61502035, and 61201351), the National Basic Research Program (973) of China (No. 2013CB329303), and the Beijing Advanced Innovation Center for Imaging Technology (No. BAICIT-2016007)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Su, C., Guo, Yh., Huang, Hy. et al. Incorporating target language semantic roles into a string-to-tree translation model. Frontiers Inf Technol Electronic Eng 18, 1534–1542 (2017). https://doi.org/10.1631/FITEE.1601349

Download citation

Received: 18 June 2016
Accepted: 30 November 2016
Published: 15 December 2017
Issue Date: October 2017
DOI: https://doi.org/10.1631/FITEE.1601349

Keywords

CLC number

TP391

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incorporating target language semantic roles into a string-to-tree translation model

Abstract

Access this article

Similar content being viewed by others

Enhancing English-Japanese Translation Using Syntactic Pattern Recognition Methods

Machine Translation Method Based on Non-compositional Semantics (Word-Level Sentence-Pattern-Based MT)

Parse and Corpus-Based Machine Translation

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Keywords

CLC number

Navigation

Incorporating target language semantic roles into a string-to-tree translation model

Abstract

Access this article

Similar content being viewed by others

Enhancing English-Japanese Translation Using Syntactic Pattern Recognition Methods

Machine Translation Method Based on Non-compositional Semantics (Word-Level Sentence-Pattern-Based MT)

Parse and Corpus-Based Machine Translation

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

CLC number

Search

Navigation