Machine Translation

, Volume 25, Issue 4, pp 291–315 | Cite as

A \({\mathcal{O}(|G|n^6)}\) time extension of inversion transduction grammars

Article

Abstract

Range concatenation grammars are viewed as a hierarchy of synchronous grammars. It is shown how inversion transduction grammars (ITGs) and extensions thereof, including synchronous tree-adjoining grammars, are captured by the hierarchy, and the expressivity and linguistic relevance of subclasses of the hierarchy are discussed. A \({\mathcal{O}(|G|n^6)}\) time extension of ITGs is proposed. The extension translates cross-serial dependencies into nested ones and handles complex kinds of discontinuous translation units and so-called inside-out alignments. In fact, our \({\mathcal{O}(|G|n^6)}\) time extension generates all possible alignments. It is shown that this additional expressivity comes at the cost of probabilistic parsing.

Keywords

Range concatenation grammar Generative capacity Computational complexity Inversion transduction grammar Synchronous tree-adjoining grammar 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aho A, Ullman J (1972) The theory of parsing, translation and compiling. Prentice-Hall, LondonGoogle Scholar
  2. Barton E, Berwick R, Ristad E (1987) Computational complexity and natural language. MIT Press, CambridgeGoogle Scholar
  3. Boullier P (1998) Proposal for a natural language processing syntactic backbone. Technical report. INRIA, Le ChesnayGoogle Scholar
  4. Boullier P (1999) On TAG and multicomponent TAG parsing. In: Proceedings of the 6th Conférence annuelle sur le Traitement Automatique des Langues Naturelles. Cargèse, Corse, pp 75–84Google Scholar
  5. Boullier P (2000) A cubic time extension of context-free grammars. Grammars 3(2–3): 111–131MathSciNetMATHCrossRefGoogle Scholar
  6. Brown P, Della Pietra V, Della Pietra S, Mercer R (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2): 263–311Google Scholar
  7. Casacuberta F, de la Higuera C (2000) Computational complexity of problems on probabilistic grammars and transducers. In: Grammatical inference: algorithms and applications, 5th international colloquium, Lisbon, Portugal, pp 15–24Google Scholar
  8. Chi Z (1999) Statistical properties of probabilistic context-free grammars. Comput Linguist 25(1): 131–160MathSciNetGoogle Scholar
  9. Chi Z, Geman S (1998) Estimation of probabilistic context-free grammars. Comput Linguist 24(2): 299–305MathSciNetGoogle Scholar
  10. Chiang D (2007) Hierarchical phrase-based translation. Comput Linguist 33(2): 201–228CrossRefGoogle Scholar
  11. Culik K (1966) Well translatable languages and Algol-like languages. In: Steel T (ed) Formal languages and description languages. North Holland Press, Amsterdam, pp 76–85Google Scholar
  12. Dreyer M, Hall K, Khudanpur S (2007) Comparing reordering constraints for SMT using efficient BLEU oracle computation. In: Proceedings of SSST, NAACL-HLT 2007/AMTA workshop on syntax and structure in statistical translation, New York, NY, pp 103–110Google Scholar
  13. Eisner J (2003) Learning non-isomorphic tree mappings for machine translation. In: 41st annual meeting of the association for computational linguistics, Sapporo, Japan, pp 205–208Google Scholar
  14. Gazdar G (1988) Applicability of indexed grammars to natural languages. In: Reyle U, Rohrer C (eds) Natural language parsing and linguistic theories. Reidel, Dordrecht, pp 69–94CrossRefGoogle Scholar
  15. Goutte C, Yamada K, Gaussier E (2004) Aligning words using matrix factorisation. In: Proceedings of the 42nd annual meeting of the association for computational linguistics, ACL-04, Barcelona, Spain, pp 502–509Google Scholar
  16. Graça J, Pardal J, Coheur L, Caseiro D (2008) Building a golden collection of parallel multi-language word alignments. In: Proceedings of the 6th international conference on language resources and evaluation (LREC’08), Marrakech, Morocco, pp 986–993Google Scholar
  17. Guan Y (1992) Klammergrammatiken, Netzgrammatiken und Interpretationen von Netzen. Ph.D thesis, Universität des Saarlands, Saarbrücken, GermanyGoogle Scholar
  18. Harbusch K, Poller P (1996) Structural translation with synchronous tree-adjoining grammars in Verbmobil. Technical report Verbmobil 184. Universität Koblenz-Landau/DFKI GmbH, KoblenzGoogle Scholar
  19. Huang S, Zhou B (2009) An EM Algorithm for SCFG in Formal Syntax-based Translation. In: Proceedings of the IEEE international conference on acoustic, speech, and signal processing (ICASSP’09), Taiwan, China, pp 4813–4816Google Scholar
  20. Kallmeyer L (2005) Tree-local multicomponent tree-adjoining grammars with shared nodes. Comput Linguist 31(2): 187–225MathSciNetCrossRefGoogle Scholar
  21. Kato Y, Seki H, Kasami T (2006) RNA pseudoknotted structure prediction using stochastic multiple context-free grammar. IPSJ Digit Cour 2: 655–664CrossRefGoogle Scholar
  22. Maier W, Søgaard A (2008) Treebanks and mild context-sensitivity. In: Proceedings of the 13th conference on formal grammar, Hamburg, Germany, pp 61–76Google Scholar
  23. Melamed D, Satta G, Wellington B (2004) Generalized multitext grammars. In: 42nd annual meeting of the association for computational linguistics, Barcelona, Spain, pp 661–668Google Scholar
  24. Nesson R, Shieber S (2007) Simpler TAG semantics through synchronization. In: Proceedings of the 11th conference on formal grammar. CSLI Publications, Stanford, pp 129–142Google Scholar
  25. Padó S, Lapata M (2006) Optimal constituent alignment with edge covers for semantic projection. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics, COLING · ACL 2006, Sydney, Australia, pp 1161–1168Google Scholar
  26. Rambow O, Satta G (1994) A two-dimensional hierarchy for parallel rewriting systems. Technical report. University of Philadelphia, PhiladelphiaGoogle Scholar
  27. Saers M, Nivre J, Wu D (2009) Learning stochastic bracketing inversion transduction grammars with a cubic time biparsing algorithm. In: 11th international conference on parsing technology (IWPT 2009), Paris, France, pp 29–32Google Scholar
  28. Satta G, Peserico E (2005) Some computational complexity results for synchronous context-free grammars. In: Proceedings of human language technology conference and conference for empirical methods in natural language processing 2005, Vancouver, Canada, pp 803–810Google Scholar
  29. Seki H, Matsumura T, Fujii M, Kasami T (1991) On multiple context-free grammars. Theor Comput Sci 88(2): 191–229MathSciNetMATHCrossRefGoogle Scholar
  30. Shieber S (2007) Probabilistic synchronous tree-adjoining grammars for machine translation. In: Proceedings of the North American chapter of the association for computational linguistics 2007, workshop on syntax and structure in statistical translation, Rochester, New York, pp 88–95Google Scholar
  31. Shieber S, Schabes Y (1990) Synchronous tree-adjoining grammars. In: COLING-90, papers presented to the 13th international conference on computational linguistics on the occasion of the 25th anniversary of COLING and the 350th anniversary of Helsinki University, Helsinki, Finland, pp 253–258Google Scholar
  32. Simard M, Cancedda N, Cavestro B, Dymetman M, Gaussier E, Goutte C, Yamada K, Langlais P, Mauser A (2005) Translating with non-contiguous phrases. In: HLT/EMNLP 2005: proceedings of the human language technology conference and conference on empirical methods in natural language processing, Vancouver, British Columbia, Canada, pp 755–762Google Scholar
  33. Søgaard A (2010) Can inversion transduction grammars generate hand alignments? In: Proceedings of the 14th annual conference of the european association for machine translation, St. Raphael, FranceGoogle Scholar
  34. Søgaard A, Kuhn J (2009a) Empirical lower bounds on alignment error rates in syntax-based machine translation. In: Proceedings of the third workshop on syntax and structure in statistical translation (SSST-3) at NAACL HLT 2009, Boulder, CO, pp 19–27Google Scholar
  35. Søgaard A, Kuhn J (2009b) Using a maximum entropy-based tagger to improve a very fast vine parser. In: 11th international conference on parsing technology (IWPT 2009), Paris, France, pp 206–209Google Scholar
  36. Søgaard A, Wu D (2009) Empirical lower bounds on alignment error rates for the full class of inversion transduction grammars. In: 11th international conference on parsing technology (IWPT 2009), Paris, France, pp 33–36Google Scholar
  37. Sudkamp T (2005) Languages and machines, 3rd edn. Pearson, BostonGoogle Scholar
  38. Vijay-Shanker K, Weir D (1994) The equivalence of four extensions of context-free grammars. Math Syst Theory 27: 511–546MathSciNetMATHCrossRefGoogle Scholar
  39. Wedekind J (1997) Approaches to unification in grammar. In: Blackburn P, de Rijke M (eds) Specifying syntactic structure. CSLI Publications, Stanford, pp 245–280Google Scholar
  40. Wellington B, Waxmonsky S, Melamed D (2006) Empirical lower bounds on the complexity of translational equivalence. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics, COLING · ACL 2006, Sydney, Australia, pp 977–984Google Scholar
  41. Wong F, Dong M, Hu D (2006) Machine translation using constraint-based synchronous grammar. Tsinghua Sci Technol 11(3): 295–306MathSciNetMATHCrossRefGoogle Scholar
  42. Wong YW, Mooney R (2007) Learning synchronous grammars for semantic parsing with lambda calculus. In: Proceedings of the 45th annual meeting of the association for computational linguistics, Prague, Czech Republic, pp 960–967Google Scholar
  43. Wu D (1995) Trainable coarse bilingual grammars for parallel text bracketing. In: Third annual workshop on very large corpora, Cambridge, MA, pp 69–81Google Scholar
  44. Wu D (1997) Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comput Linguist 23(3): 377–403Google Scholar
  45. Zens R, Ney H (2003) A comparative study on reordering constraints in statistical machine translation. In: 41st annual meeting on association for computational linguistics, Sapporo, Japan, pp 144–151Google Scholar
  46. Zhang H, Gildea D, Chiang D (2008) Extracting synchronous grammar rules from word-level alignments in linear time. In: Coling 2008, proceedings of the 22nd international conference on computational linguistics, Manchester, England, pp 1081–1088Google Scholar
  47. Zhang H, Huang L, Gildea D, Knight K (2006) Synchronous binarization for machine translation. In: Proceedings of the joint human language technology conference and the North American chapter of the association for computational linguistics 2006, New York, NY, pp 256–263Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.Center for Language TechnologyUniversity of CopenhagenCopenhagenDenmark

Personalised recommendations