Advertisement

Wide-Coverage Parsing, Semantics, and Morphology

  • Ruket Çakıcı
  • Mark Steedman
  • Cem Bozşahin
Chapter
Part of the Theory and Applications of Natural Language Processing book series (NLP)

Abstract

Wide-coverage parsing poses three demands: broad coverage over preferably free text, depth in semantic representation for purposes such as inference in question answering, and computational efficiency. We show for Turkish that these goals are not inherently contradictory when we assign categories to sub-lexical elements in the lexicon. The presumed computational burden of processing such lexicons does not arise when we work with automata-constrained formalisms that are trainable on word-meaning correspondences at the level of predicate-argument structures for any string, which is characteristic of radically lexicalizable grammars. This is helpful in morphologically simpler languages too, where word-based parsing has been shown to benefit from sub-lexical training.

References

  1. Akkuş BK (2014) Supertagging with combinatory categorial grammar for dependency parsing. Master’s thesis, Middle East Technical University, AnkaraGoogle Scholar
  2. Ambati BR, Deoskar T, Steedman M (2013) Using CCG categories to improve Hindi dependency parsing. In: Proceedings of ACL, Sofia, pp 604–609Google Scholar
  3. Ambati BR, Deoskar T, Steedman M (2014) Improving dependency parsers using combinatory categorial grammar. In: Proceedings of EACL, Gothenburg, pp 159–163Google Scholar
  4. Aronoff M, Fudeman K (2011) What is morphology?, 2nd edn. Wiley-Blackwell, ChichesterGoogle Scholar
  5. Atalay NB, Oflazer K, Say B (2003) The annotation process in the Turkish treebank. In: Proceedings of the workshop on linguistically interpreted corpora, Budapest, pp 33 – 38Google Scholar
  6. Bangalore S, Joshi AK (eds) (2010) Supertagging. MIT Press, Cambridge, MAGoogle Scholar
  7. Barton G, Berwick R, Ristad E (1987) Computational complexity and natural language. MIT Press, Cambridge, MAGoogle Scholar
  8. Berwick R, Weinberg A (1982) Parsing efficiency, computational complexity, and the evaluation of grammatical theories. Linguist Inquiry 13:165–192Google Scholar
  9. Birch A, Osborne M, Koehn P (2007) CCG supertags in factored statistical machine translation. In: Proceedings of WMT, pp 9–16Google Scholar
  10. Bos J, Bosco C, Mazzei A (2009) Converting a dependency treebank to a categorial grammar treebank for Italian. In: Proceedings of the international workshop on treebanks and linguistic theories, Milan, pp 27–38Google Scholar
  11. Bozşahin C (2002) The combinatory morphemic lexicon. Comput Linguist 28(2):145–186CrossRefGoogle Scholar
  12. Bozşahin C (2012) Combinatory linguistics. Mouton De Gruyter, BerlinCrossRefGoogle Scholar
  13. Çakıcı R (2005) Automatic induction of a CCG grammar for Turkish. In: Proceedings of the ACL student research workshop, Ann Arbor, MI, pp 73–78CrossRefGoogle Scholar
  14. Çakıcı R (2008) Wide-coverage parsing for Turkish. PhD thesis, University of Edinburgh, EdinburghGoogle Scholar
  15. Çakıcı R, Steedman M (2009) A wide-coverage morphemic CCG lexicon for Turkish. In: Proceedings of ESSLLI workshop on parsing with categorial grammars, Bordeaux, pp 11–15Google Scholar
  16. Çakıcı R, Steedman M (2018) Wide coverage CCG parsing for Turkish, in preparationGoogle Scholar
  17. Cha J, Lee G, Lee J (2002) Korean combinatory categorial grammar and statistical parsing. Comput Hum 36(4):431–453CrossRefGoogle Scholar
  18. Clark S (2002) A supertagger for combinatory categorial grammar. In: Proceedings of the TAG+ workshop, Venice, pp 19–24Google Scholar
  19. Clark S, Curran JR (2006) Partial training for a lexicalized grammar parser. In: Proceedings of NAACL-HLT, New York, NY, pp 144–151Google Scholar
  20. Clark S, Curran JR (2007) Wide-coverage efficient statistical parsing with CCG and log-linear models. Comput Linguist 33:493–552CrossRefGoogle Scholar
  21. Çöltekin Ç, Bozşahin C (2007) Syllable-based and morpheme-based models of Bayesian word grammar learning from CHILDES database. In: Proceedings of the annual meeting of the cognitive science society, Nashville, TN, pp 880 – 886Google Scholar
  22. Eryiğit G, Nivre J, Oflazer K (2008) Dependency parsing of Turkish. Comput Linguist 34(3): 357 – 389CrossRefGoogle Scholar
  23. Göksel A (2006) Pronominal participles in Turkish and lexical integrity. Ling. Linguaggio 5(1):105–125Google Scholar
  24. Hall J, Nilsson J (2006) CoNLL-X shared task: multi-lingual dependency parsing. MSI Report 06060, School of Mathematics and Systems Engineering, Växjö University, VäxjöGoogle Scholar
  25. Hockenmaier J (2003) Data models for statistical parsing with combinatory categorial grammar. PhD thesis, University of Edinburgh, EdinburghGoogle Scholar
  26. Hockenmaier J (2006) Creating a CCGbank and a wide-coverage CCG lexicon for German. In: Proceedings of COLING-ACL, Sydney, pp 505–512Google Scholar
  27. Hockenmaier J, Steedman M (2007) CCGbank: a corpus of CCG derivations and dependency structures extracted from the Penn Treebank. Comput Linguist 33(3):356–396CrossRefGoogle Scholar
  28. Hockett CF (1959) Two models of grammatical description. Bob-Merrill, Indianapolis, INGoogle Scholar
  29. Hoeksema J, Janda RD (1988) Implications of process-morphology for categorial grammar. In: Oehrle RT, Bach E, Wheeler D (eds) Categorial grammars and natural language structures. D. Reidel, DordrechtGoogle Scholar
  30. Honnibal M (2010) Hat categories: representing form and function simultaneously in combinatory categorial grammar. PhD thesis, University of Sydney, SydneyGoogle Scholar
  31. Honnibal M, Curran JR (2009) Fully lexicalising CCGbank with hat categories. In: Proceedings of EMNLP, Singapore, pp 1212–1221Google Scholar
  32. Honnibal M, Kummerfeld JK, Curran JR (2010) Morphological analysis can improve a CCG parser for English. In: Proceedings of COLING, Beijing, pp 445–453Google Scholar
  33. Kabak B (2007) Turkish suspended affixation. Linguistics 45:311–347CrossRefGoogle Scholar
  34. Koskenniemi K (1983) Two-level morphology: a general computational model for word-form recognition and production. PhD thesis, University of Helsinki, HelsinkiGoogle Scholar
  35. Koskenniemi K, Church KW (1988) Complexity, two-level morphology and Finnish. In: Proceedings of COLING, Budapest, pp 335–339Google Scholar
  36. Lewis M, Steedman M (2014) A CCG parsing with a supertag-factored model. In: Proceedings of EMNLP, Doha, pp 990–1000Google Scholar
  37. Lieber R (1992) Deconstructing morphology: word formation in syntactic theory. The University of Chicago Press, Chicago, ILGoogle Scholar
  38. MacWhinney B (2000) The CHILDES project: tools for analyzing talk, 3rd edn. Lawrence Erlbaum Associates, Mahwah, NJGoogle Scholar
  39. Matthews P (1974) Morphology: an introduction to the theory of word-structure. Cambridge University Press, CambridgeGoogle Scholar
  40. McConville M (2006) An inheritance-based theory of the lexicon in combinatory categorial grammar. PhD thesis, University of Edinburgh, EdinburghGoogle Scholar
  41. McDonald R, Crammer K, Pereira F (2005) Online large-margin training of dependency parsers. In: Proceedings of ACL, Ann Arbor, MI, pp 91–98Google Scholar
  42. Nivre J, Hall J, Nilsson J, Chanev A, Eryiğit G, Kübler S, Marinov S, Marsi E (2007) MaltParser: a language-independent system for data-driven dependency parsing. Nat Lang Eng 13(2):95–135Google Scholar
  43. Oflazer K (2003) Dependency parsing with an extended finite-state approach. Comput Linguist 29(4):515–544CrossRefGoogle Scholar
  44. Oflazer K, Göçmen E, Bozşahin C (1994) An outline of Turkish morphology. www.academia.edu/7331476/An_Outline_of_Turkish_Morphology (7 May 2018)
  45. Oflazer K, Say B, Hakkani-Tür DZ, Tür G (2003) Building a Turkish treebank. In: Treebanks: building and using parsed corpora. Kluwer Academic Publishers, BerlinGoogle Scholar
  46. Roark B, Sproat RW (2007) Computational approaches to morphology and syntax. Oxford University Press, OxfordGoogle Scholar
  47. Sak H, Güngör T, Saraçlar M (2011) Resources for Turkish morphological processing. Lang Resour Eval 45(2):249–261CrossRefGoogle Scholar
  48. Schmerling S (1983) Two theories of syntactic categories. Linguist Philos 6(3):393–421CrossRefGoogle Scholar
  49. Sells P (1995) Korean and Japanese morphology from a lexical perspective. Linguist Inquiry 26(2):277–325Google Scholar
  50. Steedman M (1996) Surface structure and interpretation. MIT Press, Cambridge, MAGoogle Scholar
  51. Steedman M (2000) The syntactic process. MIT Press, Cambridge, MAGoogle Scholar
  52. Steedman M (2011) Taking scope. MIT Press, Cambridge, MACrossRefGoogle Scholar
  53. Steedman M, Baldridge J (2011) Combinatory categorial grammar. In: Boyer R, Börjars K (eds) Non-transformational syntax: formal and explicit models of grammar: a guide to current models, Wiley-Blackwell, West SussexGoogle Scholar
  54. Steedman, M. and C. Bozşahin (2018) Projecting from the Lexicon. MIT Press, (submitted)Google Scholar
  55. Stump GT (2001) Inflectional morphology: a theory of paradigm structure. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  56. Tse D, Curran JR (2010) Chinese CCGbank: extracting CCG derivations from the Penn Chinese treebank. In: Proceedings of COLING, Beijing, pp 1083–1091Google Scholar
  57. Valiant L (2013) Probably approximately correct: nature’s algorithms for learning and prospering in a complex world. Basic Books, New York, NYGoogle Scholar
  58. van Rooij I (2008) The tractable cognition thesis. Cogn Sci 32(6):939–984CrossRefGoogle Scholar
  59. Wang A, Kwiatkowski T, Zettlemoyer L (2014) Morpho-syntactic lexical generalization for CCG semantic parsing. In: Proceedings of EMNLP, Doha, pp 1284–1295Google Scholar
  60. Yuret D, Türe F (2006) Learning morphological disambiguation rules for Turkish. In: Proceedings of NAACL-HLT, New York, NY, pp 328–334Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Middle East Technical UniversityAnkaraTurkey
  2. 2.University of EdinburghEdinburghUK

Personalised recommendations