Recursion in Grammar and Performance

  • Edward P. Stabler
Part of the Studies in Theoretical Psycholinguistics book series (SITP, volume 43)


The deeper, more recursive structures posited by linguists reflect important insights into similarities among linguistic constituents and operations, which would seem to be lost in computational models that posit simpler, flatter structures. We show how this apparent conflict can be resolved with a substantive theory of how linguistic computations are implemented. We begin with a review of standard notions of recursive depth and some basic ideas about how computations can be implemented. We then articulate a consensus position about linguistic structure, which disentangles what is computed from how it is computed. This leads to a unifying perspective on what it is to represent and manipulate structure, within some large classes of parsing models that compute exactly the consensus structures in such a way that the depth of the linguistic analysis does not correspond to processing depth. From this unifying perspective, we argue that adequate performance models must, unsurprisingly, be more superficial than adequate linguistic models, and that the two perspectives are reconciled by a substantial theory of how linguistic computations are implemented. In a sense that will be made precise, the recursive depth of a structural analysis does not correspond in any simple way to depth of the calculation of that structure in linguistic performance.


Computational linguistics Parsing Recursive depth Processing depth 


  1. Abels, K. (2007). Towards a restrictive theory of (remnant) movement: Improper movement, remnant movement, and a linear asymmetry. Linguistic Variation Yearbook 2007, 7, 53–120.Google Scholar
  2. Altmann, G. T. M., van Nice, K. Y., Garnham, A., & Henstra, J.-A. (1998). Late closure in context. Journal of Memory and Language, 38(4), 459–484.CrossRefGoogle Scholar
  3. Appel, A. W. (1992). Unrolling recursions saves space (Tech. Rep. tr-363-92). Department of Computer Science, Princeton University.Google Scholar
  4. Baltin, M. R. (2002). Movement to the higher V is remnant movement. Linguistic Inquiry, 33(4), 653–659.CrossRefGoogle Scholar
  5. Becker, T., Rambow, O., & Niv, M. (1992). The derivational generative power of formal systems, or, scrambling is beyond LCFRS (IRCS Tech. Rep. 92–38). University of Pennsylvania.Google Scholar
  6. Berwick, R. C. (1981). Computational complexity of lexical functional grammar. In Proceedings of the 19th annual meeting of the Association for Computational Linguistics, ACL’81, Stanford (pp. 7–12).Google Scholar
  7. Berwick, R. C., & Weinberg, A. S. (1984). The grammatical basis of linguistic performance: Language use and acquisition. Cambridge: MIT.Google Scholar
  8. Bever, T. G. (1970). The cognitive basis for linguistic structures. In J. R. Hayes (Ed.), Cognition and the development of language. New York: Wiley.Google Scholar
  9. Bhatt, R., & Joshi, A. (2004). Semilinearity is a syntactic invariant: A reply to Michaelis and Kracht. Linguistic Inquiry, 35, 683–692.CrossRefGoogle Scholar
  10. Blass, A., Dershowitz, N., & Gurevich, Y. (2009). When are two algorithms the same? Bulletin of Symbolic Logic, 15(2), 145–168.CrossRefGoogle Scholar
  11. Buss, S. R. (1994). On Gödel’s theorems on lengths of proofs I: Number of lines and speedup for arithmetics. Journal of Symbolic Logic, 59(2), 737–756.CrossRefGoogle Scholar
  12. Chambers, C. G., Tanenhaus, M. K., Eberhard, K. M., Filip, H., & Carlson, G. N. (2004). Actions and affordances in syntactic ambiguity resolution. Journal of Experimental Psychology: Learning, Memory and Cognition, 30(3), 687–696.Google Scholar
  13. Chitil, O. (1999). Type inference builds a short cut to deforestation. In International conference on functional programming, Paris (pp. 249–260).Google Scholar
  14. Chomsky, N. (1995). The minimalist program. Cambridge: MIT.Google Scholar
  15. Crain, S., & Steedman, M. (1985). On not being led up the garden path. In D. R. Dowty, L. Karttunen, & A. Zwicky (Eds.), Natural language parsing. New York: Cambridge University Press.Google Scholar
  16. Eisner, J. (2000). Bilexical grammars and their cubic-time parsing algorithms. In H. Bunt & A. Nijholt (Eds.), Advances in probabilistic and other parsing technologies (pp. 29–62). Dordrecht/Boston: Kluwer.CrossRefGoogle Scholar
  17. Everett, D. L. (2005). Cultural constraints on grammar and cognition in Piraha: Another look at the design features of human language. Current Anthropology, 46, 621–646.CrossRefGoogle Scholar
  18. Ferreira, F. (2005). Psycholinguistics, formal grammars, and cognitive science. Linguistic Review, 22, 365–380.CrossRefGoogle Scholar
  19. Fodor, J. A., Bever, T. G., & Garrett, M. F. (1974). The psychology of language: An introduction to psycholinguistics and generative grammar. New York: McGraw-Hill.Google Scholar
  20. Ford, M., Bresnan, J., & Kaplan, R. M. (1982). A competence-based theory of syntactic closure. In J. Bresnan (Ed.), The mental representation of grammatical relations. Cambridge: MIT.Google Scholar
  21. Frank, R., & Satta, G. (1998). Optimality theory and the generative complexity of constraint violability. Computational Linguistics, 24, 307–315.Google Scholar
  22. Franosch, J.-M. P., Lingenheil, M., & van Hemmen, J. L. (2005). How a frog can learn what is where in the dark. Physical Review Letters, 95(7), 078106.CrossRefGoogle Scholar
  23. Friedel, P. (2008). Sensory information processing: Detection, feature extraction, and multimodal integration. Ph.D. thesis, Technical University of Munich.Google Scholar
  24. Gibson, E. & Pearlmutter, N. J. (1998). Constraints on sentence processing. Trends in Cognitive Science, 2, 262–268.CrossRefGoogle Scholar
  25. Gödel, K. (1936). Uber die Länge von Beweisen. Ergebnisse eines mathematischen kolloquiums (pp. 23–24) [On the length of proofs. In Kurt Gödel: Collected works, (Vol. 1, pp. 396–399), 1986. New York: Oxford University Press].Google Scholar
  26. Graf, T. (2010). Reference-set constraints as linear tree transductions via controlled optimality systems. In P. de Groote & M.-J. Nederhof (Eds.) Formal Grammar 2010/2011 (Lecture Notes in Computer Science 7395). Heidelberg: Springer.Google Scholar
  27. Hale, J. (2003). Grammar, uncertainty, and sentence processing. Ph.D. thesis, Johns Hopkins University.Google Scholar
  28. Hale, J. T. (2006). Uncertainty about the rest of the sentence. Cognitive Science, 30(1), 609–642.Google Scholar
  29. Hale, J. T. (2011). What a rational parser would do. Cognitive Science, 35(3), 399–443.CrossRefGoogle Scholar
  30. Hamilton, G. W. (2006). Higher order deforestation. Fundamenta Informaticae, 69(1–2), 39–61.Google Scholar
  31. Harkema, H. (2001a). A characterization of minimalist languages. In Logical aspects of computational linguistics (Lecture notes in artificial intelligence, Vol. 2099, pp. 193–211). New York: Springer.Google Scholar
  32. Harkema, H. (2001b). Parsing minimalist languages. Ph.D. thesis, University of California, Los Angeles.Google Scholar
  33. Jaeger, E., Francez, N., & Wintner, S. (2005). Guaranteeing parsing termination of unification grammars. Journal of Logic, Language and Information, 14(2), 199–234.CrossRefGoogle Scholar
  34. Jäger, G. (2002). Gradient constraints in finite state OT: The unidirectional and the bidirectional case. In J. Greenberg (Ed.), More than words: A Festschrift for Dieter Wunderlich (pp. 299–325). Berlin: Akademie Verlag.Google Scholar
  35. Johnson, M. (1988). Attribute value logic and the theory of grammar (CSLI lecture notes, Vol. 16). Chicago: CSLI Publications.Google Scholar
  36. Jones, N. D., Gomard, C. K., & Sestoft, P. (1993). Partial evaluation and automatic program generation. Englewood Cliffs: Prentice-Hall.Google Scholar
  37. Joshi, A. K. (1985). How much context-sensitivity is necessary for characterizing structural descriptions. In D. Dowty, L. Karttunen, & A. Zwicky (Eds.), Natural language processing: Theoretical, computational and psychological perspectives (pp. 206–250). New York: Cambridge University Press.Google Scholar
  38. Jurafsky, D. (1996). A probabilistic model of lexical and syntactic access and disambiguation. Cognitive Science, 20, 137–194.CrossRefGoogle Scholar
  39. Jurafsky, D. (2003). Probabilistic modeling in psycholinguistics: Comprehension and production. In R. Bod, J. Hay, & S. Jannedy (Eds.), Probabilistic linguistics (pp. 39–96). Cambridge: MIT.Google Scholar
  40. Kanazawa, M. (2009). A pumping lemma for well-nested multiple context free grammars. In 13th international conference on developments in language theory, DLT 2009, Stuttgart.Google Scholar
  41. Kanazawa, M., & Salvati, S. (2007). Generating control languages with abstract categorial grammars. In Proceedings of the 12th conference on formal grammar (FG’07), Dublin. Stanford: CLSI Publications.Google Scholar
  42. Kayne, R. S. (1998). Overt vs. covert movment. Syntax, 1(2), 128–191.Google Scholar
  43. Keenan, E. L., & Stabler, E. P. (2003). Bare grammar: Lectures on linguistic invariants. Stanford: CSLI Publications.Google Scholar
  44. Kempter, R., Leibold, C., Wagner, H., & van Hemmen, J. L. (2001). Formation of temporal-feature maps by axonal propagation of synaptic learning. Proceedings of the National Academy of Sciences of the United States of America, 98(7), 4166–4171.CrossRefGoogle Scholar
  45. Kennedy, K., & Allen, R. (2001). Optimizing compilers for modern architectures: A dependence-based approach. San Mateo: Morgan Kaufmann.Google Scholar
  46. Kepser, S., & Mönnich, U. (2006). Properties of linear context free tree languages with an application to optimality theory. Theoretical Computer Science, 354, 82–97.CrossRefGoogle Scholar
  47. Kobele, G. M. (2002). Formalizing mirror theory. Grammars, 5, 177–221.CrossRefGoogle Scholar
  48. Kobele, G. M. (2006). Generating copies: An investigation into structural identity in language and grammar. Ph.D. thesis, UCLA.Google Scholar
  49. Kobele, G. M. (2010). Without remnant movement, MGs are context-free. In MOL 10/11 (Lecture notes in computer science, Vol. 6149, pp. 160–173). Berlin/Heidelberg: Springer.Google Scholar
  50. Koopman, H., & Szabolcsi, A. (2000). Verbal complexes. Cambridge: MIT.Google Scholar
  51. Kühnemann, A. (1999). Comparison of deforestation techniques for functional programs and for tree transducers. In Fuji international symposium on functional and logic programming, Tsukuba (pp. 114–130).Google Scholar
  52. Lee, F. (2000). VP remnant movement and VSO in Quiaviní Zapotec. In A. Carnie & E. Guilfoyle (Eds.), The syntax of verb initial languages. Oxford: Oxford University Press.Google Scholar
  53. Levy, R. (2008). Expectation-based syntactic comprehension. Cognition, 106, 1126–1177.CrossRefGoogle Scholar
  54. Levy, R., Reali, F., & Griffiths, T. (2009). Modeling the effects of memory on human online sentence processing with particle filters. In Proceedings of the twenty-second annual conference on neural information processing systems. Vancouver: Canada.Google Scholar
  55. MacDonald, M., Pearlmutter, N. J., & Seidenberg, M. S. (1994). Syntactic ambiguity as lexical ambiguity resolution. In C. Clifton, L. Frazier, & K. Rayner (Eds.), Perspectives on sentence processing (pp. 155–180). Hillsdale: Lawrence Erlbaum.Google Scholar
  56. Magerman, D. M. (1994). Natural language parsing as statistical pattern recognition. Ph.D. thesis, Stanford University.Google Scholar
  57. Marlow, S., & Wadler, P. (1992). Deforestation for higher-order functions. In Proceedings of the fifth annual Glasgow workshop on functional programming. (pp. 154–165). Glaskow, UK.Google Scholar
  58. Michaelis, J. (1998). Derivational minimalism is mildly context-sensitive. In Proceedings, logical aspects of computational linguistics, LACL’98, Grenoble (pp. 179–198). New York: Springer.Google Scholar
  59. Michaelis, J. (2001a). On formal properties of minimalist grammars. Ph.D. thesis, Universität Potsdam. Linguistics in Potsdam 13, Universitätsbibliothek, Potsdam.Google Scholar
  60. Michaelis, J. (2001b). Transforming linear context free rewriting systems into minimalist grammars. In Logical aspects of computational linguistics (Lecture notes in artificial intelligence, Vol. 2099, pp. 228–244). New York: Springer.Google Scholar
  61. Michaelis, J. (2002). Notes on the complexity of complex heads in a minimalist grammar. In Proceedings of the 6th international workshop on tree adjoining grammars and related frameworks, TAG+6, Venice (pp. 57–65).Google Scholar
  62. Michaelis, J., & Kracht, M. (1997). Semilinearity as a syntactic invariant. In Logical aspects of computational linguistics (Lecture notes in computer science, Vol. 1328, pp. 37–40). New York: Springer.Google Scholar
  63. Moschovakis, Y. N. (2001). What is an algorithm? In B. Engquist & W. Schmid (Eds.), Mathematics unlimited – 2001 and beyond (pp. 919–936). New York: Springer.CrossRefGoogle Scholar
  64. Müller, G. (1998). Incomplete category fronting. Boston: Kluwer.CrossRefGoogle Scholar
  65. Nederhof, M.-J., & Satta, G. (2000). Left-to-right parsing and bilexical context-free grammars. In Proceedings of ANLP-NAACL 2000, Seattle.Google Scholar
  66. Parberry, I. (1996). Circuit complexity and feedforward neural networks. In P. Smolensky, M. C. Mozer, & D. Rumelhart (Eds.), Mathematical perspectives on neural networks. Mahwah: Erlbaum.Google Scholar
  67. Perekrestenko, A. (2008). Minimalist grammars with unbounded scrambling and nondiscriminating barriers are NP-hard. In Proceedings of the 2nd international conference on language and automata theory and applications, LATA2008 (Lecture notes in computer science, Vol. 5196, pp. 421–432). Berlin: Springer.Google Scholar
  68. Peters, P. S., & Ritchie, R. W. (1973). On the generative power of transformational grammar. Information Sciences, 6, 49–83.CrossRefGoogle Scholar
  69. Phillips, C., & Wagers, M. (2007). Relating time and structure in linguistics and psycholinguistics. In G. Gaskell (Ed.), Oxford handbook of psycholinguistics (pp. 739–756). Oxford: Oxford University Press.Google Scholar
  70. Phillips, C., Wagers, M., & Lau, E. (2011). Grammatical illusions and selective fallibility in real-time language comprehension. In J. Runner (Ed.), Syntax and semantics, Volume 37: Experiments at the interfaces (pp. 153–186). Bingley: Emerald.Google Scholar
  71. Pullum, G. K., & Gazdar, G. (1982). Natural languages and context free languages. Linguistics and Philosophy, 4, 471–504.CrossRefGoogle Scholar
  72. Rambow, O. (1994). Formal and computational aspects of natural language syntax. Ph.D. thesis, University of Pennsylvania. Computer and Information Science Technical report MS-CIS-94-52 (LINC LAB 278).Google Scholar
  73. Roark, B. (2001). Probabilistic top-down parsing and language modeling. Computational Linguistics, 27(2), 249–276.CrossRefGoogle Scholar
  74. Roark, B. (2004). Robust garden path parsing. Natural Language Engineering, 10(1), 1–24.CrossRefGoogle Scholar
  75. Scheifler, R. W. (1977). An analysis of inline substitution for a structured programming language. Communications of the Association for Computing Machinery, 20, 647–654.CrossRefGoogle Scholar
  76. Seidl, H., & Sørensen, M. H. (1997). Constraints to stop higher-order deforestation. In ACM symposium on principles of programming languages (pp. 400–413). New York: ACM.Google Scholar
  77. Seki, H., Matsumura, T., Fujii, M., & Kasami, T. (1991). On multiple context-free grammars. Theoretical Computer Science, 88, 191–229.CrossRefGoogle Scholar
  78. Smolensky, P., & Legendre, G. (2006). The harmonic mind: From neural computation to optimality-theoretic grammar, Volume I: Cognitive architecture. Cambridge: MIT.Google Scholar
  79. Stabler, E. P. (1997). Derivational minimalism. In C. Retoré (Ed.), Logical aspects of computational linguistics (Lecture notes in computer science, Vol. 1328, pp. 68–95). New York, Springer.Google Scholar
  80. Stabler, E. P. (1999). Remnant movement and complexity. In G. Bouma, E. Hinrichs, G.-J. Kruijff, & D. Oehrle (Eds.), Constraints and resources in natural language syntax and semantics (pp. 299–326). Stanford: CSLI Publications.Google Scholar
  81. Stabler, E. P. (2001). Recognizing head movement. In P. de Groote, G. Morrill, & C. Retoré (Eds.), Logical aspects of computational linguistics (Lecture notes in artificial intelligence, Vol. 2099, pp. 254–260). New York: Springer.Google Scholar
  82. Stabler, E. P. (2003). Comparing 3 perspectives on head movement. In A. Mahajan (Ed.), From head movement and syntactic theory, UCLA/Potsdam working papers in linguistics, UCLA (pp. 178–198). Los Angeles: CaliforniaGoogle Scholar
  83. Stabler, E. P. (2004). Varieties of crossing dependencies: Structure dependence and mild context sensitivity. Cognitive Science, 93(5), 699–720.CrossRefGoogle Scholar
  84. Stabler, E. P. (2006). Sidewards without copying. In Formal grammar’06, proceedings of the conference (pp. 133–146). Stanford: CSLI Publications.Google Scholar
  85. Stabler, E. P. (2011a). After GB theory. In J. van Benthem & A. ter Meulen (Eds.), Handbook of logic and language (2nd ed., pp. 395–414). Amsterdam: Elsevier.CrossRefGoogle Scholar
  86. Stabler, E. P. (2011b). Computational perspectives on minimalism. In C. Boeckx (Ed.), Oxford handbook of linguistic minimalism (pp. 617–641). Oxford: Oxford University Press.Google Scholar
  87. Stabler, E. P. & Keenan, E. L. (2003). Structural similarity. Theoretical Computer Science, 293, 345–363.CrossRefGoogle Scholar
  88. Statman, R. (1974). Structural complexity of proofs. Ph.D. thesis, Stanford University.Google Scholar
  89. Torenvliet, L., & Trautwein, M. (1995). A note on the complexity of restricted attribute-value grammars. In Proceedings of computational linguistics in the Netherlands, CLIN5 (pp. 145–164) Department of Computer Science, University of Twente, Twente.Google Scholar
  90. Trautwein, M. (1995). The complexity of structure-sharing in unification-based grammars. In Proceedings of computational linguistics in the Netherlands, CLIN5 (pp. 165–180) Department of Computer Science, University of Twente, Twente.Google Scholar
  91. Trueswell, J. (1996). The role of lexical frequency in syntactic ambiguity resolution. Journal of Memory and Language, 35, 566–585.CrossRefGoogle Scholar
  92. van Hemmen, J. L., & Schwartz, A. B. (2008). Population vector code: A geometric universal as actuator. Biological Cybernetics, 98(6), 509–518.CrossRefGoogle Scholar
  93. Vijay-Shanker, K., Weir, D., & Joshi, A. K. (1987). Characterizing structural descriptions produced by various grammatical formalisms. In Proceedings of the 25th annual meeting of the Association for Computational Linguistics, Stanford (pp. 104–111).Google Scholar
  94. Wadler, P. (1990). Deforestation: Transforming programs to eliminate trees. Theoretical Computer Science, 73(2), 231–248.CrossRefGoogle Scholar
  95. Webelhuth, G., & den Besten, H. (1987). Adjunction and remnant topicalization in the Germanic SOV-languages. Paper presented at the GLOW conference, Venice.Google Scholar
  96. Woods, W. A. (1970). Transition network grammars for natural language analysis. Communications of the Association for Computing Machinery, 13(10), 591–606.CrossRefGoogle Scholar
  97. Zipf, G. K. (1935). The psychobiology of language: An introduction to dynamic philology. Boston: Houghton-Mifflin.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Edward P. Stabler
    • 1
  1. 1.Department of LinguisticsUCLALos AngelesUSA

Personalised recommendations