Biology & Philosophy

, Volume 27, Issue 5, pp 615–637

Grammar as a developmental phenomenon


    • Department of PhilosophyUniversity of Louisville

DOI: 10.1007/s10539-012-9324-4

Cite this article as:
Dove, G. Biol Philos (2012) 27: 615. doi:10.1007/s10539-012-9324-4


More and more researchers are examining grammar acquisition from theoretical perspectives that treat it as an emergent phenomenon. In this essay, I argue that a robustly developmental perspective provides a potential explanation for some of the well-known crosslinguistic features of early child language: the process of acquisition is shaped in part by the developmental constraints embodied in von Baer’s law of development. An established model of development, the Developmental Lock, captures and elucidates the probabilistic generalizations at the heart of von Baer’s law. When this model is applied to the acquisition of grammar, it predicts that grammatical achievements that are more generatively entrenched will emerge earlier in development and will be more developmentally resilient than those that are less generatively entrenched. I show that the first prediction is supported by a wealth of psycholinguistic evidence involving typically developing children and that the second prediction is supported by numerous studies involving both children who receive deficient linguistic input and children who experience various language impairments. The success of this model demonstrates the analytic potential of a developmental approach to the study of language acquisition.


DevelopmentGenerative entrenchmentGrammarInnatenessLanguage acquisitionSyntax


Within the past decade, a broad shift in how researchers approach the study of language acquisition has occurred. Prior to this shift, the majority of researchers adopted a nativist theoretical perspective that posited a substantial body of innate grammatical knowledge (e.g. Chomsky 1980, 1986, 1995; Cowie 2010). This perspective typically held that acquisition is constrained by a Universal Grammar (often captured in terms of a small set of principles and optional parameters). Against this background, the apparent differences between early child language and adult language were often explained away in terms of performance factors such as task demands (Crain and Thornton 1998), computational limitations (Bloom 1990; Grodzinsky and Reinhart 1993), or perceptual properties (Gerken et al. 1990). More recently, an increasing number of researchers have begun to adopt alternative theoretical perspectives that view the process of acquisition as more responsive to environmental influences than previously thought. For example, some have adopted the perspective of statistical learning (e.g. Behme and Deacon 2008; Saffran 2003; Saffran et al. 1996), which holds that children acquire language through the detection of statistical regularities in the linguistic input. Others examine the acquisition of grammar from a usage-based perspective (e.g. Croft 2007; Diessel 2004; Tomasello 2003, 2006), which holds that meaningful constructions are acquired individually. On this view, grammatical structures emerge gradually through use.

Bavin (2009) identifies this broad shift as a movement towards emergentism (MacWhinney 1999). Emergentism does not amount to an outright rejection of nativism, but it does seek to explain language acquisition as the consequence of a coalition of diverse developmental factors. In keeping with this, it tends to place greater emphasis on the role played by domain-general mechanisms in the acquisition of grammatical knowledge. With this shift in theoretical perspective has come a shift in research priorities. Currently, greater emphasis is placed on uncovering the complex sources of change and difference. Although it is not always acknowledged, this shift also creates a theoretical puzzle: How should we account for those aspects of language acquisition that motivated nativism in the first place? A remarkable fact about natural languages is that, despite their diverse cultural, socioeconomic and historical circumstances, they exhibit a fundamental pattern of structural similarity. In addition, young children regularly acquire a sophisticated grammatical competence with remarkable speed and accuracy. Against the background of nativism, these properties were central explananda of psycholinguistics. They were seen as the result of domain-specific innate constraints. In this essay, I argue that adopting a robust developmental perspective suggests another possibility: that language acquisition is shaped by developmental constraints or biases.

The focus of this essay is constructive and not critical. My primary aim is to explore the potential explanatory benefits of viewing grammar as a developmental phenomenon. Right from the start, this perspective generates a number of nontrivial predictions: grammatical knowledge should change in complexity over time due to largely irreversible alterations in organization (Michel and Moore 1995); it should be acquired through the dynamic interaction of the child with elements of her physical and social environment (Oyama 1985); it should emerge from earlier conditions and not be directed toward later conditions (Michel and Moore 1995); and it should unfold epigenetically (Gottlieb 1991; Schneirla 1966). All in all, a developmental approach suggests that the causal sources of grammar acquisition are multi-factorial. By its lights, a central focus of psycholinguistic theory should be the complex and varied circumstances of change and constancy within the life cycle.

At first blush, one might think that a perspective that emphasizes the dynamic nature of development and the multiplicity of the potential sources of change would not have the theoretical resources to explain the constrained plasticity of natural language acquisition identified by nativists. Fortunately, this initial impression is mistaken. A general architectural feature of developmental systems is that certain traits (entities, processes, structures, etc.) will play an in important role in the generation other traits. This means that not all traits are created equally, that is, some have more cumulative downstream dependencies than others. Roughly put, more is riding on the success of some developmental achievements than is riding on the success of others, and the failure to achieve them is more likely to have far-reaching and large maladaptive effects. A consequence of this fact is that, within evolved developmental systems (whether biological, cognitive, or cultural), features acquired in earlier stages tend to be more entrenched than features acquired in later ones (Arthur 2002; Wimsatt 2007a).

A core thesis of this essay is that the acquisition of grammar fits with these generalizations. My argument proceeds as follows: After discussing how the pattern of greater evolutionary conservation for features acquired earlier in the life cycle is captured by von Baer’s law of development, I outline a pre-existing model of development, the Developmental Lock (Wimsatt 1986), which is designed to elucidate this pattern. When applied to the acquisition of grammar, this model makes two clear predictions: (1) more generatively entrenched grammatical achievements will emerge earlier in development than less generatively entrenched ones and (2) more generatively entrenched achievements should be more developmentally resilient (i.e. buffered against perturbation) than less generatively entrenched ones. In the following sections, I provide empirical support for these predictions; I show that evidence from typical language development fits with the first prediction and that evidence from language development under strained or unusual circumstances fits with the second prediction.

Grammar and the developmental lock

The purpose of this section is to provide a conceptual framework for thinking about the acquisition of grammar from a developmental perspective. I begin by identifying how von Baer’s law provides a new way to think about the emergence of grammatical competence and then explain how the Developmental Lock sharpens this idea. Von Baer was an early critic of Haekel’s biogenetic law that ontogeny recapitulates phylogeny (Gould 1977). Despite his rejection of this law, von Baer recognized that an organism’s ontogeny tends to reflect its phylogeny in certain ways. He proposed a four-part law of development to capture the embryological patterns that emerge in related organisms. This law can be categorized as follows (from Ospovat 1981, p. 122):

(1) The more general characters of a large group of organisms appear earlier in their embryos than special characters; (2) from the most general forms, the less general are developed, until finally the most special arises; (3) the embryo of a given animal form, instead of passing through other forms, rather becomes separated from them; and (4) fundamentally, therefore, the embryo of a higher form is never identical to any other form, but only its embryo.

Von Baer’s great insight was that these de facto constraints limit the manner in which complexity can emerge in organisms. Although it is common to refer to them as part of a “law,” most theorists treat them as probabilistic properties of development (Gould 1977). Von Baer’s law is often subsumed under the dictum: Differentiation proceeds from the taxonomically general to the particular. An important caveat is that this dictum best captures an observed pattern found in closely related ontogenetic trajectories (Arthur 2002).

Von Baer’s law provides a new way to think about the constrained flexibility of grammar acquisition. Perhaps, children begin the task of acquiring language with a grammatical potential that becomes differentiated through a process of constructive interaction with the external linguistic environment. Separate dialects, and even idiolects, can be viewed as the products of related developmental trajectories.

According to our dictum then, earlier stages of grammar should be more taxonomically general than later ones. We need to be careful here, though. An overly strong reading of this generalization would exclude the possibility that grammatical rules or principles are the result of induction and would thus be incompatible with the very evidence that supports the move to emergentism. Fortunately, a more moderate reading is available: the developmental constraints embodied in this law can be seen as causally connected to learning mechanisms. More specifically, we can grant that learning plays an important role in the acquisition of particular grammatical rules or principles while maintaining that general developmental constraints influence which rules or principles are learned at a given point of development. To give away my punch line, I propose that the course of grammar acquisition is shaped by the fact that what is learned early in development has deeper and wider influences than what is learned later in development.

Although von Baer’s law provides a way of identifying a form of dynamical foundationalism, it rests on the admittedly impressionistic qualitative notion of taxonomic generality. What we need is some way capturing the relevant generalizations with greater precision. Fortunately, a model of development, the Developmental Lock, has been designed to show how the differential entrenchment of developmental achievements can lead to the sort of evolutionary conservatism captured by von Baer’s law (Wimsatt 2007a). This model has proved useful in a number of biological domains (Glassmann and Wimsatt 1984; Rasmussen 1987; Wimsatt and Schank 1988, 2004). The core idea behind it—that some traits are more generatively entrenched than others—has also been applied to cultural evolution (Wimsatt 2007b; Wimsatt and Griesemer 2007).

The model is inspired by an influential argument highlighting the computational advantage of hierarchical functional decomposition (Simon 1996). Simon contrasts the task of cracking two nearly identical safes, each of which has 10 dials. The first safe works normally and only opens when all of the dials are set correctly while the other inadvertently clicks whenever an individual dial is set correctly. Simon’s analogy is intended show how the ability to break a complex task down into autonomous sub-tasks can dramatically reduce the original task’s overall difficulty. It can be captured by means of two versions of the cylindrical combination lock represented in Fig. 1.
Fig. 1

A cylindrical combination lock (adapted with permission from Wimsatt 1986)

The complex version of this lock is a normal combination lock. Here, the correct ten-digit solution is only discoverable as a complete solution. Because someone trying to open this lock must consider 1010 possibilities, the expected number of trials to solve it is 1010/2 = 5 × 109. The simple version of this lock is similar to the complex one, except that a click is heard when each wheel is turned to its correct position. This allows for independent partial solutions. Someone trying to open this lock can find the correct setting for each wheel without having to worry about the settings of the other wheels. The expected number of trials to solve this lock is 10 × (10/2) = 5 × 101. The computational difference between these two locks is clear: solving the simple version of the lock is easier than solving the complex one by many orders of magnitude.

The Developmental Lock builds on these two locks. As with the simple version of the cylindrical lock, it clicks when a wheel is in its correct position. However, the solution to a particular wheel is determined by the settings of all the wheels to its left (whether or not these wheels are set correctly). This directional context-dependency is indicated in Fig. 2.
Fig. 2

The Developmental Lock (adapted with permission from Wimsatt 1986)

The lines to the right of each wheel indicate the number of downstream wheels whose solution depends on the setting of that wheel. A consequence of the structure of this lock is that a change in the setting of a particular wheel resets the solutions to all the wheels that lie to its right.

It may help to imagine a 3 × 3 version of the Developmental Lock (one with 3 wheels and only 1, 2, and 3 as possible solutions to each wheel). The solution to the left-most wheel of this lock would not be determined by the setting of any other wheel. For example, it might be 3. The solution to the second wheel from the left would be determined by the setting of the first wheel. It might be 2 if the first wheel is set to 1, 3 if the first wheel is set to 2, and 1 if the first wheel is set to 3. The solution of the third wheel from the left would be determined by the settings on the first two wheels. Thus, it might be 3 when the first and second wheels are set to 1 and 2, respectively, or when they are set to 3 and 1, but it might be 2 when they are set to 2 and 1.1

The wheels of the Developmental Lock can be taken to represent the developmental stages of an organism, with time proceeding from left-to-right.2 The algorithm that relates the local solution of an individual wheel to the settings of the wheels to its left is intended to capture the dependence of later developmental stages on earlier ones. Local solutions to the lock are meant to represent adaptive traits at that particular developmental stage (given what has already transpired). Randomly resetting a wheel is meant to be analogous to the introduction of a perturbation at a particular developmental stage.

The visual representation of the model is misleading in two ways: first, by associating developmental stages with individual wheels, it gives the false impression that each stage must be a coherent whole rather than a heterogeneous collection of phenotypic features, and, second, by providing no representation of the role played by environmental influences, it gives the false impression that the relevant developmental dependencies are purely internal. A more perspicuous—and also more complex—version of the model would indicate the dynamic role played by the environment in the generation of characteristic developmental dependencies (for an example of such a model see Wimsatt and Schank 2004).

If worked from left-to-right, the Developmental Lock is fairly easy to solve, but, if worked from right-to-left, it is extremely difficult to solve. In order to see why, consider each direction in turn. Start first from the left-most wheel. Since this wheel has no wheels to its left, its local solution can be determined directly (in an estimated 5 trials). According to the algorithm, the local solution to the next wheel is now dependent on the setting of this first wheel. Proceeding in this direction, one can manipulate each wheel until it clicks and be assured of eventually finding the lock’s global solution. Working the lock in the opposite direction is not as effective. Starting with the right-most wheel, one should still find its local solution (given the current settings of the wheels to its left) in approximately 5 trials. There is only a 1 in 109 chance, though, that this local solution represents a part of the lock’s global solution. When one adjusts the penultimate wheel to its locally correct position, there is a 9 in 10 chance that it will scramble the local solution of the right-most wheel and only a 1 in 108 chance that it will represent a unified solution with respect to the 8 wheels to its left. As one moves from right to left, one makes no cumulative progress towards the global solution.

The model must be expanded because there is often more than one adaptive developmental pathway in a given system. Wimsatt (1986) proposes that we add more numbers to each wheel and posit more than one local solution. For explanatory convenience, he makes the unrealistic assumption that each wheel has the same number of possibilities and local solutions. The chance that a setting is correct for a particular wheel can then be represented as k (where 0 < k<1). This means that the probability that a random resetting of a wheel that is m wheels from the right represents a part of a global solution is km. This feature of the model helps capture the tendency for evolution to be inherently conservative with regard to the functional structures of early developmental stages. Wimsatt explains (1986, p. 37; italics in the original):

Since k is between 0 and 1, then ki < kj if and only if i > j. Thus the proportions of mutations which are adaptive declines exponentially at earlier stages (i.e. with greater values of m) in development, so we would expect that evolution should be increasingly conservative at earlier stages of development because features which are expressed earlier in development [1] have a higher probability of being required for features which will appear later, and [2] will on average have a larger number of “downstream” features dependent on them.

Because early achievements tend to have more consequences than later ones, any disruption of their acquisition tends to have a greater impact on the developing organism.

The Developmental Lock is built on the presumption that developmental achievements differ in the degree to which they are generatively entrenched (Wimsatt 1986, 1999, 2003). Generative entrenchment is a measure of a feature’s characteristic downstream effects. An important benefit of this quantitative property is that it has a natural association with the qualitative property of taxonomic generality because the more taxonomically general a trait is the more likely it is to have multiple downstream dependencies. The notion of generative entrenchment, however, is broader than that of taxonomic generality. For example, a pleiotropic gene that contributes significantly to the development of several distinct phenotypes may not have a clearly specifiable overall function but nevertheless be deeply entrenched.

Generatively entrenched features characteristically exhibit temporal priority and developmental resilience (Rasmussen 1987). Deeply generatively entrenched features tend to emerge before those that are less generatively entrenched because early emergence facilitates greater developmental dependencies. Because the failure to acquire deeply generatively entrenched features often has cataclysmic results, natural selection tends to favor developmental pathways in which they are buffered (by genetic or epigenetic means) against environmental variation. It is important to keep in mind that the predictions derived from the Developmental Lock are probabilistic. Admittedly, there are exceptions. Genetic mutations, for example, can be neutral with regard to an organism’s fitness. Vestigial features may also emerge early in development and yet have limited functional consequences. These unusual cases, however, are exceptions that prove the rule. The empirical insight behind the model is that deeply entrenched features tend to emerge earlier and be more robust than less entrenched ones.

Generative entrenchment and grammar

Let us take stock of what we have achieved so far. We began with the insight that viewing grammar as a developmental phenomenon leads to the prediction that the acquisition process should be shaped by the constraints embodied in von Baer’s law. This suggests that the constructions, principles, or rules at work in early child language should exhibit greater taxonomic generality than those at work in later child language. We then introduced a model of development, the Developmental Lock, which captures significant aspects of von Baer’s law. This model replaces the vague qualitative notion of taxonomic generality with the more precise quantitative notion of generative entrenchment and predicts that deeply generatively entrenched grammatical achievements should emerge earlier and be more developmentally resilient than those that are less generatively entrenched.

In order to assess these predictions, we need to identify some of the fundamental elements of grammar. I am using the term broadly as it is used within contemporary linguistics to encompass both the syntactic and morphological properties of a language. Roughly put, syntax concerns how words are combined to form phrases and sentences, and morphology concerns word formation. Morphemes are the smallest meaningful units in a language and can be either bound or free. Bound morphemes (such as the past tense morpheme –ed in English) must occur with other morphemes within a single word, and free morphemes (such as the auxiliary verb may in English) can occur on their own.

On a more general level, languages are made up of thematic and functional elements. Thematic elements are linguistic items that tend to have semantic content such as nouns, verbs, adjectives, and prepositions; functional elements are grammatical functors that tend to have little or no semantic content such as auxiliaries, case-markers, complements, and tense-markers. The exact status of this broad but useful distinction is not settled. On the one hand, it seems reasonable to treat it as a matter of degree. After all, both thematic and functional elements have grammatical properties, and some functional elements have more concrete semantic content than others. On the other hand, there is reason to treat it as categorical. For example, some evidence from cognitive neuroscience and neuropsychology suggests that these elements are processed by different neuroanatomical systems (e.g. Diaz and McCarthy 2009; Friedmann and Grodzinsky 1997; King and Kutas 1995; Neville et al. 1992). However this issue is resolved, it remains the case that there are independent reasons to draw some version of the thematic/functional distinction.

The next two sections survey the empirical evidence concerning the acquisition of grammar under normal and strained circumstances, respectively. First, I show that some aspects of grammar regularly emerge earlier in development than others. In particular, the grammatical properties of the phrases headed by thematic elements (such as noun phrases [NP], verb phrases [VP], prepositional phrases [PP], and adjectival phrases [AP]) tend to emerge before those of the phrases headed by functional elements (such as complementizer phrases [CP], agreement phrases [AgrP], and tense phrases [TP]; Pollock 1989).3 Second, I show that those aspects of thematic grammar that are acquired earlier tend to be more resilient in the face of environmental and/or genetic perturbation than those that are acquired later.

Prediction I: Temporal priority

Ascertaining the developmental timing of grammatical achievements is not a simple matter. Because speech production imposes additional cognitive demands to those imposed by speech comprehension (by requiring further motor planning and greater use of working memory), researchers generally acknowledge that grammatical competence is likely to be evident earlier in comprehension than in production. Despite this, the majority of research on the acquisition of grammar has involved naturalistic and experimental studies of productive speech. The reason for this focus on production is largely practical: it is simply easier to ascertain and identify when a child has gained the ability to produce grammatical utterances than it is to ascertain and identify when a child has gained an ability to use that principle or rule in comprehension. Within the past two decades, though, new comprehension-based paradigms have emerged. Research employing these paradigms has shown that children are sensitive to grammatical properties much earlier than is evident in their speech productions (Lust 2006). In the end, understanding the course of acquisition requires research into both production and comprehension.

The evidence

Psycholinguists often speak of three developmental stages distinguished by differences in the mean length of utterance or MLU (identifying a one-word stage, a two-word stage, and a multi-word stage). However, talk of stages may rely too heavily on the myth of the average child. Evidence suggests that there can be significant individual variation in the timing of grammatical achievements (Bates et al. 1988, 1995). In addition, evidence also suggests that exogenous factors such as the structure of the target language (Demuth 1989; Slobin 1985) and the statistical properties of child-directed speech (Shatz et al. 1989) may influence the timing of grammatical achievements. Despite these complexities, it remains a fact that children learning a particular language regularly acquire some productive grammatical abilities before others. Although there is variation with respect to the precise age at which individuals acquire particular grammatical structures in productive speech, the order in which they acquire them remains fairly consistent (Brown and Fraser 1963; Brown and Bellugi 1964; Brown and Hanlon 1970; Brown 1973).

A child typically begins uttering her first words at 12 months. By 18 months or so, she is likely to be in the midst of a “word spurt” in which there is a noticeable increase in her use of new words (Nelson 1973). Around the same time, she is likely to begin to stringing words together in short phrases (Bates et al. 1988). The utterances that she produces at this stage have been described as telegraphic (Brown and Bellugi 1964) because they consist primarily of thematic elements and typically lack functional elements (Bates et al. 1995). When bound morphemes are acquired early in morphologically complex languages, they tend to have thematic content and are often the equivalents of English pronouns or prepositions (Peters 1995).

Young children have command of some important syntactic properties from the time that they begin to produce their first multi-word utterances. For example, they rarely make errors with regard to word order (Brown 1973; Braine and Bowerman 1976; Pinker 1984). Errors that one would expect if early grammars were organized around purely semantic principles are not attested. For instance, nouns that refer to events are not generally miscategorized as verbs (Maratos 1982). In addition, children at this stage are able to use syntactic cues in the acquisition of word meanings (Gleitman 1990). Examining a large naturalistic corpus of the early multi-word utterances of several English-speaking children, Radford (1990) finds a number of positive instances of phrases headed by syntactic categories that typically have thematic content. Some examples are (ages given in months):
  1. (1)

    Hayley draw boat. Want Lisa. Turn page. (Hayley, 20)

  2. (2)

    Nice book. Good girl. (Paula, 18) Want cup tea. (Jenny, 21) Blue ball wool. (Jonathan, 23)

  3. (3)

    Paula play with ball. Go in there. Hat on baby. (Paula, 18)

  4. (4)

    Sausage bit hot. (Jem, 23) Very very naughty. (Chrissie, 23)


The first group of utterances seems to demonstrate an ability to form verb phrases, the second noun phrases, the third prepositional phrases, and the fourth adjectival phrases (Radford 1990, 1995).

The common omission of functional elements in early multi-word speech has led some researchers to propose that children at this point in development lack an awareness of their grammatical role. An influential proposal offered by Radford (1990, 1995) holds that children go through a stage in which the highest syntactic structure is VP (see Rizzi 1994 for a more flexible variant of this approach). According to this proposal, as the genetically encoded language faculty matures, the grammatical principles that enable children to handle functional elements come on-line.

Although Radford’s maturational hypothesis has initial promise, it faces at least three significant problems. The first is that functional elements are not completely absent from early multi-word utterances (Pine and Lieven 1993). For instance, young children appropriately use some wh-question words (words like who, what, where, how/why and when). In order to assess the importance of this fact, a little background is needed. One of the reasons to posit phrases headed by functional elements is that they provide an explanation for long-distance dependencies that exist between different structural positions in a sentence. In English there are a couple of ways to ask questions containing wh-words. One is to simply replace a content word with a wh-word (Jill is eating what?). A second involves a fronted wh-word (What is Jill eating?). This second technique is particularly interesting because it can be used to create a dependency between a constituent position within an embedded clause and one that is superordinate to the main clause. Consider the question, What did Jimmy think that Beth would encourage Jill to eat? Here again, the query involves something eaten by Jill. The wh-word is thus intended to refer to the patient of the action expressed by the verb in the lowest clause. Traditionally, generative grammarians have explained these questions by means of a pair of transformations: one that inverts the subject and the auxiliary verb (known as subject-auxiliary inversion) and one that moves the wh-word from its lower clause position where it receives its thematic role to its fronted and superordinate surface position (known as wh-movement). Both transformations are generally thought to involve phrases headed by functional elements. Because transformations involve a mapping between two or more phrase structures, there is a fairly straightforward sense in which these long-distance dependencies represent a significant level of grammatical complexity.4

Radford (1990) argues that the presence of wh-words in early multi-word speech does not conclusively show that children at this point in development have acquired the grammatical rules or principles associated with wh-questions. He maintains that we need to distinguish between multi-word utterances that are unanalyzed wholes or, at least, partially analyzed constructions and those that involve an understanding of grammatical rules or principles. Many studies suggest that early utterances are formulaic in character. The following utterances from Tomesello’s diary study of his two-year-old daughter are representative (adapted by Diessel 2004 from Tomasello 1992):
$$ \begin{array}{*{20}c} {(5)} \hfill & {That{\text{'}}s\;Daddy.} \hfill & {More\;corn.} \hfill & {Block\;get {\text{-}}it.} \hfill \\ {} \hfill & {That{\text{'}}s\;Weezer.} \hfill & {More\;that.} \hfill & {Bottle\;get {\text{-}} it.} \hfill \\ {} \hfill & {That{\text{'}}s\;my\;chair.} \hfill & {More\;cookie.} \hfill & {Phone\;get {\text{-}} it.} \hfill \\ {} \hfill & {That{\text{'}}s\;him.} \hfill & {More\;mail.} \hfill & {Mama\;get {\text{-}} it.} \hfill \\ {} \hfill & {That{\text{'}}s\;a\;paper\;too.} \hfill & {More\;popsicle.} \hfill & {Towel\;get {\text{-}} it.} \hfill \\ {} \hfill & {That{\text{'}}s\;Mark{\text{'}}s\;book.} \hfill & {More\;jump.} \hfill & {Dog\;get {\text{-}} it.} \hfill \\ {} \hfill & {That{\text{'}}s\;too\;little\;for\;me.} \hfill & {More\;Pete\;water.} \hfill & {Books\;get {\text{-}} it.} \hfill \\ \end{array} $$
In keeping with this observed tendency, evidence suggests that early questions containing wh-words are based on a small number of formulas (Clancy 1989; Dabrowska 2000). For instance, Fletcher (1985) reports that almost all the early wh-questions of an intensively studied child consisted of one of three formulas (How do…, What are…, and Where is…). Further support for the claim that early wh-questions are formulaic is provided by Radford’s (1990) finding that children who have mastered English subject-verb agreement in other contexts fail to exhibit it in their wh-questions (e.g. What color is these?, What’s those?, and Where’s my hankies?).

While the formulaic character of early wh-questions enjoys empirical support, the attempt to use this fact to defend a maturational account of the acquisition of functional elements is problematic. First of all, the extended transition from formulaic use to full competence fits more naturally with a gradualist account of development than with a saltationist one. A defender of maturation might respond to this by suggesting that the process of maturation is itself gradual. This is insufficient, however, because research on the acquisition of complex constructions such as complement and relative clauses suggests that this pattern—early formulaic use followed by ever greater generalization—is a common one (Diessel 2004). The ubiquity of this pattern undermines any appeal to a specialized maturational process operating at a specific point in early development.

The second significant problem faced by Radford’s maturational hypothesis is that evidence suggests that the timing of acquisition can be manipulated by the richness of the linguistic input. For instance, children learning Sesotho (a language in which passive sentences are more prevalent in adult speech than they are in English) acquire passives much earlier than children learning English (Demuth 1989). In general, there is a degree of cross-linguistic variation in the timing of the emergence of functional elements in early child speech (Lust 2006). Some experimental studies also suggest that the richness of the linguistic input available to a child can affect the timing of grammatical achievements. For instance, Vasilyeva et al. (2006) experimentally manipulated the linguistic input received by preschoolers. Their study involved 10 sessions in which four-year old children were told stories that either involved a high proportion of active sentences or passive sentences. Children who heard the stories containing a high portion of passive sentences produced more passive constructions with fewer mistakes on a neutral production task. Differentially enriched input has also been shown to affect the timing of the acquisition of auxiliaries (Shatz et al. 1989).

The third significant problem faced by this maturational hypothesis is that, with the advent and increased use of newer comprehension-based research paradigms, evidence has been mounting that young children have a greater awareness of functional elements than the production data would seem to indicate. Some early comprehension studies were suggestive. For example, Shipley et al. (1969) found that young ‘telegraphic-speakers’ (18–33 months) responded better to commands containing functional elements than commands lacking them (e.g. throw me the ball vs. throw ball). More recently, using a preferential-looking time paradigm, Golinkoff et al. (2001) found that infants (18–21 months) were sensitive to the grammatical presence of the suffix –ing on verbs. Using a different measure (head turn preference), Santelmann and Jusczyk (1998) tested whether 15-month-old or 18-month-old infants were sensitive to the grammatical dependency in English between the auxiliary is and the progressive morpheme –ing. They found that 18-month-olds, but not 15-month-olds, listened significantly longer to the passages with the well-formed English dependency. Evidence also shows that 19-month-olds, but not 16-month-olds, are sensitive to the presence of the present tense verbal inflection –s (Soderstrom et al. 2002). Soderstromet al. (2007) found that 16-month-old infants preferred grammatical sentences over ungrammatical ones when ungrammaticality was cued by both misplaced inflection and word-order violations. Most of these studies point to an early sensitivity to the presence of functional items but fail to show that infants have an understanding of their grammatical role. Some recent studies, though, provide an initial indication of such understanding. For instance, Kedar et al. (2006) provide evidence that the presence of the determiner the helped 18-month-old infants orient faster to a visual target than when they heard sentences in which the determiner was dropped, replaced with a nonsense function word, or an alternate function word.

Clearly this evidence fits poorly with the idea that children who omit functional elements in their speech completely lack an awareness of them. Moreover, functional elements have a number of properties that make them well-suited to play an important role in the acquisition of grammar (Kedar et al. 2006). First, despite the fact that they are often less phonologically salient than thematic elements, they occur with much greater frequency in speech than thematic elements do. Second, functional elements regularly occur with specific syntactic types. Their presence may thus help children discover the syntactic class of thematic elements. Finally, functional elements generally occur in fixed positions in phrases (e.g., in English, determiners always occur at the beginning of a noun phrases). This may help children segment continuous speech into appropriate constituents. Taking these considerations into account, it seems reasonable to hypothesize that functional elements help the child to categorize thematic elements and to form sentence skeletons (Lust 2006).

While the comprehension data suggest that infants are sensitive to the presence of functional elements much earlier than the production data would seem to indicate, it does not show that infants have a complete understanding of the grammatical processes associated with agreement, tense, and subordinate clauses. After all, the comprehension evidence primarily concerns sensitivity. Reviewing the extant comprehension data from infants learning English, Soderstrom et al. (2007) provide a timeline of the development of grammatical knowledge. On this timeline, some of the earliest achievements are sensitivity to the word order of determiners and nouns (10 months) and the ability to use frequent frames to group words into categories (12 months). Sensitivity to auxiliary/inflection dependencies and the presence of the third-person singular present tense inflection –s does not emerge until later (18 months). This is around the same time that research indicates that they are able to use word order to determine thematic roles in simple noun–verb-noun sentences (Hirsh-Pasek and Golinkoff 1996). Use of verb inflections in speech productions does not typically begin until around 24 months.

Evaluating the model

When we survey the production and comprehension evidence as a whole, several conclusions are justified. First, the evidence suggests that the acquisition process is gradual and cumulative. Indeed, one of the interesting contributions of the comprehension evidence is that it extends the period during which grammar emerges. Second, the process appears to be restrained to significant degree. In production, errors of omission are much more common than errors of commission, and, in comprehension, sensitivity precedes full understanding. Third, the precise timing of some grammatical achievements can be influenced by local factors such as the structural properties of the particular language being acquired. For instance, richer and more regular inflectional systems tend to be acquired earlier than more impoverished and irregular ones (Deen 2009). Finally, in the face of this crosslinguistic variation, there is a marked tendency for children to acquire knowledge of a bare bones phrase structure grammar built around thematic syntactic categories before acquiring knowledge of a more complex grammar that fully incorporates functional elements.

How does this match up with the prediction of the model? At a macro-level, it matches very well. In general, the grammatical processes associated with thematic elements appear to be fully functional earlier in development than those associated with functional elements. This fits with the proposal that the former are more generatively entrenched than the latter. At a micro-level, though, the model seems less successful. The production evidence suggests that children acquire grammatical knowledge through a restrained process of generalization from knowledge of specific constructions. While the evidence from comprehension is more equivocal, it seems reasonable to hypothesize that the pattern is similar. If so, how is this to be squared with the notion of generative entrenchment?

One solution is to see this micro-level pattern as the result of the learning mechanisms involved in acquiring language. Positing that some aspects of development should be explained in terms generative entrenchment should not be seen as excluding the possibility of learning. Instead, learning and generative entrenchment should be seen as countervailing developmental forces. This should not be surprising because generative entrenchment is proffered as a developmental analog of innateness (Wimsatt 1999). Empiricists traditionally recognize that learning requires some innate knowledge. If we replace the construct of innate knowledge with that of deep generative entrenchment, the idea would be that learning requires that some achievements are more generatively entrenched than others. From this perspective, learning and generative entrenchment are two sides of the same developmental coin.

An example from cultural evolution might help show how the circumstances of entrenchment can be distinct from the proximal circumstances of emergence. Consider the QWERTY keyboard found on most computers. Various functional, historical, and sociological factors went into the selection of this configuration. It seems unlikely that this particular configuration is ergonomically optimal. Nevertheless, once it became the standard, it achieved a high degree of generative entrenchment. To a large extent, the reason that it persists has less to do with the circumstances of its original emergence and more to do with everything that depends on it.

The Developmental Lock builds on the recognition that it is an architectural feature of developmental systems that traits will exhibit differential generative entrenchment. This means that particular contingencies will have greater consequences than others. History matters in development. Given this fact, selective forces operating on repeated developmental processes will tend favor the early emergence of highly entrenched traits. This can occur even with a trait that is acquired through learning because selective forces can shape the learning process. With respect to grammar, the perceptual salience of particular linguistic elements, the statistical properties of the input, the nature of interpersonal dynamics, and the presence of biologically determined biases are just some of the manifold ways in which the early emergence of some learning achievements might be secured.

A deflationary concern

One might worry that the appeal to generative entrenchment is unnecessary because researchers have already identified a number of potential proximal causes of the relatively delayed acquisition of the grammatical processes associated with functional elements. Examples include the relative lack of phonological–salience of functional elements (Peters 1995); the computational complexity of the grammatical processes themselves (Diessel 2004; Jakubowicz and Nash 2001); the computational structure of maternal speech or “motherese” (Hoff 2006; Hoff-Ginsberg 1985) and the degree to which these grammatical processes are orthogonal to the communicative interests of young children. Let us call this the alternative-hypotheses objection. At its core is the idea that the success of any of these proposals would eliminate the need for an appeal to generative entrenchment. This objection misses the mark because it sets up a false opposition between proximal explanations and an explanation in terms of generative entrenchment. In other words, the objection fails because these are not mutually exclusive types of explanations.

The alternative-hypotheses objection is built on the assumption that an explanation based on generative entrenchment is in competition with explanations involving proximal causes. Recall that the generative entrenchment of a phenotype is defined relative to its effects and not to its causes. Generative entrenchment is a measure of developmental importance, and the fact that deeply entrenched traits tend to emerge earlier in development has to do with manner in which selective forces shape development. A consequence of this is that, within an evolutionary-developmental conceptual framework, generative entrenchment can be part of an ultimate explanation for the early emergence of a trait. For this reason, the identification of proximal causes does not exclude the relevance of generative entrenchment to explaining the emergence of that trait. This point cannot be emphasized enough: an appeal to generative entrenchment does not exclude careful investigation of proximal causes. Indeed, it predicts that such causal factors will exist while remaining somewhat neutral with respect to what they might be.

Those who formulate the objection tend to see the fact that there are numerous potential causes for the crosslinguistic delay in the acquisition of the grammatical processes involving functional elements as a strong point in their favor. A reasonable interpretation of this multiplicity, however, is that the proper explanation for the crosslinguistic delay is likely to be multi-causal. After all, each of the proposals outlined above enjoys some empirical support. Furthermore, languages are likely to vary with respect to the relevant properties (such as the phonological-salience of their functional elements). The Developmental Lock predicts that, even in the face of this variation, more generatively entrenched grammatical properties should emerge before less generatively entrenched ones. It is not clear that this generalization can be captured at the level of specific proximal causes. In sum, the appeal to generative entrenchment (which focuses on a trait’s effects) complements, rather than conflicts with, research into proximal causation.

Prediction II: Developmental resilience

The Developmental Lock predicts that, all things being equal, more generatively entrenched aspects of grammar should be more developmentally resilient than those that are less generatively entrenched. Because generative entrenchment is a measure of developmental centrality, the failure to achieve deeply generatively entrenched traits is likely to be maladaptive. For this reason, there should be selective pressure to buffer these traits against environmental and even genetic variation. Wimsatt (1999, p. 142) explains, “Features (whether contingent or not) that accumulate many downstream dependencies become deep necessities, increasingly and ultimately irreplaceably important in the development of individual organisms.” What I hope to show is that there is reason to think that certain aspects of grammar have become just such deep necessities.

Whenever one compares atypical developmental trajectories to typical ones, the degree to which they share processes remains an open question. For this reason, we cannot immediately assume that what we discover about the former carries over to the latter. Moreover, the claim that deeply entrenched traits should be developmentally resilient does not require a clear and direct relationship between atypical and typical development. Indeed, it begins with the assumption that the relationship between the two is likely to be complex and then predicts that a general pattern will emerge out of this complexity.

Below, I review some of the evidence that supports the prediction of developmental resilience. Psycholinguists have investigated language acquisition under various forms of strained circumstances. These can be roughly divided into two classes: the first involves neurologically intact children who have received deficient input and the second involves children whose cognitive capacities are impaired in some way.

The evidence

An obvious way to test developmental resilience would be to run deprivation experiments in which age-matched children received different levels of exposure (or lack of exposure) to language. This would clearly be unethical and, thankfully, has not been pursued. Research into the acquisition of sign language, though, has uncovered instances of children who have received deficient or delayed input for various external reasons. This research reveals that certain aspects of grammar are more resilient than others.

With sign language, it is often possible to document the age of first exposure and the nature of the input available. Many deaf children are born to non-signing hearing parents and are not exposed to a full-blown sign language until they attend a specialized school. Age of initial exposure to a sign language has been shown to be a statistically relevant factor in grammatical achievement. Late learners of American Sign Language exhibit a discrepancy between their facility with handling word order (an indication of the ability to handle the basic properties of phrase structure grammar) and their difficulties with handling more complex morphosyntactic processes (Newport 1988, 1991, 1994). Verb agreement seems to be particularly challenging for late learners (Lillo-Martin 2009). For instance, Emmorey et al. (1995) found that late-learners of ASL experienced difficulties with on-line processing of verb agreement but did not experience the same difficulties with aspect morphology (which marks semantic distinctions).

Studies of the grammatical abilities of children with neuro-developmental disorders that affect language acquisition provide another means of evaluating developmental resilience. Below, I discuss evidence from studies of children with Down syndrome, Specific Language Impairment (SLI), and Autism Spectrum Disorders (ASD) and show that a similar developmental pattern arises in each group.

Individuals with Down syndrome (Trisomy 21) tend to have more difficulties with language than with other areas of cognition (Chapman and Hesketh 2000; Fowler et al. 1994; Miller 2004). A large body of evidence suggests that morphosyntax is more impaired than other areas of grammar (for reviews see Rosenberg and Abbeduto 1993; Rice et al. 2005). For instance, a number of studies report that morphosyntactic abilities are relatively more delayed than other grammatical abilities in children and young adults with Down syndrome (e.g. Kernan and Sabsay 1996; Vicari et al. 2000). Other studies indicate that their morphosyntactic abilities are not merely delayed but also qualitatively different than those of typically developing individuals (Fabbretti et al. 1997; Perovic 2001; Ring and Clahsen 2005; Schaner-Wolles 2004). In general, research supports Rosenberg and Abbeduto’s (1993, pp. 91–92) conclusion that “the vast majority of persons with Down syndrome suffer from a specific deficit that interferes more with the acquisition of grammatical morphemes than with the mastery of basic sentence structures.”

Specific Language Impairment is a heterogeneous disorder that is diagnosed when an otherwise typically developing child experiences particular difficulties with language. This diagnosis is accomplished in part through the exclusion of other potential sources of difficulty such as a loss of sensory capacity, the presence of a psychiatric or neurological condition, or deficient input (Bishop 1997; Leonard 1998). When children with SLI are compared with typically developing children with the same MLU, they exhibit particular deficiencies in morphosyntax (Steckol and Leonard 1979). Children with SLI have been shown to omit finite verb morphology (He jump out the swing) and perform poorly on grammaticality judgments involving finiteness for a longer time than typically developing children (Rice 2003; Wexler 2003). They have also been shown to produce more errors with wh-movement (Who did Marge see someone?) than language-matched controls (van der Lely 2005; van der Lely and Battell 2003).

The morphological properties of the target language can affect the timing and success of the acquisition of morphosyntactic knowledge by children with SLI. For example, although German-speaking children with SLI have trouble with agreement and tense inflections, their correct use of these elements exceeds the correct use of agreement and tense inflections by English-speaking children with SLI (Roberts and Leonard 1997). Children with SLI who are learning Spanish and Italian—two languages with rich inflectional systems—exhibit even less difficulty with tense and agreement inflection than do children with SLI who are learning German (Leonard 2009). Although their acquisition of these elements is delayed, they do not exhibit any significant differences in their use of these elements when they are matched with younger MLU controls. This parity with MLU controls does not hold for all functional elements, though. For instance, these children exhibit a relative weakness with respect to function words.

A subgroup of higher-functioning, verbal children with an ASD have trouble with grammar acquisition and end up achieving a grammatical competence similar to that of individuals with SLI (Gernsbacher et al. 2005; Kjelgaard and Tager-Flusberg 2001; Tager-Flusberg 2004, 2007). For instance, English-speaking children within this subgroup have been shown to have trouble with tense-marking elicitation tasks when compared to other high-functioning ASD children who do not have language impairment (Roberts et al. 2004). It should be noted, though, that there is a degree of diagnostic overlap in the social and communicative deficits of ASD and SLI (Leyfer et al. 2008). This overlap raises the possibility of shared etiologic factors between these disorders.

Evaluating the model

These are just some of the various strained circumstances that have been studied. Although a full account of each is beyond the scope of this essay, they are clearly heterogeneous. Some of these circumstances involve neurotypical children who have received deficient or delayed input while others are the result of complex neuro-developmental disorders. The important point is that, despite this manifest variation, the same overall distributional pattern emerges: relative success with the fundamental aspects of phrase structure grammar and difficulties with the complexities of morphosyntax. This pattern has not gone unnoticed by researchers. As mentioned above, several researchers have identified similarities in the grammatical difficulties experienced by individuals with SLI and those experienced by some on the Autistic spectrum. Several researchers have also remarked that there are important similarities in the morphosyntactic difficulties experienced by individuals with Down syndrome and SLI (e.g. Laws and Bishop 2003; Rice et al. 2005).

Some recent evidence involving typically developing children also supports the notion that some aspects of grammar are more developmentally resilient than others. Vasilyeva et al. (2008) examined the effect of socioeconomic background on the acquisition of syntax. In a longitudinal study covering the period of development in which what the authors refer to as basic and complex syntactic structures emerge, they found that socioeconomic background affected the degree to which complex syntax was mastered. There were no significant differences between the groups with respect to the mastery of basic syntax.

The very fact that this pattern occurs under such diverse circumstances supports the claim of developmental resilience. It shows that certain aspects of grammar are more robust in the face of diverse developmental challenges (either internal or external) than others. This fits with the hypothesis that these aspects are generatively entrenched. While it does not enable us to directly infer that any particular process involved in normal development occurs in the various circumstances considered above, it does seem incompatible with the negative hypothesis that no processes are shared.


The acquisition of grammar has traditionally been viewed through the prism of the innate/acquired distinction. Recently within the philosophy of biology, a number of people have questioned the relevance, expedience, and theoretical foundations of the innate/acquired distinction (Griffiths and Knight 1998; Oyama 1985). These innate/acquired skeptics emphasize the multiplicity of sources of change in developmental systems and argue that the complex and dynamic nature of development undermines any attempt to explain biological structures or processes as primarily the result of either internal or external forces (Griffiths and Gray 1994; Oyama et al. 2001). Their approach represents an explicit effort, not to resolve the innate/acquired distinction, but to sidestep it. A key intuition behind developmental systems theory is that development and evolution should not be bifurcated but, instead, treated as dynamically integrated.

In this essay, I began with the observation that there has been a similar shift in psycholinguistics away from the innate/acquired distinction and toward a developmental point of view that sees grammar as an emergent phenomenon. I then argued that there are good reasons to think that the acquisition of grammar is shaped by developmental constraints. A large body of psycholinguistic evidence suggests that children tend to acquire more generatively entrenched grammatical processes before less generatively entrenched grammatical processes. Evidence also suggests that the grammatical processes acquired early tend to be more developmentally resilient than those acquired later. Although these generalizations are admittedly broad and preliminary, they suggest that a developmental explanation of the constrained plasticity of language acquisition may be possible.


In order to make the analogy to Simon locks clear and to keep the relevant probability calculations simple, the developmental lock is presented with the assumption that there is a solution for every series of numbers on the wheels to the left of the particular wheel under consideration. A more realistic assumption would be that many series have no solution (indicating a developmental failure).


This is not to say that there have to be well-defined stages. The same point could be made with a continuous model of developmental change. It is just simpler to talk about a discrete combination lock.


This does not exclude the possibility that functional elements play a role in early child language. As I discuss in the section entitled “The evidence”, research suggests that children may use the presence of some functional elements to help them segment utterances, categorize nouns and verbs, and recognize sentence skeletons.


Not all theories of grammar posit transformations. Some examples of non-transformational grammatical theories are Autolexical Syntax (Sadock 1991); Head Driven Phrase Structure Grammar (Pollard and Sag 1994) and Lexical-Functional Grammar (Bresnan 2001). Even within these theories, though, there is a clear sense in which the long-distance dependency between a “moved” wh-word and its canonical thematic position (variously described as a mapping, sharing, or linking) is the result of a complex grammatical process.



I would like to thank two anonymous reviewers, Bill Bechtel, Cara Cashon, Fiona Cowie, Jesse Prinz, Kim Stelreny, and William Wimsatt for comments on the manuscript during various stages of its development.

Copyright information

© Springer Science+Business Media B.V. 2012