Some syntactic positions can be targeted by some movement types, but not by others. The classical example of this phenomenon is , whereby \(\overline{\textrm{A}}\)-movement can leave a finite clause (1), but A-movement cannot (2). The traditional analysis of hyperraising involves a conspiracy of two constraints: (i) movement out of a finite clause must proceed through the intermediate [Spec, CP] position (Chomsky 1973, 1977, 1981, 1986) and (ii) a ban on “improper movement,” according to which \(\overline{\textrm{A}}\)-movement but not A-movement may proceed from [Spec, CP] (Chomsky 1973, 1981; May 1979).

figure a
figure b

A growing body of work has shown that these movement asymmetries are not limited to the binary distinction between A-movement and \(\overline{\textrm{A}}\)-movement (e.g. Williams 1974, 2003, 2013; Müller and Sternefeld 1993, 1996; Abels 2007, 2009, 2012a,b; Neeleman and van de Koot 2010; Müller 2014a,b; Keine 2016, 2019, 2020). Thus, there needs to be a more general theory of “improper movement” that restricts what movement types are available to what positions.

One particularly general and therefore interesting account of these asymmetries stems from Williams (1974, 2003, 2013) and van Riemsdijk and Williams (1981); I will refer to it as the Williams Cycle (WC).Footnote 1 The core analytical intuition behind the WC is that one and the same node is a barrier to some movement types, but not to others, and that this distinction correlates with the structural height of the landing site in the functional sequence. In Williams (2003), the WC is formulated as the Generalized Ban on Improper Movement, given in (3) (to be discussed in greater detail in Sect. 5.1).

figure c

The WC accounts for the ban on hyperraising (2) as a prohibition on moving from inside a CP to [Spec, TP]. According to the WC, CP is a barrier for movement to TP, but not for movement to CP, as schematized in (4), because C is higher than T in the functional sequence.

figure d

This account extends beyond hyperraising to the other kinds of movement asymmetries that have been documented in the literature. In addition to the work by Williams (1974, 2003, 2013) and van Riemsdijk and Williams (1981), various versions of the WC have been developed by Abels (2007, 2009), Müller (2014a,b), and Keine (2016, 2019, 2020), amongst others.

While the WC has traditionally been proposed on the basis of movement, Keine (2016, 2019, 2020) argues that analogous restrictions also govern agreement. This generalizing of the WC raises the question of whether other syntactic dependencies are also subject to the WC. This paper investigates the locality of case assignment and argues that it too is constrained by the WC.Footnote 2 Therefore, in line with movement and agreement, there is .

The paper is couched in terms of dependent-case theory (DCT) (Marantz 1991; Bittner and Hale 1996; McFadden 2004; Baker 2015). The reason for this choice is that the paper draws heavily on Finnish, which I will argue requires the notion of dependent case Poole 2015; also Maling 1993; Anttila and Kim 2011, 2017. However, the main arguments in this paper equally apply to functional-head case theory (FHCT) (e.g. Chomsky 2000, 2001; Legate 2008); see fn. 13 and 32 in particular for discussion.

The motivation for improper case, i.e. that case assignment is subject to the WC, comes from two puzzles that the previous literature has not investigated in depth. The first puzzle involves the interaction between dependent case and movement, namely that some movement may feed dependent-case assignment, but other movement crucially must not do so. The second puzzle is crossclausal case assignment in Finnish, where a subject, but not an object, may license dependent case on another DP across a nonfinite clause boundary. I show that both puzzles crucially do not fall under the purview of standard syntactic locality constraints, e.g. phases. I argue instead that both problems receive a unified analysis if case assignment is subject to the WC, which I formulate for case as the Ban on Improper Case in (5).

figure e

According to the Ban on Improper Case, the heights of two DPs relative to one another in the functional sequence dictate whether they can establish a dependent-case relationship. For movement, (5) means that the height of a movement’s landing site determines the range of positions from which another DP can license a dependent-case relationship with that moved DP. For clausal embedding, (5) means that the size of an embedded clause dictates which DPs in higher clauses can establish a dependent-case relationship across that clause boundary.

The Ban on Improper Case brings the locality of case into line with movement and agreement, in that the WC applies to all three. The question that follows then is how to uniformly derive WC effects in all three of these empirical domains. Crucially, improper case does not follow from recent proposals that analyze WC effects as the result of a constraint on Agree or Merge (Abels 2007, 2009; Müller 2014a,b; Keine 2016, 2019, 2020), because dependent-case assignment does not seem to involve either one of these operations. I argue that a unified analysis of WC effects for case, movement, and agreement becomes available if we adopt Williams’s (2003) analysis of clausal embedding. Williams proposes that a ZP can only be embedded in a clause that has itself been built up to ZP, which he calls the Level Embedding Conjecture. The crucial consequence of this proposal is that a root XP containing an embedded YP, where Y is higher than X in the functional sequence (Y ≻ X), never exists in the course of a derivation (6).

figure f

Any movement, agreement, or case assignment between matrix XP and embedded YP that would violate the WC is in turn impossible because the relevant structure where X and [Spec, XP] would have access to YP—under the strict cycle—is simply not created by the grammar, as schematized in (a6a). Matrix Y and [Spec, YP], on the other hand, are able to access embedded YP because matrix YP is the root node at the point when embedded YP is embedded, as schematized in (b6b); this access generalizes to projections higher than Y in the functional sequence. Because this constraint follows from the way that syntactic structures are built, the key consequence of this account is that all syntactic dependencies are subject to the WC, regardless of whether they share the same operational core or not.

The argumentation proceeds as follows: Sect. 2 briefly overviews the paper’s assumptions about DCT. In Sect. 3 and 4, I present two locality puzzles for dependent-case assignment: the interaction of case and movement, and Finnish crossclausal case assignment. To account for these two seemingly disparate locality problems, in Sect. 5, I propose that dependent-case assignment is subject to the Ban on Improper Case. In Sect. 6, I then argue that Williams Cycle effects for case, movement, and agreement can be uniformly analyzed in terms of clausal embedding. Section 7 concludes by discussing several purported exceptions to the Williams Cycle and further ramifications of the paper’s proposals.

Background on dependent case

In DCT, the calculus of case follows the algorithm in (7) Marantz 1991; Bittner and Hale 1996; McFadden 2004; Baker 2015; and its predecessor Yip et al. 1987.

figure g

“Ergative” and “accusative”—in their canonical textbook definitions—are collapsed into the unified notion of . Whenever two DPs presently unvalued for case stand in a c-command relationship in the same local domain, one of the DPs is assigned dependent case, though which one depends on the language’s parameterization. When the c-commanding DP is assigned dependent case, this corresponds to what would traditionally be called “ergative.” When the c-commanded DP is assigned dependent case, this corresponds to what would traditionally be called “accusative.” I will refer to this process as establishing a , and to the higher DP in the pair, i.e. the one that initiates the relationship, as the .

For the sake of concreteness, I adopt the syntactic implementation of DCT from Preminger (2011, 2014) throughout the paper: (i) DPs enter the derivation with an unvalued case feature, [case: □], which can be valued as either dep (for dependent case) or a particular lexical case.Footnote 3 (ii) Lexical cases are assigned locally by (typically, lexical) heads, e.g. P0 and V0, to their sibling upon first-merge (8).Footnote 4

figure h

(iii) Dependent case is assigned whenever two DPs with unvalued case features ([case: □]) stand in a c-command relationship (9).Footnote 5 The language’s parameterization determines whether it is the higher or lower DP in the pair that gets assigned dependent case; throughout the paper, I indicate the assignee with an underline. Concerning the timing of dependent case, in Sect. 3.2, I will argue that dependent-case relationships are established as early as possible, which I take to be upon merging the licensor into the c-commanding position. The morphological exponence of dependent case, e.g. as “accusative” and “ergative,” is determined at PF.

figure i

(iv) If [case: □] is still unvalued when Spellout occurs, it is realized as unmarked case at PF (10). Thus, unmarked case (≈ “nominative” and “absolutive”) is the absence of any otherwise assigned case (see Kornfilt and Preminger 2015).

figure j

These detailed mechanics will be abstracted over when not relevant to the discussion at hand. One advantage of this syntactic case calculus is that the structure consisting of a head and the DP that it c-selects is necessarily built before any larger structure containing that DP and another DP in a c-command relationship. Therefore, the precedence relations in (7) fall out intrinsically based on how structure is built, and do not need to be stipulated, as, e.g., the original implementation in Marantz (1991) does.

Before proceeding, there are two important points about DCT worth emphasizing: First, dependent-case assignment is in addition to case assignment by designated heads (which in the terms here, falls under lexical case).Footnote 6 As such, the expressive power of DCT is a proper superset of the expressive power of FHCT (Preminger 2017, to appear). The crucial question then is whether the additional expressive power of DCT is warranted, i.e. whether there are case patterns that call for the notion of dependent case. Several such patterns have been identified in the literature (see e.g. Marantz 1991; Baker and Vinokurova 2010; Baker 2014, 2015; Levin and Preminger 2015; Baker and Bobaljik 2017; Jenks and Sande 2017; Yuan 2018, 2020), and in this paper, I argue that the distribution of Finnish accusative case is another such pattern. Second, the descriptive label given to a case should not be taken to entail a particular case-assignment mechanism. That is, just because a given case in a given language is pretheoretically called “accusative” or “ergative” does not mean that it is necessarily dependent case—and likewise for “nominative” or “absolutive” and unmarked case (hence the scare quotes).

Movement and case

This section shows that some movement can lead to dependent-case assignment (A-movement), but other movement must not do so (\(\overline{\textrm{A}}\)-movement). This dichotomy will be shown not to follow from standard conceptions of locality, e.g. phases, and thus it presents a challenge for DCT.

Some movement can feed dependent case

The DCT literature has identified a number of case patterns as involving movement feeding dependent-case assignment. Let us consider three representative examples.

The first example involves object shift: a dependent-case relationship between the subject and the object is allowed only if the object has raised out of VP Bittner and Hale 1996; Baker 2015:125–130; Woolford 2015. This pattern is illustrated in (11) with Niuean, where the case markings correlate with both the specificity of the object and the clausal word order (Massam 2000, 2001). If the object is nonspecific, the subject is nominative, and the clause has VOS word order (a11a). If the object is specific, the subject is ergative and the clause has VSO order (b11b).Footnote 7

figure k

Massam (2000, 2001) analyzes this alternation in terms of object shift and VP fronting. The VOS order is produced by the object remaining in its base position and thus fronting along with the VP (a12a). By contrast, the VSO order is produced by the object raising out of the VP prior to VP fronting (b12b). The additional correlation with ergative case is then captured by assuming—in pretheoretical terms—that ergative case requires that the object leave VP.

figure l

In DCT terms, VP blocks dependent-case assignment in Niuean (see Sect. 7.1). Thus, only when the object raises out of VP can a dependent-case relationship between the subject and the object be established. Other languages exhibiting object shift feeding dependent case include Dyirbal, Eastern Ostyak, Ika, Inuit, Nez Perce, Sakha, and Tagalog see Baker and Vinokurova 2010; Baker 2015:125–130; Woolford 2015; and references therein.

The second example is Shipibo applicatives of unaccusatives (Baker 2014). Baker shows that in Shipibo, an unaccusative subject is ordinarily nominative (a13a), but adding an applicative argument causes the subject to become ergative (b13b). He argues that ergative in Shipibo is dependent case and that in (b13b), the subject is ergative because the applicative provides the additional DP needed for dependent case.

figure m

However, if the unaccusative subject (i.e. the theme) is base-generated inside VP and the applicative is base-generated above VP—both standard assumptions—then it should be the applicative that gets dependent ergative. That is, upon merging into the structure, the applicative should establish a dependent-case relationship with the theme; in the pair, the applicative would be the higher DP and hence should be the one assigned dependent case (= ergative). Baker (2014) handles this problem by proposing that the applicative is encased inside a null PP, so that (i) it does not c-command the theme and (ii) it is ineligible for movement to subject position, [Spec, TP]. Due to the latter, the theme is able to raise over the applicative to [Spec, TP]. From [Spec, TP], the theme c-commands the applicative and establishes a dependent-case relationship with it; in this version of the pair though, the theme is the higher DP and thus gets assigned dependent case. This analysis is schematized below in (14).

figure n

Assuming that Baker’s (2014) analysis is on the right track, Shipibo applicatives of unaccusatives are an instance of movement feeding dependent-case assignment.Footnote 8

The third example is the English raising predicate strike as (Marantz 1991), which is traditionally analyzed as the matrix subject starting out as the embedded subject and raising into matrix-subject position, as schematized in (15). Taking accusative on objects in English to be dependent case, the licensor of dependent case on the internal argument of strike must be the subject after it has raised to matrix [Spec, TP], because there is no other possible licensor.

figure o

Note that something needs to be said about why the internal argument of strike does not license dependent case on the subject before it has raised; I will return to this point in Sect. 5.4.

In sum, taken together, these three examples crucially show that there are instances of movement that feed dependent-case assignment.

Some movement must not feed dependent case

While the previous section showed that some movement may feed dependent-case assignment, there is also other movement that must not feed dependent-case assignment. I will illustrate this problem using wh-movement, though it holds generally for nonlocal movement, and I will use English for ease of illustration.Footnote 9 Let us take the problem in two parts.

The first part of the problem is that dependent case cannot be assigned based on the surface structure alone. For example, the structure in (a16a) with wh-movement must be mapped to the string in (b16b) and cannot be mapped to (c16c). Descriptively, dependent case needs to be calculated before wh-movement has occurred.

figure p

One potential solution that can be immediately set aside is to assume that case is assigned at PF and that wh-movement ‘reconstructs’ for case at PF. This solution would face the problem that unlike canonical reconstruction at LF, this hypothetical PF reconstruction would have to be for case alone and not uniformly for all PF processes, in particular not for linearization. As such, it would be nothing more than a restatement of the empirical generalization that wh-movement does not affect case. Rather, I propose that dependent-case assignment is interspersed with structure building, so that dependent case is assigned as early as possible in the derivation see also Baker and Vinokurova 2010:604; Preminger 2011, 2014. I call this principle Earliness (17) (in the spirit of Pesetsky 1989; Pesetsky and Torrego 2001).

figure q

Earliness crucially forces dependent-case assignment to happen prior to wh-movement. The derivation of (16) under this analysis is illustrated in (18): First, a dependent-case relationship is established between she and who immediately upon first-merge of she into the structure (a18a). Second, wh-movement happens later in the derivation, after dependent case has been assigned (b18b). Third, at PF, [case: dep] is realized as “accusative” and [case: □] as “nominative” (c18c).

figure r

The formulation of Earliness in (17) also reiterates the restriction that two DPs can enter into a dependent-case relationship only if they both presently have unvalued case features ([case: □]); this is part of the case calculus laid out in Sect. 2. This restriction prevents DPs with dependent case from reparticipating in dependent-case assignment, either after having themselves moved or with DPs that have moved above them. For example, in (19), this restriction prevents the moved wh-element that has itself been assigned dependent case from turning around and licensing dependent case on the subject from the higher position to which it has wh-moved.

figure s

While Earliness is a necessary component to solving the problem imposed by wh-movement, it is not sufficient. This brings us to the second part of the problem: dependent-case assignment in the context of successive-cyclic movement. When a wh-element that is itself unvalued for case—and should surface with unmarked case at PF—moves successive cyclically, it passes through intermediate [Spec, CP] positions from where it should in principle affect the calculus of dependent case, but does not. (For the moment, let us set aside the possibility of successive-cyclic movement through [Spec, vP] until Sect. 7.2.) Consider (20), where who undergoes successive cyclic wh-movement to matrix [Spec, CP] and must surface with unmarked case.

figure t

Sentences like (20) present two complications for DCT. For convenience, I will discuss these complications in terms of an accusative alignment, where the lower DP in a dependent-case pair is assigned dependent case, but the problem extends to an ergative alignment too, where the higher DP in the pair is assigned dependent case. The first problem is that a wh-element does not have its own case altered from its intermediate landing sites.Footnote 10,Footnote 11 From these intermediate positions, there may very well be another DP unvalued for case that c-commands the wh-element. All else equal, the wh-element should be assigned dependent case in such configurations—but crucially, it is not. Descriptively, the moving wh-element cannot have the case overwritten that would have been assigned to it if it had not moved. In DCT terms, the wh-element cannot be the lower DP in a dependent-case pair when it is in an intermediate landing site, as schematized in (21). As such, I will refer to this problem as the Lower-DP Problem.

figure u

The second problem is that a wh-element does not alter the case of other DPs from its intermediate or final landing sites. From these positions, the wh-element may very well c-command another DP unvalued for case, and thus it should, all else equal, be able to license dependent case on it—but it cannot do so. In other words, the moving wh-element cannot be the higher DP in a dependent-case pair (modulo from its base-generated position with, e.g., an object). As such, I will refer to this problem as the Higher-DP Problem (22).

figure v

The standard dependent-case calculus does not offer an explanation for why successive-cyclic movement does not affect case in these two ways (for a discussion of Baker 2015, see Sect. 5.5). Crucially, in light of the data in Sect. 3.1, it would not suffice to simply stipulate that movement does not affect case assignment. Thus, a more nuanced account is called for.

Neither of the problems that successive-cyclic movement raises for DCT fall under the purview of phases (or its predecessor, subjacency). First, because phase edges remain accessible at the next highest phase, per the Phase Impenetrability Condition (Chomsky 2000, 2001), the locality enforced by phases permits precisely the configurations that give rise to the Lower-DP Problem, as schematized in (23). In other words, a DP unvalued for case may c-command the edge of the lower phase, thereby satisfying the criteria for establishing a dependent-case relationship with a DP in that edge position, if it too is unvalued for case.Footnote 12

figure w

Second, because movement to [Spec, CP] takes place before phasal Spellout, such movement should, all else equal, be able to affect the case of elements in the CP-phase domain.Footnote 13 Otherwise, establishing any relation between the phase edge and the phase complement would be impossible, and such relations are minimally necessary for movement dependencies. Thus, the Higher-DP Problem is also not solved under the locality afforded by phases. Note that I am not claiming that these considerations provide evidence against phases; rather, the point is that they do not follow from phase theory itself.

In sum, successive-cyclic movement leads to the generalization that some movement crucially must not feed dependent-case assignment.

Section summary

Some, but not all, movement affects dependent-case assignment. Assigning dependent case as early as possible, Earliness (17), already filters out many of the undesirable interactions between dependent case and movement. For example, in a simple transitive clause, moving an object over a subject will have no effect on dependent case because a dependent-case relationship will have already been established between the two DPs upon first-merge of the subject. This was shown for wh-movement in Sect. 3.2, but it holds for A-movement as well, in particular for A-scrambling. The question then is about all the interactions that Earliness does not capture. These interactions include some instances of A-movement feeding dependent-case assignment, but no instances of \(\overline{\textrm{A}}\)-movement doing so. They also include some instances of \(\overline{\textrm{A}}\)-movement that cannot affect dependent case, but which are not ruled out by Earliness, e.g. with successive cyclicity. This state of affairs is summarized with the generalization in (24).Footnote 14

figure x

To the best of my knowledge, (24) appears to be without a clear exception, though there are three phenomena that warrant discussion. Two of these phenomena, I discuss later: Koryak in Sect. 5.4 and Sakha in Sect. 7.3. The third phenomenon is Hungarian long focus movement (which includes wh-movement): when an embedded nominative element is long-focused, it can surface as accusative if the matrix predicate is transitive, i.e. could itself assign accusative, as shown in (25) (e.g. Massam 1985; É. Kiss 1987; Gervain 2009; den Dikken 2009, to appear; Jánosi 2013; Jánosi et al. 2014). This pattern is (mostly) general: a long-focused element can bear the case associated with the embedded clause (a25a) or the matrix clause (b25b) (Jánosi 2013).Footnote 15

figure y

Massam (1985) analyzes this pattern as case being assigned to the long-focused element in the intermediate [Spec, CP] position that it \(\overline{\textrm{A}}\)-moves through; if true, it would be an exception to (24) (assuming that accusative in Hungarian is dependent case).Footnote 16,Footnote 17 However, in the literature on Hungarian, it has been argued for reasons entirely independent of considerations like (24) that this alternation actually stems from two distinct structures (den Dikken 2009, to appear; Jánosi 2013; Jánosi et al. 2014). When the long-focused element bears embedded case, as in (a25a), there is genuine long movement: the element is base-generated in the embedded clause, is assigned case locally, and \(\overline{\textrm{A}}\)-moves into the matrix-focus position. By contrast, when the long-focused element bears matrix case, as in (b25b), it is base-generated in the matrix clause, is assigned case locally, locally \(\overline{\textrm{A}}\)-moves to the matrix focus position, and is indirectly linked to the embedded gap via resumption. Discussing the arguments in favor of this analysis would take us too far afield; the reader is referred to Jánosi (2013). Crucially for the purposes of this paper though, under this independently-motivated analysis of Hungarian long focus, it is not an exception to (24), as there is no \(\overline{\textrm{A}}\)-movement feeding dependent-case assignment.

Explaining (24) requires a way of teasing apart movement types. In minimalist syntax, because there is only a single primitive movement operation (i.e. Merge), there is no principled way to distinguish A-movement and \(\overline{\textrm{A}}\)-movement. A goal of this paper is thus to derive the locality constraint in (24) without reference to separate primitives for A-movement and \(\overline{\textrm{A}}\)-movement. In the next section, I show a pattern from Finnish crossclausal case assignment that also does not follow from any binary notion of locality, e.g. phases. Despite not involving movement, this pattern will be shown to parallel movement configurations that are accounted for under the Williams Cycle. I will argue that adopting the Williams Cycle as a constraint on dependent-case assignment, in the form of the Ban on Improper Case, provides a unified account of both crossclausal case assignment in Finnish and the Movement-Case Generalization in (24).

Finnish crossclausal case assignment

This section shows that in Finnish, dependent case may be licensed across a nonfinite clause boundary, but only by a subject and not by an object (or an adjunct). As with movement, this dichotomy will be shown not to fall under the purview of standard conceptions of locality, e.g. phases. Section 4.1 begins with some background on Finnish structural case and arguments that accusative in Finnish is dependent case. Sect. 4.2 then discusses the crucial case patterns in embedded nonfinite clauses.

Background on Finnish case

Finnish has three structural cases: nominative, accusative, and partitive.Footnote 18 For the sake of simplicity, I set aside partitive case and focus on the distribution of nominative and accusative (for a more comprehensive analysis, see Poole 2015). In a simple transitive clause, the external argument is nominative and the internal argument is accusative (26). To simplify the exposition, let us refer to the external argument as the “subject” and the internal argument as the “object”. Whenever the subject is absent, e.g. in a passive (a27a) or in an imperative (b27b), or the subject bears lexical case (i.e. a quirky subject) (c27c), the object is nominative.Footnote 19,Footnote 20

figure aa
figure ab

The case patterns exemplified in (26) and (27) receive a straightforward explanation under DCT. In (26), the subject licenses dependent case (= accusative) on the object; then, because there is no other DP that c-commands the subject, the subject remains unvalued for case throughout the derivation and is realized as having unmarked case (= nominative) at PF. In (a27a) and (b27b), there is no other DP that c-commands the object; as such, no dependent-case relationship is established, and the object is realized as having unmarked case. In (c27c), although there is another DP that c-commands the object, it bears lexical genitive case. Recall from Sect. 2 that only DPs unvalued for case factor into the calculus of dependent case. Lexically case-marked DPs are thus invisible to dependent-case assignment because their case will already have been assigned locally. Accordingly, because no other DP with unvalued case c-commands the object in (c27c), the object remains unvalued for case and is realized as having unmarked case at PF. This analysis is summarized in (28).Footnote 21

figure ac

The data in (26) and (27) could alternatively be analyzed in FHCT: the variants of v0 in (27) would lack the ability to assign accusative case, so that T0 could assign nominative case to the object (e.g. Vainikka and Brattico 2014, though the identity of the heads differs on their account). Such an analysis would amount to a standard implementation of Burzio’s Generalization. Evidence that such an FHCT analysis is insufficient comes from adjuncts. In Finnish, there is a special class of adjuncts that are structurally case-marked, akin to subjects and objects (Tuomikoski 1978; Maling 1993). These adjuncts include durational adjuncts (for an hour), spatial-measure adjuncts (a kilometer), and multiplicative adjuncts (two times). In DCT terminology, these adjuncts factor into the calculus of dependent case—i.e. they can license and be assigned dependent case—and they are realized with unmarked case if their case remains unvalued in the derivation. To illustrate, in an intransitive clause with one of these adjuncts, the subject is nominative and the adjunct is accusative (a29a). When the intransitive predicate is passivized (as some kind of impersonal passive), the adjunct becomes nominative (b29b), the same case alternation that is observed for objects in passives (a27a).Footnote 22

figure ad

With a transitive predicate, where the object does not bear lexical case, structurally case-marked adjuncts are always accusative (30). (Note that the DP in a dependent-case relationship that is not assigned dependent case still has an unvalued case feature and thus is eligible to enter into another dependent-case relationship.)

figure ae

Following Larson (1988) and Pesetsky (1995), among others, I will assume that the vP is right-branching, where adjuncts are c-commanded by the object see also Csirmaz 2005:90–98, who also argues for such an analysis for Finnish. This is schematized in (31), where the possible dependent-case relationships are indicated. Accordingly, an object that is not assigned lexical case by the verb will invariably license dependent case on an adjunct, thereby accounting for the pattern in (30).Footnote 23

figure af

Crucially, clauses like (b29b), where the adjunct is nominative, may contain multiple structurally case-marked adjuncts. In such configurations, the DCT analysis and the FHCT analysis make different predictions. The DCT analysis predicts that the highest adjunct is nominative and all the other adjuncts are accusative. The FHCT analysis, on the other hand, predicts that all of the adjuncts are nominative, because the functional head responsible for assigning accusative case is absent in clauses where the subject is absent; this is what accounted for the data in (27) under an FHCT analysis. The data bear out the prediction of the DCT analysis. This is shown in (32) with two structurally case-marked adjuncts and the verb luottaa ‘trust,’ which assigns lexical illative case to its object, thereby removing it from the calculus of dependent case. When the subject is present, both of the adjuncts are accusative (a32a). When the subject is absent, here in a passive, the higher adjunct is nominative and the lower adjunct is accusative (b32b).Footnote 24 Finally, when the first adjunct is dropped, the only remaining adjunct becomes nominative (c32c).

figure ag

The pattern in (32) follows in the DCT analysis without further ado. For example, in (b32b), the first adjunct licenses dependent case on the second adjunct; then, because no relevant DP c-commands the first adjunct, it remains unvalued for case in the derivation and is realized with unmarked case at PF. The FHCT analysis, on the other hand, would need to make additional stipulations to account for (32), in particular to deal with (b32b), in which accusative would have to be assigned in a passive, where the functional head responsible for accusative would not occur (similarly in (b30b)). As far as I am aware, there is no FHCT analysis of Finnish case that extends to the case pattern in (32).Footnote 25 I take the fact that this adjunct pattern is entirely regular and productive in Finnish to indicate that Finnish requires the notion of dependent case in order to capture the distribution of accusative Poole 2015; see also Maling 1993, Anttila and Kim 2011, 2017. I will thus adopt such an account in what follows. Against this backdrop, let us now consider case assignment in nonfinite clauses.

Case in nonfinite clauses

Finnish has a number of nonfinite constructions Vainikka 1989, 1995; Toivonen 1995; Koskinen 1998; also Hakulinen et al. 2004:Sect. 490. The nonfinite construction of interest in this paper is the ma-infinitive (traditionally called the “third” infinitive). The reason that ma-infinitives are interesting is because when they function as clausal complements, case assignment within the nonfinite clause interacts with the makeup of the clause of the embedding verb (e.g. Vainikka 1989). That is, the matrix (= embedding) and embedded clauses constitute a single coextensive domain for the purposes of dependent-case assignment.

The ma-infinitive requires the verb to bear an inner locative case marker (inessive, elative, or illative) after the infinitival morpheme -mA (33).Footnote 26 The case marker matches what a DP would bear in that same position, with the same “directional” meaning (33). In this sense, the verb in a ma-infinitive is nominal-like, but unlike a genuine nominal, it cannot be modified by nominal modifiers, only verbal modifiers (34).

figure ah
figure ai

When the matrix clause has an ordinary nominative subject, the embedded object is marked with accusative (a35a). Then, when the matrix subject is absent or bears lexical case, the embedded object becomes nominative (b35b). In this section, of the constructions that remove the subject from the dependent-case calculus, I only show imperatives, but all of the data can be replicated for passives and quirky-subject constructions.

figure aj

This resembles the same pattern from monoclausal sentences in Sect. 4.1. In (a35a), the embedded object is c-commanded by another DP unvalued for case, i.e. the matrix subject, and thus is assigned dependent case (= accusative). In (b35b), there is no other DP unvalued for case that c-commands the embedded object and thus it surfaces with unmarked case (= nominative) at PF.

Accordingly, the pattern in (35) can be accounted for under DCT by considering (i) the CP to be the relevant domain for dependent case and (ii) ma-infinitives to be projections smaller than CP, so that the domain over which dependent case is calculated includes both the matrix and embedded clauses. Following Koskinen (1998), I assume that ma-infinitives are TPs.Footnote 27

(b35b) also reveals that PRO is either absent from these constructions or inert for the purposes of dependent-case assignment. Otherwise, there would be no principled way to explain why the embedded object’s case is contingent on the presence of an argument in the matrix clause. Another DP like PRO inside the embedded clause that c-commands the object and is unvalued for case (for some portion of the derivation) would invariably license dependent case on the object, thereby negating any effect that the matrix clause could ever have. While either analysis (i.e. no PRO or inert PRO) would in principle account for the case pattern in (b35b), I will adopt the first analysis that ma-infinitives lack a PRO.Footnote 28 This choice is largely for the sake of simplicity, but there are two arguments in its favor. First, if PRO can only occur in CPs, as Landau (2000) argues, this absence would follow from ma-infinitives being smaller than CPs. Second, this analysis also allows for a uniform treatment of PRO crosslinguistically as a dependent-case licensor, rather than parametrizing its ability to license dependent case on a language-by-language basis. On this analysis, then, PRO has no effect on dependent case in ma-infinitives because it is not there.

The crucial pattern emerges when the embedding predicate has its own object. Some of these predicates include pakottaa ‘force,’ pyytää ‘ask,’ and kieltää ‘deny’ (see Vainikka 1989:330). As shown in (36), when the matrix subject is present, the matrix subject is nominative, the matrix object is accusative, and the embedded object is accusative; this is the pattern expected, given what we have seen so far.

figure ak

Under DCT, this pattern could in principle be modelled in one of two ways: (i) a covariance derivation, where the matrix subject licenses dependent case on both objects (37), or (ii) a daisy-chain derivation (38), where the matrix object licenses dependent case on the embedded object and then the matrix subject licenses dependent case on the matrix object.

figure al
figure am

However, in the absence of a matrix subject, both the matrix object and the embedded object surface with nominative case, as shown in (39). This rules out the daisy-chain derivation for ma-infinitives in (38). Rather, the case of the matrix and embedded objects covaries with the presence of the matrix subject, as predicted by the analysis in (37).

figure an

Binding reveals that the matrix object nevertheless c-commands the embedded object. Finnish third-person possessive suffixes are subject to Condition A, as illustrated in (a40a). Crucially, a third-person possessive suffix on the embedded object can be bound by the matrix object (in addition to the matrix subject), as shown in (b40b). This shows that the matrix object does indeed c-command the embedded object. All else equal, the matrix object should then license dependent case on the embedded object. The fact that it does not thus needs to be explained.Footnote 29

figure ao

What (39) and (40) reveal is that a matrix subject, but not a matrix object can license dependent case across an embedded TP boundary into a ma-infinitive, as schematized in (41).

figure ap

Structurally case-marked adjuncts in the matrix clause are also unable to license dependent case across an embedded TP, and thus they pattern with matrix objects. This is shown in (a42a), where the multiplicative adjunct has matrix scope and still both objects must be nominative. (a42a) additionally shows that the matrix object has the ability to license dependent case, as it does so on the adjunct, making its inability to do so on the embedded object all the more striking. When the adjunct has embedded scope, the embedded object licenses dependent case on the adjunct in an ordinary local configuration (b42b).Footnote 30

figure aq

The overarching pattern to emerge from Finnish ma-infinitives is summarized in (43).Footnote 31,Footnote 32

figure ar

One might wonder why the Finnish pattern in (43) is not found in languages like English. There are two reasons. First, in languages like English, control infinitives are CPs (Landau 2000), and CPs are domains for dependent-case assignment. Second, as control infinitives in languages like English contain PRO, PRO will always locally license dependent case on the object. Thus, in languages like English, case assignment in control infinitives is always determined locally; it is never contingent on elements in the matrix clause.Footnote 33 Finnish ma-infinitives (and ta-infinitives), on the other hand, are smaller than CP and contain no PRO (or, alternatively, they contain a PRO inert for dependent case), which causes the embedded DPs to interact for dependent-case assignment with matrix DPs. (35) is the crucial datapoint showing this property. The prediction then is that in languages with a pattern like (35), the same generalization from Finnish in (43) should emerge. I leave exploring this prediction to future research.

Crucially, the Finnish Case Generalization in (43) does not involve movement, which will prove important in the next two sections. Like the Movement-Case Generalization from Sect. 3, it also does not fall under the purview of standard notions of locality, e.g. phases, where a domain is either opaque to all operations or transparent to all operations. Under these standard, binary notions of locality, it is unexpected for a domain (here, a TP) to be penetrable by a DP in one position (matrix-subject position), but not another position (matrix-object position, which is arguably more local than the matrix subject). As such, the Finnish Case Generalization must be the result of some other kind of locality, namely one that is nonbinary. In the next section, I will argue that this nonbinary notion of locality is the Williams Cycle.

Improper case

In this section, I propose that dependent-case assignment is constrained by the Ban on Improper Case in (44). This constraint rules out dependent-case assignment configurations like (45).

figure as
figure at

The Ban on Improper Case is a constraint in the spirit of the Williams Cycle (WC) (Williams 1974, 2003, 2013; van Riemsdijk and Williams 1981), which in its original form is only a constraint on movement dependencies. In Sect. 6, I will propose that the WC be generalized to encompass case, movement, and agreement and then take up how to derive this generalized WC.

I begin in Sect. 5.1 by introducing the WC in its instantiation for movement, known as the Generalized Ban on Improper Movement. Sect. 5.2 proposes the Ban on Improper Case, an extension of the WC particularized to case. In Sects. 5.3 and 5.4, I then apply the proposal to the Finnish Case Generalization and the Movement-Case Generalization respectively. Sect. 5.5 briefly discusses the treatment of case and movement in Baker (2015).

The Williams Cycle

The Williams Cycle (WC) is a size-based locality constraint on (movement) dependencies spanning two clauses, going back to Williams (1974) and van Riemsdijk and Williams (1981). The basic idea behind the WC is that movement from a specific domain in an embedded clause may move to the same kind of domain or a higher domain in the matrix clause. In Williams (2003), the WC is formulated as the Generalized Ban on Improper Movement (GBOIM) in (46), where domains are defined in terms of the functional sequence (fseq).Footnote 34 I will notate X being higher in fseq than Y as XY, and, for concreteness, I will assume the simple functional sequence in (48).

figure au
figure av
figure aw

As its name suggests, the GBOIM is intended to subsume the traditional ban on improper movement (Chomsky 1973, 1981; May 1979). Thus, to illustrate the GBOIM, let us consider how it handles the classical instance of improper movement, namely the ungrammaticality of hyperraising: A-movement out of a finite clause. While \(\overline{\textrm{A}}\)-movement may leave a finite clause (a49a), A-movement may not (b49b).Footnote 35 This contrast does not extend to nonfinite TP clauses, which allow both \(\overline{\textrm{A}}\)-movement (a50a) and A-movement (b50b) out of them. (For the sake of simplicity, I set aside nonfinite CP clauses, which pattern like finite clauses for hyperraising.)

figure ax
figure ay

According to the GBOIM, the relative heights of the launching and landing sites determine whether extraction is possible. Because finite clauses are CPs, movement out of a finite clause can land no lower than [Spec, CP] in the next highest clause, as schematized in (51).

figure az

As depicted in (51), CP is a barrier for movement to [Spec, TP] because CT in fseq, but CP is not a barrier for movement to [Spec, CP] because CC. Thus, \(\overline{\textrm{A}}\)-movement, but not A-movement, out of a finite clause is grammatical. On the other hand, because nonfinite clauses are TPs, movement out of a nonfinite clause may land in either [Spec, TP] or [Spec, CP] because TT and TC respectively. Thus, both A-movement and \(\overline{\textrm{A}}\)-movement are possible out of a nonfinite clause, unlike finite clauses, as schematized in (52).

figure ba

Under the GBOIM, size matters. A smaller clause is permeable to more movement types than a larger clause, because the maximal projection of a smaller clause will be lower in fseq than the maximal projection of a larger clause. Constraining movement in terms of clause size extends beyond the distinction between A-movement and \(\overline{\textrm{A}}\)-movement. Here are several examples (taken from Keine 2016): (i) Infinitival clauses are opaque to extraposition, but not regular A-movement and \(\overline{\textrm{A}}\)-movement (Ross 1967; Baltin 1978). (ii) Embedded questions are opaque to wh-movement, but not topicalization and relativization (Williams 2013). (iii) In Hindi-Urdu, finite clauses are opaque to A-scrambling, but not \(\overline{\textrm{A}}\)-scrambling (Mahajan 1990). In German, (iv) embedded V2 clauses are opaque for movement into a verb-final clause, but not movement into a V2 clause (Haider 1984); (v) finite clauses are opaque to scrambling and relativization, but not wh-movement or topicalization (Bierwisch 1963; Ross 1967; Bayer and Salzmann 2013; Müller 2014b); and (vi) incoherent infinitives are opaque to scrambling, but not wh-movement and relativization (Bech 1955/1957; Wurmbrand 2001). What these asymmetries share is involving a domain that is permeable to one movement type, but not another movement type (what Keine terms selective opacity). The GBOIM derives these asymmetries as “generalized” improper movement configurations, i.e. in terms of clause size. For more discussion, see Williams (1974, 2003, 2013), Müller and Sternefeld (1993, 1996), Abels (2007, 2009, 2012a,b), Neeleman and van de Koot (2010), Müller (2014a,b), and Keine (2016, 2019, 2020).


There are crucially parallels between the locality problems from Sects. 3 and 4 and the kinds of movement configurations ruled out by the Generalized Ban on Improper Movement. To see these parallels, let us consider the two locality problems in turn.

With respect to the Movement-Case Generalization, recall the Lower-DP Problem, according to which a DP cannot be the lower DP in a dependent-case pair when in an intermediate landing site. (I will return to the Higher-DP Problem in Sect. 5.4.) This characterization can be recast in terms of the WC, viz. clause size and the functional sequence: a DPα in [Spec, CP] cannot enter into a dependent-case relationship with a DPβ in a higher clause—DPα being the lower in the pair—if DPβ is in [Spec, TP], [Spec, vP], or [Spec, VP], because CT, Cv, and CV in fseq. This is schematized in (53).

figure bb

Note that a dependent-case relationship between two [Spec, CP] positions also needs to be ruled out (54). This configuration does not fall under the characterization of the Lower-DP Problem—or from the Ban on Improper Case, to be proposed below—because CC in- fseq.

figure bc

However, for the higher DP in (54) to be in [Spec, CP], it will have undergone \(\overline{\textrm{A}}\)-movement to that position. Thus, the impossibility of this particular configuration falls under the Higher-DP Problem (i.e. that an \(\overline{\textrm{A}}\)-moved element cannot be the higher DP in a dependent-case pair) and will follow from the analysis of the Higher-DP Problem in Sect. 5.4.

The same parallels apply to the Finnish Case Generalization, according to which a matrix subject can license dependent case across an embedded TP clause boundary, but a matrix object and a matrix adjunct cannot. In terms of the WC: a DP in [Spec, TP] can license dependent case on another DP across a TP, because TT, but a DP in a lower position such as [Spec, vP] or [Spec, VP] cannot do so, because Tv and TV. This is schematized in (55).

figure bd

These parallels in (53) and (55) are the motivation for extending the WC to dependent-case assignment. I propose that dependent-case assignment is subject to the Ban on Improper Case in (56), a direct extension of the WC to case.

figure be

The Ban on Improper Case states barrierhood for dependent-case assignment relative to the fseq-position of the higher DP in the dependent-case pair. For example, a DP in [Spec, TP] can license a dependent-case relationship with another DP past TP, vP, and VP, because none of these projections are higher than T in fseq (57). However, a DP in [Spec, TP] cannot license dependent case past CP, because CT so that CP is a barrier to dependent-case licensing from TP (58). Note that CP’s barrierhood extends to all projections lower than T in fseq as well.

figure bf
figure bg

Notice that the Ban on Improper Case makes no reference to movement or clause types. It is more general than the empirical data that motivated it. The remainder of this section shows how the Ban on Improper Case applies to our two very different generalizations: the Finnish Case Generalization in Sect. 5.3 and the Movement-Case Generalization in Sect. 5.4.

Application to Finnish

The Finnish Case Generalization is repeated below in (59).

figure bh

Under the Ban on Improper Case, the matrix subject is able to license dependent case across the embedded TP boundary because the matrix subject is located in [Spec, TP] and TT in fseq. Thus, it licenses dependent case on the matrix object (within the same clause) and on the embedded object (across the clause boundary). This is schematized in (60).Footnote 36

figure bi

The matrix object occupies a vP-internal position—the precise position is inconsequential, but somewhere below v. From its vP-internal position, the matrix object is unable to license dependent case across the embedded TP boundary because Tv in fseq, thereby making TP a barrier for dependent-case licensing from vP-internal DPs, in particular from DPs in [Spec, vP] and any position lower in fseq. The same barrierhood applies for matrix adjuncts as well, which are generated in vP-internal positions too. As such, in the absence of a matrix subject, the [case: □] features on the matrix and embedded objects both remain unvalued throughout the derivation and are realized as unmarked case at PF. This is schematized in (61).

figure bj

Under this analysis, there is nothing special about case in ma-infinitives. The same general case mechanism, namely dependent case, applies everywhere in the language as syntactic structure is built up, following Sect. 2—but this mechanism is constrained by the Ban on Improper Case.

Previous analyses of ma-infinitives are all broadly based on the idea that when the matrix subject is absent or bears lexical case, i.e. the environments in Finnish with nominative objects, the ability to assign accusative case is gone altogether (Vainikka 1989; Nelson 1998; Vainikka and Brattico 2014).Footnote 37 However, we saw in Sect. 4.2 that structurally case-marked adjuncts are still accusative in configurations like (61); the relevant datapoint is repeated in (62).

figure bk

If the ability to assign accusative case is absent in configurations like (61), as previous analyses assume, then there would be no source of accusative case for the adjunct in (62). However, (62) follows without further ado on the DCT analysis developed in this paper: the matrix object licenses dependent case on the adjunct, but the matrix and embedded objects cannot enter into a dependent-case relationship without violating the Ban on Improper Case.

Application to movement

The Movement-Case Generalization is repeated below in (63).

figure bl

Let us begin with \(\overline{\textrm{A}}\)-movement. Recall that the locality problem with \(\overline{\textrm{A}}\)-movement is that an \(\overline{\textrm{A}}\)-moved element cannot enter into dependent-case relationships from its intermediate and final landing sites. Thus, we must consider (i) when an \(\overline{\textrm{A}}\)-moved element is the lower DP in a potential dependent-case pair (see (21)) and (ii) when it is the higher one (see (22)). These are the Lower-DP Problem and the Higher-DP Problem respectively. For the sake of clarity, I will label the higher and lower DPs in a dependent-case pair as DPα and DPβ, respectively, unless the DP in question is an \(\overline{\textrm{A}}\)-moved element, for which I will reserve the label DPμ. It should be emphasized that this labeling is for expository purposes only, and the Ban on Improper Case does not (need to) take into account whether the relevant DPs have undergone movement.

According to the Ban on Improper Case, a DPα in [Spec, TP], [Spec, vP], or [Spec, VP] cannot enter into a dependent-case relationship with a DPμ in embedded [Spec, CP] because these projections are all lower than C in fseq. That is, CP is a barrier for dependent-case licensing from TP and all projections lower in fseq (64). This barrierhood accounts for why an \(\overline{\textrm{A}}\)-moved element may not have its case altered at its intermediate and final landing sites, i.e. the Lower-DP Problem.

figure bm

The Ban on Improper Case, however, does not prohibit a DPμ in [Spec, CP] from establishing a dependent-case relationship with a DPβ lower in the same clause, i.e. the Higher-DP Problem, as C is higher than these projections in fseq. I propose that the reason why \(\overline{\textrm{A}}\)-moved DPs cannot themselves license dependent case is because they are encased in a QP, i.e. Q-particle Phrase (in the sense of Cable 2007, 2010).Footnote 38 Because only DPs may establish a dependent-case relationship, a DP inside a QP cannot be the higher DP in a dependent-case pair because it does not c-command out of the QP and hence never c-commands other DPs in the clause (a65a). On the other hand, a DP inside a QP can be the lower DP in the pair because other DPs can still c-command into the QP (b65b).

figure bn

However, a DP that undergoes \(\overline{\textrm{A}}\)-movement should still be able to enter into dependent-case relationships from the A-positions that it occupied prior to \(\overline{\textrm{A}}\)-movement, which (a65a) does not permit. To solve this problem, I adopt Safir’s (2019) independently-motivated proposal that the QP is countercyclically merged onto the DP immediately before it \(\overline{\textrm{A}}\)-moves (see also Rezac 2003; Stanton 2016).Footnote 39 To illustrate how this applies to dependent case, consider the derivation of a simple wh-subject question in (66): (i) the subject is base-merged in [Spec, vP], from where it licenses dependent case on the object (a66a); (ii) the subject A-moves to [Spec, TP] (b66b); (iii) the QP is then merged on top of the subject (c66c); and finally (iv) the QP moves to [Spec, CP] (c66c). The \(\overline{\textrm{A}}\)-moving DP will always be encased in the QP before it reaches any intermediate or final [Spec, CP] landing sites, thereby preventing it from licensing dependent case on other DPs from those derived positions, i.e. the Higher-DP Problem.

figure bo

Note that the addition of the QP layer does not handle the Lower-DP Problem, because other DPs can nonetheless c-command a DP encased in a QP. This problem still requires the Ban on Improper Case, as was schematized above in (64). This point, however, raises an alternative analysis where QPs are themselves opaque to case assignment, so that once they are formed on a DP, that DP no longer interacts with case assignment. Such an analysis faces the dilemma that the opacity for case assignment would have to come from its own source and not apply to other dependencies, because c-command into a QP for other dependencies, e.g. binding, is indeed possible. For example, consider (67), in which an anaphor in a moved wh-element—on the analysis here, a QP—can be bound from its landing site (Barss 1986; Lebeaux 1988).

figure bp

Under this analysis, the behavior of \(\overline{\textrm{A}}\)-movement with respect to dependent case follows from two components: a QP-shell and the Ban on Improper Case. The former handles the Higher-DP Problem, and the latter handles the Lower-DP Problem.

Because the QP-shell and the Ban on Improper Case are independent from each other, this account predicts that if the ‘higher/lower’ symmetry breaks down, it should crucially do so in one direction. Namely, if an \(\overline{\textrm{A}}\)-moving DP is not encased in a QP-shell, then it should be able to be the higher DP in a dependent-case pair in its intermediate and final landing sites, but not the lower DP. In other words, it should exhibit the behavior of the Lower-DP Problem, but not the Higher-DP Problem. Empirically, this would be a movement type that is just like wh-movement—targeting a position high in fseq, like [Spec, CP]—except it does not involve a QP-shell.Footnote 40 This prediction is schematized in (68).

figure bq

This prediction appears to be borne out in Koryak, as described by Abramovitz (2020). Abramovitz shows that in Koryak, (i) ergative is dependent case and (ii) long wh-movement of an embedded nominative DP results in dependent ergative case on the matrix subject, as shown in (69). Note that for readability, I do not gloss the verbal morphology in the Koryak data.

figure br

(69) does not reveal where the wh-element enters into the dependent-case relationship with the matrix subject: matrix [Spec, CP] (the higher position) or embedded [Spec, CP] (the lower position). For (68) to be true, it must be the higher position. Crucially, embedded questions show that it is indeed the higher position.Footnote 41 In embedded questions, moving a nominative wh-element to embedded [Spec, CP] does not trigger dependent ergative on the matrix subject, as shown in (70). This is precisely what (68) predicts. By contrast, if (69) involved the wh-element establishing the dependent-case relationship from the lower position, then we would expect the matrix subject to be ergative in (70) as well, contrary to fact.

figure bs

Therefore, Koryak seems to confirm the prediction in (68) of the two-component analysis being proposed here. Whether this pattern can be found more widely, I leave for future research. Note that this analysis of Koryak requires the unorthodox assumption that ergative is dependent case assigned to the lower DP in a dependent-case pair, rather than the higher DP. Exploring this point is outside the scope of this paper, but see Yuan (2018, 2020) for independent arguments that “ergative” (i.e. dependent case on transitive subjects) can be assigned downwards.Footnote 42

Turning now to A-movement, recall the three examples of movement feeding dependent case from Sect. 3.1: (some instances of) object shift, Shipibo applicatives of unaccusatives, and the English raising predicate strike as. All three of these examples obey the Ban on Improper Case. For object shift, the subject’s position is higher in fseq than the raised object’s position—presumably [Spec, TP] and [Spec, vP] respectively—as schematized in (71). Similarly for Shipibo applicatives, the raised theme’s position is higher in fseq than the applicative’s position (72). Generally, if a DP moves clause-internally (i.e. within the same extended projection), it will be able to establish dependent-case relationships with other DPs in that same clause, because the higher DP in the pair will always be higher in fseq than the lower DP.

figure bt
figure bu

For English strike as, the movement crosses a clause boundary (the movement itself obeying the GBOIM; see Sect. 5.1), but the position in which the moved DP lands is higher in fseq than the position of the internal argument of strike. Therefore, according to the Ban on Improper Case, the two DPs can establish a dependent-case relationship, as schematized in (73).

figure bv

The Ban on Improper Case also explains why the internal argument of strike does not license dependent case on the subject in its embedded position before it moves: the vP-internal position of the internal argument is lower in fseq than T. Therefore, TP is a barrier to licensing a dependent-case relationship with the embedded subject.

The discussion thus far has focused on dependent case, but it should be noted that this analysis does not preclude lexical case from being assigned to a moved position. First, the Ban on Improper Case is not formulated to encompass lexical-case assignment. But, because lexical case is assigned in a siblinghood relation, it is out of the purview of the Ban on Improper Case regardless. Assuming Bare Phrase Structure, where what projects is the head itself (Chomsky 1995a), it is then possible for a DP to move to a specifier position and be assigned a lexical case under siblinghood, in what would traditionally be a specifier–head relation (à la Rezac 2003) (see also fn. 5). To illustrate, consider dative–accusative constructions in Faroese (74), which are historically related to the more familiar Icelandic dative–nominative constructions.

figure bw

These constructions can be analyzed as the following: (i) the subject is base-merged in [Spec, vP], from where it licenses dependent case (= accusative) on the object (a75a); (ii) the subject moves to a higher projection in the clause, e.g. Exp0 (b75b); and (iii) the head of this projection assigns the subject lexical dative case (c75c). The difference between Faroese and Icelandic is that in Icelandic, the subject is assigned dative case in its base-generated position, thus bleeding dependent-case assignment and yielding nominative objects.Footnote 43

figure bx

There are several other examples that, to my knowledge, might instantiate this kind of derivation with movement to a lexical-case position: ergative subjects in what Woolford (2015) terms Active Ergative languages (where ergative case is associated with external arguments), the “marked nominative” construction in Dinka (van Urk 2015), and differential object marking in Hindi-Urdu (Bhatt and Anagnostopoulou 1996). There are likely many other such instances, but these exemplify when such a derivation might be reasonably invoked.

In sum, the Ban on Improper Case accounts for the interactions between movement and dependent case: roughly, A-movement, but not \(\overline{\textrm{A}}\)-movement may feed dependent-case assignment. Importantly, the analysis does not invoke separate operational primitives for A-movement and \(\overline{\textrm{A}}\)-movement. Rather, the analysis derives from the positions targeted by different movement types. Moreover, if Safir (2019) is correct that the QP-shells in \(\overline{\textrm{A}}\)-movement can be derived from independent factors, then the analysis presented here captures the A/\(\overline{\textrm{A}}\)-distinction in this (narrow) domain purely as an epiphenomenon. This thinking is in line with minimalist syntax, where all structure building is the result of the operation Merge. The foundations of the analysis were also independently motivated from the Finnish Case Problem, which crucially does not involve movement.

Remarks on Baker (2015)

As the most comprehensive dependent-case system to date, it is instructive to consider how Baker’s (2015) system fares on the data considered in this paper. First, Baker does not investigate anything comparable to Finnish ma-infinitives, and hence nothing in his system handles the Finnish Case Generalization. Second, his treatment of the A-movement examples from Sect. 3.1 is more or less in line with what I propose for them in Sect. 5.4. Therefore, let us set aside these two issues and focus on the Higher-DP Problem and the Lower-DP Problem, where a comparison is more fruitful.

Regarding the interaction of case and movement, Baker proposes that (i) dependent case is assigned at phasal Spellout and that (ii) it is calculated within the phase complement, crucially excluding the phase edge.Footnote 44 Consider the wh-question in (76) at the point when the CP phase is spelled out. As the phase complement of C, TP is the domain of dependent case. The higher copy of who in [Spec, CP] is not within the phase complement and hence does not factor into the dependent-case calculus. Thus, the only dependent-case relationship established at the CP phase is between she and the lower copy of who. The higher copy of who in [Spec, CP] is spelled out in the next phase (or by whatever procedure spells out the edge of the highest phase).

figure by

This analysis accounts for the Higher-DP Problem: in its intermediate and final landing sites, where it could be the higher DP of a dependent-case pair, an \(\overline{\textrm{A}}\)-moved element is not included in the calculus of dependent case for that phase.

However, there are two drawbacks. First, this treatment of the Higher-DP Problem does not extend to the Koryak data, where an \(\overline{\textrm{A}}\)-moved element in [Spec, CP] does in fact establish a dependent-case relationship with a DP lower in that clause. Baker’s analysis categorically rules out such configurations, which appears to be too strong. Second, it leaves the Lower-DP Problem unresolved. As laid out in Sect. 3.2, from embedded [Spec, CP] positions, a DP unvalued for case should be eligible to be the lower DP in a dependent-case pair, but crucially it is not. There is nothing in Baker’s analysis to rule out such pairs. In general, appealing to the locality afforded by phases cannot account for the Lower-DP Problem (see (23)).

Therefore, Baker’s (2015) analysis of the interaction between case and movement does not extend to the full range of facts considered in this paper. However, this should not be construed as an argument against Baker’s overall dependent-case system, since it is otherwise compatible with the Ban on Improper Case.

Deriving the Williams Cycle

While the Ban on Improper Case derives the range of facts presented in this paper, the fact that analogous restrictions have been observed for movement (e.g. Williams 1974, 2003, 2013; Müller and Sternefeld 1993, 1996; Abels 2007, 2009, 2012a,b; Neeleman and van de Koot 2010; Müller 2014a,b) and agreement (Keine 2016, 2019, 2020) strongly suggests that these “WC effects” have a unified source. Here, there are two interconnected issues: (i) how to formulate the WC so as to encompass case, movement, and agreement and (ii) how to derive the WC in the grammar.

The existing analyses of WC effects—other than Williams’s own—analyze the WC as the result of a constraint on either Merge (Abels 2007, 2009; Müller 2014a,b) or Agree (Keine 2016, 2019, 2020). Examining the specific details of these proposals would take us too far afield. What is important for present purposes is that they are operation-specific. For these proposals to extend to the case facts presented here, i.e. the Ban on Improper Case, dependent-case assignment would need to involve Agree (or, in principle, Merge). However, a dependent-case relation does not resemble an Agree-relation, in that it does not involve any obvious valuation (or checking) of syntactic features between the two DPs. Put differently, it is not clear how the two DPs valuing features on each other (in either direction) would result in one of them being assigned dependent case. Thus, if case, movement, and agreement all exhibit WC effects and (dependent) case assignment does not involve Agree, then WC effects must be the result of a more general (non-operation-specific) constraint in the grammar as Keine 2020:332 also acknowledges.

The line of thinking that I advance in this section is that WC effects can be uniformly derived in an operation-general way if they are analyzed as the result of how clausal embedding works, as Williams (2003) originally proposed. The challenge for this approach is that it enforces a very strict locality. While this strict locality appears to be appropriate for case (the focus of this paper), it has been argued that it is empirically too restrictive for movement (Abels 2007, 2009); I will return to this issue in Sect. 7.3.

Broadly construed, the WC is the notion that one and the same node can be a barrier to some dependencies, but not to other dependencies. I propose adopting the particularly strong formulation of the WC in (77).

figure bz

The formulation in (77) is operation-general; it does not reference specific operations and thereby covers case, movement, and agreement alike. It also encodes the strict locality of the Generalized Ban on Improper Movement (see Sect. 5.1) and thus is commensurate with the formulation in Williams (2003, 2013). Accordingly, the Ban on Improper Case is a subcase of (77).

An additional upshot of the formulation in (77) is that it is compatible with the various syntactic implementations of dependent-case assignment: a distinct syntactic operation (as assumed in this paper; see Preminger 2011, 2014), parasitic on cyclic linearization (Baker 2015), or binding relations (Pesetsky 2011). In other words, the WC does not require a particular analysis of dependent case, only that it occurs in the syntax (see Sect. 7.1).

To derive the WC as formulated in (77), I suggest returning to Williams’s (2003, 2013) own analysis of the WC, which derives the WC from the syntax of embedding (for an overview of this theory, see Hornstein and Nevins 2005). The core idea of the analysis is that embedding is constrained by the Level Embedding Conjecture (LEC) in (78).Footnote 45

figure ca

The basic idea behind the LEC is that clauses are built up in parallel. Embedding may take place at any point, but once a clause has been embedded, it no longer increases in size.Footnote 46 The different points in the derivation at which embedding may take place correspond to the functional sequence. Williams calls this notion the derivational clock (or ‘F-clock’). To illustrate, consider that-clause embedding in (79) (ignoring the vP). First, both clauses are built up to the VP-level; here, the embedding verb think merges with a placeholder for a CP-clause.Footnote 47 Second, both clauses are built up to the TP-level (b79b). Third, both clauses are built up to the CP-level (c79c). Last, embedding occurs: the embedded CP is substituted for the placeholder in the matrix clause (c79c).

figure cb

Under the LEC, embedding is a substitution operation (though Williams does not explicitly call it such), analogous to Chomsky’s (1955, 1957) theory of generalized transformations (for a reprise, see also Chomsky 1995b:173-174) and to substitution in Tree Adjoining Grammar (Joshi et al. 1975; Kroch and Joshi 1985).

On this proposal, the WC follows from the strict cycle. Let us take the strict cycle to be the result of the Strict Cycle Condition, as defined in (80) the formulation is taken from Müller 2017; see Chomsky 1973, 1995b, 2001, 2008, which precludes syntactic operations from solely applying within embedded domains. Embedding itself must be considered to be admissible under (80).Footnote 48

figure cc
figure cd

Consider now the standard case of improper movement, namely hyperraising: the prohibition on A-movement out of a finite clause. Under the LEC, at no point in the derivation is there a root TP that contains the embedded CP (82). Consequently, there is no means for an element in the embedded CP to move to [Spec, TP] while TP is the root node. The only point in the derivation at which the embedded CP is embedded in the matrix clause is when both clauses are built up to the CP-level. At that point in the derivation, movement to [Spec, TP] would violate the strict cycle.

figure ce

To generalize, under the LEC, a root XP containing an embedded YP, where YX in fseq, never exists in the course of a derivation (a83a). A YP is only embedded once the embedding clause has itself been built up to the YP-level (b83b).

figure cf

No operation that is triggered in XP—whether it be movement, agreement, or case—can look into a YP (where YX) because the relevant structure where X and [Spec, XP] would have access to YP within the strict cycle is simply not created by the grammar. This is illustrated in (84) for dependent-case assignment.

figure cg

As such, all syntactic dependencies are subject to the WC, regardless of whether or not they share the same operational core. All of the WC effects are thus uniformly derived from the timing of embedding.

Before concluding this section, it is worth briefly considering what counts as a ‘syntactic dependency’ in the context of the WC and the LEC. Recall from Sect. 4.2 that in Finnish ma-infinitives, the matrix and embedded objects cannot establish a dependent-case relationship with each other (see (39)), but they can establish a binding relationship (see (40)). The former is predicted by the LEC, but the latter is not; the asymmetry between the two thus needs to be explained. I take a ‘syntactic dependency’ to be a dependency that exists in the narrow syntax, thereby being interspersed with structure building. By this definition, case, movement, and agreement are all syntactic dependencies. However, I contend that LF relations are not syntactic in this relevant, narrow sense. Because the LEC only constrains the narrow syntax, LF relations therefore do not exhibit WC effects. The justification for this claim is that under the LEC, semantic interpretation can only proceed after all embedding has taken place, because the embedded clause’s denotation is needed to compute the VP’s denotation, and so forth. Thus, it is independently the case that LF must be computed on the basis of the final output of the narrow syntax. Crucially, the core of a binding dependency, i.e. the λ-operator and the variable that it binds, is a relation that only needs to hold at LF (Lebeaux 2009).Footnote 49 Returning to Finnish, the asymmetry between dependent case and binding in ma-infinitives, then, is because the dependent-case relationship would need to be established in the narrow syntax, which is not possible because of the LEC, while the binding relationship is established at LF and hence is unaffected by the LEC. Verification of this hypothesis would require examining other LF relations, such as scope and focus, the problem being that many (perhaps most) of these relations are clause-bounded and thus are not the kinds of dependencies that the LEC would affect. I leave pursuing this topic to future research.Footnote 50


This paper has argued that case assignment is subject to the Ban on Improper Case in (85). This constraint is an extension of the Williams Cycle (WC) particularized to case. The motivation for improper case came from two disparate empirical domains: the interaction between case and movement and crossclausal case assignment in Finnish. Both of these locality problems were shown not to fall under the purview of standard notions of locality, e.g. phases, but rather they follow from the Ban on Improper Case.

figure ch

It was then shown that Williams’s (2003) analysis of the WC in terms of clausal embedding uniformly captures WC effects for case, movement, and agreement. Thus, it crucially derives the Ban on Improper Case.

The remainder of this paper is devoted to discussing some of the issues that emerge from the Ban on Improper Case. Sections 7.1 and 7.2 discuss two broader ramifications: the timing of case assignment and the WC’s relation to phases, respectively. Finally, the issue of potential counterexamples to the strict locality enforced by the (strong) WC is taken up in Sect. 7.3.

Timing of case assignment

An overarching question in the case literature is at what point in the derivation case assignment happens. In Marantz’s (1991) original implementation of DCT, case assignment is situated at PF, i.e. in the postsyntactic morphological component. This line of thinking prevailed in the early work on DCT e.g. McFadden 2004, Bobaljik 2008; an exception being Bittner and Hale 1996. It was also often considered a key difference between DCT and the more standard FHCT, which situates case assignment in the narrow syntax (see e.g. Legate 2008). However, more recent work on DCT has argued that, even under DCT, case assignment must be in the narrow syntax and not at PF (Baker and Vinokurova 2010; Preminger 2011, 2014, to appear; Baker 2015).

The Ban on Improper Case lends further support to the argument that case assignment must be in the narrow syntax. First, as movement is subject to the WC and movement occurs in the narrow syntax, the WC itself must be due to a constraint in the syntax in order for it to restrict movement so. It stands to reason then that anything else that is subject to the WC must be in the syntax as well. As such, because case assignment is subject to the WC, it too must be in the syntax. Second, the information required for the WC in the first place is fundamentally syntactic in nature, and replicating it at PF just so that it could apply to case assignment would be redundant. Third, it was argued in Sect. 3.2 that dependent-case assignment is interspersed with structure building, which would not follow if case assignment were at PF.

When in the narrow-syntactic derivation does case assignment happen then? In Sect. 3.2, I proposed that dependent case is assigned as early as possible, a principle that I called Earliness. This notion of earliness also trivially extends to lexical case under DCT, where it is always assigned locally (see Sect. 2). Earliness is the strongest hypothesis about the timing of case assignment, and it is fully consistent with the data presented in this paper. Another possibility, though, is that case assignment is delayed until phasal Spellout (as in Baker 2015). The effects of such an analysis largely depend on what the phases are. If vP is a phase, such an analysis is in principle possible, as long as case assignment precedes any movement to the phase edge, in order to avoid the problems solved by Earliness. However, such an analysis would have to grapple with the fact that vP does not seem to erect a locality domain for case in the same way as CP does; see the next section. If only CP is a phase, though, then delaying case assignment until phasal Spellout will require saying something special about A-scrambling, which would be phase-internal (e.g. to [Spec, TP]) on such an analysis, but does not affect case assignment. Fully exploring these issues is beyond the scope of this paper, so I leave them for future research. However, it should be pointed out that whether XP is a domain at which some operation applies, and whether XP is a locality domain that blocks that same operation are in principle distinct questions, even if they are typically conflated in phase theory.


The WC and phases—the more standard notion of locality—are not mutually exclusive. They may coexist as independent constraints on syntactic operations. For instance, the WC does not force successive-cyclic movement through [Spec, CP]; this is still a consequence of phases.

It is standardly assumed that CP and vP are phases, and consequently that successive-cyclic movement targets [Spec, CP] and [Spec, vP] (Chomsky 2000, 2001, 2008). Throughout this paper though, I have tacitly assumed that only CP is a phase, because in Finnish, a dependent-case relationship can span an arbitrary number of intervening vPs, as illustrated in (86).

figure ci

(86) shows, minimally, that dependent-case assignment is not subject to the Phase Impenetrability Condition (PIC) at the vP-level. There are two potential conservative explanations for this status, both of which are compatible with the Ban on Improper Case. The first is that dependent-case assignment is simply not subject to the PIC, as Bošković (2007) has argued about Agree. The second is that the vP-phase does not intervene in the same way for dependent-case assignment as the CP-phase does, as Baker (2015) proposes with his ‘soft’- and ‘hard’-phase distinction.

There is also the more radical explanation that vP is not a phase. vP-phasehood, in fact, conflicts with the WC more generally. First, according to the WC, movement from [Spec, CP] to [Spec, vP] is barred because Cv in fseq. Second, if such movement were permitted, it would obscure crucial distinctions needed to account for generalized improper movement. For example, consider hyperraising: at the point at which movement to [Spec, TP] occurs, the moving DP would be in [Spec, vP], so it would be necessary to backtrack into the previous phase to see whether it moved out of a CP or a TP. If the movement to [Spec, TP] proceeds directly from the CP/TP (see Sect. 5.1), then such backtracking is unnecessary. For more discussion of this particular problem, see Müller (2014a,b). Based on (i) these kinds of considerations involving the WC and (ii) long-distance agreement configurations parallel to (86), Keine (2016, 2017, 2020) argues that vP should not be considered a phase (see also Keine and Zeijlstra 2020), which would also solve the problems that the WC poses for vP-phasehood.

Potential exceptions to the Williams Cycle

As shown in Sect. 6, the Level Embedding Conjecture (LEC) successfully derives the strong formulation of the WC, repeated in (87), thereby providing a uniform analysis of WC effects for case, movement, and agreement. I will refer to the formulation in (87) as the ‘strong’ WC.

figure cj

Abels (2007, 2009), however, has argued that the strong WC is empirically too restrictive because it rules out several purported movement dependencies, such as subject-to-object raising in ECM infinitives and movement over complementizers (to be discussed below). This criticism extends to the LEC, since it derives the strong WC. The recent, operation-specific analyses of WC effects have taken these purported exceptions at face value and gone on to develop analyses that derive weaker versions of the WC. As discussed in Sect. 6, the dilemma with these analyses is that they do not extend to dependent case. Let us focus on Keine’s (2016, 2019, 2020) Agree-based analysis, setting aside the Merge-based analyses of Abels (2007, 2009) and Müller (2014a,b). Dependent-case relations do not resemble canonical Agree-relations, and thus it is not immediately evident that dependent-case assignment involves Agree. We are therefore at an impasse between two options: (i) develop a fully Agree-based implementation of DCT, thereby allowing us to (in principle) extend Keine’s analysis to case, or (ii) revisit and reanalyze the purported exceptions to the strong WC, thereby allowing us to maintain the LEC.

I argue that the purported exceptions to the strong WC should be revisited and reanalyzed. My argument against the first option is twofold. First, no Agree-based implementations of DCT have been proposed in the literature. Thus, given the current state-of-the-art in DCT, it is not presently possible to directly extend Keine’s analysis to case. Second, Keine’s analysis handles the exceptions largely through a stipulation. Space limitations prevent a detailed discussion, but in a nutshell, under his analysis, some Agree-probes are not subject to the WC (in his terms, they do not have a ‘horizon’). In light of these two points, I contend that it is not at all certain that abandoning the strong WC—and by extension the LEC—is warranted based on a set of limited exceptions, especially given the importance of the strong WC’s operation-generality. At the very least, the introduction of improper case into the empirical landscape warrants subjecting the purported exceptions to closer scrutiny.

Fully reconciling this issue is beyond the scope of this paper, but there are—I believe—promising directions towards reanalyzing the exceptions. In what follows, I briefly discuss each exception, the first two of which are from Abels (2007), and sketch how they might be reanalyzed in ways compatible with the LEC.


In ECM infinitives, it is commonly assumed that the embedded subject moves from inside the embedded TP to a vP-internal position in the matrix clause, as schematized in (88) (Postal 1974). However, according to the strong WC, TP should be a barrier for such movement because Tv in fseq. As such, ECM infinitives appear to pose a challenge for the strong WC. Note that under the WC, the matrix subject can establish a dependent-case relationship with matrix [Spec, vP] or embedded [Spec, TP], so the actual case in ECM is unproblematic.

figure ck

The classical evidence cited in favor of this analysis is that matrix adverbs and particles may intervene between the embedded subject and the embedded predicate, e.g. with all her heart in (88). However, recent work by Neeleman and Payne (2020) has reevaluated this argument. On the basis of scope-freezing effects and adverb order, they argue that an ECM infinitive does not actually involve moving the embedded subject, as in (88), but rather extraposing part of the embedded clause rightwards, as in (89). If Neeleman and Payne’s analysis is on the right track, then ECM infinitives do not pose a problem for the strong WC after all.

figure cl

Movement over complementizers

In some languages, movement that lands below a complementizer is then able to cross that complementizer to move to a higher clause. To illustrate, consider English topicalization. In an embedded clause, topicalization lands in a position below the complementizer (a90a), from which it could be concluded that C≻Top in fseq (where TopP represents whatever position topicalization targets). Topicalization can, however, cross an embedded finite clause boundary, moving over a complementizer (b90b). If topicalization targets TopP and C≻Top, the strong WC incorrectly predicts that CP should be a barrier for movement to TopP, thereby prohibiting topicalization over a complementizer.

figure cm

This class of exceptions would disappear if complementizers in these languages are analyzed as edge markers that uniformly appear at the clause boundary rather than as real C heads, along the lines of Manetta’s (2006, 2011) proposal for Hindi-Urdu ki. The particular implementation of this idea is largely inconsequential for present purposes, but for concreteness, let us assume that these complementizers are elements that merge at the edge of a clause, but do not project, so that the category of the clause remains unchanged. Under such an analysis, a moved element appearing to the right of a complementizer, like in (a90a), would not entail that the complementizer corresponds to a projection higher than the landing site of movement and therefore would not constitute a violation of the strong WC if that movement can also cross the complementizer.


Several languages have been claimed to allow hyperraising. This phenomenon has been most thoroughly investigated for Bantu languages (e.g. Carstens 2011; Diercks 2012; Carstens and Diercks 2013; Halpert 2015, 2019), so I will center the discussion around them. A representative example of Bantu hyperraising from Lubukusu is given in (91).

figure cn

Because the WC expressly prohibits hyperraising, if (91) is indeed hyperraising, it is problematic for the WC. However, Carstens and Diercks observe a crucial interaction between hyperraising and complementizers, which suggests that this picture is too simplistic. They report on three Bantu languages: Digo, Lubukusu, and Lusaamia. Digo and Lusaamia crucially do not allow hyperraising over complementizers. On the other hand, some Lubukusu speakers allow hyperraising over complementizers, but only the complementizer mbo and not the agreeing complementizer -li. They analyze this pattern as follows: (i) CPs are generally barriers to hyperraising because they are phases; (ii) finite clauses without complementizers are TPs in Bantu, not CPs; and (ii) mbo in Lubukusu is special in that it is not a phase head, thereby projecting a nonphasal CP that is not a barrier to hyperraising.

Under the WC, TP is not a barrier for movement to [Spec, TP], since TT in fseq, irrespective of whether the TP is considered finite or nonfinite. Therefore, on Carstens and Diercks’s analysis, hyperraising out of complementizer-less clauses is in fact compatible with the strong WC. This leaves mbo-clauses in Lubukusu. Rather than analyzing mbo as a special nonphasal complementizer, I suggest that mbo be analyzed as an edge marker, along the lines discussed above. Like their complementizer-less counterparts, mbo-clauses would then be TPs, and thus A-movement out of them would not violate the WC. Similar considerations can, I believe, be applied to the other purported cases of hyperraising, such as Brazilian Portuguese (Nunes 2008), Greek (Alexiadou and Anagnostopoulou 2002), and Zulu (Halpert 2015, 2019).

Long-distance agreement

There are several languages that have been reported to allow agreement between a matrix verb and a DP at the edge of an embedded finite clause, e.g. Innu-aimûn (Branigan and MacKenzie 2002), Passamaquoddy (Bruening 2001), and Tsez (Polinsky and Potsdam 2001). This is problematic for the WC because CP should be a barrier to a φ-probe on T0, because CT in fseq. However, these instances of long-distance agreement can be reanalyzed in a way compatible with the strong WC: (i) the embedded DP (i.e. the agreement controller) moves to embedded [Spec, CP], (ii) the DP’s features percolate up to CP via Spec-Head agreement, and (iii) matrix T0 agrees with the CP. This analysis is similar in spirit to Koopman’s (2006) analysis of Tsez long-distance agreement, in that there is no direct crossclausal agreement. In terms of the LEC, matrix T0 would agree with the CP before the full CP has been embedded (= substituted in); thus, upon embedding the CP, the CP’s features must be shared along (or match) its existing Agree-relations. Similar analyses can, I believe, be extended to wh-agreement in Chamorro and Palauan (e.g. Chung 1982, 1994; Chung and Georgopoulos 1988) and to crossclausal object agreement in Nez Perce (see Deal 2017, who analyses it in terms of covert movement, however).

Sakha accusative subjects

In Sakha, an embedded subject can be assigned dependent case (= accusative) iff the matrix clause has another DP (Baker and Vinokurova 2010). Baker and Vinokurova analyze this pattern in terms of raising: the embedded subject is eligible to move to embedded [Spec, CP], where it may then enter into dependent-case relationships with DPs in the matrix clause. This analysis is problematic for the strong WC (and the Ban on Improper Case) because it involves a DP in [Spec, CP] being the lower DP in a dependent-case pair, which should be impossible (see Sect. 5.4).

figure co

I argue that so-called accusative subjects in Sakha are actually proleptic arguments: they are base-generated as an argument of the matrix clause and are indirectly linked to an embedded gap via resumption (in the spirit of den Dikken 2017, 2018). As an argument of the matrix clause, it participates in the dependent-case calculus in the matrix clause, and thus is sensitive to the DPs there. Baker and Vinokurova themselves consider a prolepsis analysis of accusative subjects. They claim that while some instances of accusative subjects are indeed prolepsis, there are at least some instances that are not. To support this claim, they show that an accusative subject can be an NPI that would only be licensed in the embedded clause. They argue that this constitutes evidence against adopting a prolepsis analysis across the board for accusative subjects. However, den Dikken (2017, 2018) explicitly argues that prolepsis does in fact allow NPI licensing. (Technically for him, such constructions are complex predicates with no crossclausal syntactic dependencies, the same analysis that den Dikken 2009 gives for Hungarian long focus movement; see Sect. 3.3.) Thus, the NPI facts are in fact compatible with the prolepsis analysis. Evidence ruling out prolepsis would have to come from other reconstruction data, which are not available in the literature. Crucially, under a prolepsis analysis, Sakha accusative subjects are not problematic for the strong WC, as no crossclausal syntactic dependencies are involved.

As shown with the above discussion, it is not at all clear that these phenomena constitute evidence against the strong WC and the LEC. At the very least, in light of improper case, they deserve more attention and careful scrutiny. If the reanalyses sketched above can be sustained, then the LEC can be maintained in its full strength.