Improper case

This paper argues that case assignment is impossible in configurations that parallel generalized improper-movement configurations. Thus, like improper movement, there is “improper case.” The empirical motivation comes from (i) the interaction between case and movement and (ii) crossclausal case assignment in Finnish. I propose that improper case is ruled out by the Ban on Improper Case: a DP in [Spec, XP] cannot establish a dependent-case relationship with a lower DP across YP if Y is higher than X in the functional sequence. I show that this constraint falls under a strong version of the Williams Cycle (Williams 1974, 2003, 2013; van Riemsdijk and Williams 1981) and is derived under Williams’s (2003, 2013) analysis of embedding.


Introduction
Some syntactic positions can be targeted by some movement types, but not by others. The classical example of this phenomenon is H Y P E R R A I S I N G, whereby Amovement can leave a finite clause (1), but A-movement cannot (2). The traditional analysis of hyperraising involves a conspiracy of two constraints: (i) movement out of a finite clause must proceed through the intermediate [Spec, CP] position (Chomsky 1973(Chomsky , 1977(Chomsky , 1981(Chomsky , 1986 and (ii) a ban on "improper movement," according to which A-movement but not A-movement may proceed from [Spec, CP] (Chomsky 1973(Chomsky , 1981May 1979).

A-mvt
A growing body of work has shown that these movement asymmetries are not limited to the binary distinction between A-movement and A-movement (e.g. Williams 1974Williams , 2003Williams , 2013Sternefeld 1993, 1996;Abels 2007Abels , 2009Abels , 2012aNeeleman E. Poole 1 University of California, Los Angeles, USA and van de Koot 2010;Müller 2014a,b;Keine 2016Keine , 2019Keine , 2020. Thus, there needs to be a more general theory of "improper movement" that restricts what movement types are available to what positions. One particularly general and therefore interesting account of these asymmetries stems from Williams (1974Williams ( , 2003Williams ( , 2013 and van Riemsdijk and Williams (1981); I will refer to it as the Williams Cycle (WC). 1 The core analytical intuition behind the WC is that one and the same node is a barrier to some movement types, but not to others, and that this distinction correlates with the structural height of the landing site in the functional sequence. In Williams (2003), the WC is formulated as the Generalized Ban on Improper Movement, given in (3) (to be discussed in greater detail in Sect. 5.1). (

3) GENERALIZED BAN ON IMPROPER MOVEMENT (GBOIM)
Movement to [Spec, XP] cannot proceed from [Spec, YP] or across YP, where Y is higher than X in the functional sequence.
[based on Williams 2003] The WC accounts for the ban on hyperraising (2) as a prohibition on moving from inside a CP to [Spec, TP]. According to the WC, CP is a barrier for movement to TP, but not for movement to CP, as schematized in (4), because C is higher than T in the functional sequence.
This account extends beyond hyperraising to the other kinds of movement asymmetries that have been documented in the literature. In addition to the work by Williams (1974Williams ( , 2003Williams ( , 2013 and van Riemsdijk and Williams (1981), various versions of the WC have been developed by Abels (2007Abels ( , 2009), Müller (2014a,b), and Keine (2016Keine ( , 2019Keine ( , 2020, amongst others. While the WC has traditionally been proposed on the basis of movement, Keine (2016Keine ( , 2019Keine ( , 2020 argues that analogous restrictions also govern agreement. This generalizing of the WC raises the question of whether other syntactic dependencies are also subject to the WC. This paper investigates the locality of case assignment and argues that it too is constrained by the WC. 2 Therefore, in line with movement and agreement, there is I M P R O P E R C A S E. The paper is couched in terms of dependent-case theory (DCT) (Marantz 1991;Bittner and Hale 1996;McFadden 2004;Baker 2015). The reason for this choice is that the paper draws heavily on Finnish, which I will argue requires the notion of dependent case (Poole 2015;also Maling 1993;Kim 2011, 2017). However, the main arguments in this paper equally apply to functional-head case theory (FHCT) (e.g. Chomsky 2000Chomsky , 2001Legate 2008); see fn. 13 and 32 in particular for discussion.
The motivation for improper case, i.e. that case assignment is subject to the WC, comes from two puzzles that the previous literature has not investigated in depth. The first puzzle involves the interaction between dependent case and movement, namely that some movement may feed dependent-case assignment, but other movement crucially must not do so. The second puzzle is crossclausal case assignment in Finnish, where a subject, but not an object, may license dependent case on another DP across a nonfinite clause boundary. I show that both puzzles crucially do not fall under the purview of standard syntactic locality constraints, e.g. phases. I argue instead that both problems receive a unified analysis if case assignment is subject to the WC, which I formulate for case as the Ban on Improper Case in (5).

(5) BAN ON IMPROPER CASE
A DP in [Spec, XP] cannot establish a dependent-case relationship with a lower DP across YP, where Y is higher than X in the functional sequence.
According to the Ban on Improper Case, the heights of two DPs relative to one another in the functional sequence dictate whether they can establish a dependent-case relationship. For movement, (5) means that the height of a movement's landing site determines the range of positions from which another DP can license a dependentcase relationship with that moved DP. For clausal embedding, (5) means that the size of an embedded clause dictates which DPs in higher clauses can establish a dependent-case relationship across that clause boundary. The Ban on Improper Case brings the locality of case into line with movement and agreement, in that the WC applies to all three. The question that follows then is how to uniformly derive WC effects in all three of these empirical domains. Crucially, improper case does not follow from recent proposals that analyze WC effects as the result of a constraint on AGREE or MERGE (Abels 2007(Abels , 2009Müller 2014a,b;Keine 2016Keine , 2019Keine , 2020, because dependent-case assignment does not seem to involve either one of these operations. I argue that a unified analysis of WC effects for case, movement, and agreement becomes available if we adopt Williams's (2003) analysis of clausal embedding. Williams proposes that a ZP can only be embedded in a clause that has itself been built up to ZP, which he calls the Level Embedding Conjecture. The crucial consequence of this proposal is that a root XP containing an embedded YP, where Y is higher than X in the functional sequence (Y X), never exists in the course of a derivation (6). where Y X and XP is the root node where Y X and YP is the root node Any movement, agreement, or case assignment between matrix XP and embedded YP that would violate the WC is in turn impossible because the relevant structure where X and [Spec, XP] would have access to YP-under the strict cycle-is simply not created by the grammar, as schematized in (6a). Matrix Y and [Spec, YP], on the other hand, are able to access embedded YP because matrix YP is the root node at the point when embedded YP is embedded, as schematized in (6b); this access generalizes to projections higher than Y in the functional sequence. Because this constraint follows from the way that syntactic structures are built, the key consequence of this account is that all syntactic dependencies are subject to the WC, regardless of whether they share the same operational core or not. The argumentation proceeds as follows: Sect. 2 briefly overviews the paper's assumptions about DCT. In Sect. 3 and 4, I present two locality puzzles for dependentcase assignment: the interaction of case and movement, and Finnish crossclausal case assignment. To account for these two seemingly disparate locality problems, in Sect. 5, I propose that dependent-case assignment is subject to the Ban on Improper Case. In Sect. 6, I then argue that Williams Cycle effects for case, movement, and agreement can be uniformly analyzed in terms of clausal embedding. Section 7 concludes by discussing several purported exceptions to the Williams Cycle and further ramifications of the paper's proposals.

Background on dependent case
In DCT, the calculus of case follows the algorithm in (7) (Marantz 1991;Bittner and Hale 1996;McFadden 2004;Baker 2015;and its predecessor Yip et al. 1987).
(7) Case calculus in dependent-case theory 1. Assign idiosyncratic lexical and inherent cases.
3. If a DP was not assigned case in the previous two steps, then assign it unmarked case (≈ "nominative" and "absolutive"). 3 "Ergative" and "accusative"-in their canonical textbook definitions-are collapsed into the unified notion of D E P E N D E N T C A S E. Whenever two DPs presently unvalued for case stand in a c-command relationship in the same local domain, one of the DPs is assigned dependent case, though which one depends on the language's parameterization. When the c-commanding DP is assigned dependent case, this corresponds to what would traditionally be called "ergative." When the c-commanded DP is assigned dependent case, this corresponds to what would traditionally be called "accusative." I will refer to this process as establishing a D E P E N D E N T -C A S E R E -L AT I O N S H I P, and to the higher DP in the pair, i.e. the one that initiates the relationship, as the L I C E N S O R. For the sake of concreteness, I adopt the syntactic implementation of DCT from Preminger (2011Preminger ( , 2014) throughout the paper: (i) DPs enter the derivation with an unvalued case feature, [CASE: ], which can be valued as either DEP (for dependent case) or a particular lexical case. 4 (ii) Lexical cases are assigned locally by (typically, lexical) heads, e.g. P 0 and V 0 , to their sibling upon first-merge (8). 5 (8) [ PP/VP/XP P 0 /V 0 /X 0 DP [CASE: LEX] ] where LEX = the relevant lexical case (iii) Dependent case is assigned whenever two DPs with unvalued case features ([CASE: ]) stand in a c-command relationship (9). 6 The language's parameterization determines whether it is the higher or lower DP in the pair that gets assigned dependent case; throughout the paper, I indicate the assignee with an underline. Concerning the timing of dependent case, in Sect. 3.2, I will argue that dependent-case relationships are established as early as possible, which I take to be upon merging the licensor into the c-commanding position. The morphological exponence of dependent case, e.g. as "accusative" and "ergative," is determined at PF. (iv) If [CASE: ] is still unvalued when Spellout occurs, it is realized as unmarked case at PF (10). Thus, unmarked case (≈ "nominative" and "absolutive") is the absence of any otherwise assigned case (see Kornfilt and Preminger 2015).
(10) [CASE: ] ↔ UNMARKED CASE These detailed mechanics will be abstracted over when not relevant to the discussion at hand. One advantage of this syntactic case calculus is that the structure consisting of a head and the DP that it c-selects is necessarily built before any larger structure containing that DP and another DP in a c-command relationship. Therefore, the precedence relations in (7) fall out intrinsically based on how structure is built, and do not need to be stipulated, as, e.g., the original implementation in Marantz (1991) does. Before proceeding, there are two important points about DCT worth emphasizing: First, dependent-case assignment is in addition to case assignment by designated heads (which in the terms here, falls under lexical case). 7 As such, the expressive 4 For the sake of simplicity, I collapse the distinction between lexical and inherent case (Woolford 2006). 5 Following Bare Phrase Structure, where what projects is the head itself (Chomsky 1995a), lexical case can also be assigned in a specifier-head relation as siblinghood agreement (à la Rezac 2003). 6 Baker (2015) proposes several enhancements to dependent-case assignment, including reciprocal relationships (where both DPs are assigned dependent case; see also Deal 2015), null relationships (where neither DP gets dependent case), and 'keying' dependent-case rules to particular domains. These enhancements are compatible with the proposals in this paper, but are not directly relevant to the matters at hand, so I have set them aside. 7 Traditionally, in DCT, lexical-case assignment (i.e. case assignment by heads) is considered to be very local. Under Preminger (2011Preminger ( , 2014 system, which I am adopting here, it is restricted to siblinghood. However, Baker (2015) proposes that lexical case (though he does not call it such) can in fact be assigned under closest c-command (see also Baker and Vinokurova 2010;Preminger 2017, to appear). Because lexical case does not factor into the arguments in this paper, I will assume the traditional, more restrictive view of lexical case. power of DCT is a proper superset of the expressive power of FHCT (Preminger 2017, to appear). The crucial question then is whether the additional expressive power of DCT is warranted, i.e. whether there are case patterns that call for the notion of dependent case. Several such patterns have been identified in the literature (see e.g. Marantz 1991;Baker and Vinokurova 2010;Baker 2014Baker , 2015Levin and Preminger 2015;Baker and Bobaljik 2017;Jenks and Sande 2017;Yuan 2018Yuan , 2020, and in this paper, I argue that the distribution of Finnish accusative case is another such pattern. Second, the descriptive label given to a case should not be taken to entail a particular case-assignment mechanism. That is, just because a given case in a given language is pretheoretically called "accusative" or "ergative" does not mean that it is necessarily dependent case-and likewise for "nominative" or "absolutive" and unmarked case (hence the scare quotes).

Movement and case
This section shows that some movement can lead to dependent-case assignment (Amovement), but other movement must not do so (A-movement). This dichotomy will be shown not to follow from standard conceptions of locality, e.g. phases, and thus it presents a challenge for DCT.

Some movement can feed dependent case
The DCT literature has identified a number of case patterns as involving movement feeding dependent-case assignment. Let us consider three representative examples.
The first example involves object shift: a dependent-case relationship between the subject and the object is allowed only if the object has raised out of VP (Bittner and Hale 1996;Baker 2015:125-130;Woolford 2015). This pattern is illustrated in (11) with Niuean, where the case markings correlate with both the specificity of the object and the clausal word order (Massam 2000(Massam , 2001. If the object is nonspecific, the subject is nominative, and the clause has VOS word order (11a). If the object is specific, the subject is ergative and the clause has VSO order (11b 'Sione drank the coffee' [Massam 2000:98] Massam (2000,2001) analyzes this alternation in terms of object shift and VP fronting. The VOS order is produced by the object remaining in its base position and thus fronting along with the VP (12a). By contrast, the VSO order is produced by the object raising out of the VP prior to VP fronting (12b). The additional correlation with ergative case is then captured by assuming-in pretheoretical terms-that ergative case requires that the object leave VP.
(12) Massam's (2000Massam's ( , 2001 analysis of Niuean word order a. VOS derivation In DCT terms, VP blocks dependent-case assignment in Niuean (see Sect. 7.1). Thus, only when the object raises out of VP can a dependent-case relationship between the subject and the object be established. Other languages exhibiting object shift feeding dependent case include Dyirbal, Eastern Ostyak, Ika, Inuit, Nez Perce, Sakha, and Tagalog (see Baker and Vinokurova 2010;Baker 2015:125-130;Woolford 2015; and references therein). The second example is Shipibo applicatives of unaccusatives (Baker 2014). Baker shows that in Shipibo, an unaccusative subject is ordinarily nominative (13a), but adding an applicative argument causes the subject to become ergative (13b). He argues that ergative in Shipibo is dependent case and that in (13b), the subject is ergative because the applicative provides the additional DP needed for dependent case.
(13) a. Kokoti-ra fruit-PCL joshin-ke ripen-PFV 'The fruit ripened' b. Bimi-n-ra fruit-ERG-PCL Rosa Rosa joshin-xon-ke ripen-APPL-PFV 'The fruit ripened for Rosa' [Baker 2014:345-346] However, if the unaccusative subject (i.e. the theme) is base-generated inside VP and the applicative is base-generated above VP-both standard assumptions-then it should be the applicative that gets dependent ergative. That is, upon merging into the structure, the applicative should establish a dependent-case relationship with the theme; in the pair, the applicative would be the higher DP and hence should be the one assigned dependent case (= ergative). Baker (2014) handles this problem by proposing that the applicative is encased inside a null PP, so that (i) it does not c-command the theme and (ii) it is ineligible for movement to subject position, [Spec, TP]. Due to the latter, the theme is able to raise over the applicative to [Spec, TP]. From [Spec, TP], the theme c-commands the applicative and establishes a dependent-case relationship with it; in this version of the pair though, the theme is the higher DP and thus gets assigned dependent case. This analysis is schematized below in (14).
(14) Baker's (2014) analysis of Shipibo applicatives of unaccusatives Assuming that Baker's (2014) analysis is on the right track, Shipibo applicatives of unaccusatives are an instance of movement feeding dependent-case assignment. 9 The third example is the English raising predicate strike as (Marantz 1991), which is traditionally analyzed as the matrix subject starting out as the embedded subject and raising into matrix-subject position, as schematized in (15). Taking accusative on objects in English to be dependent case, the licensor of dependent case on the internal argument of strike must be the subject after it has raised to matrix [Spec, TP], because there is no other possible licensor.
Note that something needs to be said about why the internal argument of strike does not license dependent case on the subject before it has raised; I will return to this point in Sect. 5.4.
In sum, taken together, these three examples crucially show that there are instances of movement that feed dependent-case assignment.

Some movement must not feed dependent case
While the previous section showed that some movement may feed dependent-case assignment, there is also other movement that must not feed dependent-case assignment. I will illustrate this problem using wh-movement, though it holds generally for nonlocal movement, and I will use English for ease of illustration. 10 Let us take the problem in two parts.
The first part of the problem is that dependent case cannot be assigned based on the surface structure alone. For example, the structure in (16a) with wh-movement must be mapped to the string in (16b) and cannot be mapped to (16c). Descriptively, dependent case needs to be calculated before wh-movement has occurred.
(16) a. Who did she see who?
b. Who(m) DEP did she NOM see? c. *Who NOM did her DEP see?
One potential solution that can be immediately set aside is to assume that case is assigned at PF and that wh-movement 'reconstructs' for case at PF. This solution would face the problem that unlike canonical reconstruction at LF, this hypothetical PF reconstruction would have to be for case alone and not uniformly for all PF processes, in particular not for linearization. As such, it would be nothing more than a restatement of the empirical generalization that wh-movement does not affect case. Rather, I propose that dependent-case assignment is interspersed with structure building, so that dependent case is assigned as early as possible in the derivation (see also Baker and Vinokurova 2010:604;Preminger 2011Preminger , 2014. I call this principle Earliness (17) (in the spirit of Pesetsky 1989;Pesetsky and Torrego 2001).

(17) EARLINESS
Upon (re)merging α into the structure, if α c-commands β and both α and β have unvalued case features, establish a dependent-case relationship between α and β.
Earliness crucially forces dependent-case assignment to happen prior to wh-movement. The derivation of (16) under this analysis is illustrated in (18): First, a dependent-case relationship is established between she and who immediately upon first-merge of she into the structure (18a). Second, wh-movement happens later in the derivation, after dependent case has been assigned (18b The formulation of Earliness in (17) also reiterates the restriction that two DPs can enter into a dependent-case relationship only if they both presently have unvalued case features ([CASE: ]); this is part of the case calculus laid out in Sect. 2. This restriction prevents DPs with dependent case from reparticipating in dependent-case assignment, either after having themselves moved or with DPs that have moved above them. For example,in (19), this restriction prevents the moved wh-element that has itself been assigned dependent case from turning around and licensing dependent case on the subject from the higher position to which it has wh-moved.
(19) who [CASE: DEP] did she [CASE: ] see who [CASE: DEP] ? ✗ While Earliness is a necessary component to solving the problem imposed by wh-movement, it is not sufficient. This brings us to the second part of the problem: dependent-case assignment in the context of successive-cyclic movement. When a wh-element that is itself unvalued for case-and should surface with unmarked case at PF-moves successive cyclically, it passes through intermediate [Spec, CP]  Sentences like (20) present two complications for DCT. For convenience, I will discuss these complications in terms of an accusative alignment, where the lower DP in a dependent-case pair is assigned dependent case, but the problem extends to an ergative alignment too, where the higher DP in the pair is assigned dependent case. The first problem is that a wh-element does not have its own case altered from its intermediate landing sites. 11,12 From these intermediate positions, there may very well be another DP unvalued for case that c-commands the wh-element. All else equal, the wh-element should be assigned dependent case in such configurations-but crucially, it is not. Descriptively, the moving wh-element cannot have the case overwritten that would have been assigned to it if it had not moved. In DCT terms, the wh-element cannot be the lower DP in a dependent-case pair when it is in an intermediate landing site, as schematized in (21). As such, I will refer to this problem as the Lower-DP Problem.

✗ ✗
The second problem is that a wh-element does not alter the case of other DPs from its intermediate or final landing sites. From these positions, the wh-element may very well c-command another DP unvalued for case, and thus it should, all else equal, be able to license dependent case on it-but it cannot do so. In other words, the moving wh-element cannot be the higher DP in a dependent-case pair (modulo from its basegenerated position with, e.g., an object). As such, I will refer to this problem as the Higher-DP Problem (22). 11 A wh-element also cannot enter into a dependent-case relationship in its final landing site in an embedded question, but this instantiates the same relevant configuration as an intermediate landing site. 12 This is not to imply that there cannot be a dedicated lexical case for A-moved elements, e.g. as is found in Dinka (van Urk 2015). Because the assignment of such case is not contingent on the presence of another DP, it does not qualify as dependent case and thus falls outside the purview of the present discussion. However, for discussion of movement and lexical case, see Sect. 5.4.

✗ ✗
The standard dependent-case calculus does not offer an explanation for why successive-cyclic movement does not affect case in these two ways (for a discussion of Baker 2015, see Sect. 5.5). Crucially, in light of the data in Sect. 3.1, it would not suffice to simply stipulate that movement does not affect case assignment. Thus, a more nuanced account is called for.
Neither of the problems that successive-cyclic movement raises for DCT fall under the purview of phases (or its predecessor, subjacency). First, because phase edges remain accessible at the next highest phase, per the Phase Impenetrability Condition (Chomsky 2000(Chomsky , 2001, the locality enforced by phases permits precisely the configurations that give rise to the Lower-DP Problem, as schematized in (23). In other words, a DP unvalued for case may c-command the edge of the lower phase, thereby satisfying the criteria for establishing a dependent-case relationship with a DP in that edge position, if it too is unvalued for case. 13 Second, because movement to [Spec, CP] takes place before phasal Spellout, such movement should, all else equal, be able to affect the case of elements in the CPphase domain. 14 Otherwise, establishing any relation between the phase edge and the phase complement would be impossible, and such relations are minimally necessary for movement dependencies. Thus, the Higher-DP Problem is also not solved under the locality afforded by phases. Note that I am not claiming that these considerations provide evidence against phases; rather, the point is that they do not follow from phase theory itself.
In sum, successive-cyclic movement leads to the generalization that some movement crucially must not feed dependent-case assignment.

Section summary
Some, but not all, movement affects dependent-case assignment. Assigning dependent case as early as possible, Earliness (17), already filters out many of the undesirable interactions between dependent case and movement. For example, in a simple transitive clause, moving an object over a subject will have no effect on dependent case because a dependent-case relationship will have already been established be- 13 The same problem is faced by FHCT if caseless DPs are permitted to move through a phase edge or if one assumes that nominative is unvalued case or that there is case stacking. A v 0 head could then assign accusative case to a DP in a phase-edge position, e.g. an intermediate [Spec, CP] position, thereby making the same incorrect prediction that DCT does in (21). 14 Baker (2015), however, proposes an analysis that stipulates that material at the phase edge cannot affect the case of elements inside the phase domain; see Sect. 5.5 for discussion. tween the two DPs upon first-merge of the subject. This was shown for wh-movement in Sect. 3.2, but it holds for A-movement as well, in particular for A-scrambling. The question then is about all the interactions that Earliness does not capture. These interactions include some instances of A-movement feeding dependent-case assignment, but no instances of A-movement doing so. They also include some instances of Amovement that cannot affect dependent case, but which are not ruled out by Earliness, e.g. with successive cyclicity. This state of affairs is summarized with the generalization in (24). 15

(24) MOVEMENT-CASE GENERALIZATION
A-movement can feed dependent case, but A-movement cannot.
To the best of my knowledge, (24)  case). 17,18 However, in the literature on Hungarian, it has been argued for reasons entirely independent of considerations like (24) that this alternation actually stems from two distinct structures (den Dikken 2009, to appear;Jánosi 2013;Jánosi et al. 2014). When the long-focused element bears embedded case, as in (25a), there is genuine long movement: the element is base-generated in the embedded clause, is assigned case locally, and A-moves into the matrix-focus position. By contrast, when the long-focused element bears matrix case, as in (25b), it is base-generated in the matrix clause, is assigned case locally, locally A-moves to the matrix focus position, and is indirectly linked to the embedded gap via resumption. Discussing the arguments in favor of this analysis would take us too far afield; the reader is referred to Jánosi (2013). Crucially for the purposes of this paper though, under this independentlymotivated analysis of Hungarian long focus, it is not an exception to (24), as there is no A-movement feeding dependent-case assignment. Explaining (24) requires a way of teasing apart movement types. In minimalist syntax, because there is only a single primitive movement operation (i.e. MERGE), there is no principled way to distinguish A-movement and A-movement. A goal of this paper is thus to derive the locality constraint in (24) without reference to separate primitives for A-movement and A-movement. In the next section, I show a pattern from Finnish crossclausal case assignment that also does not follow from any binary notion of locality, e.g. phases. Despite not involving movement, this pattern will be shown to parallel movement configurations that are accounted for under the Williams Cycle. I will argue that adopting the Williams Cycle as a constraint on dependent-case assignment, in the form of the Ban on Improper Case, provides a unified account of both crossclausal case assignment in Finnish and the Movement-Case Generalization in (24).

Finnish crossclausal case assignment
This section shows that in Finnish, dependent case may be licensed across a nonfinite clause boundary, but only by a subject and not by an object (or an adjunct). As with movement, this dichotomy will be shown not to fall under the purview of standard 17 Kayne (1981Kayne ( , 1984 and Rizzi (1982) propose a similar derivation for French and Italian ECM, which is possible only if the embedded subject undergoes A-movement. On their analyses, the embedded subject is licensed in [Spec, CP] by the matrix predicate. Though they call this relation "Case," no actual case is involved. See Koopman and Sportiche (2014) for a recent reanalysis of these facts in French that does not involve crossclausal movement. 18 Kayne (1984) also proposes a similar derivation for the use of whom in sentences like (i.a) (acceptable for some English speakers), where whom corresponds to a subject position and thus should be nominative who. On his analysis (in FHCT), the higher predicate assigns accusative to who(m) in an intermediate [Spec, CP] position. However, Lasnik and Sobin (2000) argue that this is a mischaracterization of the data. In particular, whom can still appear if the higher predicate is passive and hence would be unable to assign accusative (irrespective of FHCT or DCT), as shown in (i.b). See Lasnik and Sobin (2000) for an alternative analysis of whom. conceptions of locality, e.g. phases. Section 4.1 begins with some background on Finnish structural case and arguments that accusative in Finnish is dependent case. Sect. 4.2 then discusses the crucial case patterns in embedded nonfinite clauses.

Background on Finnish case
Finnish has three structural cases: nominative, accusative, and partitive. 19 For the sake of simplicity, I set aside partitive case and focus on the distribution of nominative and accusative (for a more comprehensive analysis, see Poole 2015). In a simple transitive clause, the external argument is nominative and the internal argument is accusative (26). To simplify the exposition, let us refer to the external argument as the "subject" and the internal argument as the "object". Whenever the subject is absent, e.g. in a passive (27a) or in an imperative (27b), or the subject bears lexical case (i.e. a quirky subject) (27c), the object is nominative. 20,21 The status of accusative case in Finnish is somewhat contentious. Under traditional analyses, accusative comprises three forms: one homophonous with genitive, one homophonous with nominative, and one distinct form for human pronouns. Kiparsky (2001) argues that only the form for human pronouns is a genuine accusative case (see also e.g. Penttilä 1963 ;Timberlake 1975;Milsark 1985;Taraldsen 1986;Mitchell 1991;Maling 1993;Toivainen 1993;Vainikka 1993;Nelson 1998). This paper assumes a simplified picture: the genitive-homophonous accusatives are referred to as "accusative," the nominative-homophonous accusatives are referred to as "nominative," and the pronouns are set aside. This is in line with what Kiparsky (2001) argues, but with the terminology shifted to parallel the standard nominative-accusative pattern. This choice has no bearing on the claims made in this paper; the dependent case in Finnish is marked with -n regardless of whether one calls that form "accusative" or "genitive." 20 Some notes on the Finnish data: Unless indicated otherwise, Finnish judgments are due to my informants. Glossing conventions have been unified across sources. To simplify examples, I do not gloss verbal agreement and have removed any instances of pro-drop (pro behaves just like an overt DP for the purposes of case assignment). The Finnish case patterns in this paper are all invariant (modulo that some of the objects can be partitive), e.g. if a DP is nominative in an example, it must be nominative in that position and cannot be accusative. 21 For imperatives, this is only true if the verb is in first or second person. See Nelson (1998:95-97) and Kiparsky (2001) for arguments that these imperatives do not have syntactically active subjects. c.

Minu-n I-GEN
täytyy need osta-a buy-INF/TA kirja book.NOM 'I have to buy the/a book' The case patterns exemplified in (26) and (27) receive a straightforward explanation under DCT. In (26), the subject licenses dependent case (= accusative) on the object; then, because there is no other DP that c-commands the subject, the subject remains unvalued for case throughout the derivation and is realized as having unmarked case (= nominative) at PF. In (27a) and (27b), there is no other DP that c-commands the object; as such, no dependent-case relationship is established, and the object is realized as having unmarked case. In (27c), although there is another DP that c-commands the object, it bears lexical genitive case. Recall from Sect. 2 that only DPs unvalued for case factor into the calculus of dependent case. Lexically case-marked DPs are thus invisible to dependent-case assignment because their case will already have been assigned locally. Accordingly, because no other DP with unvalued case c-commands the object in (27c), the object remains unvalued for case and is realized as having unmarked case at PF. This analysis is summarized in (28) The data in (26) and (27) could alternatively be analyzed in FHCT: the variants of v 0 in (27) would lack the ability to assign accusative case, so that T 0 could assign nominative case to the object (e.g. Vainikka and Brattico 2014, though the identity of the heads differs on their account). Such an analysis would amount to a standard implementation of Burzio's Generalization. Evidence that such an FHCT analysis is insufficient comes from adjuncts. In Finnish, there is a special class of adjuncts that are structurally case-marked, akin to subjects and objects (Tuomikoski 1978;Maling 1993). These adjuncts include durational adjuncts (for an hour), spatial-measure adjuncts (a kilometer), and multiplicative adjuncts (two times). In DCT terminology, these adjuncts factor into the calculus of dependent case-i.e. they can license and be assigned dependent case-and they are realized with unmarked case if their case remains unvalued in the derivation. To illustrate, in an intransitive clause with one of these adjuncts, the subject is nominative and the adjunct is accusative (29a). When the intransitive predicate is passivized (as some kind of impersonal passive), the adjunct becomes nominative (29b), the same case alternation that is observed for objects in passives (27a). 23 22 The notion that DPs compete for nominative case also underlies the analyses of Finnish case in Maling (1993), Kim (2011, 2017), and Poole (2015), though the implementations differ considerably as a result of using different frameworks. The core insights about Finnish and improper case in this paper could in principle be expressed using any of these analyses. 23 Further evidence that this class of adjuncts is structurally case-marked comes from the fact that they must be partitive when under the scope of negation, like subjects and objects (e.g. Heinämäki 1984;Kiparsky 2001 [Kiparsky 2001:323] With a transitive predicate, where the object does not bear lexical case, structurally case-marked adjuncts are always accusative (30). (Note that the DP in a dependentcase relationship that is not assigned dependent case still has an unvalued case feature and thus is eligible to enter into another dependent-case relationship.) 'Liisa has to remember the trip for a year' [Maling 1993:57] Following Larson (1988) and Pesetsky (1995), among others, I will assume that the vP is right-branching, where adjuncts are c-commanded by the object (see also Csirmaz 2005:90-98, who also argues for such an analysis for Finnish). This is schematized in (31), where the possible dependent-case relationships are indicated. Accordingly, an object that is not assigned lexical case by the verb will invariably license dependent case on an adjunct, thereby accounting for the pattern in (30). 24 Crucially, clauses like (29b), where the adjunct is nominative, may contain multiple structurally case-marked adjuncts. In such configurations, the DCT analysis and the FHCT analysis make different predictions. The DCT analysis predicts that the highest adjunct is nominative and all the other adjuncts are accusative. The FHCT analysis, on the other hand, predicts that all of the adjuncts are nominative, because 24 Given the vP-structure in (31), more needs to be said about how V 0 assigns lexical case to the object. Under Bare Phrase Structure, a head X reprojects and is a sibling with its specifier, so the locality conditions of lexical-case assignment laid out in Sect. 2 are still satisfied with the structure in (31). Nevertheless, there needs to be something preventing V 0 from assigning the lexical case to the adjunct instead of the object. As our concern in this paper is dependent case, I leave this problem for future research. the functional head responsible for assigning accusative case is absent in clauses where the subject is absent; this is what accounted for the data in (27) under an FHCT analysis. The data bear out the prediction of the DCT analysis. This is shown in (32) with two structurally case-marked adjuncts and the verb luottaa 'trust,' which assigns lexical illative case to its object, thereby removing it from the calculus of dependent case. When the subject is present, both of the adjuncts are accusative (32a). When the subject is absent, here in a passive, the higher adjunct is nominative and the lower adjunct is accusative (32b). 25 Finally, when the first adjunct is dropped, the only remaining adjunct becomes nominative (32c).  [Maling 1993:59] The pattern in (32) follows in the DCT analysis without further ado. For example, in (32b), the first adjunct licenses dependent case on the second adjunct; then, because no relevant DP c-commands the first adjunct, it remains unvalued for case in the derivation and is realized with unmarked case at PF. The FHCT analysis, on the other hand, would need to make additional stipulations to account for (32), in particular to deal with (32b), in which accusative would have to be assigned in a passive, where the functional head responsible for accusative would not occur (similarly in (30b)). As far as I am aware, there is no FHCT analysis of Finnish case that extends to the case pattern in (32). 26 I take the fact that this adjunct pattern is entirely regular and productive in Finnish to indicate that Finnish requires the notion of dependent case in order to capture the distribution of accusative (Poole 2015; see also Maling 1993, Anttila and Kim 2011. I will thus adopt such an account in what follows. Against this backdrop, let us now consider case assignment in nonfinite clauses.

Case in nonfinite clauses
Finnish has a number of nonfinite constructions (Vainikka 1989(Vainikka , 1995Toivonen 1995;Koskinen 1998; also Hakulinen et al. 2004:Sect. 490). The nonfinite construction of interest in this paper is the MA-infinitive (traditionally called the "third" infinitive). The reason that MA-infinitives are interesting is because when they function as clausal complements, case assignment within the nonfinite clause interacts with the makeup of the clause of the embedding verb (e.g. Vainikka 1989). That is, the matrix (= embedding) and embedded clauses constitute a single coextensive domain for the purposes of dependent-case assignment. The MA-infinitive requires the verb to bear an inner locative case marker (inessive, elative, or illative) after the infinitival morpheme -mA (33). 27 The case marker matches what a DP would bear in that same position, with the same "directional" meaning (33). In this sense, the verb in a MA-infinitive is nominal-like, but unlike a genuine nominal, it cannot be modified by nominal modifiers, only verbal modifiers (34). [based on Koskinen 1998:325] When the matrix clause has an ordinary nominative subject, the embedded object is marked with accusative (35a). Then, when the matrix subject is absent or bears lexical case, the embedded object becomes nominative (35b). In this section, of the constructions that remove the subject from the dependent-case calculus, I only show must be assigned in every finite clause, essentially the Inverse Case Filter of Bošković (1997Bošković ( , 2002. However, her analysis does not explain the possibility of impersonal passives of intransitive predicates, which have no arguments that could receive nominative case, e.g. Tanssittiin 'There was danced.' Her account also does not extend to nonfinite clauses; see Sect. 5.3 for discussion. Space limitations unfortunately prevent giving an exposé of these alternative accounts. 27 MA-infinitives can also occur with the essive case marker, but not when they function as clausal complements. imperatives, but all of the data can be replicated for passives and quirky-subject constructions. This resembles the same pattern from monoclausal sentences in Sect. 4.1. In (35a), the embedded object is c-commanded by another DP unvalued for case, i.e. the matrix subject, and thus is assigned dependent case (= accusative). In (35b), there is no other DP unvalued for case that c-commands the embedded object and thus it surfaces with unmarked case (= nominative) at PF. Accordingly, the pattern in (35) can be accounted for under DCT by considering (i) the CP to be the relevant domain for dependent case and (ii) MA-infinitives to be projections smaller than CP, so that the domain over which dependent case is calculated includes both the matrix and embedded clauses. Following Koskinen (1998), I assume that MA-infinitives are TPs. 28 (35b) also reveals that PRO is either absent from these constructions or inert for the purposes of dependent-case assignment. Otherwise, there would be no principled way to explain why the embedded object's case is contingent on the presence of an argument in the matrix clause. Another DP like PRO inside the embedded clause that c-commands the object and is unvalued for case (for some portion of the derivation) would invariably license dependent case on the object, thereby negating any effect that the matrix clause could ever have. While either analysis (i.e. no PRO or inert PRO) would in principle account for the case pattern in (35b), I will adopt the first analysis that MA-infinitives lack a PRO. 29 This choice is largely for the sake of simplicity, but there are two arguments in its favor. First, if PRO can only occur in CPs, as Landau (2000) argues, this absence would follow from MA-infinitives being 28 There is reason to believe that -mA corresponds to a v 0 head: (i) it cannot cooccur with verbal inflection, such as passivization, and (ii) -mA is the morpheme used to form agentive participles, which is in line with the argument-structure role of v 0 . Thus, MA-infinitives are at least as big as vPs. Though I follow Koskinen (1998) in assuming that they are TPs, the analysis in Sect. 5.3 is compatible with MA-infinitives being vPs as well; see fn. 37. Additionally, I assume that the case morphology that appears on the verb is assigned directly to the nonfinite clause, with no intervening nominal projections (in line with Vainikka 1995). While relatively inconsequential, this assumption is based on the fact that MA-infinitives do not allow nominal modification (34) and cannot occur with possessive suffixes, the latter of which is a hallmark of nominals in Finnish and is possible with other nonfinite clause types. 29 A reviewer raises the question of how the subject position gets saturated in MA-infinitives if they do not contain a PRO. Standardly, in analyses of the semantics of control, the subject is taken to be unsaturated and the embedded control clause to denote a property of individuals (e.g. Montague 1973;Bach 1979;Dowty 1985;Chierchia 1989). On Chierchia's (1989) analysis, this is produced by a λ-binder abstracting over PRO (for a recent instantiation of this analysis, see Pearson 2013). The absence of PRO produces the same semantic object, namely a property of individuals. smaller than CPs. Second, this analysis also allows for a uniform treatment of PRO crosslinguistically as a dependent-case licensor, rather than parametrizing its ability to license dependent case on a language-by-language basis. On this analysis, then, PRO has no effect on dependent case in MA-infinitives because it is not there.
The crucial pattern emerges when the embedding predicate has its own object. Some of these predicates include pakottaa 'force,' pyytää 'ask,' and kieltää 'deny' (see Vainikka 1989:330). As shown in (36), when the matrix subject is present, the matrix subject is nominative, the matrix object is accusative, and the embedded object is accusative; this is the pattern expected, given what we have seen so far. 'Maija asked Jukka to read the book' [Vainikka 1989:267] Under DCT, this pattern could in principle be modelled in one of two ways: (i) a covariance derivation, where the matrix subject licenses dependent case on both objects (37), or (ii) a daisy-chain derivation (38), where the matrix object licenses dependent case on the embedded object and then the matrix subject licenses dependent case on the matrix object.
However, in the absence of a matrix subject, both the matrix object and the embedded object surface with nominative case, as shown in (39). This rules out the daisy-chain derivation for MA-infinitives in (38). Rather, the case of the matrix and embedded objects covaries with the presence of the matrix subject, as predicted by the analysis in (37) 'Ask Jukka to read the book!' [Vainikka 1989:268] Binding reveals that the matrix object nevertheless c-commands the embedded object. Finnish third-person possessive suffixes are subject to Condition A, as illustrated in (40a). Crucially, a third-person possessive suffix on the embedded object can be bound by the matrix object (in addition to the matrix subject), as shown in (40b). This shows that the matrix object does indeed c-command the embedded object. All else equal, the matrix object should then license dependent case on the embedded object. The fact that it does not thus needs to be explained. 30 (40) Matrix object c-commands the embedded object a. Poika 1 boy.NOM myi sold marsu-nsa 1/ * 2 guinea.pig.ACC-3.POSS 'The boy 1 sold his 1/ * 2 guinea pig' [Nelson 1998:187]  'Maija 1 asked Pekka 2 to bring her/his 1,2, * 3 record' [Vainikka 1989:270] What (39) and (40) reveal is that a matrix subject, but not a matrix object can license dependent case across an embedded TP boundary into a MA-infinitive, as schematized in (41).
✗ Structurally case-marked adjuncts in the matrix clause are also unable to license dependent case across an embedded TP, and thus they pattern with matrix objects. This is shown in (42a), where the multiplicative adjunct has matrix scope and still both objects must be nominative. (42a) additionally shows that the matrix object has the ability to license dependent case, as it does so on the adjunct, making its inability to do so on the embedded object all the more striking. When the adjunct has embedded scope, the embedded object licenses dependent case on the adjunct in an ordinary local configuration (42b). 31 30 A reviewer raises the possibility of analyzing this asymmetry in terms of extraposition: the MA-infinitive extraposes before dependent-case assignment and then reconstructs for binding at LF. There are several arguments against such an analysis. First, the extraposition would be string-vacuous, so there is no independent evidence for extraposition. Second, unlike canonical extraposition, it would have to be obligatory. Third, it would require delaying dependent-case assignment; in Sect. 3.2, I argued that dependent-case relationships are established as soon as the licensor is merged into the structure (see also Sect. 7.1). Fourth, MA-infinitives are transparent for extraction (Toivonen 1995;Huhmarniemi 2012). This suggests that even if MA-infinitives were to string-vacuously extrapose, they would independently need to be accessible to syntactic operations prior to extraposition, and it would be unclear why this would preclude case assignment. 31 The adjunct in (42a) also has an embedded reading, which is presumably derived from (42b)  'Ask Jukka to read the book for the third time!' [Maling 1993:66] The overarching pattern to emerge from Finnish MA-infinitives is summarized in (43). 32,33 (43) FINNISH CASE GENERALIZATION In Finnish, a matrix subject can license dependent case across an embedded TP boundary, but a matrix object and a matrix adjunct cannot.
One might wonder why the Finnish pattern in (43) is not found in languages like English. There are two reasons. First, in languages like English, control infinitives are CPs (Landau 2000), and CPs are domains for dependent-case assignment. Second, as control infinitives in languages like English contain PRO, PRO will always locally license dependent case on the object. Thus, in languages like English, case assignment in control infinitives is always determined locally; it is never contingent on elements in the matrix clause. 34 Finnish MA-infinitives (and TA-infinitives), on the other hand, are smaller than CP and contain no PRO (or, alternatively, they contain a PRO inert for dependent case), which causes the embedded DPs to interact for dependent-case assignment with matrix DPs. (35) is the crucial datapoint showing this property. The prediction then is that in languages with a pattern like (35), the same generalization from Finnish in (43) should emerge. I leave exploring this prediction to future research.
Crucially, the Finnish Case Generalization in (43) does not involve movement, which will prove important in the next two sections. Like the Movement-Case Generalization from Sect. 3, it also does not fall under the purview of standard notions of 32 Something like the Finnish Case Generalization in (43) would presumably need to hold under an FHCT analysis as well, because whatever conditions assigning accusative case to the embedded object can only be triggered by a matrix subject. This is notwithstanding the problem that structurally case-marked adjuncts pose for an FHCT analysis in the first place; see Sect. 4.1. 33 (43) also captures the other canonical nonfinite clause type in Finnish, namely TA-infinitives (traditionally called the "first infinitive"). TA-infinitives behave identically to MA-infinitives for the purposes of case assignment. However, the predicates that embed TA-infinitives never have their own objects. Thus, while TA-infinitives exhibit the same basic pattern as (35), the more complex pattern involving matrix objects in (36) and (39) happens not to arise for them. 34 For the same reason, we do not expect to find the Finnish pattern in ECM constructions, assuming that ECM involves movement of the embedded subject (though see Sect. 7.3). In ECM constructions, the embedded subject will locally license dependent case on the embedded object, so that the embedded object's case is never contingent on elements in the matrix clause. locality, e.g. phases, where a domain is either opaque to all operations or transparent to all operations. Under these standard, binary notions of locality, it is unexpected for a domain (here, a TP) to be penetrable by a DP in one position (matrix-subject position), but not another position (matrix-object position, which is arguably more local than the matrix subject). As such, the Finnish Case Generalization must be the result of some other kind of locality, namely one that is nonbinary. In the next section, I will argue that this nonbinary notion of locality is the Williams Cycle.

Improper case
In this section, I propose that dependent-case assignment is constrained by the Ban on Improper Case in (44). This constraint rules out dependent-case assignment configurations like (45).

(44) BAN ON IMPROPER CASE
A DP in [Spec, XP] cannot establish a dependent-case relationship with a lower DP across YP, where Y is higher than X in the functional sequence.
where Y X

✗
The Ban on Improper Case is a constraint in the spirit of the Williams Cycle (WC) (Williams 1974(Williams , 2003(Williams , 2013van Riemsdijk and Williams 1981), which in its original form is only a constraint on movement dependencies. In Sect. 6, I will propose that the WC be generalized to encompass case, movement, and agreement and then take up how to derive this generalized WC. I begin in Sect. 5.1 by introducing the WC in its instantiation for movement, known as the Generalized Ban on Improper Movement. Sect. 5.2 proposes the Ban on Improper Case, an extension of the WC particularized to case. In Sects. 5.3 and 5.4, I then apply the proposal to the Finnish Case Generalization and the Movement-Case Generalization respectively. Sect. 5.5 briefly discusses the treatment of case and movement in Baker (2015).

The Williams Cycle
The Williams Cycle (WC) is a size-based locality constraint on (movement) dependencies spanning two clauses, going back to Williams (1974) and van Riemsdijk and Williams (1981). The basic idea behind the WC is that movement from a specific domain in an embedded clause may move to the same kind of domain or a higher domain in the matrix clause. In Williams (2003), the WC is formulated as the Generalized Ban on Improper Movement (GBOIM) in (46), where domains are defined in terms of the functional sequence (fseq). 35 I will notate X being higher in fseq than Y as X Y , and, for concreteness, I will assume the simple functional sequence in (48).

(46) GENERALIZED BAN ON IMPROPER MOVEMENT (GBOIM)
Movement to [Spec, XP] cannot proceed from [Spec, YP] or across YP, where Y is higher than X in the functional sequence.
[based on Williams 2003] (47) A dependency relating α and β occurs A C R O S S XP iff XP dominates β but not α.
(48) fseq = C T v V As its name suggests, the GBOIM is intended to subsume the traditional ban on improper movement (Chomsky 1973(Chomsky , 1981May 1979). Thus, to illustrate the GBOIM, let us consider how it handles the classical instance of improper movement, namely the ungrammaticality of hyperraising: A-movement out of a finite clause. While A-movement may leave a finite clause (49a), A-movement may not (49b). 36 This contrast does not extend to nonfinite TP clauses, which allow both A-movement (50a) and A-movement (50b) out of them. (For the sake of simplicity, I set aside nonfinite CP clauses, which pattern like finite clauses for hyperraising.)

A-mvt
According to the GBOIM, the relative heights of the launching and landing sites determine whether extraction is possible. Because finite clauses are CPs, movement out of a finite clause can land no lower than [Spec, CP] in the next highest clause, as schematized in (51). 35 The formulation of the GBOIM given in Williams (2003:72) does not represent the full generality of what Williams's analysis of the GBOIM actually derives (see Sect. 6). All else being equal, that formulation allows movement across projections higher in the functional sequence than the launching site of movement because it is stated only in terms of the landing site. I have reformulated the GBOIM in (46) to avoid this problem. 36 In (49), I do not depict movement through [Spec, CP], but this would not change the movement derivations that are ruled out by the GBOIM. Note though that the traditional ban on improper movement (Chomsky 1973(Chomsky , 1981May 1979)

(51) Movement from CP cannot land lower than CP
As depicted in (51), CP is a barrier for movement to [Spec, TP] because C T in fseq, but CP is not a barrier for movement to [Spec, CP] because C C. Thus, Amovement, but not A-movement, out of a finite clause is grammatical. On the other hand, because nonfinite clauses are TPs, movement out of a nonfinite clause may land in either [Spec, TP]

or [Spec, CP] because T T and T C respectively. Thus, both
A-movement and A-movement are possible out of a nonfinite clause, unlike finite clauses, as schematized in (52).

(52) Movement from TP cannot land lower than TP
Under the GBOIM, size matters. A smaller clause is permeable to more movement types than a larger clause, because the maximal projection of a smaller clause will be lower in fseq than the maximal projection of a larger clause. Constraining movement in terms of clause size extends beyond the distinction between A-movement and A-movement. Here are several examples (taken from Keine 2016): (i) Infinitival clauses are opaque to extraposition, but not regular A-movement and A-movement (Ross 1967;Baltin 1978). (ii) Embedded questions are opaque to wh-movement, but not topicalization and relativization (Williams 2013). (iii) In Hindi-Urdu, finite clauses are opaque to A-scrambling, but not A-scrambling (Mahajan 1990). In German, (iv) embedded V2 clauses are opaque for movement into a verb-final clause, but not movement into a V2 clause (Haider 1984); (v) finite clauses are opaque to scrambling and relativization, but not wh-movement or topicalization (Bierwisch 1963;Ross 1967;Bayer and Salzmann 2013;Müller 2014b); and (vi) incoherent infinitives are opaque to scrambling, but not wh-movement and relativization (Bech 1955(Bech /1957Wurmbrand 2001). What these asymmetries share is involving a domain that is permeable to one movement type, but not another movement type (what Keine terms selective opacity). The GBOIM derives these asymmetries as "generalized" improper movement configurations, i.e. in terms of clause size. For more discussion, see Williams (1974Williams ( , 2003Williams ( , 2013, Sternefeld (1993, 1996), Abels (2007Abels ( , 2009Abels ( , 2012a, Neeleman and van de Koot (2010), Müller (2014a,b), and Keine (2016Keine ( , 2019Keine ( , 2020.

Proposal
There are crucially parallels between the locality problems from Sects. 3 and 4 and the kinds of movement configurations ruled out by the Generalized Ban on Improper Movement. To see these parallels, let us consider the two locality problems in turn.
With respect to the Movement-Case Generalization, recall the Lower-DP Problem, according to which a DP cannot be the lower DP in a dependent-case pair when in an intermediate landing site. (I will return to the Higher-DP Problem in Sect. 5.4.) This characterization can be recast in terms of the WC, viz. clause size and the functional sequence: a DP α in [Spec, CP] cannot enter into a dependent-case relationship with a DP β in a higher clause-DP α being the lower in the pair-if DP β is in [Spec, TP], [Spec, vP], or [Spec, VP], because C T , C v, and C V in fseq. This is schematized in (53).

(53) Lower-DP Problem
Note that a dependent-case relationship between two [Spec, CP] positions also needs to be ruled out (54). This configuration does not fall under the characterization of the Lower-DP Problem-or from the Ban on Improper Case, to be proposed belowbecause C C in-fseq.  (54) to be in [Spec, CP], it will have undergone Amovement to that position. Thus, the impossibility of this particular configuration falls under the Higher-DP Problem (i.e. that an A-moved element cannot be the higher DP in a dependent-case pair) and will follow from the analysis of the Higher-DP Problem in Sect. 5.4. The same parallels apply to the Finnish Case Generalization, according to which a matrix subject can license dependent case across an embedded TP clause boundary, but a matrix object and a matrix adjunct cannot. In terms of the WC: a DP in [Spec, TP] can license dependent case on another DP across a TP, because T T , but a DP in a lower position such as [Spec, vP] or [Spec, VP] cannot do so, because T v and T V . This is schematized in (55).

(55) Finnish-Case Generalization
These parallels in (53) and (55) are the motivation for extending the WC to dependent-case assignment. I propose that dependent-case assignment is subject to the Ban on Improper Case in (56), a direct extension of the WC to case.

(56) BAN ON IMPROPER CASE
A DP in [Spec, XP] cannot establish a dependent-case relationship with a lower DP across YP, where Y is higher than X in the functional sequence.
The Ban on Improper Case states barrierhood for dependent-case assignment relative to the fseq-position of the higher DP in the dependent-case pair. For example, a DP in [Spec, TP] can license a dependent-case relationship with another DP past TP, vP, and VP, because none of these projections are higher than T in fseq (57)

✗
Notice that the Ban on Improper Case makes no reference to movement or clause types. It is more general than the empirical data that motivated it. The remainder of this section shows how the Ban on Improper Case applies to our two very different generalizations: the Finnish Case Generalization in Sect. 5.3 and the Movement-Case Generalization in Sect. 5.4.

Application to Finnish
The Finnish Case Generalization is repeated below in (59).

(59) FINNISH CASE GENERALIZATION
In Finnish, a matrix subject can license dependent case across an embedded TP boundary, but a matrix object and a matrix adjunct cannot.
Under the Ban on Improper Case, the matrix subject is able to license dependent case across the embedded TP boundary because the matrix subject is located in [Spec, TP] and T T in fseq. Thus, it licenses dependent case on the matrix object (within the same clause) and on the embedded object (across the clause boundary). This is schematized in (60) The matrix object occupies a vP-internal position-the precise position is inconsequential, but somewhere below v. From its vP-internal position, the matrix object is unable to license dependent case across the embedded TP boundary because T v in fseq, thereby making TP a barrier for dependent-case licensing from vP-internal DPs, in particular from DPs in [Spec, vP] and any position lower in fseq. The same barrierhood applies for matrix adjuncts as well, which are generated in vP-internal positions too. As such, in the absence of a matrix subject, the [CASE: ] features on the matrix and embedded objects both remain unvalued throughout the derivation and are realized as unmarked case at PF. This is schematized in (61). 37 An assumption of this analysis is that the subject undergoes A-movement to [Spec, TP], from where it is then able to penetrate the embedded TP to license dependent case. However, if we were to analyze MA-infinitives as being vPs, rather than TPs (contra Koskinen 1998), then the matrix subject would be able to penetrate the embedded vP from its base-generated position in [Spec, vP]. Note that the matrix object and adjuncts would not be able to penetrate an embedded vP because they are in positions below vP. Under this analysis, there is nothing special about case in MA-infinitives. The same general case mechanism, namely dependent case, applies everywhere in the language as syntactic structure is built up, following Sect. 2-but this mechanism is constrained by the Ban on Improper Case.
Previous analyses of MA-infinitives are all broadly based on the idea that when the matrix subject is absent or bears lexical case, i.e. the environments in Finnish with nominative objects, the ability to assign accusative case is gone altogether (Vainikka 1989;Nelson 1998;Vainikka and Brattico 2014). 38 However, we saw in Sect. 4.2 that structurally case-marked adjuncts are still accusative in configurations like (61); the relevant datapoint is repeated in (62). 'Ask Jukka for the third time to read the book!' [Maling 1993:69] If the ability to assign accusative case is absent in configurations like (61), as previous analyses assume, then there would be no source of accusative case for the adjunct in (62). However, (62) follows without further ado on the DCT analysis developed in this paper: the matrix object licenses dependent case on the adjunct, but the matrix and embedded objects cannot enter into a dependent-case relationship without violating the Ban on Improper Case.

Application to movement
The Movement-Case Generalization is repeated below in (63).

(63) MOVEMENT-CASE GENERALIZATION
A-movement can feed dependent case, but A-movement cannot.
Let us begin with A-movement. Recall that the locality problem with A-movement is that an A-moved element cannot enter into dependent-case relationships from its intermediate and final landing sites. Thus, we must consider (i) when an A-moved element is the lower DP in a potential dependent-case pair (see (21)) and (ii) when it is the higher one (see (22)). These are the Lower-DP Problem and the Higher-DP Problem respectively. For the sake of clarity, I will label the higher and lower DPs in a dependent-case pair as DP α and DP β , respectively, unless the DP in question is an A-moved element, for which I will reserve the label DP μ . It should be emphasized that this labeling is for expository purposes only, and the Ban on Improper Case does not (need to) take into account whether the relevant DPs have undergone movement.
According to the Ban on Improper Case, a DP α in [Spec, TP], [Spec, vP], or [Spec, VP] cannot enter into a dependent-case relationship with a DP μ in embedded [Spec, CP] because these projections are all lower than C in fseq. That is, CP is a barrier for dependent-case licensing from TP and all projections lower in fseq (64). This barrierhood accounts for why an A-moved element may not have its case altered at its intermediate and final landing sites, i.e. the Lower-DP Problem.
The Ban on Improper Case, however, does not prohibit a DP μ in [Spec, CP] from establishing a dependent-case relationship with a DP β lower in the same clause, i.e. the Higher-DP Problem, as C is higher than these projections in fseq. I propose that the reason why A-moved DPs cannot themselves license dependent case is because they are encased in a QP, i.e. Q-particle Phrase (in the sense of Cable 2007Cable , 2010. 39 Because only DPs may establish a dependent-case relationship, a DP inside a QP cannot be the higher DP in a dependent-case pair because it does not ccommand out of the QP and hence never c-commands other DPs in the clause (65a). On the other hand, a DP inside a QP can be the lower DP in the pair because other DPs can still c-command into the QP (65b).
However, a DP that undergoes A-movement should still be able to enter into dependent-case relationships from the A-positions that it occupied prior to Amovement, which (65a) does not permit. To solve this problem, I adopt Safir's (2019) independently-motivated proposal that the QP is countercyclically merged onto the DP immediately before it A-moves (see also Rezac 2003;Stanton 2016). 40 39 While Cable's (2007Cable's ( , 2010 QP-system is designed primarily for wh-movement, I follow Cable in assuming that a QP-analysis extends to other kinds of A-movement as well; see Cable (2007:369-375). 40 A few notes are in order: First, Safir (2019) does not assume that the shell insulating an A-moved element is always a QP, as I do here, although nothing critical in this paper rests on that assumption. Second, the reader is referred to Safir (2019) for discussion of how other facets of the A/A-distinction can be captured under a QP-shell analysis. Third, non-countercyclic implementations of the QP-shell analysis are conceivable. What I have in mind is the multidominance analysis of QP-movement in Johnson (2012) and Poole (2017), where the DP merges with its base position and with the Q-particle, the resulting QP then being merged in the landing site of movement; this is effectively sidewards movement of the DP into the QP.
To illustrate how this applies to dependent case, consider the derivation of a simple wh-subject question in (66): (i) the subject is base-merged in [Spec, vP], from where it licenses dependent case on the object (66a); (ii) the subject A-moves to [Spec, TP] (66b); (iii) the QP is then merged on top of the subject (66c); and finally (iv) the QP moves to [Spec, CP] (66d). The A-moving DP will always be encased in the QP before it reaches any intermediate or final [Spec, CP] landing sites, thereby preventing it from licensing dependent case on other DPs from those derived positions, i.e. the Higher-DP Problem.
Step 1: Merge the subject, assign dependent case Step 2: A-move the subject c.
Step 3: Build the QP on the DP Step 4: Move the QP Note that the addition of the QP layer does not handle the Lower-DP Problem, because other DPs can nonetheless c-command a DP encased in a QP. This problem still requires the Ban on Improper Case, as was schematized above in (64). This point, however, raises an alternative analysis where QPs are themselves opaque to case assignment, so that once they are formed on a DP, that DP no longer interacts with case assignment. Such an analysis faces the dilemma that the opacity for case assignment would have to come from its own source and not apply to other dependencies, because c-command into a QP for other dependencies, e.g. binding, is indeed possible. For example, consider (67), in which an anaphor in a moved wh-element-on the analysis here, a QP-can be bound from its landing site (Barss 1986;Lebeaux 1988 Under this analysis, the behavior of A-movement with respect to dependent case follows from two components: a QP-shell and the Ban on Improper Case. The former handles the Higher-DP Problem, and the latter handles the Lower-DP Problem. Because the QP-shell and the Ban on Improper Case are independent from each other, this account predicts that if the 'higher/lower' symmetry breaks down, it should crucially do so in one direction. Namely, if an A-moving DP is not encased in a QPshell, then it should be able to be the higher DP in a dependent-case pair in its intermediate and final landing sites, but not the lower DP. In other words, it should exhibit the behavior of the Lower-DP Problem, but not the Higher-DP Problem. Empirically, this would be a movement type that is just like wh-movement-targeting a position high in fseq, like [Spec, CP]-except it does not involve a  This prediction is schematized in (68).
✗ This prediction appears to be borne out in Koryak, as described by Abramovitz (2020). Abramovitz shows that in Koryak, (i) ergative is dependent case and (ii) long wh-movement of an embedded nominative DP results in dependent ergative case on the matrix subject, as shown in (69). Note that for readability, I do not gloss the verbal morphology in the Koryak data.  (68) to be true, it must be the higher position. Crucially, embedded questions show that it is indeed the higher position. 42 In embedded questions, moving a nominative wh-element to embedded [Spec, CP] does not trigger dependent ergative on the matrix subject, as shown in (70). This is precisely what (68) predicts. By contrast, if (69) involved the wh-element establishing the dependent-case relationship from the lower position, then we would expect the matrix subject to be ergative in (70) as well, contrary to fact. 41 Under Cable's (2007Cable's ( , 2010 QP-system, such a movement type should also lack pied-piping. I do not know whether Koryak wh-movement-which I introduce shortly below as an instance of such a movement type-allows pied-piping or not, but none of the examples in Abramovitz (2020) involve pied-piping. 42 Abramovitz (2020) proposes that the dependent-case relationship with the matrix subject is established when the moving wh-element is in matrix [Spec, vP]. He places the matrix subject in [Spec, TP], and so the directionality of dependent case appears to be upwards, as is standard for "ergative." However, once we recognize that the subject's base position is the inner specifier of vP, then on Abramovitz's analysis, the directionality of dependent case would in fact need to be downwards, as in the analysis presented in the main text, because the wh-element would move through the outer specifier position of vP, which is above the matrix subject. Thus, as far as I can tell, my (re)analysis of the Koryak data is not substantively different from Abramovitz's analysis, though technically a different claim. 'I am wondering what gifts I should give Hewngyto for his birthday' [Abramovitz 2020:28] Therefore, Koryak seems to confirm the prediction in (68) of the two-component analysis being proposed here. Whether this pattern can be found more widely, I leave for future research. Note that this analysis of Koryak requires the unorthodox assumption that ergative is dependent case assigned to the lower DP in a dependent-case pair, rather than the higher DP. Exploring this point is outside the scope of this paper, but see Yuan (2018Yuan ( , 2020 for independent arguments that "ergative" (i.e. dependent case on transitive subjects) can be assigned downwards. 43 Turning now to A-movement, recall the three examples of movement feeding dependent case from Sect. 3.1: (some instances of) object shift, Shipibo applicatives of unaccusatives, and the English raising predicate strike as. All three of these examples obey the Ban on Improper Case. For object shift, the subject's position is higher in fseq than the raised object's position-presumably [Spec, TP] and [Spec, vP] respectively-as schematized in (71). Similarly for Shipibo applicatives, the raised theme's position is higher in fseq than the applicative's position (72). Generally, if a DP moves clause-internally (i.e. within the same extended projection), it will be able to establish dependent-case relationships with other DPs in that same clause, because the higher DP in the pair will always be higher in fseq than the lower DP. For English strike as, the movement crosses a clause boundary (the movement itself obeying the GBOIM; see Sect. 5.1), but the position in which the moved DP lands is higher in fseq than the position of the internal argument of strike. Therefore, according to the Ban on Improper Case, the two DPs can establish a dependent-case relationship, as schematized in (73).

(73)
English 'strike as' Yuan (2018Yuan ( , 2020 argues that in Inuit, the object must raise over the subject in order to license dependent ergative case on the subject. Abramovitz (2020:27-30) in fact provides data that can be taken as evidence in favor of extending this kind of analysis to Koryak. He shows that nominative objects raise to the edge of vP, which could plausibly be analyzed as a position above the subject's base position, so that dependent ergative is assigned downwards. Strictly speaking, such an analysis would require relaxing Earliness; see Sect. 7.1 for discussion.
The Ban on Improper Case also explains why the internal argument of strike does not license dependent case on the subject in its embedded position before it moves: the vP-internal position of the internal argument is lower in fseq than T. Therefore, TP is a barrier to licensing a dependent-case relationship with the embedded subject.
The discussion thus far has focused on dependent case, but it should be noted that this analysis does not preclude lexical case from being assigned to a moved position. First, the Ban on Improper Case is not formulated to encompass lexical-case assignment. But, because lexical case is assigned in a siblinghood relation, it is out of the purview of the Ban on Improper Case regardless. Assuming Bare Phrase Structure, where what projects is the head itself (Chomsky 1995a), it is then possible for a DP to move to a specifier position and be assigned a lexical case under siblinghood, in what would traditionally be a specifier-head relation (à la Rezac 2003) (see also fn. 5). To illustrate, consider dative-accusative constructions in Faroese (74), which are historically related to the more familiar Icelandic dative-nominative constructions.  [Thráinsson et al. 2004:255] These constructions can be analyzed as the following: (i) the subject is base-merged in [Spec, vP], from where it licenses dependent case (= accusative) on the object (75a); (ii) the subject moves to a higher projection in the clause, e.g. Exp 0 (75b); and (iii) the head of this projection assigns the subject lexical dative case (75c). The difference between Faroese and Icelandic is that in Icelandic, the subject is assigned dative case in its base-generated position, thus bleeding dependent-case assignment and yielding nominative objects. 44 (75) a.
Step 1 (2015) terms Active Ergative languages (where ergative case is associated with external arguments), the "marked nominative" construction in Dinka (van Urk 2015), and differential object marking in Hindi-Urdu (Bhatt and Anagnostopoulou 1996). There are likely many other such instances, but these exemplify when such a derivation might be reasonably invoked.
In sum, the Ban on Improper Case accounts for the interactions between movement and dependent case: roughly, A-movement, but not A-movement may feed dependent-case assignment. Importantly, the analysis does not invoke separate operational primitives for A-movement and A-movement. Rather, the analysis derives from the positions targeted by different movement types. Moreover, if Safir (2019) is correct that the QP-shells in A-movement can be derived from independent factors, then the analysis presented here captures the A/A-distinction in this (narrow) domain purely as an epiphenomenon. This thinking is in line with minimalist syntax, where all structure building is the result of the operation MERGE. The foundations of the analysis were also independently motivated from the Finnish Case Problem, which crucially does not involve movement.

Remarks on Baker (2015)
As the most comprehensive dependent-case system to date, it is instructive to consider how Baker's (2015) system fares on the data considered in this paper. First, Baker does not investigate anything comparable to Finnish MA-infinitives, and hence nothing in his system handles the Finnish Case Generalization. Second, his treatment of the A-movement examples from Sect. 3.1 is more or less in line with what I propose for them in Sect. 5.4. Therefore, let us set aside these two issues and focus on the Higher-DP Problem and the Lower-DP Problem, where a comparison is more fruitful.
Regarding the interaction of case and movement, Baker proposes that (i) dependent case is assigned at phasal Spellout and that (ii) it is calculated within the phase complement, crucially excluding the phase edge. 45 Consider the wh-question in (76) at the point when the CP phase is spelled out. As the phase complement of C, TP is the domain of dependent case. The higher copy of who in [Spec, CP] is not within the phase complement and hence does not factor into the dependent-case calculus. Thus, the only dependent-case relationship established at the CP phase is between she and the lower copy of who. The higher copy of who in [Spec, CP] is spelled out in the next phase (or by whatever procedure spells out the edge of the highest phase).
phase complement This analysis accounts for the Higher-DP Problem: in its intermediate and final landing sites, where it could be the higher DP of a dependent-case pair, an A-moved element is not included in the calculus of dependent case for that phase.
However, there are two drawbacks. First, this treatment of the Higher-DP Problem does not extend to the Koryak data, where an A-moved element in [Spec, CP] does in fact establish a dependent-case relationship with a DP lower in that clause. Baker's analysis categorically rules out such configurations, which appears to be too strong. Second, it leaves the Lower-DP Problem unresolved. As laid out in Sect. 3.2, from embedded [Spec, CP] positions, a DP unvalued for case should be eligible to be the lower DP in a dependent-case pair, but crucially it is not. There is nothing in Baker's analysis to rule out such pairs. In general, appealing to the locality afforded by phases cannot account for the Lower-DP Problem (see (23)).
Therefore, Baker's (2015) analysis of the interaction between case and movement does not extend to the full range of facts considered in this paper. However, this should not be construed as an argument against Baker's overall dependent-case system, since it is otherwise compatible with the Ban on Improper Case.

Deriving the Williams Cycle
While the Ban on Improper Case derives the range of facts presented in this paper, the fact that analogous restrictions have been observed for movement (e.g. Williams 1974Williams , 2003Williams , 2013Sternefeld 1993, 1996;Abels 2007Abels , 2009Abels , 2012aNeeleman and van de Koot 2010;Müller 2014a,b) and agreement (Keine 2016(Keine , 2019(Keine , 2020 strongly suggests that these "WC effects" have a unified source. Here, there are two interconnected issues: (i) how to formulate the WC so as to encompass case, movement, and agreement and (ii) how to derive the WC in the grammar.
The existing analyses of WC effects-other than Williams's own-analyze the WC as the result of a constraint on either MERGE (Abels 2007(Abels , 2009Müller 2014a,b) or AGREE (Keine 2016(Keine , 2019(Keine , 2020. Examining the specific details of these proposals would take us too far afield. What is important for present purposes is that they are operation-specific. For these proposals to extend to the case facts presented here, i.e. the Ban on Improper Case, dependent-case assignment would need to involve AGREE (or, in principle, MERGE). However, a dependent-case relation does not resemble an AGREE-relation, in that it does not involve any obvious valuation (or checking) of syntactic features between the two DPs. Put differently, it is not clear how the two DPs valuing features on each other (in either direction) would result in one of them being assigned dependent case. Thus, if case, movement, and agreement all exhibit WC effects and (dependent) case assignment does not involve AGREE, then WC effects must be the result of a more general (non-operation-specific) constraint in the grammar (as Keine 2020:332 also acknowledges).
The line of thinking that I advance in this section is that WC effects can be uniformly derived in an operation-general way if they are analyzed as the result of how clausal embedding works, as Williams (2003) originally proposed. The challenge for this approach is that it enforces a very strict locality. While this strict locality appears to be appropriate for case (the focus of this paper), it has been argued that it is empirically too restrictive for movement (Abels 2007(Abels , 2009); I will return to this issue in Sect. 7.3.
Broadly construed, the WC is the notion that one and the same node can be a barrier to some dependencies, but not to other dependencies. I propose adopting the particularly strong formulation of the WC in (77).

(77) WILLIAMS CYCLE (strong version)
Within the current XP, a syntactic operation may not target an element across YP, where Y is higher than X in the functional sequence.
The formulation in (77) is operation-general; it does not reference specific operations and thereby covers case, movement, and agreement alike. It also encodes the strict locality of the Generalized Ban on Improper Movement (see Sect. 5.1) and thus is commensurate with the formulation in Williams (2003Williams ( , 2013. Accordingly, the Ban on Improper Case is a subcase of (77). An additional upshot of the formulation in (77) is that it is compatible with the various syntactic implementations of dependent-case assignment: a distinct syntactic operation (as assumed in this paper; see Preminger 2011Preminger , 2014, parasitic on cyclic linearization (Baker 2015), or binding relations (Pesetsky 2011). In other words, the WC does not require a particular analysis of dependent case, only that it occurs in the syntax (see Sect. 7.1).
To derive the WC as formulated in (77), I suggest returning to Williams's (2003Williams's ( , 2013 own analysis of the WC, which derives the WC from the syntax of embedding (for an overview of this theory, see Hornstein and Nevins 2005). The core idea of the analysis is that embedding is constrained by the Level Embedding Conjecture (LEC) in (78). 46

(78) LEVEL EMBEDDING CONJECTURE (LEC)
An XP can only be embedded in a structure that is also built up to an XP.
The basic idea behind the LEC is that clauses are built up in parallel. Embedding may take place at any point, but once a clause has been embedded, it no longer increases in size. 47 The different points in the derivation at which embedding may take place correspond to the functional sequence. Williams calls this notion the derivational clock (or 'F-clock'). To illustrate, consider that-clause embedding in (79)  Under the LEC, embedding is a substitution operation (though Williams does not explicitly call it such), analogous to Chomsky's (1955Chomsky's ( , 1957 theory of generalized transformations (for a reprise, see also Chomsky 1995b:173-174) and to substitution in Tree Adjoining Grammar (Joshi et al. 1975;Kroch and Joshi 1985). On this proposal, the WC follows from the strict cycle. Let us take the strict cycle to be the result of the Strict Cycle Condition, as defined in (80) (the formulation is taken from Müller 2017; see Chomsky 1973Chomsky , 1995bChomsky , 2001Chomsky , 2008, which precludes syntactic operations from solely applying within embedded domains. Embedding itself must be considered to be admissible under (80). 49

(80) STRICT CYCLE CONDITION
Within the current XP α, a syntactic operation may not exclusively target an item in the domain of another XP β if β is in the domain of α.
(81) DOMAIN The domain of a head X is the set of nodes dominated by XP that are distinct from and do not contain X.
where Y X and YP is the root node No operation that is triggered in XP-whether it be movement, agreement, or casecan look into a YP (where Y X) because the relevant structure where X and [Spec, XP] would have access to YP within the strict cycle is simply not created by the grammar. This is illustrated in (84) for dependent-case assignment.
(84) Applied to dependent-case assignment As such, all syntactic dependencies are subject to the WC, regardless of whether or not they share the same operational core. All of the WC effects are thus uniformly derived from the timing of embedding. Before concluding this section, it is worth briefly considering what counts as a 'syntactic dependency' in the context of the WC and the LEC. Recall from Sect. 4.2 that in Finnish MA-infinitives, the matrix and embedded objects cannot establish a dependent-case relationship with each other (see (39)), but they can establish a binding relationship (see (40)). The former is predicted by the LEC, but the latter is not; the asymmetry between the two thus needs to be explained. I take a 'syntactic dependency' to be a dependency that exists in the narrow syntax, thereby being interspersed with structure building. By this definition, case, movement, and agreement are all syntactic dependencies. However, I contend that LF relations are not syntactic in this relevant, narrow sense. Because the LEC only constrains the narrow syntax, LF relations therefore do not exhibit WC effects. The justification for this claim is that under the LEC, semantic interpretation can only proceed after all embedding has taken place, because the embedded clause's denotation is needed to compute the VP's denotation, and so forth. Thus, it is independently the case that LF must be computed on the basis of the final output of the narrow syntax. Crucially, the core of a binding dependency, i.e. the λ-operator and the variable that it binds, is a relation that only needs to hold at LF (Lebeaux 2009). 50 Returning to Finnish, the asymmetry between dependent case and binding in MA-infinitives, then, is because the dependent-case relationship would need to be established in the narrow syntax, which is not possible because of the LEC, while the binding relationship is established at LF and hence is unaffected by the LEC. Verification of this hypothesis would require examining other LF relations, such as scope and focus, the problem being that many (perhaps most) of these relations are clause-bounded and thus are not the kinds of dependencies that the LEC would affect. I leave pursuing this topic to future research. 51

Discussion
This paper has argued that case assignment is subject to the Ban on Improper Case in (85). This constraint is an extension of the Williams Cycle (WC) particularized to case. The motivation for improper case came from two disparate empirical domains: the interaction between case and movement and crossclausal case assignment in Finnish. Both of these locality problems were shown not to fall under the purview of standard notions of locality, e.g. phases, but rather they follow from the Ban on Improper Case.

(85) BAN ON IMPROPER CASE
A DP in [Spec, XP] cannot establish a dependent-case relationship with a lower DP across YP, where Y is higher than X in the functional sequence.
It was then shown that Williams's (2003) analysis of the WC in terms of clausal embedding uniformly captures WC effects for case, movement, and agreement. Thus, it crucially derives the Ban on Improper Case. The remainder of this paper is devoted to discussing some of the issues that emerge from the Ban on Improper Case. Sections 7.1 and 7.2 discuss two broader ramifications: the timing of case assignment and the WC's relation to phases, respectively. Finally, the issue of potential counterexamples to the strict locality enforced by the (strong) WC is taken up in Sect. 7.3.

Timing of case assignment
An overarching question in the case literature is at what point in the derivation case assignment happens. In Marantz's (1991) original implementation of DCT, case assignment is situated at PF, i.e. in the postsyntactic morphological component. This line of thinking prevailed in the early work on DCT (e.g. McFadden 2004, Bobaljik 2008; an exception being Bittner and Hale 1996). It was also often considered a key difference between DCT and the more standard FHCT, which situates case assignment in the narrow syntax (see e.g. Legate 2008). However, more recent work on DCT has argued that, even under DCT, case assignment must be in the narrow syntax and not at PF (Baker and Vinokurova 2010;Preminger 2011Preminger , 2014Baker 2015).
The Ban on Improper Case lends further support to the argument that case assignment must be in the narrow syntax. First, as movement is subject to the WC and movement occurs in the narrow syntax, the WC itself must be due to a constraint in the syntax in order for it to restrict movement so. It stands to reason then that anything else that is subject to the WC must be in the syntax as well. As such, because case assignment is subject to the WC, it too must be in the syntax. Second, the information required for the WC in the first place is fundamentally syntactic in nature, and replicating it at PF just so that it could apply to case assignment would be redundant. Third, it was argued in Sect. 3.2 that dependent-case assignment is interspersed with structure building, which would not follow if case assignment were at PF.
When in the narrow-syntactic derivation does case assignment happen then? In Sect. 3.2, I proposed that dependent case is assigned as early as possible, a principle that I called Earliness. This notion of earliness also trivially extends to lexical case under DCT, where it is always assigned locally (see Sect. 2). Earliness is the strongest hypothesis about the timing of case assignment, and it is fully consistent with the data presented in this paper. Another possibility, though, is that case assignment is delayed until phasal Spellout (as in Baker 2015). The effects of such an analysis largely depend on what the phases are. If vP is a phase, such an analysis is in principle possible, as long as case assignment precedes any movement to the phase edge, in order to avoid the problems solved by Earliness. However, such an analysis would have to grapple with the fact that vP does not seem to erect a locality domain for case in the same way as CP does; see the next section. If only CP is a phase, though, then delaying case assignment until phasal Spellout will require saying something special about A-scrambling, which would be phase-internal (e.g. to [Spec, TP]) on such an analysis, but does not affect case assignment. Fully exploring these issues is beyond the scope of this paper, so I leave them for future research. However, it should be pointed out that whether XP is a domain at which some operation applies, and whether XP is a locality domain that blocks that same operation are in principle distinct questions, even if they are typically conflated in phase theory.

Phases
The WC and phases-the more standard notion of locality-are not mutually exclusive. They may coexist as independent constraints on syntactic operations. For instance, the WC does not force successive-cyclic movement through [Spec, CP]; this is still a consequence of phases.
It is standardly assumed that CP and vP are phases, and consequently that successive-cyclic movement targets [Spec, CP] and [Spec, vP] (Chomsky 2000(Chomsky , 2001(Chomsky , 2008. Throughout this paper though, I have tacitly assumed that only CP is a phase, because in Finnish, a dependent-case relationship can span an arbitrary number of intervening vPs, as illustrated in (86) [Nelson 1998:238] (86) shows, minimally, that dependent-case assignment is not subject to the Phase Impenetrability Condition (PIC) at the vP-level. There are two potential conservative explanations for this status, both of which are compatible with the Ban on Improper Case. The first is that dependent-case assignment is simply not subject to the PIC, as Bošković (2007) has argued about AGREE. The second is that the vP-phase does not intervene in the same way for dependent-case assignment as the CP-phase does, as Baker (2015) proposes with his 'soft'-and 'hard'-phase distinction.
There is also the more radical explanation that vP is not a phase. vP-phasehood, in fact, conflicts with the WC more generally. First, according to the WC, movement from [Spec, CP] to [Spec, vP] is barred because C v in fseq. Second, if such movement were permitted, it would obscure crucial distinctions needed to account for generalized improper movement. For example, consider hyperraising: at the point at which movement to [Spec, TP] occurs, the moving DP would be in [Spec, vP], so it would be necessary to backtrack into the previous phase to see whether it moved out of a CP or a TP. If the movement to [Spec, TP] proceeds directly from the CP/TP (see Sect. 5.1), then such backtracking is unnecessary. For more discussion of this particular problem, see Müller (2014a,b). Based on (i) these kinds of considerations involving the WC and (ii) long-distance agreement configurations parallel to (86), Keine (2016Keine ( , 2017Keine ( , 2020 argues that vP should not be considered a phase (see also Keine and Zeijlstra 2020), which would also solve the problems that the WC poses for vP-phasehood.

Potential exceptions to the Williams Cycle
As shown in Sect. 6, the Level Embedding Conjecture (LEC) successfully derives the strong formulation of the WC, repeated in (87), thereby providing a uniform analysis of WC effects for case, movement, and agreement. I will refer to the formulation in (87) as the 'strong' WC.

(87) WILLIAMS CYCLE (strong version)
Within the current XP, a syntactic operation may not target an element across YP, where Y is higher than X in the functional sequence. Abels (2007Abels ( , 2009, however, has argued that the strong WC is empirically too restrictive because it rules out several purported movement dependencies, such as subjectto-object raising in ECM infinitives and movement over complementizers (to be discussed below). This criticism extends to the LEC, since it derives the strong WC. The recent, operation-specific analyses of WC effects have taken these purported exceptions at face value and gone on to develop analyses that derive weaker versions of the WC. As discussed in Sect. 6, the dilemma with these analyses is that they do not extend to dependent case. Let us focus on Keine's (2016Keine's ( , 2019Keine's ( , 2020 AGREE-based analysis, setting aside the MERGE-based analyses of Abels (2007Abels ( , 2009 and Müller (2014a,b). Dependent-case relations do not resemble canonical AGREE-relations, and thus it is not immediately evident that dependent-case assignment involves AGREE.
We are therefore at an impasse between two options: (i) develop a fully AGREE-based implementation of DCT, thereby allowing us to (in principle) extend Keine's analysis to case, or (ii) revisit and reanalyze the purported exceptions to the strong WC, thereby allowing us to maintain the LEC. I argue that the purported exceptions to the strong WC should be revisited and reanalyzed. My argument against the first option is twofold. First, no AGREE-based implementations of DCT have been proposed in the literature. Thus, given the current state-of-the-art in DCT, it is not presently possible to directly extend Keine's analysis to case. Second, Keine's analysis handles the exceptions largely through a stipulation. Space limitations prevent a detailed discussion, but in a nutshell, under his analysis, some AGREE-probes are not subject to the WC (in his terms, they do not have a 'horizon'). In light of these two points, I contend that it is not at all certain that abandoning the strong WC-and by extension the LEC-is warranted based on a set of limited exceptions, especially given the importance of the strong WC's operationgenerality. At the very least, the introduction of improper case into the empirical landscape warrants subjecting the purported exceptions to closer scrutiny.
Fully reconciling this issue is beyond the scope of this paper, but there are-I believe-promising directions towards reanalyzing the exceptions. In what follows, I briefly discuss each exception, the first two of which are from Abels (2007), and sketch how they might be reanalyzed in ways compatible with the LEC.
ECM In ECM infinitives, it is commonly assumed that the embedded subject moves from inside the embedded TP to a vP-internal position in the matrix clause, as schematized in (88) (Postal 1974). However, according to the strong WC, TP should be a barrier for such movement because T v in fseq. As such, ECM infinitives appear to pose a challenge for the strong WC. Note that under the WC, the matrix subject can establish a dependent-case relationship with matrix [Spec, vP] or embedded [Spec, TP], so the actual case in ECM is unproblematic.
The classical evidence cited in favor of this analysis is that matrix adverbs and particles may intervene between the embedded subject and the embedded predicate, e.g. with all her heart in (88). However, recent work by Neeleman and Payne (2020) has reevaluated this argument. On the basis of scope-freezing effects and adverb order, they argue that an ECM infinitive does not actually involve moving the embedded subject, as in (88), but rather extraposing part of the embedded clause rightwards, as in (89). If Neeleman and Payne's analysis is on the right track, then ECM infinitives do not pose a problem for the strong WC after all.

Movement over complementizers
In some languages, movement that lands below a complementizer is then able to cross that complementizer to move to a higher clause. To illustrate, consider English topicalization. In an embedded clause, topicalization lands in a position below the complementizer (90a), from which it could be concluded that C Top in fseq (where TopP represents whatever position topicalization targets). Topicalization can, however, cross an embedded finite clause boundary, moving over a complementizer (90b). If topicalization targets TopP and C Top, the strong WC incorrectly predicts that CP should be a barrier for movement to TopP, thereby prohibiting topicalization over a complementizer.
topicalization This class of exceptions would disappear if complementizers in these languages are analyzed as edge markers that uniformly appear at the clause boundary rather than as real C heads, along the lines of Manetta's (2006Manetta's ( , 2011 proposal for Hindi-Urdu ki. The particular implementation of this idea is largely inconsequential for present purposes, but for concreteness, let us assume that these complementizers are elements that merge at the edge of a clause, but do not project, so that the category of the clause remains unchanged. Under such an analysis, a moved element appearing to the right of a complementizer, like in (90a), would not entail that the complementizer corresponds to a projection higher than the landing site of movement and therefore would not constitute a violation of the strong WC if that movement can also cross the complementizer.
Hyperraising Several languages have been claimed to allow hyperraising. This phenomenon has been most thoroughly investigated for Bantu languages (e.g. Carstens 2011;Diercks 2012;Carstens and Diercks 2013;Halpert 2015Halpert , 2019, so I will center the discussion around them. A representative example of Bantu hyperraising from Lubukusu is given in (91). 'The people seem like they fell' [Carstens and Diercks 2013:100] Because the WC expressly prohibits hyperraising, if (91) is indeed hyperraising, it is problematic for the WC. However, Carstens and Diercks observe a crucial interaction between hyperraising and complementizers, which suggests that this picture is too simplistic. They report on three Bantu languages: Digo, Lubukusu, and Lusaamia. Digo and Lusaamia crucially do not allow hyperraising over complementizers. On the other hand, some Lubukusu speakers allow hyperraising over complementizers, but only the complementizer mbo and not the agreeing complementizer -li. They analyze this pattern as follows: (i) CPs are generally barriers to hyperraising because they are phases; (ii) finite clauses without complementizers are TPs in Bantu, not CPs; and (ii) mbo in Lubukusu is special in that it is not a phase head, thereby projecting a nonphasal CP that is not a barrier to hyperraising. Under the WC, TP is not a barrier for movement to [Spec, TP], since T T in fseq, irrespective of whether the TP is considered finite or nonfinite. Therefore, on Carstens and Diercks's analysis, hyperraising out of complementizer-less clauses is in fact compatible with the strong WC. This leaves mbo-clauses in Lubukusu. Rather than analyzing mbo as a special nonphasal complementizer, I suggest that mbo be analyzed as an edge marker, along the lines discussed above. Like their complementizerless counterparts, mbo-clauses would then be TPs, and thus A-movement out of them would not violate the WC. Similar considerations can, I believe, be applied to the other purported cases of hyperraising, such as Brazilian Portuguese (Nunes 2008), Greek (Alexiadou and Anagnostopoulou 2002), and Zulu (Halpert 2015(Halpert , 2019.

Long-distance agreement
There are several languages that have been reported to allow agreement between a matrix verb and a DP at the edge of an embedded finite clause, e.g. Innu-aimûn (Branigan and MacKenzie 2002), Passamaquoddy (Bruening 2001), and Tsez (Polinsky and Potsdam 2001). This is problematic for the WC because CP should be a barrier to a ϕ-probe on T 0 , because C T in fseq. However, these instances of long-distance agreement can be reanalyzed in a way compatible with the strong WC: (i) the embedded DP (i.e. the agreement controller) moves to embedded [Spec, CP], (ii) the DP's features percolate up to CP via Spec-Head agreement, and (iii) matrix T 0 agrees with the CP. This analysis is similar in spirit to Koopman's (2006) analysis of Tsez long-distance agreement, in that there is no direct crossclausal agreement. In terms of the LEC, matrix T 0 would agree with the CP before the full CP has been embedded (= substituted in); thus, upon embedding the CP, the CP's features must be shared along (or match) its existing AGREE-relations. Similar analyses can, I believe, be extended to wh-agreement in Chamorro and Palauan (e.g. Chung 1982Chung , 1994Chung and Georgopoulos 1988) and to crossclausal object agreement in Nez Perce (see Deal 2017, who analyses it in terms of covert movement, however).

Sakha accusative subjects
In Sakha, an embedded subject can be assigned dependent case (= accusative) iff the matrix clause has another DP (Baker and Vinokurova 2010). Baker and Vinokurova analyze this pattern in terms of raising: the embedded subject is eligible to move to embedded [Spec, CP], where it may then enter into dependent-case relationships with DPs in the matrix clause. This analysis is problematic for the strong WC (and the Ban on Improper Case) because it involves a DP in [Spec, CP] being the lower DP in a dependent-case pair, which should be impossible (see Sect. 5.4). erem-mit-im hope-PAST-1SG.SA 'I hoped that you would win today' [Baker and Vinokurova 2010:615] I argue that so-called accusative subjects in Sakha are actually proleptic arguments: they are base-generated as an argument of the matrix clause and are indirectly linked to an embedded gap via resumption (in the spirit of den Dikken 2017den Dikken , 2018. As an argument of the matrix clause, it participates in the dependent-case calculus in the matrix clause, and thus is sensitive to the DPs there. Baker and Vinokurova themselves consider a prolepsis analysis of accusative subjects. They claim that while some instances of accusative subjects are indeed prolepsis, there are at least some instances that are not. To support this claim, they show that an accusative subject can be an NPI that would only be licensed in the embedded clause. They argue that this constitutes evidence against adopting a prolepsis analysis across the board for accusative subjects. However, den Dikken (2017Dikken ( , 2018 explicitly argues that prolepsis does in fact allow NPI licensing. (Technically for him, such constructions are complex predicates with no crossclausal syntactic dependencies, the same analysis that den Dikken 2009 gives for Hungarian long focus movement; see Sect. 3.3.) Thus, the NPI facts are in fact compatible with the prolepsis analysis. Evidence ruling out prolepsis would have to come from other reconstruction data, which are not available in the literature. Crucially, under a prolepsis analysis, Sakha accusative subjects are not problematic for the strong WC, as no crossclausal syntactic dependencies are involved.
As shown with the above discussion, it is not at all clear that these phenomena constitute evidence against the strong WC and the LEC. At the very least, in light of improper case, they deserve more attention and careful scrutiny. If the reanalyses sketched above can be sustained, then the LEC can be maintained in its full strength.