Surface Velar Palatalization in Polish

This article investigates a palatalization process called Surface Velar Palatalization that turns /k g/ into [kj gj] before the front vowel e. What would appear to be a trivial rule, k g → kjgj/—ε, turns out to be a highly complex process. The complexity is caused by several independent factors. First, Surface Velar Palatalization, k g → kjgj, competes with Phonemic Velar Palatalization, k g → ʧ ʤ. Second, some but not all changes are restricted to derived environments. Third, some suffixes appear to be exceptions to one type of Palatalization but not to the other type. Fourth, /x/ behaves in an ambivalent way by undergoing one but not the other type of Palatalization. Fifth, Palatalization constraints interacting with segment inventory constraints yield different results in virtually the same contexts. I argue that the complexity of Surface Velar Palatalization motivates derivational levels in Optimality Theory. Further, the condition of derived environments is expressed as a constraint that is ranked differently at different levels of evaluation. A historical analysis of Surface Velar Palatalization tells the story of how the process came into being and operated for centuries in an unrestricted way. It subsequently became restricted to derived environments, which led to pronunciation reversals of the historical Duke of York type: gε → gjε → gε.*

This article investigates a palatalization process in Polish called Surface Velar Palatalization, 1 which turns /k g/ into [k j g j ] before the front vowel e. Aside from some cursory remarks in Gussmann (1980) and Rubach (1984), Surface Velar Palatalization has not been discussed in the generative literature to date, so the material is new. 2 What would appear to be a trivial rule, k g → k j g j /-E, turns out to be a highly complex but fully regular process. Accounting for Surface Velar Palatalization is therefore a challenge and a test of adequacy for phonological theory.
On the theoretical side, this paper is a contribution to Stratal Optimality Theory (Stratal OT, henceforth) in two ways. First, it provides a new argument for the distinction of levels or strata stemming from the hitherto unexplored role played by segment inventories. Second, it investigates derived environments in Palatalization and postulates that they are best captured as an OT constraint that can be ranked differently at different levels of derivation. Third, a historical study of Surface Velar Palatalization contributes to an understanding of the life cycle of a process that Stratal OT is designed to model. This article is organized as follows. Section 1 introduces the relevant background facts of Polish phonology (Sect. 1.1) and the assumptions of Stratal OT (Sect. 1.2). Section 2 discusses Phonemic Velar Palatalization while Sect. 3 provides an OT analysis of Surface Velar Palatalization and related processes, making the point about segment inventory constraints and derived environments. Section 4 looks at a historical development of Surface Velar Palatalization from Old Polish, through Middle Polish, to Modern Polish. Section 5 summarizes the rankings of the constraints and their interaction. Section 6 concludes with a summary of the results. The Appendix extends the analysis to coronal and labial inputs and to i as the trigger of Palatalization.

Background
This section prepares the ground for an analysis of Velar Palatalization. I begin with the presentation of descriptive facts of Polish phonology in the fragment that is relevant for this article. Subsequently, I introduce the assumptions of Stratal OT and the constraint apparatus for an analysis of Palatalization.

Descriptive background
Polish has a rich system of consonants. One reason for this richness is that the distinction 'hard' versus 'soft' consonants cuts without exception across the whole system. According to Wierzchowska (1963:9-11 and1971:149), hard consonants are pronounced with a tongue body configuration for the back vowel [a] while soft consonants have a tongue body position characteristic for front vowels. Consequently, in terms of features, hard consonants are characterized as [+back] while soft consonants are [-back]. In (1), I look at a fragment of the phonetic inventory that includes coronals and dorsals. For compactness, I list only voiceless obstruents, noting that [x] has a voiced counterpart only as a result of Voice Assimilation. I assume the Halle-Sagey model of Feature Geometry (Halle 1992;Sagey 1986), in which the features [±anterior] and [±strident] are dependents of the CORONAL node, so they are not applicable to dorsals.
(2) The feature theory shown in (1) fails to distinguish between palatalized postalveolar [S j Ù j ] and prepalatal [C tC]. This issue has been debated in the phonetic literature, notably by Dogil (1990), Halle and Stevens (1997) andŻygis and Hamann (2003). Wierzchowska (1971) and Dogil (1990) note that [S Z Ù Ã S j Z j Ù j Ã j ] are pronounced with lip protrusion, which gives them a characteristic hushing quality that distinguishes them from [C ý tC dý]. Sidestepping the exact nature of the relevant phonetic property, I will use the following segment inventory constraints:
In contrast to the consonantal system, the vocalic system of Polish is simple. It includes the high vowels [i 1 u], 3 the mid [E O] and the low [a]. 4 The only complication is that Polish has yers, the renowned Slavic vowels, which exhibit an alternation between e [E] and zero, as in bez 'lilac' (nom.sg.) -bz+y (nom.pl.). Analyzing yers is a perennial problem of Polish phonology. Rubach's (2016) study of the yers has been carried out in the framework of Stratal OT, so it connects with the analysis pursued here in a seamless way. Rubach (2016) turns around the classic analysis of yers (Gussmann 1980;Rubach 1984) and argues that, first, Yer Deletion precedes Yer Vocalization and, second, Yer Deletion is context-sensitive while Yer Vocalization is context-free. Yer Deletion takes place in a CV context, meaning before a single consonant followed by a full vowel. 5 Yers that have not been deleted 'vocalize' context-freely and become the regular vowel [E]. This is illustrated in (4), where I look at the derivation of bz+y 'lilac' (nom.pl.) and bez (nom.sg.). The yer is transcribed as the capital letter E.
(4) a. Yer Deletion //bEz+1// 6 → [bz1] b. Yer Vocalization //bEz// → [bEs] 3 For the status of [1] as the underlying segment, see Rydzewski (2017). 4 It is unclear if the nasal vowels spelledę andą should be analyzed as deriving from strings of an oral vowel and a nasal consonant or whether they should be regarded as underlying segments. See Rubach (1984). 5 The term 'full vowel' refers to any vowel that is linked to a mora.
In (4a), the yer is followed by a consonant and a vowel, and hence deletes. This context does not occur in (4b), so Yer Deletion is mute and, consequently, the yer vocalizes. Rubach (2016)  The conclusion is that the yer E and the regular vowel E must be distinct in terms of their underlying representation. There is voluminous literature on how this distinction should be made. 7 The analysis in Rubach (2016) builds on the idea of floating segments. Specifically, the yer E differs from the regular vowel E by lacking a mora. Yer Vocalization is therefore a process that inserts a mora, making the yer E, transcribed //E//, indistinguishable from the regular vowel E.

Theoretical background: Derivational Optimality Theory
Derivational OT (Rubach 1997(Rubach , 2011(Rubach , 2016) is a version of Stratal OT (Kiparsky 1997(Kiparsky , 2000(Kiparsky , 2015Bermúdez-Otero 1999, 2018. It is different from Stratal OT in one respect only: the assumption is that the grammar by default has four levels or strata. Stratal OT recognizes three levels/strata: the stem level, the word level and the postlexical level. Derivational OT adds a fourth level: the clitic level that is placed between the word level and the postlexical level. The stem level encompasses the root and level 1 affixes. The word level enlarges the domain of analysis by adding level 2 affixes to the structures derived at the stem level. The determination which affixes are level 1 and which are level 2 is a language-specific matter. Similarly, languages may differ in their understanding of what constitutes a clitic structure. For example, Rubach (2016) argues that prefixes in Polish have the status of clitics, 8 hence prefix plus word structures are analyzed at level 3. The postlexical level, level 4, covers the domain of the utterance, analyzing processes that apply across word boundaries. The input to level 1 is the underlying representation, the input to level 2 is the optimal output from level 1, the input to level 3 is the winner from level 2, and the winner from level 3 is the input to level 4. In effect then, the architecture of Derivational OT is cyclic be-cause the derivation proceeds from smaller domains to progressively larger domains: stem → word → clitic phrase → utterance. There is an obvious and actually intended similarity between Derivational OT and Lexical Phonology (Kiparsky 1982;Booij and Rubach 1987). Like in Lexical Phonology, at each cycle constraints can look at the structure derived in the previous cycle. However, unlike in Lexical Phonology, there is no prohibition to change the representations derived in an earlier cycle. Constraints are the same at all levels but their ranking may be different. The principle of reranking minimalism (Rubach 2000b) makes sure that reranking of the constraints occurs only if required by compelling analytical need. In sum, the grammar is understood as a system of four levels that are connected serially. Each level constitutes an OT 'miniphonology,' which means that it has its own inputs and constraint ranking. The Standard OT's principle of strict parallelism (simultaneous evaluation, no derivational steps) holds inside a level but not across levels since, as just explained, levels are ordered serially.
From the point of view of OT, Slavic Palatalization is driven by markedness constraints that are individualized with regard to the trigger. 9  (Bilodid 1969). I argue in this paper that in a single language, Polish, Palatalization may have different triggers at different levels of derivation. Specifically, PAL-e is active at levels 1 and 2 but inert at levels 3 and 4 while PAL-Glide and PAL-i are active at all levels. Chen (1973) argues that there is an entailment relation between Palatalization rules specified for particular environments, whereby Palatalization before a low vowel entails Palatalization before a mid vowel and Palatalization before a mid vowel entails Palatalization before a high vowel.
c. IDENT-V[-back] [-back] on the vowel in the input must be preserved on a correspondent of that vowel in the output. d. IDENT-V[+back] [+back] on the vowel in the input must be preserved on a correspondent of that vowel in the output. 10 Palatalization as a strategy of conflict resolution is grounded in phonetics, both articulatory and acoustic. Kochetov (2016) points out that fronting and raising of the tongue body is in conflict with gestures that articulators need to execute to produce consonants with various places and manners of articulation. He further argues that the sequence of a consonant plus a front vowel or a glide is both acoustically and perceptually problematic "as front vowels tend to obscure phonetic cues to place of articulation and induce affrication, ultimately leading to perceptual confusion" (Kochetov 2016:4, see also Ohala 1978Kawasaki 1982;Guion 1996).
This article analyzes Palatalization of velar consonants in Polish, with a focus on the surface k g → k j g j Palatalization. To keep the presentation within manageable bounds, I look at the operation of PAL-e and ignore  This operation manifests itself in two ways that appear to be contradictory.
(gen.sg.) There are two kinds of Palatalization that I dub Phonemic Velar Palatalization (8a) 12 and Surface Velar Palatalization (8b). The former makes profound changes by turning velars into strident coronals, k g → Ù Ã. The latter executes a minor alteration turning velars into prevelars, k g → k j g j .
In the case of PAL-e, the repair of the violation in [CE] is implemented as Palatalization, schematically, //CE// → [C j E], rather than as Vowel Retraction. The reason is that Vowel Retraction acting on /E/ as the input would derive schwa, E → @ / C[+back]-, like we have i → 1 / C[+back]-in the case of PAL-i (see above and Appendix). This action is blocked because schwa does not exist in Polish.
To conclude, PAL-e manifests itself in Polish as Palatalization, not as Vowel Retraction, a generalization that is expressed by the ranking of IDENT-V[-back] higher than PAL-e and IDENT-C [+back]. In what follows, I will not consider Vowel Retraction candidates such as [k@] from the input /kE/. 13 then, as noted, there are two repairs: Palatalization, Ci → C j i, and Vowel Retraction, Ci → C1. If the input is /C j 1/, then, there are two other repairs: Depalatalization, C j 1 → C1, and Vowel Fronting, C j 1 → C j i. All of these repairs are attested, albeit not all in a single language. See Rubach (2000aRubach ( , 2007Rubach ( and 2017 for discussion. 11 In response to a reviewer's question, PAL-i is discussed in the Appendix.

Phonemic Velar Palatalization
As noted in (8), velars palatalize in two different ways that I call Phonemic Velar Palatalization (k g x → Ù Ã S ) and Surface Velar Palatalization (k g → k j g j ). This section looks at the former type of Palatalization in an effort to disentangle the two types of changes. HARD is a well-known generalization in Slavic languages (Rubach 2003 (Avanesov 1968). The co-existence of PAL-e and HARD in a single language creates an analytical problem for Standard OT. PAL-e requires agreement in [-back] between the con-sonant and [E] while HARD bans [-back] stridents. The contradiction is solved by assuming that PAL-e and HARD operate on different levels of derivation, an analysis that is afforded by Derivational OT. Specifically, PAL-e, but not HARD, is active at level 1, so PAL-e is ranked high while HARD is bottom-ranked. At level 2, HARD is reranked above PAL-e, and, consequently, [Ù j E Ã j E S j E Z j E] must yield to [ÙE ÃE SE ZE], even though these outputs violate PAL-e.
Given the input //k+E//, as in rycz+e+ć [r1Ù+E+tC] 'to scream', a verb derived from ryk 'scream', the analysis must make sure that //kE gE xE// change into /Ù j Ã j S j /, and not into some other segments. In particular, it is necessary to exclude /k j E g j E x j E/, which satisfy PAL-e by sharing [-back]. The desired effect is achieved by ranking as undominated the segment inventory constraints in (12).
(12) a. *k j : Don't be k j b. *g j : Don't be g j c. *x j : Don't be x j .
Since, given the ranking, /k j E g j E x j E/ are not viable winners in the evaluation driven by PAL-e, the query is to what segments the inputs //kE gE xE// will change. The default would be /t j E d j E s j E/ as coronals are less marked than labials 15 and anteriors are better than posteriors. However, the facts of Polish show otherwise. In particular, palatalized coronals are preferably posterior (Rubach 2003), a generalization that is expressed as the following segment inventory constraint. That is, palatalized coronals must be [-anter]. This is exactly what we find in Slovak, where [t j ] is not just palatalized but also [-anter], which is marked as a minus underneath the t. However, posterior t is not the desired output of Phonemic Velar Palatalization. We need to make sure that the outputs are strident consonants, //k g x// → /Ù j Ã j S j /. This is effected by Stridency (Rubach 2007).
The derivation //k g x// → /Ù j Ã j S j / at level 1 is almost ready. The final step is to ensure that posterior stridents are the hushing stridents [Ù j Ã j S j ] and not the hissing stridents [tC dý C]. The desired effect is achieved by ranking the segment inventory constraint *[tC dý C] over *[Ù j Ã j S j ], which tips the evaluation towards the hushing stridents. Lastly, turning dorsals into coronals violates IDENT-Dor.
(15) IDENT-Dor The node DORSAL on the input segment must be preserved on a correspondent of that segment in the output.
The foregoing discussion is summarized in all essential points by looking at the evaluation of rycz+e+ć //r1k+ E+tC// → [r1Ù+E+tC] 'to scream', a verb derived from ryk 'scream'. I look at the relevant fragment of the word. The derivation of brac+ie 'brother' (voc.sg.) continues as follows.
Finally, the operation of PAL-e is restricted to derived environments (DE, henceforth) in the sense that the consonant and the /E/ must span a morpheme boundary. This condition is fulfilled by the examples in (9) and (10) but not by those in (22) below. Morpheme-internal //CE// remains unaffected and surfaces as [CE] in violation of PAL-e. This generalization extends to all types of consonants, not just to dorsals. ser It is unclear how the DE restriction can be built into an OT analysis and I will not pursue this issue here. 18 One way would be to make reference to a morpheme boundary in the statement of a given PAL constraint. This approach is problematic in three ways. First, Rubach (1984) has shown that Palatalization in all contexts, not just before e, may carry a DE restriction. Translated into the OT framework, this observation would mean that DE needs to be written into three separate constraints: PAL-e, PAL-i and PAL-Glide. Second, the founding assumption of OT is that constraints are universal, so there is no sense in which PAL-e is a Polish constraint. For example, PAL-e belongs just as much to Russian phonology as it does to Polish phonology. Writing DE into the statement of PAL-e would make the prediction that the DE restriction holds for Russian, like it does for Polish. The prediction is wrong because PALe freely applies morpheme-internally in Russian, as the following closely minimal contrasts show. I conclude that the putative DE-PAL-e would not be able to deliver the correct results in Russian. The third reason against building DE into particular PAL constraints stems from the observation that the same constraint in the same language can act as a DE generalization at one level but not at another level. This is what happens in the case of PAL-Glide. It is limited to DE at level 1 but not at level 2, where it applies morphemeinternally. I discuss this issue in Sect. 3.
The problems with the DE condition are eliminated if DE is divorced from particular PAL constraints and is stated as a constraint in its own right.
(24) DE-PAL: A [-back] consonant and a front vowel/glide must span a morpheme boundary.
DE-PAL is subject to language-specific ranking, like any other constraint. 19 Similarly, like other constraints, it can be reranked between levels. In Polish, the restriction of PAL-e to derived environments means that the ranking is DE-PAL PAL-e. In Russian, the ranking is reversed, PAL-e DE-PAL, meaning that morphemeinternal palatalization is valued more highly than obedience to DE-PAL. The role of DE-PAL in Polish is illustrated by looking at the word seks 'sex' occurring in the loc.sg., whose suffix is //E//. The question is what protects morpheme-internal sequences of a soft consonant and a front vowel from being eliminated in order to satisfy DE-PAL. These are the morphemes that have underlying soft consonants that come historically from the time when PAL-e was not constrained by DE and applied across the board (Rubach 1984). The answer is that the potential adverse action of DE-PAL is thwarted by IDENT-C[back] that outranks DE-PAL. Thus, there is no danger that, for example, underlying //CErp//, sierp 'sickle', can lose its palatalization because the consonant and [E] are in the same morpheme and hence violate DE-PAL.
The interaction between DE-PAL and PAL-Glide mentioned above creates no difficulty (see the next section). At level 1, DE-PAL PAL-Glide ensures that the candidate respecting DE is the winner. At level 2, the constraints are reranked, so PAL-Glide has jurisdiction not only at morpheme boundaries but also morphemeinternally. As will be shown in Sect. 4, DE-PAL makes sense not only for present day synchronic analysis but also for diachronic analysis because it permits to view historical change as constraint reranking.
A  (Shevelov 1979). The innovation can be readily accounted for if DE-PAL outranks IDENT-C[-back] and PAL-i is ranked above IDENT-V[-back]. (28) Recall that PAL-i can be satisfied either by Palatalization, Ci → C j , or by Vowel Retraction, Ci → C1, as in either case the consonant and the vowel agree in [±back]. DE-PAL is violated by (28a) because a palatalized consonant and a front vowel are not separated by a morpheme boundary. The loc.sg. ending in Ukrainian is //i//, so the underlying representation of the loc.sg. form is //z j im+i//. The word illustrates the two effects of DE-PAL: Vowel Retraction morpheme-internally, i → 1, and Palatalization at a morpheme boundary, The winner [z1m j +i] is exactly the attested surface representation in Ukrainian. I conclude that DE-PAL as a constraint is supported not only by the absence of Palatalization morpheme-internally, as in Polish (26), but also by Depalatalization, as in Ukrainian (28)-(29).
In sum, DE effects in Palatalization are captured by DE-PAL, a new constraint. Evidence for level distinction and hence for Derivational OT is drawn from the ranking paradoxes displayed by the segment inventory constraints: *tC, *Ù j , POSTER and STRID.

Surface Velar Palatalization
The interaction between DE-PAL and PAL-e accounts for the absence of palatalization in (30), where the consonant and /E/ occur inside one morpheme, as in ser [sEr] 'cheese'.
The problem is that the DE restriction cannot account for the absence of palatalization in the instr.sg. forms in (31) below. The examples are the same words as in (10), but this time we look not only at the voc.sg. but also at the instr.sg.  22 An objection can be raised that the assignment of affixes to different levels is arbitrary. This is true and it reflects different historical sources that merged in the evolution of the language. The point is that Derivational OT has the resources to give a formal account of differences in the behavior of various affixes.
A reviewer points out that Standard OT could account for the behavior of level 2 suffixes by resorting to indexed constraints (Pater 2008). This is true, for example, an Output-Output faithfulness constraint OO-IDENT-Dor could be co-indexed with the instr.sg. suffix //Em//. The drawback of this solution is that it opens the way to analyses in which potentially every affix could have a phonology of its own, which hugely weakens the restrictiveness of the theory. Derivational OT assigns the problematic suffixes to a level and they cannot choose to which constraints they wish to be available. I conclude that level assignment is a more restrictive mechanism than Standard OT's indexed constraints and OO-faithfulness.
The question regarding the absence of palatalization returns at level 2, but the action of PAL-e is now severely curtailed. The curtailment is effected by the high ranking of the segment inventory constraints: *SOFT-Lab, banning palatalized labials, *SOFT-Coron, prohibiting [-back] coronals, and *x j , banning the palatalized velar fricative.
(34) a. Level 1: *k j *g j *x j PAL-e *SOFT-Lab, *SOFT-Coron b. Level 2: *SOFT-Lab, *SOFT-Coron, *x j PAL-e *k j *g j At level 1, it is more important to obey PAL-e than the segment inventory constraints against soft labials and coronals (34a), so we have Palatalization of labials and coronals, exemplified in (10), such as chłop 'man' -chłop+ie [p j +E] (voc.sg.) and brat 'brother' -brac+ie [bratC+E] (voc.sg.). At level 2, greater value is placed on obeying the segment inventory constraints in (34b) than obeying PAL-e, so Palatalization of labials, coronals and /x/ before /E/ is blocked, as required by the data in (31)-(33). At level 2, velar stops palatalize in a surface manner, k g → k j g j , because the option of changing /k g/ to [ÙÃ] is closed by reranking IDENT-Dor to an undominated position. In (35), I look at the evaluation of krok+iem 'step' (instr.sg.) at level 2. Recall that //Em// is a level 2 suffix, so it was not available on level 1.
The blocking effect of the reranked constraints in (34b) is illustrated by pas+em [pasEm] 'belt' (instr.sg.). 22 The distinction between level 1 and level 2 affixes has a time-honored tradition in generative phonology. The idea is attributed to Siegel (1974). It played an important role in Lexical Phonology and was imported into OT by Benua (1997). The novelty of the analysis is that this distinction is applied to Polish, a solution that has never been proposed before.
There is no danger that the high-ranked *SOFT-Coron can undo the work done by PAL-e at level 1, as in, for example, los 'lot' (nom.sg.) -los+ie [lOC+E] (loc.sg.), which leaves level 1 with the representation /lOs j +E/. 23 The integrity of the palatalized /s j / or [C] is guarded by IDENT-C[-back] that is undominated at both level 1 and level 2.
The relationship between DE-PAL and PAL-e is copied unchanged from level 1 to level 2, which means that the ranking is DE-PAL PAL-e. The effect of the ranking is that the actual jurisdiction of PAL-e is limited to derived environments. This is exactly correct, as morpheme-internally, we find [kE gE] rather than [k j E g j E]. The consequence of limiting PAL-e to derived environments, DE-PAL PAL-e, is that morpheme-internal [k j g j ] are not derivable from //k g// any more and hence must be underlying segments, as the following sample of examples illustrates.
(40) kiedy [k j Ed1] = //k j Ed1// 'when' zgiełk [zg j Ewk] = //zg j Ewk// 'turmoil' kierat [k j Erat] = //k j Erat// 'treadmill' giemz+a [g j Emz+a] = //g j Emz+a// 'chamois' kier [k j Er] = //k j Er// 'hearts' giermek [g j ErmEk] = //g j ErmEk// 'page' Similarly as in (26) and (37) As explained in Sect. 1.1 yers, transcribed //E//, are floating melodic segments at the underlying level, so Yer Vocalization is a process of mora insertion. Prior to mora insertion, the yer is a melodic segment that cannot erect a syllable itself precisely because it lacks a mora. This is exactly the structure that characterizes glides. In autosegmental theory, the glide [j], for example, is the vocalic melodic segment [i] that lacks a mora. To conclude, the yer is structurally identical to a glide and, consequently, must be treated as a glide. The difference between //j// and //E// is that //j// is a high front glide while //E// is a mid front glide. The conclusion that matters for the analysis in this article is that Palatalization caused by yers falls under the jurisdiction of PAL-Glide, and not PAL-e. The data in (42)  Since, as already said, yers are formally glides, the responsibility for inducing Palatalization rests with PAL-Glide, not with PAL-e. This is fortunate because Palatalization before //E// and Palatalization before the regular vowel //E// exhibit different behaviors. The latter, that is, PAL-e, applies in derived environments and is blocked morpheme-internally: recall the data in (33) and (38) (45), so /E/ survives in the optimal candidate at level 2. The winner from level 2, /k j Ep/, is the input to level 3, at which yers vocalize, that is, obtain a mora and hence become nondistinct from the regular vowel [E]: /k j Ep/ → [k j Ep]. The gen.sg. form kp+a 'fool', from underlying //kEp+a//, loses its yer at level 2 because E is followed by a consonant (here p) and a vowel (here a). Deleting the yer violates MAX-Seg that militates against deletion.
The analysis is correct as [kpa] is the attested surface form. It is worth underscoring that there is an advantage to doing Surface Velar Palatalization at level 2 rather than at level 1. The reason is that yers are deleted at level 2, which allows us to uphold the generalization that /k g/ are palatalized before yers but only before those yers that ultimately vocalize and are turned into [E] in the surface representation. If Surface Velar Palatalization was done at level 1, hence prior to Yer Deletion, it would be necessary to postulate a Depalatalization constraint to depalatalize the stops at level 2. That is, //kEp+a// → /k j Ep+a/ at level 1 and /k j Ep+a/ → /k j p+a/ → [kpa] at level 2. The Depalatalization step and the Depalatalization constraint are not necessary if Surface Velar Palatalization is executed at level 2, as proposed in this paper.
PAL-Glide operates not only at level 2 but also at level 1, as the following examples show. The suffix -ek is a diminutive morpheme. The diminutive suffix -ek contains a yer because e alternates with zero in (47). The representation is therefore //Ek// and the palatalization effects are the action of PAL-Glide, not of PAL-e. The point is that the changes driven by PAL-Glide and the associated constraints are different at level 1 and at level 2.
(48) Level 1: k g x → Ù Ã S/ -E, as shown in (47) Level 2: k g → k j g j / -E, as shown in (42) Further, the operation of PAL-Glide at level 1 leads to opacity in the surface representation. We see this in the gen.sg. forms in (49), for example, bocz+k+a, the gen. sg. of bocz+ek 'side' (dimin.). It should also be noted that PAL-Glide is constrained by DE-PAL at level 1, that is, its jurisdiction is limited to derived environments, as in bocz+ek, //bOk+EK// → /bOÙ j +Ek/. It is imperative that morpheme-internal structure is not within the reach of PAL-Glide because morphemes such as kiep //kEp// 'fool' and cukier //ţukEr// 'sugar' must escape the //k// → /Ù j / Palatalization and must emerge unscathed from level 1: /kEp/, not */Ù j Ep/ and /ţukEr/, not */ţuÙ j Er/. At level 2, on the other hand, the objective is reversed: PAL-Glide must be able to look into morphemes: /kEp/ → /k j Ep/ and /ţukEr/ → /ţuk j Er/.
The activity of PAL-e and PAL-Glide is different at different levels. PAL-e loses force at both the clitic phrase level (level 3) and the postlexical level (level 4). That is, the outputs are selected as optimal even if they violate PAL-e. This is a very different situation from that found at level 2: PAL-Glide applies to all consonants in an unconstrained way. The generalization is captured by reranking PAL-Glide to an undominated position. To conclude, as (53) shows, a large number of segment inventory constraints determining admissible inventories are ranked differently in the word domain (level 2) and in the phrase domain (levels 3 and 4). The differences in ranking cannot be derived from Standard OT's opacity theories since these theories are not equipped to handle such differences. I conclude that the segment inventory constraints in (53) constitute an argument for levels and hence for Derivational OT.
In the following section, I look at PAL-e from a historical perspective, focusing on k g → k j g j , the effects of Surface Velar Palatalization. The absence of Surface Velar Palatalization morpheme-internally raises the question of whether the process might have been limited to derived environments, which would explain its absence morpheme-internally. Limitation to derived environments is commonplace in Slavic phonology (see Rubach 1984). The answer is negative: Surface Velar Palatalization did not apply in derived environments. This is the conclusion of Stieber (1973:67-68), who evaluated scribal practices in Old Polish. The generalization is clear: [k j g j ] did not exist as phonetic segments in Old Polish. 29 The absence of [k j g j ] means that the segment inventory constraints *k j , *g j were undominated and blocked Palatalization from all PAL constraints. According to Stieber (1973) and Długosz-Kurbaczowa and Dubisz (2006) In terms of the constraint system, the change between Old Polish and Middle Polish is a matter of reranking of the segment inventory constraints *k j *g j .

A historical perspective
(58) Old Polish: *k j *g j PAL-e Middle Polish: PAL-e *k j *g j The word kiedy 'when' serves as an example illustrating the changes. 29 This is generally true, not only for the [k j g j ] that would have been derived in the context of e. The other potential source of phonetic [k j g j ] would be PAL-i, but until the 16th c. [k g] could not be followed by [i], for example, today's gibk+i [g j ipk j i] 'bending' was gybk+y [g1pk1]; see Stieber (1973) for discussion. 30 The examples here and in (57)  shows that PAL-e applies productively not only at morpheme boundaries, as in krok+iem 'step' (instr. sg.) and rog+iem 'horn' (instr. sg.) but also inside morphemes. To achieve this result, PAL-e must outrank DE-PAL, so that the derived environment restriction has no force. 32 The situation just described changes in the middle of the 20th century. First, new borrowings are not assimilated by palatalizing k, g before e. Second, we witness pronunciation reversals on a huge scale. The words in (66), for example, legend+a 'legend', illustrate Duke-of-York derivations (Pullum 1976), but in a diachronic rather than synchronic dimension. Sławski (1952) states that legenda is a 16th c. borrowing of the mediaeval Latin legenda 'story of saints'. At the point of borrowing, the ge in legenda was pronounced [gE]. In 1903 exactly this word is cited by Kryński (1903)  Pronunciation reversals do not constitute a problem for a theory that distinguishes between underlying and surface representations. Looking at legenda as an example, the Latin legenda enters Polish with //gE// because [gE] is the surface representation in the 16th c., [lEgEnd+a] = //lEgEnd+a//. The pronunciation [lEg j Enda] is an effect of Surface Velar Palatalization, a rule that develops in the 16th/17th c. Since the derivation g → g j is fully predictable from Surface Velar Palatalization, the underlying representation is underspecified for Palatalization, //lEgEnd+a//. In the 20th c. Surface Velar Palatalization changes status and becomes restricted to derived environments. Consequently, it can no longer affect morpheme-internal //gE// and hence legenda starts surfacing as [lEgEnda], with hard [gE]. Some words such as giermek [g j ErmEk] 'page' and kiedy [k j Ed1] 'when' restructured their underlying representations from //gE// and /kE// to //g j E// and //k j E//, respectively, because there were no alternations between [k j g j ] and [k g]. The restructuring process was lexically diffused and proceeded in an item-by-item manner, so, for example, the restructuring occurred in giermek //g j ErmEk// 'page' but not in legenda //lEgEnd+a// 'legend'. When, in the 20th c., 34 It is difficult to establish at which point exactly the [g j E] pronunciation started occurring. What we know as a fact is that in 1903 [g j E] was attested. Most probably, the [g j E] pronunciation existed already in the 16th/17th c. because this is the time when Surface Velar Palatalization entered Polish as a rule.
These pronunciation reversals highlight a new theoretical point. Namely, they show that reversals are possible in the absence of alternations. Kiparsky's (1973) conclusion that reversals may occur only in instances of alternation is therefore too restrictive. 35 The story of ke, ge is relevant for the understanding of the life cycle of a process (Baudouin de Courtenay 1894 ;Hyman 1976;Bermúdez-Otero 1999Kiparsky 2013). The idea of the life cycle is that a phonological process starts in phonetics and, when phonologized, begins to climb up the strata of Derivational/Stratal OT, beginning with the postlexical stratum (Bermúdez-Otero 2013). At late stages in the life cycle, the process is morphologized, stops being productive and finally expires. The story of ke, ge makes two additions to this understanding of the life cycle. First, a process may begin at level 2 rather than at level 4. 36 Second, evolution of a process goes through the stage at which it develops a DE restriction. At that stage, the process is still extremely regular and exceptionless, as exemplified by Surface Velar Palatalization, but its jurisdiction has narrowed down to derived environments.
A further point that the story of ke, ge brings out is the characterization of Palatalization in terms of PAL constraints. The DE restriction has affected PAL-e but not PAL-Glide and PAL-i (see the Appendix below), which continue to apply morphemeinternally as in //kEp// → /k j Ep/ 'fool' (and /k j Ep/ → [k j Ep] at level 3), //dialekt// → [d j jalEkt] 'dialect'. This difference in the behavior of PAL-e and PAL-Glide (as well as PAL-i) shows that PAL-e is an independent generalization that should not be collapsed with the other PAL constraints.
Finally, the development of //k j g j // as underlying segments is an example of phonologization in the sense of Kiparsky (2013). The observation is that secondary split is effected here not through the destruction of the environment but through the change of status of the process from across-the-board-application to derived environments.

Summary of interactions
This section summarizes the interaction of the constraints and compares their ranking at different levels. Before summarizing the ranking, I illustrate schematically the changes at a given level by looking at the alteration of segments in four relevant 35 Kiparsky (1968Kiparsky ( , 1973 has argued that the loss of Final Devoicing in Yiddish resulted in the pronunciation of the morpheme weg 'away' as [vEk] 36 Evidence for postlexical processes is notoriously difficult to find in historical phonology, but some insight can be gleaned from the inspection of modern languages. As far as I know, no Slavic language has Palatalization before e across word boundaries. This is true also for Russian that in general has robust Palatalization triggered by e in derived and non-derived environments. contexts: before //E// and before //E//, across morpheme boundaries and inside morphemes. I look at velars (represented by k), coronals (represented by t) and labials (represented by p).

(68)
Level 1 input k+E t+E p+E k+E t+E p+E kE tE pE kE tE pE output Ù j +E t j +E p j +E Ù j +E t j +E p j +E kE tE pE kE tE pE Phonemic Velar Palatalization, k → Ù j , is executed by ranking IDENT-Dor low and *k j , *g j , *x j high, so that velars have to change into coronals. The output must be a [-anter] affricate, which is enforced by POSTER (13) and STRID (14). The details of the ranking are as follows. , *SOFT-Coron, *SOFT-Lab guarantees that Palatalization of hard consonants rather than Retraction of front vowels is the response to PAL constraints at level 1. PAL-e and PAL-Glide palatalize not only velars but also coronals and labials, so PAL-e, PAL-Glide *SOFT-Coron, *SOFT-Lab. Palatalization is limited to derived environments, hence DE-PAL PAL-e, PAL-Glide. Velars //k g x// palatalize to soft postalveolars /Ù j Ã j S j / rather than to /k j g j x j /, so *k j *g j *x j outrank *SOFT-Coron, HARD and IDENT-Dor. Dentals //t d s z// go to /t j d j s j z j / at level 1, so IDENT[+anter], IDENT[strid] dominate POSTER and STRID. The outputs of PhonemicVelar Palatalization are the postalveolars /Ù j Ã j S j / rather than the prepalatals /tC dý C ý/, which means that the constraint against the latter is ranked above the constraint against the former: *tC *Ù j . Level 2 segmental changes do not include palatalized labials because [p j +E] derived at level 1 is the final output. Morpheme-internal [pE] is also the attested surface form. The inputs with yers, for instance, /p j +E/ either delete the yer (when /E/ is followed by a consonant and a full vowel) or leave the /E/ intact, so that it can vocalize at level 3: E → E.

(70)
Level 2 input Ù j +E t j +E Ù j +E t j +E k+E 37 kE tE kE tE output Ù+E tC+E Ù+E tC+E k j +E kE tE k j E tE Level 2 witnesses Hardening, Ù j → Ù, so HARD is reranked to an undominated position. Velars palatalize in a surface manner, k → k j , before level 2 suffixes, so in DE contexts. The yer /E/, effects k → k j in non-DE environments, so morphemeinternally. The absence of k → Ù j is accounted for by reranking IDENT-Dor to an undominated position. Finally, palatalized coronals are spelled out as prepalatals, t j → tC. The details are as follows. , *k j *g j , *Ù j *tC.
As noted, level 2 exhibits some dramatic changes. First, Phonemic Velar Palatalization is closed as an option, so IDENT-Dor is reranked to an undominated position. PAL-e and PAL-Glide palatalize velar stops in a surface manner, /k g/ → [k j g j ], but the velar fricative /x/ does not palatalize at all, so *k j *g j , but not *x j , are bottomranked at level 2. HARD, inducing /Ù j Ã j S j / → [Ù Ã S], is obeyed without exception, so HARD is reranked to an undominated position, crucially, above IDENT-C[-back]. IDENT-C[-back] yields to HARD but not to *SOFT-Coron, *SOFT-Lab and, therefore, it continues to outrank these constraints. PAL-Glide is no longer limited to derived environments and applies morpheme-internally, as in /kEp/ → /k j Ep/ 'fool'. Consequently, PAL-Glide is reranked above DE-PAL. To induce /k g/ → [k j g j ], PAL-Glide must outrank *k j *g j . PAL-e and PAL-Glide affect velars, /k g/ → [k j g j ], but not coronals or labials, so *SOFT-Coron, *SOFT-Lab PAL-e, PAL-Glide. Since input soft coronals and labials do not lose their [-back] property, IDENT-C[-back] must continue outranking *SOFT-Coron, *SOFT-Lab. The enhancement spell-out operation /t j d j s j z j / → [tC dý C ý] is effected at level 2, hence the grip of IDENT [+anter] and IDENT[-strid] must be released by reranking POSTER and STRID above these faithfulness constraints. The spell-out produces prepalatals [tC dý C ý] rather than postalveolars [Ù j Ã j S j Z j ], so the level 1 ranking *tC *Ù j is reversed at level 2: *Ù j *tC. Level 3 segmental changes are all allophonic and occur before the glide [j] 40 since the yer (the glide /E/) now vocalizes as a full vowel [E], so it is no longer a glide.
(72) Level 2 input Ù+E tC+E p j +E kj tj pj output Ù+E tC+E p j +E k j j t j j p j j The details of the ranking are as follows. dominated and surface-true because the glide causing Palatalization surfaces overtly in the phonetic representation. Third, Palatalization is of the secondary articulation type, C → C j , and, unlike at levels 1 and 2, does not lead to the change of place or manner of articulation. Fourth, even HARD that is exceptionless and undominated at level 2 now yields to PAL-Glide and /Ù Ã S/ occurring before [j] end up as soft [Ù j Ã j S j ] because [j], which is an [i] at the melodic tier, is protected by the undominated IDENT-V[-back], so the disagreement in [±back] in inputs such as /Ù j/, as in zabacz je 'see them', is resolved in favor of Palatalization at the expense of violating HARD: /Ù j/ → [Ù j j]. The segment inventory constraints prohibiting soft consonants (here: *SOFT-Coron, *SOFT-Lab, *k j *g j *x j ) are bottom-ranked, so Palatalization derived by PAL-Glide prevails. DE-PAL plays no role, so it is bottom-ranked.
For the fragment of Polish phonology discussed in this article, levels 3 (clitic level) and 4 (postlexical sentence level) have the same ranking, so (73) is true for both of these levels.
The winner /sOÙ j +Ek/ enters level 2, at which it picks up the instr.sg. suffix -em //Em//, so the input to level 2 is /sOÙ j +Ek+Em/. Recall from Sect. 3 that level 2 has an active Yer Deletion constraint that deletes yers if they are followed by a single consonant and a full vowel, such as /E/. 41 The /E/ creates a context for Palatalization. The relevant constraint is PAL-e.

Conclusion
Palatalization of velars in Modern Polish takes on two different guises: k g x →Ù Ã S and k g → k j g j . In both cases the drivers are PAL-e and PAL-Glide. The effects, [Ù Ã] and [k j g j ], respectively, are incompatible and cannot be derived in a strictly parallel manner postulated by Standard OT. They can, however, be accommodated by Derivational/Stratal OT that incorporates derivational levels. The distinction of levels is motivated further by segment inventory constraints that lead to ranking paradoxes on a massive scale in Standard OT. A diachronic look at Surface Velar Palatalization shows the life cycle of the process. The analysis makes two points: first, a phonological process may enter the language by starting at level 2 rather than at level 4 and, second, a typical stage in the evolution of the process is the stage at which we see a restriction to derived environments. These are best modeled by postulating a separate constraint, DE-PAL, that is satisfied by structures spanning a morpheme boundary. In such analysis, it is unproblematic that the derived environment generalization may behave differently in different domains, that is, at different levels. DE-PAL makes correct predictions not only for an analysis of Polish, which exhibits Palatalization, but also for an analysis of Ukrainian, which exhibits Vowel Retraction, the reverse of Palatalization.
Derivational levels or strata show certain general characteristics. Levels 1 and 2, both of which are lexical (stem domain and word domain, respectively) are the locus of morphophonemic generalizations. Level 1 and level 2 are dramatically different, which is reflected in the number of constraint rerankings between these two levels.
As we move to level 3, the reranking of constraints diminishes significantly and, in the fragment of Polish phonology discussed here, it drops to zero at the transition between the postlexical levels: 42 the clitic level and the sentence level. The postlexical levels exhibit Palatalization of the surface (allophonic) type: consonants become [-back] but their original manner and place of articulation are retained, so k → k j is a typical postlexical process while k → Ù, with its change of place (velar → posterior coronal) and manner of articulation (stop → affricate) is a typical lexical process. When, in addition to the postlexical levels, k → k j occurs also at a lexical level, as it does in Polish (level 2), it is a transparent process in the sense that the trigger of Palatalization is present in the surface representation, as in cukier [ţuk j Er] 'sugar'. The k → Ù process is not restricted in this way, so we see [Ù] in both the transparent alternation in bok 'side'-bocz+ek [bOÙEk] (dimin., nom.sg.) and in the opaque alternation in bok -bocz+ka [bOÙka] (dimin., gen.sg.).
To conclude, the differences between levels are so substantial that it is fair to say that each level constitutes a separate grammar with its own inputs, evaluations, outputs, constraint ranking and inventories of admissible segments. The Derivational OT's claim is that there are four such grammars and that they are linked serially. An analysis of Surface Velar Palatalization has tested the derivational model and reaffirmed that it is adequate.
The velar palatalization example in (78a) appears to be problematic in both of the two theoretically relevant ways: the //k// does not palatalize to [Ù] in hak+i, even though it is followed by i and, second, the reverse: the //k// palatalizes to [Ù] ' -buc+ik [butC+ik]. The underlying representation of the suffix is therefore //ik//, with //i// changing into [1] after the velar has palatalized to /Ù j / and hardened to [Ù]. The details are as follows.
At level 1, underlying //k// of //xak+ik// goes to /Ù j / before i, exactly as it does in (16) in the context of e, rycz+e+ć 'scream': //r1k+E+tC// → / r1Ù j +E+tC/, but the driver is not PAL-e but PAL-i. Recall that level 1 is a Palatalization level, so IDENT-V[-back] is undominated. PAL-i, like the other PAL constraints, is restricted to derived environments by DE-PAL (not shown in (79)  The optimal candidate [butCik] is the attested surface form.
In the case of labial stems, such as sklep 'store' -sklep+ik (dimin.) in (78b), the level 1 grammar selects [sklEp j +ik] as the optimal output. The level 2 derivation does not change anything. The loss of palatalization, p j → p, is made impossible by IDENT-C[-back] being ranked above *SOFT-Lab, so the faithful [sklEp j +ik] emerges as the winner. The result is correct.
Returning to the data in (78), the nom.pl. hak+i 'hooks' appears to be problematic because we see [i], a trigger of //k// → /Ù j /, and yet the /k/ is retained, albeit palatalized in a surface manner, k → k j [xak j i]. The key to the analysis lies with the data in (78b), such as but+y [but+1] 'shoes'. We see that the nom.pl. ending surfaces as [1] and the occurrence of [i] is limited to the context of velar stops (Rubach 1984 In the class of coronals, the differences between the effects of level 1 and level 2 are by far greater than in the class of velars. In the latter, the difference between levels is a matter of what type of palatalizing changes occur: k → Ù j at level 1 vs. k → k j at level 2. In the case of coronals, Palatalization, for example, t → t j , occurs at level 1. At level 2, the effects of Palatalization are enhanced by affrication and change of the place of articulation from dental to prepalatal: t j → tC, as shown in the evaluation of brac+ie 'brother' (voc.sg.) in (21). More importantly, the level 2 grammar switches gears from Palatalization to Vowel Retraction (Retraction, henceforth), so from Ci → C j i at level 1 to Ci → C1 at level 2. 47 By force majeure, Retraction cannot occur across morpheme boundaries because this is the context in which consonants undergo Palatalization at level 1, so //t+i// → /t j +i/. As a consequence of Palatalization, the input to level 2 is /t j +i/, which thwarts Retraction because Retraction is triggered by hard coronals. To conclude, Retraction operates morpheme-internally and hence we see no alternations. 48 47 Recall from the introductory section that both Palatalization and Vowel Retraction satisfy PAL-i. 48 Actually, I know of one case of alternation. It is the word ekspedy+cj+a [d1] 'expedition' -eks-pedi+owa+ć [d j j] 'send promptly'. The [j] in the verb is part of the root because the suffix is -owa+ć, as Evidence for Retraction comes from assimilation of borrowings (Rubach 1984). The data in (85) below show that Retraction, i → 1, occurs after coronals (85a) but not after labials or velars (85b).
The fact that PAL-i is implemented as Retraction at level 2 has no adverse consequences for the derivation of prepalatals, as in but 'shoe' -buc+ik (dimin.). Recall that buc+ik leaves level 1 as /but j +ik/. The derivation continues at level 2.
(88) Level 2 /but j +ik/ → [butCik] Pal-i IDENT-C[-back] *SOFT-Coron IDENT-V[-back] POSTER STRID a. but j ik * *! * b. butik *! ☞ c. butCik * d. but1k * ! * As the data in (85) show, Retraction as a response to PAL-i at level 2 is limited to coronals (85a). Labials and velars undergo Palatalization (85b), like they do at level 1, but with the notable difference that PAL-i is not limited to derived environments: PAL-i DE-PAL at level 2. Palatalization is found with both native words (89a) and loanwords (89b). Palatalization, the desired effect for labials and velars, pi → p j i and ki → k j i, is obtained at level 2 if IDENT-V[-back] outranks IDENT-C[+back] and *SOFT-Lab as well as the constraints against palatalized velars (*k j , *g j , *x j ). In (90), I look at the palatalization of pisk //pisk// 'scream'. As noted earlier, nothing happens at level 1 because pi does not span a morpheme boundary, so the output from level 1 is the faithful /pisk/. The ranking of the constraints at levels 3 and 4 returns in many ways to the ranking at level 1. PAL-i manifests itself as Palatalization, C → C j , rather than as Retraction, i → 1. Palatalization is not accompanied by enhancement via POSTER and STRID, so /t d s z/ turn into [t j d j s j z j ] and not into [tCdýCý]. HARD has no force as PAL-i produces soft [Ù j S j ] that do not harden to [ÙS]. The evaluation in (94) looks at the prefix plus stem structure in the verb z+ignorować 'ignore' (perfective). Since z is a prefix, it becomes first available for evaluation at level 3, that is, it never goes through levels 1 and 2. In (94), I look at the /z+i/ portion of z+ignorować. To conclude, the analysis in Sect. 3 is readily extended to include PAL-i as the driver and coronals as well as labials as the inputs. The theoretical conclusions in Sect. 6 are strengthened by these extensions. PAL-i demonstrates that levels 1 and 2 differ dramatically in the types of operation that are conducted and the segment inventory constraints that play an active role in selecting the correct output. Level 1 is a Palatalization level, Ci → C j i, while level 2 is a Retraction level, Ci → C1. At level 1, PAL-i is restricted to derived environments while at level 2 it applies freely inside morphemes. The admissible Palatalization outputs for coronals are /t j d j s j z j / at level 1. At level 2, /t j d j s j z j / are inadmissible, so they are changed into [tC dý C ý]. At levels 3 and 4, the admissible soft coronals are the same as at level 1: [t j d j s j z j ]. The Velar Palatalization process at level 1 turns //k g x// into /Ù' Ã j S j Z j / in the context of /i/, so the process is transparent. The change becomes opaque at level 2, where HARD, a segment inventory constraint, enforces the loss of Palatalization: /Ù' Ã j S j Z j / → [Ù Ã S Z], which in turn entails the change of the vowel from front to back: Retraction, /i . Level 3 inputs with prefixes revert to the strategy for accommodating PAL-i as Palatalization, as in z+ignorować [z j i] 'ignore', but, unlike in uraz+i+ć there is no follow-up by enhancing /z j / to [ý] at the next level.
A general conclusion that emerges from the investigation of PAL-i and from the extension of inputs from velars alone to coronals and labials concurs with the conclusion in Sect. 6: disparate effects of PAL-i require different grammars with different constraint ranking and different inventories. Derivational OT is exactly the paradigm that can accommodate these requirements.