Factors of selection, standard universals, and the standardisation of German relativisers

This contribution explores the concept of selection as an integral part of Haugen’s standardisation model from a theoretical as well as an empirical angle. It focuses on different types of factors of selection and how they are relevant to the study of selection processes both on the level of individual variants and whole varieties. The question of why standard languages appear to differ systematically from vernaculars and at the same time exhibit remarkable resemblances among each other is addressed, and characteristic features of standard languages are traced to general conditions of standardisation processes. A case study on the standardisation of German relativisers illustrates how different factors of selection combine in the dynamics of linguistic structure, variation, attitudes, and codification. It also shows how general tendencies of selection can lead to similar structures across standard languages, while it becomes clear that register variation and the historical development and changing evaluation of stylistic varieties can be crucial in order to explain the selection or de-selection of linguistic forms.

vernacular as model and constructing a new standard by way of combining existing or reconstructed forms to conceive a new norm. In Haugen (1982: 271), he characterizes selection as "a form of policy planning, which in this case establishes that a given linguistic form, be it a single item or a whole language, shall enjoy (or not enjoy) a given status in a society". He thus refers to two types of selection: the selection of a variety on the one hand, 1 and the selection of variants on the other hand. Haugen calls these two types of selection "rival hypotheses concerning selection: (a) that an SL (standard language) is (or should be) based on a single dialect, that is, someone's vernacular; (b) that an SL is (or should be) a composite of dialects (the nature of the composition being left unspecified). We may call the first the unitary thesis and the second the compositional thesis" (Haugen 1982: 564). Deumert and Vandenbussche (2003: 4) choose a less antithetical terminology in distinguishing between two types, rather than rival theses, of selection: "monocentric" selection-i.e. on the basis of an existing variety-and "polycentric" selection-i.e. as the eclectic combination of individual linguistic variants, 2 and van der Wal's (2007) analogical distinction between "macro-" and "micro-selection", which will be used in this paper, even seems to suggest that both types of selection should be combined in a unified model of selection.
In Haugen's (1966: 932) discussion of selection, he refers to varieties spoken by socially or geographically distributed vernaculars, i.e. diastratic and diatopic varieties which are associated with "groups of people", and describes arising conflicts as tensions between these groups. The diaphasic dimension, and therefore intraindividual and stylistic variation, is conspicuously absent here, perhaps because the relevance of register differences seems trivial. Selection on the macro-level is usually understood in the geographical or social sense; the relevance of formal or written registers, however, is pointed out frequently (e.g. Milroy and Milroy 2012: 30).

Between micro-selection and macro-selection
Any standardisation history can be characterised in terms of whether the standard is a result of micro-or macro-variation. In their introduction to the volume Germanic Standardizations: Past to Present, Deumert and Vandenbussche (2003: 4-5) note that "polycentric selection seems to be rather more common [than monocentric selection] in language history. Most standard languages are composite varieties which have developed over time, and which include features from several dialects." Arguably, however, some standardisation histories are based on a combination of both processes, by selection on the micro-and the macro-level. Even though Haugen (1972) presents the "unitary thesis" and the "composite thesis" of selection as competing rivals, he concedes "that both theses are true-and false-in varying degrees according to time and place" (Haugen 1972: 564). Van der Wal (2007), for instance, shows that Dutch was standardised mainly through a process of macro-selection in Selection, codification, and selection criteria Haugen (1966: 932-933) points out that the construction of a new standard from existing variants has "[t]o some extent […] happened naturally in the rise of the traditional norms". He refers to the fact that micro-selection can take place in an unguided manner, and possibly below the threshold of consciousness of the language users; we might call this 'informal' selection as opposed to 'formal' selection through codification 3 or speak of implicit selection 'from below' in such cases. 4 These are processes of levelling (Stewart 1968: 534;Mattheier 2003: 214) involving the reduction of variability (Milroy and Milroy 2012) which take place in the emergence of written norms but do not necessarily result in fully-fledged standardisation. Only with ensuing codification, more conscious and deliberate efforts of selecting variants take place in an explicit selection 'from above' which culminates in ideologies of language correctness. In other words: codification involves selection, but selection does not necessarily involve codification.
In this sense, we may understand codification as explicit (micro-)selection. Codification usually implies some form of selection, at least where it does not merely reproduce existing codices. Codification often not only tells us which forms were selected but also often give reasons why they were selected. In this sense, codification adds a layer to the mechanisms of selection: it informs us about actual or purported criteria that govern the process of selection. The motives that agents of codification give may be in line with reasons that we linguists would identify, but often linguists will dispel them as rationalisations of decisions made on the basis of ideologies and determine other reasons instead. 5 There is a difference between (alleged) selection criteria, which are adduced by grammarians and other codifiers, and (actual) factors of selection, which can be determined through linguistic scrutiny. 6 As outlined above, selection criteria are understood to be reasons that are reported explicitly as motivations for selection in codices. As Davies and Langer (2006: 33) put it, "[t]hose involved in the selection process […] have always tried to and still try to justify their judgments in various ways." These justifications can be categorised in various ways and pigeonholed e.g. in the categories of function, analogy, (good) usage (usually that of a prestigious social group; Davies and Langer 2006: 28), moral and aesthetics. Since selection criteria are not central to this contribution, I refer the reader to their discussion in Davies and Langer (2006: 33-42).

3
Factors of selection, standard universals, and the…

Factors of selection
In contrast to selection criteria, which are largely restricted to codification contexts and may or may not reflect the actual motives for selection, factors of selection help us describe the mechanisms of selection from a linguistic viewpoint irrespective of whether selection took place implicitly from below or explicitly from above. Prescriptive statements in codification may themselves be a factor (e.g. Auer 2009a; Anderwald 2016; Havinga 2018). Glaser (2003: 60-67), who discusses the selection of variants in the case of German from an areal perspective, maintains that the problem of identifying the causal factors of selection is unresolved (2003: 60-61). Besch (1967Besch ( , 2003 has formulated five 'explanatory principles' which help to account for which variants prevailed in German. They will not be discussed in detail here (but see Footnotes 9, 10, and 12 below); for a critical discussion see Mattheier (1981), who calls into question their explanatory power, while Langer (2001: 44-45) finds them insufficient to explain the de-selection of periphrastic tun in German.
Besch formulated his five principles in the context of variant selection in the history of German, and at least one of his principles, ("Landschaftskombinatorik" 7 ), which holds that variants that appear in a particular combination of dialect areas have higher chances of being selected, is restricted to the German context because it relies on specific regions in their specific historical context. In contrast, his other principles ("Geltungsareal", "Geltungsgrad", "strukturelle Disponiertheit", and "Geltungshöhe") are at least in virtue universally applicable (Mattheier 1981: 281). This distinction between universal factors which are relevant in any standardisation process and historical factors which affect the selection in the standardisation of a particular language is a crucial one because it concerns the generalisability of factors of selection. 8 For instance, the effect that a more widely used variant is more likely to be selected as a standard variant can be considered a universal factor. If, on the other hand, a particular region has proven to be more likely to lend variants to the emerging standard in a specific context due to historical reasons, this can be considered a historical factor.
A further distinction derives from the nature of the factors in question and their ontological relation to the respective linguistic forms. Factors of selection can, on the one hand, result from linguistic properties of the selection candidates, i.e. their phonological, morphological, syntactic etc. form or structure (potentially in relation to other forms). Such factors can be called structural factors of selection. For example, a linguistic form might have higher chances of being selected if it is morphologically more transparent or more regular, or if it is more clearly distinguishable from other forms. 9 Factors can, however, also tend to favour some variants over others because of their distribution in space, society, or across registers; we can call these factors distributional factors. The examples for universal and historical factors above are both examples for distributional factors. In addition, the tendencies that variants used by the higher classes or variants that belong to formal registers are selected can be classified as distributional factors. 10 Because registers and social strata, and to some degree also the particular regions, are often linked to attitudes towards these domains, the prestige they are perceived to have has a direct impact on the valorisation of linguistic variants and varieties used in them, which in turn is a factor in selection.
Because of this, we must characterise factors as to whether-or to what degreethey are attitudinal. Attitudinal factors are an expression of language ideologies and can reflect covert or overt attitudes. They concern the evaluation of linguistic forms as appealing or unappealing, as better or worse, or, especially in a standardisation context, as correct or incorrect. 11 Just as positive attitudes can lead to a favourable assessment of a linguistic form in selection, negative attitudes can result in the de-selection of variants or entire varieties. This effect was particularly strong in the context of linguistic purism (e.g. Langer and Davies 2005) and can find expression in the stigmatisation of linguistic variants (Langer 2001), which leads to their deselection (Davies and Langer 2006). Since, according to Silverstein (1979: 193), "ideologies about language, or linguistic ideologies, are any sets of beliefs about language articulated by the users as a rationalization or justification of perceived language structure and use", attitudinal factors can be analysed with regard to language users' rationalisations or justifications of their attitudes. Such 'adduced reasons' can be either structural (i.e. a variant is preferred because its linguistic properties are viewed as better than others) or distributional (i.e. a variant is preferred because the region/class/register in which it is used is viewed as more prestigious than others). 12

3
Factors of selection, standard universals, and the…

Universal factors of selection and standard universals
If there are universal factors of selection, these will lead to specific linguistic characteristics that are typical of standard languages; conversely, features typically found in standard languages are likely the result of universal factors of selection. It seems fairly safe to say that there are no linguistic features which are shared by all standard languages; in this sense, there are no standard universals in the strict sense. In an account of attempts to list defining features of standard languages, Stein (1994: 3) concludes that "it is […] clear that an attempt at such a definition must be, not categorical, but prototypical, so that it can accommodate the specific varietal and historical conditions of each individual case". This may seem like a partial departure from the distinction between universal and historical features just introduced, but the key notion here is that standard universals need not be found in every standard language; they only need to be found typically in a standard language to be considered (relatively) universal, such that 'standardness' is a fuzzy concept.
The notion of standard universals (or 'standardversals'; De Vogelaer and Seiler 2012: 14; Seiler 2019) can be understood as a complement to the concept of "vernacular universals" postulated by Chambers (2004) (see also Filppula et al. 2009a), i.e. nonstandard features that are found in vernaculars around the world and which "occur not only in working-class and rural vernaculars but also in child language, pidgins, creoles and interlanguage varieties" (Chambers 2004: 128). "Socially, the vernacular universals appear to fall into well-defined patterns in the acrolectbasilect hierarchy, but functionally there appear to be several disparate principles at work (from motor economy to cognitive overload)" (Chambers 2004: 130). Using the label vernacular for this concept raises the question of whether it has a nonvernacular counterpart, i.e. features that are found in standard languages anywhere in the world and which can be derived functionally from other principles than the ones envisaged for vernacular universals by Chambers. Vernacular universals are classified by Chambers as "primitive features" (a choice of term which may be disputed due to its connotations), which expresses that such features can be seen as largely untouched by language ideology or language policy. While the vernacular is "primitive", the standard is "learned" (Chambers 2004: 139). The key idea here is that while vernacular universals are found in nonstandard varieties anywhere in the world because they emerge as a result of basic principles of language processing, standard universals can be found in standards around the world because they are the result of universal principles that govern variant selection in standardisation.
The very notion of vernacular universals testifies, to an extent, of standard language ideologies: The fact that we need a term for vernacular universals shows that non-vernacular (i.e. standard) features are, implicitly, still seen as the unmarked case, where the opposite seems more plausible (Filppula et al. 2009b: 5-8). In this sense, Trudgill (2009: 307) argues regarding multiple negation that "it is difficult to argue for it as a vernacular universal when in fact it is confined to the vernaculars simply because it has been lost in Standard English-because of a linguistic change that took place in (pre-)Standard English." This means that vernacular universals are actually 'normal', unmarked language features, while standard universals are 'abnormal', typologically marked and in need of explanation (De Vogelaer and Seiler 2012: 14;Seiler 2019).
This calls for a shift of focus from the "search for vernacular universals" (Trudgill 2009: 304), which, according to Trudgill (2009: 305), "has ultimately been in vain", to the search for standard universals (bearing in mind that 'universal' is a prototypical, i.e. fuzzy, notion). According to Milroy and Milroy (2012: 6), "we do not, in fact, know whether standard languages can be conclusively shown to have no purely linguistic characteristics that differentiate them from non-standard forms of language (the matter has not really been investigated)". 13 If it is true, however, that there are peculiarities which are typically found in standard languages (see Weiß's (2004: 649) idea of "the linguistically exceptional nature of standard languages"), we need to understand what features these are, and why they are typical of standard languages. Such standard universals are necessarily the result of selection, and their (relative) universality derives from factors of selection which are universal across standardisation histories.
The notion of Standard Average European (SAE), coined by Whorf (1941) and itself a fuzzy category, is intended to describe the finding that several linguistic features-among them definite and indefinite articles, relative clauses with relative pronouns, have-perfect, etc.-are found in most European languages but rarely elsewhere (hence Dahl's (1990) notion of SAE "as an exotic language"). This circumstance has been interpreted as a sprachbund by Haspelmath (2001), who discusses a number of-typologically marked (Haspelmath 2001: 1492)-features which are shared by several European languages. More recently, however, it has been pointed out that the term Standard Average European acquires a new meaning when viewed in the light of standardisation: The findings are usually explained as the result of structural convergence due to geographical adjacency: Standard Average European forms a sprachbund, thus it is an areal phenomenon. However, the European sprachbund is defined on the basis of features of the codified standard varieties in the first place. The term Standard Average European thus obtains another meaning which has not been intended in its original formulation by Whorf (1941) and his successors. If it turns out that Standard Average European features are significantly less pervasive in vernaculars as compared to codified standard languages, linguists will have to propose a different explanation for its historical emergence. It is then not a genuinely areal, but rather a sociolinguistic, or stylistic phenomenon, rooted in common strategies of codification or even in (ideologically motivated) metalinguistic beliefs rather than in geographical adjacency. (De Vogelaer and Seiler 2012: 14) Such "common strategies of codification" can well be understood as (at least in the European context) universal factors of selection in that certain features are favoured over others in the shaping of a standard language. De Vogelaer and Seiler describe this as a "sociolinguistic, or stylistic phenomenon", which relates 1 3 Factors of selection, standard universals, and the… to distributional characteristics of the variants in question and their indexical value. They also identify the potential role of language ideology and metalinguistic beliefs, which leads us to posit factors of selection that are universal, distributional, and to an extent attitudinal. This raises the question of whether there can also be universal (attitudinal or non-attitudinal) structural factors, and how they interact with universal distributional factors.
There appear to be linguistic characteristics which are typical of standard languages by mere frequency, e.g. a high degree of nominalisation or hypotaxis (e.g. Stein 1994: 3), which may to an extent be down to their use in written language. A number of "systematic differences between dialects and standard languages" are discussed by van Marle (1997). Generally, there seems to be a tendency for standard languages to incorporate features which allow for flexible syntactic integration and semantic compaction. At the same time, standard languages seem to prefer (combinations of) linguistic variants which are functionally unambiguous, especially in the written mode, where there is limited scope for disambiguation through contextualisation or deictic operations (e.g. the functional reorganisation and differentiation of increasingly specialised conjunctions towards written standard German; Sonderegger 1979: 290;Betten 1987: 87). On the whole, many of the developments during selection seem to be aimed at optimising the linguistic code's functioning in the written medium, which can be seen as differing fundamentally from language processing in oral speech (Auer 2009b).
Overall, during selection the future standard variety becomes more and more written-like; a process which Koch and Oesterreicher (1994) have called "Verschriftlichung" and which is linked to the standard language's association with the written mode, especially in times when textualisation of society takes places. Verschriftlichung thus involves Givón's (1979) notion of syntactization: Spoken language […] stands in close proximity to the pragmatic mode [Givón 1979]. However, it also exhibits a tendency towards detachment from the situation and 'syntactization'. Written language uses these tendencies and develops them further in the direction of the syntactic mode with its rich and dense verbalization […]. At the same time, more economical and less integrated strategies of verbalization contrasting with these communicative conditions remain unused, although they are available in the linguistic system. (Koch and Oesterreicher 2012: 453) In Stein's (1994: 5-6) words, "syntax becomes an organizational necessity to regulate expressive traffic above all in the written language, where the absence of nonlinguistic, gestural and situational context information necessitates the support of the conventionalized and socially controlled organization principles (Bartsch 1987: 15) to ensure its functioning" (see also Eisenberg's (1995) distinction between symbol grammar and context grammar). In this sense, the selection of features in standardisation is to be seen in the context of a harnessing of-above all syntacticfeatures that prepare the standard variety for its functioning in the written medium (also Stein 1997: 45).
On the attitudinal level, too, there seem to be factors which are universal in standardisations. Grondelaers and Kristiansen (2013: 16) "suspect that the construction of a standard language ideology (SLI) anywhere at any time requires the development of a 'good vs. bad' hierarchisation of varieties and a common 'knowledge' and acceptance of which variety is the 'best language'". Other hierarchies, such as between logical and illogical 14 or pure and impure may be universal within the European context but are historically linked to the rationalism of Enlightenment and the idea of the nation state. Stein (1994: 5) points out that "a constituent part of this ideology is the focus on written language as the paragon of language generally. This is related to the fact that most of the high functions […] of language are carried out in the written medium."

Case study: relativisers in German 15
One of the features that Haspelmath (2001) characterises as typically SAE features are relative pronouns, which we may read with De Vogelaer and Seiler (2012) in the literal sense as not only typically 'European', but also as typically 'standard': The type of relative clause found in languages such as German, French or Russian seems to be unique to Standard Average European languages. It is characterized by the following four features: The relative clause is postnominal, there is an inflecting relative pronoun, this pronoun introduces the relative clause, and the relative pronoun functions as a resumptive, i.e. it signals the head's role within the relative clause […]. Furthermore, in most SAE languages the relative pronoun is based on an interrogative pronoun (this is true of all Romance, all Slavic and some Germanic languages, Modern Greek, as well as Hungarian and Georgian). (Haspelmath 2001(Haspelmath : 1494 Relative pronouns, thus, are typologically marked; they seem "to be a remarkable areal typological feature of European languages, especially the standard written languages" (Comrie 1998: 61). Relative particles, 16 on the other hand, are also features of several European standard languages alongside pronouns and "are also attested widely elsewhere in the world" (Haspelmath 2001(Haspelmath : 1495. If relative pronouns have become standard variants, even though they are typologically marked and relative particles were also available, the question of why European standard languages typically have relative pronouns is a question of selection: Why were relative pronouns selected-usually at the expense of their uninflected particle alternatives-to become part of the standard? In this case study, I will focus on the standardisation of relativisers in German, which involved both relative pronouns and particles.

3
Factors of selection, standard universals, and the…

Present and past of German relativisers
Today, the most usual relativiser, which is also one of two standard variants, is the relative pronoun der/die/das (d-), which is phonologically and orthographically identical with the definite article in most inflected forms. It is attested as early as Old High German (750-1050) but "relatively rare in modern dialects" (Fleischer 2004: 232). A second standard variant is welcher/welche/welches (welch-), which is used rarely (in less than 1% of cases according to Sommerfeldt 1983: 162-163); it is stylistically marked as formal and unwieldy (Duden 2011). It is based on the interrogative adjective welch-(but is used today as a nominal rather than an adnominal relativiser). It is sometimes recommended as an alternative to d-where d-would stand next to a homonymous form of the definite article.
From amongst these relative pronouns, only d-is found in dialects and, despite its relative rarity, the only one 'which is firmly rooted in spoken language' ("die fest in der gesprochenen Sprache verankert ist" ;Brooks 2006: 122). Welch-is attested for some Yiddish and Westphalian varieties, but Fleischer (2005: 176) assumes that this is 'a rather artificial type' ("einen eher künstlichen Typ") due to contact with standard German. Von Polenz (1999: 356) characterises it as an educated written variant unfamiliar in the dialects and in colloquial language. There are a number of relative particles in the dialects which can be used on their own or in combination with d-, including wo and was, and also zero relatives are found in some dialects (Fleischer 2004(Fleischer , 2005. Historically, 17 d-has been in use continuously since Old High German. In this first period, it was often accompanied by the relative particles de or dar; de could also function as a relative particle on its own. D-was the most frequent variant until it was met with increasing competition by welch-at the end of the sixteenth century (Brooks 2006: 135).
Welch-is thought to have originated in Dutch in the thirteenth and fourteenth centuries (Reichmann and Wegera 1993: 446) and spread first to Low German and later, in the fifteenth century, to High German (Brooks 2006: 123). 18 By the sixteenth century, it is associated with educated speech and academic writing (Reichmann and Wegera 1993: 446;Ebert 1986: 161;Brooks 2006: 123); its usage rises sharply toward the end of the sixteenth century (Brooks 2006: 135). According to Dal (2014: 241), welch-had almost supplanted d-in written German in the nineteenth century before being suppressed normatively (Ágel 2000: 1883) and declining in the nineteenth and twentieth centuries (von Polenz 1999: 5).
Relative particles, unknown in standard German, do not only occur in modern dialects but are also attested historically, e.g. da, und and so. The latter was highly frequent during the Early Modern period; it is the result of a reanalysis of comparative constructions with the relative adverbial so. It was still rare in Middle High German (Paul 2007: 405) and is found primarily in administrative texts of the fourteenth and fifteenth centuries before spreading to other genres in the sixteenth century (Reichmann and Wegera 1993: 447) and even appears in Dutch private correspondence in a case of language contact in the seventeenth century (van der Wal 2018). It reached its widest distribution in the sixteenth and seventeenth centuries (Dal 2014: 244) before falling into disuse in the eighteenth century (Brooks 2006: 135). It is considered a characteristic of Kanzleisprache ('chancery language') (Brooks 2006: 122-123;132-133), a style of writing which originated in chanceries and was imitated widely as a supraregional prestige variety (Schwitalla 2002).

Codification
Before turning to the analysis of usage data, I will briefly outline the codification of relativisers in German by focusing on the three most influential German grammarians: Justus Georg Schottelius (1612-1676), Johann Christoph Gottsched (1700-1766), and Johann Christoph Adelung (1732-1806). All three of them refer to the three variants d-, welch-, and so in their works.
In Schottelius's (1663: 534) two major works, 19 there is no explicit valorisation of any of the three forms. The only variant whose relative usage is described explicitly is welch-(1663: 730-731), but relative usage of d-and so is clearly implied (1663: 700-701). Even though Schottelius does not make any statements about which variant he thinks is preferable, he mentions welch-and so as alternatives for d-where it would stand next to a homonymous form of the definite article in order to avoid repetition (1663: 700-701). 20 Similarly, so is treated as an alternative for both d-and welch-(1663: 543, 735). We can therefore cautiously hypothesise that Schottelius's works may reflect a hierarchy d-> welch-> so, where d-is the least marked form and so the most highly marked form. Of all three variants, only so is treated in a subtly evaluative way: Dieses So kan man zuweilen schiklich und wol anwenden/ wenn nemlich die Rede durch welcher/welche/welches/ oder der/die/das/ hart und unvernehmlich werden wil. (Schottelius 1663: 735) 'This so can sometimes be used decently and well, namely when the language would become hard and unpleasant by using welcher, welche, welches or der, die, das.' (my translation) The implication here is clearly that so is usually improper and cannot be used in a decent way, unless to avoid using too many instances of d-and welch-. The fact that Schottelius stresses the cases where so can be used agreeably rather than the other 1 3 Factors of selection, standard universals, and the… way around shows that attitudes against so in this usage must have existed at the time, to an extent that Schottelius took them for granted. Gottsched (1748: 237) states that he views only welch-as a 'proper' relative pronoun. This assessment may be influenced by Latin and Romance relativisers, where relative pronouns are etymologically interrogative. He concedes, though, that d-can be counted among relative pronouns as well because it can be used interchangeably with welch-(1748: 238). He calls upon the model of 'good writers' ("gute Schriftsteller"), who alternate between welch-and d-depending on the requirements of 'euphony' ("Wohlklang"). In particular, he recommends welch-as a means of avoiding repetition when using d-. Gottsched goes on to say that so, which he views, like Schottelius, as an alternative to welch-or d-(1748: 366), is also used 'very frequently' ("sehr haͤ ufig") and poses the question of 'what should be thought of this?' ("was davon zu halten sey?", 1748: 238). Again, this is a passage that tells us that there must have been some underlying, implicit yet pervasive attitudes toward this form. Gottsched concludes that there is nothing wrong with so in principle; but since so is a highly frequent word, with a large range of different functions, it should be avoided 'as far as possible' ("so viel man kann", 1748: 239) and used only where the gender of the antecedent is unclear or where it refers to several antecedents with mixed genders. Overall, Gottsched's hierarchy of relativisers appears as welch-> d-> so, in which d-and welch-have swapped places compared to Schottelius, with a more explicit condemnation of so.
Like Gottsched, Adelung (1782: 711) gives an explicit endorsement of welch-, calling it 'the most complete relativiser' ("das vollständigste Relativum"), and also provides a stylistic assessment in saying that it is used most frequently 'in solemn speech' ("in der feyerlichen Rede"), whereas d-is used in 'the private/familiar way of writing' ("der vertraulichen Schreibart"). Like d-, which according to Adelung (1782: 711) is used 'in ordinary life' ("im gemeinen Leben"), so is 'passable' ("gangbar") in 'ordinary ways of speaking and writing' ("den gemeinen Schreibund Spracharten"; 1782: 713). This is not only a clear placement in terms of register, but also again a formulation which informs us ex negativo that so was not 'passable' in other domains. Adelung (1782: 713) expresses this sentiment more explicitly: in 'more decent' ("anständigern") ways of speaking/writing, it is usually avoided as far as possible, except to avoid repetition of the other relativisers. Adelung thus also has a clear hierarchy, which is the same as Gottsched's hierarchy, but this time with a clearer stylistic allocation to registers.
In the works of these three grammarians, we find a development of increasing explicitness as regards the acceptability of the individual forms and their stylistic assessment. The variant so becomes more overtly stigmatised (in the sense of Davies and Langer 2006: 75-76) and welch-receives an increasingly evaluative endorsement. 21 The criteria for both the condemnation of so and the endorsement of welch-are implicit in the case of Schottelius, structural in the case of Gottsched, and stylistic in the case of Adelung.

Data and analysis
In order to assess the usage of the three variants diachronically, I carried out a combined corpus search in the DWDS reference corpora (www.dwds.de) comprising texts of the Deutsches Textarchiv (DTA, www.deuts chest extar chiv.de) and the Digitales Wörterbuch der Deutschen Sprache (DWDS) from 1600 to 1999. This corpus currently comprises 3822 works from the four domains of Fiction, Functional Texts, Academic Writing, and Newspapers, but is not balanced with respect to region.
The search for the three variants proved difficult because the corpus is not tagged for relative particles (so is tagged as an adverb). D-and welch-, however, appear as relative pronouns. The relative particle so therefore had to be identified through its distribution. As German relative clauses are verb-final, word order was taken into account in order to capture relative clauses opened by so. A search for cases where so appears after a comma or virgule (in order to target clause-initial cases) and is not followed by a verb (in order to exclude non-subordinate clauses) yields a high proportion of relative clauses but is far from being specific enough: the results still contain many cases of adverbial clauses beginning with so bald or so wohl (in which so modifies an adverb, bald or wohl; both examples would today appear as one word). A further problem is that conditional clauses with so ('if') cannot be ruled out this way. I therefore decided to restrict the search to cases where so is followed by a preposition: this theoretically still includes some conditional clauses but due to word order regularities the probability is much lower. The results, therefore, only represent a fraction of all relative usages of so, but the risk of false positives is considerably smaller. 22 In order to achieve optimal comparability, d-and welchwere searched with the same restrictions that were used for so, albeit using the tag for relative pronouns. A search for d-and welch-using only distributional criteria for test purposes yielded practically identical results.
Our time window begins at a time when all three variants have rather stable relative frequencies for a century (Figure 1). The situation begins to change dramatically in the first half of the eighteenth century, when so starts to drop until it is virtually non-existent by 1780. Subsequently, d-and welch-compete for dominance with more than one turning point, until d-wins out at the beginning of the twentieth century and welch-becomes a niche form. Its percentages close to zero especially after 1960 are all the more remarkable as it is to this day a standard variant, which points to a discrepancy between implicit and explicit selection: implicitly, welch-was practically de-selected in the nineteenth century, but explicit selection (i.e. codification) has not followed suit (yet).

3
Factors of selection, standard universals, and the… In order to be able to assess register differences and the role of stylistic factors, we will now consider the results separately by domain (Figures 2, 3, 4 and 5).

Discussion
It is tempting to look for explanations for these movements in prescriptive comments. It has been suggested, for instance, that Gottsched's (1748) recommendation to avoid so has led to its demise (Jakob 1999: 36). The decrease of so, however, started much earlier (Brooks 2006: 135), so Gottsched might at best have contributed to it. It appears that the distaste of so which is palpable in Schottelius's (1641;1663) comments was real and soon manifested itself in a lasting drop. If Gottsched's comments did have an effect, we must look for it in the decades following the publication of his Grundlegung. The time between 1750 and 1800 saw an extraordinary rise in the use of d-, which could be seen as contradicting Gottsched's more explicit endorsement of welch-, but his invocation of the example of 'good writers' might have led to a tendency towards imitating literary style-the percentages around 1800 in Functional Texts and Academic Writing are very close to those in Fiction. This trend was, however, reversed after 1800, leading to a new rise of welch-, which peaked c. 1870. It does not seem far-fetched to look for an explanation for this in Adelung's (1782) stylistic appraisal of welch-given that his Umständliches Lehrgebäude was highly influential. Grammars continued Gottsched and Adelung's tradition of listing welch-first and d-second (which is criticised by Wustmann 1891: 147) at least until the late nineteenth century, e.g. in later editions of Heyse (1859), which may also reflect an aspiration to make the unfamiliar welch-palatable to pupils (Schieb 1994: 377). 23 The first grammars to place d-before welchwere those that explicitly referred to Grimm's (1819Grimm's ( -1837 historical-comparative Deutsche Grammatik as a model, which describes d-type relativisers as characteristic of German and Germanic languages (Grimm 1819(Grimm -1837, such as Rinne (1836: 184) or Kehrein (1852: 95). 24 The latter points out 'that der, die, das is our oldest relative pronoun' ("daß der, die, das unser aͤ ltestes Relativpronomen ist"; 1852: 99). The first school grammars that listed d-before welch-appear to be from the 1860s (e.g. Jänicke 1863: 65; Koch 1868: 59)-the decade when welchcame closest to overtaking d-in usage and the tide was starting to turn. Towards the end of the nineteenth century, welch-became a controversial subject of language criticism (von Polenz 1999: 356). The final and decisive turn toward d-has been attributed to its normative and stylistic condemnation (Ágel 2000: 1883Dal 2014: 241). The trend reversal around 1870, however, seems to precede the first commentators to explicitly condemn welch-as part of a more general polemical onslaught on 'paper style/language' from c. 1890 onwards (e.g. Schroeder 1889; Wustmann 1891), 25 which makes it more likely that they picked up on-and probably boosted-a new trend that was already underway rather than causing it. Rather than attributing the loss of so to the grammarians, Ágel (2010) argues that the loss of so was due to structural reasons: as an uninflected and 'aggregative' particle, it conflicted with the ideal of grammatical clarity in written standard language. This can be seen as an example for the tendency of (European) standard languages to favour relative pronouns over relative particles in order to increase syntactic transparency, which is a candidate for a standard universal, at least in the European context. A contributing aspect, according to Ágel, was the restricted functional range of so: it could only replace relative pronouns in subject or object functions, 26 which led to a structural 'skew' that, overall, favoured the more flexible relative pronouns. This, too, could be seen in the context of the (perceived) lack of structural 'adequacy' of a linguistic item for a written standard language. Ágel also mentions the role of contemporary negative attitudes towards so.
A further, crucial factor appears to lie in the stylistic 'affiliation' of so: it was clearly a feature of Kanzleisprache, and with the declining prestige of the chanceries in the eighteenth century (Brooks 2006: 133), this stylistic variety and the features associated with it fell out of 'fashion' as well. With Kanzleisprache becoming obsolete in the eighteenth century (Lange 2008: 188-189), a number of stylistic features suffer the same fate (Brooks 2006: 134), which exemplifies the gradual range between micro-and macro-selection which we have dubbed 'meso-selection'. With Kanzleisprache being avoided for attitudinal reasons (Brooks 2001;Schwitalla 2002), an entire stylistic complex of variation is implicitly but effectively deselected. This is all the more remarkable since Kanzleisprache had ticked most of the boxes which would have seemed to make it an ideal candidate to be selected for the future standard: it was a high-prestige, supraregional, formal, written variety, but as its prestige diminished, the rest of these factors appear to have become less relevant, and the stylistic tastes in Enlightenment fostered different linguistic preferences (Brooks 2001(Brooks , 2006, which might be linked to the rise of the 'educated' variant welch-and the more neutral d-as written language became 'less written' and closer to spoken German after the heyday of Kanzleisprache (Brooks 2006: 134).
For the de-selection of so, several intricately interwoven factors seem to have been at play: there was a prescriptivist negative attitude towards so, based largely on aesthetic judgements, which was partly reflected, partly fuelled by the grammarians; connected with this attitude was a structural factor, put forward in the grammars,

3
Factors of selection, standard universals, and the… which concerned the functional overload of homonymous forms; a further structural factor, apparently not reflected by the grammarians, made so functionally ill-suited as a syntactically transparent and versatile relativiser in a written norm; and finally, the stylistic distribution of so condemned it to go down with the ship as the attitudes toward the stylistic variety it was associated with changed and it was de-selected.
Like so, welch-is a polarised variant (Pickl forthcoming a) associated with writing with little foothold in spoken language. It lacked, however, the stylistic stigma of Kanzleisprache and was a structurally fit candidate in the ongoing selection. The rise of d-may have been related to its use in literary language (and thus distributional and attitudinal), perhaps sparked by Gottsched's comments. The idea that welch-was a more proper relative pronoun than d-, however, expressed by Gottsched and Adelung, is a structurally conditioned attitude which may have led to its rise in subsequent years. It was complemented by a tendency to favour the more formal registers in prescription (Milroy and Milroy 2012: 30), which is palpable in Adelung's stylistic comments. The availability of a relative pronoun, however, which was closer to spoken language, d-, proved a threat to welch-, a written, 'artificial' (Fleischer 2005) variant that was never rooted in spoken language (Brooks 2006). Moreover, Gottsched and Adelung's idea that welch-, being as an interrogative and hence 'Latin-type' relative pronoun, was a more rightful relative pronoun than dmight have lost its persuasiveness after the end of German classicism, also given the influence of Grimm's historically oriented grammar. The final blow for welch-came in the twentieth century, when its association with 'paper language' made it stand out as unwieldy-the fact that it is longer may have contributed to this impressionand the stylistically neutral and more 'natural' 27 d-took the lead for good and practically obliterated its rival.
The genre differences show a clear common pattern in Functional and Academic Writing, and to some extent in Newspapers, which Fiction was practically unaffected by. This demonstrates the relevance of the stylistic evaluation of the forms in question: d-was clearly perceived as an aesthetically agreeable variant, while the increasing avoidance of welch-in Fiction shows how it became more and more marked. 28 The 'oscillation' in the other domains between welch-and d-shows how written usage wavered between a norm rather detached from spoken usage, characterised by distinctly written forms, and a norm that was more firmly grounded in spoken language and stylistically more neutral (cf. Footnote 24). 27 Even though d-seems to have emerged naturally and has always been a spoken variant, as a relative pronoun it is typologically marked, and many German dialects prefer relative particles instead (Fleischer 2004(Fleischer , 2005. 28 The gradual trend away from a 'written' feature is reminiscent of the gradual convergence of spoken and written German between 1400 and 2000 put forward by Weiß (2005).

Conclusion
Which variants or varieties are selected in standardisation is not a question of identifying a single cause, but a trade-off between several factors. These factors can be universal across standardisation histories or historically contingent, and they can stand in relation either to the linguistic form or distribution of the candidates in question. Also, attitudes based on either form or distribution have an often decisive effect on which candidate prevails. Register variation plays a crucial role in selection alongside geographical and social variation, especially in terms of written and formal language. Future studies may attempt to establish relative weights of importance between these (types of) factors. Because some factors appear to be (relatively) general, standard languages tend to develop similar linguistic structures, which were called 'standard universals' in analogy to Chambers' 'vernacular universals'. One of these features, at least in the European context, is the existence of relative pronouns instead of or in addition to relative particles. This can be traced to a universal factor of grammatical explicitness and unambiguity.
The case study focused on the standardisation of relativisers in German, which is an ideal example for the application of several of the concepts laid out in the first half of the paper. Its outcome was the result of a combination of several factors concerning the linguistic structure, the distribution across registers, and the evaluation of the three selection candidates by the speech community and agents of codification. Because so and welch-were de-selected as parts of larger stylistic complexes, this example also shows that register variation beyond the categories of formal and written has to be taken into account alongside social and geographical variation. At the same time, it exemplifies how universal factors of selection can lead to similar structures in several standard languages.