1 Introduction

Wenzhounese is a subgroup of the Wu dialect (Zhengzhang and Zheng 2015), mainly spoken in Wenzhou City, Zhejiang Province, China. Its canonical word order is SVO, but in the presence of an ostensible object marker de, the relative order of V and O is reversed, as shown in (1).Footnote 1 The (un)acceptability of the Wenzhounese sentences in this paper has been checked with four native speakers of Wenzhounese (excluding the author, who is also a native speaker of Wenzhounese).

figure a

(1a) exemplifies the VO order, where the verb tsei ‘lend’ precedes the object soutɕi ‘phone’. In (1b), soutɕi precedes tsei, resulting in the word order de + OV. Linearly, the de-construction takes the form “Subject de Object Verb”.

Essentially, de is the counterpart of Mandarin , a morpheme arguably expressing the affectedness of the object (cf.  Li 2006) and resulting in the linear order “Subject Object Verb” similar to (1b). Additionally, de and share similar prosodic and semantic constraints (Feng 2019; Li 2006; Liu 1997; Wang 2017). The prosodic constraint requires that the verb after de/ must not be monosyllabic (2), and the semantic constraint rules out certain verb types from the de/-construction (3).Footnote 2

figure b
figure c

In (2), the verb ‘publish’ has monosyllabic and disyllabic variants, i.e. fopi and fo in Wenzhounese versus fābiǎo and in Mandarin. Although all of them can be followed by vantɕɛ/wénzhāng ‘article’ in a canonical VO order, only the disyllabic ones are legitimate in (2). That is, verbs’ syllabicity may affect the acceptability of the de/bǎ-construction. On the other hand, (3) shows that stative verbs such as ɕiɕy ‘like’, t\(^{\textrm{h}}\)ɛji ‘dislike’, and xiāngxìn ‘believe’ are not allowed in the de/bǎ-construction at all (see Liu 1997 for details).

Section 2 provides a detailed comparison between Wenzhounese de and Mandarin . Since there is no formal analysis of Wenzhounese de, this paper fills this research gap within the framework of lexical functional grammar (LFG)Footnote 3: Sect. 3 reviews previous LFG approaches to Mandarin , and Sect. 4 offers an LFG account for the de-construction. Section 5 concludes.

2 Comparing de and

This section elaborates on the similarities and disparities between de and , including the topicalization strategies (Sect. 2.1) and topic properties (Sect. 2.2) of post-bǎ/de NPs, the distribution of the universal quantifier (Sect. 2.3), the double-de construction (Sect. 2.4), the voice of post-bǎ/de verbs (Sect. 2.5), and the possibilities of admitting retained subjects/objects (Sect. 2.6).

2.1 Topicalization

As a topic-prominent or discourse-configurational language (Kiss 1995; Li and Thompson 1976; Xu 2015), Mandarin often marks the information-structural topic by having it occupy the sentence-initial position. This notwithstanding, it is generally impossible to topicalize the post- NP (McCawley 1992, p. 225). (4a) is a typical -sentence, where precedes the object NP zhè-xiē píngguǒ ‘these apples’. As we can see in (4b), preposing the post- NP to the clause-initial topic position results in ungrammaticality, unless we use a resumptive pronoun after (4c).

figure d

In contrast to Mandarin, the post-de NP in Wenzhounese can be topicalized, using either the gap strategy (5a) or resumption (5b).Footnote 4

figure e

In (5b), the resumptive pronoun does not agree in number with its antecedent, while its Mandarin counterpart in (4c) shows optional number agreement. Note that number agreement is required in Wenzhounese when the object in an SVO sentence is topicalized (6), so the question is: why is it absent in the de-construction?

figure f

The answer may be that there are two types of resumptive pronouns involved. Those with number agreement are ordinary pronouns, but those without are processor resumptives, which are not required by the grammar but “inserted where a gap would lead to ungrammaticality or processing difficulty” (Asudeh 2012, p. 41). For example, the underlined pronouns in (7) are processor resumptives, without which the sentences’ acceptability will degrade.

figure g

The insertion of processor resumptives is post-grammatical, i.e. they are “produced through incremental construction of locally well-formed structures” in the processing model (Asudeh 2012, p. 298). See also Engdahl (1982) for Swedish examples.

Similar to the pronouns in (7), the resumptive gi in the de-construction (glossed as gi hereinafter) helps alleviate island constraint violations. The data below show that extracting the post-de object from within a complex NP island impairs the sentences’ acceptability.Footnote 5

figure h

The variance of acceptability suggests that information from non-syntactic module(s) plays a role in the interpretation of these sentences. Crucially, when the gap is replaced by gi, the sentences become more acceptable, which is similar to (7) where the processor resumptives in English facilitate processing.

figure i

In short, in Mandarin, topicalizing the post- NP must be accompanied by a resumptive pronoun. In Wenzhounese, either a gap or a resumptive pronoun gi is allowed. This gi seems to be a processor resumptive, an analysis for which I present further support in Sect. 2.3.

2.2 Topic properties

We have seen in the previous section that the post-de NP can be topicalized. This section shows that even setting aside topicalization, the in-situ NP manifests topic properties.

Listing the following properties of topic in Mandarin Chinese, Tsao (1987) demonstrates that the post- NP exhibits almost all these properties, except for (10f), which is difficult to test in the -construction (p. 11).Footnote 6

figure j

Footnote 7

The post-de NP in Wenzhounese shares most of these topic properties.Footnote 8 For example, it can head a topic chain (11) and it can be followed by a topic marker, i.e. ne in (12).

figure l
figure m

Moreover, the post-de NP can establish a whole-part or possessor-possessee relation with a following NP, just as sentence-initial topics may do (Tsao 1987, pp. 18–20).

figure n

Summarizing, the post-bǎ/de NPs exhibit topic properties outlined by Tsao (1987), suggesting that they should be analyzed as a topic at some level of representation.

2.3 Distribution of quantifiers

In Wenzhounese, the distribution of the quantifier is sensitive to whether the post-de NP is a resumptive pronoun or a regular (pro)nominal. To begin with, consider the universal quantifier dōu ‘all’ in Mandarin. In the -construction, dōu must follow post- NP to take scope over it (cf. Huang et al. 2009, p. 180ff). In (14), therefore, dōu cannot precede . The quantifier at issue is underlined.

figure o

When the post- position is filled by a resumptive pronoun, the quantifier still occupies the same position, as (15) shows. Note that the quantifier cannot immediately follow the topic.

figure p

The universal quantifier in Wenzhounese patterns with its Mandarin counterpart when there is no topicalization. That is, when the post-de NP stays in-situ, the quantifier must follow this NP, and thus also follows de.

figure q

However, when the post-de NP is topicalized and leaves a gap, the quantifier preferably precedes de (17a).Footnote 9 Crucially, when gi occupies the post-de position (17b), the quantifier also needs to precede de, in opposition to the non-resumptive pronoun in (16). This contrasts with the -construction in Mandarin, where the quantifier uniformly follows the post- NP.

figure r

This gap-like property of gi is confined to the de-construction, if we compare it with topicalization in SVO sentences. When using the gap strategy, we can insert a quantifier between the subject and the predicate (18a). The placement of the quantifier is similar to (17), i.e. immediately after the subject. The quantifier cannot be used when there is no topicalization (18b) or when there is an “ordinary” resumptive pronoun with number agreement. This differs from de-sentences like (17b) that contains gi, as they allow the quantifier to intervene between the subject and de.

figure s

In short, the resumptive pronoun in (18) shows number agreement and blocks quantifier float, suggesting that it plays an active role in the grammar. gi in (17) does not show number agreement and triggers quantifier float, just like a gap. This can be explained if the post-de gi is a processor resumptive; that is, gi is not licensed by competence but by performance, so it does not interact with the grammar (Asudeh 2011).

Finally, there are also phonological differences between gi and “true” resumptives. The latter, as in (18), may surface with their underlying tone, e.g., [gi\(^{31}\)] ‘3sg’ and [gi\(^{31}\)-le\(^0\)] ‘3pl’. The processor resumptive gi, as in (17), obligatorily undergoes tonal neutralization (indicated by “0”) and optionally undergoes lenition for some speakers: [gi\(^{0}\)] \(\rightarrow \) [ji\(^0\)]. In other words, the processor resumptive is prosodically weaker than the true resumptive.

2.4 The double-de construction

Another disparity between and de is that a de-sentence can have two des (which I will call the double-de construction), whereas a -sentence can only have one .

figure t
figure u

In (19a) and (20), there are two occurrences of de, the second of which can be followed by either a gap or gi. By contrast, a parallel double -construction such as (19b) is ungrammatical in Mandarin.

An important restriction on the double-de construction is that the second de can only be followed by a gap or gi, but not a full NP (21a). The first post-de NP usually does not topicalize so as to avoid two consecutive tokens of de (21b), but topicalization is possible when there is an intervening verb (21c). These generalizations are summarized in (22).

figure v
figure w

Thus, there is an asymmetry in the double-de construction: while the first de can be followed by a full NP, the second de cannot.

2.5 Unmarked passives

In Mandarin, passivization can be marked by the morpheme bèi, as in (23a). Tan (1991, Ch.3) uses tests like reflexive binding, adjunct control, and imperatives to show that (23b) is also a passive sentence, albeit morphologically unmarked.

figure x

Bender (2000, Sect. 7.2) argues that in the core cases of the -construction (i.e. those without “retained objects”, to be discussed in Sect. 2.6), the embedded verb is also an unmarked passive. The questions are (i) whether unmarked passives are attested in Wenzhounese and (ii) whether the post-de verb can be an unmarked passive.

It turns out that unmarked passives are widely used in Wenzhounese complex predicates. As Pan (1997) observes, resultative-verb compounds (RVCs) in Wenzhounese must be preceded by their objects, despite the canonical SVO word order.

figure y

In what follows, I provide evidence that the ostensible SOV order is actually TSV (“T” stands for Topic), in which the RVC is an unmarked passive. First, the sentence-initial NP in (25b) cannot bind the subject-oriented reflexive , while the one in the de-construction can.Footnote 10 This suggests that lɛ-tɕɛ ‘old Zhang’ bears the subject function in the de-sentence (25b) but does not in the ostensible SOV sentence (25a).

figure aa

Second, subject-oriented adverbs like dede-na ‘deliberately’ are incompatible with ostensible SOV sentences (26a) but compatible with de-sentences (26b).

figure ab

These disparities can be accounted for if the RVCs are unmarked passives that only subcategorize for a patient subject. The sentence-initial NPs in these sentences, e.g., lɛ-tɕɛ ‘old Zhang’ in (25a) and (26a), are topics whose semantic role is agent. In short, the ostensible SOV order is actually TSV, with a verb in the unmarked passive.

The same reasoning can be applied to verbs suffixed by the perfective marker -ɦɔ or the multifunctional aspect marker -\(^{{h}}\)i (e.g., tso-tɕ\(^{{h}}\)i ‘install-up’ and tɕɔ-tɕ\(^{{h}}\)i ‘wear-up’).Footnote 11 While the corresponding simplex verbs (i.e. monomorphemic verbs without the aspectual marker) are atelic and require the SVO order, the V-asp verbs pattern with RVCs in terms of telicity, word order, reflexive binding, and adverbial modification. In other words, these complex verbs are also unmarked passives.

Tan (1991, Sect. 3.5.1) has shown that in Mandarin, only telic verbs can passivize. In Wenzhounese, RVCs and V-asp verbs are also telic, and thus they form a natural class for (unmarked) passivization. Crucially, all these unmarked passive verbs can enter the de-construction. Given that passivization is a lexical rule (Bresnan 1982b) and the de-construction is a syntactic construction, the syntax cannot alter the voice of a verb as per the principle of lexical integrity (Bresnan and Mchombo 1995). Consequently, a passive verb must remain passivized in the de-construction.

To sum up, this section has argued that RVCs and V-asp forms in Wenzhounese are unmarked passives, which explains why these complex verbs cannot take a post-verbal object in an otherwise SVO language. By contrast, their Mandarin counterparts can still bear the active voice and take post-verbal objects, although it is usually their passivized version that enters the -construction (Bender 2000).

2.6 Retained objects and retained subjects

Post- verbs can be unmarked passives, but are not necessarily so. Some of them can take an additional object (Hsueh 1989). In (27), for instance, is followed by the NP júzi ‘orange’, and the verb ‘peel’ is followed by another NP ‘skin’. Such data are problematic for analyses that treat as an object marker (e.g., Peyraube and Wiebusch (2021)) because an ad hoc rule is required to increase the valence of ‘peel’.

figure ac

The object of the post- verb, in addition to the one that allegedly object-marks, is known as the retained object (Bender 2000, Sect. 3.2). As Tsao (1987, p. 19) observes, the post- NP and the retained object are either in a possessor-possessee relation or a whole-part relation. We have seen in (13) that such retained objects are also allowed in the de-construction.

Moreover, the post-de verb may take a subject, alongside the post-de NP. What I shall refer to as “retained subject”, to the best of my knowledge, is not reported in studies of the -construction. In (28), the first line indicates the underlying tone values from seiku bei ‘watermelon peel’. The underlying tones may change in connected speech, a process known as tone sandhi (see, e.g., Bao 2011; Chen 2000; Zhengzhang 2008). For (28), the surface tones can be realized in two ways. The first possibility is for seiku bei to undergo trisyllabic tone sandhi (Chen 2000, p. 480), which is a lexical phonological rule operative on compounds. This is shown in (28a) and the brackets delimit the tone sandhi domain. As such, bei is part of the compound seiku bei, which as a whole serves as the post-de NP. Another possibility (28b) is for seiku to undergo disyllabic tone sandhi (also a lexical rule), to the exclusion of bei. In this case, there is a noticeable pause after seiku, indicating a prosodic boundary and suggesting that seiku and bei belong to different syntactic phrases if there is a general alignment between syntactic and prosodic constituents (Selkirk 1986, 2011).

figure ad

Further support for the syntactic phrase boundary comes from questionability and topicalization. For the question in (29a), only the reading in (28a) is legitimate; but if the question is (29b), only (28b) is a possible answer.

figure ae

In terms of topicalization, we can see in (30) that seiku can be topicalized, which is impossible if seiku is a possessor of bei within the same NP because the presence of the possessive clitic gi incurs ungrammaticality.

figure af

Another instance of the retained subject is k\(^{{h}}\)o duɔ ‘air conditioner’ in (31). It is a subject because the verb tso-tɕ\(^{{h}}\)i ‘install-up’ is passivized (Sect. 2.5) and cannot take an object. Nor does this verb subcategorize for the argument kosa tei ‘inside the classroom’, which therefore must be introduced by de. This argument contains a nominal kosa ‘classroom’ and a localizer tei ‘inside’, but it is still nominal in nature (Li 2019; Nie and Liu 2021).Footnote 12

figure ag

Note that Mandarin also admits a [nominal + localizer] NP after , as in (32).Footnote 13 The difference is that tián-mǎn ‘fill-full’ is an active verb whose post-verbal NP is its object.

figure ah

The main point is that both the - and the de-constructions can include a retained object, but only the latter can include a retained subject. This is a consequence of unmarked passivization of telic verbs in Wenzhounese (Sect. 2.5), due to which the object of a transitive verb becomes the subject of the corresponding passivized verb. The existence of retained subjects indicates a structural difference between the post- and post-de verb: the former heads a VP but the latter an IP (Sect. 4.1). Also, the grammatical function associated with this IP must not be an open function with a missing subject (xcomp in LFG) because of the existence of retained subjects (Sect. 3.2).

2.7 Summary

Section 2 has shown that both Wenzhounese de and Mandarin allow retained objects and often take passivized verbs that are morphologically unmarked. Also, the post-de/bǎ NPs in both languages have properties typical of topics. Despite these similarities, they differ in the following respects:

figure ai

I have argued above that gi is a processor resumptive that is licensed by the post-grammatical processing model, hence its lack of number agreement (33b). Note also that (33e) can be readily explained by the extensive use of unmarked passives in Wenzhounese. Still, a full analysis of de needs to capture the remaining differences. I offer such an analysis in Sect. 4. Before that, a review is necessary of the previous investigation of Mandarin , which may be relevant to the analysis of de.

3 Previous analyses of

There is only non-formal, descriptive work on Wenzhounese de, e.g., Chen (2010), Lin (2019, Sect. 3.2), and Zhengzhang (2008, pp. 245–246). I will therefore focus on Mandarin , whose synchronic, diachronic, and formal properties have been extensively studied (see, e.g., Huang et al. (2009) and Sun (2015) for an overview). This notwithstanding, not even ’s part-of-speech is settled. There are claims that it is a preposition (Her 1990; McCawley 1992), a verb (Hashimoto 1971; Bender 2000), or a functional category (Sybesma 1999; Huang et al. 2009; Li 2006). Most of this work (including recent papers such as Shu 2018; Sun 2018; Wang 2017; Zhao 2021) is couched in a Chomskyan approach to syntax, and may be incompatible with the non-derivational, strictly modular framework of LFG. For example, that projects a functional category, e.g., baP or VoiceP, is generally not accepted in LFG, because functional categories “are only warranted when a particular functional feature is associated with a structural position” (Bögel et al. 2018, p. 110).

My main focus in the following is LFG approaches to the -construction, specifically Her (1990), Bender (2000), and Her (2009).Footnote 14 Nevertheless, the points that I make will be as theory-neutral as possible, so they should easily extend to other frameworks. In Sect. 3.1, I introduce the core concepts of LFG. Section 3.2 discusses previous analyses of and their compatibility with Wenzhounese de.

3.1 Introducing lexical functional grammar

As the name suggests, LFG is a lexical theory, where “regularities across classes of lexical items are part of the organization of a richly structured lexicon” (Dalrymple et al. 2019, p. 3). The theory is functional because grammatical functions like subject and object are theory primitives. For details of LFG formalisms and their implementation, the reader may find useful Falk (2001), Bresnan et al. (2016), Börjars et al. (2019), and Dalrymple et al. (2019). Shorter introductions include Carnie (2013, Ch.16), Müller (2020, Ch.7), and Sells (2013).

LFG is a modular theory that posits various domain-specific linguistic components.Footnote 15 One such component is f(unctional)-structure. F-structure is a syntactic structure that encodes grammatical functions (GFs) such as subject and object, along with grammatical features like number, tense, and gender. Relevant for this paper are the following GFs: subj(ect), obj(ect), comp(lement), xcomp(lement), and adj(unct). Among them, xcomp and comp are clausal functions. The former is an open function lacking an internal subject, while the latter is a closed function containing an internal subject (Dalrymple et al. 2019, p. 29). To illustrate, (34b) is the f-structure for (34a).

figure aj

We can see in (34b) that f-structure is an unordered attribute-value matrix, so it disregards linear order and constituency. These pieces of information are encoded in a distinct but related syntactic module called c(onstituent)-structure. It is organized as per X-bar theory, but a principle of Economy (Bresnan et al. 2016, p. 90) prunes out empty nodes and non-branching intermediate nodes. For example, (35) is the c-structure for (34a).

figure ak

C-structures are licensed by annotated phrase structure rules, which specify how c-structural nodes are mapped to f-structures. For example, in Mandarin and Wenzhounese, but not necessarily other languages, the specifier of IP is associated with subj, while the complement of V is associated with obj. We can express such associations by (36):

figure al

The arrows in (36) are metavariables (Falk 2001, p. 69): \(\uparrow \) refers to the f-structure of the mother node and \(\downarrow \) to the f-structure of the current node. In (36a), \(\uparrow \) = \(\downarrow \) under I\(^\prime \) means that the mother node of I\(^\prime \), i.e. IP, maps to the same f-structure as I\(^\prime \) does. The annotation under NP says that NP’s f-structure is the subj of IP’s f-structure. Similarly, (\(\uparrow \) obj) = \(\downarrow \) in (36b) says “my mother’s object is me.”

These annotated phrase structures relate c-structure to f-structure, but they give only partial information about f-structure and are designed to work in consort with lexical entries.

figure am

(37) contains partial lexical entries for \(^{{h}}\)i and seiku, which list their syntactic categories and f(unctional)-descriptions.Footnote 16 For example, (37a) specifies that \(^{{h}}\)i is a two-place verb, contributing to f-structure the value of a pred(icate) with a valency of two (two arguments appear in the bracketed argument list).

In short, LFG posits two structures to represent different aspects of syntactic information. F-structure records grammatical functions and features, while c-structure delineates linear order and constituency. Although these structures are independent, they are mutually constrained by lexical information and annotated phrase structure rules. We are now equipped with enough formal details to evaluate LFG analyses of the -construction.

3.2 The -construction in LFG

In this section, I compare the proposals of Her (1990, 2009), and Bender (2000). Although these are LFG analyses, most of my points will be general and not specific to LFG. I will show that none of these accounts can readily capture the de-construction in Wenzhounese, given the differences between de and summarized in (33).

To begin with, Her (1990) proposes that is a preposition, taking the post- NP as its complement. This PP is subcategorized for by the “main verb” (i.e. the post- verb). There are three problems with his analysis. First, most verbs that can enter the -construction have a non- counterpart, as exemplified in (38). If the alleged PP is an argument of the main verb, then these verbs need distinct lexical entries for - and non--sentences. This is not economical, given that such verbs are abundant in Mandarin.

figure an

Second, Her’s analysis predicts that the matrix subject must be selected by the main verb. However, there are cases, both in Mandarin and Wenzhounese, where the main verb does not subcategorize for the matrix subject. In both examples below, the RVC meaning ‘fall asleep as a result of watching’ is an intransitive verb subcategorizing for an experiencer subject, but the matrix subject ‘this movie’ is incompatible with this theta role. Therefore, the matrix subject’s subjecthood and theta role should come from bǎ/de (Bender 2000, p. 110).

figure ao

Finally, that is a preposition is not well justified. Ostensible supporting evidence comes from McCawley (1992), who lists five tests to distinguish prepositions from verbs and concludes that is a preposition. However, Bender (2000, p. 137) demonstrates that these tests actually “support an analysis of as a verb”. Of particular interest is McCawley’s (1992, p. 220) observation that “objects of Vs usually can undergo extraction or deletion, while objects of Ps are less free in allowing extraction or deletion.”

figure ap

We can see above that the object of Vs can be topicalized (40a), while the object of the preposition ‘from’ cannot (40b–c). The post- NP cannot be topicalized either (40d), suggesting that is a preposition. However, extracting the post-de NP is allowed in Wenzhounese (see (5)), but extraction from within a PP is nevertheless prohibited (41):

figure aq

The contrast between (5) and (41b) suggests that Wenzhounese de is not a preposition. It also provides indirect evidence that the ban of extraction from the post- position may result from other factors than being a preposition. One such factor can be found in Bender (2000), who offers an independent reason for the ban on post- NP’s extraction.

In Bender’s (2000) view, is a verb. Observing that the post- NP manifests topic properties (cf. Sect. 2.2), she proposes the following lexical entry for (p. 127):

figure ar

According to (42), is a three-place verb, subcategorizing for a subject, an object, and a finite clause (i.e. the closed function comp). The object corresponds to the post- NP, which simultaneously serves as the topic of the finite clause. The c- and f-structures for (43), given in (44), illustrate Bender’s analysis.

figure as
figure at

In (44b), (\(\uparrow \) obj) = (\(\uparrow \) comp topic) ensures that obj also functions as topic in the embedded clause (i.e. comp). This equating of two functions gives rise to structure sharing, notated by a line connecting topic and the f-structure that is the value of obj. This formalism allows “the two functions to have the same f-structure as their value” (Falk 2001, p. 126).

By definition, the finite clause, i.e. comp in (44b), should have its own subject, but in (43) and (44a) there is no overt subject within this clause. Consequently, we need to posit a covert subject for it to satisfy the Completeness Condition.

figure au

This covert subject, as we can see in (44b), is an abstract pronoun (pro) whose referent is determined by the context. In LFG terms, this pro is anaphorically controlled by a contextually available NP (Dalrymple et al. 2019, Ch.15; Falk 2001, Sect. 5.2), as indicated by the index attribute in (44b).Footnote 17

Bender’s (2000) analysis has several advantages over Her’s (1990). On the one hand, the matrix subject is selected by , not the main verb, so it explains why sometimes the matrix subject is not related to the main verb (see (39)). On the other, it captures the post- NP’s topic properties (Sect. 2.2), which Her (1990) ignores. Furthermore, Bender (2000, p. 137) cites Huang’s (1992) observation that “object controllers (overt objects which control another function) cannot be structural [i.e. clause-initial] topics.” This may be why the post- NP cannot be fronted, as it controls comp topic in (42).

However, Her (2009, pp. 451–453) points out that Bender’s analysis leaves comp subj unaccounted for. In other words, the anaphoric control relation between comp subj and another function is not specified. Therefore, Her proposes to make the following revision:

figure av

Here, \(\hat{\theta }\) represents the logical subject and \(\gamma \) is one of the correspondence functions in LFG’s parallel architecture. \(\gamma \)(\(\hat{\theta }\)) maps the logical subject to its associated GF. Thus, the f-description (\(\uparrow \) subj) = (\(\uparrow \) xcomp \(\gamma \)(\(\hat{\theta }\))) links ’s subject with the logical subject of the first embedded clause. Her (2009, p. 447) also mentions in a footnote that replacing comp with xcomp is “more in line with the LFG conventions”, presumably because the embedded clause is assumed to be an open function lacking a subject (Sect. 3.1).

To see how (46) works, let us first look at (44b), which has a retained object ‘skin’ in the embedded clause (comp in (44b) but xcomp as per (46)). The predicate of this embedded clause is ‘peel’, whose logical subject is its syntactic subject. Therefore, (46) successfully links the matrix subject ‘s/he’ to the embedded subject. On the other hand, -sentences containing unmarked passives (Sect. 2.5) receive a different treatment:

figure aw

In the f-structure, obj simultaneously serves as xcomp topic, as required by the second line of the f-description in (46). The third line of the f-description is not realized syntactically, because the logical subject of chāi ‘demolish’, i.e. Lisi, is demoted as a result of passivization. Although Her (2009) does not specify how xcomp subj is functionally controlled, it can be accounted for via the default lexical rule of functional control: \((\uparrow \textsc {obj})=(\uparrow \textsc {xcomp\,subj})\) (Bresnan 1982a, p. 378).

A problem with Her’s (2009) proposal concerns xcomp, whose subj is said to be “functionally or anaphorically controlled” (p. 451). This is incorrect because xcomp, by definition, is the functionally controlled clause (Bresnan 1982a, p. 376; Dalrymple et al. 2019, Sect. 15.1). This is not just an issue of terminology, as empirical evidence shows the xcomp assumption for the bǎ/de-construction is mistaken.

First, an xcomp analysis predicts that split antecedents are impossible in bǎ/de-sentences (Bresnan 1982a, p. 396). However, the following examples show that ‘old Zhang’ and ‘old Huang’ can be split antecedents for the embedded subject.

figure ax

Second, the de-construction allows a retained subject (Sect. 2.6), e.g., bei ‘peel’ in (49), which would have been ruled out were the subject functionally identical to another phrase.

figure ay

Based on the discussion above, I propose the following lexical entry for :

figure az

The first two lines of the lexical entry are in line with Bender (2000): the verbal complement of is a comp, whose topic is identified with ’s obj via structure sharing. The third line supplies a pronominal subject for the comp to meet the Completeness Condition (45). The antecedent of this pro is captured by the two templates (Dalrymple et al. 2004) defined in the fourth line. The templates serve to reformulate Her’s (2009) insight that the referent of comp subj is predictable. The passive template is for -sentences with a passive main verb and the active template is for -sentences with an active main verb, including those with a retained object and those that involve long-distance dependencies (see below). These templates are defined in (51):

figure ba

Both templates contain an if-then clause, as indicated by the double right arrow.Footnote 18 Informally, the passive template states that “if my complement clause is in the passive voice, then my object is coindexed with the complement clause’s subject.” The first two lines of the active template, which are sufficient to analyze retained objects, state that “if my complement clause is in the active voice, then my subject is coindexed with the complement clause’s subject.” The third line is optional, as indicated by the outermost brackets. It uses a local name %eobj (Kaplan and Maxwell III 1996, pp. 90–91), mnemonically for “embedded object”, to abbreviate the constraint in the fourth line, (\(\uparrow \) comp comp\(^+\) obj).Footnote 19 To put it in prose, this equation says “I can, but don’t have to, introduce a pronominal object at some level of embedding, which must be at least two levels of embedding, and this embedded object is coreferential with my object.”

To illustrate my proposal, consider the sentence and f-structure given in (52).Footnote 20

figure bb

The post- verb shèfǎ ‘attempt’ is active. The lexical entry for shèfǎ includes ‘attempt <subj,comp>’, i.e. it selects a subject and a finite clause as its arguments. As a subject control verb, shèfǎ requires the embedded subject (the subject of jiějué ‘resolve’) to be coreferential with its local subject, but it says nothing about its embedded object, which is implicit in this case. Intuitively, this implicit object is coreferential with the post- NP, zhè gè wèntí ‘this problem’, so their relationship is long-distance.

In the f-structure in (52b), the matrix subject is indexed with i, and the matrix object j (the structure sharing between obj and comp topic is omitted for simplicity). Since shèfǎ ‘attempt’ is an active verb, the active template in (50) is called for, which requires the matrix subject and the embedded subject to have the same index. Consequently, both subj and comp subj in (52b) are indexed with i, i.e. they are coreferential. Within this comp, shèfǎ ‘attempt’ introduces another level of embedding, so the optional equations in (51) are invoked: they assign the value ‘pro’ to the object of jiějué ‘resolve’ to satisfy the Completeness Condition (45) and coindex this object with the post- NP.

Summarizing, this section has reviewed three LFG approaches to the -construction and their implications for the de-construction. The conclusions are (i) is a three-place verb, subcategorizing for a subject, an object, and a finite clause (Bender 2000; ii) ’s object is multi-functional as it also serves as the topic of the finite clause; and (iii) the subject of the finite clause is not functionally identical to ’s subject or object, but may be coreferential with one of them in a predictable way (Her 2009). These observations are formalized in the lexical entry in (50).

Recall that the multi-functionality of the post- NP, i.e. simultaneously being obj and comp topic is the reason why this NP cannot be topicalized (Bender 2000). This means that the same analysis clearly cannot apply to both the -construction and the de-construction: Sect. 2.2 has presented evidence that the post-de NP has topic properties, so this NP is plausibly multi-functional as well (being both obj and comp topic). However, Sect. 2.1 has shown that the post-de NP can topicalize using the gap strategy, which is unexpected if the multi-functionality prohibits topicalization. This problem, as well as those disparities summarized in Sect. 2.7, will be addressed in the next section.

4 A formal analysis of de

Bender (2000) contends that ’s object serves as the topic of the clausal complement (comp topic). The structural relation between the object and comp topic prohibits the object from taking the clause-initial topic position because, as Huang (1992) has shown, long-distance dependencies involving Mandarin object controllers are disallowed. If Bender’s analysis is on the right track for Wenzhounese de as well, it implies that de’s object should not be equated with comp topic, because this object can be topicalized. By doing so, however, a crucial generalization is lost; that is, de’s object does have topic properties (Sect. 2.2). Therefore, a formal analysis of de needs to address this dilemma as well as capture the differences summarized in (33), repeated below as (53).

figure bc

To capture these observations, I propose two lexical entries for de. De\(_1\) is similar to but not the same as Mandarin and it prohibits extraction of the post-de NP. By contrast, de-sentences involving a post-de gap or a resumptive gi are exclusively related to de\(_2\), whose object does not have topic properties per se and can therefore be topicalized.

In the -construction in Mandarin, the topic, if any, anaphorically binds the resumptive pronoun, while in Wenzhounese, the topic functionally binds the gap. The resumptive gi, as argued in Sect. 2, is a processor resumptive inserted after grammatical computation, so it need not be specified by the grammar (Asudeh 2012). This explains why gi does not inflect for number (53b) and why it is gap-like with regard to quantifier float (53c). Section 4.3 will argue that quantifier float is essentially a locality effect. For the double-de construction (53d), I will argue that the second de must be de\(_2\). Mandarin does not have a de\(_2\) equivalent, hence the absence of a double- construction. Finally, if the clausal complement of de corresponds to IP in c-structure, it can license the retained subject (53e). By contrast, the clausal complement of needs to correspond to VP to avoid overgenerating a retained subject in Mandarin. The full analyses are presented below.

4.1 De \(_1\)

I postulate de\(_1\) for de-sentences in which there is neither topicalization nor gi, so it is similar to the Mandarin . The only difference lies in the third line of (54).

figure bd

Just as does in (50), so de\(_1\) subcategorizes for a subject, an object, and a finite clause. The second line in (54) equates obj with comp topic, which explains the topic properties of this obj (Sect. 2.2) and prohibits it from undergoing topicalization. When the subject of the comp is implicit, the f-description (\(\uparrow \) comp subj pred) = ‘pro’ provides a covert subject to satisfy the Completeness Condition (45). This is where de\(_1\) differs from : this equation is optional (indicated by the outermost brackets; see Dalrymple et al. 2019, Sect. 5.2.4 for optionality) for de\(_1\) but obligatory for . Optionality is necessarily for de\(_1\) because a retained subject (Sect. 2.6) blocks the application of this equation, for the sake of Coherence.

figure be

The phrase structure rules relevant for de\(_1\) are listed in (56). Note that the subject NP in (56a) is optional because Wenzhounese is a discourse pro-drop language.

figure bf

My proposal successfully generates the c- and f-structures for simple de-sentences, de-sentences with a retained subject or a retained object, and those which involve long-distance dependencies. First, (57) gives a simple de-sentence.

figure bg
figure bh

In (58a), V’ immediately dominates V’, NP, and IP\(_2\), the latter two corresponding to f-structural obj and comp, respectively. This flat structure suffices for the present purpose, as (I) there is no solid evidence that [V NP] or [NP IP\(_2\)] form a constituent (cf. Feng 2019; Huang et al. 2009), and none of my analysis hinges on the flat structure.

In (58b), the matrix object simultaneously functions as comp topic, as required by the second line of de\(_1\)’s lexical entry in (54).Footnote 21 The embedded subject (comp subj) is a covert pronoun, supplied by the third line of (54). Since the embedded verb is an unmarked passive (Sect. 2.5), it invokes the passive template of de’s lexical entry, so both de’s object and comp subj are indexed as j in the f-structure.

The second example (59) has a retained subject. This motivates the IP analysis of de’s clausal complement, as subjects are placed in Spec,IP in Wenzhounese. By contrast, a VP analysis is sufficient for ’s clausal complement (Bender 2000) because the -construction does not license a retained subject.

figure bi

In (60a), bei ‘peel’ occupies Spec,IP and maps to the embedded subject (comp subj) in (60b). This overt comp subj blocks the application of (\(\uparrow \) comp subj pred) = ‘pro’ in (54) to satisfy Coherence (55), which is possible because the equation is optional. The whole-part relationship between seiku ‘watermelon’ and bei ‘peel’ is not represented in (59b) for ease of exposition, but it could be easily incorporated by positing a poss(essor) attribute for comp subj and having it be functionally bound by comp topic.

figure bj

The third example (61) contains a retained object. The c-structure is unremarkable for the analysis, so only its f-structure is given in (62).

figure bk
figure bl

Here, the embedded verb \(^{{h}}\)i ‘eat’ is active, and it takes an object ʔepø ‘half’. The active template requires the matrix subject to anaphorically control the embedded subject, so subj and comp subj share the index i.

Finally, the example in (63) involves a long-distance dependency between de’s object and the object of tak\(^{{h}}\)e ‘open’.

figure bm
figure bn

In the f-structure in (64), all the subject functions are indexed as i. This anaphoric control relation is established as follows. First, the post-de verb, ɕɛ ‘think’, does not have an overt subject, so the optional equation in (54) needs to apply to provide a covert subject for ‘think’. Second, because ‘think’ is an active verb, it calls for the active template in de\(_1\)’s lexical entry. According to the template’s definition in (51), the subject of de is coindexed with the subject of ‘think’. Third, this verb (more precisely, the idiomatic expression ɕɛ bɔfɔ ‘think of a way’) is a subject-control verb, so its lexical entry carries the information to coindex its subject with the subject of its clausal complement. In this case, comp subj and comp comp subj both bear the index i.

On the other hand, de’s object anaphorically controls the most embedded object. Their coindexation is also achieved by the active template in (51). The last two lines of the template explicitly refer to comp comp\(^+\) obj, which is the object of tak\(^{{h}}\)e ‘open’ in this case. The equation (%eobj pred) = ‘pro’ supplies a predicate value to this covert object, while the equation (\(\uparrow \) obj index) = (%eobj index) mandates anaphoric control between the object of de and the object of ‘open’.

In short, the lexical entry for de\(_1\) (54) is similar to that for (50), and the only difference is that obligatorily supplies an implicit subject for its embedded clause whereas de\(_1\) does so optionally. This is motivated by the retained subject in de-sentences, which excludes an implicit subject due to the Coherence Condition. The proposed lexical entry and phrase structure rules successfully captures de-sentences of variant complexities (retained subjects, retained objects, and long-distance dependencies). Nevertheless, the data involving topicalization are still missing from the analysis. I will discuss these in the next section.

4.2 De \(_2\)

To address the topicalization paradox, I contend that there are two different elements de in Wenzhounese. De\(_1\), discussed in the previous section, is the one whose object has topic properties and therefore cannot be topicalized. By contrast, de\(_2\)’s object does not have topic properties. Instead, its object is essentially a gap that must be identified with a topic. The two lexical entries for de, though distinct, are also related, which can be formalized as a lexical redundancy rule (Bresnan 1982b; Jackendoff and Audring 2020). There may also be a diachronic relation between de\(_1\) and de\(_2\), which I leave for future research.

As we shall see, proposing two lexical entries is not just an ad hoc way to dodge the topicalization paradox; rather, it has empirical advantages in accounting for the distribution of the quantifier and the double-de construction.

The lexical entry for de\(_2\) is given in (65a). The first three lines are already familiar to us: they state that de\(_2\) is a three-place verb and it optionally provides a pronominal subject to its clausal complement. It also uses the templates defined in (51) to resolve the anaphoric control relation between de\(_2\)’s local GFs and more embedded GFs.

figure bo

The fourth line begins with \(\phi ^{-1}\), a correspondence relation that maps an f-structure to a c-structural node (Dalrymple et al. 2019, Ch.4). Informally, it says “my object does not correspond to any c-structural node.” Note that this does not contradict the presence of the processor resumptive gi, since gi is licensed not by the grammar, but by the post-grammatical processing model.

The fifth line contains an operation on f-structure called inside-out functional uncertainty, notated (comp* \(\uparrow \)).Footnote 22 Informally, the equation (\(\uparrow \) obj) = ((comp* \(\uparrow \)) topic) can be thought of as a feature path exiting from the f-structure immediately containing obj to superordinate f-structures, with the only feature along this path being comp. This feature path must terminate at a topic attribute in a local or superordinate f-structure. Take (66) for example, the start point is the f-structure labelled c, from which there is a path to b along the comp attribute in b. The path does not lead to a because there is no comp in a. Given that there is a topic in b, the feature path successfully terminates, so it identifies (b topic) with (c obj).

figure bp

The phrase structure rules relevant for constructions that include de\(_2\) are listed in (67). The rule in (67a) is a general rule for topicalization. Topics are adjoined to IP, following Bresnan et al. (2016). Since (at least theoretically) there could be more than one topic, the relevant rule includes a set membership symbol rather than an equals sign: \(\downarrow \) \(\in \) (\(\uparrow \) topic) instead of (\(\uparrow \) topic) = \(\downarrow \). This means that the value of topic is a set rather than a single f-structure. The topic phrase is also annotated with (\(\uparrow \) comp* gf) = \(\downarrow \),Footnote 23 so it will bear some grammatical function (gf) within the clause, as required by the Extended Coherence Condition in (68).

figure bq
figure br

I will illustrate my proposal with (69), in which the post-de object is a gap. Note that the grammar that generates (69) is the same as the grammar that generates a corresponding sentence with the resumptive gi, because the insertion of gi is manipulated by performance, not competence.

figure bs
figure bt

In (70a), V’ does not dominate a locally realized NP, in conformity to the f-description \(\phi ^{-1}\)(\(\uparrow \) obj) = \(\emptyset \) in de\(_2\)’s lexical entry (65). In (70b), topic is equated with obj, as indicated by the curved line. This structure sharing relation can be established either by the general rule for topicalization (67a) or the inside-out expression in (65), but this does not mean that the latter is redundant, as we shall see in the analysis of the double-de construction.

The analyses of retained subjects, retained objects, and long-distance dependencies are similar to those of de\(_1\), the difference being that de\(_2\)’s object does not correspond to a c-structure node and must be equated with a topic. The remaining sections will demonstrate how my proposals above account for the distribution of the quantifier and the double-de construction.

4.3 Quantifier float and locality

Much work on the Mandarin universal quantifier has been done in the Government and Binding framework (see Cheng 1995; Chiu 1993, and references therein). Cheng (1995) observes that the universal quantifier dōu needs to be local to its NP, which is xuéshēng ‘student’ in (71a), or the trace of xuéshēng in (71b).

figure bu

As for the universal quantifier in Wenzhounese, Sect. 2.3 has shown that it is usually adjacent to the post-de NP (72a), but when the post-de NP is topicalized, the quantifier tends to raise to the higher clause where the topic is (72b).

figure bv

(72a) is similar to (71a), in which the quantifiers are local to their NP. The disfavoured but still grammatical example in (72c) is similar to (71b), where the quantifiers are local to the trace of the NP. However, (72b) contrasts with the Mandarin examples in that the quantifier is local to the topicalized NP instead of its trace.

Therefore, it seems that locality is also relevant to quantifier float in the de-construction, although it differs from the locality condition for Mandarin dōu. The exact formulation of this locality condition is beyond the scope of this paper, due to the following pending issues. First, given that the quantifier is adjoined to the verbal domain in c-structure, what annotations are required to map it to the nominal domain in f-structure to model its scope? A typical annotation like (\(\downarrow \) adj) \(\in \) \(\uparrow \) would wrongly predict that the quantifier modifies the verb. Second, an embedded quantifier as in (72c) is still grammatical. How do we capture this optionality and gradient grammaticality?

Despite these pending issues, the generalization we can make from the data above still corroborates the proposed analyses for de. Specifically, (71) suggests that the locality condition in Mandarin holds between the quantifier and the complement of the verb, be it in-situ or ex-situ (Cheng 1995). In LFG terms, the quantifier is local to obj at f-structure. In the de-construction (72), the locality condition holds between the quantifier and the discourse function topic, instead of obj.Footnote 24 In (72a), the quantifier is embedded in the clause (comp) headed by tɛ-k\(^{{h}}\)ɔ ‘dump-away’, where there is a topic bound by lɔdʑa ‘trash’, so the locality condition is met. In (72b), the matrix verb is de\(_2\), so its object is not equated with a topic according to (65). Consequently, the quantifier floats to the matrix clause to satisfy the locality condition with the matrix topic.

In brief, assuming that the universal quantifier needs to be local to the de-sentence topic, its distribution naturally follows from the lexical entries for de. De\(_1\)’s complement clause contains a topic, so the quantifier is placed within this clause. De\(_2\)’s complement clause does not contain a topic, so the quantifier is placed in the matrix clause where there is a clause-initial topic. The processor resumptive gi does not block quantifier float because it is not part of the grammatical computation (Asudeh 2012).

4.4 The double-de construction

Section 2.4 discussed the double-de construction in Wenzhounese, which has no equivalent in Mandarin. This construction is constrained in the sense that the second de must be followed by a gap or the processor resumptive gi, as summarized in (73).

figure bw

The generalization for this constraint directly follows from the distinct lexical entries for de: while both de\(_1\) and de\(_2\) can be the first de in the double-de construction,Footnote 25 only de\(_2\) can be the second de. If we assume that Mandarin only has one lexical entry for (which is similar to de\(_1\)), the absence of a double- construction is unsurprising: there is no de\(_2\) counterpart to license a second .

A further question is why the second de must be de\(_2\). Recall the phrase structure rules in (56) and (67) readily allow the generation of the double-de construction, but they do not impose an order between de\(_1\) and de\(_2\). The explanation lies in the f-structure because the (un)grammaticality of the double-de construction relates to the arguments of de, not its constituency. Consider first a legitimate double-de sentence in (74).

figure bx

The matrix predicate is de\(_1\), whose clausal complement is headed by de\(_2\). Assuming that both instances of de are active verbs,Footnote 26 the active template defined in (51) will require the matrix subj to anaphorically control comp subj, as indicated by their coindexation. The line connecting matrix obj and comp topic indicates the structure sharing relation stipulated in de\(_1\)’s lexical entry (54). By contrast, de\(_2\)’s lexical entry (65) requires its object to be equated with a topic at a local or higher f-structure. Such a topic is locally available in (74b). Also note that this topic is not a clause-initial topic licensed by the phrase structure rule (67a), which justifies the necessity of the inside-out f-description in de\(_2\)’s lexical entry (65). Finally, because tak\(^{\textrm{h}}\)e ‘open’ is an unmarked passive, the passive template in de\(_2\)’s lexical entry coindexes de\(_2\)’s object, i.e. comp obj in (74b), with comp comp subj.

An ungrammatical double-de sentence is given in (75a), in which both tokens of de take a full NP object.

figure by

In (75b), the matrix obj also functions as comp topic. Usually, this comp topic needs to anaphorically bind an implicit pronoun, a retained object, or a retained subject (Sect. 4.1), or it needs to share its structure with de\(_2\)’s obj, as in (74b). However, in (75b), no such relation can be established. This is because the value of a pred attribute is a semantic form that is unique: each instance of use of the word safe gives rise to a uniquely instantiated occurrence of the semantic form ‘safe’ (Dalrymple et al. 2019, p. 45). Therefore, the two occurrences of ‘safe’ in (75b) must bear distinct indexes, which rules out the possibility of anaphoric control. As such, the relationship between comp topic and its local f-structure is unresolved, resulting in the ungrammaticality.

5 Conclusion

In this paper, I have investigated the syntax of the de-construction in Wenzhounese. Section 2 compared de with its Mandarin counterpart and revealed six differences as summarized in (53). In Sect. 3, I introduced LFG and critically reviewed three LFG approaches to the -construction. It was shown that and de should be analyzed as verbs with their own argument structures. Moreover, given the differences between and de, none of these analyses was directly applicable to the de-construction. Therefore, Sect. 4 proposed the following lexical entries for de and demonstrated that they could explain all the differences outlined in (53).

figure bz
figure ca

Overall, this paper presents novel data from Wenzhounese, which not only are worthy of investigation in their own right, but may also shed new light on the study of . My study also contributes typologically and thematically to the research topics of Generative Grammar in general and LFG in particular.