1 Introduction

Complex predicates have attracted much attention in the literature of generative syntax because of the mismatch between semantic biclausality and surface morpho-phonological monoclausality that they exhibit. That is, a complex predicate is typically a predicate consisting of multiple parts that stand in some semantic relation to one another (most typically that of semantic embedding), which, despite the complex semantic relation expressed, behaves as one word in surface realization, as exemplified by the morphological causative construction in Japanese in (1):

  1. (1)
    figure a

Here, semantically, the verb root yom ‘read’ is an argument of the causative suffix -ase, but on surface-oriented criteria (such as the inseparability of the stem and the accentuation patterns), the causative suffix is clearly a bound morpheme that forms a single word together with the verb root.

In the literature, there are two major types of approaches for resolving this syntax-semantics mismatch. One of these approaches employs the notion of argument sharing (or fusion) of some kind. A formal mechanism implementing this idea is most explicitly worked out by Hinrichs and Nakazawa (1990, 1994) in the framework of HPSG, where they define the operation of argument composition by which the (unsaturated) arguments of the embedded verb are inherited to the argument structure (or subcategorization frame) of the higher verb via coindexation of feature value specifications in HPSG.Footnote 1 Technically, this is achieved by specifying in the lexical entry for the higher verb that it inherits the arguments of the lower verb (i.e., its verbal complement), as in the following (simplified) lexical entry for the higher verb hazime-ta ‘began’ of a syntactic compound verb, where angle brackets indicate a list-type object and ⊕ denotes the append operation:

  1. (2)

    hazime-ta:

    figure b

Via the coindexation of (part of) the argument structures of the lower and higher verbs, the arguments of the lower verb are ‘inherited’ to the higher verb and then syntactically realized in a monoclausal structure headed by the whole complex predicate, as in the analysis in (4) for (3):

  1. (3)
    figure c
  1. (4)

Here, the higher verb hazime-ta, by virtue of its subcategorization properties as specified in (2), selects the lower verb (tagged as ) and, at the same time, the arguments (tagged as and ) recorded in the lower verb’s own ARG-ST list.

This argument-sharing approach has turned out to be particularly influential in non-derivational frameworks such as HPSG and LFG, where, due to the ‘monostratal’ architecture, the more classical, derivational solution described below is in principle unavailable (unless one adopts a radically different architecture of grammar within it, as in the linearization-based HPSG approach; cf. below). The argument composition approach has been applied to a wide range of languages (cf. Abeillé and Godard 1994; Monachesi 1999; Rentier 1994; Przepiórkowski and Kupść 1997; Manning et al. 1999; Chung 1995).

The other approach, which is more standard in derivational variants of generative grammar, is to resolve the mismatch in terms of derivational mapping. In this kind of analysis, a fully biclausal representation is posited at a level of syntax closer to semantic interpretation (i.e., deep structure in classical TG and LF in later variants), which is then derivationally related to a monoclausal structure at a level closer to the surface, actual morpho-phonological realization of linguistic expressions (i.e., surface structure, or PF). Most typically, such mapping from deep biclausal representation to surface monoclausal representation is mediated by a movement operation called verb raising (Aissen 1974), which moves the embedded verb from its original position in the embedded clause to a landing site in the matrix clause where it attaches to the higher verb to form a morphological cluster with it (see, e.g., Kuroda 1965; Shibatani 1978 and Rizzi 1982 for this kind of approach; in classical TG, this verb raising is often followed by an operation called ‘tree pruning’ which eliminates the S-node of the embedded clause so that the surface constituent structure is completely flattened out). A monostratal analog of this is proposed by some authors (e.g., Reape 1994; Gunji 1999) in a very different theoretical setup, known as linearization-based HPSG (Reape 1994; Kathol 1995), which countenances an architecture of grammar that sharply distinguishes surface morpho-phonological realization of linguistic expressions (reflected in word order) from valence-driven combinatorics (which transparently represents predicate-argument structure). In this linearization-based approach, complex predicates have fully biclausal structures in the combinatoric component, which is related to monoclausal structures in the morpho-phonological component, in a way essentially analogous to a derivational account. The following picture, which shows the analysis for (3), illustrates the workings of this linearization-based ‘verb-raising’ approach. Unlike in standard HPSG, the tree in (5) does not directly correspond to the surface constituent structure of the sentence; it should rather be thought of as an analog of analysis tree in Montague Grammar in that it essentially records the history of semantic combinatorics:

  1. (5)

Note that in this tree a fully biclausal structure is present in which a verbal projection headed by the lower verb is embedded under the higher verb. In this linearization-based setup, the surface morpho-phonological forms of linguistic expressions are specified explicitly at each node via the DOM feature (where DOM stands for ‘word order domain’). Details aside, what is important here is that this setup enables introducing operations that are more complex than simple string concatenation when computing the DOM value of a larger linguistic expression from those of its parts. Crucially, at the node where the higher verb combines with the embedded clause headed by the lower verb, the DOM value of the larger expression is given by first forming a morpho-phonological cluster involving (the phonologies of) the lower and higher verbs via the cluster forming operator +, and then concatenating this verb cluster with the rest of the sentence. This cluster forming operation in the morpho-phonological representation has the effect of prosodically masking the biclausal structure reflecting the valence-driven combinatorics (which in turn directly feeds into semantics), and the mismatch between syntax and semantics is thereby resolved.

The difference between the argument-sharing approach and the verb-raising approach essentially lies in the way in which the syntax-semantics mismatch is resolved. In the argument-sharing approach, this is done at the level of argument structure. In a sense, this involves some complication at the syntax-semantics interface, but it has the advantage of doing away with abstract biclausal syntactic representations. By contrast, the verb-raising approach admits a fully biclausal representation at some abstract level of syntax, thereby simplifying the mapping from (‘deep’) syntax to semantics. Both approaches get the basic properties of complex predicates right. But then, the question naturally arises as to which of these approaches turns out to be empirically more successful in view of a wider range of facts (or if neither turns out to be entirely successful).

This paper takes up this question and provides a new answer to it. Specifically, I review empirical data involving two types of complex predicates in Japanese that consist of both well-known problems from the literature and a set of novel observations, and argue for a synthesis of the verb-raising approach and the argument-sharing approach that becomes possible in a logical setup of syntax based on categorial grammar. By doing so, the paper aims to make both an empirical and a theoretical contribution to the literature on complex predicates (and on issues related to the syntax-semantics interface more generally). Empirically, the novel set of data that I discuss below poses challenges to both types of previous approaches. The theoretical contribution lies in demonstrating that these problematic data, as well as the classical observations from the literature, receive straightforward solutions in the proposed synthesis of the previous two approaches in the categorial grammar-based setup. To elaborate on the general guiding intuition somewhat, the idea behind this synthesis can be stated in intuitive and pre-theoretical terms as follows: argument sharing is a ‘logical consequence’ of (an extended, non-derivational sense of) verb raising, since, if the lower verb forms a morpho-phonological cluster with the higher verb, it cannot license its own arguments by itself, which forces it to have them inherited to the higher verb. Though simple and conceptually natural, this idea has not been implemented in any variant of the verb-raising or argument-sharing approach to complex predicates in the literature. As will become clear in the discussion below, the reason for this is essentially that this synthesis requires an architecture of grammar that views natural language syntax as a logical system in which reasoning about valence-driven combinatorics and reasoning about surface morpho-phonological forms (each governed by separate principles) inform each other in a certain way. Such a perspective becomes available only by integrating ideas developed (so far) in separate strands of research in the tradition of categorial grammar. I demonstrate in the ensuing sections that a previously unattained unified analysis of complex predicates becomes possible in a framework that embodies this theoretical innovation and that this unified analysis successfully solves both the old and new empirical problems. This, I believe, establishes the point that theoretical advances in the literature of categorial grammar have important implications for major issues in theoretical linguistics.

2 Data

In this section, I review the empirical properties of the two kinds of complex predicates in Japanese that I take up in this paper. One of them is the class of ‘syntactic’ compound verbs, which constitute a productively formed species of compound verbs with evidently biclausal semantic structures (where the meaning of the lower verb is embedded under the meaning of the higher verb) but clearly monoclausal surface constituent structures with tight morpho-phonological clustering of higher and lower verbs. Syntactic compound verbs are so-called since they contrast with ‘lexical’ compound verbs for which the combination of the two verbs is idiosyncratic and where the meaning of the whole compound verb is not compositionally determined from the meanings of its parts. Since I will not deal with lexical compound verbs in this paper, I sometimes call syntactic compound verbs simply ‘compound verbs’ in what follows. The other type of complex predicate taken up below is the so-called -te form complex predicate (cf., e.g., McCawley and Momoi 1986; Sells 1990), which, for the intermediate status that it exhibits with respect to the morphological word-hood of the verbal complex, has long been recognized as a problem in the literature of Japanese syntax. In the following discussion, I focus on data pertaining to the relative advantages and disadvantages of the verb-raising approach and the argument-sharing approach from the literature. For a more general overview of the properties of complex predicates in Japanese, see Kageyama (1993) and Matsumoto (1996) and references cited therein.

Before getting to the details, there is one important point of clarification. My proposal aims to capture the relevant properties of Japanese complex predicates essentially via a single parameter of obligatoriness of morpho-phonological verb clustering, classifying all complex predicates (except for lexical compound verbs) into two broad classes of compound verbs and the -te form complex predicate. This is a very strong and potentially controversial claim (but note that it is the null hypothesis that should be preferred over alternatives invoking additional and more complex assumptions unless there is convincing evidence to reject it), and one might take issue with it, since various facts have been adduced in the literature for more elaborate syntactic analyses (of various kinds).

For the class of compound verbs, the most prominent alleged evidence of this kind comes from the fact that a certain subset of them do not exhibit the typical semantic biclausality effects. I argue below (Sect. 3.2) that the relevant facts can be accounted for via a single, lexically assigned feature, doing away with the structural distinctions posited in previous accounts. Similarly, facts pertaining to passivization have been taken by some (e.g., Kageyama 1993; Nishigauchi 1993; Matsumoto 1996) to correlate with syntactic complexity. I take it that these facts receive an independent semantic explanation (see Sect. 2.2).Footnote 2

For the -te form complex predicate, again, there does not seem to be any strong reason to reject the unitary treatment I propose below.Footnote 3 For example, a classical observation that the desiderative hosii imposes different semantic restrictions on the embedded subject in different case markings receives a straightforward lexicalist treatment in my analysis (footnote 12) preserving the insight of Shibatani’s (1978) and McCawley and Momoi’s (1986) original syntactic solutions. I thus take it that none of the facts discussed in the literature undermine the broad classification of Japanese complex predicates into two classes I assume below.

The presentation of data in this section is structured as follows. I start with data pertaining to surface word order and semantic biclausality that show that complex predicates involve some kind of morpho-phonological clustering of higher and lower verbs (Sect. 2.1). As might be expected, these data lend themselves more straightforwardly to the verb-raising approach (especially the linearization-based variant of it, which separates surface morpho-phonology from the combinatoric component of grammar) than to the argument-sharing approach. I then present data involving argument alternation phenomena that provide evidence for the ‘merged’ argument structures of the kind assumed in the argument-sharing approach (Sect. 2.2). In contrast to the data in Sect. 2.1, these data provide significant problems for the verb-raising approach. Finally, in Sect. 2.3, I present a (mostly) novel set of data involving cases in which complex predicates interact with phenomena pertaining to flexible syntactic/semantic composition. As discussed in detail below, these data pose significant challenges to both of the two types of approaches. The difficulty essentially lies in the fact that these phenomena involve both the ‘verb-raising’ aspect and the ‘argument-sharing’ aspect of complex predicates, which makes it difficult to accommodate them in either of these (seemingly mutually exclusive) approaches. These data, then, provide particularly strong motivation for the synthesis of the two approaches that I propose in the next section.

2.1 Morpho-phonological verb clustering and semantic biclausality

At the descriptive level, both (syntactic) compound verbs and the -te form complex predicate involve clusters of verbs (the boldfaced part in (6)) at the end of the sentence. (The -te form complex predicate actually exhibits a more complex pattern with respect to the clustering of the two verbs. I will return to this issue below.) In both constructions, the lower verb (V1) is semantically an argument of the higher verb (V2):

  1. (6)
    1. a.
      figure g
    2. b.
      figure h

The difference between the two constructions lies in the verbal morphology on the embedded verb: in compound verbs, V1 appears in the so-called renyoo-kei form, which is a non-finite verbal inflection; in the -te form complex predicate, an additional morpheme -te (with allomorph -de) attaches to this renyoo-kei form to syntactically mark the embedded verb. Some examples of syntactic compound verbs and the -te form complex predicate are given in (7):

  1. (7)
    1. a.

      syntactic compound verb

      hasiri-hazimeru ‘run-begin’, yomi-oeru ‘read-finish’, nomi-sugiru ‘drink-overdo’, tabe-sokoneru ‘eat-fail’

    2. b.

      - te form complex predicate

      tabe-te morau ‘have somebody eat (for one’s own benefit)’, yon-de miru ‘try reading’, kat-te hosii ‘want to buy’, kai-te simau ‘finish writing’

In terms of the basic clause structure, both types of complex predicates behave like one single predicate that heads a monoclausal sentence. Scrambling patterns provide evidence for this. Japanese allows for clause-internal scrambling fairly freely and the order among arguments and adjuncts within a single clause is relatively flexible. In both compound verbs and in the -te form complex predicate, arguments and adjuncts of V1 can freely scramble with arguments and adjuncts of V2, suggesting that they are clause-mates. The relevant examples are given in (8). In (8a), an argument of V1, piano-o, is scrambled over an adverb hui-ni, which modifies V2. Similarly, in (8b), the embedded argument piano-o is scrambled over the matrix dative argument Ken-ni:

  1. (8)
    1. a.
      figure i
    2. b.
      figure j

Furthermore, in both of these constructions, the cluster of V1 and V2 cannot be separated by arguments or adjuncts. Thus, the following examples, where an adverb (hui-ni) and a matrix argument (Ken-ni) split the verb cluster, are completely ungrammatical:

  1. (9)
    1. a.
      figure k
    2. b.
      figure l

The interclausal scrambling facts and the inseparability of V1 and V2 by arguments and adjuncts both suggest that V1 and V2 form some kind of morpho-phonological cluster in both compound verbs and in the -te form complex predicate. There are, however, a set of data that suggest that the morpho-phonological clustering of V1 and V2 is tighter in compound verbs than in the -te form complex predicate. The relevant data come from examples involving focus particle insertion, embedded VP coordination and verb duplication.

First, as shown in (10a), coordination of the ‘embedded VP’ (i.e., a string of words consisting of V1 and its argument(s)) is impossible in compound verbs. This is as expected, assuming that V1 and V2 form a tight morpho-phonological unit in this construction. That is, the string of words that are coordinated in (10a) is not a constituent, hence they cannot be coordinated. Given this, it is somewhat surprising that such ‘embedded VP coordination’ is possible in the -te form complex predicate, as shown in (10b). This suggests that, in the -te form complex predicate, the morpho-phonological clustering of the higher and lower verbs is not as tight as with compound verbs:

  1. (10)
    1. a.
      figure m
    2. b.
      figure n

The facts about focus particle insertion also reflect the same pattern. Focus particles are generally known to be unable to appear inside a word boundary in Japanese (cf., e.g., Kageyama 1993). In particular, as shown in (11a), they cannot split the sequence of V1 and V2 in compound verbs. However, (11b) shows that the -te form complex predicate behaves differently in allowing focus particles to appear between V1 and V2:

  1. (11)
    1. a.
      figure o
    2. b.
      figure p

Finally, the patterns of verb duplication are similar. Generally, targets of duplication are restricted to full-fledged words (again, see, e.g., Kageyama 1993 for some discussion). Indeed, as shown in (12a), duplication of V2 alone is not possible with compound verbs. This is not surprising given the tight morpho-phonological bond between V1 and V2 in this construction. What is surprising is that the -te form complex predicate (12b) once again behaves differently in this respect, allowing for duplication of V2 alone:

  1. (12)
    1. a.
      figure q
    2. b.
      figure r

The word order patterns of compound verbs is relatively straightforward for both argument-sharing and verb-raising approach by assuming some sort of morpho-phonological clustering of V1 and V2 (or lexical formation of compound verbs). However, the intermediate degree of tightness of morpho-phonological bond between V1 and V2 in the -te form complex predicate reviewed above is problematic for both types of approaches. In fact, this issue has long been recognized as a problem in the literature of Japanese syntax (see, e.g., Shibatani 1978; McCawley and Momoi 1986; Sells 1990; Kageyama 1993; Matsumoto 1996 for some discussion of the problem and proposed solutions). As discussed at some length in Kubota (2008), existing and conceivable solutions for this problem in standard variants of generative grammar (both derivational and non-derivational) can be classified into two types: those that treat the -te form complex predicate in line with compound verbs (within lexicalist approaches to syntax, this involves positing the compound verb composed of V1 and V2 as a single word in the lexicon) and those that treat it in line with syntactic VP complementation. However, neither of these approaches is completely successful. The former fails to capture the differences between compound verbs and the -te form complex predicate reviewed above naturally, whereas the latter suffers from the exact opposite problem. As seen at the beginning of this section, like compound verbs but unlike the full-fledged VP complementation construction in Japanese, the -te form complex predicate does not allow arguments and adverbs to split the verb cluster composed of the higher and lower verbs. However, in an analysis that assimilates the -te form complex predicate with VP complementation, there is no natural way of accounting for this similarity between the -te form complex predicate and compound verbs.

Despite the morpho-phonological monoclausality that we have just seen, the two constructions behave as if they had biclausal structures in terms of phenomena pertaining to semantic interpretation. Here, I review two such phenomena, specifically, adverb scope and quantifier scope interpretations, and illustrate how the above two approaches handle them.Footnote 4

Syntactic compound verbs, except for a subclass consisting of a few items which systematically reject such ambiguity (cf. Kageyama 1993; Matsumoto 1996; Yumoto 2002 for this subclass of compound verbs; I will comment on this class of verbs in the analysis section briefly), generally allow for scope ambiguity for adverbs and quantifiers, where the relevant ambiguity is whether the scope-taking element takes scope immediately above V1 (and below V2) or scopes over the whole complex predicate. The data involving adverb scope ambiguity are given in (13):

  1. (13)
    1. a.
      figure s
    2. b.
      figure t

Example (13a) can either mean that the act of reporting the accident itself is ill-intended or that the excessive degree of repetition of the act of reporting is ill-intended. Example (13b) is similarly ambiguous in two readings. John might have watched the TV program once and failed to watch it the second time, or he might have failed twice to watch it.

The quantifier-like expression NP-dake induces scope ambiguity similar to the adverb case above. The examples are given in (14):Footnote 5

  1. (14)
    1. a.
      figure u
    2. b.
      figure v

Example (14a) could either mean that Naomi had intended to wake up Ken alone (but ended up waking up others as well), or that Ken was the only one she failed to wake up. The former reading entails that Naomi woke up Ken, while the latter reading entails that she did not. Example (14b) is similarly ambiguous in two readings. Naomi might have limited her diet to yogurt only for an excessively long period, or she might have overeaten just yogurt (while keeping a normal diet with respect to other kinds of food) for an excessively long period.

The -te form complex predicate allows for similar scope ambiguity for adverbs and quantifiers, as shown by (15) and (16):

  1. (15)
    1. a.
      figure w
    2. b.
      figure x
  1. (16)
    1. a.
      figure y
    2. b.
      figure z

Example (15a) could either mean that reading the book was not an easy undertaking but Ken managed to do it (for Mari) or that Mari forced Ken (who was reluctant) to read the book (for her benefit). Similarly, (16a) could either mean that taking that medicine alone (without taking any other medicine) was what Taro asked Hanako to do or that that medicine was the only one that Taro asked her to take. (In the former reading, for Hanako to take other medicine in addition would be against Taro’s benefit, but in the latter reading, it is not necessarily.)

In the verb-raising approach, these scope ambiguity data are straightforward. For the adverb scope ambiguity, the adverb could either appear in the embedded clause or the matrix clause in the combinatoric structure. These two distinct combinatoric structures then yield the same surface representation with the morpho-phonological clustering of V1 and V2. Quantifier scope ambiguity can be accounted for similarly with quantifier raising in an LF-based account or with Cooper storage in linearization-based HPSG (with both the embedded and matrix clauses as potential sites for quantifier retrieval for the latter).

By contrast, these semantic biclausality effects pose a significant challenge for the argument-sharing approach. The problem essentially is that since the argument-sharing approach directly associates a merged argument structure with a monoclausal phrase-structural analysis transparently reflecting the surface syntax, there is no level of (syntactic) representation at which the semantic biclausality of complex predicates is encoded. The proposal by Manning et al. (1999) is representative of the kind of solution proposed for this problem in the argument-sharing approach. Specifically, Manning et al. invoke additional mechanisms for inducing biclausality effects for each of the phenomena for which semantic biclausality is observed (adverbs, quantifiers and binding of reflexives), where the additional mechanisms invoked for the different phenomena are totally unrelated to each other. Such a solution seems almost inevitable in a strictly lexicalist approach to complex predicates, but it misses an important generalization that the phenomena all reflect a unitary notion of semantic biclausality of complex predicates.

2.2 Argument-sharing effects in passivization and desiderativization

We have seen above that the mismatch between surface monoclausality and ‘deep’ biclausality exhibited by complex predicates receives a more straightforward analysis in the verb-raising-type approach. However, there are also sets of data that show that treating complex predicate formation purely at the level of surface morpho-phonology is insufficient. The relevant data come from argument alternation patterns in passivization and desiderativization. As discussed below, these data receive straightforward solutions in the argument-sharing approach, whereas they pose a significant problem for the verb-raising approach.

Both (at least a subset of) compound verbs and the -te form complex predicate allow for passivization of the whole complex predicate where an embedded direct object gets promoted to the matrix subject, a pattern similar to what is known as ‘long passive’ in German (Höhle 1978). Examples are given in (17) and (18):

  1. (17)
    figure aa
  1. (18)
    1. a.
      figure ab
    2. b.
      figure ac

In (17) and (18), the passive morpheme -(r)are attaches to V2 of the complex predicate. The argument that gets promoted to the subject, however, is not originally an argument of this higher verb but is the direct object of the embedded verb. The fact that passivization of this pattern is possible with these complex predicates thus suggests that V1 and V2 form a single predicate, to which a merged argument structure is assigned along the lines of the argument-sharing approach.

It is important to make sure that examples like (17) and (18) are really cases of long-distance passivization.Footnote 6 In fact, Matsumoto (1996) claims that passivization of the whole compound verb is possible only if the compound verb has a monoclausal f-structure (a level of grammar where passivization applies in his LFG analysis), which, if true, would suggest an alternative analysis of data like (17) and (18) where the whole compound verbs are simply listed in the lexicon as complex words with the ‘merged’ argument structures lexically ‘preconfigured’ for them (rather than being derived via some general mechanism such as argument composition).Footnote 7 Passivization can then be formulated as a purely lexical operation of valence reduction. However, the assumption that passivizable compound verbs always have monoclausal f-structures (in LFG terms) is inconsistent with data such as (19) and (20), which show that adverbs and quantifiers can take a narrow scope (those are among the criteria for f-structural biclausality in Matsumoto’s 1996 account) with at least some of the compound verbs that undergo long-distance passivization. (The relevant narrow-scope reading for (20a) is one in which LI continued to be the single target of submission over some extended period of time. Similarly, (20b) has a reading where the secret was supposed to be known to all but the mother, but it was inadvertently disclosed to her as well.)

  1. (19)
    1. a.
      figure ad
    2. b.
      figure ae
  1. (20)
    1. a.
      figure af
    2. b.
      figure ag

Another problem with Matsumoto’s approach is that he takes passivization to be a strictly lexical process. Such an analysis may be justified for compound verbs like those in (17) (for which the whole compound verbs consisting of V1 and V2 are listed in the lexicon as single words in lexicalist approaches like LFG and HPSG). However, this analysis is arguably problematic for the -te form complex predicate, for which there is evidence (i.e., the patterns of embedded VP coordination, focus particle insertion and verb duplication from the previous subsection) that V1 and V2 are independent words (or morphemes) that are combined with each other in the syntax. As shown in (18), at least some members of the -te form complex predicate allow for long-distance passivization where the argument promoted to the matrix subject is originally an argument of V1, but such embedded arguments are invisible on the argument structure of V2 within the lexicon even on the argument-sharing approach.

For the reasons discussed above, I assume that Japanese has genuine instances of long-distance passivization where an originally embedded object gets promoted to the matrix subject as a consequence of a general process of argument sharing (however it is to be implemented). But note that the assumption that the mechanism of long-distance passivization is generally available does not immediately entail that passivization of this kind is possible for all types of complex predicates. It is cross-linguistically observed that the distribution of long-distance passives is highly restricted and that there is much speaker variation in the acceptability of specific examples. In particular, in German, long-distance passivization is restricted to a certain subset of subject control verbs (Pollard 1994; Hinrichs and Nakazawa 1998). Similar restrictions seem to apply in Japanese as well. Note that all the verbs that allow for long-distance passivization above are subject control verbs (or have at least a control verb use if they are ambiguous between raising and control uses). Furthermore, unambiguously raising-type CVs such as V-sugiru ‘over-V’ and V-kakeru ‘begin to V’ are known to not undergo this type of passivization, as exemplified by the ungrammaticality of sentences like (21):

  1. (21)
    figure ah

Passivization is not a purely syntactic phenomenon and is sensitive to semantic constraints on thematic roles. In particular, the unacceptability of examples like (21) is arguably due to semantic incompatibility between the unaccusative higher verbs (which do not themselves subcategorize for agentive subjects) and the semantics of passivization (which requires the original subject to be agentive). I thus assume that passivization is generally available for complex predicates, but that additional semantic factors might give rise to anomalous interpretations, accounting for the unacceptability of sentences like (21).Footnote 8

The desiderative predicate construction optionally involves a case-marking alternation similar to that found in passivization. As shown by the following examples, when the desiderative suffix -tai attaches to a verb, the object of the verb, which normally appears in the accusative case, can optionally be realized in the nominative case (Kuno 1973; Sugioka 1984):

  1. (22)
    figure ai

In both compound verbs and the -te form complex predicate, as long as the semantics of the predicate is compatible with that of desiderativization (which requires the matrix subject to be sentient), both the accusative and nominative markings are possible for the embedded object, as shown by the following examples:Footnote 9

  1. (23)
    figure aj
  1. (24)
    1. a.
      figure ak
    2. b.
      figure al

Just as in passivization, in the nominative-marking version here, the argument that receives nominative marking is originally an argument of the lower verb. This again suggests that the embedded object is somehow visible on the argument structure for the whole complex predicate (or of the higher verb) to which the desiderative suffix attaches.

In the argument-sharing approach, the above patterns of passivization and desiderativization are straightforward. As can be seen in the tree in (4), in this type of approach, if the embedded verb is a transitive verb, the whole compound verb has an argument structure identical to that of a lexically transitive verb, via the inheritance of the embedded object. Thus, by assuming the usual analysis of passivization as a valence reduction operation affecting the local argument structure, sentences like those in (17) and (18) can be straightforwardly accounted for. The analysis of the desiderative predicate -tai goes essentially in the same way, the only difference being that desiderativization involves a change in case marking on the object without valence reduction.

On the other hand, in the verb-raising approach, it is not clear how these long-distance passivization and desiderativization facts are accounted for. Since the verb-raising approach does not involve argument sharing, the object of V1 does not appear on the argument structure of V2. But then, assuming (as is standard) that passivization is a local phenomenon that does not have access to the argument structure of the embedded predicate, the embedded object should not be available for promoting to the matrix subject. The only solution for this problem seems to be to additionally assume a mechanism of argument sharing within the verb-raising approach, as suggested, for example, by Gunji (1999) in his analysis of passivized causatives.

2.3 Nonconstituent coordination, nonconstituent clefting and scopal interaction with symmetrical predicates

Finally, there are cases that present challenges to both types of previous approaches to complex predicates. The relevant data come from cases in which complex predicate formation interacts with nonconstituent coordination, nonconstituent clefting and the scope interpretation of symmetrical predicates such as onazi ‘same’. The patterns involving nonconstituent coordination have been noted by Nakau (1973) and McCawley and Momoi (1986), though neither of these authors provide detailed analyses of the relevant data. The patterns involving nonconstituent clefting and symmetrical predicates are novel to this paper. As I discuss below, the class of data considered here are important since they cannot be accommodated easily in either the verb-raising approach or the argument-sharing approach even with some reasonable extensions that are introduced within each approach for the purpose of overcoming the limited empirical coverage.

The first case involves an interaction between complex predicates and nonconstituent coordination (NCC). As shown in (25a) and (26a), both in compound verbs and in the -te form complex predicate, arguments of V1 and V2 can together form argument clusters to be coordinated in NCC. By contrast, the unacceptability of (25b) and (26b) show that in both kinds of complex predicates, argument clusters in NCC cannot be formed by splitting the sequence of V1 and V2:

  1. (25)
    1. a.
      figure am
    2. b.
      figure an
  1. (26)
    1. a.
      figure ao
    2. b.
      figure ap

Intuitively, the patterns here make sense since the arguments of V1 and V2 are co-arguments (in some derived sense) and the reason that they are co-arguments is that V1 and V2 together behave as an inseparable verb cluster in complex predicates. The challenge, however, is to derive the patterns in (25) and (26) by making explicit assumptions about the structures of the two kinds of complex predicates and about NCC.

In the argument-sharing approach, an analysis of NCC along the lines of Mouret (2006) interacts nicely with the lexical analysis of complex predicates where the whole complex predicate is listed in the lexicon as one word. In Mouret’s analysis, the key mechanism that licenses NCC in a phrase structure-based setup is a lexical rule that takes a verb lexical entry as input and produces as output a derived verb that subcategorizes for an argument cluster constituent (which is itself licensed by a special constructional schema) composed of the arguments of the original verb. In order to account for the interaction of NCC and complex predicates exemplified by (25a) and (26a), the crucial assumption that is needed is that argument composition takes place in the lexicon. While this assumption is plausible for compound verbs, it is problematic for the -te form complex predicate. As reviewed in the previous section, in the -te form complex predicate, V1 and V2 behave as independent units in terms of certain phenomena pertaining to surface constituency such as embedded VP coordination, focus particle insertion and verb duplication, and, for this reason, a lexical analysis is untenable for it. But then it is not clear how examples like (26a) are licensed. Also problematic is the unacceptability of examples like (26b). If V1 and V2 are put together in syntax rather than in the lexicon for the -te form complex predicate, why is it not possible, under the Mouret-type analysis, to form an argument cluster consisting of the embedded VP and the matrix dative argument (both being arguments of V2) to license (26b)? Furthermore, phrase structure-based analyses of NCC of the kind proposed by Mouret suffers from the problem that no explicit syntax-semantics interface is worked out for them. NCC exhibits scopal interactions with complex predicates in examples like those in (27) below. In order to assign the right truth conditions for such sentences, some mechanism has to be devised to semantically interpret the coordinated nonconstituents under the scope of the higher verb. However, given the lack of necessary detail, it is not clear how such an analysis can be formulated in the argument-sharing approach combined with the Mouret-type analysis of NCC.

The verb-raising approach suffers from a somewhat different problem. I focus on the linearization-based variant of the verb-raising approach here since this variant is the most promising and explicit proposal among various verb-raising approaches to capturing the relevant patterns: the clear separation of the surface morpho-phonological component from the combinatoric component is ideal for a precise formal characterization of verb clustering in complex predicates, and, moreover, in the literature of linearization-based HPSG, there is an explicit and detailed analysis of NCC in terms of surface-oriented deletion advocated by authors such as Beavers and Sag (2004) and Ito and Chaves (2008). In fact, a deletion-based analysis of NCC along the lines proposed by these authors interacts straightforwardly with the surface-oriented verb-raising analysis of complex predicates of the kind proposed by Reape (1994) to license sentences like (25a) and (26a). On this analysis, (25a) is generated by having V1 and V2 form a verb cluster in the morpho-phonological representation in both conjuncts and then deleting the verb cluster from the first conjunct via identity in form with the verb cluster in the second conjunct. The unacceptability of (25b) (and potentially (26b) as well) falls out by assuming that the deletion process licensing NCC cannot intrude on the boundaries of verb clusters.

The problem with this linearization-based analysis of complex predicates and NCC is that it runs into problems in getting the semantics right in more complex cases. Examples like the following involving disjunction of argument clusters induce interpretations in which the disjunction takes scope below the higher verb (V2) of the complex predicate:

  1. (27)
    1. a.
      figure aq
    2. b.
      figure ar

Example (27a) means that your desire can be satisfied either by showing this book to John or by showing that book to Bill (want > ∨). It does not have the stronger reading associated with sentences like the following, see (28), involving coordination of full-fledged VPs, which says that your desire (which is unknown to me) is one of the following two: to show this book to John or to show that book to Bill (∨ > want). Example (27b) is similarly unambiguous in the disjunction-narrow-scope reading. The deletion-based analysis of NCC in complex predicates along the lines sketched above incorrectly predicts that (27a) is semantically equivalent to (28), since, on this analysis, the latter is the non-elided source of the former.

  1. (28)
    figure as

While it might not be entirely impossible to derive the disjunction-narrow-scope reading for sentences like those in (27) by developing some flexible theory of the syntax-semantics interface consistent with the general setup of the linearization-based approach, such an analysis currently does not exist. Moreover, even if such an extension could somehow be worked out, that would not solve the problem here completely, since it would still be incorrectly predicted that the most naturally available reading for (27a) is the disjunction-wide scope reading equivalent to the non-elided source (28).Footnote 10

The interaction between NCC and complex predicates observed above is accounted for straightforwardly in the categorial grammar-based framework that I propose in the next section. The key difference between the previous approaches and the present proposal is that the latter allows for a flexible interaction between the morpho-phonological component and the combinatoric component of syntax so that the degree of flexibility allowed in the combinatoric reasoning pertaining to NCC is properly constrained by the degree of flexibility involved in logical deduction reflecting the (surface) morpho-phonological constituency of complex predicates. As I show in detail in the next section, this interaction between the two components of grammar is what crucially enables a simple account of the interaction between the two empirical phenomena.

A similar interaction with complex predicates is observed in yet another nonconstituent phenomenon, one involving the cleft construction. As discussed, for example, by Koizumi (1995) and Takano (2002), the Japanese cleft construction allows for strings of words that are not normally thought of as constituents to appear in the focus position, as exemplified by (29):

  1. (29)
    figure at

I call this phenomenon ‘nonconstituent clefting’. Nonconstituent clefting interacts with compound verbs and the -te form complex predicate in a way that is parallel to the pattern of NCC. As shown by the data in (30) and (31), in both compound verbs and the -te form complex predicate, putting an argument cluster composed of arguments of V1 and V2 in the focus position is grammatical, whereas splitting the sequence of V1 and V2 by nonconstituent clefting (with V1 in the focus position and V2 in the non-focus position) results in ungrammaticality:

  1. (30)
    1. a.
      figure au
    2. b.
      figure av
  1. (31)
    1. a.
      figure aw
    2. b.
      figure ax

These patterns of nonconstituent clefting pose even more challenging problems for the previous verb-raising and argument-sharing approaches. The problem for the argument-sharing approach is that, within the lexicalist, non-transformational syntactic frameworks such as HPSG and LFG in which the argument-sharing approach can be most naturally implemented, there is currently no explicit analysis of nonconstituent clefting even for the simplest cases like (29). It is of course conceivable that some analysis could be formulated, perhaps by building on Mouret’s (2006) constructional approach to nonconstituent licensing originally proposed for NCC. But aside from the fact that such an analysis would in effect simply stipulate the possible forms of strings that can be clefted (which itself makes it an unattractive option), this kind of approach would most likely inherit the same kinds of problems for the analysis of NCC in complex predicates in the argument-sharing approach that I pointed out above.

The verb-raising approach fares no better here. As with the argument-sharing approach, no explicit analysis of nonconstituent clefting that can be combined with an explicit analysis of verb raising in the linearization-based setup currently exists. Note especially that a deletion-based approach to nonconstituent licensing proposed by previous authors for NCC does not extend to the case of nonconstituent clefting. Unlike in NCC, nonconstituent clefting does not involve two matching clauses one of which could be taken as the target of the relevant deletion process in the other. Moreover, even if some analysis could be formulated, the fact that the -te form complex predicate behaves in a parallel way with compound verbs is likely to remain problematic. Recall from above that with certain phenomena pertaining to surface constituency such as focus particle insertion and verb duplication, V1 and V2 behave as independent units in the -te form complex predicate. Given this, there does not seem to be any non-stipulative way of accounting for the fact that the sequence of V1 and V2 cannot be separated in nonconstituent clefting.

In the categorial grammar analysis that I propose in the next section, NCC and nonconstituent clefting are treated by means of exactly the same general mechanism that is responsible for making the surface constituent structure flexible in certain limited environments. As we will see there, in this approach, the parallel between NCC and nonconstituent clefting as they interact with the two types of complex predicates falls out immediately from an interaction of independently motivated analyses of the relevant phenomena.

Finally, certain members of compound verbs and the -te form complex predicate interact with symmetrical predicates such as onazi ‘same’ and nita(-yoona) ‘similar’ to induce the so-called internal readings that are not available in simplex sentences. The relevant data, which so far as I am aware have never been discussed in the literature, are given in (32) and (33). These examples contrast with (34), where onazi and nita(-yoona) appear in a monoclausal sentence that does not involve a complex predicate:

  1. (32)
    1. a.
      figure ay
    2. b.
      figure az
  1. (33)
    1. a.
      figure ba
    2. b.
      figure bb
  1. (34)
    figure bc

Example (34) has only the external, anaphoric reading for onazi and nita(-yoona), which is felicitous only in contexts in which some particular book is already salient and the sentence asserts the identity (or, in the case of nita(-yoona), similarity in some relevant respect) between that book and the book that John read.Footnote 11 The sentences in (32) and (33) do have this external reading, but they also have a reading (which is the more natural reading for these sentences) that does not require a context in which any particular book is already salient. This is similar to the reading of sentences like John and Mary read the same book, understood in the sense of asserting the identity between the book that John read and the one that Mary read, without referring to (and thus requiring the existence of) any discourse-salient book. The relevant reading for (32a) (with onazi) asserts that there is some unique book that John kept reading for some extended period. The other sentences in (32) and (33) also have this internal, non-anaphoric reading, which asserts the existence of some unique book (or work) for some continuous (or repeated) set of events.

Intuitively, the reason that the sentences in (32) and (33) induce the internal reading lacking in the simplex sentence (34) is that in all of these sentences, the matrix predicate denotes some complex event consisting of multiple subevents. The V2 makuru in (32b) asserts multiple occurrences of the same type of event denoted by the embedded verb, and iku and kuru in (33) (in the relevant meanings) are similar in meaning to tuzukeru in (32a) in that they both assert the existence of some continuous event that unfolds over an extended temporal interval and thus can be conceived of as consisting of subparts. The internal reading arises by having onazi interpreted as asserting the identity of the book involved in each of these subevents (just like the function of same is to assert the identity of the book involved for each subpart of the sum consisting of John and Mary in the relevant reading of John and Mary read the same book). In semi-theoretical terms, what we need to do to derive the relevant reading, then, is to let the NP onazi hon-o ‘the same book’ containing the symmetrical predicate to bind the variable occupying the object argument position of V1 (i.e., the embedded verb) while at the same time distributing over the event argument of V2 (i.e., the higher verb), so that the proper interpretation can be obtained where the meaning of onazi has access to both the nominal expression that provides the descriptive content for the unique entity involved and the plural entity (in this case the multiplicity of events) whose subparts are each asserted to hold some constant relation to that unique entity.

Since the analysis of the internal reading of symmetrical predicates is itself a complex issue, it is somewhat difficult to evaluate the implications of the above data for the argument-sharing and verb-raising approaches to complex predicates. However, there are certain problems that one can reasonably foresee for each approach. The problem with the argument-sharing approach is essentially that the mechanism of argument sharing merely identifies syntactic argument slots of the embedded verb and the whole complex predicate. Thus, the NP onazi hon-o containing the symmetrical predicate is semantically interpreted as occupying the object position of the embedded verb. But then it is not clear how this NP can have access to the event variable of the matrix predicate to distribute over. One might attempt to overcome this problem by devising some extension of Cooper storage (which is the standard mechanism for treating scope-related phenomena in the non-derivational setup of HPSG), but it is not clear whether (an extension of) Cooper storage can handle cases in which a single scope-taking expression simultaneously has access to two variables, one in the matrix clause and the other in the embedded clause. Given the lack of a formally rigorous and explicit theory of syntax-semantics interface in the current HPSG literature, the task of formulating a detailed analysis of a highly complex semantic problem like this seems particularly challenging.

The verb-raising approach would most likely encounter similar problems. In the verb-raising approach, the scopal property of the NP containing the symmetrical predicate needs to be represented in the combinatoric structure (or LF, in movement-based approaches), but currently there does not exist any formally explicit mechanism which can apply to such NPs to yield the complex scope-taking behavior they display, wherein some expression appears in the matrix clause in the surface string but semantically binds a trace in the embedded clause while at the same time having access to the matrix event variable. At the very least, standard mechanisms for scope taking such as QR and quantifying-in do not seem to be sufficient. Moreover, even if some more complex mechanism for the treatment of symmetrical predicates is formulated, it is far from clear whether such a mechanism can be made to properly interact with the morpho-phonological clustering treatment of complex predicates in the verb-raising approach to assign the right internal readings to sentences like (32) and (33).

It turns out that the categorial grammar framework that I propose below has exactly the right kind of flexible syntax-semantics interface for treating a phenomenon like this. As I show in detail in the next section, an independently motivated analysis of symmetrical predicates which essentially builds on Barker’s (2007) innovative approach in terms of parasitic scope interacts straightforwardly with the analysis of complex predicates proposed in this paper to yield an explicit and precise account of the internal readings of examples like (32) and (33).

3 Unifying the two approaches to complex predicates in categorial grammar

We have seen above that neither the verb-raising approach nor the argument-sharing approach to complex predicates is completely successful. Note especially the empirical challenge posed by the novel set of data, showing that the interactions between complex predicates and other phenomena pertaining to flexible syntactic/semantic composition are difficult to accommodate in either approach. There is also a theoretical inadequacy in these previous approaches in that verb-raising and argument sharing are two distinct, unrelated mechanisms typically associated with different kinds of theoretical setup. But intuitively, there is a deeper connection between the two: in complex predicates, the arguments of V1 are inherited to V2 since V1 does not have the ability to license its arguments by itself once it forms a cluster with V2 to be realized as one word in surface syntax. Such a connection is lost if they are simply posited as unrelated theoretical devices.

This means that we need a third type of approach which unifies the essential insights of the two previous approaches in some coherent manner. It turns out that precisely such a synthesis becomes possible once we adopt a logical perspective on grammar where syntax is viewed as a system that mediates reasoning about form and reasoning about meaning composition. In what follows, I demonstrate this by formulating explicit analyses of the two kinds of complex predicates reviewed above within a variant of categorial grammar (CG) called Multi-Modal Categorial Grammar with Structured Phonology. The proposed framework builds on two strands of research in the literature of CG, namely, the perspective originally due to Lambek (1958) and most explicitly embodied in Type-Logical Grammar (Morrill 1994; Moortgat 1997) which views natural language syntax as a system of logical reasoning, and the tradition stemming from the work of Montague (1973) and further developed by Dowty (1982, 1996b) (building also on the idea of Curry 1961) which separates valence-driven combinatorics and surface morpho-phonology as distinct components of grammar. As I show below, the synthesis of these two approaches from the theoretical literature of CG is precisely what enables the synthesis of the two analytic ideas in the empirical domain, giving rise to the conceptually simplest and empirically most successful treatment of complex predicates.

The discussion below is structured as follows. I first present the combinatoric component (Sect. 3.1) and the morpho-phonological component (Sect. 3.2) of the proposed framework separately, illustrating how they respectively model the previous argument-sharing and verb-raising approaches to complex predicates. I then show how these two components can be put together within the present framework (Sect. 3.3), enabling a synthesis of the two approaches which accounts for the whole range of empirical data reviewed in the previous section straightforwardly. For technically sophisticated readers, I note at the outset that the presentation below glosses over some technical details since the focus of the present paper is on demonstrating the empirical advantages of (a particular kind of) CG-based syntactic theory. For the technical details of the proposed framework, the reader is referred to Kubota (2010) and Kubota and Pollard (2010). For comparisons with alternative variants of CG, see Sect. 4.2.

3.1 The combinatoric component

Categorial grammar is a strictly lexicalist theory in that the syntactic categories of words in the lexicon transparently reflect their combinatorial properties. Complex syntactic categories are built from atomic categories recursively as in (35), with the binary connectives of forward slash (/) and backward slash (∖):

  1. (35)
    1. a.

      atomic category

      N, NP and S are categories.

    2. b.

      complex category

      If A and B are categories, then so are A/B and BA.

    3. c.

      Nothing else is a category.

We say that expressions with complex syntactic categories like \(\mathrm { S/NP } \) and \(\mathrm { NP \backslash S } \) are functors (or functions), which take arguments to return results. The distinction between the forward and backward slashes indicates the direction in which the functor looks for its argument, as illustrated in the following sample lexicon for Japanese:

  1. (36)

    taroo-ga; t; NP n

    hanako-o; h; NP a

    hasit-ta; run; NP n ∖S

    mi-ta; see; NP a ∖NP n ∖S

    yukkuri; slowly; (NP n ∖S)/(NP n ∖S)

Here and in what follows, linguistic expressions are written as tuples 〈φ;σ;γ〉 of phonological representation, semantic interpretation and syntactic category. I assume a small number of syntactic features for atomic syntactic categories, notated as subscripts to category labels as in NP n and NP a (which designate nominative NP and accusative NP, respectively). The intransitive verb hasit-ta ‘ran’ is assigned the category NP n ∖S involving the backward slash, since it combines with a nominative NP to its left to become an S. The transitive verb mi-ta ‘saw’ has category NP a ∖NP n ∖S since it first combines with an accusative NP (again, to its left) and then behaves like an intransitive verb. The outermost connective for the adverb yukkuri ‘slowly’ is the forward slash, and this means that it combines with a VP (i.e., NP n ∖S) to its right to return a VP. Such a VP then combines with a nominative NP (to its left) to return an S, completing the proof of sentential status for strings like Taroo-ga yukkuri hasit-ta ‘Taro ran slowly.’ After introducing the formal proof rules in (37), I provide a detailed proof for this sentence in which all the steps involved are explicitly shown in (38) below.

The Slash Elimination rules (37), formulated here in the so-called labeled deduction format of natural deduction (cf., e.g., Oehrle 1994 and Morrill 1994), are responsible for putting together functors with the arguments that they subcategorize for:

  1. (37)
    1. a.

      Forward Slash Elimination

      figure bd
    2. b.

      Backward Slash Elimination

      figure be

Intuitively, what (37a) says is that applying a functor expression with phonology a and syntactic category A/B to its argument b (with syntactic category \(\mathrm {\textit{B}}\)) yields an expression with syntactic category A and phonology ab (i.e., concatenation of a and b; note that a and b appear in this order in the output phonology since the forward slash is involved here). This is function application, and, correspondingly, the semantics is that of function application. Example (37b) is a counterpart of (37a) with the backward slash (note in particular that the order of a and b in the output phonology is reversed from (37a)).

The following derivation illustrates the way in which the Slash Elimination rules are used in an actual linguistic analysis:

  1. (38)
    figure bf

In CG, a derivation like this should be thought of as a proof showing that some string of words is a well-formed sentence given the lexicon (thought of as a set of axioms) and the rules of grammar (thought of as inference rules). The derivation in (38) should thus be read as a proof that the string Taroo-ga yukkuri hasit-ta is a well-formed sentence of Japanese denoting the proposition that Taro ran slowly.

The Slash Elimination rules introduced above roughly correspond to ordinary subcategorization cancelation rules in other syntactic theories. The logical conception of syntax characterizing CG is really brought out by the following Slash Introduction rules, which do not have any direct counterpart in other theories. In CG, which takes the grammar of natural language to be a kind of logic, the forward and backward slashes are to be thought of as connectives of implication, and the Slash Elimination rules introduced above are rules of implication elimination (or modus ponens: B,BAA). The Slash Introduction rules are then rules of implication introduction (or hypothetical reasoning), where the form of the reasoning involves drawing the conclusion AB given a proof of B by hypothetically assuming A. (I use upright Greek letters for variables of phonological entities: φ and ψ (type str for strings); σ (type strstr and strstrstr).)

  1. (39)
    1. a.
      figure bg
    2. b.

      Backward Slash Introduction

      figure bh

Since these rules are more abstract and their conceptual motivations are harder to grasp initially, I first illustrate their workings with a concrete example involving complex predicates and then come back to the technical details. In the context of linguistic analysis, hypothetical reasoning is essentially a tool that enables a certain kind of ‘reanalysis’ of the combinatorial possibilities of linguistic expressions. And this kind of reanalysis is exactly what underlies the analytic intuition of the argument-sharing approach. That is, the idea behind the argument-sharing approach is that verb clusters like yomi-hazime-ta or yon-de morat-ta are ‘reanalyzed’ verbs that simultaneously subcategorize for arguments (originally) of both V1 and V2. In the present CG setup, this just means that V1 and V2 can be composed into a single verb with the help of hypothetical reasoning as in the derivation in (41) with the lexical entries for V1 and V2 in (40) (VP abbreviates NP n ∖S here).Footnote 12

  1. (40)

    hii-te; play; NP a ∖VP

    morat-ta; benef; VP∖NP d ∖VP

  1. (41)
    figure bk

The formal proof here can be verbalized informally as follows: by hypothetically assuming an accusative NP to the left of a sequence of V1 and V2, we can conclude that we have an expression of category NP d ∖VP (steps and ); but then, what we already have (without the hypothesized accusative NP), that is, the sequence of V1 and V2 hii-te morat-ta is of category NP a ∖NP d ∖VP (a verb that subcategorizes for both the embedded accusative argument and the matrix dative and nominative arguments at the same time), since it is something that combines with NP a to its left to become NP d ∖VP (step ). This derivation illustrates that in a system like the present one, which makes use of hypothetical reasoning generally, argument composition falls out as a theorem in the deductive system that constitutes the combinatoric component of syntax—in precisely the same way that transitivity of implication (AB,BCAC) is a theorem in the proof system of propositional logic. The reasoning steps in the two cases are exactly parallel, and, in fact, the use of hypothetical reasoning is not limited to the analysis of complex predicates but is much more general. As will become clear below, it plays a central role in the analyses of the two nonconstituent phenomena (i.e., nonconstituent coordination and nonconstituent clefting) in assigning flexible constituent structures to certain ‘nonconstituent’ strings. The generality of the present approach wherein any substring of a sentence can in principle be analyzed as a constituent naturally raises the question of how to prevent overgeneration. This issue is addressed by elaborating the morpho-phonological component of the present framework in the next subsection.

It should also be noted here that the present approach follows the standard assumption in lexicalist approaches in assuming that case marking is lexically specified in the argument structures of the predicates. This means that the arguments that are inherited to the higher verb preserve the case assignments by the embedded verb, as illustrated above in (41). As discussed below, case alternation effects such as the nominative-accusative alternation in potential and desiderative predicates are handled lexically: the variants in which the ‘raised’ argument does not preserve the original case assignment by the embedded predicate are treated by positing alternative lexical entries for the higher verb that lexically encode the non-canonical case assignments.

With the reanalysis of the complex predicate as a derived verb illustrated above, the passivization and desiderativization facts from Sect. 2.2 receive a straightforward account. The long-distance passive sentence (18a), repeated here as (42), is derived as in (44) with the lexical entry for the passive morpheme in (43):

  1. (42)
    figure bl
  1. (43)

    rare; λPλx.∃yP(x)(y); (NP a ∖NP n ∖S)∖(NP n ∖S)

  1. (44)
    figure bm

As in the previous example, by hypothetically assuming an accusative object to the left of V1 () and then later withdrawing it after V1 and V2 are put together ( and ), we can assign a transitive verb category NP a ∖VP to the whole complex predicate. Once this reanalysis is done, the raised embedded object is further promoted to the matrix subject position via the lexical specification of the passive morpheme (). The analysis of desiderativization proceeds in essentially the same way, with the only difference being that the desiderative suffix changes the case marking on the object from accusative to nominative, without the valence reduction effect associated with passivization.

Returning to the technical details of the Slash Introduction rules of the present system, in the rules in (39), the bracketed expression designates the hypothetically assumed expression. The index n attached to the closing bracket is for keeping track of which hypothesis is withdrawn at which step in the proof. Thus, Forward Slash Introduction (39a) essentially says that we can conclude that the string of words b alone is of category B/A, given the proof that b concatenated with a hypothetically assumed φ (with syntactic category A) to its right is of category B (note the parallel to the rule of implication introduction in standard propositional logic). Backward Slash Introduction (39b) is a directional counterpart of Forward Slash Introduction, where the conclusion AB is drawn based on the proof of B with hypothesis A, whose phonology appears on the left edge, instead of the right edge. The semantics for the Slash Introduction rules is lambda abstraction: the variable x for the semantics of the hypothesized expression A is bound by the lambda operator at the step where the hypothesis is withdrawn.

It is important to note here that, even though there is a close connection (noted by Miller and Sag 1997; Kathol 1998 and Monachesi 1998) between the mechanism of argument composition (in its original implementation in HPSG) and the way it is simulated in the present system, there is also an important difference between the two in that the former is an operation that is available only for a limited set of lexical items in certain languages, whereas the latter follows from a completely general inference system in an architecture of grammar that views natural language syntax as a kind of logic. At this stage, this generality is both a strength and a (potential) weakness. The additional flexibility introduced with Slash Introduction, if not properly constrained, easily leads to unwanted overgeneration. This point should become clear from the fact that nothing in the logical reconceptualization of argument sharing above explicitly forces the clustering of V1 and V2. To illustrate this point, let us assume something like the following as a language-specific rule for scrambling. (This is for an illustrative purpose only and (45) will later be replaced by a more general mechanism for scrambling formulated as a rule in the morpho-phonological component introduced below.)

  1. (45)

    Japanese scrambling

    figure bn

The ‘…’ here abbreviates an arbitrary sequence of argument categories. Thus, (45) flips the order of two adjacent arguments of a verb, and any permutation possibility of co-arguments of a verb can be derived by successive application of this rule. Note also that by positing (45) as a syntactic rule, interclausal scrambling in complex predicates can be accounted for as well. Applying (45) immediately after the last step of (41) yields the category NP d ∖NP a ∖VP for the derived complex verb, which then licenses the interclausal scrambling word order in (8b).

However, the scrambling rule in (45) is too general and it overgenerates. For example, applying (45) to the lexical entry of morat-ta in (40) produces the category NP d ∖VP∖VP, which incorrectly licenses examples like (9b) from the previous section where an argument NP splits the sequence of V1 and V2. It might appear that this problem could be fixed by assuming that the variables X and Y in (45) cannot be instantiated as VP, but such a solution is questionable since genuine VP complementation constructions in Japanese do allow for arguments and adjuncts to appear between higher and lower verbs. Thus, the relevant constraint to be imposed on (45) would have to specifically exclude verbal projections headed by the embedded verb in complex predicates, but such a solution is evidently ad hoc. Furthermore, just like the (phrase structure-based) argument composition approach, an analysis that treats complex predicate formation purely at the level of the combinatoric component cannot capture in any natural way the different degrees of morpho-phonological bond between V1 and V2 that compound verbs and the -te form complex predicate exhibit. What is lacking in the present system is a way of stating generalizations about surface morpho-phonological forms of linguistic expressions. In the next subsection, I introduce structured phonology, a component of grammar that is specifically designed to capture such generalizations.

3.2 Multi-modal structured phonology

In the previous subsection, I simply assumed that the phonological representations of linguistic expressions are (unstructured) strings of words. In order to capture surface morpho-phonological generalizations such as the different degrees of tightness of bond between V1 and V2 found in the two types of complex predicates, we need to enrich the phonological representations of linguistic expressions and make them more ‘structured’, reflecting morpho-phonological constituency in a more nuanced way. In CG, there is a strand of research that incorporates this idea, whose conceptual underpinnings are first laid out explicitly in Dowty (1996b) (originally written in 1989). The technical apparatus for implementing Dowty’s proposal, namely, the notion of multi-modality goes back to Oehrle and Zhang (1989) and Moortgat and Morrill (1991) and were later refined in variants of Multi-Modal Type-Logical Grammar (Moortgat and Oehrle 1994; Morrill 1994; Moortgat 1997; Bernardi 2002). In this kind of setup, generalizations pertaining to surface morpho-phonological constituency are dealt with in a component, called here structured phonology, that is distinct from the component that deals with valence-driven combinatorics.Footnote 13 I generally build on this research tradition and assume that ‘phonological’ representations that are visible to syntax are ‘multi-modal’, in the sense that they involve different ‘modes of composition’ modeling different morpho-phonological properties that different linguistic phenomena exhibit. For ontological clarity (for which I do not have space here to provide full justification), I distinguish between abstract and concrete phonologies among structured phonologies of linguistic expressions. The former is what the combinatoric component of syntax has access to, and it should be thought of as an abstract representation of all the possible pronunciations that a given linguistic expression can be instantiated to. The latter is modeled as strings of words (or, morphemes), and straightforwardly represents the actual pronunciations of linguistic expressions. The technical setup of the component of structured phonology and its conceptual motivations can best be understood by looking at concrete examples. Thus, in what follows I introduce the necessary extensions to the present fragment by first working through an analysis of clause-internal scrambling in Japanese, and then develop it further by moving on to more complex cases involving the two complex predicate constructions.

In order to derive multiple word order possibilities of co-arguments of a single verb, I assume that verbs subcategorize for arguments in the scrambling mode (). The mode that is involved in putting together a functor with its argument is specified in the syntactic category of the functor via a subscript on the relevant slash. Thus, the transitive verb mi-ta ‘saw’ is given the following lexical specification, which indicates that it combines with its two arguments in the scrambling mode:

  1. (46)

    mi-ta; see;

Since slashes now specify the modes of composition involved in putting together functor and argument, we need to revise the Slash Elimination rules accordingly.Footnote 14

  1. (47)
    1. a.

      Forward Slash Elimination

      figure bo
    2. b.

      Backward Slash Elimination

      figure bp

Here, the index i ensures that the mode of composition specified on the slash is matched with the mode of composition involved in putting together the phonologies of the functor and the argument.

This revision enables us to assign abstract phonologies to linguistic expressions in syntactic derivations, as in (48):

  1. (48)
    figure bq

Unlike before, the abstract phonology on the last line has a hierarchical structure (indicated by the brackets). Note that abstract phonologies ‘straight out of’ the combinatoric component like this one transparently reflect the combinatorial order.

In order to substantiate the idea that the abstract phonology obtained at the last step in the above derivation represents the total set of possible pronunciations of the sentence, we posit the following two deducibility relations among abstract and concrete phonologies in the component of structured phonology (where A ≤ B should be read as ‘ B is deducible from A’):

  1. (49)

    Scrambling

  1. (50)

    Pronunciation

    A i BAB   

Conceptually, the ≤ relation should be thought of as representing an order of abstract ‘degrees of pronounceability’, where elements higher in the order are closer to an actual pronunciation of the linguistic expression (cf. below for more discussion of this point). The rule in (49) regulates the property of the scrambling mode, which models clause internal scrambling in Japanese. This rule says that elements combined in the scrambling mode are permutable with each other, except for the rightmost one (which corresponds to the head verb and thus has to stay in situ).Footnote 15 The rule in (50) converts an abstract phonology to an actually pronounceable string.

With these deducibility relations, we can derive the result in (51), which says that the abstract phonology derived in (48) is deducible to (i.e., it can be instantiated by) a concrete phonology with the OSV order Hanako-o Taroo-ga mi-ta:

  1. (51)
    figure br

For simple examples like clause-internal scrambling above, it suffices to think of the combinatoric component (with the Slash Elimination and Slash Introduction rules) and the component of structured phonology as constituting totally independent components of grammar where the output of the former is fed to the latter as an input. However, more complex linguistic phenomena (to be discussed below) require closer interactions between the two components. For this reason, it is necessary to posit the following P-interface rule, which makes it possible to refer to the deducibility relations among abstract and concrete phonologies in the component of structured phonology in the middle of syntactic derivations in the combinatoric component. (In fact, as will become clear below, the way in which the two components interact with one another via the P-interface rule is what enables the synthesis of the verb-raising approach and the argument-sharing approach in the present system.)

  1. (52)

    P-interface rule

    figure bs

This rule can be thought of as a channel between the combinatoric component and the morpho-phonological component: it states that, at any point in the syntactic derivation, the structured phonology of the linguistic expression can be replaced with one that is deducible (in the technical sense introduced above) from the original structured phonology, with the syntactic category and semantics unchanged. Technically, the P-interface rule (52) needs to be explicitly posited since the deducibility relations among structured phonologies like (49) and (50) do not by themselves sanction an actual inference in the proof system of the combinatoric component. Conceptually, the ≤ relation represents the abstract ‘degrees of pronounceability’ and the rules like (49) and (50) specify specific ways of going from more abstract (and ‘less pronounceable’) phonologies to more concrete (and ‘more pronounceable’) ones. In view of this, the P-interface rule can be thought of as a mechanism that allows one to infer a more ‘pronounceable’ phonology for subparts of a sentence as they are put together, rather than doing this ‘phonological reasoning’ all at once after everything is put together in the combinatoric component.

With the P-interface rule (52), the derivation for the scrambling sentence can now be rewritten as follows (with the last step supported by the deducibility relation in (51)):

  1. (53)
    figure bt

With the assumptions introduced above, we can now model the verb-raising approach to complex predicates in the present system in terms of morpho-phonological clustering of V1 and V2 in structured phonology. The lexical entries for the relevant items are given in (54).

  1. (54)

    hazime-ta; begin;

    morat-ta; benef;

In both compound verbs and the -te form complex predicate, V2 subcategorizes for a VP headed by V1. The crucial difference between the two constructions is that they combine with this embedded VP in different modes of composition, the clustering left-associative mode (for compound verbs) and the non-clustering left-associative mode (for the -te form complex predicate). (I call these modes simply the ‘clustering mode’ and the ‘non-clustering mode’ below.) The difference between the two modes is that morpho-phonological clustering of V1 and V2 is obligatory for the clustering mode, while it is optional for the non-clustering mode. As will become clear below, this distinction captures the difference in the tightness of bond between V1 and V2 in the two constructions as manifested in the phenomena reviewed in Sect. 2.1 (i.e., embedded VP coordination, focus particle insertion and verb duplication), where the sequence of V1 and V2 can be split in the -te form complex predicate but not in compound verbs with respect to these three phenomena. These modes are both left associative, which means that the following structure-changing rule is applicable to them:

  1. (55)

    Left Association

       ()

The following derivation of the interclausal scrambling example with the -te form complex predicate illustrates how the left-associative property of these two modes models verb clustering and the resultant ‘liberation’ of embedded arguments (and adjuncts) to matrix clauses in the morpho-phonological representation. (Here and in what follows, I omit semantics from derivations whenever the details of compositional semantics are irrelevant. Also, technically, the successive applications of the P-interface rule at the end of (56) and elsewhere can be collapsed into one step, but the intermediate steps are shown here and below for expository ease.)

  1. (56)
    figure bu

Note crucially here that the second to last step in this derivation is licensed by Left Association (55). This restructuring of morpho-phonological constituency in effect makes the embedded argument piano-o and the matrix argument Ken-ni clause-mates, and then the two can be scrambled via the Scrambling rule (49) at the final inference step.

The impossibility of splitting the sequence of V1 and V2 with arguments or adjuncts in the two complex predicate constructions also falls out naturally in the proposed analysis. What is crucial here is that in both constructions, a left-associative mode is involved in putting together (a projection of) V1 and V2. Unlike the scrambling mode that is used for combining verbs with its nominal arguments, neither the clustering mode nor the non-clustering mode undergoes the Scrambling rule (49). Specifically, in order to derive the word order in (57b), something like (58) needs to be a valid deducibility relation in structured phonology, but since the Scrambling rule (49) is not applicable to the non-clustering mode , it is not. Thus, sentences like those in (57) are correctly ruled out:

  1. (57)
    1. a.
      figure bv
    2. b.
      figure bw
  1. (58)
    figure bx

The difference between the clustering mode and the non-clustering mode lies in the fact that the former cannot be directly converted to the pronunciation mode via the Pronunciation rule (50). For the clustering mode, there is instead a special rule that converts it to the pronounceable mode in a certain restricted environment:

  1. (59)

    Cluster Pronunciation

    A BAB    (where both A and B are of type cl)

The subtype cl (of type str) introduced here distinguishes strings of words consisting of verb phonologies alone from other (concrete or abstract) phonologies (which are of the more general type str). The idea is that the subtype cl singles out a certain subset of phonological terms belonging to the general type str. Technically, the type cl is defined by the recursive condition in (60) together with the assumption that all verb phonologies like hiki, yomi, hazime-ta, etc., are listed in the lexicon as belonging to the type cl:

  1. (60)

    If a and b are terms of type cl, then so is a ∘ b.

From this it follows that expressions like yomi ∘ hazime-ta is of type cl, but hon-oyon-da is not, since hon-o, being the phonology of an NP, is not of type cl.Footnote 16

From this, it follows that sentences involving compound verbs (but not the -te-form complex predicate) yield pronounceable phonologies only if verb clusters are formed so that the Cluster Pronunciation rule (59) can apply, as in the derivation (62) for (61):

  1. (61)
    figure ca
  1. (62)
    figure cb

The last step in this derivation is supported by the following deducibility relation in structured phonology:

  1. (63)
    figure cc

With these assumptions, the accounts of the three phenomena in which compound verbs and the -te-form complex predicate contrast with one another in terms of the degree of tightness of morpho-phonological bond between V1 and V2 are now straightforward. The following ill-formed vs. well-formed derivations for the embedded VP coordination examples in (64) and (65) illustrate this point. I assume that the conjunction matawa combines with the left and right conjuncts directly in the pronunciation mode. For notational convenience, I omit a subscript for the pronunciation mode and thus the syntactic category of the conjunction is written as (XX)/X:

  1. (64)
    figure cd
  1. (65)
    figure ce

Unlike (65), (64) does not yield a pronounceable phonology. This is so because, in order to convert the abstract phonology into a concrete pronounceable one that does not contain the clustering mode , the phonology of the whole sentence needs to be restructured so that a local structure is created that satisfies the applicability condition of Cluster Pronunciation (59), but this is impossible in (64). The embedded VP involves a coordinate structure (which is not of type cl, containing non-verb phonologies), therefore the phonology at the bottom of (64) does not undergo Cluster Pronunciation (59). This means that the embedded verb in the second conjunct somehow needs to be moved out of the conjunct to form a verb cluster with V1 alone. But this possibility is blocked too, since, crucially, a coordinate structure is not left associative (with the conjunction combining with the two conjuncts in the pronunciation mode, which does not satisfy Left Association (55)), and hence the necessary restructuring cannot be carried out. Thus, the whole expression remains unpronounceable and the example is correctly ruled out. By contrast, with the -te form complex predicate, the non-clustering mode can be converted to the pronounceable mode without V1 forming a cluster with V2, therefore the derivation goes through as in (65).

The patterns of focus particle insertion and verb duplication can be accounted for similarly. I assume that focus particles like sae are given an adverb-like syntactic category XX and the verb duplication trigger koto-wa is assigned a conjunction-like category X∖(X/X) (both combining with their arguments directly in the pronunciation mode just like the conjunction above). The focus particle attaches to V1 and the duplication trigger koto-wa attaches to V2 in the relevant examples, but, crucially, in either case, the presence of the focus particle or the duplication structure prevents V1 and V2 from being directly combined with each other to form a morpho-phonological cluster (essentially for the same reason as in the VP coordination example above), thereby failing to satisfy the applicability condition for the Cluster Pronunciation rule (59) in the case of compound verbs. With the -te form complex predicate, no such problem arises since verb clustering is not obligatory, so the sentences are correctly licensed. I omit the relevant derivations here since the accounts are essentially parallel to the case of VP coordination above.

Finally, like verb-raising approaches in the previous literature, facts regarding biclausal semantic interpretation of scope-taking elements are straightforward in the present approach. To derive the ‘sublexical scope’ of adverbs (where they scope above V1 but below V2), we just need to license an adverb in the embedded VP in the combinatoric structure, which then scrambles out of this embedded VP due to verb clustering (obligatorily in compound verbs and optionally in the -te form complex predicate). The derivation for the narrow-scope reading of (66) (= (13a)) in (67) illustrates how this works:

  1. (66)
    figure cf
  1. (67)
    figure cg

The account of quantifier scope is essentially parallel. The two scoping possibilities for quantifiers are predicted on the proposed account since there are two scoping domains for quantifiers in the syntactic derivation (which reflects the combinatoric structure): the embedded VP and the matrix clause. See Kubota (2010) for an analysis which works out the relevant details explicitly.

The empirical pattern of semantic biclausality of complex predicates is actually somewhat more complicated than what the above illustration might suggest. It is well known in the literature on complex predicates in Japanese that syntactic compound verbs are divided into two classes: those that allow for biclausality effects and those that do not (see, for example, Kageyama 1993; Matsumoto 1996). The two classes show consistent behaviors with respect to different tests for biclausality including the scope of adverbs and quantifiers discussed in the previous section. For example, with the type of compound verbs that Kageyama (1993) categorizes as the V′-type, which includes V-wasureru ‘forget to V’ and V-naosu ‘re-V’, scope ambiguity is not observed with either adverbs or quantifiers:

  1. (68)
    1. a.
      figure ch
    2. b.
      figure ci
  1. (69)
    1. a.
      figure cj
    2. b.
      figure ck

In all of the previous accounts I know of, the difference between the set of complex predicates that exhibit biclausality effects and those that do not (which I respectively call ‘biclausal compound verbs’ and ‘monoclausal compound verbs’ below) is accounted for in terms of some kind of stipulation or other as to their syntactic properties (see, for example, Kageyama’s 1993 and Matsumoto’s 1996 approaches where the difference is attributed to the differences in the complexities of syntactic projection in the former and of the f-structural representation in LFG in the latter). I suggest here a solution similar in spirit to these previous accounts, but one which is simpler in that it just involves one additional syntactic feature. Specifically, I propose to block the narrow-scope readings of adverbs and quantifiers in monoclausal compound verbs by means of a lexically specified binary feature ±lex on the category S that regulates the syntactic environments in which scopal expressions such as adverbs and quantifiers can take scope. With this binary feature, we can rule out the unwanted narrow-scope readings for adverbs and quantifiers by making the following two assumptions:

  • V1 of monoclausal compound verbs (and nothing else) is rooted in S+lex .

  • Quantifiers and adverbs are lexically specified so as not to be able to scope over categories rooted in S+lex .

Example (70) shows a failed derivation for the adverb narrow-scope reading for the monoclausal compound verb V-wasureru in (68a):

  1. (70)
    figure cl

In order to take a narrow scope, the adverb needs to originate in the embedded VP. However, the problem here is that the syntactic category of the lower VP (VP+lex , which abbreviates ) does not match the specification on the adverb (which requires the embedded VP to be rooted in Slex ). Thus, the derivation in (70) fails. And since generating the adverb within the lower VP is the only way to make it take a narrow scope, the narrow-scope reading is blocked. The lack of quantifier scope ambiguity is accounted for similarly. For the quantifier to take a narrow scope, it has to occur within the embedded VP in the derivation, but that possibility is blocked for compound verbs like V-wasureru due to the conflict in the value of the lex feature.Footnote 17

The intuition behind the +lex specification for monoclausal compound verbs is that the syntactic projection of the embedded VP is ‘defective’ for them, in the sense that it does not constitute a full-fledged clausal domain for scopal elements. In this sense, it is reminiscent of Kageyama’s (1993) proposal in which these verbs directly take V′ projections rather than full-fledged VP projections as the complement verbal projection. But the account here is simpler and more explicit than Kageyama’s in distinguishing the relevant properties of the two types of compound verbs purely in terms of their lexical properties, which is fully in line with the underlying empirical observation that the relevant distinction is lexical in nature.

3.3 Putting the two components together

The system developed in the previous subsection models the verb-raising approach to complex predicates in a formally rigorous way and successfully accounts for the similarities and differences between compound verbs and the -te-form complex predicate reflected in the basic word order patterns. As it is, however, we do not yet have a fully general account of complex predicates since it does not model the effect of argument sharing, i.e., the merging of the subcategorization frames lexically associated with separate verbs into a single subcategorization specification of the complex predicate, as attested by the evidence presented in Sect. 2.2. Specifically, the Slash Introduction rules that were introduced in Sect. 3.1 need to be put back into the present system that is extended with the structured phonology component. This is almost straightforward except that we need to make a small revision to the rule so that it is consistent with the extended architecture that has the modality distinctions and in which (abstract) phonological representations of linguistic expressions are structured, rather than being simple strings of words.

The revised Slash Introduction rules are given in (71).Footnote 18

  1. (71)
    1. a.

      Forward Slash Introduction

      figure co
    2. b.

      Backward Slash Introduction

      figure cp

Just like the (revised) Slash Elimination rules, the coindexation by the variable i guarantees that the right modality is inherited from the input expression to the output expression. Since Slash Introduction is a rule that creates a new functor by means of hypothetical reasoning, the coindexation is between the modality by which the phonology of the hypothesis is combined with the rest of the phonology of the input and the modality by which the newly created functor in the output looks for its argument (which is indicated as a subscript on the slash of the output syntactic category). It should be noted that, given the conceptual understanding of the Slash Introduction rules that they are essentially rules for manipulating the combinatorial properties of linguistic expressions relative to each other, the application of the rules in (71) should not destroy the hierarchical structure and linear order encoded in the phonology of the input expression. In other words, combining an expression of category A with an expression of category B/ i A via Slash Elimination right after the application of Slash Introduction (71a) should yield an expression that is exactly identical in form to the input of (71a). This is ensured by assuming that the phonology of the hypothesized expression appears at the right or left edge of the input expression not only linearly but also hierarchically. (This is implicit in the notation in (71); it is guaranteed by assuming that the metavariable b in the phonology of the input expression is itself a well-formed structured phonology.)

Unlike the fragment from the previous subsection, the system at this point can model the effects of argument sharing essentially in the same way as in the fragment from Sect. 3.1. However, the relevant inference is now constrained to be available only in restricted environments where the morpho-phonological configuration of the relevant linguistic expressions satisfies the applicability condition of the revised Slash Introduction rules. This is illustrated in the following partial derivation that assigns a derived ditransitive verb-like category to the verbal complex involving the -te form complex predicate:

  1. (72)
    figure cq

Here, via the clustering of V1 and V2 with Left Association (55), the accusative NP hypothesized within the embedded VP is pushed to the left edge of the phonology of the whole expression (). This makes it possible to withdraw this hypothesized NP with Backward Slash Introduction (71b) so that a ‘reanalyzed’ ditransitive verb-like category is assigned to the whole complex predicate ().

Thus, we see here that, just as in the simpler system introduced in Sect. 3.1, the effect of argument composition falls out as a theorem due to the availability of hypothetical reasoning. However, the present system is more constrained than the simplified system in Sect. 3.1 in that the availability of hypothetical reasoning crucially depends on the fact that restructuring of the abstract phonology is possible so that V1 and V2 form a morpho-phonological unit. This morpho-phonological reanalysis makes it possible to withdraw a hypothetically assumed embedded argument after V1 and V2 are put together so that the desired combinatorial reanalysis of the complex predicate as a single derived verb goes through. It is in this sense that the effect of ‘argument sharing’ falls out as a consequence of ‘verb raising’ in the present approach.Footnote 19 As should already be clear, this result depends on the architecture wherein reasoning in the combinatoric component is constrained and informed by reasoning about surface morpho-phonological forms of linguistic expressions in the component of structured phonology, with the P-interface rule serving as a channel between the two components. The hierarchical structures of phonological representations of linguistic expressions before the application of the P-interface rule reflect the combinatorial order of the elements involved. Such phonological representations can then be restructured via the P-interface rule, in accordance with the lexically encoded modality specifications reflecting different degrees of flexibility in morpho-phonological constituency. And, crucially, this morpho-phonological reasoning can sometimes feed into further combinatorial reasoning via the (revised) Slash Introduction rules, with the restructured morpho-phonological representation enabling a previously unavailable application of Slash Introduction. As I show below, the restricted availability of this combinatorial reanalysis in the present system plays a crucial role in accounting for the contrast between grammatical and ungrammatical examples of NCC and nonconstituent clefting as these phenomena interact with complex predicates.

Empirically, with this modeling of argument sharing, the analyses of passivization and desiderativization that remain problematic for the fragment from the previous subsection now becomes straightforward. Since the analyses of these phenomena are essentially the same as in the fragment from Sect. 3.1, I omit derivations here.

The more significant consequence of the synthesis of the verb-raising approach and the argument-sharing approach attained at this point is that it enables principled solutions for the three hitherto unattended sets of data from the previous section that pose problems for both of the previous approaches. What is common in these phenomena (especially the first two) is that both the ‘verb-raising’ properties and the ‘argument-sharing’ properties of complex predicates become simultaneously relevant in capturing the empirical patterns observed, and, precisely for that reason, they reveal the limitations of the previous two types of approaches that reflect only one of these aspects of complex predicates. What crucially distinguishes the present approach from these previous alternatives is that it captures the deeper, ‘logical’ connection between the ‘verb-raising’ and ‘argument-sharing’ properties of complex predicates as a natural consequence of the theoretical architecture adopted. In this setup, the seemingly complex patterns found in the interactions between complex predicates and these phenomena (which are themselves recalcitrant problems in the syntactic literature) fall out straightforwardly as a consequence of interactions of independently motivated analyses of the respective syntactic patterns.

We start with the case of NCC. Example (74) shows the derivation for (73) (= (26a)), an NCC sentence in which arguments of V1 and V2 together form an argument cluster to be coordinated (here DTV is an abbreviation for ).

  1. (73)
    figure cr
  1. (74)
    figure cs

There are two key components in this derivation. The first is the part where a reanalyzed ditransitive verb-like category is assigned to the cluster of V1 and V2 (marked as in the above derivation), which is identical to the previous derivation (72). The second key component is the part where an argument cluster constituent is formed involving the matrix dative NP and the embedded accusative NP (the steps down to on the left-hand side in (74)). Here, a ditransitive verb is first hypothesized to the right of the two NPs. Once a VP is formed with the two NPs as arguments, the phonology of this hypothesized verb is pushed to the right periphery (), so that the hypothesis can be withdrawn with Slash Introduction (). This assigns the category to the argument cluster, which is looking for a ditransitive verb to its right to become a VP. (The restructuring of the phonology at step actually involves a rule that has not yet been introduced; I will come back to this point immediately below.) Finally, two such argument clusters are coordinated to return a larger expression of the same category () and the coordinated argument cluster is then combined with the verb cluster that it is looking for as its argument to form a VP ().

To license the PI step marked as above, we need the following Right Association rule:

  1. (75)

    Right Association

This, together with Left Association (55) (which is also applicable to the scrambling mode), in effect flattens out the hierarchical constituent structure involving a verb and its arguments and adjuncts within a clause. That is, as long as all the elements are combined in the scrambling mode , hierarchical constituent structure is completely irrelevant since any structure can be re-bracketed to any other one by applying (55) and (75) successively. As will become clear below, the flexibility of constituency in basic clause structure introduced here is crucial to the analysis of nonconstituent clefting as well.

The present analysis also correctly predicts the ungrammaticality of examples like (76) (= (26b)), where V1 is split from V2 in NCC. Here, what blocks the unwanted overgeneration is the limited degree of morpho-phonological flexibility encoded in the two left-associative modes reflecting the ‘verb-raising’ property of complex predicates. For (76) to be derived, a string composed of the matrix dative argument and the embedded VP would need to be assigned a category that is looking for the matrix verb to return a VP. However, this is impossible given the limited flexibility of the non-clustering mode for the -te-form complex predicate. Example (77) shows a failed derivation for (76):

  1. (76)
    figure ct
  1. (77)
    figure cu

In (77), the matrix verb is hypothesized so that the coordinated string would be assigned the desired syntactic category. For this derivation to go through, the phonology of the expression derived at the last step in (77) would have to be restructured with the P-interface rule so that the phonology of the hypothesized verb is pushed to the right edge (i.e., not just linearly at the right periphery but also hierarchically being the immediate rightmost daughter of the whole structured phonology) to satisfy the applicability condition of Slash Introduction. But this restructuring is not allowed since the non-clustering mode is not right associative, a property limited to the Scrambling mode in the present fragment, as per (75). Thus, the ungrammaticality of sentences such as (76) is predicted as a direct consequence of the limited degree of morpho-phonological flexibility of the non-clustering mode involved in the -te-form complex predicate.

It should be obvious that the same pattern of grammaticality obtains for cases involving compound verbs, given that verb clustering (which enables argument structure reanalysis) is possible (in fact obligatory) with them (thus licensing (25a)), and that the clustering mode, like the non-clustering mode, is not right associative (thus ruling out (25b)).

The scope facts involving disjunction of nonconstituent strings in examples like (78) (= (27a)) are also unproblematic in the present approach. The derivation for (78) is given in (79) (here, as above, DTV is an abbreviation for and ⊔ designates the polymorphic disjunction operator in the generalized conjunction analysis of coordination (Partee and Rooth 1983)):

  1. (78)
    figure cv
  1. (79)
    figure cw

Here, argument cluster constituents are first coordinated via the usual process of NCC and the whole coordinate structure is combined with V1 and V2 successively. Since the whole coordinate structure appears inside an argument of the higher predicate -tai ‘want’ in the combinatoric structure, the desired disjunction-narrow-scope reading is obtained.

The patterns involving nonconstituent clefting receive a straightforward treatment parallel to the case of NCC. Here, I adopt the analysis of nonconstituent clefting proposed by Kubota and Smith (2006) in the framework of CCG by making some minor adjustments. The key assumption, together with general flexibility of constituency available in CG, that enables a uniform analysis of ordinary single constituent clefting and nonconstituent clefting in Kubota and Smith’s analysis is the lexical entry for the copula. Specifically, the copula is assigned a polymorphic syntactic category so that it combines with a higher-order expression (derived via hypothetical reasoning) that is looking for some material to the right to become a sentence () and reverses the order in which it seeks its argument ():

  1. (80)
    figure cx
  1. (81)
    figure cy

In derivation (81) for (80), an accusative NP appears in the precopular focus position. With hypothetical reasoning, this NP is assigned a higher-order syntactic category (). By combining with the copula, this type-raised NP looks for its argument to its left (). The topicalized expression (i.e., the non-focus material) is a sentence missing an accusative NP and can thus be derived as (). Since the syntactic category of this non-focus part matches the category of the argument that the focus part is looking for, the two can be combined via function application, completing the derivation for the whole sentence ().

Since a single NP is clefted in (81), the (raised) syntactic category of the precopular focus expression is just that of a type-raised NP (of the generalized quantifier type). However, since the lexical entry for the copula is polymorphic, with the variable X ranging over possibly complex syntactic categories, this is not the only possibility. In fact, this polymorphism is what allows argument cluster nonconstituents to appear in the focus position. This is illustrated in derivation (83) for (82) (= (31a)), a sentence in which arguments of V1 and V2 of the -te form complex predicate together appear in the focus position of a cleft sentence:

  1. (82)
    figure cz
  1. (83)
    figure da

The argument cluster Ken-ni piano-o is assigned the syntactic category , via essentially the same process of hypothetical reasoning as in the analysis of NCC above (). Likewise, the sequence of V1 and V2 hii-te morat-ta is assigned a ditransitive verb-like category via hypothetical reasoning as above (). As shown in the derivation in (83), by combining with the copula, the argument cluster (which is already in a functor category) reverses the order in which it looks for its argument (i.e., , a sentence missing accusative and dative NPs) (). Since this is exactly the category of the expression in the main clause (i.e., the non-focused part) where the derived ditransitive verb combines (non-hypothetically) only with the subject (steps on the left-hand side down to ), the two parts can be put together via Slash Elimination to complete the derivation (). Note crucially here that the exact same mechanism that reanalyzes nonconstituent strings as constituents is involved in this analysis of nonconstituent clefting as in the analysis of NCC above. We have already seen above (cf. derivation (77) for example (76)) that V1 and V2 in the -te form complex predicate cannot be separated from each other in forming such ‘nonconstituent’ constituents due to the limited flexibility of the non-clustering left-associative mode involved in this construction. Specifically, the proof for the NCC example in (77) fails since the restructuring that is necessary for carrying out the relevant hypothetical reasoning fails due to the non-right-associativity of the non-clustering left-associative mode . The ungrammaticality of its nonconstituent clefting counterpart (31b) falls out for exactly the same reason. The data involving compound verbs in (30) receive essentially the same account, as should be clear from the way in which the parallel data involving NCC above are accounted for.

Finally, the present CG framework enables a simple treatment of the previously unnoticed internal readings of symmetrical predicates as they interact with certain complex predicates. Example (84) (= (32a) from Sect. 2.3) is representative of this pattern:

  1. (84)
    figure db

In this type of example, the symmetrical predicate depends on the higher verb (tuzuke ‘continue’ in (84)) of the complex predicate in inducing its internal reading (unavailable in simplex sentences not involving a complex predicate) which essentially asserts the uniqueness of the book involved in the continuous reading event that unfolds over a certain period of time.

We need to introduce some modest extensions to the present system to handle the scope-taking behaviors of symmetrical predicates. The first is a general mechanism for scope taking. Here, I adopt an approach that is originally due to Oehrle (1994) which exploits the separation of the combinatoric and morpho-phonological components characterizing the present framework. The key idea is to posit bindable variables in the phonological component as well as in the semantic component so that the mismatch between syntax and semantics exhibited by scope-taking expressions like quantifiers can be explicitly handled. Technically, this can be done by introducing a new kind of slash, which I call the vertical slash, together with its Introduction and Elimination rules defined as follows:

  1. (85)
    1. a.
      figure dc
    2. b.
      figure dd

Unlike the Introduction rules for the forward and backward slashes, the variable φ for the phonology of the hypothesized expression is not thrown away but is bound by the lambda operator in the Vertical Slash Introduction rule (85a). This creates functional phonological expressions that take (abstract representations of) strings as inputs and produce (abstract representations of) strings as outputs. Linguistic expressions with such functional phonologies are applied to their arguments by the Vertical Slash Elimination rule (85b).

The following example illustrates how the inverse scope reading for the sentence Someone loves everyone is licensed. (Following Oehrle 1994, I assume that quantifiers are lexically specified in the syntactic category \(\mathrm {S} | \mathrm {( \mathrm {S} | \mathrm {NP} )} \), with ordinary generalized quantifier meanings and functional phonologies of type (strstr)−>str.)

  1. (86)
    figure de

Here, after hypothesizing the subject and the object NPs, the subject quantifier someone first takes scope with the Vertical Slash Introduction rule binding the variable in the subject position and the Vertical Slash Elimination rule applying the quantifier meaning and phonology to the lambda-abstracted meaning and phonology of the sentence. The lexical entry for the quantifier essentially says that it semantically scopes over the whole expression that it takes as its argument, but phonologically, it embeds its phonology in the variable position bound by the λ-operator in its argument. This has the effect of mediating the mismatch between surface string and semantics, much in the same way as Montague’s quantifying-in rule. The derivation is completed by applying the same steps for the object quantifier. Since the derivational history in the combinatoric component transparently corresponds to the semantic scope of the quantifier, the inverse scope reading results from this derivation. By reversing the order of the last two quantifier scoping steps (i.e., the two pairs of Vertical Slash Introduction and Elimination steps), the surface scope reading is obtained. From this, the close parallel between the treatment of quantifier scope via the vertical slash in the present fragment and Montague’s original syncategorematic rule of quantifying-in should be apparent. Oehrle’s innovation lies in modeling the effect of ‘quantifying in’ in a more explicit and logically transparent manner wherein it falls out as a consequence of a modest extension (i.e., the introduction of a new, order-insensitive slash tied to λ-binding in phonology) to the general setup of deductive logic of natural language syntax.

The second extension needed is the treatment of implicit event variables. For this purpose, I posit the distinction between S and \(\mathrm {S^{e}}\) for the sentential category. \(\mathrm {S^{e}}\) is the category for sentences in which the implicit event variable is not saturated (thus, it is semantically of type vt, with v the type of event variables) and S (of semantic type t, as before) is for sentences with the event argument saturated.Footnote 20

With the above extensions in place, we can now account for the internal readings of symmetrical predicates in complex predicates in (87) and (88) (= (32) and (33) from Sect. 2.3):

  1. (87)
    1. a.
      figure df
    2. b.
      figure dg
  1. (88)
    1. a.
      figure dh
    2. b.
      figure di

The lexical entries for the relevant items can be stated as in (89):Footnote 21 , Footnote 22

  1. (89)

    λ σ.σ(onazihon-o); λPλe[∃y.book(y)∧[∀e′≤e.P(y)(e′)]]; Se|(Se|NP a )

    yomi; λxλyλe[read(x)(y)(e)];

    tuzuke-ta; λPλxλe[P(x)(e)∧¬atom(e)∧cont(e)];

    makut-ta; λPλxλe[P(x)(e)∧¬atom(e)∧|{e′:e′<e}|≥C];

As can be seen here, both the embedded verb and the matrix verb have their own event argument slots. The embedded verb yomi ‘read’ returns an event predicate (of type vt) after the subject and object NP arguments are saturated. The higher verb of the compound verb V-tuzuke (‘continue V-ing’) takes such event predicates as arguments and imposes on them the restriction that they are predicated of events that are non-atomic and temporally continuous. V-makuru (‘V repeatedly and excessively many times’) imposes the restriction that the number of subevents of the relevant event exceeds some contextually determined threshold (provided by the free variable C, whose exact value is determined contextually). The key assumption in this lexicon is the lexical specification for the NP onazi hon-o ‘the same book’ containing the symmetrical predicate onazi. As explained below, this treatment essentially builds on the analysis of symmetrical predicates by Barker (2007) in terms of the notion of parasitic scope. As reflected in the syntactic category Se|(Se|NP a ), the NP containing onazi binds an individual variable for an accusative NP argument and passes up an event argument slot. As will become clear in the derivation below, by having this NP take scope between V1 and V2, a proper link can be established between the event arguments of the two verbs to induce the internal readings of sentences like (87a). Specifically, the NP onazi hon-o takes the denotation of the embedded VP (which is an event predicate with the object position abstracted over) and creates out of it a predicate that holds true of an event sum just in case each atomic subpart of that event sum satisfies the event description provided by the embedded verb yomi ‘read’, with the object of each of these smaller reading events being held constant as a unique book.

The following derivation for (87a) illustrates how the right meaning is assigned to the whole sentence compositionally with the lexical entries in (89). (Here, I follow a standard assumption in event semantics that existential closure of the event variable is implemented by means of a syntactic rule, shown in the step marked as ECL in the following derivation.)

  1. (90)
    figure dj

In (90), via hypothetical reasoning with the vertical slash, the embedded clause is assigned the category \(\mathrm {S^{e}} | \mathrm {NP}_{a} \). The NP onazi hon-o takes this as an argument and creates out of it a predicate that holds true of event sums just in case each subevent of that event sum is a reading event involving a unique book. Phonologically, this NP inserts its string in the embedded accusative object position (which is later raised to the matrix clause in the morpho-phonological representation via the usual clustering of V1 and V2 obligatory in the compound verb construction). After this, the hypothetically assumed embedded subject is withdrawn by Backward Slash Introduction so that the embedded clause satisfies the subcategorization requirement of the matrix verb (which is looking for a VP () to return a VP) and the matrix verb and the subject NP are combined via function application, which is followed by the existential closure of the event variable and the PI step taking care of the morpho-phonological verb clustering to complete the derivation. The translation for the whole sentence is unpacked in (91):

  1. (91)

    e.keep(λy.same(book)(λx.read(x)(y)))(j)(e)

    =∃e.keep(λyλe[∃z.book(z)∧[∀e′≤e.read(z)(y)(e′)]])(j)(e)

    =∃e.[∃z.book(z)∧[∀e′≤e.read(z)(j)(e′)]]∧¬atom(e)∧cont(e)

This says that there is a temporally continuous non-atomic event such that all of its subparts are reading events involving John as the reader and a unique book as the thing being read. This correctly captures the internal reading for (87a).

Though there are some differences in the details of implementation due to the fact that we are here dealing with implicit event variables rather than overt plural expressions as the entity that onazi distributes over, the analysis of the internal reading of symmetrical predicates here is essentially parallel to that of Barker (2007) by means of parasitic scope. Just as in Barker’s analysis, the interpretation of symmetrical predicates is dependent on two linguistic expressions: the nominal expression (hon ‘book’) that provides the descriptive content for the unique entity involved and a plural entity that is composed of parts such that each of these parts holds a certain relationship to that unique entity. The only difference between the cases that are treated in Barker (2007) and the cases involving complex predicates here is that, in the latter, the plural entity is a sum of events (rather than a sum of individuals) and that this event sum is represented by an implicit variable in the syntax. But this difference does not pose any obstacle in the present approach that is equipped with a flexible syntax-semantics interface. Note especially that the way in which the combinatoric component and the morpho-phonological component are clearly separated yet interact with one another is what enables this straightforward treatment of the interaction between symmetrical predicates and complex predicates. The internal readings of sentences like (32) and (33) essentially arise from the availability of the intermediate scoping position for the symmetrical predicate in the combinatoric component (where it can establish the proper relationship between the event arguments of the two verbs), but at the morpho-phonological level, the NP containing the symmetrical predicate behaves just like an ordinary raised embedded object, due to the its own phonological specification and the ‘verb-raising’ property of complex predicates.

4 Conclusion

The argument-sharing approach highlights the properties of complex predicates viewed, as it were, ‘from outside’: it assigns to these predicates just the right argument structures if they were to be treated as single unitary predicates as a whole. The verb-raising approach, by contrast, highlights their properties viewed ‘from inside’, dissecting the hidden biclausal structures that complex predicates possess under the monoclausal guise. In this paper, through a thorough examination of empirical data involving two types of complex predicates in Japanese, I argued that both of these approaches have something to offer for the analysis of complex predicates, but that neither alone can provide a complete analysis. Then I formulated an analysis in a variant of categorial grammar which integrates the insights of the two approaches from a logical perspective. Specifically, the formal calculus that I have developed allows for a systematic interaction between the combinatoric component and the surface morpho-phonological component. An interesting property of such a system is that the effect of ‘argument sharing’ falls out as a theorem from the lexically specified ‘verb-raising’ (or verb-clustering) property of complex predicates. This synthesis of the two previous approaches is both conceptually appealing and empirically satisfying. Conceptually, it enables us to see a hitherto unnoticed deeper connection between the argument-sharing approach and verb-raising approach. Empirically, it results in an analysis that has a wider coverage of data than either approach. In this concluding section, I would like to briefly discuss the empirical and theoretical implications of the analysis that I have proposed above and of the framework of categorial grammar in which it was couched.

4.1 Complex predicates in other languages

The approach that I have argued for in this paper seems to enable interesting alternative analyses of various kinds of challenges that complex predicates across different languages pose for other theories. For verb-final languages like Korean and Germanic languages, where similar kinds of verb-clustering effects are observed, the analysis of the present paper can be extended relatively straightforwardly. Korean is especially interesting in this connection in that, along with the type that exhibits a tight morpho-phonological clustering of the higher and lower predicates, it has a type of complex predicate where the clustering of V1 and V2 seems still looser than with the -te form complex predicate in Japanese. (See Chung 1995, 1998 for relevant data and discussion. See also the discussion of restructuring predicates in Romance below.) Such a cross-linguistic difference can be naturally captured by assuming that different languages are equipped with different sets of modes of composition in the morpho-phonological domain. Note that since Dowty (1996b), it is generally assumed in approaches that separate the so-called ‘tectogrammar’ and ‘phenogrammar’ that the latter is the domain of cross-linguistic variation.

Dutch verbal complexes raise a somewhat different issue. In the literature, two types of approaches have been proposed for the analysis of cross-serial dependencies in Dutch. On one approach (see Moortgat and Oehrle 1994; Dowty 1996a; Muskens 2007), the surface word order exhibiting the cross-serial dependency is captured by a somewhat elaborate interactions between several modes of composition posited in the morpho-phonological domain, which regulate the clustering of verbs and reordering of the component verbs within the verb cluster. The other approach provides a more direct solution for the mismatch between the surface word order and the predicate-argument structure by means of the notion of wrapping (Pollard 1984; Ojeda 1988; Morrill 2000). In the latter type of analysis, an infinitival VP keeps an ‘infixation’ point immediately to the left of the verb cluster, and the phonological form of the governing verb is inserted to this infixation point when the verb takes the infinitival VP as its complement. Since the present CG framework has both a multi-modal morpho-phonological component and the mechanism of λ-binding for it (for modeling discontinuity of the wrapping type), both types of analyses are implementable. The latter approach seems to be conceptually simpler, while the former approach seems to be empirically more adequate in modeling the argument-sharing effects straightforwardly. Comparing the two types of analyses in a framework like the present one, which allows for a systematic interaction between the morpho-phonological component and the combinatoric component, would be interesting, as it might uncover some hitherto unnoticed connection between the two types of analysis.

Finally, the so-called ‘restructuring’ verbs (Rizzi 1982) in Romance languages such as French and Italian require some comment, since they turn out to provide further motivation for the kind of approach to complex predicate that I have proposed in this paper and the architecture of grammar in which it is implemented.Footnote 23 Romance restructuring verbs exhibit a seemingly puzzling case where apparently conflicting evidence is presented for argument sharing and verb (non-)clustering. This is illustrated by the following Italian example (Roberts 1997):

  1. (92)
    figure dk

In this example, the clitic climbing (of li, which is semantically an argument of the lower verb but which attaches to the higher verb in the surface string) indicates that some sort of argument sharing obtains between the two verbs. However, the positioning of the adverb tutti ‘all’ between the two verbs suggests that the two verbs do not form any morpho-phonological cluster. How can one handle a case like this? An approach that predicts a one-to-one correlation between morpho-phonological verb clustering and argument sharing fails to predict such an empirical pattern. It is nonetheless intriguing that the two effects coincide in many instances of complex predicates, and it would be more desirable if we were able to treat the case of Romance restructuring as a related case. As I discuss below, it turns out that the overall pattern of Romance restructuring predicates finds a natural place in the spectrum of possible types of complex predicates that the present approach predicts.

Since the relevant empirical patterns are essentially the same, in what follows I focus on the case of French tense auxiliaries, for which a detailed and lucid discussion of a complex set of facts is available in Abeillé and Godard (1994, 2002). Abeillé and Godard provide a series of evidence for a flat-structure analysis of tense auxiliaries in French, where the auxiliary, the governed verb and all of its dependents are realized as sisters in a totally flat VP structure. The verb cluster analysis is rejected on the basis of adverb placement data analogous to the Italian example in (92), whereas a VP embedding analysis is rejected on the basis of the fact that the construction systematically fails canonical constituency tests for such embedded VPs, including deletion (93), fronting (94) and cleft (95). For such tests, tense auxiliaries in French systematically contrast with control verbs for which the corresponding sentences are all grammatical:

  1. (93)
    figure dl
  2. (94)
    figure dm
  3. (95)
    figure dn

Abeillé and Godard take these data to support their flat VP structure analysis. This argument is very convincing within the assumptions of phrase structure-based approaches, but the data in (93)–(95), together with the fact that adverbs can intervene between V1 and V2, suggest a somewhat different analysis within a setup like the present one that is equipped with a fine-grained notion of morpho-phonological constituency. Note that in all of the ungrammatical examples in (93)–(95), V1 is displaced from V2 by some syntactic operation. It then seems plausible to assume that the ungrammaticality of these examples reflect some sort of adjacency requirement between V1 and V2, which is somewhat looser than morpho-phonological clustering of the sort found in the Japanese-type complex predicates (in accommodating adverb placement between V1 and V2), reflecting the property of some special morpho-phonological mode of composition. On this view, it is natural to assume that this adjacency requirement also triggers optional restructuring of constituency at the level of abstract phonology that is responsible for the argument-sharing effects found in phenomena such as clitic climbing. But this restructuring differs from verb clustering in typical complex predicates in that it does not make the restructured V1–V2 unit a morpho-phonological cluster.

Thus, the case of Italian and French highlights a significant difference between the present proposal and alternative analyses in phrase structure-based approaches. Unlike in phrase structure-based approaches, the degree of tightness of the morpho-phonological bond between V1 and V2 is not a binary distinction between lexical vs. syntactic composition, but is expected to form a continuum in the present approach. Since flexibility in surface word order is exactly the place where such cross-linguistic variation is expected, the existence of the pattern exhibited by French and Italian restructuring predicates is in fact something that is naturally expected on the present approach.Footnote 24 In view of the existing detailed accounts in this empirical domain in the HPSG literature such as Abeillé and Godard (1994, 2002) for French and by Monachesi (1999) for Italian, and Monachesi (2005) for a wider range of Romance languages, a full comparison needs to await working out the details of the analysis sketched above. However, given the difficulty that such constructions pose for phrase structure-based and transformational approaches (which is well documented in the works cited above), the analytic possibility that the present approach opens up by abandoning the notion of phrase-structural constituency seems very attractive.Footnote 25

4.2 Comparison with other approaches

Since the present approach inherits the features of both ‘argument-sharing’ and ‘verb-raising’ approaches, one might wonder in what respects it is similar to and different from them. Below I will comment on these points and also on the relationship between the present approach and related previous approaches in CG, focusing especially on how the former improves upon the latter.

The most important difference between the present approach and the argument-sharing approach of the kind represented by the argument composition analysis in HPSG is that the effect of argument sharing follows from a more general property of complex predicates in the former, while it is simply stipulated in the latter. This is not merely a matter of theoretical elegance since the argument composition approach suffers quite seriously from the fact that it cannot naturally account for the semantic biclausality effects pervasively observed in complex predicates with respect to phenomena such as adverb scope, quantifier scope and binding as discussed above. While various elaborate extensions have been proposed in the literature (Manning et al. 1999 is a representative proposal of this kind; see Kubota 2007 for a detailed discussion of the problematic empirical consequences of this type of solution), they all accommodate such biclausality effects by means of a host of additional mechanisms that are unrelated to each other. This sharply contrasts with the proposed analysis in which such facts immediately fall out from the mechanism that mediates the mismatch between ‘underlying’ biclausality and ‘surface’ monoclausality.

Such biclausality effects are more naturally captured in the verb-raising-type analysis in linearization-based HPSG. However, the main difficulty with the linearization-based approach is that the liberation of embedded arguments to the higher clausal domain is done via the ‘compaction’ operation, which forms a tight morpho-phonological cluster involving V1 and V2. Such an approach is inherently unsuited to accounting for the intermediate degree of bond in morpho-phonological clustering in the Japanese -te form complex predicate. Further difficulty is likely to arise when one attempts to extend this type of approach to a wider range of phenomena including the cases of Korean complex predicates and Romance restructuring verbs discussed above. Moreover, verb-raising approaches in general do not have anything to say about the argument alternation effects found with passive and desiderative predicates.

The framework of Dowty (1996b), though very programmatic and informal, is an important precursor of the present approach, and can be thought of as a refinement of the HPSG linearization-based approach. The basic word order facts of the Japanese -te form complex predicate may be formulated in this setup. However, Dowty’s framework crucially lacks the interaction between the combinatoric component and the morpho-phonological component. For this reason, it cannot deal with cases where complex predicates interact with phenomena such as coordination and cleft that affect the combinatoric properties of the expressions involved.

The present approach builds most directly on and resembles most closely earlier variants of Multi-Modal Type-Logical Grammar (MMTLG) (Moortgat and Oehrle 1994; Moortgat 1997; Oehrle 2011). It improves on these earlier variants mainly by attaining greater conceptual clarity. In previous variants of MMTLG, the notion of multi-modality is modeled by means of the resource management system which governs the behaviors of different types of logic that are simultaneously posited within one calculus, and the linguistic motivations for such a highly abstract and technical concept have never been clarified.Footnote 26 The present approach departs from these previous proposals and models the morpho-phonological component as a separate system with its own deductive inference rules. This results in greater conceptual clarity in that the role of the morpho-phonological component within the grammar is clearer and is more in line with Dowty’s original conception. The details of this morpho-phonological component (especially the mapping from syntax to actually pronounceable strings of words) is also worked out more explicitly than in previous variants of TLG, attaining greater formal explicitness as well.Footnote 27

The present approach maximally exploits the gained ontological clarity for achieving greater empirical adequacy. Note in particular that the systematic predictions available for ruling out overgeneration (which has been taken by some to be a weakness of the ‘much too general’ setup of TLG) in cases in which complex predicates interact with other linguistic phenomena (especially those affecting the combinatoric properties) are due to the novel architecture of the present framework in which the combinatoric and morpho-phonological components are distinct yet interact with one another. Given that this theoretical innovation is precisely what enables a straightforward analysis of the complex set of data considered in this paper, I conclude that the logic-based perspective on natural language syntax embodied in the present work offers a new insight on the widely acknowledged yet previously unresolved recalcitrant challenge that the phenomenon of complex predicate poses for grammatical theories.