1 Introduction

It is, of course, not news that the way language organizes the world may tell us something about how the mind does so. Nor is it news that that perhaps the best window into how language organizes the world is how language works: what words mean and how grammars manipulate those meanings. This is the project that Emmon Bach memorably dubbed ‘natural language metaphysics’ (Bach 1986, 1989) or ‘natural language ontology’. Importantly, it’s a project that’s worthwhile even if—perhaps especially if—it should fail to coincide with metaphysics proper, because our theory of natural language metaphysics is a repository of linguistic analysis. If we’re doing it right, its structure explains the structure of language. The structure of the world is another matter entirely, best left to others.

There is an important trade-off in this domain, however, that I’d like to use to frame this paper. Structures in natural language ontology can serve to explain linguistic phenomena, and when they do, they may lighten the explanatory burden on other components of linguistic theory, including the syntax and semantics. Conversely, introducing complexity in the syntax or semantics can make possible a simpler ontology.

It may help to sketch an example of what I have in mind. It’s entirely independent of the one I’ll focus on primarily in this paper. It concerns polar antonyms of adjectives, such as tall and short. No one would defend the view that they are unrelated, of course, so only the question is where to install a theory of their difference. One possibility is ontological. Height is measured in abstract representations of measurement, degrees, which include things like ‘6 feet’. The set of degrees that measure height (as opposed to e.g. weight) tell us the dimension along which a given measurement exists, but they don’t actually tell us whether we’re measuring how tall someone is or how short. They tell us that the dimension is spatial extent, but not whether the scale is tallness or shortness. To know that, we must know at least one more thing: the ordering imposed on those degrees. ‘6 feet’ is a greater degree of tallness than ‘5 feet’, but a lesser degree of shortness. On such a theory, advocated in Kennedy (1997, 2001) and elsewhere, the key to the relation between tall and short is that they measure on scales that impose opposite orderings on degrees along the same dimension. Their denotations therefore need not reflect any direct connection between the adjectives beyond specifying which scale they use, because the connection is between the scales, not between adjectives themselves.

The alternative is to suppose that the relation between tall and short is a matter of grammar, not (primarily) ontology, and that they use precisely the same scale after all. One might suppose, with Heim (2006, 2008) and Büring (2007), that short involves a special kind of negation, present in the syntactic tree but not normally pronounced as a separate morpheme. Short, on this view, is actually a way of pronouncing ‘little tall’ or ‘untall’. There are a variety of arguments to be made for this more complicated syntax, and with it in place, the ontology needn’t provide an independent analysis of the connection between the two antonyms because the richer syntax already does.

It’s not the case, of course, that any analysis of any arbitrary phenomenon can be said to be primarily grammatical or ontological. In the context of this volume, it’s especially worth noting that an approach that involves decomposition into features might occupy an intermediate position with respect to this distinction: the decomposition is in some respects like decomposing short into ‘little tall’, but of course the decomposition needn’t be implemented directly in the syntax in this way, and there are interesting discussions to be had about the relationship between decomposing word meanings and decomposing the underlying concepts themselves. The former seems still a robustly grammatical enterprise; the latter considerably more an ontological one.

All that said, at some point there’s a danger of putting more weight on this distinction than it can bear. Its purpose here is chiefly just to situate another empirical puzzle for which a balance has to be struck between grammatical and ontological explanation: adjectives like average. The first thing to notice is the curious effect they often have on the referent of the nominal in which they occur:

figure a

This sentence, Carlson and Pelletier (2002) point out, is doubly mysterious. What sort of entity is ‘the average American’? Certainly, on its most natural reading, it doesn’t refer to some particular American who is especially typical of Americans. Second, what sort of entity is ‘2.3 children’? If the average American referred to a particular American—say, one named Steve—it would suggest, alarmingly, that Steve has only a fraction of one of his children. That’s not what the sentence means, at least ordinarily. Nor, indeed, is it possible to straightforwardly disentangle the strangeness of the first nominal from the strangeness of the second. Even if we avoid the reading under which (1) involves direct reference to Steve, it still fails to communicate that it is typical for Americans to have fractional children.

On its face, it would seem that to avoid such morally outlandish outcomes, we must embrace a metaphysically outlandish one. We must accept that there are such things as ‘average Americans’ in the model underlying the semantics, and indeed perhaps in some extended sense such things as ‘2.3 children’. I don’t think we should dismiss this possibility too readily. For one thing, as Bach would remind us, our judgment in these matters must be guided by language, not a priori notions about what sorts of objects populate the actual world. That’s the difference between natural language metaphysics and metaphysics proper. Indeed, this metaphysical direction is precisely the one in which Carlson and Pelletier head. For this reason, Hornstein (1984) was ultimately mistaken in saying that ‘no one wishes to claim that there are objects that are average men in any meaningful sense’. Yet, he argued, nominals like the average American act no different from more referentially pedestrian ones. He concluded that this was an argument against the enterprise of formal semantics itself.

My aim here will be more modest. Kennedy and Stanley (2009) observed that sentences such as (1) can be analyzed as a special case of a more general phenomenon: readings of adjectives in which the adjective is interpreted as though it were an adverbial. This requires a more complex syntax, but that more complex syntax is a low price to pay for the metaphysical benefit. It frees us from having to posit any spookily abstract and therefore implausible entities in the ontology. I’ll argue, building on Morzycki (2016b), these adverbial readings are in fact part of a considerably more general pattern of readings available to a far wider range of adjectives than generally recognized. I’ll argue that these readings actually fall into three classes, and that this leads us to an analysis distinct in important respects from Kennedy and Stanley but that, as they argued, places the explanatory burden on the syntax and compositional semantics rather than the ontology.

In Sect. 1, following largely the argument in Morzycki (2016b), I’ll present the case that what I’ll call nonlocal readings of adjectives (following Schwarz 2006 et seq.) are far more general than is typically recognized, and that they fall into three distinct classes. In Sect. 2, I’ll review some ways these problems have been approached in the past, highlighting the interplay between grammatical and ontological explanation. In Sect. 3, I’ll propose a strategy for approaching these facts that I hope may eventually scale up to the larger empirical picture and that has components of both kinds of explanation. In particular, I’ll combine elements of syntactic assumptions that have widely been made with a new ingredient in the compositional semantics: the idea that adjectives with external readings have determiner-like meanings, and as a consequence have the complex grammar associated with determiners. I’ll sketch this idea in general terms for average in particular, relating it to Gehrke and McNally (2010, 2015)’s crucial insight that adjectives like occasional involve reference to kinds. Finally, in Sect. 4, I’ll very briefly return to the larger issues with which we began: the analytical balance between structure in the syntax and semantics and structure in the ontology.

2 Nonlocal Readings of Adjectives

2.1 On ‘Occasional’

Let’s begin with the classic example of a nonlocal reading of an adjective, which is occasional (Bolinger 1967; Stump 1981; Larson 1999; Zimmermann 2003; Schäfer 2007; Gehrke and McNally 2010, 2015; DeVries 2010). It’s the best-studied such case, and this will serve as a useful background against which to consider average. The standard sentence is (2):

figure b

It has what’s called an internal and an external reading. The internal reading is interesting in a number of respects, but from our current perspective, it’s the external reading that is most immediately relevant. On this reading, the adjective makes a semantic contribution that is, to all appearances, completely divorced from the nominal in which it finds itself. The sailors that strolled by are sailors simpliciter. There is no question about the frequency of their sailing. But the situation is more puzzling still. On the external reading, the sentence means more or less the same thing as (3), where the definite determiner replaces the indefinite:

figure c

Yet the meaning is essentially the same (but see Gehrke and McNally 2015 for detailed discussion). Indeed, some adjectives of this class (odd and rare) have the external reading only with the.Footnote 1 Setting apart a subtle change of flavor, the external reading also occurs with your and in the bare plural:

figure d

So there are three mysteries so far: an ambiguity, unexpectedly wide scope, and unexpected interpretations of the determiner.

There are more still. Another is that, on the external reading, the adjective must occupy the leftmost position in the structure of the nominal:

figure e

Indeed, the range of determiners with which occasional is possible on the external reading is relatively limited:

figure f

Yet another idiosyncrasy of the external reading is that it renders the adjective unable to coordinate with ordinary adjectives:

figure g

Another still: on this reading, the adjective becomes incompatible with degree words such as very or the comparativeFootnote 2:

figure h
figure i

2.2 Returning to ‘Average’

Having noted the crucial features of occasional, let’s return to average with them in mind. First, there was ambiguity. As Carlson and Pelletier (2002), Kennedy and Stanley (2009) among others noted, there is an ambiguity with average too:

figure j

For the internal reading to be available without counterpragmatically ghastly background assumptions, we must change our earlier sentence to 2 children. On this reading, the claim is that there is an American somewhere that is typical and that he has two children. There is another reading of average that also occurs in (11) (Sebastian Löbner, p.c.), which is also internal, or in any case fails to be external:

figure k

The external reading is the one with which we are now familiar from occasional. It’s worth noting that it paraphrases naturally with an adverbial, on average, which is analogous to how occasional morphed into occasionally.

Here we encounter a set of properties that elegantly mirror those of occasional. There are unexpected interpretations of the determiner. Switching to the definite determiner leaves us, on the external reading, with apparently the same interpretation, and your is not much different:

figure l

Again, on the external reading, other determiners don’t seem to work:

figure m

And again, on the external reading average has to be leftmost among the adjectives in its nominal:

figure n

It is also unable to coordinate with other adjectives on the external reading:

figure o

It is incompatible with degree modifiers on this reading:

figure p

So, once again, the same mysterious patterns manifested themselves as with occasional. At a minimum, this supports the connection between the two that Kennedy and Stanley (2009) posited—perhaps indeed more robustly than they intended. But the pattern is more widespread still.

2.3 Wrong

Before considering the bigger picture, it will be necessary to lay out a few more examples of the general phenomenon. A version of the now-familiar pattern emerges once again with wrong (Haïk 1985; Schmitt 2000; Schwarz 2006, 2019). It too has an internal/external ambiguity, though perceiving it is slightly trickier. Suppose Floyd is a spy who is required to provide his interlocutor with false information and deprive her of true information. If he succeeds in this, (17) is true on the internal reading, on which the information provided was incorrect:

figure q

On the external reading, (17) is false, because Floyd answered as he is supposed to. On the other hand, if Floyd slips up at some point and accidentally answers a question truthfully, the situation is flipped: (17) is still true, but only on the external reading: he provided information that he isn’t supposed to provide, namely, true information. Something similar happens in (18):

figure r

Again, the internal reading in (18) is more easily discerned with some context. Consider a dystopian game show in which participants are executed for answering a quiz question incorrectly. Floyd is the executioner. If he killed the contestant that answered incorrectly, (18a) is true only on the internal reading. (‘Clyde was wrong, so I killed him,’ he might explain.) If Floyd accidentally killed a contestant that provided the correct answer, (18b) would be true only on the external reading.

There is again an odd fact about the interpretation of the determiner: the is interpreted as an indefinite. In (17), there need not have been only one wrong answer, and in (18), there need not have been only one person who must not be killed. The picture is slightly different, though. Your is impossible here except on its usual possessive reading, irrelevant here:

figure s

Strangely, it’s not just that the definite determiner is interpreted as an indefinite, but it’s the principal way to say this. The indefinite would be unusual on the external reading:

figure t

It’s not actually fully clear what reading these receive. For me, an external reading is possible, but only when there is a desire to communicate that there are multiple answers that shouldn’t be given and people that shouldn’t be killed.

Apart from that quirk, again we encounter restrictions on the choice of determiner on the external reading:

figure u

As before, inherently quantificational determiners fail.

The requirement that the nonlocal adjective be structurally higher than other adjectives again emerges:

figure v

So does the ban on coordination:

figure w

And so does the ban on degree modification:

figure x

So a rather large class of adjectives that includes wrong, average, typical, occasional and a number of its synonyms seems to manifest quite a number of common properties.

2.4 ‘Whole’ and ‘Entire’

The parallels continue with whole and entire, though there will be an important twist. As before, there is an ambiguity (Moltmann 1997, 2005; Morzycki 2002), which I’ll assume is a special case of the internal/external ambiguity:

figure y
figure z

The internal reading is actually the unusual one in these cases, and may take a moment to perceive. It’s what could be expressed more or less unambiguously with complete—indeed, I suspect that it’s precisely the existence of this unambiguous alternative that accounts (on broadly Gricean grounds) for the unnaturalness of the internal reading.

As before, there are restrictions on the determiner, but they take a different form. First, a, the, and your retain their usual meanings, and don’t become interchangeable. Second, strong quantifiers are still incompatible with the external reading, but weak ones are perfectly compatible with it (I will now indulge in the habit of marking sentences with a when they are impossible on the external reading)Footnote 3:

figure aa

The other, now increasingly familiar restrictions reemerge in their customary form. The external reading is only possible when the nonlocal adjective occurs high:

figure ab

It’s incompatible with coordination:

figure ac

And it’s incompatible with degree modification:

figure ad

2.5 Epistemic Adjectives

Abusch and Rooth (1997) observed a proposition-modifying interpretation of what they called ‘epistemic adjectives’ that now won’t come as a shock. These adjectives include unknown, undisclosed, unspecified, and unexpected. They can receive a wide-scope reading:

figure ae

The external reading systematically supports concealed-question paraphrases. For many years in the early 2000s, (32) was a kind of running joke in American political discourse, and it’s actually very hard to make sense of its internal reading:

figure af

The external reading is that Dick Cheney is hiding at a location and it has not been disclosed, for his safety, what location that is. On its internal reading, perhaps it would have to be the very fact that it is a location that is not disclosed.

At this stage, we will encounter the same empirical refrain, and the reader can presumably sing along. On the external reading, there are again restrictions on the determiner. Although the and a seem to behave normally, strong inherently quantificational determiners remain impossible:

figure ag

As for whole, weak determiners are compatible with external readings.

The restrictions on the structural position of the adjective in the DP remain the same. The external reading is, as we have come to expect, possible only when the adjective is highFootnote 4:

figure ah

The external reading is unavailable when the adjective occurs in a coordinate structure:

figure ai

It’s incompatible with degree modification:

figure aj

2.6 Same and Different

Other adjectives fall under broadly the same rubric. Among the best-studied of these are same and different (Nunberg 1984; Heim 1985; Carlson 1987; Keenan 1992; Moltmann 1992; Beck 2000; Lasersohn 2000; Majewski 2002; Alrenga 2006, 2007a, b; Barker 2007; Brasoveanu 2011). The facts in this domain are complicated in ways that muddy the waters considerably, and the terminology is different and confusing, but for our purposes the important point is that there is an ambiguity.

The main terminological confound is that the internal reading involves an anaphoric dependency on preceding discourse. This is in an important sense ‘external’, but it is not external in the relevant sense of seeming to require the adjective to access to the semantic content of the clause outside the nominal itself. This is clearer when considering the readings:

figure ak
figure al

I won’t rehearse the full song-and-dance yet again, in part because it presents, in this instance, complications that go considerably beyond the scope of this paper. Suffice it to say that on the external reading, same and different impose restrictions on the determiner with which they combine:

figure am

On this reading same and different are subject to the now familiar structural position requirement:

figure an

2.7 Modal Superlatives: ‘Possible’ and Its Kin

There is another important class of nonlocal readings of adjectives, which I will mostly set aside. These involve possible, conceivable, and the like (‘modal superlatives’; Bolinger 1967; Larson 2000; Schwarz 2005; Cinque 2010; Romero 2013; Leffel 2014):

figure ao

There are important distinctions between these cases and the ones we’ve examined so far, but for the moment I will note only the similarity: again, there is an ambiguity between an internal and external reading.

2.8 Miscellaneous Obscurities and Novelties

Without further discussion, I’ll note a few examples of nonlocal readings that are either obscure or, to my knowledge, novel:

figure ap

One shouldn’t read too much into these without careful examination, of course, but they collectively suggest that more external readings lurk just over our analytical horizon.

3 Three Classes of Nonlocal Readings

This paper is not a linguistic curio cabinet. We’ve established, I hope, that there are patterns in this domain. That’s not to say that there aren’t genuine mysteries here. It’s just that the phenomena at issue are mysterious in parallel ways. The next stage is to systematize the patterns more robustly so we can move toward an analysis.

There are, I will argue, three distinct classes of nonlocal adjectives. The first class I will set aside here. It includes the aforementioned ‘modal superlatives’ like possible. They differ from the others most strikingly in which determiners are involved in the external readings. In these cases, universal quantifiers license the external reading, not inhibit it:

figure aq

Superlatives and only also license it:

figure ar

Analyses for these cases can be framed around ellipsis, along the lines first proposed in Larson (2000), with structures like (49):

figure as

There is a satisfying account built from standard assumptions about superlatives in Romero (2013).

It will be the other two classes that will be of interest here. These are what I’ll call the weak-quantifier class, which includes whole and unknown and which permits external readings with weak quantifiers, and what I’ll call the no-quantifier class, which includes occasional and average and permits external readings only with non-quantificational determiners. Of course, describing various particular determiners as ‘non-quantificational’ is already a bit tendentious—though for the moment, I mean this only descriptively, in the sense of Heim (1982), Kamp (1981), and DRT more generally—so more needs to be said for explicitness.

It goes beyond the scope of this paper to advocate a particular theory of how determiner quantification works in general. All we require is some general conceptual machinery to characterize particular classes. I’ll refer to every and most DPs as strong and inherently quantificational; definite descriptions and other DPs that arguably directly refer as strong but not inherently quantificational; and all others as weak.

Setting the ellipsis class aside, all nonlocal readings observe a generalization:

figure at

This has been observed for specific lexical-semantic families of adjectives, but the important point is that it seems to be true of all of them.

As we’ve seen, a few nonlocal adjectives—occasional, average, and wrong—are even more constrained in that they are incompatible with any determiner apart from (some combination of) the, a, bare plurals, and generic your. Stating it more officially:

figure au

These generalizations are the crucial element in the taxonomy, so it may help to summarize things in a table:

figure av

Of course, the challenge now is to explain these generalizations. That’s a tall order, inasmuch as it requires a synthesis of a vast array of adjectives and (collectively) a vast literature and set of analytical approaches. This won’t happen in any single paper. Nevertheless, having framed the challenge in this way, we are in a better position to assess what an explanation might look like.

4 Some Background

4.1 Incorporation

First, we must dispense with a straw man. One might imagine that external readings of adjectives are brought about simply by moving the adjective from its base position to an adverbial position, where it is interpreted as an adverb. The idea is a natural one, and I’ll argue that in a certain sense it’s not entirely wrong—but formulated in this crude way, it’s unenlightening. Why should this movement happen? Why would an adjective have an adverb meaning? How does this help us understand the interaction of the adjective with the determiner?

More enlightening alternatives are available. There are many analyses on the market of individual instances of the larger problem of nonlocal readings, but they aren’t straightforwardly generalizable to the full range of facts. There is one idea, though, that constitutes an excellent starting point. It’s Larson (1999)’s proposal (further developed in Zimmermann 2000, 2003) that, in the occasional construction, the adjective moves from its base position to incorporate into the determiner in a process of ‘complex quantifier formation’Footnote 5:

figure aw

This movement creates a single quantificational determiner, an+occasional. It is then possible to provide this determiner with a denotation, listed in the lexicon just like that of any other. The advantage of that is that it’s straightforward to capture various idiosyncrasies. If we need to stipulate that for occasional and average, the denotations of the, a, and your should be identical but for wrong they shouldn’t be, we can reflect it directly. Indeed, we should expect such idiosyncrasies, inasmuch as the lexicon is, after all, a repository of the idiosyncratic.

What’s less comfortable is that we have to stipulate not just that an+occasional, the+occasional, and your+occasional all have identical denotations, but also to make precisely the same stipulation independently for a+sporadic, the+sporadic, and the+sporadic—and indeed for other combinations of a, the, and your with adjectives of this class (though see Zimmermann 2003) for some inroads on this).Footnote 6

Some analysis is necessary of why these readings fail to occur with determiners other than a, the, and your. On this approach, it would simply be to fail to stipulate any complex determiners that fail to have these as components. It would be essentially an accidental lexical gap, a mere accident of the development of language.

This approach provides helps in one way right off the bat. Quantificational determiners have access to the VP by perfectly ordinary means: Quantifier Raising (May 1977, 1985; Heim and Kratzer 1998). A generalized quantifier—the type of expression a quantified nominal denotes—takes a VP as its argument. The basic architecture of a quantified sentence is as in (54):

figure ax

The determiner every here has ‘access’ to the VP in the sense that its denotation asks for a predicate, Q in (55), that it can subsequently manipulate. The manipulation of VP meanings is the signature property of adverbials, of course, so on the incorporation view, what makes it seem like occasional has an ‘adverbial’ external reading is that it incorporates into a quantificational determiner and therefore has access to a VP meaning. Its access to clausal material external to the DP is a side-effect of the access the VP it has by actually being, in a deeper sense, a determiner.

If an adjective is part of a quantificational determiner meaning, it will gain access to the VP as a matter of course.

Thus this approach accounts for the adverbial scope of occasional and its kin, for the idiosyncratic interpretations of determiners in this construction, and (by stipulation) for restrictions on the determiner. It also accounts for the restriction on coordination: any adjective in a coordinate structure would be unable to move out of it without violating the Coordinate Structure Island. In general, movement from outside of one conjunct in a coordinate structure is not possible:

figure ay

That’s precisely the sort of movement that, on this view, would be required in (57) to achieve the impossible external reading:

figure az

The obligatory high position of the adjective is explained as well—any adjectives above it would block its path to the determiner.

The incompatibility of external readings with degree modification would also be expected, because only a bare adjective, and not a phrasal constituent, can do head-to-head movement, the kind required here. Occasional on its own is the head of an AP, but very occasional is not. This approach may even shed light on Zimmermann (2003)’s observation that external readings are often absent where Quantifier Raising is blocked. This analysis can be extended to average, wrong, perhaps same, and maybe others.

Nevertheless, one might have some qualms. The movement required would seem to violate the Head Movement Constraint (Travis 1984), which would normally prevent a head from moving outside of an adjoined phrase (the AP, in this case) as in (53).

More worrying, perhaps: why are a, the, and your alone the determiners that have been targeted for complex quantifier formation? Could it in principle have been any other combination? And why is it that the denotations of these complex determiner-adjective combinations aren’t unpredictable? If they’re specified in the lexicon, one might imagine virtually arbitrary variation, but the generalizations we would like to explain aren’t arbitrary. Whatever the answers to these questions, more would have to be said to make weak-determiner-compatible adjectives such as whole, unspecified, and different fit in.

4.2 Structure Versus Ontology: The First Step

Framing the current project as a trade-off between structure and ontology, at least with respect to average, is as I’ve said not novel. What I propose here is a variation on a theme from Kennedy and Stanley (2009). They observe the connection between average and occasional, and that this connection affords an analytical opportunity. For them, average incorporates into the determiner, just like occasional does for Larson (2000). The actual combinatorics required to achieve the necessary readings are complicated in ways that can be set aside, but they require a non-standard scope-taking mechanism that Barker (2007) dubbed Parasitic Scope, though appeals to it without the brand name can be found in Sauerland (1998) and earlier. The structure they propose is this:

figure ba

The variable n here ranges over real numbers, or what number terms like 2.3 denote. The denotation is built up using the complex determiner th’average as in (59):

figure bb

The denotation of th’average applies to three arguments. The first is a relation between numbers and individuals that have that number of children. The second is a property indicating what population is being averaged over, in this case, Americans. The final one is a real number indicating the computed average.

The details of implementation won’t be crucial here, but they involve computing a mean on the basis of the maximal number of children each individual has,Footnote 7 and |P| should be interpreted as the number of individuals that have the property P:

figure bc

The most important point, for current purposes, is that on this view average DPs don’t refer to anything metaphysically exotic because they don’t refer to anything at all. Rather, they have an exotically high semantic type, which, coupled with incorporation from an adjective into a determiner and an unusual scope-taking operation, add up to a semantics that yields the right reading. For them, the right reading is strictly adverbial. It’s the reading that can be paraphrased ‘on average, Americans have 2.3 children’. It’s worth noting, though, that this analysis has many of the same costs as the basic incorporation analysis, including having to stipulate the equivalence of the, a, and your on this reading.

4.3 The Kind Analysis of ‘Occasional’

The balance between the compositional semantics and the ontology is tilted in precisely the other direction in Gehrke and McNally (2010, 2015), building on Schäfer (2007). The distinctive property of occasional nominals, for them, is not in their grammar but rather in their referential properties—and it is therefore there that we should locate an analysis. So they seek a simpler syntax-semantics and a richer ontology.

It would require navigating quite a bit off my intended course to do justice to their proposal, but at its heart is an idea on which I will build: kind reference. The observation is that the occasional sailor involves reference to realizations of sailor-kinds. Very approximately, the truth conditions of the now-familiar sailor sentence can be rendered as in (61):

figure bd

The major advantage to this strategy is that it doesn’t require the compositional backflips that the incorporation analysis—and especially the Kennedy and Stanley (2009) variant for occasional—requires. Indeed, because there is no movement at all, it doesn’t violate the Head Movement Constraint. It also provides insight into why a, the, and your should be the determiners that uniquely have a special status in this construction. This is precisely the class of determiners that have a special status with respect to kinds and genericity:

figure be

To the extent that this approach is successful, it requires no special stipulations about the denotations of determiners. And because of that, it helps explain why determiner interpretations don’t vary freely. No special stipulations are necessary to explain why your+occasional or the unattested *every+occasional don’t just happen by chance to mean something they don’t actually mean.

The main shortcoming of this approach, from the current perspective, is that it’s not clear how to make it scale up. On its own, it seems convincing that kind-reference is going to be a crucial ingredient in the analysis of external readings. But it’s not clear to me how to make it the principal ingredient in a fully general theory.

5 The Modular Strategy

5.1 Determiner-Like Adjectives

The aim of this paper is not to present a general theory of nonlocal readings, but taking a confident step in that direction requires a theory of how they arise that is modular: that is, one that relies on multiple interacting parts to arrive at an explanation. Such a theory makes it possible to activate or deactivate certain of these components to explain variation among subclasses of adjectives and—most directly at issue here—to explain the biggest split among nonlocal readings, the one between adjectives that give rise to Broader Quantifier Resistance and those that don’t. (This sets aside, of course, the possible ellipsis class.)

One satisfying aspect of the incorporation analysis sketched above is that it reflects that nonlocal adjectives aren’t prototypically adjective-like, even on a purely descriptive level. They don’t pass standard diagnostics for adjectives, such as the ability to occur in comparatives, with degree modifiers, or in the complement position of seem. They don’t conjoin with adjectives. Nor do they occur in the same positions as adjectives generally; rather, they are obligatorily high.

This might suggest incorporation or another form of syntactic differentiation, but all these properties also follow from simply assuming that nonlocal adjectives have an unusual semantic type. In the spirit of the incorporation approach, I’ll assume these adjectives have precisely the same type of denotation as quantificational determiners, namely . Switching back to average American, the picture would be as in (63):

figure bf

This has as a consequence that the node above the adjective, the NP average American, would denote a generalized quantifier. Following standard assumptions (see Heim and Kratzer 1998 for a review), it would therefore have to quantifier-raise and adjoin to the clause to avoid a type clash. I’ll leave aside what happens higher in the clause for the moment to focus on the DP. The trace this movement leaves behind would standardly denote an individual. To make these LFs slightly easier to read later in the paper, I’ll write it as a variable rather than a trace:

figure bg

But this is hardly any help at all. It just gives rise to a different type clash: the NP would now denote an individual, but the is of and expects a property.

There is a natural solution. It’s to adopt the standard be type shift (Partee 1987), which shifts an individual to the property of being that individual:

figure bh

Applied to Floyd, for example, this shift would yield the property of being Floyd:

figure bi

Partee used it for copular constructions, and it has subsequently proven useful elsewhere. In this case, this resolves the type clash by providing the with the property-denoting argument it seeks in (64):

figure bj

But as it turns out, at the next node up, this shift will achieve for us something more.

5.2 Determiners That Work

One of the things we would like to explain is why the, a, and your seem to work robustly with a number of nonlocal adjectives, and why distinctions in their interpretations seem to be neutralized in the presence of frequency adjectives and average/typical. That result follows from the type shift alone. There is one and only one individual that has the property of being Floyd, and it is Floyd. For this reason, the person who is Floyd and Floyd mean the same thing. So too, here the would combine with the property the shifted trace denotes to yield the unique individual that is identical to the one the unshifted trace denotes:

figure bk

This is precisely the same individual as the one denoted by the trace alone. The effect is as though the were absent entirely, as though the nonlocal adjective and its NP sister had occurred in subject position on their own.

The semantically-bleached variant of your that occurs in e.g. your average American mostly amounts to a version of the with a slight whiff of genericity about it, which would leave us in more or less the same place (see Gehrke and McNally 2010, 2015 for more).

As for a, the right result follows from a simple equivalence. To say that there’s a person x such that x is wearing a hat and x is Floyd is just to say that Floyd is wear a hat. The same equivalence manifests itself in (70). The standard denotation of the indefinite article in (70a) when combined with the shifted trace denotation, as in (70b), yields an expression that asks for a predicate Q and says that some individual identical to \(x_1\) satisfies Q:

figure bl

To say that there is an individual identical to \(x_1\) of which the predicate Q holds is simply to say that Q holds of \(x_1\):

figure bm

The result, again, is truth-conditionally identical to what would have happened had the determiner been absent entirely.

To articulate this a little bit further, let’s adopt the toy denotation for average in (72a). This applies to the denotation of the modified NP, and predicates the VP meaning of the kind that corresponds to the NP meaning, using Chierchia (1998)’s \(^\cap \) property-to-kind type shiftFootnote 8:

figure bn

This probably isn’t adequate on its own as a theory of average, and much of Kennedy and Stanley (2009) may have to be layered on top of it. A few more words on this follow in Sect. 6.1 below. But it suffices to sketch the compositional machinery. Thus the updated tree would look like this (I’ve ornamented the tree with a superscript k to reflect that the trace of average American denotes a kind):

figure bo

The result of the computation is just what we need:

figure bp

So the upshot is a semantics that requires that Americans generally have 2.3 children.

The crucial component to notice here is not the semantics of average, though, so much as the way the combination of the type shift, compositional assumptions, and kind-reference have achieved the effect of ensuring that precisely the determiners that systematically license external readings yield the right result.

5.3 Determiners That Don’t Work

What of determiners that don’t work? Again, the nature of the movement and resulting type shift helps the situation—or rather, undermines it in the right way.

Strong determines like every and most presuppose that their domain has more than one member. If there is only one person in the corner, for example, (75) gives rise to failure of presupposition:

figure bq

I’ve spelled it out explicitly in the denotation of every in (76) (the colon indicates the presupposition; |P|, as before, indicates the cardinality of individuals that satisfy P)

figure br

In (77), every combines with the property :

figure bs
figure bt

But (78) is a singleton property—there is only one individual that is identical to \(x_1^k\). It therefore violates the presupposition every imposes on its first argument.

This presupposition is not a peculiarity of every, but rather a property of strong quantificational determiners in general. Thus most would work similarly. Because movement below the DP level, in the framework proposed here, systematically gives rise to such singleton properties, it systematically precludes combining with strong quantifiers.

We have thus derived one of the two generalizations articulated earlier: the Strong Quantifier Resistance Generalization. All external readings observed it, so if this mechanism is crucial to deriving external readings, this explains it. Weak quantificational determiners do not have this presupposition, so they don’t in general block external readings.

But what of the Broader Quantifier Resistance Generalization, the one only some adjectives observed? Some adjectives—like our test cases, average and occasional—do block the external reading with weak quantifiers too. But despite the absence of the fatal presupposition, these fail in another respect. The denotation of three is as in (79), a property of individuals that have a cardinality of 3:

figure bu

When this combines with the shifted trace, it will combine intersectively with its denotation to yield (80):

figure bv

This is a property satisfied by a plurality with three elements that is identical to the kind \(x_1^k\). That means, naturally, that the kind \(x_1^k\) has to be a plurality of three elements. But kinds aren’t pluralities, and they don’t have cardinalities. This is pretty straightforward metaphysically, but again, linguistic evidence makes it clear. As Chierchia (1998) demonstrates especially convincingly, across languages kinds are essentially a kind of mass term. Cheese, for example, denotes a kind in English, and *three cheese is ungrammatical.

So in this case, the problem that rules out weak quantifiers has to do with kinds, and it will be only nonlocal adjectives that leave behind kind-denoting traces that will be subject to this additional restriction. Occasional is also incompatible with weak quantifiers, and, as Gehrke and McNally (2010, 2015) demonstrate, its semantics also relies crucially on kinds. Nonlocal adjective with no kind overtones such as whole or wrong or unspecified should therefore avoid running afoul of this difficulty and be compatible with weak quantifiers even on their external readings. And indeed they are. More on both of these points follows in the subsequent two sections.

5.4 A Word About ‘Occasional’

Occasional and its kin aren’t the focus here, but a brief word about how they might work in this framework is appropriate. The approach to which I’m most sympathetic would be to simply combine the insights of two competing classes of approaches. Kinds must occupy a central place, for the reasons discussed above. But quantification can play a central role too. In particular, there is no reason not to adopt the Zimmermann (2000)’s quantifier OCCASIONAL, which quantifies jointly over the individuals and events, though here it will be crucial that it be kinds and events (with as the type of events):

figure bw

This denotation would trigger movement to a position just below where the event argument is closed, and yield sentence denotations like (82):

figure bx

This seems a reasonable happy medium between the two approaches.

5.5 The Weak Quantifier Class

There remains to discuss the class of external readings that are compatible with weak quantifiers. For those, though, in one sense there is little to be said. What ensured incompatibility with weak quantifiers above was the role of kinds. Adjectives whose semantics makes no special reference to kinds don’t give rise to the problem of computing the cardinality of a kind.

To illustrate, the denotation of unknown could be characterized as in (83), where I’ve used \(?x\phi \) to abbreviate the embedded question ‘which x is such that \(\phi \)?’Footnote 9:

figure by

What unknown hotel does is a little complicated. First, it requires that there exist a hotel that satisfies the property formed by raising the whole quantified NP. Second, it requires that it not be known which individuals satisfy this property.

It will help to see how this works in action. The tree for (84a) arrived at by raising would be as in (84b):

figure bz

This assumes a null existential determiner in the head of the nominal, and that, standardly, it undergoes quantifier raising. The denotation of (84) would be as in (85):

figure ca

This is a property that holds of any three-membered plural individual such that Solange stayed at its members.Footnote 10

What unknown hotels adds to this is that this plurality is required to consist of hotels, and that it not be known which hotels precisely these are. The computation for the full sentence is in (86):

figure cb

The result, correctly, is that there must be three hotels at which Solange stayed, and it must not be known which hotels these are.

The crucial element in all this, though, is that there is nothing about unknown that prevents cardinalities from being computed, and so nothing that resists, in this instance, three, and more broadly any of its kin.

5.6 Summary

The result, then, is that there is no need for incorporation. The external scope facts follow from quantifier raising. The interpretation of determiners is standard. Restrictions on determiners follow from independent considerations. The general resistance of nonlocal adjectives to strong quantifiers follows from the compositional circumstances of their movement, which invoke a type shift with which they are incompatible. The resistance of certain nonlocal adjectives to weak quantifiers follows from independent facts about the lexical semantics of the adjective—specifically, having a kind-based semantics. Other restrictions, like the lack of coordination with ordinary adjectives and absence of degree modifiers, follow from the quantifier type of these expressions.

This means it was not necessary to stipulate which determiners support incorporation and which don’t, or what interpretations result for every combination. Nor was it necessary to stipulate why the, a, and your wind up making the same semantic contribution, or to do so repeatedly for each frequency adjective. It also wasn’t necessary to stipulate anything about the interaction of quantificational force with external readings. This is possible in part precisely because what I have offered here is only a sketch. The devil, as always, is in the details. But I hope this illustrates an analytical approach to these facts that might hope to scale up to the broader analytical picture I sought to draw.

6 Taking Stock

6.1 Could Things Be so Simple?

One issue remains strikingly unresolved. I’ve characterized the denotation for average I’ve provided above as a toy denotation. I’ve said, perhaps a bit defensively, that things couldn’t possibly be so simple. Surely, it couldn’t suffice to say that the average American means, essentially, the same thing as the kind-denoting nominal Americans, and (87a) and (87b) mean more or less the same thing:

figure cc

But the truth is, I think the simple toy version of the facts may be onto a deeper grammatical intuition than the more complicated one.

To be sure, we have the option of layering on components of the Kennedy and Stanley approach here, introducing elements of their machinery on top of the bits I propose to achieve their desired adverbial reading. There is a danger of redundancy, though. And the more one does that, the farther one gets to the connection to kind-reference, for which Gehrke and McNally provide ample evidence.

The defense of the naïve theory proceeds in several steps. The first is empirical. Suppose we adopt a theory that involves computing a mean. On such a view, (88a) and (88b) would both be predicted to be true, and, therefore, quite probably (88c) too:

figure cd

Yet they are all false, or in any case false outside of exceptionally odd statistical contexts. Any theory that revolves primarily around calculation of means would fail to predict this. But in a theory that relies on kind reference, it’s expected. On such a view, it’s the 2.3 children case that’s puzzling.

That, I think, is precisely where we should want to be puzzled—that is the case that we should treat as exotic rather than as the core example. Most languages through most of human history had no reason to refer to fractions. Moreover, the semantics of fractions is independently puzzling. They are problematic completely independent of their role in average sentences. It makes sense, then, that the theory of average shouldn’t be itself founded on this independent mystery.

That said, nothing in the general conception of external readings proposed here rests above all on any particular assumptions about kinds. Perhaps other notions could do the necessary work without putting us on thin ice with respect to sentences containing fractions. Indeed, I consider one such possibility in Sect. 6.2 below. The only crucial role kinds play here is to rule out computing cardinalities, which in turn is crucial to distinguishing the weak-quantifier class from the no-quantifier class. That’s not nothing, but there may be other means of accomplishing this. Nevertheless, it’s worth recognizing that there are several converging lines of evidence that point to kinds or some form of genericity in these sentences: initial intuitions about what average sentences mean, the judgments in (88), the role kinds play in distinguishing classes of external readings, and its place in correctly predicting which determiners have which readings. One might be still able to explain 2.3 children by simply adopting,with Kennedy and Stanley, an extraordinarily high type, but it seems right that special stipulations should be required there and not elsewhere.

It’s worth pointing out, though, that one could also follow in the spirit of Carlson and Pelletier (2002) and appeal to fictive entities in place of some form of kind. This analytical avenue may actually be more available on this approach. Kennedy and Stanley argue against the fictive entity approach in part on the grounds that it doesn’t explain the limited inventory of determiners possible with average. Those facts, however, can be explained independently here. But again, if the relevant notion of fictive entities can emerge with an appropriate kind flavor, that seems preferable on independent grounds to the alternative.

None of that directly addresses what the semantics of 2.3 children should be. My suspicion is that an ultimately satisfying answer requires not just a theory of nonlocal readings of adjectives, but a better theory of mathematical language, and in particular of what I elsewhere call ‘semantic viruses’ (Morzycki 2017), in the spirit of Sobin (1997) syntactic viruses (see also Lasnik and Sobin 2000; Schütze 1999). I argue there that some expressions associated with educated, often highly self-conscious language may use special semantic mechanisms not otherwise available in the semantics. Making this distinction may help us distinguish which operations and what grammatical phenomena truly are exotic and may call for some brute-force high-type complexity, and where we should seek simplicity, even occasionally in the face of apparent counterevidence.

6.2 Kinds and Concepts

Sebastian Löbner (p.c.) suggests that a number of the restrictions on external readings of average and occasional may involve characterizing more precisely the concept types they give rise to. Average American on the relevant reading isn’t a sortal concept—one that supports counting and is neither uniquely referential nor relational. That accounts for its incompatibility with strong quantifiers (#every average American), and perhaps for its incompatibility with stacked or conjoined adjectives (#an average (and) irritable American).

This mode of explanation in some respects has the same shape as a kind-based one, or indeed as one organized around fictive entities. They all seek to derive the properties of the expression from the ontological status of the extension of the nominal. Both kinds and the relevant non-sortal concepts are uncountable. It doesn’t seem too far-fetched to claim that fictive entities might not be countable either, though that’s less obvious. Insomniacs are sometimes advised to count sheep in order to fall asleep, yet under normal circumstances the livestock in one’s bedroom are entirely fictive. Likewise, the resistance to quantification that I earlier attributed to a failure of presupposition could be attributed to countability as well. As I expressed it in (76), the presupposition involved determining the cardinality of individuals that satisfy the property expressed by the nominal argument. In this implementation, that is not undefined. Even though this property has in its domain kinds, it denotes a property that holds of precisely one kind. Therefore it is countable. This follows from how the movement and type shift interact. One might imagine, though, an alternative analysis where the inherent countability of the noun is crucial. In order for the analysis of adjective stacking and conjunction to go through, however, one really would have to have the NP average American denote this concept kind quite low in the tree, before any type shifts have taken place. On this view, then, the crucial difference could be viewed as being in how high in the structure of the nominal kinds are invoked. But there are good reasons to think properties of kinds are to be found deep in the nominal extended projection, very near the noun (Zamparelli 1995 among others). So this fact too may be insufficient to distinguish these two approaches on a deep level, setting aside particular analytical choices I’ve made here.

The adjective order facts, however, might be of use. Most evidence for a layer in the nominal projection that is concerned with kinds rather than objects suggests that it is the lower of the two. So-called Bolinger contrasts (Bolinger 1967; see Morzycki 2016a and Leffel 2014 for extensive discussion) such as the one in (89) show that adjectives lower in the nominal ascribe inherent or individual-level properties, and higher ones ascribe contingent or stage-level properties:

figure ce

On its only possible reading, (89a) refers to stars that are visible in principle but invisible at the moment, perhaps by clouds. But (89b) is contradictory, because it refers to stars that are invisible in principle but visible at the moment. A broadly similar fact, in the spirit of Larson (1998, 2000):

figure cf
figure cg

Larson marshals such facts to argue for a generic quantifier in the nominal projection. But be it about kinds or not, the domain of genericity in the nominal is low. Yet as we’ve seen, adjectives associated with external readings are exclusively high. A reminder:

figure ch
figure ci

If kinds or non-sortal properties were at issue lower in the nominal, this effect would be expected to be either reversed or absent entirely.

One appeal of such an approach, in either of these incarnations, would be that the quantificational facts and the facts about conjunction and stacking could be brought under the same rubric. As it stands, the latter derive from the quantificational type of the NP. A major disadvantage is that they wouldn’t readily extend to the rather large class of adjectives compatible with weak quantifiers. Nor, in the absence of a scope-taking mechanism, would they permit the adjective access to the VP denotation. Yet this access is precisely what seems to be required for e.g. epistemic adjectives such as unknown, as shown in Sect. 5.5.

7 Final Remarks

To close, a few words about the commonly expressed intuition that nonlocal readings are a grammatical oddity. These adjectives are indeed odd, but in a precise and interesting sense. They are odd in the way that platypuses and lungfish are odd: they are—perhaps metaphorically, or perhaps more than metaphorically—transitional forms in an evolutionary progression, unusual because they combine features of two distinct categories that we normally regard as mutually exclusive. Over succeeding generations of speakers, certain adjectives may emerge from the swampy depths of the inner NP to which they are usually confined, and tentatively make their way onto the dry land of the determiner domain. They can’t be expected to make this leap in a single stride, so we can observe them in the midst of their evolutionary journey and thereby discover more about both their evolutionary origin and their destination. Like platypuses and lungfish, they are important and analytically revealing not despite their strangeness, but because of it.

Substantively, the proposal was that nonlocal adjectives have quantificational determiner denotations, trigger raising of the NP in which they occur, stranding the determiner, and sometimes require properties of kinds as their arguments. This isn’t a general theory of all nonlocal readings, naturally. That would be far too ambitious for any single paper. But it has the shape of a general theory, and my hope is that further research will be able to fill in the gaps in a similar spirit.

From the broader cognitive perspective, though, one of the larger lessons is the balance between the explanatory burden on the ontology and on the structural machinery. For average, for example, one might have gone in the direction of recognizing ‘average Americans’ as actual, if very abstract, objects in the model, ‘fictive persons’. For occasional, I followed Gehrke and McNally (2010, 2015) in placing a great deal of explanatory weight on the notion of kinds, if perhaps not quite so much weight as they have.

On the other hand, structural components played a crucial role. For average, one could go so far as Kennedy and Stanley (2009) do, and invoke quite high-powered syntactic and semantic machinery to twist the tree into the shape we require. For occasional, Larson (1999), Zimmermann (2000) and others provide a path that also requires quite a bit of syntactic machinery.

It is misguided, I think, to ask where we wind up in each respect: how much compositional structure do we need, how much metaphysics, and what the right balance is. Rather, we should recognize that there may be some explanatory trade-offs, but that inevitably, we will need a bit of both modes of explanation—and it is up to language to tell us how much we need of either.