Counterfactuality and past

Many languages have past-and-counterfactuality markers such as English simple past. There have been various attempts to find a common definition for both uses, but I will argue in this paper that they all have problems with (a) ruling out unacceptable interpretations, or (b) accounting for the contrary-to-fact implicature of counterfactual conditionals, or (c) predicting the observed cross-linguistic variation, or a combination thereof. By combining insights from two basic lines of reasoning, I will propose a simple and transparent approach that solves all the observed problems and offers a new understanding of the concept of counterfactuality.


Introduction
It has long been observed that, across a large number of unrelated languages, markers of the past also refer to counterfactual contexts. This relation is also easily observable in English: (1) a. Erica sat down and drank a glass of water. b. If Erica drank more water (in the present/ future), she would be healthier. c. If Erica had drunk a glass of water (in the relevant past), she would not be this dehydrated.
In Sect. 2, I will present the main facts from English and other languages that this article is concerned with. As I will discuss in Sect. 3, the puzzling correlation between B Kilu von Prince kilu.von.prince@hu-berlin.de 1 Sprach-und literaturwissenschaftliche Fakultät, Institut für deutsche Sprache und Linguistik, Allgemeine Sprachwissenschaft, Humboldt-Universität zu Berlin, Dorotheenstraße 24, Raum 3.311, Berlin, Germany past and counterfactuality has attracted a great deal of research in typology, cognitive linguistics and formal semantics. I will discuss in particular two lines of reasoning: the remoteness-based approach, in which English simple past (ESP) and related markers express a distance from the actual present-represented by Iatridou (2000); and the back-shifting approach, in which ESP shifts the perspective to the past, which also allows quantification over otherwise historically inaccessible worlds-as in Ippolito (2013). I will argue that the two lines of reasoning exhibit largely complementary sets of problems: Iatridou (2000) provides a compositionally simple and transparent approach that explains the contrary-to-fact implicature of counterfactual sentences, but fails to account for the observed distribution and various interpretations of ESP. Ippolito (2013) covers all attested and unattested readings of ESP, but relies on complex assumptions about the syntax-semantics interface and does not directly provide an explanation of the implicatures of counterfactual sentences. Both approaches fail to predict the cross-linguistic variation we observe.
Readers who are primarily interested in my proposal rather than the problem statement may jump directly to Sect. 4, where I will argue that a combination of insights from Iatridou (2000) and Ippolito (2013) can solve all the observed problems. From Iatridou (2000), I will take the idea of exclusive quantification over counterfactual worlds. Since Iatridou (2000) operates within a parallel-worlds framework that allows for only a binary distinction between the actual and non-actual (or counterfactual) worlds, universal quantification over factual worlds leads to an overgeneration of readings. But a modified version of the branching-time framework used by Ippolito (2013) allows for a three-way distinction between actual, possible and counterfactual indices. Exclusive quantification over counterfactual indices in such a tripartite structure allows for compositionally transparent, lexically precise definitions of TAM (tense, aspect, mood) markers such as ESP and correctly predicts the cross-linguistic variation we find. This three-way distinction of modal domains into the actual, the possible and the counterfactual is the main theoretical innovation of my approach. I will therefore refer to it as 3D modality, short for three domains of modality.
I will then discuss the truth conditions of counterfactual conditionals that derive from my assumptions and argue that they take a middle ground between two traditional extremes: While some authors have defended the position that conditionals do not have truth conditions at all, there is widespread agreement among linguists that counterfactual conditionals have vague truth conditions that can, in principle, be tested in the actual world. What follows from my assumptions is that counterfactuals do have vague truth conditions which can however never be made true or false by observations in the actual world.
In Sect. 6, I proceed to show that the contrary-to-fact implicatures of counterfactual conditionals can be easily derived from my previous assumptions in combination with some basic considerations of pragmatic fitness of utterances relative to a Question Under Discussion. I will show that the 3D-modality approach correctly predicts some of the environments in which the implicature does not arise, including Anderson conditionals.
Sections 7.1 and 7.2 are not essential to the understanding of my proposal, but add some background and perspective. Section 7.1 briefly retraces the history of applying branching time to counterfactual conditionals and reflects on probable reasons why the particular proposal made here has not been considered before. In Sect. 7.2, I discuss the implications of 3D modality for the concept of counterfactuality and the classification of specific utterances, including polite questions containing would, future-oriented conditionals with would and indicative conditionals with contrary-to-fact implicatures.

The main empirical observations
The main correlation between past and counterfactuality in ESP, which has already been illustrated by the examples in (1), goes back at least to Jespersen (1931) and has been discussed many times since.
Less attention is typically paid to the meanings ESP can not express. A clear definition of what I mean by counterfactuality will be given in the following sections. For our current purposes, I will consider all conditionals as counterfactual that contain would in the apodosis. The following examples illustrate the range of observations I will discuss. ESP can refer to the actual past: (2) If Laura took the train this morning, she will arrive at 3 pm.
ESP can also refer to the future in conditionals with would in the protasis, which I take to mean that it can refer to the counterfactual future: (3) If Laura took the train tomorrow, she would arrive at 3 pm.
ESP can not refer to the future in a conditional with will in the apodosis. I take this to mean that it can not refer to the possible future. 1 (4) If Laura #took/ takes the train tomorrow, she will arrive at 3 pm.
ESP can not refer to the past in a conditional with would in the apodosis. I take this to mean it cannot refer to the counterfactual past: (5) If Laura #took/ had taken the train yesterday, she would have arrived at 3 pm.
To refer to the counterfactual past, it is necessary to use past perfect-see example (1-c); at the same time, English past perfect (EPP) can also be used with a reference to the counterfactual future. This has first been discussed by Iatridou (2000) and is most closely associated with the work of Ogihara (2000). I will explore it in more detail in Sect. 3.

(6)
Martha arrived in Paris yesterday. If she had arrived there TOMORROW, she would have missed the Fête de la Musique.
As I will argue in more detail in Sect. 3, previous approaches to past-andcounterfactuality markers suffer from a potential overgeneration of interpretations by not ruling out a reference to possible futures and the counterfactual past, and, in some cases, to the actual present. One might suspect that pragmatic principles of relevance and paradigmatic contrasts are responsible for those restrictions, but: (1) if so, no one has spelled out this option yet; and (2) the fact that past-and-counterfactuality markers in other languages do not have the same restrictions makes such a position much harder to maintain. The Oceanic language Daakaka shows what a marker may look like that actually encodes a reference to anything but the actual present. The "distal" TAM clitic t can refer to the actual past, the counterfactual past and present, the possible future and the counterfactual future, depending on the environment (von Prince 2018). The Daakaka distal marker is used to express discontinuous past, similar to the English simple past in combination with stative predicates (Altshuler and , von Prince et al. (2018) have found, that in future counterfactual conditionals, the potential marker is preferred in the apodosis. The distal marker can still occur in the protasis of the conditional. The following example is from a storyboard-based elicitation, in which one speaker asks the other one if he will play volleyball the next day. He says that he will not because he hurt his hand, he goes on to say:  (1a) and (1b), but unlike ESP with respect to criteria (2a) and (2b), in the list of criteria given towards the end of this section.
Similar facts have been reported for other expressions cross-linguistically, including the TAM marker kua in Faka'uvea (Moyse-Faurie 2002), the transitional aspect in Cèmuhî (Rivierre 1980) and the TAM marker tō in Mwotlap (François 2003). Except the Daakaka distal marker, however, none of these expressions have been investigated in sufficient detail to allow for a definitive comparison.
These observations only serve to show that the restrictions we find for ESP are in need of an explanation, because they do not hold for past-and-counterfactual markers in other languages. The problem has also been stated concisely by Schulz (2007): […] English is not the only language showing non-temporal uses of its past tense marker. It is rather a phenomenon that can be observed in languages from quite different families. But while there is a certain similarity between the contexts in which these languages employ this marker, there are also language specific differences. In order to account for the general meaning of the simple past in English a proponent of the past-as-unreal [i. e. remoteness-based] hypothesis has to give a description of this semantic property that singles out those and only those uses made of ESP. This is clearly something notions like "distance from reality" and "non-actuality" etc. cannot achieve. (Schulz 2007: 178) The solution by Schulz (2007) is to give up on finding a single definition of ESP that accounts both for its actual past and counterfactual references and treat it as an item that is ambiguous between two different meanings. 2 2 In the words of the author: We assume that the morphological category of the simple past is ambiguous and expresses two different syntactic feature combinations: either it asks for the past tense operator PAST or for the mood operator SUBJ. If the simple past is interpreted as mood feature, then the verb also carries a [-pres] feature. Hence, the subjunctive obligatory combines with the present tense. A similar ambiguity is also proposed for the syntactic perfect. The auxiliary have is either interpreted as the perfect operator or selects for the counterfactual mood.
In this article, I pursue the goal of finding a definition that does account for both uses, while simultaneously excluding non-attested readings.
Another fact that any theory of counterfactual conditionals has to account for is their very counterfactuality. In brief, the pragmatically most salient feature of counterfactual clauses is the inference that their prejacent is not true in the actual world: (13) If Martha had watered the flowers, they would have survived. Martha didn't water the flowers, they did not survive.
The following, widely cited example comes from Anderson (1951): If Jones had taken arsenic, he would have shown just exactly those symptoms which he does in fact show.
Regardless of examples such as (14), in most situations, counterfactual conditionals are infelicitous if their prejacent is known or very likely to be true (compare e. g. Starr 2014).
(15) Tracy ran the marathon. #If Tracy had run, Sharlene would have run too.
Any approach to past-and-counterfactuality markers should be able to derive these felicity conditions and the contrary-to-fact implicature. Finally, an ideal approach to the semantics of ESP would allow for a straightforward derivation of the meaning of a sentence from the definitions of its lexemes and basic compositional principles. The following list summarizes the observations that a theory of past-and-counterfactuality marking should ideally account for: 1. ESP can express: (a) reference to the actual past (1-a); (b) reference to the counterfactual future (1-b); 2. ESP can not express: (a) reference to the possible future (4); (b) reference to the counterfactual past (5); 3. EPP can express (among other things): (a) reference to the counterfactual past (1-c); (b) reference to the counterfactual future (6); 4. Counterfactual conditionals come with the implicature that their prejacent is not true in the actual world (13) and are infelicitous in contexts where this implicature is in conflict with the common ground (15). 5. Past-and-counterfactuality markers differ cross-linguistically in whether they can also refer to domains such as the counterfactual past and possible future.
Footnote 2 continued In the second case it does not carry a tense feature like the simple past. The counterfactual mood is only realized if some other past tense marking in the sentence asks for the subjunctive mood. (Schulz 2007: 205) 6. Sentence meanings should derive compositionally and transparently from basic definitions and observable structures.
In the following section, I will argue that previous approaches to the relation between past and counterfactuality face problems with various subsets of the above goals.

The previous discourse on the connection between past and counterfactuality
The broad and varied literature on past-and-counterfactual markers can roughly be sorted into two main approaches: 1. Expressions that encode both past and counterfactuality essentially express remoteness from the actual present (remoteness approach). 2. In counterfactual contexts, the past marker causes a perspective shift to the past, from which hypotheses about the future can be entertained (back-shifting approach). 3 In this section, I will retrace the development of both and illustrate each with one representative example.
In trying to explain this relation, most of the earlier accounts converge on some version of the remoteness approach. As mentioned above, the main idea behind this approach is that past and counterfactuality share a semantic core of distance from the actual present. In this section, I will retrace the development of this line of reasoning and show how it overgenerates potential readings of ESP.
The remoteness approach was intuited early by Joos (1964), Steele (1975) and Langacker (1978), and spelled out in detail in Fleischman (1989): essentially, this approach suggests, both the past and counterfactuality are removed from the actual present. Fleischman (1989) proposes that the counterfactual interpretations of past markers are metaphorical extensions of their temporal meanings (see also Isard 1974;Lyons 1977) and claims that the basic metaphor that links tense and modality is distance. Under this approach, however, it is not clear why future events and counterfactual past events should not be covered by the same form in some languages but not in others. This overgeneration of potential interpretations has been noted and criticized early on by Givón (1994: 317). Iatridou (2000) picks up the essential intuition by Steele (1975) and Fleischman (1989) and proposes to overcome the vagueness of previous proposals by formalizing a definition of ESP that covers both its modal and its temporal uses in the form of the Exclusion Feature. The Exclusion Feature is defined in terms of a variable x that can range either over times or over worlds. It determines that an utterance may refer to the same world as the world of utterance, but in this case, it cannot refer to the time of utterance. Or it can refer to the time of utterance, but in this case, it cannot refer to the world of utterance.
While my proposal is very close in spirit and deeply indebted to Iatridou (2000), it is also meant to overcome some of the problems it faces. I will discuss how Iatridou (2000) relates to the following four observations from above: (1b) ESP can express reference to the counterfactual future; (2a) ESP can not express reference to the possible future; (2b) ESP can not express reference to the counterfactual past; (5) Past-and-counterfactuality markers differ cross-linguistically in whether they can also refer to domains such as the counterfactual past and possible future; (3b) EPP can express reference to the counterfactual future; Iatridou (2000) addresses: I will follow Palmer (1986), Vlach (1993), Kamp and Reyle (1983), and many others in treating tense as only past or present and woll as modal. It follows, then, that [the topic time excluding the utterance time] means that the topic time is in the past with respect to the utterance time. (Iatridou 2000: 246) At first glance, it seems that this statement is successful in ruling out a reference of ESP to the possible future. On second thought, however, the situation appears more complicated. The following two stipulations are apparently expressed by the quoted passage: ESP can only affect a shift in worlds or times, but not both simultaneously.
(17) Future indices are not included in the world of reference.
The following additional assumption appears to be quite unavoidable:  (16) and (17) succeeds in ruling out reference to possible futures, in accordance with (2a). But if one accepts (18), then the combination of these three hypotheses would also rule out a reference of ESP to counterfactual futures and therefore contradict our very basic observation (1b). The only way for Iatridou (2000) to be compatible with all the observations discussed here, one would have to give up hypothesis (18). While this is generally a logical possibility, it is not a very intuitive one and would need scrupulous exploration. Moreover, it is not clear under the assumptions by Iatridou (2000) how we would accommodate the cross-linguistic variation we find. The fact that ESP cannot refer to potential futures is not a general property of past-and-counterfactuality markers cross-linguistically, and it is not clear to me how this observation relates to the statement quoted above.
Later motivations for abandoning parts of the proposal by Iatridou (2000) come from observations about counterfactuals with EPP and future reference as in (6), repeated below: Martha arrived in Paris yesterday. If she had arrived there TOMORROW, she would have missed the Fête de la Musique.
According to Iatridou (2000), a counterfactual clause with a past perfect tense in the protasis has two layers of past as in If Martha had arrived earlier, she would have met Laura; only one of those layers can be interpreted as referencing a non-actual world. The second layer is then necessarily taken to encode temporal distance from the present, resulting in a past reference. Therefore, counterfactuals with a past perfect tense in the protasis should always refer to the counterfactual past. Iatridou (2000: 252, footnote 26) states this as a puzzle that has to remain unsolved under her initial proposal. It has later been taken up by Ogihara (2000), Ippolito (2003), Arregui (2007), Ippolito (2013) and others. Later work in the remoteness-based tradition includes Nevins (2002), Schlenker (2004), Karawani and Zeijlstra (2013) and Schulz (2014). They are however not primarily concerned with deriving the distributional and interpretational restrictions we find for ESP. Before closing this section, I would like to point out that, despite the problems pointed out above, Iatridou (2000) successfully addresses and derives the contraryto-fact implicature of counterfactual conditionals. We will see in the coming sections that this is not the case for some later approaches.

The back-shifting approaches
Much of the subsequent work on ESP has moved away from a remoteness-based approach and toward a back-shifting approach. Dudman (1983) and Dudman (1984) are often credited as the first accounts of this line of reasoning. The central idea is that in combination with would in the apodosis, a simple past marker causes a backward shift to a point in the past from which we can quantify forward over possible developmentsincluding those that are no longer accessible from the present perspective. Fig. 1 In back-shifting approaches, the past tense morphology is thought to push one's perspective back in time so that developments that are no longer possible become historically accessible. Left: parallel worlds; right: branching time This idea is illustrated by Fig. 1. It is independent from the choice between a parallel-worlds framework (Romero 2014) and a branching-time framework (Ippolito 2013). Romero (2014: 48) has brought forward a point of criticism that generally applies to this line of reasoning: According to the temporal remoteness [back-shifting] line, past tense morphology uniformly expresses temporal precedence, but this morphology may be interpreted outside the syntactic structure where it is found, i. e., outside the if-clause in our case; it is this mismatch between surface position and interpretation site that deceivingly gives the impression that the additional tense layer is fake (Dudman 1983, Arregui 2009, Grønn and von Stechow 2009; see also Ippolito 2003).
In other words, this line of reasoning relies on complex assumptions about the syntax-semantics interface and cannot derive the intended meaning from the surface structure. The main goal of Romero (2014) is to find a plausible solution to this problem, while maintaining the basic assumption about temporal back-shifting.
In addition to the apparent mismatch between form and meaning that is basic to back-shifting approaches, they also share the essential challenge faced by the remoteness-based accounts: They are either too loose or too restrictive to account for the full range of attested references of ESP and related markers from other languages.
One back-shifting approach that is very close in spirit to my proposal and also quite similar to it in its reliance on branching time is represented by Ippolito (2003Ippolito ( , 2006Ippolito ( , 2013. I will in particular take a closer look at Ippolito (2013) for the remainder of this section.
The approach by Ippolito (2013) is crucially motivated by the observation by Iatridou (2000) that counterfactuals with a past perfective in the antecedent can refer to the (counterfactual) future, as illustrated above in (6), which remains an unsolved puzzle under the approach of Iatridou (2000). The first one to pick up this puzzle was Ogihara (2000). Ippolito (2013) goes against Ogihara (2000) in asserting that this observation cannot be accounted for purely in terms of a contrastive focus on temporal adverbials. Ippolito (2013) does take into account the overgeneration of readings that earlier approaches suffer from and that had previously been pointed out by Schulz (2007).
One potential problem that Ippolito (2013) addresses explicitly, in contrast, for example, to Romero (2014), is the missing counterfactual past reading for counterfactual conditionals with ESP in the protasis. Ippolito (2013) states that the past form in the protasis of the conditional is already used to shift back the time of historical accessibility. It cannot simultaneously determine the time during which the relevant event takes place. Why the past feature is spelled out on the main verb of the protasis remains an open question in this scenario. Also, this account does not sit too well with the observation that, in some languages, a single past marker can apparently do both: shift back the point of accessibility and locate the time of the event described in the protasis in the past. So the explanation by Ippolito (2013) rests on idiosyncratic and language-specific assumptions about ESP. The same could be said about the solution that I offer myself, although in 3D modality, the relation between ESP and similar items from other languages that do not have the same restriction would be more straightforward to define. 4 Ippolito (2013) also manages to exclude the use of counterfactual ESP and EPP in the protasis with will in the apodosis, by stipulating that will is just the spell-out of an abstract underlying form woll when in the scope of a present tense, but will be spelled-out as would when in the scope of a past tense (going back to Abusch 1988; also assumed by Iatridou 2000).
Ippolito (2013) does not provide a clear explanation for why counterfactual conditionals are often not felicitous in situations where indicative conditionals can be used. Consider (19): a. I'm quite sure that Amaya took the train. b. If she took / did take the train, she will arrive at 3 pm. c. #If she had taken the train, she would arrive at 3 pm. 5 Ippolito (2013) accounts for why counterfactuals are felicitous in situations where indicative conditionals fail. And she offers an explanation for why EPP counterfactual conditionals are good in situations where ESP counterfactual conditionals fail. But she does not predict, or explain, the infelicity of counterfactuals in situations such as (15). In contrast to Iatridou (2000), in Ippolito (2013) counterfactual conditionals are quantifications over both actual / possible and counterfactual indices-it is therefore not 4 To wit, compare the definition of ESP that I will suggest further on with a hypothetical past marker that behaves like ESP except that it also includes the counterfactual past: By contrast, saying that ESP can only shift back either the time of historical accessibility or the event time, but not both, while PAST 1 can do both, appears hard to formalize under the proposal by Ippolito (2013). A reviewer points out that the missing interpretation of ESP could instead be derived by its paradigmatic contrast with EPP. In my view, the assumption of a blocking effect should be motivated by the observation that under specific circumstances, the missing interpretation is still available. But as far as I can tell, ESP can never refer to the counterfactual past. Of course, it still remains a logical possibility. 5 Note that the relevant conditional here is the EPP conditional rather than the ESP version If she took the train, she would arrive at 3 pm, because we assume that the hypothetical train-taking event is located in the past, and counterfactual ESP cannot refer to the past. ?
clear how the contrary-to-fact interpretation is derived. The closely related approach in Ippolito (2003) relies on Maximize Presupposition to derive the felicity conditions of counterfactuals, but Leahy (2018) points out two problems with this solution: Firstly, it cannot generate the contrary-to-fact implicature as new information; secondly, as earlier pointed out by Leahy and Romero (2010), "Ippolito's derivation seems not to enable the conclusion that the antecedent is false, but that the antecedent suffers presupposition failure." (Leahy 2018: 9) Finally, the criticism by Romero (2014) against the general intransparency of backshifting approaches also applies to Ippolito (2013), who freely admits that her proposal rests on complex assumptions about the syntax-semantics interface and does not fully resolve all mismatches.

Summary
In this section, I have discussed previous approaches to the connection between past and counterfactuality and the meaning of ESP. I have then assessed two concrete proposals with respect to how well they can handle the observations in Sect. 2.
We have seen that Iatridou (2000) is a compositionally transparent, straightforward approach that accounts for both the observed reference to the actual past, and to the counterfactual present and future. By quantifying exclusively over counterfactual worlds, it also provides an explanation for the contrary-to-fact implicature. But it is not clear that it solves the problem of overgenerating unattested references of ESP to the counterfactual past and the possible future; and it does not address the reference of EPP to the counterfactual future.
On the other side of the spectrum, Ippolito (2013) successfully rules out unacceptable uses of ESP and EPP. However, this approach requires highly involved assumptions about the syntax-semantics interface, is not easily compatible with the cross-linguistic variation in past-and-counterfactuality markers, and it does not fully predict the implicature that the prejacent of a counterfactual should be false in the actual world. Table 1 summarizes these differences between the two approaches with reference to the goals set in Sect. 2.
There are a number of other proposals that attempt a unified approach to the actualpast and counterfactual-present/-future uses of ESP, such as Grønn and von Stechow (2009), Karawani and Zeijlstra (2013), Karawani (2014) and Bjorkman (2015), to which I cannot do full justice in this paper. As far as I can assess, however, they all fall somewhere onto the spectrum between these two situations. My work is particularly indebted to Condoravdi (2002), which incorporates elements from the remotenessbased approaches as well as the back-shifting approaches-although it is not primarily concerned with ESP. I recommend Schulz (2007: 169ff.) for a detailed discussion of Condoravdi (2002) and other proposals, where some of the same problems are diagnosed systematically. The proposal I will introduce in the coming sections is closer to the remoteness-based approaches of Iatridou (2000) and others than to the back-shifting approaches in that it will derive the various interpretations of ESP via its definition rather than through syntactic movement.
Before concluding this section, I should comment on the role of aspect in expressing counterfactuality. Aspect has long been known to be deeply involved with modality (compare e. g. Dowty 1977 and references therein). A large body of literature addresses the interaction of the perfective / imperfective distinction and counterfactuality. This interaction appears to be more important for some languages such as Greek (Iatridou 2000) and Romance (Hacquard 2006(Hacquard , 2009) than for others such as Russian (Grønn 2013). But for English, too, this distinction has been argued to play a crucial role in the expression of counterfactuality most prominently by Arregui (2005Arregui ( , 2007Arregui ( , 2009. Two central observations to this body of work are that, firstly, would-conditionals without EPP in the protasis are much worse in a context such as (20) Arregui proposes that the relevant difference between the two cases is aspectual. Ippolito (2013) argues that the difference is that in the case of (20), the presuppositions that are necessary for the prejacent of the conditional are not true, while in (21), the prejacent itself is negated. The initial account by Ogihara (2000) suggests that the relevant difference is in the focus on a temporal adverbial in (20). My impression is that the only clear-cut cases where EPP is required to refer to a counterfactual future involve both some event of dying and focus on a temporal adverbial, so I find it hard to take a definitive stand in the debate on empirical grounds. I do however not share the central assumption by Arregui that the ESP version of (20) is not counterfactual. And my approach is compatible with the proposal by Ippolito (2013) that an EPP counterfactual is needed when the presupposition of its prejacent is false in the actual world.
In Romance, Greek and some other languages, the perfectivity distinction plays a much more obvious role in counterfactuals than it does in English. My proposal does not contradict those findings. It just suggests that different languages might have developed different means to accessing the counterfactual. English uses past tense, but other languages might require imperfective aspect in combination with past tense or other means. While a comprehensive review of cross-linguistic strategies is beyond the scope of this paper, I will comment briefly on the apparently widespread combination of imperfective aspect and past tense. I conceive of perfective expressions as treating indices as atomic and zero-dimensional, and describing events as atomic entities. Imperfective aspect, by contrast, treats indices as intervals; in effect, imperfective aspect creates a two-dimensional smudge from an index, which then covers both the modal and the temporal dimension. So, in my mind, imperfective aspect can grant access to non-actual worlds by smudging indices. This intuition is inspired heavily by Dowty (1977), who has spelled this out in some detail. In those languages where the reference of past tense expressions does not extend to counterfactual branches, the only way to access the counterfactual domain may be to combine past tense with imperfective aspect. I will not be able to exhaustively argue for this position here. This short excursion is just meant to illustrate that the 3D-modality approach can in principle be extended to other phenomena and languages.

Branching time
Like Ippolito (2013), many linguists have used a branching-time framework to formalize the relation between tense and modality (e. g. Condoravdi 2002;Kaufmann 2005b;Arregui 2009;Laca 2012). In this section I will introduce the main ideas and explain how giving up one of the original assumptions by Thomason (1970) will allow us to come up with a definition of ESP that combines strengths of Iatridou (2000) with those of Ippolito (2013).
The original motivation behind the branching-time framework, as envisioned by Meredith and Prior (1956) and Prior (1957Prior ( , 1967 and spelled out by Thomason (1970Thomason ( , 1984, is a philosophical one. It is meant to account for puzzling intuitions about historical necessity. Going back to ancient Greek thinkers such as Aristotle and Diodorus of Chronos, the notion of historical necessity addresses the asymmetry between statements about the past and statements about the future. In brief, statements about the future have a certain chance of being true or false. By contrast, true statements about the past are true by necessity-according to Thomason (1970Thomason ( , 1984 and others. This asymmetry is captured by a branching-time framework. The formal definition for this framework is taken from Thomason (1984). I recommend Rumberg (2016) for an overview of branching time in modal and temporal logic.
Definition 1 A branching-time frame U is a pair I , < , where 1. I is a non-empty set of indices i; 2. < is an ordering on I such that if i 1 < i and i 2 < i, then either i 1 = i 2 , or i 1 < i 2 , or i 2 < i 1 .
All indices have a common predecessor. A branch through any i ∈ I is a maximal linearly ordered subset of I containing i.
This partial ordering relation creates a tree structure as shown in Fig. 2. It beautifully captures the intuition about historical necessity: Looking forward, there may always be more than one possible continuation. But looking backward, there is only one line of developments that leads to where we are now. Thomason (1970Thomason ( , 1984 relies on logical constants such as the necessity operator instead of explicitly quantifying over indices with expressions such as ∀i : φ(i).ψ(i).
Therefore, to formalize the notion of historical necessity, he introduces the additional assumption that quantification over worlds is always restricted to those branches that are identical up to the present moment. This is an assumption I do not make. In the setup I propose here, actuality can be seen as a kind of necessity in that it can be formalized as a universal quantification-one that is restricted to the actual past and present. It is however not the only kind of necessity that can be modeled in a branching-time framework. Universal quantification over both the actual and counterfactual worlds is also possible. So is universal quantification over only counterfactual worlds.
I should add that the notion of historical necessity as such can still be implemented. It is of course still possible to model the asymmetry between the openness of the future and the necessity of hindsight with the proposed system: Looking forward, there are potentially many continuations of the present and we cannot single out one "real" future. But looking back, we can still uniquely identify one sequence of indices that precedes our present as our actual past. Quantification over branches can still be explicitly restricted to those branches that pass through the actual present.
Giving up the quantificational restriction opens up a new semantic space that differs crucially from all previous accounts in that it allows for a tripartite distinction between temporal-modal domains. In a parallel-worlds approach, there is only a binary distinction between the actual worlds and non-actual worlds. In a Thomason-style branching-time approach, there is only a binary distinction between actuality and future possibilities. But in an approach to branching time that does not assume the same restrictions, there is a three-way distinction between the actual (past and present), the counterfactual (past, present and future) and the possible (future). To show this difference between traditional and unrestricted branching time more clearly, consider If we assume with Thomason (1970Thomason ( , 1984 that quantification is restricted to branches that are identical up to the actual present, then, if i 2 is the actual present, we can only quantify over b 3 , b 4 . It is also possible to quantify over all six branches b 1 , . . . b 6 , if one shifts the perspective backwards to i 1 as in the back-shifting approaches that have been discussed above. However, it is not possible to quantify exclusively over b 1 , b 2 , b 5 , b 6 , because from i 2 they are not accessible at all, and from the perspective of i 1 the precedence relation cannot distinguish them from b 3 and b 4 . 7 By giving up this restriction, we can distinguish between and exclusively quantify over three modal domains: 8 1. i c and predecessors of i c (the actual); 2. successors of i c (the possible); 3. and indices that are neither successors nor predecessors of nor identical with i c (the counterfactual).
In contrast to previous setups, this more fine-grained temporal-modal space allows for the more precise lexical definitions that we need to avoid the overgeneration of interpretations for ESP, and to account for the cross-linguistic variation, all the while maintaining the intuition by Iatridou (2000) about exclusive quantification over counterfactual worlds.
I also assume that indices from different branches can be sorted into groups of indices that qualify as simultaneous and that for any given pair of indices, it is possible to specify a temporal order between them. This means that only those branching-time structures that allow for a linear ordering of indices are candidates for the structure I assume (see also Schulz 2007 for similar concerns and Visser 2017 for a technical exploration of the problem).

Definition 2 1. Every index i has a time value t(i).
2. There is a strict linear order on time values, such that for every pair t(i), t(i ) either 7 I would like to stress here that, by quantifying exclusively over counterfactual indices, we do not imply anything about the actual world. We only say about counterfactual branches that they have a certain property X ; we do not say, however, that only counterfactual branches have property X . If we only assert about counterfactual branches that they have a property X , we leave it open whether the actual world also has property X or not. 8 By saying these are three modal domains, I mean that their distinction is afforded by the predecessor relation alone, without recurse to an additional temporal order.

For all
In the following section, I will propose concrete definitions for some expressions of English.

Definitions
I will show here how the assumptions in the previous section can be used for precise and simple definitions of English TAM expressions. I adopt the common assumption from tense semantics that the reference time of a sentence is represented as a temporal pronoun. TAM features place a presupposition on this temporal pronoun, as suggested by Partee (1973), Heim (1994), Abusch (1997) and Kratzer (1998) and beautifully modeled in a recent paper by Bochnak (2016). Let us start with the definition of ESP.
This will be abbreviated as : λ pλi : i ∈ I esp . p(i) In words: ESP takes a proposition and an index argument, asserts that the proposition is true for that index, under the condition that this index is (a) relevant and (b) either a predecessor of the actual present i c ; or later than / simultaneous with i c and not a successor of / identical with i c . This definition accounts for the exclusion of ESP from reference to the possible future, the actual present and to the counterfactual past, simply by lexical definition. Since we can account for these restrictions on a lexical level, rather than an architectural level, in contrast to Ippolito (2013) and others, the cross-linguistic variation that we actually find with languages like Daakaka is fully expected.
Note that the definitions for English TAM expressions all include a variable of relevance for indices R I and sometimes branches R B . This variable has a number of functions, including ensuring the well-known non-monotonicity of counterfactuals. I assume with Stanley and Gendler Szabó (2000) that the domain of quantification is always restricted to contextually relevant items. I believe that R B , R i are determined dynamically and also include a measure of similarity to the actual world-worlds that differ from ours arbitrarily are not considered relevant. Thus, consider a scenario in which two speakers are locked in a room at the top of a high building and are contemplating ways to escape. One speaker may then say, felicitously: If we jumped out of the window, we would die from the fall.
In this scenario, we understand that they do not consider all logically possible worlds, including those in which gravity is extremely weak, or in which guardian angels are bound to pluck them out of the air. By contrast, imagine the speakers are contemplating what they would do if they had superpowers such as flying. In this case, the utterance of (23) would seem weird, because we would evaluate the sentence relative to the counterfactual worlds already under consideration, which include superpowers.
At this point I would like to address the concern of one reviewer about the compatibility of this framework with traditional approaches to modal flavors and ordering sources. It is generally easily possible to intersect the domain of quantification over indices with those indices that are epistemically or otherwise accessible and to order branches or indices according to the number of propositions that are compatible with a given set of rules, wishes or similar. In this respect, the framework proposed here is fully commensurate with most traditional approaches to modal semantics. 9 Turning to the meaning of further expressions of English, I stipulate that the definition of would is as follows: When you compare this definition of would with the definition of ESP above, you will find that it is almost identical, except that (a) would cannot refer to the actual past; and (b) would contains a universal quantifier over branches. This last property ensures that would is excluded in the protasis of a counterfactual clause. As we will see shortly, if requires a proposition of type s, t as its first argument, and a proposition of type t as its second argument. Since would yields type t, it is not eligible for the protasis of a conditional clause. The only TAM element of English that can then step in to refer to counterfactual indices is ESP. 11 The range of both expressions is illustrated in Fig. 4.
We will see below how these assumptions allow us to understand why the counterfactual meaning of ESP is only available in combination with certain expressions like would or wish.
Instead of writing, for example {b|b ∈ R, ∃i ∈ b.φ(i)} for the protasis of a conditional, we might refer to {b| f ≈B 0 (φ)(b)} (compare derivations below). I assume that intersecting R B with the result of f ≈B 0 (φ)(b) would generally yield a subset of the latter. The details of the implementation would, of course, depend on the intended goals and the assumptions of the corresponding framework. The part about my proposal that interestingly differs from others here consists in how φ is spelled out, not in how R B , R i are spelled out. 10 I assume that the variable of relevance R B results from an intersection of contextually relevant branches with the temporal-modal domain of the expression it occurs in, to the extent that this is necessary to avoid vacuously false statements. 11 One option to derive the difference between varieties of English that allow would in the protasis of a conditional and those that do not, would be to assume two different entries of would that differ in their semantic type. As we will see below, I assume that if takes an expression of type s, t as its first argument, and an expression of type t as its second. An s, t version of would is given below: The corresponding t-type version would be derived by existential closure, which I assume, in the context of indices, comes with a default universal quantification over branches.  (1947), Klein (1994) and others, because it is the easiest to integrate to the framework developed here. A Reichenbachian definition of perfect is given below: This definition of the perfect aspect ensures that the event time is prior to the reference time.
Whether a Reichenbachian approach to perfect can successfully derive all its attested interpretations, especially in the context of English present perfect, is a matter of current debate (Grønn and von Stechow, to appear). Other approaches view the perfect as indicating that the result-state of an event holds at present (Kamp and Reyle 1983), or suggest that the perfect creates and extended now, such that the present moment is made into an interval that includes prior moments (Dowty 1979). It is not my intention to decide between these different approaches. The only effect of the perfect that is relevant for the discussion at hand is its potential to specify that an event has taken place prior to the reference time (or prior to the end of the reference interval). This is implied by all three lines of approaches to the perfect. The definition in (25) does not require any additional assumptions on my part and is therefore the most trivial to integrate into this framework. In contrast to Iatridou (2000) and Ippolito (2013), I therefore do not treat the past perfect as instantiating two layers of past tense, but as a transparent combination of perfect, which is here treated as a relative tense, and past tense.
The final ingredient that we need before we can demonstrate a derivation of the meaning of a counterfactual conditional is English if.
Apart from the assumption that if is semantically vacuous (e. g. Kratzer 1991 and others), there are two basic intuitions about its meaning. One intuition has been explored, among others, by von Fintel (1997, 1999a, 2001) and von Fintel and Iatridou (2002). In the terms of the proposed framework, this intuition says that if takes two sets of branches and asserts that one set of branches is a subset of another set of branches: 12 (26) The meaning of if (first version):

φ(i)}-the set of those contextually relevant branches that contain an index for which φ is true.
Another intuition is that the antecedent of a conditional clause is a topic. Haiman (1978) was the first to note that conditionals are marked like topics in a number of typologically unrelated languages (also compare Iatridou 2013: 134-137). Biscuit conditionals such as If you're hungry, there's biscuits in the pantry have been fruitfully analyzed as involving a topical if -clause- Hinterwimmer et al. (2008) argue that the same analysis can also be applied to indicative conditionals more generally. In my approach, a topic-version of if has to have a different setup from the definition in (26). Crucially, it is a function that takes only one argument of type s, t and one argument of type t rather than two arguments of type s, t . Furthermore, the topical if is an information-structural function. I will define it using the conventions of structured propositions, where α, β is an ordered set such that α is the topic and β is the comment of an utterance (Krifka 2001 In (29), the frame topic large fish explicitly restricts the scope of the following proposition-the speaker does not commit to catfish being their favorite thing in the world, but only to preferring catfish over other big fish. This meaning can be represented as in (31):

fish(x)}, ∀y : favorite(I)(y).catfish(y)
For conditional clauses, I suggest the following logical form: The meaning of if (second and final version): Here, q is a proposition of type t as in ∀b ∈ R B .∃i.q(i) or ∃b ∈ R B .∃i.q(i). Read: Within the set of relevant branches such that p is true, all / some branches contain an index such that q is true. This definition is truth-conditionally identical to (26). The two definitions only differ in how if combines with the rest of the clause. I choose the second version here, because only this one allows me to make sure would is excluded from the protasis of a conditional in standard varieties of English. This approach is also better equipped to handle modal auxiliaries such as might in the apodosis, where, in the simplest scenario, the universal quantifier of would is replaced by an existential one. Note also that if does not do a lot of work here. It makes the relation between two clauses specific, but the topic-comment relation it spells out is one that can very frequently be found between juxtaposed clauses. It might therefore not be too surprising that the same meaning can also be expressed without if as in Had Laura taken the train, she would have arrived on time. This would seem to dovetail nicely with the approach by Iatridou and Embick (1994) on inverse conditionals. In some languages, including Mandarin Chinese, no specific complementizer or word order is needed to express a conditional clause (Comrie 1986). This, too, is not unexpected under the assumption that the job of if is a fairly light one.

Derivations
With these definitions in place, we can proceed to derive the meaning of a counterfactual conditional. The syntactic representation is given in Fig. 5. The syntactic labels are merely meant for better orientation and do not constitute a commitment to a particular set of assumptions about syntactic structures. My only commitment is to the structural relations between nodes. In each step, meanings combine via Functional Application as defined in Kratzer and Heim (1998). Let us apply these definitions and derivations to a concrete example.
(33) (A heavy rainstorm is sweeping through the city.) If Margo went outside (now/ in the near future), she would get soaked.
According to my assumptions so far, this sentence is true if all the relevant branches containing a counterfactual present or future index where Margo goes outside also contain a counterfactual present or future index where she gets soaked. The toy model in Fig. 6 shows a scenario in which the sentence would be true: All ψ branches are also Fig. 6 A toy model for a counterfactual clause such as (33); big circles: indices where the protasis is true φ(i ); big solid dots: indices where the apodosis is true ψ(i) φ branches. Remember that it is part of the apodosis ψ that the indices we are talking about are counterfactual. Therefore, there can be no ψ indices that are successors of the actual present i c (assuming that ψ includes the specification that it is a property of counterfactual indices). The assumptions I have made so far account for the observations stated in Sect. 2: They explain why ESP can refer to the actual past, to the counterfactual present and future; why it cannot express reference to the actual or possible present, to the possible future or the counterfactual past; I will say more about the contrary-to-fact implicature below in this section and in Sect. 6.
Note that there are several instances of the variable R B in the tree above. It is legitimate to ask whether R B is determined by both parts of the conditional separately and then somehow combined, or whether only one set of relevant branches is determined for the entire conditional sentence. Intuitively, I would assume that R B is determined only once per sentence and fed into the derivation by whatever mechanism one prefers for quantifier-domain restrictors in general. The example is analogous to a case such as Among the students who consistently did their homework, everyone got a high score. Here, it is clear that both the students and everyone will probably not refer to all the students on the planet. Depending on the context, the speaker will only be talking about the students in her latest semantics class, for example. It seems intuitive that the scope of both the students and everyone is subject to the same discourse-level restriction. Accordingly, I suggest that R B , too, is determined for the entire sentence.
Before concluding this section, I will present the derivation of a counterfactual clause with EPP in the protasis and highlight the way in which it contrasts with counterfactuals that only have a simple past form in the protasis.
As stated above, EPP ensures that the event index is a predecessor of the reference index. In a counterfactual conditional, the reference index is in the counterfactual present or future. A predecessor of a counterfactual future index may itself be in the actual past. So let me sketch very briefly why conditional sentences with would have do not refer to the actual past. The entire sentence either has to be about only actual or possible indices, or only about counterfactual ones. In expressing an indicative conditional about the past, would have competes with ESP. And since ESP is the morphologically and compositionally simplest way to express a reference to the actual past, this interpretation is not available for would have.
The perfect aspect thus opens up the domain of past counterfactual indices, so we can talk about what would have happened in the past under specific cir- cumstances. But what about Ogihara cases? We saw above in example (6) that EPP can express a reference to the future as well as to the past. This is one of the problems Iatridou (2000) has stated for her own account. It is easy to see, however, that the definitions and assumptions made so far are fully compatible with Ogihara cases. The truth conditions of an EPP counterfactual merely state that there is some index i in the counterfactual future, prior to which there is another index i at which the event in question takes place. An index i that is prior to a future index i may itself still be in the future. It does not have to be in the past. Figure 7 shows the derivation of a counterfactual conditional with EPP.
To conclude, I have introduced basic assumptions and definitions in this section and demonstrated how they allow us to derive the meaning of a counterfactual conditional clause without any covert morphology, semantically empty elements or any complex movements between the overt syntactic form and the logical form. We have seen in this section that the assumptions made so far correctly account for the range of meanings we actually find for ESP and EPP-and that they also correctly exclude the uses that are ungrammatical (compare Sect. 3.3).
In the following two sections, I will explore further implications for the truth and felicity of counterfactual clauses.

Truth conditions of counterfactuals
The truth conditions of counterfactual clauses have been a hotly debated topic for many decades. There are two extreme positions that comprise the spectrum of opinions. One position asserts that conditionals, counterfactual or not, do not have truth values at all. Thus, von Fintel (2011) quotes Adams (1965), Gibbard (1981) and Edgington (1986) as prominent representatives of this stance. As von Fintel (2011) notes further, this position has had no noticeable impact on the linguistic side of the debate. Most linguists share the intuition brought forward by Lewis (1973) that conditionals, counterfactual or indicative, have definite truth conditions that can, at least sometimes, be tested in the actual world.
As Lewis (1981) puts it in the opening paragraph: Consider the counterfactual conditional "If I were to look in my pocket for a penny, I would find one". Is it true? That depends on the factual background against which it is evaluated. Perhaps I have a penny in my pocket. […] So in this case the counterfactual is true. (Lewis 1981: 217) Of course, probably everyone also agrees that for most counterfactual conditionals, the matter of their truth is usually not as straightforward as it seems in the above case. The following classical example is attributed to Quine by Lewis (1973) Taken by itself, each assertion appears reasonable enough, even though we will hardly find a scenario believable in which Caesar uses both the atom bomb and catapults in the same war. This observation speaks to the deep-seated vagueness of counterfactuals and their general defeasibility. Approaches to counterfactual conditionals in the Kratzer-Lewis tradition therefore operate with the notion of similarity: worlds are ranked according to how similar they are to the actual world; different conditionals may activate different similarity rankings against which they are evaluated. In sum, the view of truth conditions in the Kratzer-Lewis tradition and beyond is that (1) there are definite truth conditions that can sometimes be tested in the actual world, but (2) they are vague and context-dependent. Between those two extremes of the spectrum-no truth conditions vs. vague truth conditions that can sometimes be tested in the actual world-my approach takes a middle ground. My assumptions so far predict that counterfactual conditionals do have vague truth conditions, but that these can never be tested exhaustively in the actual world. Because counterfactual statements are statements about counterfactual indices, no actual index can make them true or false. In other words, a counterfactual conditional can be true even if (the prejacent of) its protasis is true and (the prejacent of) its apodosis is false in the actual world. And it can be false even if both are true in the actual world. Applied to Lewis' penny, the clause If I were to look in my pocket, I would find a penny is not necessarily false if my pocket is empty. In this case, it is just either false, or entirely irrelevant. This means that, if the speaker utters the penny conditional, and the addressee checks her pockets and finds them empty, the speaker then has to either admit that she lied; or she has to qualify her statement, by saying, for example Sorry, I meant if I had magical pockets that always contain pennies, THEN, if you were to look, you'd find a penny. In most contexts, the listener would have no way of guessing the part about magical pockets. Thus, the speaker may be able to deny a blatant lie, but then the conditional utterance would still come across as highly misleading and uncooperative.
Since my position here is not entirely trivial, I will go into more detail about this point here. I take it that in (36), B's utterance is a valid objection to A's statement.

(36)
A and B talk about Laura's arrival yesterday. They discuss whether the best option, given that Laura had to arrive at 2:30, would have been the 10 am train, the 12 o'clock flight or the bus at 9:30 am. A: If Laura had taken the train, she would have arrived at 2 pm. B: That's not true. Laura did take the train, but she arrived only at 3 pm.
My claim is that B's objection is pragmatically valid, but not a direct counterargument against the truth of the counterfactual conditional. Instead, it is a contradiction against a very strong pragmatic implicature. This implicature is that the relation between propositions that we claim to hold in counterfactual worlds should also hold in the actual world-other counterfactual worlds should be considered irrelevant and therefore be excluded from the domain of quantification. In other words, the counterfactual conditional implicates the indicative conditional. I would like to briefly defend the idea that an objection of the form that's not true can in fact be a contradiction to an implicature only, rather than the original statement by considering the following two example conversations: (37) A: If Laura had taken today's 8 o'clock train from Frankfurt, she would have arrived in Berlin at 2 pm. B: That's not true. MARTHA took that exact train and she arrived only at 3 pm.
(38) A: If you had taken melatonin before your flight to Boston last week, you would not have been jet-lagged. B: That's not true. I took some melatonin before flying to New York last year, but I still had a terrible jet lag.
In both cases, we may feel that B has made a valid argument against A's claim, despite the fact that it is very clear that B's statement does not refute directly the truth of A's statement: In (37) A didn't make any claim about Martha's time of arrival, only about Laura's. So A would of course be justified to respond to B saying I didn't say anything about Martha, so how can you say I'm wrong? but this would pragmatically only be licensed if A could plausibly motivate a claim that two people can take the exact same train and still arrive at the same station at different times. Otherwise, the assumption that Laura should arrive at the same time as Martha is enough to make B's utterance a valid counterargument to A's claim. A similar case can be made for (38).
I suggest that what happens in (36) is analogous to what happens in (37) and (38): B actually only objects to a strong implicature of A's statement, but we accept this objection as a valid contradiction to A's statement as long as A cannot plausibly motivate why the implicature is not valid. Now, these observations about the defeasibility of counterfactuals are by no means new and should not be too controversial. They can be handled by a variety of approaches, including Kratzerian situation semantics (Kratzer 2015). The Kratzerian situation-semantics approach theoretically differs in its truth conditions from the 3D-modality approach, in that a counterfactual is definitely false if the antecedent is true and the consequent is false in the actual world. But since it affords speakers great flexibility in choosing the set of worlds they quantify over, it makes the same empirical predictions about acceptable linguistic behavior as I do.

Felicity conditions
Any account of counterfactual conditionals has to address their contrary-to-fact implicature, including those cases where it fails to occur. I will start this section with some basic observations about the felicity of counterfactual conditionals. There is a wide consensus that both indicative and counterfactual conditionals are odd in contexts in which the prejacent of the protasis is known to be true.

(39)
A asks when Laura will arrive. B knows for a fact that Laura has taken the train. B: #If she took the train, she will be here by noon. B: #If she had taken the train, she would be here by noon.
Moreover, indicative conditionals are also bad in environments where the prejacent of the protasis is known to be false. But in this environment, counterfactuals are particularly good.
(40) Laura didn't take the train. a. #If she took the train, she will be here by noon. b. If she had taken the train, she would be here by noon.
The most detailed discussions of the felicity conditions of counterfactual conditionals concern the contrast between indicatives and counterfactuals illustrated in (40). The main line of investigation follows the intuition by Stalnaker (1975) that counterfactual, but not indicative conditionals, require the revision of the context set of worlds, that is, the set of worlds that is compatible with what we know in the actual world. Representative studies in this tradition are Asher and McCready (2007) and Starr (2014). The proposal by Ippolito (2013) aims at deriving the revised set of worlds through the back-shifting process triggered by past morphology. In this section, I want to sketch out how the above two observations follow from my previous assumptions in combination with some general considerations about principles of conversation, before turning to the contrary-to-fact implicature and Anderson conditionals. In contrast to the studies cited above, I do not assume a process of revisions in the context set of worlds. I suggest that, in most contexts, the Question Under Discussion (QUD, see Groenendijk and Roelofsen 2009) is about actual indices or future possibilities rather than counterfactual developments. In other words, most of the time we want to know what actually happened rather than what would have happened under certain circumstances. Therefore, in most contexts, by uttering a counterfactual conditional, we violate the maxim of relation by not really answering the QUD. This violation creates inferences. I assume that in most cases, we use conditional sentences to assert a positive correlation between two propositions p and q (compare DeRose and Grandy 1999). If both p and q are true, we can simply say p is true and q is true (because of p), and in most contexts, this is the most informative and relevant information we can give. If we do not know whether p is true, we may say if p is true then q is true. But if we are fairly certain that p is not true, then the only option left is to talk about counterfactual indices by saying if p were true, then q would be true: I assume with many others (including the seminal tradition of Kratzer 1991), that an indicative conditional is trivially true if the protasis is false in the actual world. So when we believe the protasis to be false in the actual world, putting it into an indicative conditional would be uncooperative and infelicitous in most situations.
In a context where the QUD is concerned with what actually happened, the counterfactual conditional is thus the least informative way to assert a positive correlation between two propositions. The inference is then that the other two, more informative, options are not available. In most situations the most plausible reason is that p cannot be asserted because we do not believe it to be true, and that the indicative conditional would be vacuous. 13 We thus derive the implicature that the prejacent of the protasis of a counterfactual conditional be false in the actual world-the very fact that has led to the term counterfactual. We may summarize this argument as follows: When the QUD is about actual indices, the following ranking reflects the preferred type of sentence: unconditional assertion > indicative conditional > counterfactual conditional I therefore see a counterfactual clause in most contexts as an answer to a different question from the QUD, but one that is still close enough to the actual question to be deemed relevant. This is similar but not identical to the reasoning by Iatridou (2000), who sees a counterfactual utterance as a partial answer to a question, rather than as an answer to a different question. Iatridou (2000: 247) discusses the following conversation: (42) A: What do you think about Peter and Ian? B: Well, I like Ian.
The implicature is that B cannot simply assert the same degree of fondness for Peter as for Ian. Iatridou (2000) states that this implicature is of the same nature as the counterfactual implicature. The set of assumptions I make also ensures that the implicature of falsity in the actual world is context-dependent. For example, there are contexts where QUD is about counterfactual indices. In this context, no implicature arises: Furthermore, there may be situations in which the QUD is about actual indices, but an unconditional assertion is not possible because of epistemic uncertainty, and indicative conditional would be vacuously true because we know its apodosis to be true (rather than the protasis to be false). In this scenario, too, we do not expect a counterfactual implicature. And that is exactly what happens in an Anderson conditional. The locus classicus to show that falsity in the actual world is a cancelable implicature by Anderson (1951: 37) has been introduced in Sect. 2 and is repeated below: (14) If Jones had taken arsenic, he would have shown just exactly those symptoms which he does in fact show.
If this was uttered by a doctor trying to diagnose Jones' cause of death, we would infer that arsenic poisoning is in fact a likely option. Without giving a complete analysis of this case, I would like to outline briefly how I think about it: Again, we imagine a context for (14) in which the QUD is roughly What is the cause of Jones' death?-a question about actual indices. Talking about counterfactual indices instead is a violation of the maxim of relation. This creates inferences-the immediate inference that is created is that, for some reason, both the corresponding indicative conditional and the corresponding unconditional assertions are not felicitous in this context. One possible reason for that, as we have seen before, is that the protasis is not true in the actual world.
However, in this scenario, there is a different explanation. The unconditional assertion-Jones took arsenic, that's why he shows the symptoms we observe-is presumably not available, because the doctor lacks the degree of confidence that would be necessary for this strong commitment. In situations of epistemic uncertainty, an indicative conditional is often a good choice. But consider the indicative conditional If Jones took arsenic, he shows exactly those symptoms which he shows. Following standard approaches to indicative conditionals, this assertion would be vacuously true. Of course, Jones shows the symptoms he shows, regardless of the cause. And this is how the counterfactual clause is licensed in this situation. Like in other scenarios, an unconditional proposition cannot be asserted and the indicative conditional would be vacuously true-but in this special case, it is vacuous because we know that the apodosis is true in the actual world, rather than that the protasis is false, thereby leading to a different interpretation. This reasoning closely follows the proposal by von Fintel (1999b).
At this point, I would like to briefly discuss Mackay (2015), who points out that Anderson conditionals are problematic at least for Iatridou (2000) and for Schulz (2014) because of the following problem: According to both approaches, counterfactual clauses exclude not only the actual world from their domain of quantification, but also worlds that are epistemically indistinguishable from the actual world. When we utter a counterfactual conditional, we speak only about those worlds that differ from ours in ways we would notice. But under this assumption, a sentence such as If Jones had taken arsenic, everything would be exactly as it is, cannot be true, because in those counterfactual worlds we are quantifying over, not everything can be as it is in the actual world.
I do not share the assumption, which is quite central to the entire Kratzer-Lewis tradition, that we cannot single out the actual world. It is true that, were we presented with a set of worlds that are epistemically indistinguishable, we would not be able to identify which of those worlds is ours. But this is not the only way in which we can identify something. We can identify objects in terms of what we know about them. But we can also identify them in terms of our relation to them. We can always point to where we are and refer to it as here, even if we do not know anything more about the place we inhabit. Likewise, we can always point to the actual world as the world we currently experience, even though it may be indistinguishable to us from an infinite number of different worlds. In other words, what we do when we exchange information is not trying to narrow down which of the epistemically accessible worlds is ours. Instead, we point to the world we inhabit and ask what it is like. The difference will be too subtle for most purposes to be of significance. But with respect to some issues, there are profound consequences. The problem of Mackay (2015) is one of them. In sum: I believe that when we quantify over counterfactual worlds, we can include those that differ only imperceptibly from ours. So Jones can have the exact same symptoms in a counterfactual world that we notice in the actual one.
Concluding this section, I have suggested that the contrary-to-fact implicature of counterfactual clauses in most contexts derives from a mismatch with the QUD and therefore a violation of the maxim of relation. I suggest that, under a QUD that is about actual indices, counterfactual conditionals compete with indicative conditionals and unconditional assertions. So when a counterfactual conditional violates the maxim of relation, listeners have to figure out why the other two structures are unavailable, and depending on the situation, different explanations may be available. This approach correctly predicts that counterfactual clauses are licensed by a variety of contexts and that only some of them lead to the implicature that the prejacent of the conditional protasis be false in the actual world.

Perspectives
At this point, I am done with the main goals of this paper: I have stated the problems I wanted to tackle, proposed a set of assumptions and showed how they solve my problems. You may now wonder why something as seemingly obvious should not have been previously proposed and discussed. Unfortunately, a full reconstruction of the history of modal and temporal logic in the light of this question goes far beyond the constraints of this paper. But I will, in the following section, trace the application of branching time to counterfactuality other than Ippolito (2003Ippolito ( , 2006Ippolito ( , 2013 for some historical context. In Sect. 7.2, I will offer a few reflections on the implications of conceptualizing counterfactuality as a property of indices, rather than as a property of untensed propositions.

Looking back: branching time and counterfactuality
In Sect. 3, I have reviewed the literature on the connection between counterfactuality and past. I have therein not included a small body of literature that does not address this connection, but does apply a branching-time framework to counterfactual conditionals. In this section, I would like to take a look at this discourse and briefly discuss how my work relates to it.
Crucially, my suggestion to lift the restriction in Thomason (1970Thomason ( , 1984 on quantification has never been made. I will give a brief outline of approaches to get a better sense of why this is. The three main attempts to get a better handle on counterfactuals with the help of branching time that I am aware of all come from the tradition of modal logic. They are: 1. Thomason and Gupta (1980); 2. Tedeschi (1981), building on a manuscript later published as Cresswell (1985); 3. and Placek and Müller (2007).
All three articles are concerned with narrowing down truth conditions for counterfactuals: Thomason and Gupta (1980) reflect on the usefulness of branching time in defining similarity between worlds. Tedeschi (1981) ponders the relative scope of modal-temporal operators and argues that, among the following formalizations, (44-a) should be the correct logical form of a counterfactual conditional: 14 Placek and Müller (2007) start with the observation that a unified analysis of all counterfactual clauses apparently has to remain quite vague. They propose to give up a unified and vague analysis in favor of a split analysis that allows to define rigorous truth conditions for at least a subclass of counterfactuals, which they call historical counterfactuals. Historical counterfactuals are characterized by the fact that their antecedent is true in some historical alternative to the actual world. There was a distinct point in time such that histories split into those where the antecedent is true and those where it is not true. For illustration, consider the following pair of sentences: If this coin had shown heads, I would have won my bet.
(46) If this were a ruby, it would be red.
Example (45) is a historical counterfactual; (46) is not, because there is no moment in the past such that histories (or worlds) split into those where the object of interest is suddenly a ruby and those where it is not. The main intuition is that historical counterfactuals have clear and rigorous truth conditions. Thus, in a scenario where A bets on heads, B tosses a coin and it comes up tails, the counterfactual in (46) should simply evaluate as true, without any degree of vagueness or ambiguity. In sum, applying branching time to counterfactual conditionals has mostly been considered as a tool to narrow down truth conditions, rather than finding the most parsimonious and compositionally most transparent definition of TAM expressions.Giving up the restriction on quantification introduced by Thomason (1984) only helps with the latter, but is actually detrimental to the former: I do not assume any logical constants and therefore do not provide any validities for my framework. While it is theoretically possible to recast my assumptions using logical constants instead of explicitly restricted quantifiers, I do not think it would be a very fruitful exercise. Moreover, the way I envision the branching-time frame, it does little to help narrow down the notion of similarity. I assume that it is possible to jump from the actual present directly to a development that might branch off from a slightly earlier moment, but where magic is suddenly possible, or kangaroos do not have tails, or something that is an emerald in the actual world is a ruby. The tree of developments does not represent a quantummechanical state-space, but the world and its alternatives as we imagine them. The difference between historical counterfactuals such as (45) and other conditionals such as (46) might still be possible to model if one restricts the domain of quantification to completely realistic branches, that is, those branches where our laws of nature and social conventions are identical.

Looking forward: rethinking counterfactuality
The discourse on counterfactual clauses has been riddled by confusion about the relation between linguistic form and meaning. Edgington (2007: 131f.) gives a lucid overview of the debate. So does von Fintel (2012), who writes: Conditionals of the first kind are usually called "indicative" conditionals, while conditionals of the second kind are called "subjunctive" or "counterfactual" conditionals. The "indicative" vs. "subjunctive" terminology suggests that the distinction is based in grammatical mood, while the term "counterfactual" suggests that the second kind deals with a contrary-to-fact assumption. Neither terminology is entirely accurate. (von Fintel 2012: 466) Accordingly, there is widespread disagreement about which clauses in fact qualify as counterfactual. In this section, I will outline how my approach answers some of the most contested questions of classification. These are: 1. Are there future counterfactuals? 2. Are questions such as Would you like some tea? counterfactual? 3. Are when / if hell freezes over-conditionals counterfactual?

Future counterfactuals
Everyone agrees that If Laura had taken the train, she would have been on time is a counterfactual conditional. But opinions differ on whether (47) also counts as counterfactual. (47) If Laura took the train, she would be on time.
Sentences like these are similar to counterfactual conditionals of the past in that they often imply that we do not expect the protasis to come true. Compare: (48) ?If Laura took the train, and I'm quite sure she will, she would be on time.
However, some authors are uncomfortable with describing them as counterfactual conditionals because they do not exactly imply that the protasis be false in the actual world, since there is no such thing as "the actual future" (compare also Karawani 2014: 4). Iatridou (2000: 135) refers to conditionals such as (47) as future-less-vivid (FLV) conditionals, and concludes that they should be treated on a par with past and present counterfactuals.
The definitions I have given so far lead to the same conclusion as Iatridou (2000): the expression would quantifies exclusively over counterfactual indices.
Recall from Sect. 4.1 that the future is split into two domains: One set of future developments is a continuation of the actual present. The other set of future developments are not accessible from the actual present, they are continuations of prior actual indices.
The sentence in (47) is a counterfactual sentence because it is a sentence about counterfactual (future) indices. These can be defined as follows: We also have a solution for the conundrum cited above: There is no actual future. But there is a counterfactual future-these are indices that are temporally later than the actual present but not successors of it. The fact that we often consider the prejacent of a future counterfactual conditional to be unlikely to come true follows again from our expectation that most QUDs about the future are about what will happen, not what would happen. In those contexts, the counterfactual conditional competes with the indicative conditional. Choosing it over the indicative creates inferences-in many contexts, the implicature is one of unexpectedness.

Counterfactual questions
Not much of the literature concerns itself with "counterfactual morphology" in questions. 15 Kim (2016) has remarked on the puzzling asymmetry between assertions and questions as illustrated in (50): (50) You could pass me the salt.
(51) Could you pass me the salt?
The assertion in (50) suggests that the addressee is not very likely to pass the salt. But the corresponding question in no way suggests the same thing-quite on the contrary, by uttering it, the speaker communicates an expectation that the addressee will in fact pass the salt. As I have outlined above in Sect. 6, the implicature of a counterfactual conditional that a proposition be false in the actual world comes from a mismatch with the QUD. The listener has to figure out why the corresponding indicative and unconditional assertion were not available instead. One plausible explanation in many situations is that the protasis of the counterfactual is (likely to be) false in the actual world. For some questions, the same calculations and inferences may arise as well. A counterfactual question may be used in a context where the QUD is about actual indices, to narrow down possible answers. For example, let us assume we are trying to find out when Laura arrived. We know that she considered using the 9 o'clock train but ended up traveling by car. We may then ask: (52) If she had taken the 9 o'clock train, when would she have arrived?
Someone who just enters the room will infer from this question that we do not think Laura took the train. However, in a polite question such as (51) and (53), corresponding inferences do not arise: Would you open the window, please?
According to my definitions, (53) is a counterfactual question. It is a question about counterfactual indices: In the relevant counterfactual future indices, do you open the window? Again, in most situations we will be more concerned with what will happen next than with what would happen next. So the listener once again has to figure out why the speaker did not use will instead of would. In a situation where the question does in fact constitute a polite request, though, we may suspect that the reference to counterfactual indices is meant to give us a painless way out of a commitment. In effect, this is a question we can truthfully answer positively, even if we are not in a position to follow the request: (54) I would (gladly), but the windows here cannot be opened.

Contrary-to-fact indicatives
Ippolito (2013: 2) specifies that she uses the term counterfactuals only with reference to subjunctive conditionals whose antecedents are false. She thereby explicitly excludes indicative conditionals whose antecedents are known to be false, as in (55) Even so, Ippolito (2013) does describe conditionals such as (55) as counterfactual.
According to the definition of counterfactuality proposed here, (55) is not a counterfactual conditional, despite its contrary-to-fact implicature. Here is how I think about it: If we both agree that I am not the Easter Bunny, the only way this utterance can be true is to say that the protasis is false. In a situation where the protasis has already been suggested to be true by someone else, violating the constraint against vacuously true statements can be a creative way to refuse this suggestion. Like a counterfactual conditional, a sentence such as (55) implicates that its protasis is false by violating a communicative principle. However, the way this happens is different: (55)-type sentences are vacuous; by contrast, counterfactual conditionals do, in many contexts, not directly address the QUD. The conditional in (55) is not about counterfactual indices. It is therefore not a counterfactual conditional.
On the other hand, examples like the arsenic example in (14) are not categorized as counterfactual by Ippolito (2013), because they do not come with the implicature that their protasis be false in the actual world. By contrast, my definitions imply that they are counterfactual conditionals-again, because they are about counterfactual indices.
In sum, if we understand counterfactuality as a property of indices-and of propositions about counterfactual indices-we can classify utterances regardless of the variable circumstances of their utterance context and specific interpretation.

Conclusion
The task I have set myself in this article was to find a definition of ESP that would allow to arrive all the interpretations it can actually get and prevent the derivation of unavailable interpretations. I have first stated the main observations that describe the scope of the investigated phenomena and presented examples of past-and-counterfactuality markers from other languages that stress that accounting for unattested readings of ESP is not trivial.
I have then outlined the history of approaches to past-and-counterfactuality markers and identified two major lines of investigation-remoteness-based and back-shifting. Among the former, I have singled out the seminal work by Iatridou (2000) and have shown that, while it is very straightforward, compositionally transparent and explanatory with regards to the contrary-to-fact implicature of counterfactual conditionals, it does not fully predict the available range of distributions and interpretations.
Among the back-shifting approaches, I have discussed Ippolito (2013) as a representative contestant. Ippolito (2013) does a good job in covering attested and unattested interpretations and distributions, but does not predict the observed cross-linguistic variation and may not suffice to explain the contrary-to-fact implicature. It also relies on complex assumptions about the syntax-semantics interface.
I have proposed to solve these problems by combining exclusive quantification from Iatridou (2000) over counterfactual worlds with the ideas by Ippolito (2013) about the role of branching-time, resulting in a tripartite modal-temporal structure. I have discussed the predicted truth-conditions of this approach and shown how the felicity conditions and implicatures can be derived from my assumptions. I have then given an outline of the history of approaches to branching time and counterfactuality and argued that my approach has never before been discussed, because without my focus on linguistic parsimony, compositional transparency and cross-linguistic variation, the advantages are not immediately obvious. Thus, I have argued that in languages such as Daakaka, the "distal" past marker is also used for both the actual past and for counterfactual contexts, like English, but unlike English, it can also be used with reference to the counterfactual past. This illustrates that the inability of ESP to refer to the counterfactual past, along with its other restrictions, are not trivial and need an explanation that can accommodate the observed cross-linguistic variation.
Finally, I have discussed the new understanding of counterfactuality that arises from the theory I have proposed here. I believe that my assumptions have much more farreaching consequences than can be explored here and am looking forward to discussing them in the future.