Temporal Relations at the Sentence and Text Genre Level: The Role of Linguistic Cueing and Non-linguistic Biases—An Annotation Study of a Bilingual Corpus

Grisot, Cristina; Blochowiak, Joanna

doi:10.1007/s41701-021-00104-5

Temporal Relations at the Sentence and Text Genre Level: The Role of Linguistic Cueing and Non-linguistic Biases—An Annotation Study of a Bilingual Corpus

Original Paper
Open access
Published: 30 April 2021

Volume 5, pages 379–419, (2021)
Cite this article

Download PDF

You have full access to this open access article

Corpus Pragmatics Aims and scope Submit manuscript

Temporal Relations at the Sentence and Text Genre Level: The Role of Linguistic Cueing and Non-linguistic Biases—An Annotation Study of a Bilingual Corpus

Download PDF

Cristina Grisot¹ &
Joanna Blochowiak²

2426 Accesses
4 Citations
Explore all metrics

Abstract

This study investigates the role of non-linguistic biases in the obligatory (verb tenses) and optional (discourse connectives) linguistic marking for inferring temporal relations at the sentence and the text genre levels. Specifically, we formulated and tested several assumptions: (1) the linguistic cueing assumption (verb tenses inform language users about the temporal relation), (2) the implicitness assumption (highly expected relations need not be overtly marked), (3) the specialized connective assumption (specialized connectives are more efficient than underspecified ones), (4) the text genre assumption (language users’ expectations of temporal relations are linked to the text genre), and (5) the text status assumption (information in translated texts tends to be more explicit than in original texts). We carried out an annotation study of a bilingual corpus (French–English) belonging to two different text genres: literary and journalistic. Our results challenge the implicitness and the text status assumptions while confirming the linguistic cueing and the text genre assumptions. So, we put forth an alternative view, according to which language users have equal expectations about all three types of temporal relations and are oriented to one relation or the other by linguistic cueing (obligatory and optional marking) as well as text genre.

Compositionality, Metaphor, and the Evolution of Language

Article Open access 30 July 2022

Actions, reasons, and becauses

Article Open access 19 June 2024

Compositionality, communication, and commitments

Article 26 June 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

In this study, we investigate the role of obligatory and optional linguistic cueing, i.e. verbal tenses and discourse connectives, as well of non-linguistic biases, i.e. text genre and translation status, for expressing temporal relations. While these factors have been investigated individually, as we will discuss below, they have never been investigated all together and from the perspective of their role at the sentence level and the wider level of text genre. This is our ambition in the current study, in which we focus on two tensed languages: a Germanic language–English, and a Romance language–French.

In languages such as English or French, verb tenses are obligatorily marking elements and may cue language users towards inferring a chronological (i.e. time advances from situation 1 to situation 2), synchronous (i.e. time stagnates between situations 1 and 2) or backwards (i.e. time moves backwards from situation 1 to situation 2) temporal relation (Asher & Lascarides, 2003; Kamp & Rohrer, 1983). By contrast, temporal connectives (specialized connectives) are optional markers of temporal relations. They specifically instruct language users which temporal relation to infer, an instruction which may override the temporal relation signalled by verb tenses. Temporal relations may be marked by the underspecified connective and, and even left implicit. At the same time, prior work has suggested that there are other, non-linguistic biases, which drive comprehenders’ inferences of temporal relations in the absence of overt linguistic marking (Croft, 1990; Dowty, 1979, 1986; Hopper, 1979; Murray, 1997; Segal et al. 1991). For example, a chronological relation would be the default temporal interpretation of a series of two sentences, generally in the Simple Past, while the synchronous and backwards interpretations arise in specific linguistic conditions, such as imperfective marking on the verb and the presence of a synchronous specialized connective (at the same time).

Furthermore, scholars working in the fields of discourse analysis (Adam, 1992) and the cognitive approach to discourse relations (Kintsch, 1998; Sagi, 2006; Sanders, 1997) argued that most linguistic and discourse phenomena—and thus temporal relations—display significant differences depending on the text genre (for example, narrative literary texts vs. descriptive or argumentative journalistic texts). Moreover, translation studies in corpus linguistics have also pointed out the issue of translation universals (Baker, 1993), referring to features of translated texts which appear due to the translation process itself. Of interest for our study is the explicitation feature, which refers to the fact that information is more frequently expressed explicitly in translated texts than in original texts.

In this context, our research questions are the following. First, are specific verb tenses and/or the marking status of a relation (implicit, marked with underspecified connectives or with specialized connectives) significant predictors of one type of temporal relation or another? Second, are chronological relations more frequently expressed implicitly than overtly marked with a discourse connective and, conversely, are synchronous and backwards relations more frequently expressed by overt marking than left implicit? Third, does the role of verb tenses and discourse connectives depend on text genre and on the text status (original vs. translated)? To answer these questions, we carried out a bilingual annotation corpus study to examine whether the effects of obligatory and optional linguistic marking of temporal relations differ from one language to another. Below, we describe previous findings regarding the role of obligatory marking (verb tenses) and optional marking (discourse connectives) in confirming or reversing the temporal interpretation triggered by non-linguistic biases (“Overview of the Current State of the Research” section). Then, we introduce and report our bilingual (English and French) annotation corpus study (“A Bilingual Corpus Annotation Study” section). We discuss the results and conclude in “A Bilingual Corpus Annotation Study” and “General Discussion”.

Overview of the Current State of the Research

The Role of English and French Verb Tenses

Prior work in discourse semantics (Asher & Lascarides, 2003; Kamp, 1979; Kamp & Reyle, 1993; Kamp & Rohrer, 1983; Lascarides & Asher, 1993) has argued that the specification of temporal relations between situations described by adjacent sentences is a function of the verb tenses of the sentences. For example, two sentences expressed in the simple past (English Simple Past and French Passé Simple) are interpreted chronologically (the situation described in the second sentence follows and does not overlap with the first one), as in (1). On the other hand, when a sentence is in the imperfective (English Past Progressive and French Imparfait), it is interpreted as describing a situation temporally synchronous with that of the previous sentence, as in (2). Furthermore, a sentence in the pluperfect (English Pluperfect and French Plus-que-parfait) is interpreted as expressing a past event that has occurred before another past event, as in (3).

(1)	Mary went to the supermarket. Someone called her.
(2)	Mary went to the supermarket. Someone was calling her.
(3)	Mary went to the supermarket. Someone had called her.

Such contributions of verb tenses to the inferring of temporal relations is due, on the one hand, to their temporal content (i.e. tense) and, on the other hand, to their aspectual content (i.e. grammatical aspect). We will refer to the role of tense and grammatical aspect as the linguistic cueing assumption, according to which the presence of these linguistic cues language users’ expectations about certain types of temporal relations. In what follows, we present the roles of tense and grammatical aspect in more details.

The role of tense in determining the temporal relation between two situations from contiguous sentences is linked to the presence or absence of referential autonomy of certain verbal tenses (Hinrichs, 1986; Klein, 1994; Partee, 1973, 1984; Nerbonne, 1986; Reichenbach, 1947; cf. discussion in Grisot, 2018). Precisely, the simple past (English Simple Past and French Passé Simple) is referentially autonomous, which means that each situation expressed with this verb tense introduces an R point in the discourse, as in (1). In contrast, the past imperfective (Past Progressive and Imparfait) is referentially non-autonomous or anaphoric (Ducrot, 1979; Kamp & Rohrer, 1983; Kleiber, 2003; Molendijk, 1990; Tasmowski-De Ryck, 1985), which means they cannot introduce a new R and are dependent on an R already existing in the context, as in (2). As such, a situation expressed with the Past Progressive/Imparfait is interpreted as simultaneous with another situation (in the Simple Past, for example) because the former is temporally anchored (via the R point) to the latter. Nevertheless, empirical studies have found important differences between the English Simple Past and the French Passé Simple with respect to their role for signalling temporal relations. For example, Grisot (2017) showed using a corpus annotation study that the English Simple Past signals chronological or backwards relations (narrative usage in Grisot’s terminology) in 59% of the cases whereas it signals synchronous relations (non-narrative usage in Grisot’s terminology) in 41% of the cases. In contrast, the French Passé Simple was annotated as having a narrative usage much more frequently, that is in 92% of the cases. Grisot (2017) studied the role of the French Imparfait for signalling temporal relations and she found that this verb tense signalled synchronous relations in 77.5% of the cases. No data is available for the English Past Progressive.

The pluperfect (Pluperfect and Plus-que-parfait) represents a slightly different situation than that of the simple past and the past imperfective, as its meaning already includes the temporal relation between two situations. For Reichenbach (1947), the pluperfect expresses that the point of event E precedes the point of reference R which precedes the point of speech S. The reference point may be provided by a past time temporal adverbial or by another past time event (expressed with the simple past or the past progressive), as in (3). This is why the situation had called in example (3) is understood as taking place before the situation went to the supermarket, and the temporal relation inferred between sentence 1 and sentence 2 is backwards.

For some scholars, it is the grammatical aspect of verb tenses, i.e. perfective or imperfective (Comrie, 1976), which plays a role for specifying the temporal relation between two situations. According to Comrie (1976), the perfective aspect displays a situation as a single whole, without internal structure, and with highlighted boundaries. The imperfective aspect gives access to the internal structure of a situation, or to a moment other than the initial and the final boundaries. The perfect focuses on the relevance of the resultative state of a past situation. Through their meanings, perfective (e.g. the Simple Past, the Passé Simple and the Passé Composé in some of its uses) and imperfective (e.g. the Past Progressive and the Present Progressive, the Imparfait and the Présent) verb tenses affect the mental representations of situations which hearers or readers build during the comprehension process (Johnson-Laird, 1983; Kintsch, 1998; for a review see Zwaan & Radvansky, 1998). Specifically, the perfective instructs comprehenders to build mental representations of completed situations, whereas the imperfective instructs comprehenders to build mental representations of ongoing situations. Consequently, two situations mentally viewed as completed are understood as having taken place in a chronological way, whereas two situations mentally viewed as ongoing are understood as having taken place synchronously.

Numerous experimental studies have tested whether this is the effect of the instructional meaning of the perfective and imperfective in the inference of temporal relations. Up to now, the findings from English are inconclusive. While some scholars (e.g. Feller et al., 2019; Magliano & Schleich, 2000) have found that the imperfective necessarily favours synchronous interpretations of the situations described by two adjacent sentences, others (e.g. Madden & Zwaan, 2003; Author1, under review) have found that the imperfective triggers synchronous and chronological interpretations with equal frequency. For Madden and Zwaan (2003: 670), comprehenders may infer that the following situation unfolds simultaneously or chronologically with respect to the current situation, depending on how they represent the internal structure of a situation expressed in the imperfective.

As for French, Grisot and Blochowiak (2019) investigate the role of French verb tenses and of the presence and absence of temporal connectives, separately and in interaction, as processing instructions for chronological relations in French, by means of a self-paced reading experiment and an offline evaluation experiment. They test how two past time verb tenses (the Passsé Simple and the Passé Composé) and the presence and absence of two specialized connectives (puis ‘then’ and ensuite ‘then’) influence the processing and acceptability of short narrative contexts. Following theoretical descriptions of the meanings of these verb tenses, the Passé Simple is thought to encode an instruction about inferring a chronological relation between the situations described in the short narrative context, whereas the Passé Composé is supposed to be neutral with respect to the temporal relation inferred (Kamp & Rohrer, 1983; Moeschler, 2000, 2002; de Saussure, 2003). The ensuing predictions in terms of processing time were that the short narratives expressed with the Passé Simple will be read faster than those expressed with the Passé Composé. The results of the self-paced reading experiment did not show significant processing differences between the two verb tenses, whereas the offline evaluation experiment revealed comprehenders’ preference for the Passé Composé. These findings showed that the Passé Composé is as efficient as the Passé Simple for inferring chronological relations. The authors interpreted this result in terms of the aspectual component that these two verb tenses share (they are both perfective verb tenses) arguing that it is grammatical aspect which instructs the hearers about the inference of chronological relations. Furthermore, speakers consciously preferred the Passé Composé over the Passé Simple because of the former is much more frequent and has, in contemporary French, nearly replaced the latter.

In sum, up to now existing experimental and empirical studies focusing on individual verb tenses showed the following: (1) both the French Passé Simple et Passé Composé trigger more frequently chronological than synchronous relations (Grisot, 2017; Grisot & Blochowiak, 2019), (2) the French Imparfait signals more frequently synchronous than chronological relations (Grisot, 2017), (3) the English Simple Past was found either to signal equally frequently chronological and synchronous relations (Grisot, 2017) or to trigger more frequently chronological than synchronous relations (Feller et al. 2019; Madden & Zwaan, 2003; Magliano & Schleich, 2000; Author1, under review), and (4) the English Past Progressive was found either to trigger equally frequently chronological and synchronous relations (Madden & Zwaan, 2003; Author1, under review) or to trigger more frequently chronological than synchronous relations (Feller et al. 2019; Magliano & Schleich, 2000). These results are insufficient to understand the role of verb tenses for temporal relations because they show neither a monolingual not cross-linguistic comprehensive picture of all verb tenses and because they seem to be contradictory.

Implicit and Overtly Marked Temporal Relations

Nevertheless, the above-mentioned patterns linked to the contribution of verb tenses for temporal relations do not seem to be exclusive. Sometimes, the temporal interpretation triggered by the verb tense is altered. For example, it may be modified by the use of specialized temporal connectives, as in (4), in which the second sentence describes an event that happened before the event from the first sentence. In (5), the use of the adverbial 2 min later changes the synchronous interpretation (triggered by the imperfective) into a chronological interpretation. Also, the temporal relation may be marked by the conjunction and, as in (6), which may be understood as either chronological or synchronous. Equally, despite the use of the pluperfect in (7), the temporal relation holding between the two situations changes from backwards—as in (3)—to chronological or even ambiguous, depending on how and is interpreted.

(4)	Mary went to the supermarket after someone called her.
(5)	Mary went to the supermarket. Two minutes later, someone was calling her.
(6)	Mary went to the supermarket and she picked the kids up from school.
(7)	Mary had gone to the supermarket and someone had called her.

These examples demonstrate that temporal relations may be either left implicit—and only inferred via verbal tenses—as in (1)–(3), or overtly marked by some lexical item, as in (4)–(7). Unlike verb tenses, overt marking is optional and may be performed using specialized connectives (such as then for chronological, after for backwards and at the same time for synchronous relations), the underspecified connective and, and other types of expressions such as noun phrases (2 min later or the next summer). Even if most of the attention given to the marking of coherence relations has focused on discourse connectives, coherence relations may be also marked by other types of signals (such as the prepositional phrase in the event that from example (8), which signals a condition relation) (see the Signalling Corpus released by Das & Taboada, 2013, 2018).

(8)	This notice must not be removed from the software, and in the event that the software is divided, it should be attached to every part. (Das & Taboada, 2013: 256)

Problematically, the linguistic elements functioning as signals do not directly signal specific coherence relations and they are categorized or grouped with great difficulty (but see Hoek’s, 2018 proposal to categorise segment-internal elements as signals of coherence). In contrast, our knowledge about both discourse connectives, whether they are specialized or underspecified, is much more stable. Experimental studies show that when a discourse relation is overtly marked using a specialized connective, the processing and the integration of the information from the following segment is faster compared to when the relation is left implicit (e.g. Britton et al. 1982; Deaton & Gernsbacher, 1997; Haberland, 1982; for causal relations). For temporal relations, Grisot and Blochowiak’s (2019) reading data showed that the specialized connectives puis and ensuite facilitated the processing of chronological relations, by comparison with contexts where the relation is left implicit. Their acceptability task experiment also revealed that comprehenders preferred chronological relations to be overtly marked rather than left implicit. Other studies investigated the role of the underspecified connective and to signal various discourse relations, among which are both synchronous and chronological relations (see for instance, Blakemore, 1987; Carston, 1993, 1998, 2002; Levinson, 2000; Luscher & Moeschler, 1990). It is argued that the semantics of the connective and is minimal as it encodes a concept of logical conjunction (Blakemore, 1987; Blakemore & Carston, 1999; Wilson & Sperber, 1998). This minimal semantic meaning can have cumulative or distributive readings (de Saussure & Sthioul, 2002) or it can refer to additivity (Spooren, 1997; Zeevat & Jasinskaja, 2007). In sum, scholars agree that the meaning of and is underspecified and it can get more specific in the context depending on the type of eventualities described. Given the lack of specific meaning of some connectives, such as and, Sanders (2005) advances the specialized connective assumption, according to which the more specific the linguistic marker, the greater facilitating effect would be found on the segment following the linguistic marker. Thus, specialized connectives are more efficient than underspecified ones.

Among the three possible temporal relations (chronological, backwards and synchronous), prior work has argued that chronological relations have a special status: they represent the default case, whereas the other two types are non-default relations. This means that comprehenders interpret by default two contiguous sentences as conveying that the situations expressed unfolded chronologically (Croft, 1990; Dowty, 1979, 1986; Grice, 1975; Hopper, 1979; Murray, 1997; Segal et al. 1991). This proposal has been linked to compliance during the comprehension process with one or more principles. Among these principles, some are said to be cognitive biases, such as the continuity hypothesis (Murray, 1997; Segal et al. 1991); some are conversational principles, such as the iconicity hypothesis (Hopper, 1979) or the iconicity of sequence (Croft, 1990) and the maxim of order (Grice, 1975; cf. also Levinson, 2000); others are semantic principles, such as Dowty’s Temporal Discourse Interpretation Principle (TDIP; Dowty, 1979, 1986). The continuity hypothesis predicts that, when reading a text, readers are cognitively biased to interpret a situation described by a sentence as being chronologically or causally related to the previous one. The iconicity of sequence principle sets out that comprehenders should interpret the linear ordering of main and subordinate clauses as mirroring the sequential order in which the described events occurred. In Gricean pragmatics (Grice, 1975), the conversational maxim of order states that cooperative speakers should narrate things in the (sequential) order in which they happened. Finally, Dowty’s TDIP posits that, for a sequence of sentences, (1) each following sentence should be interpreted to be temporally consistent with a definite temporal adverbial, if there is one, and (2) as temporally subsequent to the previous sentence otherwise. All these proposals share the core idea that a sequence of clauses describing a series of events should be interpreted as preserving the iconic order—that is, the chronological order in which the events happened in the world. In short, the chronological interpretation of a series events is the default one.

Other scholars have argued that, by default, coherence relations presenting a forward movement of time (i.e. chronological), are highly expected by comprehenders; their overt marking is thus not needed, as it would make the linguistic content more informative than necessary (Asr & Demberg, 2012, 2015). This proposal may be explained by Grice’s maxim of quantity, Horn’s R and Q principles (1984) and Levy & Jaeger’s Uniform Information Density hypothesis (UID) (Levy & Jaeger, 2007). According to Grice, speakers are expected to make their contribution as informative as required for comprehenders to grasp the intended message, but no more informative than necessary. In Horn’s pragmatic model, speakers should make their contributions sufficient (the Q principle) and necessary (the R principle). In other words, the Q principle reduces the hearer’s effort to interpret an utterance while the R principle prevents the speaker from producing unnecessary linguistic content. The UID further specifies Grice’s maxim and Horn’s principles with respect to what is the most optimal way in which a speaker should communicate information: the speaker should distribute the information evenly across a text or utterance, thereby reducing or omitting redundant optional markers. As put by Hoek et al. (2018: 278), “a coherence relation should therefore be sufficiently marked so that the hearer will be able to construct the appropriate relations, but not overly or unnecessarily marked so as to limit the speaker’s effort”. As Asr and Demberg (2012: 2671) put it: “At the level of inter-sentential relations, this would mean the presence of explicit connectives is necessary when the relation is unexpected, but that a connective may be implicit if the relation is predictable.” Where Grice’s maxim, Horn’s principles and the UID hypothesis initially refers to production, Asr and Demberg (2012, 2015) related production tendencies to comprehension biases and showed that corpus-based patterns provide support for the expectation hypothesis (Langacker, 2000) (cf. also Grisot & Blochowiak, 2019).

Asr and Demberg (2012: 2671) then formulated the implicitness assumption, according to which the implicitness of the discourse connective is “a sign of expectation of the discourse relation: if readers have a default preference to infer a specific relation in the text, this type of relation should appear without explicit markers.” Asr and Demberg (2012) provide evidence for the implicitness assumption from their study of the frequency of implicit and explicitly marked coherence discourse relations in the Penn Discourse Tree Bank (PDTB; Prasad et al., 2008). They find a pattern in the distribution of explicit and implicit relations: relations such as Conjunction, Contrast, Concession, Synchrony, Asynchrony and Condition are most often overtly marked, whereas relations such as Cause, Instantiation and Restatement are most often left implicit. Furthermore, Hoek et al. (2018) show, by means of a parallel corpus study, that the implicit versus overt marking status of coherence relations depends not only on the relation itself but also on the cognitive complexity associated to each type of coherence relations (based on the Cognitive Complexity of Coherence Relations framework developed by Sanders et al., 1992, and later work^{Footnote 1}), among other factors. For example, they find that coherence relations with basic order are more often implicit than relations with non-basic order, and that conditional relations are less often implicit than causal or additive ones (Hoek et al., 2018: 127).

In this precise study, Hoek et al. (2018) do not distinguish among the coherence relations that may be and relations that may not be temporally characterized (e.g. in some cases, additive relations may have a temporal interpretation, which can be either synchronous or chronological). Nevertheless, this specification is provided in Asr and Demberg (2012). For them, asynchronous temporal relations (i.e. chronological and backwards) and, to a point, synchronous relations “describe discontinuous events” (page 2678), where the continuous/discontinuous status refers to Murray’s (1997) continuity hypothesis. Compared to all the other coherence relations in the PDTB, they find that relations “classified as discontinuous are exactly the ones with lower [than the average] values of implicitness.” (page 2678). In other words, compared to all other discourse relations, those which may be characterized with one of the three types of temporal configurations are more frequently overtly marked than left implicit. Furthermore, Asr & Demberg compare the rates of explicit marking versus implicit status of coherence relations presenting chronological and backwards movements of time. They find that “the forward relation [i.e. chronological] is associated to a higher degree of implicitness” (page 2679). So, in the PDTB corpus, coherence relations presenting a chronological order of events are significantly more frequently left implicit than overtly marked, whereas relations presenting a backwards order display the reverse distribution. What is of interest for the current paper is the implicitness assumption, which has been proposed for coherence relations in the PDTB framework and which we will apply to (chronological, synchronous and backwards) temporal relations.

Beyond Sentence-Level: The Role of Text Genre and Text Status

Finally, two other factors that play a role in enhancing comprehenders expectations about the discourse to follow are text genre (Kintsch, 1998; Sanders, 1997; Sagi, 2006; Author1, under review) and text status: if the text analysed is an original text or a translated text (Baker, 1993; Colleague & Author1 2020).

Regarding text genre, it has been argued that the current distribution of discourse relations in a text (specific to each the text genre) influences the readers’ expectation about what relations will appear more frequently later on in the text. Specifically, Sanders (1997) was the first to argue that other factors than the content of the discourse might influence the specifics of the relation inferred by the comprehender. For instance, he constructed an experiment to examine participants’ sensitivity to the source of coherence (i.e. content or semantic, epistemic and speech act relations or pragmatic) in two text genres: argumentative and descriptive. He presented expert discourse analysts with discourses that included “chameleon” causal relations, whose source of coherence could be either semantic or pragmatic. While both sources of coherence were found to be equally frequent in argumentative texts, most participants agreed that in descriptive texts the source of coherence was most likely semantic. In addition, Sanders (1997) provides an analysis of these discourse genres and shows that the interpretations made by the participants in his study mirrored the distributions of relation types in the argumentative, and respectively, descriptive genre. One possible explanation of these results may be that discourse comprehension is directly affected by the perceived genre of the discourse (e.g. the textual schemas in Kintsch’s, 1998 terms). As put by Sagi (2006), this account implies that people differentiate among discourse genres and that such discourses are clearly identifiable types. Problematically, this is not always the case since there are situations in which several different genres could be applicable to the same text. Furthermore, in some cases a novel genre may be encountered, or even expected. In other words, readers have expectations about what discourse relations should be found in a certain text genre. For example, Author1 (under review) investigated how people infer temporal relations (chronological vs. synchronous) and assessed experimentally the role of coreference patterns (i.e. whether two situations are performed by the same agent or by different agents) in two fine-grained distinctions of text types: transfer-of-possession contexts (study 1) as in (9) and in short narratives (study 2), as in (10).

(9)	John handed a book to Bob. He [wanted him to read it].
(10)	Mary walked along the street. She gave her mother a call.

In study 1, participants had to provide continuations after having read the pronoun he, which were annotated after the study, by two independent annotators, with respect to temporal relations holding between the situation expressed in the first sentence and that expressed in the second sentence. In study 2, participants’ task was to decide whether there was a chronological or a synchronous temporal relation between the situations expressed. The results revealed that temporal relation instantiated in cases of coreference is different in transfer-of-possession contexts and in short narratives. In the former, coreference triggered more continuations about the source of the transfer-of-possession verb, revealing the participants’ expectation to have more information about the source. These cases were annotated most frequently as synchronous relations. In the latter, coreference triggered more chronological relations revealing their expectation to learn more about the sequence of the protagonist’ actions. In sum, Author1’s experimental study provides evidence that text genre raises different expectations about what discourse relations will appear more frequently later in the text. We will call this the text genre assumption.

With respect to text status, translated texts may have specific features which originate in the translation process itself (cf. Gellerstam’s, 1996translationese and Baker’s, 1993 hypothesis of translation universals). The advantages of translation corpora are that they provide objective linguistic data (Dyvik, 1998) and that they are intended to express the same meaning and have the same discourse functions in the languages considered (Johansson, 1998), hence inherently provide the terms of comparison. More specifically, one of the benefits of using translation corpora to investigate the issue of temporal relations and their implicitness is that we keep constant the content of the texts in the two languages we analyse, and thus eliminate the risk that the conceptual content expressed influences our findings for the two languages. The disadvantage of translation corpora is their very nature: the translation process can create a bias affecting the target text, such as the explicitation feature—that is, a higher rate of explicit information in translated texts than in original texts. Nevertheless, as recent research in translation studies has shown (House, 2018), quantitative evidence supporting the thesis of translation universals has yet to be found. For example, in Colleague & Author1 (2020), we investigate the translation of texts between French, English and Mandarin Chinese with respect to temporal reference (i.e. inflected verbs for tense and aspect in English and French versus non-inflected verbs in Mandarin), and did not find evidence confirming the existence of the two translation universals tested: explicitation and normalization (that is, the tendency to conform to patterns and practices which are typical of the TL, or even to exaggerate them). In the current study, to control for a potential translation bias, in each of the two languages investigated we compare the results we find when the texts were originally written in that language to the results we find when the texts were translated into that language. We will call this the text status assumption.

Hypotheses

In the previous sections, we introduced five assumptions: the linguistic cueing assumption, the implicitness assumption, the specialized connective assumption, the text genre assumption and the text status assumption. Table 1 provides the variables we test in our annotation study, the hypotheses we formulate on the basis of the above-mentioned assumptions issued from the state of the research, and their ensuing predictions for our empirical corpus study.

Table 1 Hypotheses and predictions

Full size table

The first variable tested is the obligatory marking via verb tenses (linked to the linguistic cueing assumption), and we focus on the distinctions between referentially autonomous versus non-autonomous verb tenses on the one hand, and perfective and imperfective verb tenses on the other hand. The main hypothesis is that verb tenses inform language users in a systematic and obligatory way with respect to the temporal relation inferred. Following both theoretical linguistic and psycholinguistic studies, we expect referentially autonomous and perfective verb tenses to signal chronological relations, the pluperfect to signal backwards relations. Cross-linguistic differences are expected with respect to the role of the simple past: the English Simple Past might behave according to this pattern (due to its perspective aspect) or signal with equal frequency chronological and synchronous relations (as found by Grisot, 2017). In contrast, the French Passé Simple should signal almost exclusively chronological relations. Referentially non-autonomous and imperfective verb tenses are expected to signal synchronous relations. While the French Imparfait is expected to behave according to this pattern (as found by Grisot, 2017), the English Past Progressive might either follow it as well (as found by Feller et al. 2019; Magliano & Schleich, 2000) or it might allow chronological and synchronous relations equally frequently (as found by Madden & Zwaan, 2003; Author1, under review).

The second variable tested is optional overt marking status—in particular, the implicit versus overtly marked distinction. We consider two approaches which generate opposite predictions. First, we consider the implicitness assumption, according to which implicitness is a sign of expectation of the discourse relation, and that chronological relations are highly expected relations. As such, we expect to find that chronological relations are more frequently left implicit, whereas synchronous and backwards relations are most frequently overtly marked. Second, we consider Grisot and Blochowiak’s (2019) pragmatic approach, according to which chronological relations overtly marked with specialized connectives are processed faster and preferred by comprehenders over implicit relations. Thus, Grisot & Blochowiak’s results contradict the implicit assumption. As such, we expect to find that chronological relations are more frequently overtly marked with specialized connectives than left implicit.

The third variable tested is a finer-grained view of optional overt marking status, i.e. the distinction between specialized and underspecified overt markers. We follow the specialized connective assumption, according to which specialized connectives are more efficient than underspecified ones. In pragmatics (Horn, 1984; Sperber & Wilson, 1986; Wilson & Sperber, 2012), the notion of efficiency is linked to the speakers’ and hearers’ search for reducing cognitive effort during utterance interpretation (cf. Horn’s, 1984 R and Q principle, and Sperber & Wilson’s notion of relevance). In this context, we expect chronological, synchronous and backwards relations to be more frequently overtly marked with specialized connectives than with underspecified connectives because the former type of connectives allows them to satisfy their search for relevance.

The fourth variable tested is the text genre. We consider the hypothesis that text genre raises language users’ expectations of discourse relations to follow. In particular, literary texts, which are frequently narrative texts, increase the expectation of inferring chronological relations more frequently than synchronous or backwards relations. Since chronological relations become more predictable due to the text genre, they need not be overtly marked. As such, we expect more chronological relations in literary texts than in journalistic texts (by definition); we also expect that chronological relations are less frequently overtly marked than left implicit in literary texts.

The fifth variable tested is the original versus translated status of the text. Following Baker’s (1993) hypothesis of translation universals, and more specifically the explicitation feature, we expect to find more overtly marked temporal relations (chronological, synchronous and backwards) in translated texts than in the original texts.

A Bilingual Corpus Annotation Study

Method

Annotation scheme. We used a three-category annotation scheme of types of relations holding between two situations: chronological, backwards, synchronous. Annotators could also use a 4th case, the ambiguous label, where they were not able to identify one of the first three cases uniquely. We calculated inter-annotator agreement rate using percentages—for a three-way classification, chance-level agreement is of 33.33%, and disagreements were not resolved.^{Footnote 2} Further analysis was carried out only on the set of data for which annotators agreed.

Procedure. For each language, two annotators were presented with annotation guidelines, underwent a training phase, and then annotated the data independently. The annotation guidelines provided explanation and examples for each of the five possible cases of the classification scheme.

Annotators. Two native speakers of English (Study 1) from the University of xxx and two other native speakers of French (Study 2) from the University of xxx annotated the data and received financial compensation for their work. They were of similar background with respect to their age and level of education (linguistics undergraduates).

Data analyses. English and French data were prepared for analysis and analysed in a similar manner. First, English and French sets of agreement data were coded with three surface features: the verb tense of segment 1 (VTS1) and segment 2 (VTS2); the presence or absence of an overt marker of the temporal relation (Marking); and whether the connective used was a specialized connective or the underspecified connectives and and et.

Two analyses were carried out on the English and French agreement sets of data. In the first analysis, we carried out quantitative descriptive analyses to explore which verbal tenses (from S1 and S2) and overt markers are most frequently used to express chronological, synchronous and backwards temporal relations when the temporal relation is not left implicit. In the second analysis, we fitted generalized mixed models in order to see which factors—i.e. obligatory marking (verb tense), optional marking (implicit, underspecified and specialized markers), register (literature and journalistic) and language status (original and translated)—played statistically significant roles in predicting chronological relations. The logistic mixed-effects models were fitted using the R software (R Development Core Team, 2010, version 3.1.2). Models were tested using the glmer() function of the lmer4 package of R, and model comparisons were assessed using the anova() function, which calculates the chi-square value of the log-likelihood to evaluate the difference between models, following Baayen’s (2008) procedure. To fit the models, we coded the data to define a binary dependent variable and several independent variables, i.e. random and fixed factors, as shown in Table 2.

Table 2 Dependent and independent variables

Full size table

Following Johnson (2008) and Field (2014), we built the models by going from the simplest model to the model of interest. Into the simplest model, consisting of only the random structure, we incorporated the various fixed factors one after the other—the marking status, the VT from S1 and from S2, the corpus and language status, as well as their possible interactions—and compared them using the anova() function.

Material

In this study, we used translation corpora between French and English belonging to two stylistic registers: literary and journalistic. In order to answer the research questions formulated in “Introduction” section, we studied the following sets of data. The data from the literary register consists in 125 randomly selected excerpts from two literary texts: 60 were from “A Christmas Carol” by Charles Dickens, originally written in English and professionally translated into French; 65 were from “Le comte de Monte Cristo” by A. Dumas, originally written in French and professionally translated into English. The literary texts were accessed from the Project Gutenberg website (https://www.gutenberg.org/). The data from the journalistic register consists in 147 randomly selected excerpts from articles published in “Le monde diplomatique”, available on http://cabal.rezo.net: 77 were from articles originally written in English and professionally translated into French; 70 were originally written in French and professionally translated into English.

Regarding the English data, 125 literary and 147 journalistic corpus excerpts (i.e. chunks of text with an average length 30–40 words were used). In order to prepare the excerpts for the annotation task, we identified for each excerpt pairs of verbs expressing two situations which were temporally related. A total of 502 pairs of situations from the literary texts and 329 pairs of situations from the journalistic texts were annotated using the three-category annotation scheme. We observed full agreement at a rate much higher than chance: 78% (649 items). Disagreements were not resolved, and we considered for further analysis the 649 cases of agreement (224 cases in which English was the original language, and 425 in which English was the translated language), wherein there were 354 chronological relations, 236 synchronous relations and 59 backwards relations. The two ambiguous cases were considered cases of disagreement.

Regarding the French data, 129 literary and 147 journalistic corpus excerpts were used in this second annotation study. As for the English set of data, we identified for each excerpt pairs of verbs expressing two situations which were temporally related. A total of 514 pairs of situations from the literary texts and 345 pairs of situations from the journalistic texts were annotated using the same three-category annotation scheme. We observed full agreement at a rate much higher than chance, at 69% (597 items). Disagreements were not resolved, and we considered for further analysis the 597 cases of agreement (283 cases in which French was the original language and 314 in which French was the translated language), wherein there were 260 chronological relations, 286 synchronous relations and 51 backwards relations.

Results

English Data

Descriptive Quantitative Analyses

On the set of 649 agreement data points, we carried out a quantitative exploratory analysis to show how verb tenses combine and co-occur with overt markers to express these chronological, synchronous and backwards temporal relations.

The Role of VTs in Expressing Chronological, Synchronous and Backwards Relations

Figure 1 shows the distribution of different verb tense combinations (from segment 1 and segment 2) with respect to chronological, synchronous and backwards relations. Recall that verb tenses, due to their referential autonomy or non-autonomy and to their perfective/imperfective nature, are expected strongly to influence the temporal relation inferred: for example, the progressive -ing is expected to signal synchronous relations more frequently than chronological or backwards relations, whereas the pluperfect is expected to signal backwards relations more frequently than chronological or synchronous relations. Note that here we focus not on the role of a verb tense taken individually but its combination with another verb tense.

As Fig. 1 illustrates, all verb tense combinations may generally be used to express all three types of temporal relations: chronological, synchronous and backwards relations.

First, the SP-SP combination is the most frequent verb tense combination (406 occurrences among the 649 verb pairs analysed), and most frequently expresses chronological relations (65%) (as in (11)), followed by synchronous relations (30%) (as in (12)(13)) and backwards relations (5%) (as in (13)). As expected, when the SP is followed by a PP, a backwards relation was annotated in 77% of cases (as in (14)) and a chronological relation in 23% (as in (15)); a synchronous relation was never annotated (0%). When the SP is followed by the -ing morpheme, annotators annotated almost equally frequently a chronological relation (50%) (as in (16)) or a synchronous relation (46%) (as in (17)); a backwards relation was annotated in only 4% of the cases (as in (18)). A Pearson’s Chi-Square test performed on the data consisting of the SP in S1 (first left-hand panel of Fig. 1) shows that this observed distribution is significantly different from a chance expected distribution (χ²(4) = 151.27, p < .0001).

(11)	The clerk in the tank involuntarily applauded: becoming immediately sensible of the impropriety, he poked the fire. [Literary corpus]
(12)	Scrooge never painted out Old Marley's name. There it stood, years afterwards, above the warehouse door: Scrooge and Marley. The firm was known as Scrooge and Marley. [Literary corpus]
(13)	Up Scrooge went, not caring a button for that: darkness is cheap, and Scrooge liked it. But before he shut his heavy door, he walked through his rooms to see that all was right. [Literary corpus]
(14)	This prospect of fresh festivity redoubled the hilarity of the guests to such a degree, that the elder Dantès, who, at the commencement of the repast, had commented upon the silence that prevailed. [Literary corpus]
(15)	Scrooge lay in this state until the chimes had gone three quarters more. [Literary corpus]
(16)	When the clock struck eleven, this domestic ball broke up. Mr. and Mrs. Fezziwig took their stations, one on either side the door, and shaking hands with every person individually as he or she went out, wished him or her a Merry Christmas. [Literary corpus]
(17)	Up Scrooge went, not caring a button for that: darkness is cheap, and Scrooge liked it. [Literary corpus]
(18)	The honorable, the king’s attorney, is informed by a friend of the throne and religion, that one Edmond Dantès, mate of the ship Pharaon, arrived this morning from Smyrna, after having touched at Naples and Porto-Ferrajo. [Literary corpus]

Second, the combination of verbs presenting the -ing morpheme (i.e. the progressive or the gerund) most frequently expresses synchronous relations (54%) (as in (19)), followed by chronological (33%) (as in (20)) and backwards relations (13%) (as in (21)). When the -ing morpheme is followed by an SP, the two situations are more frequently interpreted as expressing a chronological relation (62%) (as in (22)), less frequently a synchronous relation (38%) (as in (23)), and never a backwards relation (0%). A Pearson’s Chi-Square test showed that this distribution (second left-hand panel of Fig. 1) is significantly different from a chance expected distribution (χ²(2) = 7.48, p < .02).

(19)	Facing head-on the dramatic collapse in exchange rates and the breathtakingly high rate of inflation, thousands of them—used to bending with the wind—are already packing their bags of their own accord. Some are returning to their home countries despite the endemic poverty and political risks : the Thai kingdom’s undesirables include Shan and Karen refugees, members of ethnic minorities persecuted by the Burmese military junta. [Journalistic corpus]
(20)	“Ah,” exclaimed the young girl, blushing with delight, and fairly leaping in excess of love, “you see he has not forgotten me, for here he is!” And rushing towards the door, she opened it, saying, “Here, Edmond, here I am!” [Literary corpus]
(21)	While French television’s "Les Guignols" on Canal+ were lampooning Jamie Shea’s press conferences (for two night the puppet Shea was making excuses for a missile attack on a bus), the British and Americans stayed po-faced to the end. [Journalistic corpus]
(22)	But how much greater was his horror, when the phantom taking off the bandage round its head, as if it were too warm to wear in-doors, its lower jaw dropped down upon its breast! [Literary corpus]
(23)	Rising to be life president of the De Beers mining conglomerate ("diamonds are for ever"), Barney Barnato died in mysterious circumstances at the age of 42. [Journalistic corpus]

Third, as we expected, when the PP is followed by an SP, a chronological (70%) (as in (24)) or a synchronous (30%) relation was annotated (as in (25)), and never a backwards relation (0%). When the PP verb combines with another PP verb, language users usually infer a chronological relation (45%) (as in (26)), followed by a synchronous relation (30%) (as in (27)), then a backwards relation (25%) (as in (28)). A Pearson’s Chi-Square test showed that this distribution (third left-hand panel of Fig. 1) is significantly different from a chance expected distribution (χ²(2) = 6.86, p < .03).

(24)	But he put his hand upon the key he had relinquished, turned it sturdily. [Literary corpus]
(25)	But they and their spirit voices faded together; and the night became as it had been when he walked home. [Literary corpus]
(26)	The ship drew on and had safely passed the strait, which some volcanic shock has made between the Calasareigne and Jaros islands; had doubled Pomègue. [Literary corpus]
(27)	The city had entirely vanished. Not a vestige of it was to be seen. The darkness and the mist had vanished with it. [Literary corpus]
(28)	Within hours the BBC had promised the Irishwoman air time to explain what she had actually seen. [Journalistic Corpus]

Fourth, when two present time verb tenses (Simple Present or Present Perfect) co-occur, annotators most frequently annotated a synchronous relation (78% when a PresPerf is used and 86% when a Spres is used) (as in (29)), and less frequently a chronological (11% for the PresPerf and 14% for the Spres) (as in (30)) or backwards relation (11% for the PresPerf and 0% for the Spres) (as in (31)). A Pearson’s Chi-Square test showed that these distributional differences (fourth left-hand panel of Fig. 1) between the Spres and PresPerf are not statistically significant (χ²(2) = 4.1, p = .12), indicating that the two verb tenses have similar roles in expressing backwards, chronological and synchronous relations.

(29)	This Sunday, the church is full and the doors are open. [Journalistic Corpus]
(30)	In 2 years, the committee has dealt with some 50 cases, most of them women. They leave Asia, Africa or the Middle East for a promised Eldorado in France. When they arrive, they are forced to become skivvies.
(31)	It is always an event at Marseilles for a ship to come into port, especially when this ship, like the Pharaon, has been built, rigged, and laden at the old Phocee docks, and belongs to an owner of the city. [Literary corpus]

The Marking of Chronological, Synchronous and Backwards Relations

Figure 2 shows the distribution of chronological, synchronous and backwards relations according to their percentages of implicit cases, overt marking using the underspecified connective and, specialized connectives for expressing synchronous, chronological and backwards, and other linguistic markers such as NP or PPs.

First, backwards relations are most frequently expressed implicitly (71%), and are overtly marked in 29% of cases (with a specialized connective in 14% of cases, and other types of linguistic marker in 15% of cases). As already shown in Fig. 1, the high percentage of implicit cases might be due to use of the PP in S2, preceded by a SP or an -ing form. Second, chronological relations are less frequently expressed implicitly (40%) than they are overtly marked (a total of 60%), distributed as follows: 30% with the underspecified and; 19% with a specialized connective, and 11% with other types of overt marker, such as NPs and PPs. Third, synchronous relations are also most frequently expressed implicitly (62%) rather than overtly marked (38%), distributed as follows: 14% with the underspecified and; 10% with a specialized connective, and 14% with other types of overt marker, such as NPs and PPs. A Pearson’s Chi-Square test performed on the data for Fig. 2 shows that this observed distribution is significantly different from a chance expected distribution (χ²(6) = 56.96, p < .0001).

From the perspective of the use of the underspecified versus specialized connectives to express temporal relations, our data indicate a higher frequency of the underspecified connective and for overtly marking chronological (30%) and synchronous (14%) relations than specialized chronological (14%) and synchronous (10%) connectives. In contrast, no case was found when and is used to express backwards relations. Furthermore, among all cases where and is used, 76% correspond to a chronological relation, 24% to a synchronous relation and 0% to a backwards relation.

Statistical Modelling

Data

After removing the cases of backwards chronological relations (59 occurrences) and the cases in which overt marking was performed using other types of linguistic marker than underspecified and specified connectives (41 occurrences), the final set of data on which we fitted the mixed models has 519 observations.