Diachronic Variation of Temporal Expressions in Scientific Writing Through the Lens of Relative Entropy

Open Access
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10713)


The abundance of temporal information in documents has lead to an increased interest in processing such information in the NLP community by considering temporal expressions. Besides domain-adaptation, acquiring knowledge on variation of temporal expressions according to time is relevant for improvement in automatic processing. So far, frequency-based accounts dominate in the investigation of specific temporal expressions. We present an approach to investigate diachronic changes of temporal expressions based on relative entropy – with the advantage of using conditioned probabilities rather than mere frequency. While we focus on scientific writing, our approach is generalizable to other domains and interesting not only in the field of NLP, but also in humanities.

1 Introduction

Many types of textual documents are rich in temporal information. A specific type of such information are temporal expressions, which again happen to occur in a wide variety of documents. Thus, during the last years, there has been a growing interest in temporal tagging within the NLP community. While variation of temporal expressions according to different domains has become a well established research area (Mazur and Dale 2010; Strötgen and Gertz 2012; Lee et al. 2014; Strötgen and Gertz 2016; Tabassum et al. 2016), variation of temporal expressions according to time within a domain has received less attention so far.1 Knowing how temporal expressions might have changed over time within a domain is interesting not only in the field of NLP, e.g., for adaptation of temporal taggers to different time periods, but also in humanities studies in the fields of historical linguistics, sociolinguistics, and the like.

In this paper, we focus on temporal expressions in the scientific domain and study their diachronic development over a time frame of approx. 350 years (from the 1650s to the 2000s). While here we take an exploratory historical perspective, our findings have implications for improving temporal tagging, especially for recall.

Temporal expressions are related to situation-dependent reference (see notably, Biber et al. (1999)’s work), i.e., linguistic reference to a particular aspect of the text-external temporal context of an event (cf. Atkinson (1999, p. 120); Biber and Finegan (1989, p. 492)). While according to Biber et al.’s work, scientific writing has moved towards expressing less situation-dependent reference, to the best of our knowledge, there is no evidence of how this change has been manifested linguistically and whether the types of temporal expressions used in scientific writing have changed over time. To investigate this in more detail, we pose the following questions:
  • Do the types of temporal expressions vary diachronically in scientific writing, and if so how is this manifested linguistically?

  • What are typical temporal expressions of specific time periods and do these change over time?

  • Are different types of temporal expressions, e.g., duration expressions and date expressions referring to points in time, equally affected by a potential change over time?

To process temporal information in scientific research articles, we use HeidelTime (Strötgen and Gertz 2010), a domain-sensitive tagger to extract and normalize temporal expressions according to the TimeML standard (Pustejovsky 2005) for temporal annotation (see Sect. 4). To detect typical temporal expressions of specific time periods, we use relative entropy, more precisely Kullback-Leibler Divergence (KLD) (Dagan et al. 1999; Lafferty and Zhai 2001). By KLD we measure how typical a temporal expression is for a time period vs. another time period (see Sect. 5). The methodology has been adopted from Fankhauser et al. (2016) and successfully used in Degaetano-Ortlieb and Teich (2016) to detect typical linguistic features in scientific writing, Degaetano-Ortlieb and Teich (2017) to detect typical features of research article sections, and Degaetano-Ortlieb (2017) to observe typical features of social variables.

In the analysis, we inspect general diachronic tendencies based on relative frequency and use relative entropy to investigate more fine-grained changes in the use of temporal expressions over time in scientific writing (see Sect. 6). On a more abstract level, we observe that the use of temporal information in scientific writing reflects the paradigm change from observational to experimental science (cf. Fankhauser et al. (2016); Gleick (2010)) and moves further to descriptions of previous work (e.g., in the last decades) in contemporary scientific writing.

2 Related Work

Temporal information has been often employed to improve information retrieval (IR) approaches (see Campos et al. (2014) and Kanhabua et al. (2015) for an overview). A prerequisite to exploit temporal information is temporal tagging, i.e., the identification, extraction, normalization, and annotation of temporal expressions based on an annotation standard such as the temporal markup language TimeML (Pustejovsky 2005). While for quite a long time, temporal tagging was tailored towards processing news texts, in the last years, domain-sensitive approaches are being developed, as it has been shown that temporal information varies significantly across domains (Mazur and Dale 2009; Strötgen and Gertz 2016). Domain-sensitive temporal taggers are UWTime (Lee et al. 2014) and HeidelTime (Strötgen and Gertz 2012). We choose HeidelTime as it is being reported to be much faster than UWTime (Agarwal and Strötgen 2017).

Recently, there is also an increasing interest in temporal information in the field of digital humanities. An early approach to operationalize time in narratology has been applied by Meister (2005). Strötgen et al. (2014) show how temporal taggers can be extended for temporal expressions referring to historical dates in the AncientTimes corpus. Fischer and Strötgen (2015) apply temporal taggers to analyze date accumulations in large literary corpora. An analysis of temporal expressions and whether they refer to the future or the past has also been performed on English and Japanese twitter data (Jatowt et al. 2015).

Considering the diachronic aspect of temporal information in scientific writing, it has been mainly investigated by considering temporal adverbs in the context of register studies. Biber and Finegan (1989) and Atkinson (1999), for example, have shown a decrease of temporal adverbs in scientific writing in terms of relative frequencies. Fischer and Strötgen (2015) also studied temporal expressions in a diachronic corpus, but only temporal expressions with explicit day and month information have been considered.

We use temporal tagging tailored at identifying temporal information in scientific writing to obtain a more comprehensive picture of possible diachronic changes. Moreover, besides considering changes in terms of relative frequency, we look at typical temporal expressions and patterns of temporal expressions of specific time periods.

3 Data

As a dataset, we use texts of scientific writing ranging from 1665 to 2007. The first time periods (1665 up to 1869) are covered by the Royal Society Corpus (Kermes et al. 2016a) build from the Proceedings and Transactions of the Royal Society of London – the first periodical of scientific writing – covering several topics within biological sciences, general science, and mathematics. For the later time periods (1966 to 2007), we also use scientific research articles from various disciplines (e.g., biology, linguistics, computer science) taken from the SciTex corpus (Degaetano-Ortlieb et al. 2013; Teich et al. 2013). For comparative purposes, we divide the corpus into fifty year time periods. Table 1 shows the time periods, their coverage and the sub-corpus sizes in number of tokens and documents.

The corpus has been pre-processed in terms of OCR correction, normalization, tokenization, lemmatization, sentence segmentation, and part-of-speech tagging (cf. Kermes et al. (2016b)).
Table 1.

Corpus details.

































4 Processing Temporal Information

4.1 Temporal Expressions

Key characteristics. Temporal expressions have three important key characteristics (cf. Alonso et al. (2011); Strötgen and Gertz (2016)). First, they can be normalized, i.e., expressions referring to the same semantics can be normalized to the same value. For example, March 11, 2017 and the 2nd Saturday in March of this year point to the same point in time, even though both expressions are realized in different ways. Second, temporal expressions are well-defined, i.e., given two points in time X and Y, the relationship between these two points can always be determined, e.g., as X is before Y (cf. Allen (1983)). Third, they can be organized hierarchically on a granularity scale (from coarser to finer granularities and vice versa such as day, month or year). Relevant in our analysis are normalization and granularity. Normalized values are used to compare temporal expressions across time periods instead of considering only the single lexical realizations. In terms of granularity, we consider granularity scales to determine diachronic changes of temporal expressions.

Types. According to the temporal markup language TimeML (cf. Pustejovsky (2005)), there are four types of temporal expressions (cf. also Strötgen and Gertz (2016)):
  • Date expressions refer to a point in time of the granularity equal or coarser than ‘day’ (e.g., March 11, 2017, March 2017 or 2017).

  • Time expressions refer to a point in time of any granularity smaller than ‘day’ (e.g., Saturday morning or 10:30 am).

  • Duration expressions refer to the length of a time interval and can be of different granularity (e.g., two hours, three weeks, four years).

  • Set expressions refer to the periodical aspect of an event, describing set of times/dates (e.g., every Saturday) or a frequency within a time interval (e.g., twice a day).

In the analysis, we consider all these four types showing how their use has changed diachronically in scientific writing.

4.2 Temporal Tagging

For temporal tagging we use HeidelTime (Strötgen and Gertz 2010), a domain-sensitive temporal tagger. HeidelTime supports normalization strategies for four domains: news, narrative, colloquial, and autonomous. Although HeidelTime has been applied to process scientific documents using the autonomous domain, these scientific documents have been very specific, relatively short (biomedical abstracts) with many so-called autonomous expressions (i.e., expressions not referring to real points in time, but to references in a local time frame).

In contrast, our corpus is quite heterogeneous, containing letters and reports in the earlier time periods and full articles in the later time periods. Thus, we expect that most of the documents are written in such a way that the correct normalization of relative temporal expressions can be reached by using the document creation time as reference time. This makes the documents similar to news-style documents according to HeidelTime’s domain definitions. Thus, we apply HeidelTime with its news domain setting. Note, however, that in our analysis we use only normalized values of Duration and Set expressions, which are normalized to the length and granularity of an expression but not to an exact point in time. Thus, our findings still hold if some of the occurring temporal expressions are not normalized correctly to a point in time.

HeidelTime uses TIMEX3 tags, which are based on TimeML (Pustejovsky et al. 2010), the most widely used annotation standard for temporal expressions. In the following, we briefly explain the value attribute of TIMEX3 annotations of Duration and Set expressions, as we do consider their normalized values for a deeper analysis of the occurring temporal expressions. The value attribute of Duration and Set expressions contains information about the length of the duration that is mentioned, starting with P (or PT in case of time level durations) followed by a number and an abbreviation of the granularity (e.g., years: Y, month: M, week: W, days: D; hours: H, minutes: M). In addition, fuzzy expressions are referred to by X instead of precise numbers, e.g., several weeks is normalized to PXW, monthly is normalized to XXXX-XX and annually to XXXX.

4.3 Extraction Quality

For meaningful analysis and substantiated conclusions of temporal expressions in our diachronic corpus, the extraction (and normalization) quality of the temporal tagger should be reliable. Although HeidelTime has been extensively evaluated before on a variety of corpora2, our corpus is quite different from standard temporal tagging corpora as it contains scientific documents from multiple scientific fields published across several centuries. Creation of a proper gold standard with manual annotations covering all scientific fields across all time periods would not be feasible in an appropriate time frame. Instead, for a valuable statement of temporal tagging quality on our corpus, determining the correctness of expressions tagged by the temporal tagger would be meaningful.

For this, we use precision, i.e., we randomly sample 250 instances for each time period, and manually validate whether the automatically annotated temporal expressions are correctly extracted.3 Here, we consider correctly extracted instances (right) and wrongly extracted instances (wrong). The latter are either cases of ambiguity (e.g., spring as ‘season’ or ‘water spring’ or current meaning ‘now’ or ‘electric current’) or wrongly assigned temporal expressions to numbers occurring in the text. On top we differentiate correctly assigned but not relevant instances (other) due to noise in the data itself. These are, e.g., temporal expressions assigned to reference sections (especially in the 1950–2000 periods) or used within tables (mostly in the earlier time periods).

Table 2 presents precision information and the number of instances per assigned category of right, other, and wrong. We consider the other instances to be correct in terms of precision of extraction. Across periods, precision achieves 0.89 to 0.96.
Table 2.

Precision across time periods.









































5 Typicality of Temporal Expressions

To obtain temporal expressions typical of a time period, we use relative entropy, also known as Kullback-Leibler Divergence (KLD) (Kullback and Leibler 1951) – a well-known measure of (dis)similarity between probability distributions used in NLP, speech processing, and information retrieval. In comparison to relative frequency, i.e., the unconditioned probability of, e.g., a word over all words in a corpus, relative entropy is based on conditioned probability.

In information-theoretic parlance, relative entropy measures the average number of additional bits per feature (here: temporal expressions) needed to encode a feature of a distribution A (e.g., the 1650 time period) by using an encoding optimized for a distribution B (e.g., the 1700 time period). The more additional bits needed, the more distant A and B are. This is formalized as:
$$\begin{aligned} D(A||B) = \sum _{i} p(feature_i|A)log_2 \frac{p(feature_i|A)}{p(feature_i|B)} \end{aligned}$$
where \(p(feature_i|A)\) is the probability of a feature (i.e., a temporal expression) in a time period A, and \(p(feature_i|B)\) the probability of that feature in a time period B. The \(log_2\frac{p(feature_i|A)}{p(feature_i|B)}\) relates to the difference between both probability distributions (\(log_2p(feature_i|A)-log_2p(feature_i|B)\)), giving the number of additional bits. These are then weighted with the probability of \(p(feature_i|A)\) so that the sum over all \(feature_i\) gives the average number of additional bits per feature, i.e., the relative entropy.

In terms of typicality, the more bits are used to encode a feature, the more typical that feature is for a given time period vs. another time period. Thus, in a comparison of two time periods (e.g., 1650 vs. 1700), the higher the KLD value of a feature for one time period (e.g., 1650), the more typical that feature is for that given time period. In addition, we test for significance of a feature by an unpaired Welch’s t-test. Thus, features considered typical are distinctive according to KLD and show a p-value below a given threshold (e.g., 0.05).

To compare typical features across several time periods, the most high ranking features of each comparison are considered. For example, for 1650 we obtain six feature sets typical of 1650 as we have six comparisons of 1650 with each of the other six time periods (i.e., a feature set for features typical of 1650 vs. 1700, of 1650 vs. 1750, etc.). If features are shared across feature sets and are high ranking (e.g., in the top 5), these features are considered to be typical of 1650. In other words, these are features ranking high in terms of KLD, significant in terms of p-value, and typical of a time period across all/most comparisons with other time periods. As in our case we consider seven time periods, features are considered typical which rank high for one time period in 6 to 4 feature sets (i.e., typical in more than half of the comparisons).

6 Analysis

In the following, we analyze diachronic tendencies of temporal expressions from the period of 1650 to 2000 in terms of (1) relative frequency (i.e., unconditioned probabilities), and (2) typicality (i.e., conditioned probabilities of expressions in one vs. the other time periods as described in Sect. 5).

We show how the notion of typicality based on relative entropy leads to valuable insights on the change of temporal expressions in scientific writing w.r.t. more and less frequent expressions.

6.1 Frequency-Based Diachronic Tendencies

Comparing temporal types across fifty years time periods in terms of frequency (see Fig. 1 showing log of frequency per million (pM)), Date is the most frequent type, followed by Duration. Set and Time expressions are less frequent. In addition, while Date remains relatively stable over time, expressions of Duration, Set and Time drop quite a bit from 1850 onwards, getting relatively rare.
Fig. 1.

Diachronic tendencies of temporal expression types in scientific writing.

6.2 Diachronic Tendencies of ‘Typical’ Temporal Expressions

Inspecting diachronic change through the lens of relative entropy (as described in Sect. 5) allows us to consider temporal expressions typical of one time period when compared to the other time periods. We study each type of temporal expression and carefully select the base of comparison.

Date Considering Date expressions, instead of comparing single dates (which mostly occur only once in the corpus, such as June, 3, 1769), we take a level of abstraction and consider part-of-speech (POS) sequences of annotated Date expressions to better inspect the types of changes that might have occurred over time. For each Date expression, we extract POS sequences and use relative entropy to detect typical POS sequences of temporal expressions for each time period.
Table 3.

Typical POS sequences of Date.


POS sequence





in the Spring



in Winter







in Summer



March 8



the 6th of March




June 3, 1769



April 19



2 June



the Spring




June 18, 1784




in 1858




current work



mid seventeenth century




the last decades



the 1990s



late seventeenth century


CD: cardinal number, DT: determiner, IN: preposition, JJ: adjective, NN: sing. common noun, NNS: pl. common noun, NP: sing. proper noun, RB: adverb

Fig. 2.

Specificity (black) and interval (gray) of typical Date expressions.

Table 3 shows POS sequences typical of one time period vs. 6-4 other time periods (see column comp.)4. For example, for 1650 the POS sequences Determiner-Noun (DT-NN) and Proper Noun (NP) are quite typical, which are all temporal expression referring to seasons in terms of lexical realizations (see Example 1). Both POS sequences are typical of 1650 vs. 1750 to 2000 (i.e., 5 comparisons). If we consider the POS sequences that are typical across time periods and their lexical realizations, there seems to be a development in terms of specificity and interval (see Fig. 2).

To capture the notion of specificity, we consider how many pieces of temporal information are given by a POS sequence to make a temporal expression most specific, with a scale from 1 to 4, where 1 is least specific (e.g., NP denoting seasons such as Winter as we do not know of which year etc.) and 4 is most specific (e.g., NP CD, CD such as June 3, 1769 which gives us an exact date)5. For comparison across time periods, Fig. 2 shows the average of the specificity count over all typical POS sequences of a time period (black line). For the interval of typical Date expressions, the amount of days6 the expressions refer to is used (shown in log in Fig. 2, gray line). The more specific an expression becomes, the smaller the interval it refers to and vice versa.

Figure 2 also shows how temporal expressions move from relatively unspecific (e.g., in the Spring in 1650) to very specific (June 18, 1784 in 1800) and back to unspecific expressions (e.g., the last decades in 2000). The interval moves instead from a wider to a smaller span and back to a wider span in 2000. Investigating the contexts, in which these expressions arise, gives further insights. While in the early time periods, season mentioning is typical, from 1800 to 1850, temporal expressions are typical with exact date, year or month expressions. These expressions are used to present exact dates of observations made by a researcher at several points in time, especially in the field of astronomy (see Example 1). From the 1950 onwards, typical Date expressions become less explicit, relating to broader (e.g., the 1970s in Example 3) and less specific (e.g., current literature in Example 3) temporal reference. These expressions are used, e.g., in the context of previous work descriptions in introduction sections of research papers.

Example 1

  • In Winter it will need longer infusion, than in the Spring or Autumn. \(\mathrm {(1650)}\)

  • The difference between these two plants is this; the papaver corniculatum dies to the root in the winter, and sprouts again from its root in the spring; \(\mathrm {(1750)}\)

Example 2

  • March 4, 1783. With a 7-feet reflector, I viewed the nebula near the 5th Serpentis, discovered by Mr. MESSIER, in 1764. \(\mathrm {(1750)}\)

Example 3

  • In the 1970s, Rabin [38] and Solovay and Strassen [44] developed fast probabilistic algorithms for testing primality and other problems. \(\mathrm {(2000)}\)

  • There is a significant confusion in the current literature on “cellular” or “tessellation arrays” concerning the concept of a “Garden-of-Eden configuration”. \(\mathrm {(1950)}\)

Time To investigate typical Time expressions, similarly to Date expressions, we consider their POS sequences (see Table 4).

It can be seen that only the intermediate time periods show typical expressions (1750 to 1850). In terms of granularity, in the period of 1750, expressions are less granular pointing to broader sections of a day (e.g., morning, evening) mostly used to describe observations made (see Example 4). In the 1850 period, expressions point to specific hours of a day (e.g., 9 A.M.) mostly in descriptions of experiments.
Table 4.

Typical POS sequences of Time.


POS sequence





Sunday morning



next morning




10 A.M.




7 A.M.



the evening of the 28th of August



about 8 A.M.


CD: cardinal number, DT: determiner, IN: preposition, JJ: adjective, NN: sing. common noun, NP: sing. proper noun

Example 4

  • Monday morning she appeared well, her pulse was calm, and she had no particular pain. \(\mathrm {(1750)}\)

  • There being usually but one assistant, it was impossible to observe during the whole twenty-four hours; the hours of observation selected were therefore from 3 A.M. to 9 P.M. inclusive. \(\mathrm {(1850)}\)

Duration For Duration we consider their TIMEX3 value, as it directly encodes normalized information on the duration length and granularity of temporal expressions. Figure 3 shows typical TIMEX3 values (e.g., P1D for expressions such as one day) of specific time periods7. The y-axis shows the duration length in seconds on a log scale. In general, duration length gets lower from 1750 to 1850 (with expressions of seconds and hours, which are more granular) and higher in 1950 and 2000 (with expressions of decades, which are less granular).
Fig. 3.

Diachronic tendencies of typical Duration expressions.

We then again consider the contextual environments of these typical expressions. In the earlier time periods (1650 and 1700), day and year expressions are typical, mostly relating to observations or experiment descriptions (see Example 5).

Example 5

  • After the eleven Months, the Owner having a mind to try, how the Animal would do upon Italian Earth, it died three days after it had changed the Earth. \(\mathrm {(1650)}\)

  • [...] the Opium, being cut into very thin slices, [...] is to be put into, and well mixed with, the liquor, (first made luke-warm) and fermented with a moderate Heat for eight or ten Days, [...]. \(\mathrm {(1650)}\)

From the period of 1750 to 1950, duration length is relatively low with expressions of seconds, minutes and hours being typical of these time periods. These expressions are mainly related to observations in the 1750 period and experiment descriptions from 1800 to 1950 (see Example 6).

Example 6

  • June 4, the weather continued much the same, and about 9h 30 in the evening, we had a shock of an earthquake, which lasted about four seconds, and alarmed all the inhabitants of the island. \(\mathrm {(1750)}\)

  • [...] the glass produced by this fusion was in about twelve hours dissolved, by boiling it in a proper quantity of muriatic acid. \(\mathrm {(1800)}\)

  • In a few hours a mass of fawn-coloured crystals was deposited; \(\mathrm {(1850)}\)

  • The patient is then switched to the re-breathing system containing 133 Xenon at 5 mCi/1 for a period of one minute, and then returned to room air for a period of ten minutes. \(\mathrm {(1950)}\)

Fig. 4.

Diachronic tendencies of typical Set expressions in scientific writing.

In the 1950, besides weeks and minutes, related to experiment descriptions (see Example 7), expressions of decades are typical. The latter is also true for the 2000. In both periods, expressions relating to decades refer to previous work (see Example 7).

Thus, Duration shifts from being used for purposes of observational to experimental science and finally to previous work references in the latest time periods.

Example 7

  • For each speaker, performance was observed across numerous repetitions of the vocabulary set within a single session, as well as across a 2-week time period. \(\mathrm {(1950)}\)

  • It constitutes the usual drift-diffusion transport equation that has been successfully used in device modeling for the last two decades. \(\mathrm {(1950)}\)

  • Provably correct and efficient algorithms for learning DNF from random examples would be a powerful tool for the design of learning systems, and over the past two decades many researchers have sought such algorithms. \(\mathrm {(2000)}\)

Set For Set expressions again their TIMEX3 value is considered. Figure 4 shows typical expressions with the times per year of a Set expression on the y-axis in log8, mirroring also less granular (annually) and more granular (every day) expressions. As we have seen from Fig. 1, Set expressions are relatively rare in scientific writing and strongly decrease over time (see Sect. 6.1). This is also reflected in the few temporal expressions typical of each time period in Fig. 4. In terms of granularity, there is a shift from day to month expressions (see Example 8). Interestingly, for the latter, there has been a move from a noun phrase expression (every/each month) to an adverb expression (monthly). While, in the intermediate periods (1800 and 1850) both expressions are typical, in 1950 only monthly is typical. In 1750 to 1850, every/each month expressions relate to observations done on a monthly basis of which the mean or average is drawn and the same applies for monthly used with mean as a term (see Example 8). In 1950, instead, monthly is solely used as an adverb. Thus, there is a replacement of longer noun phrase expressions (every/each month) by the shorter adverb expression monthly.

Example 8

  • Besides this, you may there see, that every day the Sun sensibly passes one degree from West to East, [...]. \(\mathrm {(1650)}\)

  • In order to determine the annual variations of the barometer, I have taken the mean of the observations in each month, [...]. \(\mathrm {(1800)}\)

Example 9

  • The mean was then taken in every month of every lunar hour (attending to the signs), and the monthly means were collected into yearly means. \(\mathrm {(1850)}\)

  • A disk resident file of all current recipient numbers is created monthly from the eligibility tape file supplied by Medical Services Administration. \(\mathrm {(1950)}\)

7 Discussion and Conclusion

We have presented an approach to investigate diachronic change in the usage of temporal expressions. First, we use temporal tagging to obtain a more comprehensive coverage of possible temporal expressions, rather than investigating specific expressions only, as was the case in previous work. Evaluation of the tagging results showed high precision (approx. 90%) across time periods.

Second, we use relative entropy to detect typical temporal expressions of specific time periods. A clear advantage to frequency-based accounts is that with relative entropy frequent as well as rare phenomena can be investigated in terms of their ‘typicality’ according to a variable (here: temporal expressions typical of specific time periods). Apart from gaining knowledge on diachronic changes specific to different types of temporal expressions, we also capture more abstract and more fine-grained shifts. On a more abstract level, while our findings confirm the paradigm shift from the more observational to the more experimental character of scientific writing (cf. Fankhauser et al. (2016); Gleick (2010)) for Date and Duration expressions, we also show the tendency towards previous work descriptions for these two temporal types in contemporary scientific writing. On a more fine-grained level, for Set (a rarely used temporal type especially towards more contemporary time periods), there is a linguistic shift from longer noun-phrase to shorter adverb expressions.

These findings are not only interesting in historical linguistic terms, but are also relevant to improve adaptation of temporal taggers to different time periods. Especially for recall, gold-standard annotations are needed. Since this is a quite resource and time consuming task, our approach can help in gaining insights on the use of typical temporal expressions in specific contexts across periods. These contexts can then be further exploited in terms of possible temporal expression occurrences to achieve better recall. In addition, temporal expressions might change in terms of linguistic realization as with the Set type in our case. Accounting for shifts in linguistic realization will also improve recall. While this is true for diachronic variation, the approach also generalizes to domain-specific variation. In future work, we plan to work in this direction, further elaborating our methodology for diachronic and domain variation.


  1. 1.

    Note that here domain is defined as a group of documents sharing the same characteristics for the task of temporal tagging, cf. (Strötgen and Gertz 2016).

  2. 2.

    E.g., on news articles as in the TempEval competitions (Verhagen et al. 2010; UzZaman et al. 2013) and on Wikipedia articles contained in the WikiWars corpus (Mazur and Dale 2010).

  3. 3.

    We chose to use an amount of instances per period rather than an amount of documents, due to possible sparsity of temporal expressions within documents. This also allows us to validate the same amount of instances across time periods, rather than varying amounts of instances across time periods.

  4. 4.

    Note that the examples in Table 3 show most frequent realizations for relatively generic expressions such as seasons (e.g., in the Spring) or examples taken randomly from the corpus for specific dates (e.g., June 3, 1769) as they occur only once.

  5. 5.

    On this scale, 1 denotes the mere occurrence of a temporal expression (e.g., NP such as Winter), 4 denotes the mere expression plus the inclusion of day, month and year (e.g., NP CD, CD such as June 3, 1769, which is a temporal expression + day + month + year resulting in 4 points in total), and 2 and 3 denote an occurrence of a temporal expression plus either two or three combinations of day, month and year (e.g., CD such as 1858 for 2, i.e., temporal expression + year, and NP CD such as March 8 for 3, i.e., temporal expression + month + day).

  6. 6.

    We use ‘day’ because Date expressions refer to a point in time of the granularity ‘day’ or coarser, i.e., ‘day’ is the smallest unit.

  7. 7.

    Note also that typical expressions can either be relatively explicit (e.g., P1D for 24 h) or fuzzy (indicated by an X in the TIMEX3 value, e.g., few hours for the value PTXH).

  8. 8.

    For example, every day corresponds to 365 times a year, while annually to once a year.



This work is partially funded by Deutsche Forschungsgemeinschaft (DFG) under grant SFB 1102: Information Density and Linguistic Encoding (www.sfb1102.uni-saarland.de). We are also indebted to Stefan Fischer for his contributions to corpus processing and Elke Teich for valuable improvement suggestions. Also, we thank the anonymous reviewers for their valuable comments.


  1. Agarwal, P., Strötgen, J.: Tiwiki: searching Wikipedia with temporal constraints. In: Proceedings of the 26th International Conference on World Wide Web (WWW 2017), Companion Volume, pp. 1595–1600 (2017)Google Scholar
  2. Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11), 832–843 (1983)CrossRefMATHGoogle Scholar
  3. Alonso, O., Strötgen, J., Baeza-Yates, R., Gertz, M.: Temporal information retrieval: challenges and opportunities. In: Proceedings of the 1st International Temporal Web Analytics Workshop (TWAW 2011), pp. 1–8 (2011)Google Scholar
  4. Atkinson, D.: Scientific Discourse in Sociohistorical Context: The Philosophical Transactions of the Royal Society of London, 1675–1975. Routledge, New York (1999)Google Scholar
  5. Biber, D., Finegan, E.: Drift and the evolution of English style: a history of three genres. Language 65(3), 487–517 (1989)CrossRefGoogle Scholar
  6. Biber, D., Johansson, S., Leech, G., Conrad, S., Finegan, E.: Longman Grammar of Spoken and Written English. Longman, Harlow (1999)Google Scholar
  7. Campos, R., Dias, G., Jorge, A.M., Jatowt, A.: Survey of temporal information retrieval and related applications. ACM Comput. Surv. 47(2), 15:1–15:41 (2014)CrossRefGoogle Scholar
  8. Dagan, I., Lee, L., Pereira, F.: Similarity-based models of word cooccurrence probabilities. Mach. Learn. 34(1–3), 43–69 (1999)CrossRefMATHGoogle Scholar
  9. Degaetano-Ortlieb, S.: Variation in language use across social variables: a data-driven approach. In: Proceedings of the Corpus and Language Variation in English Research Conference (CLAVIER 2017) (2017)Google Scholar
  10. Degaetano-Ortlieb, S., Teich, E.: Information-based modeling of diachronic linguistic change: from typicality to productivity. In: Reiter, N., Alex, B., Zervanou, K.A. (eds) Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2016), pp. 165–173. ACL (2016)Google Scholar
  11. Degaetano-Ortlieb, S., Teich, E.: Modeling intra-textual variation with entropy and surprisal: topical vs. stylistic patterns. In: Alex, B., Degaetano-Ortlieb, S., Feldman, A., Kazantseva, A., Reiter, N., Szpakowicz, S. (eds.) Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH and CLfL 2017), pp. 68–77. ACL (2017)Google Scholar
  12. Degaetano-Ortlieb, S., Kermes, H., Lapshinova-Koltunski, E., Teich, E.: SciTex - a diachronic corpus for analyzing the development of scientific registers. In: Bennett, P., Durrell, M., Scheible, S., Whitt, R.J. (eds.) New Methods in Historical Corpus Linguistics. Corpus Linguistics and Interdisciplinary Perspectives on Language - CLIP, vol. 3, pp. 93–104. Narr, Tübingen (2013)Google Scholar
  13. Fankhauser, P., Knappen, J., Teich, E.: Topical diversification over time in the royal society corpus. In: Proceedings of Digital Humanities (DH 2016) (2016)Google Scholar
  14. Fischer, F., Strötgen, J.: When does (German) literature take place? On the analysis of temporal expressions in large corpora. In: Proceedings of Digital Humanities (DH 2015) (2015)Google Scholar
  15. Gleick, J.: At the beginning: more things in heaven and earth. In: Bryson, B. (ed.) Seeing Further. The Story of Science and The Royal Society, pp. 17–36. Harper Press, New York (2010)Google Scholar
  16. Jatowt, A., Antoine, É., Kawai, Y., Akiyama, T.: Mapping temporal horizons: analysis of collective future and past related attention in Twitter. In: Proceedings of the 24th International Conference on World Wide Web (WWW 2015), pp. 484–494 (2015)Google Scholar
  17. Kanhabua, N., Blanco, R., Nørvåg, K., et al.: Temporal information retrieval. Found. Trends Inf. Retr. 9(2), 91–208 (2015)CrossRefGoogle Scholar
  18. Kermes, H., Degaetano-Ortlieb, S., Khamis, A., Knappen, J., Teich, E.: The royal society corpus: from uncharted data to corpus. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), pp. 1928–1931. ELRA (2016a). ISBN 978-2-9517408-9-1Google Scholar
  19. Kermes, H., Knappen, J., Khamis, A., Degaetano-Ortlieb, S., Teich, E.: The Royal Society Corpus: towards a high-quality corpus for studying diachronic variation in scientific writing. In: Proceedings of Digital Humanities (DH 2016) (2016b)Google Scholar
  20. Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)MathSciNetCrossRefMATHGoogle Scholar
  21. Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001), pp. 111–119. ACM (2001).  https://doi.org/10.1145/383952.383970. ISBN 1-58113-331-6
  22. Lee, K., Artzi, Y., Dodge, J., Zettlemoyer, L.: Context-dependent semantic parsing for time expressions. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 1437–1447. ACL (2014)Google Scholar
  23. Mazur, P., Dale, R.: The DANTE temporal expression tagger. In: Vetulani, Z., Uszkoreit, H. (eds.) LTC 2007. LNCS (LNAI), vol. 5603, pp. 245–257. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-04235-5_21 CrossRefGoogle Scholar
  24. Mazur, P., Dale, R.: WikiWars: a new corpus for research on temporal expressions. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), pp. 913–922. ACL (2010)Google Scholar
  25. Meister, J.C.: Tagging time in prolog: the temporality effect project. Lit. Linguist. Comput. 20, 107–124 (2005)CrossRefGoogle Scholar
  26. Pustejovsky, J.: Temporal and event information in natural language text. Lang. Res. Eval. 39(2–3), 123–164 (2005)CrossRefGoogle Scholar
  27. Pustejovsky, J., Lee, K., Bunt, H., Romary, L.: ISO-TimeML: an international standard for semantic annotation. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), pp. 394–397. ELRA (2010)Google Scholar
  28. Strötgen, J., Gertz, M.: HeidelTime: high quality rule-based extraction and normalization of temporal expressions. In: Proceedings of the 5th International Workshop on Semantic Evaluation (SemEval 2010), pp. 321–324. ACL (2010)Google Scholar
  29. Strötgen, J., Gertz, M.: Temporal tagging on different domains: challenges, strategies, and gold standards. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 3746–3753. ELRA (2012)Google Scholar
  30. Strötgen, J., Gertz, M.: Domain-sensitive Temporal Tagging. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers, San Rafael (2016)Google Scholar
  31. Strötgen, J., Bögel, T., Zell, J., Armiti, A., Van Canh, T., Gertz, M.: Extending HeidelTime for temporal expressions referring to historic dates. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 2390–2397. ELRA (2014)Google Scholar
  32. Tabassum, J., Ritter, A., Xu, W.: TweeTime: a minimally supervised method for recognizing and normalizing time expressions in Twitter. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), pp. 307–318. ACL (2016). https://aclweb.org/anthology/D16-1030
  33. Teich, E., Degaetano-Ortlieb, S., Kermes, H., Lapshinova-Koltunski, E.: Scientific registers and disciplinary diversification: a comparable corpus approach. In: Proceedings of 6th Workshop on Building and Using Comparable Corpora (BUCC 2013), pp. 59–68. ACL (2013)Google Scholar
  34. UzZaman, N., Llorens, H., Derczynski, L., Allen, J.F., Verhagen, M., Pustejovsky, J.: SemEval-2013 task 1: TempEval-3: evaluating time expressions, events, and temporal relations. In: Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013), pp. 1–9. ACL (2013)Google Scholar
  35. Verhagen, M., Saurí, R., Caselli, T., Pustejovsky, J.: SemEval-2010 task 13: TempEval-2. In: Proceedings of the 5th International Workshop on Semantic Evaluation (SemEval 2010), pp. 57–62. ACL (2010)Google Scholar

Copyright information

© The Author(s) 2018

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.Saarland UniversitySaarbrückenGermany
  2. 2.Max Planck Institute for InformaticsSaarbrückenGermany

Personalised recommendations