Relations in the representation of word meaning
Dedre Gentner and her colleagues (Asmuth & Gentner, 2005, 2017; Gentner, 1982; Gentner & France, 1988) proposed that, whereas many nouns denote specific entities (e.g., dog, lion, man), the meaning of verbs is inherently relational. For example, the notion of buying can only be exemplified using an entity such as a woman, who is performing the action on a different entity, such as a computer. That is, the verb buy can only be used in reference to other entities, frequently denoted using nouns. The denoted action can therefore be thought of as identifying a relationship between the entities involved. More generally, verbs denote relations between entities. For instance, Gentner (2006) argues that concrete nouns are easier to learn because they are inherently individuated and more easily separable from the environment. In contrast, the relational nature of verbs makes their meaning more dependent on the context in which they appear. Likewise, Gentner and France proposed that this contextual sensitivity explains why participants in their studies preferred to adjust the meaning of the verbs than those of concrete nouns when paraphrasing sentences such as “The lizard worshipped.” Asmuth and Gentner (2005, 2017) demonstrated that such results can be seen not only when contrasting nouns and verbs, but also when contrasting concrete nouns (such as lion) with nouns that denote relational meanings (such as threat).
Interestingly, this hypothesis has implications to the structure of language and to our expectations regarding the uses of nouns and verbs more generally. In particular, such adjustments are an essential aspect of metaphors. The hypothesis that verbs are more relational than nouns can therefore be used to predict that it is easier to use verb metaphorically than it is to use concrete nouns. Moreover, linguistic theories on semantic change have long argued that metaphorical uses are one of the primary avenues through which the meaning of words is changed and extended (Traugott & Dasher, 2001).
Consequently, we can hypothesize that a word that is more contextually sensitive should appear in a greater variety of contexts, and, more importantly, change its meaning more over time. However, such changes take place over long periods of time, and are likely to be infrequent. It is therefore difficult to observe and measure such changes in the context of lab studies. In contrast, we have access to a variety of textual sources that were created over long periods of time. Statistical methods can thus be applied in order to test the hypotheses that particular classes of words, such as verbs, vary more across these texts than do words that arguably are less relational, such as concrete nouns. We could then examine the variability of context within a period, as well as how it varies across periods.
Measuring semantic change in a diachronic corpus
Semantic change has traditionally been measured on a word-by-word basis. Researchers identify a word whose meaning they are interested in tracing. They collect the contexts in which it appears over a period of time (often centuries) and record its use in each case. The hypothesis of semantic change can then be tested by examining trends in its uses over time. One such famous example was the rise of periphrastic do, which was traced by (Ellegård, 1953). In this case, the word do used to have a specific verb meaning in Old and Middle English—it denoted a causative relation (e.g., “did him gyuen up,” the Peterborough Chronicle, ca. 1154). In modern English, do is more frequently used as a grammatical function word (e.g., “Do you like it?”).
Although do started out as a verb with a meaning that was quite relational, its meaning was still less context-sensitive in Middle English than it is in Modern English. By measuring the variety of contexts in which do appears, Sagi, Kaufmann, and Clark (2012) demonstrate this shift using corpus statistics, with results that correspond to Ellegård’s hand-coded measures. The measure they use is essentially a measure of the variability of contexts in which the word appears within each period. The contextual variability in the uses of do exhibits a marked incline between the 15th and 16th centuries.
However, not all changes in meaning necessarily result in its broadening as was the case for periphrastic do. In many cases, such broadening is limited to the addition of a handful of new uses, or the depreciation of an old use. It might therefore be useful to also examine the change in use as a shift in the contexts rather than simply an increase in their variability. One possible source of such shifts, through metaphoric extension, might arise out of the effects of conceptual framing, such as the framing of terror as an act of war instead of as a crime following the events of September 11, 2001 (see Lakoff, 2009; for a related computational method see Sagi, Diermeier, & Kaufmann, 2013).
The similarity measure used in LSA and other methods of corpus statistics provide one possible method for tracing these changes. In particular, the more similar the uses of a word in one period are to its uses in another, the less likely it is that semantic change has occurred. Conversely, if a word has undergone a shift in its meaning or how it is used, we might expect it to appear in a different set of contexts in the new period than it did in the old. For example, the word computer used to mean a person who computes. This meaning was largely replaced by its current use, referring to a class of machines. Therefore the vectors representing the new contexts should be farther away from the vectors representing the old contexts than for a word whose meaning did not change (or changed to a lesser degree). By examining these measures across a large number of words, we can use statistics to identify trends in semantic change.
It is important to distinguish between two distinct sources of variability in contexts of use over time. In the first case, words with broader meanings (periphrastic do presents an extreme case of such words), are likely to be used in a wide variety of ways. Consequently, their uses will vary greatly within a time period, as well as across time periods. Semantic change presents a second source of variability of contexts over time. In this case, a drift, or change, in the meaning of a word, such as computer, or the addition of new meanings, result in a change in the contexts that a word is used in over time. Importantly, without semantic change, the broadness of application of a word remains constant over time, and therefore can be expected to be largely constant regardless of the time span involved. In contrast, drifts in meaning can be expected to accumulate over time and therefore show an increase in contextual variability as the time span examined increases. That is, whereas the effect of broadness of meaning on contextual variability does not depend on the length of time between uses, semantic change should show an increase in variability for longer time periods. For example, variability in contexts due to the broadness of applicability of a word should be the same whether measured over 25 years or 50 years, whereas semantic change would be expected to result in higher variability when measured over 50 years than when measured over 25 years.
Finally, an additional source of variability in the context of use of words over time comes from changes in the use of other words. That is, a shift in which the word man appears frequently with the word silly at one time period, but more frequently with blessed in another, might occur not because the meaning of man has changed, but because of pejoration in the meaning of the word silly, whose uses are then replaced by blessed. For the purposes of the present study, since each word was analyzed in isolation by comparing its uses in one period to its later uses, these changes will be treated as statistical noise, under the assumption that they are uniformly distributed across the corpus and do not vary by grammatical category.
To identify changes in word meaning in modern English, a corpus of 19th-century texts were collected from Project Gutenberg (www.gutenberg.org; Lebert, 2011), using the bulk of the English-language literary works available through the project’s website. This resulted in a corpus of 4,034 separate documents, consisting of over 240 million words. The Gutenberg Project preamble was removed from the books prior to analysis. Infomap (2007; Takayama, Flournoy, Kaufmann, & Peters, 1998) was used to generate a semantic space based on this corpus, using default settings (the 20,000 most frequent content words for the analysis with a co-occurrence window of ± 15 words and generating a 100-dimension space) and its default stop list.
Dating texts from the 19th century is difficult, because publication dates are often not readily available. Moreover, when considering language change, the publication date might not be the relevant date to use because the manuscript might have been written years earlier. The analysis was based on the birth dates of the authors instead, because they were easily obtained and are relevant from a linguistic perspective—since much of language learning occurs within the first few years of life. For the purposes of the analysis below, texts were also aggregated into 25-year periods. Consequently, the text from 3,490 books were written by authors born in the 19th century were used in the present analysis (1800–1824, 887 books; 1825–1849, 1,020 books; 1850–1874, 1,243 books; 1875–1899, 340 books).
Nouns and verbs
The nouns and verbs used in the study came from two sources: First, high-frequency nouns and verbs were collected from the 500 most frequent words in the corpus. This procedure contrasts words that are in frequent use and have a relatively stable meaning. The grammatical categories of the high-frequency words were determined on the basis of the MRC2 database (Wilson, 1988).Footnote 1 Nouns that were only rarely used as verbs were counted as nouns, and vice versa. Out of the 500 words, 168 nouns and 95 verbs followed this selection criterion (roughly 52.6% of the high-frequency words examined). Although this list includes more nouns than verbs, this is to be expected when examining high-frequency words. Nevertheless, these nouns and verbs are relatively equally interspersed among the list of the 500 most frequent words in the corpus. Specifically, the mean frequency of the nouns was 46,486 (SD = 2,884.78), and the mean frequency of the verbs was 43,077 (SD = 3,289.40). The two conditions did not significantly differ in frequency, t(258) = 0.744, p = .46.
Second, the study employed a list of frequency-matched relational nouns, entity nouns, and verbs obtained from Dedre Gentner, which was based on the lists used in previous studies (primarily from Asmuth & Gentner, 2005). This list comprised 70 entity nouns (e.g., emotion, fruit), 81 relational nouns (e.g., game, marriage), and 76 frequency-matched verbs (e.g., buy, explain).
Calculating context vectors
This study was designed to compare the variability of different word classes across uses and time. This analysis was based on the precomputed semantic space generated from a corpus of texts from the 19th century, as described above. This space provides vector representation for 20,000 words. However, these vectors are computed as aggregates over the entire corpus.
We can employ vector arithmetic to compute the vectors representing the use of a word, such as man, in a particular subset of a corpus (such as a particular book, an author, or a time period).Footnote 2 This is done by aggregating the contextual representations of the word and essentially averaging them together. Specifically, we can calculate the context vector of each occurrence of the target word by summing up the vectors of the words that appear in its context and normalizing this resulting vector to a unit length. Following the convention used in Infomap, the contexts used here comprised the 15 words that preceded the target word and the 15 words that followed it, for a total of 30 words. After computing a context vector for each appearance of the target word (e.g., man) in the selected subset, we can average the resulting context vectors together using the same vector addition and normalization process. The resulting vector represents the centroid of the vectors it aggregates and is functionally equivalent to the mean in a scalar context.
Measuring vector similarity
We can gauge the similarity of two vector representations by examining the angle between them—similar vectors will point in similar directions and will have a small angle whereas differing vector will point in different directions and therefore exhibit a larger angle. In vectors of unit length, the cosine of the angle is equivalent to the Pearson correlation between the components of each vector, which is the basic measure of similarity used in this article.
Computing contextual variability
The variability associated with an aggregate vector (such as a vector that represents a word in a subset of the corpus as described above) can be conceptualized as the variability of its vectors constituting it. We can therefore measure such variability by examining the similarity of each constituent vector to the centroid, in a similar fashion to how variance of individual data points is measured with relation to the mean. Importantly, since the correlation measure is higher for context vectors that are closer to the centroid (with a maximal value of 1 for vectors that are identical to the centroid), higher numbers indicate less variability. To aid with interpretation, below this correlation will be referred to as a measure of uniformity. Finally, its distribution is asymmetric, with maximal uniformity at one end and maximum variability at the other. As a result, there is no need to use the absolute value of the differences or to square them in order to get an accurate estimate of uniformity for comparisons.
Nouns and verbs
The first analysis was based on the variability in context and change in word use over time between high-frequency nouns and verbs. For each word, the vectors representing the contexts in which the word appeared were examined in groups of texts spanning 25-year-long periods, based on the birth date of the authors. Two hypotheses were tested: first that verbs are used in more varied textual contexts than nouns, and second that the contexts in which verbs appear change more rapidly over time than do the contexts in which nouns appear. To demonstrate this, changes in the mean textual contexts were compared over two time scales—25 and 50 years. Importantly, if we were to observe higher variability over periods of 50 years than over periods of 25 years, we could demonstrate that change in use accumulates over time. Since the number of authors whose works are out of copyright (and therefore can be provided by Project Gutenberg) drops sharply at the beginning of the 20th century, the starting periods in this analysis were limited to authors born from 1800 to 1825 and from 1825 to 1850. The means and standard deviations of the correlations between context vectors across time can be found in Fig. 2.
Variability within a time period was measured by averaging the correlation of the contexts of each term to the centroid representing it for the particular time period. That is, first the average vector of all of the contexts was calculated for a particular word (e.g., man), and then the correlation of each context to this centroid. The resulting measure informed as to the uniformity of contexts—if all contexts were identical, the average of these correlations would be 1. The more variability there was in the contexts, the lower the average correlation of the individual vectors to the centroid would be. This measure of uniformity was then averaged across all the 25-year time periods, to calculate the overall uniformity of the context for each word. The overall uniformity of use of nouns and verbs could then be compared using a one-way ANOVA. As predicted, nouns (M = .433, SD = .037) were more uniform than verbs (M = .399, SD = .033), F(1, 258) = 55.32, MSE = 0.0013, p < .001, ηp2 = .18.
A two-way ANOVA was used to analyze change over time. In this analysis, the basic dependent measure was the correlation of the centroids between periods. That is, the centroid of each word from one time period (e.g., 1800–1825) was correlated to its centroid in a second time period (e.g., 1825–1850 for a 25-year span, or 1850–1875 for a 50-year span). Grammatical category (noun vs. verb) and length of time elapsed (25 vs. 50 years) were the IVs, and the similarity of meaning was the DV (measured as the correlation between the centroids of a word for two time periods that began either 25 or 50 years apart). Where grammatical category was a between-subjects variable (since different words counted as subjects in this study), the elapsed length of time was a within-subjects variable.
As predicted, a significant main effect of grammatical category emerged, in which the meaning of nouns was more similar over time than was the meaning of verbs, F(1, 258) = 50.59, MSE = 0.00032, p < .001, ηp2 = .16. Unsurprisingly, there was also a main effect of time, in which the correlation between the centroids was lower for the 50-year than for the 25-year periods, F(1, 258) = 2,523.75, MSE = 0.000037, p < .001, ηp2 = .91. More importantly, the predicted interaction was also observed—verbs showed more change in their centroids over time than did nouns, F(1, 258) = 37.28, MSE = 0.000037, p < .001, ηp2 = .13.
It is important to consider that grammatical classes differ not only in relationality, but also in qualities such as concreteness and familiarity. In particular, nouns tend to denote more concrete entities than verbs do. To test whether concreteness and familiarity accounted for the differences observed, all the words in the high-frequency study that had MRC2 concreteness and familiarity ratings were collected, and a median split was used to identify low- and high-rating words. Concreteness significantly correlated with context similarity for both the 25-year span (r = .165, p < .05) and the 50-year span (r = .266, p < .01). Similarly, familiarity also correlated with context similarity for the 25-year span (r = .195, p < .01) and the 50-year span (r = .211, p < .01). The analysis above was repeated on this reduced set of words (135 nouns and 60 verbs), with concreteness and familiarity as covariants, and replicated the above effect. Importantly, the interaction observed earlier was still significant, even after controlling for the effects of concreteness and familiarity, F(1, 191) = 5.29, MSE = 0.000033, p < .05, ηp2 = .03. Nevertheless, the effect size was reduced, suggesting that the effects of grammatical class might be partially, but not completely, explained as differences in concreteness and familiarity between the two classes.
Entity nouns and relational nouns
Next, we turn to a comparison of entity nouns and relational nouns. As was mentioned earlier, if the likelihood of semantic change is higher for relational words, we should expect the higher rate of change to be evident not only for verbs, but for other relational words, such as relational nouns. The means and standard deviations of the correlations between context vectors across time for the entity nouns, relational nouns, and frequency-matched verbs used in the analysis can be found in Fig. 3.
As before, first I computed the average uniformity of use for each word. A one-way ANOVA was used to test whether relational nouns and verbs showed more variability in use than entity nouns. As predicted, entity nouns (M = .36, SD = .056) exhibited more contextual uniformity than either relational nouns (M = .33, SD = .043) or verbs (M = .29, SD = .048), F(2, 224) = 45.10, MSE = .002, p < .001, ηp2 = .29. Tukey’s HSD test showed that all three classes of words were different from each other in their uniformity. That is, relational nouns were more variable than entity nouns, and verbs exhibited less uniformity than either class of nouns.
For analyzing change in context over time, the same overall procedure was followed that had been used previously. As before, a small but significant main effect of grammatical category emerged, F(2, 224) = 6.33, MSE = .001, p < .01, ηp2 = .053. The difference between the centroids also increased over time, F(2, 224) = 1,004.21, MSE = .0001, p < .001, ηp2 = .818. More importantly, the expected interaction was observed, in which this increase over time was greater for relational nouns and for verbs than for entity nouns, F(2, 224) = 11.08, MSE = .0001, p < .01, ηp2 = .09.
Because this effect might have been driven primarily by the change in verbs, a planned analysis was also conducted that did not include the verbs. This analysis resulted in a similar pattern, with entity nouns showing less overall evidence of change in their centroid than relational nouns, F(2, 149) = 4.80, MSE = .0008, p < .05, ηp2 = .031. The rate of change also increased over time, F(2, 149) = 760.66, MSE = .00001, p < .001, ηp2 = .836. Most importantly, the observed interaction, in which relational nouns showed an increased rate of change over time as compared to entity nouns, was also preserved, F(2, 149) = 6.98, MSE = .00001, p < .01, ηp2 = .045.
In this study, patterns of language change were compared for English nouns and verbs. This analysis revealed that nouns showed less contextual variability within each time period than verbs. Likewise, the centroids representing nouns changed more slowly over time than verbs, and entity nouns change more slowly than relational nouns. These results are in line with theories that argue that verbs, and relational nouns, are represented using relations whereas entity nouns are represented as direct denotations.
These results also demonstrate the utility and efficacy of corpus statistics as a tool for observing large scale trends in language use. Whereas in the lab we observe and record the behavior of an individual or a small number of individuals at a time, focusing on the details of their behavior, corpora provide us with an overview of the behavior of large groups of humans. Converging evidence from both methodologies is likely to provide researchers with more confidence in the validity and reliability of their results.