1 Introduction

Authorship attribution is the task of determining the identity, or demographic characteristics, of an author, from the material they wrote. The problem of attributing a specific text to a specific author, or distinguishing between several texts and their authors, is pertinent in historical documentation (Holmes, 1998; Binongo, 2003; Zhao & Zobel, 2007; Tyrkkö, 2013), criminal justice and forensics, detecting plagiarism and more (van Halteren, 2004).

Earliest authorship attribution approaches and attempts to identify authors relied mainly on stylistic features such as humor, sentence complexity, word choice and descriptiveness (Zhao & Zobel, 2007). Scholars of literature spent a great deal of time on the “style” of a particular author, and were able to detect, through literary criticism tools or simple instinct, the signature of an author’s work.

However, in order to attain truly reliable results, authorship attribution has to utilize statistical and computational approaches. Computers simply do the work much better than humans ever could. In such a circumstance, the problem of quantifying features for authorship identification and deciding which features to use, becomes paramount. What does it mean to operationalize style? How do we transfer the human ability to detect differences into a reliable, statistically robust computer program?

This problem of which features to choose – the problem of stylometry – is a long-standing one. From the onset, the features that stylometric analyses needed to select had to be ‘salient, structural, frequent, easily quantifiable, and relatively immune from conscious control.’ (Bailey in Holmes, 1998). Robust stylometric approaches rely on the fact that a work, whether it be an email or a classical novel, is a series of countable words and morphemes, some of which are extremely prevalent, such as the n-grams /the/ and /ing/.

Modern approaches to authorship attribution additionally rely on a large variety of stylometric features, grouped into the stylistic word length, phrase length, distribution of function words, digit frequency, number of short words, etc.) and content features (POS tagging, vocabulary choice, lexical features, character n-grams, word n-grams, etc.) (see e.g.: Tanguy et al., 2011; Savoy, 2012; Wu et al., 2021; Zhao & Zobel 2007; Sundararajan & Woodard, 2018). A few approaches venture beyond the lexical and look at syntactic features (Wu et al., 2021; Varela et al., 2010) by building complex Multi-Channel Self-Attention networks, and some approaches dedicate special attention to specific parts of speech and lexical entries such as verbs (Varela et al., 2010), function words, Segarra et al. (2015), thematic vocabulary (Savoy, 2012) and other linguistic features (Tanguy et al., 2011). Most authorship attribution and stylometry, however, is still focused on the lexical features mentioned above, in large part due to their robustness and reliability.

This paper looks at two stylometric parameters that have, so far, received relatively little attention; adverbs and adjectives. While amounts of adjectives and adverbs have been utilized as part of general n-gram approaches, little has been done to examine their properties and to utilize these specific properties in authorship attribution. To that end, we have looked at two types of corpora; the Schler Blog Corpus, a collection of English Language blogs, and the Project Gutenberg Corpus, with a specific focus on the novels of Herman Melville, Jane Austen, and G. K. Chesterton.

2 Adverbs

Adverbs in English can appear in a variety of positions in a clause. In the example case below, each sentence is at least borderline acceptable and has equivalent semantic content.

  1. 1.

    Passionately, she kissed her husband.

  2. 2.

    She passionately kissed her husband

  3. 3.

    ? She kissed passionately her husband.

  4. 4.

    She kissed her husband passionately.

Given that there is no discernibly significant difference between the meanings of (1)-(4), a speaker or writer has the liberty to choose where they will place an adverb. From this liberty, one could hypothetically develop personal preferences for one position over another. These preferences may subsequently carry authorial information and have the potential for stylometric applications.

We hypothesize that adverb placement analysis could be used for the purposes of authorship attribution. By comparing average percentages of adverb positions of a known document to an anonymous document, we should be able to accurately posit the author of the anonymous document.

2.1 Constraints on placement

Of course, we are not insinuating that a given author will use adverbs in a single position in a sentence and nowhere else. For one, it is not the case that every adverb is grammatically pronounceable in any position in a sentence. As illustrated in Table 1Footnote 1 below, some adverbs—particularly adverbs of manner, such as beautifully, quietly, aggressively, carefully, and so on—can occupy a wide variety of positions in a sentence. Other adverbs, like nearly, are only pronounceable immediately before or after the verb phrase, while adverbs like daily are pronounceable only at the end of the sentence.

Table 1 Legal and illegal adverb placements in English

This variation in placeability may stem from the adverb’s argued status as a wastebin taxon (Carter and McCarthy, 2017). Scholars from many different schools of thought have investigated the scope, semantics, and syntactic constraints of adverbs (Ernst, 2002; Cinque, 1999). To the extent of our knowledge, however, there is not yet a comprehensive taxonomy of adverbs based on their placeability in the surface representation of a sentence.Footnote 2 For the purposes of this article, we will bear in mind that some adverbs have a wider range of possible placements than others (Table 2).

Table 2 Examples of each category described in the list above

2.2 Methodology

We began with a corpus of 25 English-language blogs taken from the Schler Blog Corpus (Schler et al., 2006), which were then processed to extract the first and last 100 sentences of each. Each sentence was part-of-speech tagged using the Python NLTK package (Steven et al., 2009); objects identified as adverbs were sorted into one of six categories based on their position relative to every non-adverb. In a scenario where the first verb in a sentence is α and the last verb in a sentence is β...

  1. 1.

    an adverb that is before the first non-adverb is sentence-initial

  2. 2.

    an adverb that is between the first non-adverb but before α is preverbal

  3. 3.

    an adverb that is between α and β is interverbal

  4. 4.

    an adverb that is between β and the last non-adverb in the sentence is postverbal

  5. 5.

    an adverb that is after the last non-adverb is sentence-final

  6. 6.

    an adverb that is not truthfully described by any of the above is other

The initial samples were paired and used as training data to attribute the final samples. This resulted in a series of 600 triangle tests, with one true author and one distractor author. Five different experimental conditions were used to make the determination: raw adverbial counts, adverbial counts vectorized by type, total data, and the first two metrics again with normalized adverbial counts.

2.3 Results

In this experiment, 300 correct attributions were expected by chance. As can be seen in Fig. 1, every experimental condition performed above that value. Specifically, we were able to attribute authorship with an average accuracy of 9.7% greater than chance across all conditions. At maximum, we attributed authorship with an accuracy 12.8% greater than chance.

Fig. 1
figure 1

Correct attributions by experimental condition

However, this method of analysis seems to be most effective when raw adverb counts are used. Using normalized counts (i.e., the number of adverbs in each category are divided by the total number of adverbs) loses a noticeable amount of information and is less accurate. This suggests that comparing the frequency of adverbs is useful for stylometry as well. Nonetheless, the data suggests that individual variations in adverbial placement are distinctive. By extension, this variable has stylometric potential and is useful for authorship attribution.

2.4 A D-structure approach?

We based our analysis on adverbs as they appeared relative to verbs and non-adverbs in a given sentence. In other words, we used surface representations as opposed to deep structures (henceforth D-structure(s)) that represent the underlying syntax and semantics of an expression (Chomsky, 1971). Analyzing the placement of adverbs in this hypothesized D-structure may reveal a similar pattern in regards to authorial information. For example, one author may adjoin adverbs to a verb phrase (as in (1)) more frequently than they do to an inflectional phrase (as in (2)). Alternatively, they may place adverbs to the right of a phrase (as in (3)) more often than they do the left of a phrase (as in (4)).

  1. 1.

    Alice tenderly hugs Laina.

  2. 2.

    Tenderly, Alice hugs Laina.

  3. 3.

    I have work today actually.

  4. 4.

    Actually, I have work today.

While not technically impossible, this approach was avoided for several reasons. For one, there are scenarios where an adverb-bearing sentence is ambiguous between two D-structures. In (3) above, it is unclear whether actually adjoins to the verb phrase or the inflectional phrase. Furthermore, formal analyses of both individual adverbs and the category as a whole are remarkably complex, if not subjects of debate. As summarized by Ernst (2002), “nobody seems to know exactly what to do with adverbs.” Cinque (1999), for example, suggests that the adjoin function could be done away with entirely under the schema for adverbs they posited. Conflicting theories aside, it seems unrealistic to generate syntax trees for thousands of sentences in a corpus that may contain typographical errors or irregular structures not recognized by an automatic sentence parser. While a D-structure approach to this question is worth revisiting for a future study, we limited our analysis to surface representations.

3 Adjectives

When a noun is preceded by more than one adjective in English, the adjectives have a canonical, internal order. It’s possible to talk about the violent big green monster from outer space, but it would be extremely odd to talk about the green big violent monster from outer space. The Cambridge English Dictionary (Cambridge Dictionary Online, 2021) defines the order of adjectives in the English language as:

Opinion > Size> Physical Quality > Shape> Age> Color> Origin> Material> Type> Purpose

Each adjective is classified into one of the aforementioned categories, as shown (Table 3). Note, however, that polysemic adjectives can be classified into more than one category. For instance, the adjective great can either denote size as in ‘the great white whale’, quality, as in ‘the great person’ or number, as in ‘the great majority’.

Table 3 Examples of adjective categories

Notably, while variations in classification exist (origin, material and type are often lumped together under the broad category of proper adjective), the order of adjectives is not generally disputed. Though syntactic arguments for adjective order have been made (Rosato, 2013) it seems that actual constraints on the order of adjectives as seen here are semantic and pragmatic,adjectives located closer to the noun are seen as more essentially tied to the noun, or as more necessary (Rosato, 2013; Wulff, 2003).

We examined two potential factors that may differentiate between authors for the purpose of authorship attribution. The first was variability in the canonical order of adjectives – whether it would be possible to find exceptions or deviations correlating with demographic characteristics or period of writing. The second was adjective distribution and proportion of different categories of adjectives in a work.

3.1 Methodology

We began by analyzing the large corpus of classical works available on Project Gutenberg, and tagging all adjectives with the NLTK speech tagger. The number of adjectives in the works can be seen in Table 4 below:

Table 4 Number of tagged adjectives in Project Gutenberg corpus, by author and work

We then proceeded to use the NLTK POS-tagger to tag pairs of the form [adjective-adjective] noun, where a noun was preceded by more than two adjectives, the pairs were tagged separately and the adjectives overlapped. The adjective pairs were then analyzed manually in G. K. Chesterton’s Father Brown, Jane Austen’s Emma and Herman Melville’s Moby Dick. Word frequencies were then extracted in order to allow us to analyze adjectives by category, and several sample categories chosen for analysis.

Additionally, we calculated word frequencies for the corpus, and manually assembled them into the adjective categories listed above for the works of Jane Austen and Herman Melville. Relative distribution of adjectives was calculated for several of the works.

3.2 Results

The first notable result was that manual analysis of adjective pairs did not show divergence in adjective order. All adjective pairs and triads adhered to the general order of opinion > age > size > color. This held true across author genders (male and female), across genres (Austen’s romance, Chesterton’s mystery and Melville’s nautical work) and across period (Emma, published in 1815; Moby Dick, published in 1851; Father Brown, published in 1910-1936).

Frequency analysis and grouping shows differing distributions of categories of adjectives by author. Whereas the Opinion adjectival category is the most numerous for all works, the percent share it has of the total number of adjectives differs markedly. This is true for all other categories of adjectives, as well (Table 5).

Table 5 Counted number of adjectives in Jane Austen’s Emma

As shown in Fig. 2, opinion adjectives constitute approximately 48% of all adjectives used, whereas color constitutes less than 1%. The work of Herman Melville, on the other hand, shows a very different adjectival distribution, as shown in Fig. 3 below.

Fig. 2
figure 2

Distribution of Adjectives by category in Jane Austen’s Emma

Fig. 3
figure 3

Distribution of Adjectives by category in Herman Melville’s Moby-Dick

In Melville’s work, Opinion adjectives constitute 28% of total adjectives, whereas age (3%), size (8%) and color (3%) constitute a significantly greater share of adjectives compared to Austen. Adjectives marked as Other were either ambiguous, unique, or belonging to the category Number and therefore potentially constituting classifiers (Table 6).

Table 6 Counted number of adjectives in Herman Melville’s Moby Dick

4 Discussion

Our results point to the fact that looking in detail at specific parts of speech, such as adverbs and adjectives, yields fruitful approaches to authorship attribution. Authors differ in their choice of adverb placement, their choice of adjective categories, and their choice of adjective and adverb frequency, just as they differ in their choices of nouns and verbs.

It is important, however, to note that not all features of these parts of speech are created equal; what features we look at will matter. For instance, all authors examined used adjective order in the same way; as a canonical feature of the language, it is not much more variable than the English-language requirement that sentences contain a pronoun and a verb.

In the case of adjectives, the variability seen in the results when examining adjective categories and word choice stems from two obvious sources; the first one is that adjectives are determined by the choice of subject and the descriptive, pragmatic necessity. So, in a book about the sea and a great white whale, we will see a preponderance of adjectives of color, and in a work concerning families and love such as Austen’s Emma, we will see many adjectives pertaining to individual quality and age. The second determiner of the adjectives used in a work is, of course, individual preference. The most frequently utilized adjective in Jane Austen’s Emma is ‘little’ (347 instances out of 10873 total adjectives, 1:31 adjectives), whereas the most frequent adjective in Herman Melville’s Moby Dick is ‘old’ (429 instances out of a total of 18398 adjectives.). Melville also used the adjective ‘little’ but only in 239 instances out of 18398 (a ratio of only 1:76 adjectives).

As we originally hypothesized, the data we gathered suggests that analyzing adverb placement is useful for the purpose of authorship attribution. To some extent, authors vary in where they tend to place an adverb in a sentence. However, this particular technique has the potential for refinement. It would be interesting to repeat this study excluding the most frequent placement category across authors. If there is some “marked” position to which writers of the dataset default, would excluding this category in the analysis improve accuracy? Alternatively, as natural language processing technology improves, an approach similar to the one mentioned in 2.4 may become feasible. It is possible that analyzing an adverb’s position in a syntax tree will improve the rate of correct attributions in our model. Furthermore, the study is significantly limited by the size of the examined corpus. Although examination of a larger corpus is beyond the scope of the present, it may prove a fruitful avenue for further research. Regardless, the results of the current study are encouraging and warrant further investigation.

It is possible that the pattern we observed for adverbs holds for other parts of speech that have flexible placeability. Certain prepositional phrases, for example, can occur before or after a clause and mean essentially the same thing (consider on Tuesday I walked the dog and I walked the dog on Tuesday). Analogous phenomena in other languages, such as floating quantifiers in Japanese (Fukushima, 1991), may also be subject to an author’s preferences. If this is the case, they could also be analyzed for stylometric purposes.

The results of the current study dovetail neatly with the general, accepted approaches to stylometry. Adverbs and adjectives can be seen as special cases of word n-grams and POS-tagging, two approaches that have proven reliable in the past (Houvardas and Stamatatos, 2006; Koppel et al., 2009), as well as of vocabulary selection (Savoy, 2012).

5 Conclusion

Great success has been observed using previously established methods for authorship attribution. However, some counters to these techniques—principally obfuscation (Mahmood et al., 2019)—lie on the horizon. Emerging problems such as these present a difficult challenge to the field as said techniques are not designed to account for them. To handle this inevitability, we need as many tools as possible on our belt. While an analysis of adverbs or adjectives is not the figurative silver bullet that will solve all problems, the results of this study are encouraging. As the field grows, linguistic approaches to stylometry are necessary and must continue to be explored.