We started the collocational analysis of the potential apology IFIDs in question by establishing a collocational profile for each of them individually. That is to say that we used WebCorpLSE to extract the top 100 collocates for each IFID at span 4 (i.e. a window of four words to the left of the IFID and four words to the right) to gain further insight into their general collocational behaviour. Table 2 lists the top 25 collocates of the IFID apologise as an example.
Table 2 Collocates of the IFID apologise (span 4)
Table 2 is sorted by z-score, a measure of statistical significance which takes into account the frequency of the node (the IFID) and of each collocate in relation to corpus size. For instance, profusely is a relatively rare word (frequency 348) which, nevertheless, co-occurs with apologise 105 times. That is to say that over 30% of instances of profusely in our corpus appear within four words to the left or right of apologise. This collocational pair is given a high z-score as a result. In fact, all but three of the 105 co-occurrences of apologise and profusely are at span 1 (i.e. ‘profusely apologise’ or ‘apologise profusely’), and the only other words that collocate significantly with profusely are inflections of thank, bleed and sweat. We also see evidence in Table 2 of other semi-fixed phrases (‘apologise in advance’, ‘apologise for any inconvenience/confusion’, ‘apologise publicly’/‘publicly apologise’, etc.) thus adding further details to the collocational tendencies of the verb apologise. Building collocational profiles such as the one illustrated in Table 2 for each of the eight potential apology IFIDs separately allowed us to engage in the next step of our collocational analysis, the study of shared collocates.
Shared Collocates
Shared collocates are, as the name implies, collocates which are shared by several lexemes. To arrive at a list of shared collocates, we used the collocational profiles produced for each of the eight potential IFIDs and taking each form in turn, we compared its top 100 collocates with the top 100 collocates of all the other IFIDs combined. As a result, we were able to uncover those collocates which are shared by the eight forms and gain further insight into their overlapping functions and meanings.
Figure 1 presents the results of our shared collocate analysis. Each column in Fig. 1 represents one of the eight (potential) apology IFIDs, each row represents a collocate, and shaded boxes indicate where IFIDs share collocates. For example, the first collocate, the first-person pronoun I, is shared by seven of the eight IFIDs (all except apology).
It is worth noting here that the fact we use a span of four words in our collocational analysis lessens the impact of grammatical restrictions on the results. Thus, pardon is allowed to collocate with I, largely in the phrase ‘I beg your pardon’. Given that the IFID apology also includes the plural, there is no grammatical reason why I does not collocate significantly with this IFID at span 4 (‘I offer my (sincere) apologies’ would be counted at span 4 for example).
This second step in our collocational analysis, the study of shared collocates as represented in Fig. 1, allows us to make several preliminary observations. Thus, Fig. 1 shows that some of the eight potential apology IFIDs exhibit greater similarities on a collocational level than others, which is visible by looking down the columns to determine how many of the collocates each IFID shares (see for instance the gaps for afraid and regret towards the top). This provides a good first indication as to their function as, when appearing with one of the shared collocates listed, the form in question is likely to serve an apology function and to act as an apology IFID. This is in particular the case for those collocates offering a clear indication of the reasons people apologise in the BBC sub-corpus, with a lack, delay or absence of something being particularly prominent (each a shared collocate of five or six of the apology IFIDs). Taking the IFID apologise as an example, there are 196 instances in our corpus of the words lack, delay or absence appearing within four words of this IFID. Table 3 gives ten examples extracted randomly using the ‘filter’ option in WebCorpLSE.
Table 3 Examples of lack, delay and absence as span 4 collocates of the IFID apologise
In the vast majority of examples in Table 3, the writer is apologising for his or her poor blogging etiquette, be it a lack of updates, delay in posting or general absence. Most of these examples are performative speech acts, as indicated by the use of the first person pronouns (cf. Taavitsainen and Jucker 2008: 22). The only exceptions can be found in concordance lines 1 and 8, which refer to or report on other people apologising.
Moreover, we see a set of shared collocates in Fig. 1 where writers in the blog corpus appear to be apologising for the poor quality of their written expression: spelling, english, typos (and both poor and quality are themselves also shared collocates of three or four IFIDs each). This partly reflects the mixed native language background of blog writers and commenters, who are not necessarily native speakers of English due to the diverse demographics of online users, as, for instance, example 7 in Table 4 illustrates explicitly. Table 4 contains a random selection of the 70 occasions in the BBC sub-corpus where spelling, English or typos appear as a span 4 collocate of the IFID excuse. Note that only one of the examples is a nominal use of excuse (line 9).
Table 4 Examples of spelling, english and typos as span 4 collocates of the IFID excuse
We have indicated at the end of each concordance line in Table 4 whether the example appears in a blog post or in a reader comment on a post (information which appears in the WebCorpLSE interface when the user clicks on a particular concordance line). It is clear that the majority of examples here—all but three—appear in comments. In fact, 77% of the total matches for this query appear in comments, in contrast to the apologise + (lack, delay, absence) query in Table 3 where only 17% of matches appeared in comments. Thus, the analysis of shared collocates has also revealed a medium-specific distribution of different types of apologies in blog posts and comments. While bloggers tend to apologise for infrequent updates or delays in posts, in comments their readers refer to the (often poor) standard of their language skills when apologising. This was also reflected in our previous analysis of the form oops (Lutzky and Kehoe 2017), where we found that commenters often apologise in a second comment and explicitly refer back to some infelicity on the level of content or language use that occurred in their first comment.
The examples given in this section have thus demonstrated that the analysis of shared collocates provides us with an overview of apologies in blogs, which would not have been possible if we had focussed on a single form only, and uncovers certain medium-specific types of apologies. Furthermore, Fig. 1 shows that there are gaps in the columns for some of the potential IFIDs, such as afraid and regret. While apologise, apology, excuse and forgive share many collocates with each other and with the other apology IFIDs, afraid and regret demonstrate less overlap. We therefore turn to the study of their unique collocates in the next section in order to examine the specific meanings of potential apology IFIDs in more depth and, after discussing some of the collocates they share, to find out what distinguishes them from each other.
Unique Collocates
While the study of shared collocates has pointed us in the direction of attestations showing the illocutionary force of an apology (such as the collocates indicating the reasons for an apology), it is the study of unique collocates that either suggests where a potential apology IFID may have non-apology uses or provides additional details as to the apologetic uses of a form, i.e. allowing us to see how potential apology IFIDs differ from each other. Figure 2 shows the top 20 unique collocates of each potential apology IFID at span 4, sorted by strength of collocation with that form (z-score). These are words which collocate strongly with a particular IFID but do not appear amongst the top 100 collocates of any of the other eight IFIDs studied.
Of the eight forms given above, four have a strong link with the speech act of apologising and are prototypically associated with it: apology, apologise, excuse and sorry. As Fig. 2 shows, the noun apology and the verb apologise behave very similarly in our data. The unique collocates of these IFIDs include several adjectives and adverbs, many of which express that the illocutionary force is genuine: heartfelt, deepest, grovelling, profuse(ly), sincere(-ly, -st), humbl(e/y), unreservedly and publicly. The unique collocates of apologise also offer a further indication of some of the common reasons for apologising in our corpus: (asking) questions, mistake, fault, tone. The following is an example of the last of these, taken from a comment in which the author is apologising for a previous comment he made two hours earlier (before the author of the original post has had the chance to respond):
I just reread my own comment. That was some rant, wasn’t it! I do apologize for the tone, if not the content.
There is also a set of unique collocates relating to a specific kind of apology—issued, public, formal—as the below examples illustrate:
What is really interesting about Lazare’s book is how he examines successful and unsuccessful public apologies (often those of politicians or celebrities).
I know it is very difficult for my son to do this, and his dad doesn’t expect a formal apology, it is enough that we can forget about it and get on with life.
An apology was issued about the quality of the beverages, but not about the effects of too much caffeine and being stuck in a TV studio.
In contrast with apology/apologise and its strong collocates sincere(ly) and heartfelt, several of the unique collocates of excuse are words indicating the opposite: lame, pathetic, flimsy, weak (though it does also collocate with good, valid and legitimate). This IFID functions as both a noun and a verb, although the nominal uses dominate the unique collocates.
The unique collocates of sorry are perhaps the most striking of all in Fig. 2 as they clearly point to the more colloquial uses of this form compared to the ones discussed above. There are several terms of endearment, contractions and other informal, speech-like features which seem to be expressing sympathy, often used by a reader leaving a comment and referring back to something mentioned in the post. These include two spellings each of oops and hon (short for ‘honey’), as well as aww, hugs, (a)bout and sucks. In fact, when studying oops and its various spelling variations in more detail, we found that it is attested with apologetic uses too and can be regarded as an apology IFID in blogs, both when co-occurring with other IFIDs and when attested on its own (Lutzky and Kehoe 2017).
The remaining four forms—regret, pardon, forgive and afraid—take up the peripheral positions in Fig. 2, as their use as apology IFIDs is also more peripheral compared to the four prototypical forms discussed above. While we cannot discuss each of them in detail, we will briefly mention forgive, pardon and regret below, before focussing on afraid in our demonstration of how unique collocates allow us to separate pragmatic from non-pragmatic uses of the form.
We noted in our initial analysis (Table 1) that the relative frequency of forgive is higher in the BBC sub-corpus than in Deutschmann’s BNC data, accounting for 6.1% of all eight forms taken together as opposed to 0.5% in the BNC sub-corpus. Figure 2 gives an indication of the reasons for this discrepancy, with the vast majority of the top 20 unique collocates of forgive in our corpus arising from its use in religious contexts, more specifically in the case of trespasses in the Lord’s Prayer. These are contexts far less likely to occur in Deutschmann’s spoken corpus. As we explained in the “IFID Selection” section, the relative frequency of pardon is much lower in our written corpus than in Deutschmann’s spoken one, likely as a result of the absence of ‘hearing offences’. Nevertheless, pardon does have specific uses in our corpus—both nominal and verbal—which are reflected in its unique collocates: beg (your) pardon, pardon the expression, pardon my French, receive/given a Presidential pardon.
Regret is an example where the unique collocates help to draw out multiple, often subtly different, uses of the same word. Although regret can be used in direct apologies (‘I regret to inform you’, ‘it is with regret that…’, to regret one’s actions towards another person), it is also possible to regret actions or decisions which affect only oneself. In addition, there is the wider use of regret to refer to sadness or sorrow (both unique collocates of this form).
Turning to afraid, we see evidence of non-apologetic uses in the form of things that people are commonly afraid—or frightened—of: heights, spiders, losing, (the) dark, bugs, failure and (perhaps less commonly) clowns. We also see the modifier deathly, which collocates significantly with this meaning of afraid but is unlikely to appear in an apology context. Our suggestion here is that the extraction of unique collocates such as these could be a useful means of automating the pragmatic analysis of large corpora. Taking our example of the potential apology IFID afraid, if we find one or more of the words heights, spiders, losing, dark, bugs, failure, clowns or deathly within four words to its left or right, this gives a strong indication that the particular instance we are considering is not actually part of an apology. Table 5 gives a random selection of the 439 examples of the form afraid that would be excluded were this filter to be applied.
Table 5 Examples of the form afraid excluded by collocate filtering
These examples demonstrate that the collocates selected are good indicators of non-apology uses of afraid, with deathly as perhaps the best indicator. Line 10 actually contains two of the collocates (with heights included in the span 4 window as a result of the missing space between everything and the). The only borderline case in Table 5 is line 8, where afraid collocates with spiders but it is actually the word terrified that is used to convey fear. The writer is, in effect, using the word afraid with an apologetic function to introduce the confession that they are scared of spiders.
In addition to these concepts that people are afraid of, the unique collocates also include the preposition of as a strong and unique collocate of afraid, illustrated by several of the examples in Table 5. In fact, the total number of occurrences of afraid of is 3587. Together with the constructions afraid to (3189) and afraid for (122), these amount to almost half of all attestations of afraid and can be separated as non-apologetic uses of afraid (see also Owen 1983: 88–89). Furthermore, according to the OED (s.v. afraid, adj. and n.), one can narrow down the (performative) apologetic uses of afraid to the constructions I am/I’m afraid + dependent clause or parenthetical attestations. The total number of I am/I’m afraid occurrences in our corpus is 4873. This count is case insensitive and includes the following spelling variations: I’m afraid (3670), I am afraid (969), i’m afraid (125), i am afraid (46), Im afraid (32), im afraid (17), I’M AFRAID (3), I’m Afraid (3), I’M afraid (2), I AM AFRAID (2), I AM afraid (2), I am AFRAID (1), and I’m AFRAID (1).
When combining the two filters, narrowing down the search to variations of I am afraid (4873) and excluding attestations collocating with one of the prepositions of, to or for in the first position to the right of afraid (870), we are left with a total of 4003 attestations. Here, compared to the initial output of 14,270 attestations, the salience of afraid as an apology IFID is much higher and comprises examples like those randomly extracted in Table 6.
Table 6 Examples of afraid remaining after collocate filtering
This random sample reflects that afraid, when excluding the explicitly fear-based examples, tends to occur in comments on blog posts. Of the 4003 examples remaining after filtering, 2685 (67%) are in comments and 1318 (33%) are in posts. Thus, when appearing with an apologetic function, afraid shows a distribution similar to excuse discussed above (see “Shared Collocates” section), being used primarily in the comment section of blogs, where it forms part of the interaction between different commenters and the author of the blog post and is often accompanied by a specific form of address (e.g. Sarah, Lizzy, babes).
Overall, our discussion of unique collocates has shown that some of them provide us with further information about the specific apologetic uses of a form, such as the differences in the adjectives and adverbs collocating with the forms apologise, apology and excuse. Additionally, unique collocates allow insights into medium-specific uses of IFIDs, as in the case of pardon used in blogs. On the other hand, we have seen how some unique collocates provide a clear indication as to the non-apologetic uses of a form (see e.g. forgive or afraid), allowing us to distinguish apology from non-apology attestations. In combination with other pieces of information (e.g. I am as a signal of performative apologetic uses of afraid), they therefore facilitate the exclusion of unwanted hits and contribute to an improvement in the precision of search output.