Corpus Linguistics Methods in the Study of (Meta)Argumentation

As more and more sophisticated software is created to allow the mining of arguments from natural language texts, this paper sets out to examine the suitability of the well-established and readily available methods of corpus linguistics to the study of argumentation. After brief introductions to corpus linguistics and the concept of meta-argument, I describe three pilot-studies into the use of the terms Straw man, Ad hominem, and Slippery slope, made using the open access News on the Web corpus. The presence of each of these phrases on internet news sites was investigated and assessed for correspondence to the norms of use by argumentation theorists. All three pilot-studies revealed interesting facts about the usage of the terms by non-specialists, and led to numerous examples of the types of arguments mentioned. This suggests such corpora may be of use in two different ways: firstly, the wider project of improving public debate and educating the populace in the skills of critical thinking can only be helped by a better understanding of the current state of knowledge of the technical terms and concepts of argumentation. Secondly, theorists could obtain a more accurate picture of how arguments are used, by whom, and to what reception, allowing claims on such matters to be evidence, rather than intuition, based.


Introduction
The aim of this paper is to evaluate the use of the methods of corpus linguistics in certain aspects of the study of argumentation. In order to achieve that, the paper contains descriptions of three mini-or pilot-studies. This adds a layer of complexity to the structure of the paper itself: the methods, results and conclusions from each of those pilot-studies are detailed below; however, the method of the overall paper is to employ those pilots, its results are their conclusions, and the final analysis of the value of the methodology is discussed in the concluding Sects. 4 and 5. It is important to bear in mind that the pilot-studies themselves, while producing some interesting results, are too small a foundation upon which to base any wider conclusions as to their subjects: at best they open avenues for further research. As such, they are not described in the rigorous detail which a full-scale corpus investigation would require in order to be accepted. Taken together, however, the conclusions of those pilots should allow a preliminary answer to be made to the following research questions: 1. Is the methodology of corpus linguistics appropriate to the study of (meta-) argumentation? 2. What possibilities does such a methodology provide? 3. How might the results of such studies be employed?
Before introducing the studies themselves, some elaboration is provided in the subsections below.

Corpus Linguistics
Only the briefest description of the methods of corpus linguistics and the reasons why they have become so important in the study of language can be provided here, and those familiar with the topic may wish to move on at once. Rather than any kind of potted history, I shall concentrate on certain features of the technique which are of relevance to this study and to argumentation research in general. It should also be noted that while corpus linguists are involved in both the construction and the analysis of corpora, only the latter is discussed here for reasons of space, despite the undoubted importance of corpus building.
To begin, a simple definition of what corpus-based techniques are for: 'Corpus linguistics attempts to gain linguistic knowledge through the analysis of collections of samples of naturally-occurring texts and transcribed recordings' (Wallis and Nelson 2001: 305). Simple as it may be, there are two important points of interest to be noted from this description. Firstly, the idea that knowledge is to be gained through analysing samples shows that corpus linguistics aligns itself with the paradigms of natural science. This was difficult to achieve while observational linguistics was so restricted and laborious. Naturally, linguists had always been able to construct corpora and work on them, however, both the construction and the analysis were extremely time-consuming, and access to 'real-life' language, unedited and unpublished, was difficult to obtain. The use of computers from the 1960s onwards solved the first problem, and the phenomena of mass self-publishing on the internet has gone a long way towards solving the second. By way of comparison, the British National Corpus was compiled between 1991 and 1994, and contains 100 million words; the new iWeb corpus, hosted by Brigham Young University, boasts around 14,000 million words; and the News on the Web (NOW) corpus, employed in the pilot studies described below, currently has 6640 million words, and is constantly updated, increasing in size by around 100 million words a month (Davies 2013). All of these are readily available to search online.
In recent years the use of the techniques of corpus linguistics has spread to a wide range of fields, for instance pedagogy (Flowerdew 2009), literature (Biber 2011), and the anxieties of translators (Vieira 2020). There are also debates within the field over the relationship between corpus linguistics and related language studies, and on methodological principles. Arppe et al. for instance, discuss the best methodologies for cognitive corpus linguistics amid concerns that 'corpus linguistics is yet to be fully accepted as a fundamental method in Cognitive Linguistics ' (2010: 21) and cite a number of reasons why cognitive linguists remain unconvinced. Barlow (2011) has discussed the relationship with theoretical linguistics, as has Gries (2010) with the latter noting at the outset that: 'The relation between corpus linguistics and linguistic theory has traditionally been somewhat problematic ' (2010: 327). While the use of statistics in corpus-based studies, thoroughly described in Biber and Jones (2009), is still the subject of debate (Koplenig 2019).
Such questions notwithstanding, the impact of corpus linguistics across a range of disciplines in a relatively short space of time has been impressive. Corpora, of course, cannot answer every question in linguistics and they have some fairly basic drawbacks. For example, the absence of an item from the database doesn't mean that the use of a structure is unacceptable. Indeed, the presence of an item doesn't necessarily make it acceptable either. For researchers of English there is the added difficulty of the many varieties of the language which are in common use on the internet, and the large number of foreign learners of the language who may produce calques and other 'inauthentic' constructions. For researchers of other languages, there may be a paucity of materials, particularly in the conveniently searchable form of the major English internet databases (see, for example, Saad and Ashour 2010 on corpora of Arabic). As with any experimental technique, the researcher must carefully assess whether the technique is appropriate to answering the question at hand.
The second point of note from the above definition is, perhaps, implied by the first, but is worth focussing on for a moment: that the samples analysed are 'naturally-occurring'. Argumentation theorists often create their own examples, which is perfectly sufficient to illustrate the possibilities of argument construction, but means that the transfer of theoretical considerations to the world of practical argument may be doubted. Corpus techniques might both provide easier access to actually used examples of argument forms, and, at the same time, encourage research into those forms of which there are real examples. Argument forms which are currently the subject of study are not necessarily those which are in widespread use and causing actual disruption to good communication and co-operation.
There is, then, no reason for linguists to continue to rely on their own intuitions about actual language usage, or for scholars of argument to make guesses as to what kinds of arguments are used in real-world debate. Not all linguistics, however, is concerned with actual usage, and the same may be said of argumentation: how much of what is written in that field actually depends on assumptions or intuitions about real-world argument? Clearly, much of it doesn't, however, I would argue that argumentation theorists do pepper their work with unsubstantiated claims about how often, for what purposes, and by whom certain argument structures are employed.
One obvious example is fallacy theory, where one of the criteria for an argument form to be considered a fallacy is for it to be reasonably common, and another is that it should appear convincing (see Hansen 2002): despite these being widely-accepted parts of the definition, studies showing that certain forms actually are common, or that they have actually convinced anyone, are not in any great abundance in the literature. It might also be argued that, at root, all argumentation study is concerned with the practice of argument, in which case the study of that practice, on a massive scale, ought to be a priority for the field. The question then arises as to how that study can be conducted, and this paper sets out to answer it in part.
Depending on one's approach, argumentation can be seen as a sub-field of philosophy or as a type of discourse analysis, among other things, and is, I suggest, at its best when it combines these elements to advantage. Corpora have been important in the growing movement of experimental philosophy for some time; for example, Reuter (2011) used search engines to study the distinctions made between the appearance and the reality of pain, and corpora are being constructed specifically for philosophical investigations (Betti et al. 2017). In his discussion of the use of corpora in philosophy, Roland Bluhm notes that the two main research strategies in corpus linguistics are 'corpus-driven' and 'corpus-based' research; he goes on: 'Both strategies are mirrored in the different functions that corpus analysis may serve in philosophy. They can be used in an explorative manner to facilitate philosophical research for which linguistic phenomena are somehow pertinent; but they can also be used more strictly to gather evidence to support or undermine philosophical claims ' (2016: 106). This is equally true for argumentation.
In a review of 30 major corpus studies in critical discourse analysis, Kamasa (2015) describes the common approach employed thus: 'The basic notion of discourse is most commonly operationalized as a set of patterns emerging from collocations, concordances or keywords [… the method is] to automatically generate a list of lexical or textual items which are then manually analyzed ' (2015: 215-217). An example of this type of study is Alan Partington's book The Linguistics of Political Argument (2003). Partington worked with a corpus of White House press briefings running to some 250,000 words. He first read the whole corpus and made notes, then used software to search for patterns, then analysed and discussed the rhetorical features which were unearthed. Partington notes that, at that time at least, corpus linguistics had 'had relatively little to say in describing features of discourse, particularly of interaction, that is, the rhetorical aspects of texts ' (2003: 4). His work is an example of how automated corpus techniques and traditional textual analysis can work together to expose and explain rhetorical features, and, thus, leads us towards a methodology which can be of use in argumentation. First, one interesting point which Partington makes on corpora in general: 'Checking intuitions against naturally occurring instances of language also frequently serves as a springboard for new intuitions, and can open up new and previously unexpected avenues of thought ' (2003: 6). This is an idea to which we shall return.
One area of rhetorical analysis which has seen extensive use of corpora, and has brought them into the argumentation mainstream, is the study of pre-election debates, notably in the work of William Benoit. Benoit uses a four-step method of coding the assembled debate material: first 'the message must be unitized into themes ' (2013: 28), these are then classified by function, by topic, and by sub-form of policy. This type of coding is not automated and requires considerable care in the annotation process. More recently such work has been taken on by members of ARG-Tech (of which more below) in reference to US presidential elections (Visser et al. 2018(Visser et al. , 2019a) and on populist rhetoric by a team in Zurich (Blassnig et al. 2019). There remains a question as to whether rhetorical and dialectical features of corpora can be studied in less laborious ways.
Of particular relevance to the pilots described below, Goodwin and Cortes (2010) authored a fascinating paper investigating the use of spatial metaphor in what they call 'metadiscursive commentary' on argumentation. They turned to corpus techniques since 'as theorists, as long as we rely only on our own linguistic intuitions we are unlikely to fully confront the potential otherness of practitioners' proto-theories; we are unlikely to be surprised ' (2010: 164). The element of surprise, the discovery of something which one had never suspected was there, is perhaps the main gain for both philosophers and linguists in the use of corpora. Yet, at the same time, this is not an unguided trawl through the depths: Goodwin and Cortes had studied argumentation theory and noted the use of spatial metaphors in five different authors, their aim was to 'take the spatial metaphors used by argumentation theorists as the target expressions, and compare them with their use in practitioners' discourse' (2010: 163), a technique similar to that which will be suggested in this paper. Their paper concludes by noting that: 'Recognizing the gap between practitioners' and theorists' conceptions, however, should allow us (among other things) to better understand some of the challenges of teaching argumentation' (2010: 173). Sadly, Goodwin, an argumentation theorist (among other things), has not published any more studies using corpora, and Cortes, a linguist, has not returned to considerations of argument. Neither, it seems, has anyone else taken up the mantle, as far as straightforward corpus search methods are concerned.
Instead, interest and attention has flowed into argument mining, the goal of which is 'to build a technology that can identify the premises, conclusion, and the argumentation scheme of each argument that is found in a text' (Walton and Gordon 2018: 6). This is an area of argumentation theory which is currently proving very productive, but is reliant on annotation procedures which apply a pre-conceived idea of argument structure. For example, Peldszus and Stede (2013) write at length on the theory of argument structure which they apply in argument mining before describing any work with a corpus. They base their approach on Freeman (2011) and his rejection of certain elements of Toulmin's (1958) theory, since it 'makes an explanatory valuable connection between the focus on arguments as process, such as found in the study of dialectical dialogues in philosophy or in rhetorics, […] and arguments as product, such as found in the study of persuasive text in radio or newspaper commentaries, in scientific writing or even advertising (2013: 5). Their focus in examining texts is not normative, nor is it educational, as they make clear: 'Our analysis is intended as a general, descriptive basis for corpus-linguistic studies and ultimately for computational applications such as argument mining (2013: 14). This is worth noting because it illustrates again the two-way flow between the study of a corpus and the background theory: the theory dictates, to a large degree, how the corpus is used, and the results from its study can then be used in various ways, some purely descriptive and others which will go on to inform the theory itself and ideas about what arguments should be like.
A good deal of the best work on argument mining has come out of the ARG-Tech lab at the University of Dundee. As well as the studies mentioned above, others have been based on the Hansard record of UK parliamentary debate (Duthie et al. 2016) and transcripts of dialogue from the BBC's Moral Maze programme (Budzynska et al. 2014). Unlike Peldszus and Stede's, this work has generally been in the theoretical tradition of argumentation schemes; it is also focussed on the production of practical tools for argument annotation and analysis. Both of which themes are exemplified in Lawrence et al. (2019) An Online Annotation Assistant for Argument Schemes.
The different approaches of Peldszus and Stede and the Dundee-based team, as well as others too numerous to mention, also illustrate what Lawrence and Reed consider one of the main problems in the area-'the lack of a standardized methodology for annotation ' (2019: 786). This would require both a standardized conception of argument structure and a standardized technique for carrying out annotation. In their review of the field, Lawrence and Reed list a large number of studies which employ various annotation methods, and a wide range of corpora, which they divide into data sets considered 'as "fully" structured argument data', those which are 'semi-structured', and the large corpora, such as Wikipedia, which are rich in argument, but constitute 'unstructured' data (2019: 784). Although they see a good deal of progress and predict rapid development, they conclude that: 'Argument mining remains profoundly challenging' (2019: 807).
The use of corpora investigated in this article is of a simpler nature. The basic corpus keyword search may be a less sophisticated tool, but it is one that is already fully-functional. Obviously, the search itself is not the whole story, and further coding of the results is necessary, but it is certainly worth seeing whether this form of corpus study can make a contribution to argumentation study. As with linguistics, so in argumentation, there are many questions which such techniques cannot be expected to answer: the purpose of this paper is to investigate those which perhaps they can.

Meta-argumentation
A number of papers have been written discussing meta-argument, but it seems that the term can be understood in several different ways, so some discussion of how it is being used here is necessary. For Daniel Cohen, for example, meta-arguments are arguments to which we appeal in order to justify accepting or rejecting other arguments, a conception which, he notes, may lead to an infinite regress: 'if in order to accept an argument, I need to accept this other, meta-argument, then there would have to be a meta-meta-argument for accepting the first meta-argument' (2001: 80). He does not, however, think that this is always the case. Certainly, we can say that his meta-arguments are more general arguments which support or undermine other arguments. This is a slightly different approach to that of Finocchario, who identifies three varieties of meta-argument in his volume (2013) dedicated to the topic.
Finocchario's treatment is long and detailed, but Blair, in his review of the book sums up the first two thus: 'Argument analysis is the reasoned interpretation or evaluation (or both) of an argument, […]. Selfreflective reasoning and argumentation is argument analysis of an argument one constructs oneself' (Blair 2014: 224). To these is added Argumentation theory: 'Logical theory and argumentation theory are or ought to be instances of meta-argumentation' (Finocchario 2013: 15). These cover quite a scope, but it isn't immediately obvious into which category Cohen's meta-argument would most comfortably fit.
Krabbe discusses metadialogues rather than meta-argument, but these too are 'dialogue about a dialogue' (Krabbe, 2003: 83). He suggests that there are three problems which arise from the concept. First, the demarcation problem, how can we tell a metadialogue apart from a 'ground level' one? Second, the problem of infinite regress: why not a dialogue about the rules of metadialogue, and a further dialogue about that dialogue, and so on? Thirdly, there is an equity problem, since some 'retreats into metadialogue seem quite reasonable' while others are merely 'nit-picking or completely unwarranted charges' (2003: 83). Krabbe's conception grows from Hamblin's (1970) distinction between 'topic points' and 'points of order', where the former are found in the ground level dialogue and the latter seek to comment upon it. He goes on to point out that any discussion which takes place at what is called the opening stage in pragma-dialectics (see van Eemeren and Grootendorst 2004), where the ground-rules for an argumentation are agreed upon will be a metadialogue. This is important to note, since it makes it clear that metadialogue is an important part of general argumentation, not a theoretical backwater of no relevance to real world arguers, and, as John woods says 'metadialogues can be embedded in ground level dialogues. An original dialogue can balloon into a metadialogue' (Woods 2007: 217).
Krabbe proposes a set of rules to govern metadialogue, but freely admits that they do not solve all the problems he listed at the beginning of his paper. Of most interest here is the question of demarcation. The rationale behind looking for examples of meta-argumentation using corpora was that such instances were thought more likely to contain the technical terminology of argument study: terminology which would be far easier to search for than 'raw', ground level argument which might be expressed in an infinite number of ways. This division, however, relies upon the notion that there are arguments and then there are comments upon arguments, which, though part of the same process, are actually something quite different. This is questioned by Blair who suggests that on Finocchario's definitions of argument and meta-argument 'defending a claim against argued-for objections is meta-argument' (2014: 237), since raising an objection is to argue against the justification for the original argument, and any discussion of it becomes an argument about an argument, even though one is only defending one's original position. In another review, Dale Jacquette, through a long discussion, also questions the clarity of the division, noting 'we can hardly proceed to develop a theory concerning the properties and principles of meta-argumentation if we cannot properly distinguish meta-arguments from non-meta-arguments ' (2014: 226).
Even if this distinction is tenable, and I have doubts about that, clearly, both are very different from the practice of discussing arguments in general, or the entire discourse type of argumentation, so we might be better advised to withhold the term meta-argumentation from the description of actual arguments and employ it only about the theorising on argumentation and argument types in general. Experience with the corpus, however, soon showed this also to be a difficult separation to maintain. There are many examples of the pattern where an individual argument is discarded because, for instance, 'that's a slippery slope argument' which is both a comment on the individual argument and on the type in general. Indeed, so prevalent is the practice that I am tempted to coin a new fallacy, the Calling-fallacy fallacy, in which an antagonist claims to have defeated the protagonist of an argument by merely categorising it as an example of a known fallacy form.
In spite of these issues, the one clear advantage in focussing on meta-argument when using corpora which was mentioned above remains: the use of what Strawson called 'the logician's second order vocabulary ' (1952: 15) in which Finocchario includes the terminology of fallacies, noting that as a result 'the best place to begin with in the study of fallacies, or at least a crucial phenomenon to examine, is allegations that fallacies are being committed ' (1987: 264). Most of the examples turned up in the corpus are of such allegations. In this paper, therefore, despite its use in the title, no firm definition of meta-argument is adhered to, and no attempt is made in the analysis below to distinguish uses of the key terms in argument and in meta-argument. Still, since the language analysed employs technical terms of argumentation theory it can be referred to as meta-argumentational language, and it can be asserted that each instance is in some fashion a response to and comment upon another previously made argument, fitting Finocchario's simplest formulation where he defines 'a meta-argument as an argument about one or more arguments' (2007: 253).

Search Terms: Straw Man, Ad Hominem, Slippery Slope
The selection of the three terms with which to perform the corpus analysis was, to a degree, arbitrary. This, however, highlights the necessity of employing both intuitions and empirical data in the use of corpora, and indeed in all scientific enquiry: one must first have an idea of what experiment to conduct, and that involves the imagination and creativity which are often an unconsidered element of research. All three of the search terms were chosen because I had noticed their reasonably frequent use in non-specialist contexts. A search for 'post hoc, ergo propter hoc', for example, would probably not reveal much about the meta-argumentation of nonspecialists, but that is only an intuition, and could easily be checked.
That the terms are all names of fallacies is a point worth dwelling on for a moment. Although the method of corpus searching is a purely descriptive one, by looking for phrases likely to be used pejoratively about arguments, as negative normative evaluations in meta-argument, it becomes possible to investigate 'good' and 'bad' reasoning, or at least perceptions thereof. That claim does, however, appear to presuppose a certain conception of fallacy, which is why the caveat concerning perceptions is important. What might be termed the traditional approach to fallacies, characterised by a list of well-known names of common argument errors, is not one I particularly endorse myself (Hinton forthcoming), and has been the subject of much criticism from Hamblin (1970) forward, not least by the authors of the pragma-dialectical approach, who endorse a conception of fallacy as a violation of the rules of critical discussion (van Eemeren and Grootendorst 2004). Still, this much acknowledged, the obvious terms to be searched for in a pilot study are those best known to the general public: if the technique is seen to produce valuable information, then theorists must seek more sophisticated ways to compare raw public discourse with theories of argument error.
For each of the terms the first question was to determine whether the uses made by laymen corresponded to the theoretical norm. Those norms are laid out below. Secondly, for each term there were individual research questions, designed, in keeping with the aims of the overall study, to discover what kinds of question corpora might be able to help to answer.
The term 'straw man argument' is well-known, and its definition largely uncontroversial. In pragma-dialectics it is a violation of rule 3: 'An attack on a standpoint must relate to the standpoint that has really been advanced by the protagonist'. This rule may be broken 'if a fictitious standpoint is imputed to the opponent, or if his standpoint has been distorted' (van Eemeren and Grootendorst 2015: 560). No attempt is made in this study to separate these two possibilities, but future research into the distinction between distorted and entirely fabricated Straw Men might make use of the same techniques. Walton (1996) also provides a thorough theoretical analysis of the fallacy and its relation to others, including ad hominem, and defines it thus: 'The straw man fallacy is committed where the proponent in a critical discussion misrepresents the position of the respondent with a simulated position, in order to appear to refute the respondent by carrying out a refutation of the simulated position ' (1996: 126). Others have similar descriptions: 'the misrepresentation (deliberate or accidental) of a person's position, a subsequent attack on the misrepresentation, and the conclusion that the person's position has been refuted' (Tindale 2007: 12), and 'the misrepresentation of someone's position in order to easily refute that position' (Visser et al. 2018: 947). This last is provided on the basis of a review of a number of works including Walton (1996) and Lewiński and Oswald (2013) and will serve as a working definition of the argumentation research community norm of 'straw man'.
As well as assessing how closely actual uses of the term stayed to this norm, particular attention was paid to those uses which didn't: is there an understanding of the metaphor 'straw man' which goes beyond its technical use in argumentation theory, but can tell us something about the root of its use to describe a certain fallacy, and perhaps, therefore, about the fallacy itself?
Ad hominem is a more difficult term to pin down to one standard definition since several versions are recognised within the field. John Locke's 'to press a man with Consequences drawn from his own Principles, or Concessions' (1975: 686), is not the same as either the abusive form or the modern ad hominem defined by Tindale as 'an attack against the person delivering the argument rather than the position argued ' (2007: 12). Nonetheless, I have taken Tindale's definition to represent the norm in contemporary argumentation theory for the purposes of this paper. In their elaboration of Rule 1 violations, van Eemeren and Grootendorst cite three types of 'personal attack on opponent' which constitute an ad hominem: the abusive, the circumstantial and the 'tu quoque ' (2015: 559). They also note that an ad hominem may be delivered with a degree of subtlety, maintaining plausible deniability for the attacker. It may be the case that this 'indirectness goes so far as to invoke an emphatic denial that it is the intention to put pressure on the opposing party or to launch a personal attack on him. The threat or attack is presented as information with which the listener may do what he will' (2015: 568).
The questions addressed in this pilot-study, then, are obvious: does the use of the term adhere to this norm? And, what variety of ad hominem was being referred to, that is to say, what was the manner in which some element of the person was used against his argument? In addition, the attitude of the speaker, positive or negative, towards the argument so-labelled was recorded. Attitude was considered to be positive when an argument was named and accepted, and negative when it was so-named and rejected. Examples are given in the relevant sections.
Slippery slope arguments (SSAs) are a somewhat different case as theorists have argued considerably amongst themselves over how they are actually constituted. Walton's (1992Walton's ( , 2015 definitions do not seem to fit all the requirements of other scholars for a true slippery slope (e.g. Van der Burg 1991; Rizzo and Whitman 2003). There is a clear split between those scholars who feel that a slippery slope differs from other arguments from consequences by having a logical component to it, and those who regard such arguments as only one type of slippery slope; and their definitions are generally too long be easily cited (see Hinton 2018, for a full discussion). Van Eemeren and Grootendorst have the relatively concise: 'Rejecting a course of action because it is supposed to lead us from bad to worse, whereas it is not necessary for the alleged consequences to occur at all' (2015: 564), as an example of an inappropriately applied argumentation scheme, which portrays the fallacy as an unacceptable move in the critical discussion, but does not delve into the structure of the argument or consider cases where the consequences are indeed necessary.
A very general norm for the definition of the slippery slope argument, with which I'm not entirely satisfied myself, would be: An argument that taking a first step will set off a process by which an undesirable situation will inevitably be reached.
On top of the theoretical debate over the nature of that process, there is also an attitude expressed in some quarters that slippery slope arguments are the exclusive domain of the reactionary right and that that is in itself reason enough to dismiss them. According to Philip Devine: 'Many critics have dismissed the SSA as the last resort of the traditionalists bereft of better arguments ' (2018: 376). This would appear to be the kind of issue, involving empirical claims about real-world use, which a corpus study is particularly well-placed to address.
This study, then, sought to investigate whether uses fit with the scholarly norm, which type of slope they were referring to, and in what areas of debate they were being used. As with ad hominem above, the attitude of the person using the phrase was also noted.

Method
The three studies were all conducted using the NOW (News on the Web) corpus (Davies 2013), one of several corpora hosted by Brigham Young University, Utah. This particular corpus gathers data from news websites across the Englishspeaking world. It is free to access and easy to use, requiring no technical knowledge, and now features 6,000,000,000 words. A simple key word search returns information on the date, country, and publication in which the instance of the word appears. The word is shown within the sentence it featured in, and this context can be expanded to paragraph length. Should wider context be necessary, there is a link to the original website. In all cases in this study, the paragraphlength context was checked, and in cases of doubt, the full article consulted.
The NOW corpus was chosen for a number of reasons. Firstly, it is constantly updated and reflects the most recent examples of language use available. Secondly, it has an excellent spread of English from around the world-this study features examples from Singapore, India, Malaysia and Zimbabwe, as well as Canada, Australia, the USA and the UK. Thirdly, although the corpus features a wide range of news sites, from the Washington Post and the Guardian, to local community news portals, it can be expected that the majority of those contributing are journalists, and, as such, have some education and writing ability. It is also true that many examples are to be found in opinion pieces, rather than purely factual news items. All of these features were considered advantages for this particular study, and the choice of corpus should always be made carefully with such factors in mind. These pilots focus on contemporary English, have no geographical limitations, and have a special interest in the way that opinion formers and professional writers use terms from argumentation theory, as their use is likely to spread quickly among their readers.
The results of the search are presented in chronological order, newest first. In each case the 50 most recent, unique instances of the search term were taken for analysis. This quantity, whilst somewhat arbitrary, was considered enough to show the value of the methodology, while still being manageable. A full study would, obviously, require far more data, so there seemed little point in increasing the number of examples studied unless the increase were manifold. At the same time, a quantity of fewer than 50 would hardly have allowed for the drawing of even illustrative conclusions. The corpus search engine does not turn up multiple instances of the same use in the same place; however, there are cases of the same article featuring on multiple websites, of the same author using the term repeatedly with the same reference, and of multiple quotations. An example of the last of these was the frequent repetition of a Canadian government official claiming that his government would not respond to comments from President Trump with ad hominem arguments. This statement was cited in many Canadian online newspapers. In all of these cases, repetitions were discarded. This meant that as many 70 or 80 instances had to be analysed in order to reach the target of 50 unique examples.
Each example was examined by only one person, the author, a native speaker of English and argumentation scholar. Obviously, in larger scale studies, both a greater number of examples and a greater number of assessors would be used, which would allow the calculation of inter-annotator agreement. I reiterate that this paper is intended to report on the possibilities of investigation via corpora, rather than to make claims about the terms studied. Some elements of the analysis were of a subjective nature, others less so. Deciding whether the use of a word fitted with the norms given in the section above was a question of individual judgement; recording the topic of debate in which slippery slopes were referred to, was not.

Results and Discussion
The results of each pilot-study are given in separate sub-sections, and a discussion of their significance follows in Sect. 4. The conclusions to the overall study are presented below in Sect. 5.

Straw Man
The search was conducted on the 21st of August 2018 and returned a total of 1008 hits. Of the 50 uses of the phrase 'straw man' analysed in the study, 36 were examples of meta-argumentation and in the other 14 instances the phrase was used as a metaphorical description of something else, not an argument. Table 1 shows the approximate meanings.
One  There are two broad themes here, weakness and false appearance. What is interesting is the way in which these are carried over into the argumentational uses of 'straw man'. Perhaps surprisingly, only 13 of the 36 instances of 'straw man' in meta-argumentation complied with the norm for the term. The alternative uses are recorded in Table 2.
In this small sample, then, straw man is used with equal frequency to mean any fallacious argument and in compliance with the technical norm. There are clear relations, however, with the meanings listed in Table 1. This suggests that the metaphor of the straw man has a life of its own outside of argumentation theory and its use in the phrase 'straw man argument' is only one of many possible meanings. It should be noted that in two cases in Table 2, marked with inverted commas, the definition was actually provided in the text-perhaps because the authors knew there were alternative definitions in use.

Ad Hominem
The search was conducted on the 24th of August 2018 and returned 2161 hits. Of the 50 analysed results, 2 instances were of the phrase being used as a name, and another 24 instances were not in the context of argumentation, leaving 24 examples of meta-argument, of which 19 were considered to be in accordance with the norm. In 4 of those cases which did not fit the norm, the meaning was statements which are pure insults. Since those 'arguments' themselves are not always given, it is difficult to classify these cases-are they, in fact, dealing with what would normally be classified as arguments; is it possible for an argument to be an insult at the same time? One other case described a type of ad hominem which was called the straw man fallacy. While in some cases an instance of a straw man argument might be classed as an ad hominem, it is not normal practice to do this. Of the 19 cases which did fit the norm, 5 were clearly of the abusive ad hominem variety; the other 14 appeared to relate to the general norm, but were not specific as to the form of the argument. This proved to be a major difficulty in the classification of argument types-see slippery slopes below-as those using the epithet 'ad hominem' rarely set out the argument they were using it to refer to. It was, therefore, impossible for the researcher to be sure how they were using it, although further exploration of the context might have revealed more. There was, however, no evidence at all of the phrase being used in the Lockean sense described above.
One noticeable feature about the use of 'ad hominem' was found in the words collocating with it. Table 3 shows what immediately followed the phrase in the 48 instances excluding its use as a name.
As the table makes clear, the phrase 'ad hominem attack' has gained currency, and, in this sample at least, is far more in use than the more neutral 'ad hominem argument'. This fact has important repercussions for the teaching of argumentation theory and, more widely, critical thinking. There is no reason why arguments described as ad hominem by theorists should always be considered as attacks on the person being referred to. Technically, all such arguments are, indirectly, attacks on another argument which has provoked them, using an aspect of the arguer, rather than the reasoning, to gain their force. This subtlety does not appear to be much considered in the general use of the phrase.
Despite the overwhelming negativity directed towards 'ad hominems', there were two instances of positive, or at least neutral, attitudes on display. In one, a politician described the legitimate part of the election campaign in which the character and record of candidates is examined as ad hominem scrutiny, and in a second, the writer described his own comments as 'ad hominem', explaining that such a tactic was all that was left open to him since his opponent had advanced no coherent argument of his own.
Example 2 There will also be the others who'll take me on to say I'm just another 'ad hominem' hack who's playing the man and not the ball. But that's my whole point. Kallie Kriel didn't bring a ball for us to attempt to play with […] In such circumstances the only reasonable thing to do is to tackle the man, and maybe give him a kick in the balls too, for good measure (Cilliers 2018).

Slippery Slope Argument
The search for 'slippery slope argument' was conducted on the 29th of August 2018, and returned 5423 results. The word 'argument' was added in this study as previous research by the author had found a large number of uses of 'slippery slope' as a metaphor for a difficult and deteriorating position. As a result, all 50 of the analysed uses were cases of meta-argumentation. The problems concerning classification referred to in the sub-section above made the assessment of the use of the term 'slippery slope argument' very difficult.
The key difference between the types of argument known as slippery slope in the research literature is the mechanism by which they are made slippery, that is to say, the reason why the first step leads to the next and the next and so on. In the examples from the corpus, this process is almost never mentioned, much less explained. Even with careful examination of the context of use, it is impossible to identify the type of argument being labelled 'slippery slope' since those arguments are not always repeated by the person commenting on them. Indeed, examples where the name is referring to the writer's own argument suggest that those making slippery slope arguments don't consider or explain the mechanics of the argument anyway. Examples I have used in my own work on the subject (Hinton 2018) have required a good deal of reconstruction and generous interpretation. No division, therefore, was made between logical and physical slopes of consequences in this study.
Two interesting sets of results were recorded, however. Firstly, although attitudes towards slippery slopes were in the majority negative, a good number of users took a different view: 10 instances treated the argument so-named positively, and a further 14 in a neutral fashion, generally treating it as a serious argument even if not necessarily a persuasive one. This is interesting in the light of Walton's comment that, 'as one looks through the literature on slippery slope arguments, it is difficult or even impossible to find a single example of one that meets all the requirements for being a reasonable argument' (Walton 2015: 284). That would appear to leave 26 negative instances. In fact, there were 27, as one writer referred to his opponent's slippery slope negatively, but then offered one of his own, so has been counted in both columns.
Here is an example of arguments being referred to as slippery slopes, but also being considered good arguments on the basis that they have been borne out by the facts.

Example 3
The slippery slope argument may be put in two categories, both of which are cause for concern. First there is the "practical" slippery slope, which involves abuse of the law. As the Nathaniel Centre has pointed out, there is robust evidence from Belgium and the Netherlands that the law is being routinely violated; large numbers of cases are not reported and significant numbers of people are euthanised without giving their consent, as is required by the law. Secondly, there is a "logical" slippery slope, which is the expansion of the persons for whom and the situations in which assisted suicide or euthanasia is permitted. This has been the pattern in Europe (Otto 2018).
Another interesting example shows a slippery slope argument being rejected apparently because of the lack of a causal link ('no natural progression') from one step to another, which keeps open the possibility that the slippery slope argument form itself may be reasonable, where such a progression does exist. In such cases the attitude was considered negative, even though the argument type is not explicitly rejected, only this instance.
Example 4 Aside from such choices not yet being medically possible, the slippery slope argument may falter because there's no natural progression between approving non-medical sex selection and approving being able to select other characteristics. Sex selection is a discrete choice around which a definite boundary can be drawn (Olver 2018).
Secondly, the assumption that slippery slope arguments, or at least accusations of their use, were especially prevalent within certain areas of debate was shown to be largely correct. Of the 50 uses of the term analysed, 11 were concerned with one-off topics, ranging from sushi to Brexit. The other 39 are recorded in Table 4.
While the numbers are small, of course, the majority of uses do seem to be found in debates over ethical, rights-based issues; and, given that such arguments by their nature are warnings against taking a new step, it is fair to say that they are being employed by those with a more conservative approach to these particular subjects. What significance might be attached to that finding is beyond the scope of this study.

Discussion
In this section I make reference to particular areas of argumentation theory research which could benefit from the use of the techniques of corpus linguistics. First, and most obviously, in the field of computational argumentation, software which is used to analyse argumentation patterns must first be trained on annotated corpora of relevant texts, and this is the object of much of the work discussed above in Sect. 1.1. Equally, the development of artificially intelligent systems which can construct arguments and express them in ways which are familiar and intelligible to humans requires machine learning from corpora of human argumentative discourse. These, however, are specialist fields and the details of how corpora can be utilised by computational argument developers are beyond the scope of this paper.
In terms of more traditional concerns, it seems clear that corpora can be of tremendous assistance in comparing theoretical constructs with the reality of practice. This is true of the study of rhetorical devices, where a corpus of texts can reveal how and when such devices are used, and also of theories which describe argumentative discourse. The model of the critical discussion in pragma-dialectics, for instance, is not meant as a strict reflection of real-world discourse, but still, it can both inform the way that we analyse texts and be informed by what is found therein: we may ask whether the four stages of the discussion are made explicit in actual arguments, when, and how. A model of ideal argumentation does not collapse because real arguers are not ideal, and yet the power of the model depends on its ability to reflect, to some degree, the elements which are present in our experience of the phenomenon. Example 2 above provides evidence of how theorists ideas of norms of argument conduct may clash with reality-the writer claims that an apparently unacceptable argument move is justified by the circumstances in this case.
There are various claims about how arguments work and the features they have which can be supported or doubted with the help of evidence from a corpus. For example, in his work on emotions in argument, Macagno (2014) notes that: 'Sometimes emotive words are needed in order to summarize in a condensed argument a more complex type of reasoning' (Macagno 2014: 118). Macagno bases his analysis on examples drawn from political speeches and other materials produced during the Italian general election of 2013, but cannot really be said to have used corpus linguistics techniques as no details are given of the collection of texts and there is no description of a systematic analysis, rather, examples are chosen to illustrate points the author wishes to make. This is not to criticise Macagno's paper, which contains a great deal which is of interest, but it is a case in which the application of a rigorous corpus-based study could provide much relevant information as to how often emotive terms are 'condensed arguments', in what contexts such uses occur, and so on. The possibilities for this type of investigation are practically endless, and the more one investigates a corpus, the more one is likely to be inspired to initiate further research.
This paper has looked particularly at how techniques of corpus linguistics can be used in the study of meta-argument, and, as far as keyword searches are concerned, I see this as an area in which corpora can be used to radically improve understanding and debate. Since meta-argument may often, though not always, involve the use of the meta-vocabulary of argumentation theory, examples can be discovered easily by concentrating on particular lexical items. The pilot studies presented here used well-known fallacy names, but the same approach could be applied to terms such as 'premise' and 'conclusion', or indeed 'false premise' and 'wrong conclusion'. Large scale studies looking at the use of these terms would provide a wealth of information to theorists about the way in which the non-specialist public comment on the argumentation they hear around them-the types of evidence they expect, the elements which they find important-and the degree to which they understand the underlying structure of arguments and the normative rules of 'good' argument practice. This is equally true no matter what one thinks of the theoretical value of the term themselves.
Examples 3 and 4 both show a high level of sophistication in their discussion of slippery slopes: Example 3 draws a distinction between 'practical' and 'logical' slippery slopes, and Example 4 throws doubt on whether a particular instance is driven by a 'natural progression'. In cases such as this, where there is significant disagreement amongst theorists as to how slippery slopes are to be defined, observation of how they are discussed on the ground, so to speak, might also ultimately inform the theory.
An examination of meta-argumentation amongst real-world arguers is a vital element in the design and development of programmes and courses aiming to improve the level of public debate and to increase the critical thinking skills of those involved. Only by knowing how the public argues, and how it comments upon the argument it hears put forth, can educators and activists know what needs to be taught; and only through monitoring the use of the terms employed in such training can we be sure that they have been understood and that the concepts they reflect have taken hold.
Example 1, and the results in general for 'straw man', show that terms can be slippery and their use may not be strictly in line with that of theorists. Once a term is part of language it will lead a life of its own: there is no doubt that the underlying concept being used in Example 1 is in line with the norm-a 'fake policy conjured up' is a fictitious standpoint being attributed to the speaker-and yet the linguistic application of the term is idiosyncratic in that it refers to the invention of that standpoint, not to any argument against it.
In sum, it can be stated that these techniques enabling scholars to deal with large quantities of naturally produced discourse could allow argumentation theorists to move away from a reliance on intuitions, as linguists have done, and develop a more symbiotic relationship with the argument actually occurring all around them.

Conclusions
The first research question asked if the methodology of corpus linguistics is appropriate to the study of argumentation. Since each of the pilot-studies produced a good deal of relevant data, and revealed interesting, and sometimes surprising, uses of terms common in argumentation theory, there can be little doubt that this methodology could be effective in answering certain questions concerning general use and understanding of concepts in the study of argument.
The second aim was to uncover the possibilities offered by the use of corpora. While conducting the study, it became clear that not only could the corpus search direct a researcher to examples of technical terms being used in meta-argumentation, but also, by reading further in the work highlighted or following links provided therein, the original argumentation could often, though not always, be uncovered. In this way, the corpus search could help to find examples of particular argument forms, while at the same time showing the reaction to them. The 'connectedness' of materials published on the internet means that a single-word search can prove a pathway to a large volume of related and relevant material, all of it real-world argumentation. It is also worth noting that data relating to changes over time and differences between countries and regions can be generated.
Having stated these possibilities for research, it is, perhaps, almost unnecessary to raise the final research question: to what uses can the results of such studies be put? The only real limit here is the imagination of researchers, however, I suggest two main areas. Firstly, the wider project of improving public debate and educating the populace in the skills of critical thinking can only be helped by a better understanding of the current state of knowledge of the technical terms and concepts of argumentation. This is in line with the conclusions of Goodwin and Cortes (2010) outlined above. Secondly, theorists could obtain a more accurate picture of how arguments are used, by whom, and to what reception, allowing claims on such matters to be evidence-, rather than intuition-, based.
Finally, while it is hoped that this work may inspire further studies employing corpora, it should be remembered that a full-scale study into the use of any term would require large data sets, multiple assessors, clear assessment guidelines, and thorough statistical analysis, to be of any significance: the corpus is an amazing tool, but it only does part of the work.

Conflict of interest
The author declares that he has no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.