Abstract
This volume is the third in a series of books published by Palgrave Macmillan which focus on establishing guidelines for the creation and digitization of language corpora that are unconventional in some respect (see Beal et al. 2007a, b). Volume 3 is dedicated to the issue of public engagement and questions of how linguists can and should make their corpora accessible for a broader range of uses and to a wider audience. Although in this regard the road to building a corpus is often paved with good intentions, as Rickford (1993: 130) observes, these are frequently overtaken by ‘the less escapable commitments’ of teaching and further research. While this may be understandable, it is ‘not a picture, when we step back and view it, with which we can be proud’, since it means that ‘[m]ost of us fall short of paying our debts to the communities whose data have helped to build and advance our careers’ (Rickford 1993: 130). The importance of taking public engagement initiatives more seriously has generated considerable recent scholarly debate (especially amongst researchers in the arts, humanities and social sciences) as the so-called ‘impact agenda’ has taken hold particularly, though not exclusively, in UK higher education institutions (Lawson and Sayers 2016; Martin 2011; Samuel and Derrick 2015). A key objective of this volume is to examine the evidence for the view that despite the new requirements by funding bodies (and ultimately governments) that corpora should have a dual purpose as data that is deployable for engagement as well as research, twenty-first-century corpus linguists who do just that are not following conventional practices within their discipline. A second goal is to demonstrate how the issues that purportedly stand in the way of developing what one might term ‘impactful corpora’ can be circumvented (as our contributors have done) with a little ingenuity and motivation. Another objective is to sketch what we consider to be best practices in creating corpora for public engagement by offering guidance on optimal methods by which such data (audio, text and still/moving images) can be created, digitized and subsequently exploited for public engagement projects.
The term ‘unconventional’ here relates to the distinction first articulated in Beal et al. (2007a, b) between large-scale standardized or conventional corpora like the International Corpus of English or COBUILD and smaller more specialized databases. These are often not devised at the outset as corpora strictly speaking since they initially arise from sociolinguistically oriented projects, but such resources can indeed be used as such providing they are ‘tamed’ in particular ways (Beal et al. 2007a: 1). See also D’Arcy (2011: 54–6) and Kendall (2011: 362–3).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The introduction by Lawson and Sayers (2016) to their Routledge volume, which explores the possibilities for combining sociolinguistic research with the impact agenda, offers an excellent historical overview of how this ideology developed and its implications for scholarship from the 1980s to the present day.
- 2.
- 3.
This can be gathered from the published guidelines: ‘an effect on, change or benefit to the economy, society, culture, public policy or services, health, the environment or quality of life, beyond academia’ (Research Excellence Framework 2011: 26).
- 4.
See also: http://www.rcuk.ac.uk/research/openaccess.
- 5.
See: http://www.w3.org.
- 6.
In the North American sense of non-fee-paying, state-funded educational settings.
References
Books and Articles
Allen, Will, Joan C. Beal, Karen P. Corrigan, Hermann L. Moisl, and Warren Maguire. 2007. The Newcastle Electronic Corpus of Tyneside English. In Creating and Digitizing Language Corpora: Vol. 2, Diachronic Databases, eds. Joan C. Beal, Karen P. Corrigan, and Hermann L. Moisl, 16–48. Basingstoke: Palgrave Macmillan.
Bauer, Laurie. 2004. Inferring variation and change from public corpora. In The Handbook of Language Variation and Change, 1 edn, eds. J.K. Chambers, and Natalie Schilling, 97–114. Malden: Blackwell.
Beal, Joan C., Karen P. Corrigan, and Hermann L. Moisl, eds. 2007a. Creating and Digitizing Language Corpora: Vol. 1, Synchronic Databases. Basingstoke: Palgrave Macmillan.
———, eds. 2007b. Creating and Digitizing Language Corpora: Vol. 2, Diachronic Databases. Basingstoke: Palgrave Macmillan.
Beal, Joan C., and Karen P. Corrigan. 2013. Working with unconventional existing data resources. In Data Collection in Sociolinguistics: Methods and Applications, eds. Becky Childs, Christine Mallinson, and Gerard van Herk, 213–216. London: Routledge.
Beal, Joan C., Karen P. Corrigan, Adam J. Mearns, and Hermann L. Moisl. 2014. The Diachronic Electronic Corpus of Tyneside English: annotation and dissemination practices. In The Oxford Handbook of Corpus Phonology, eds. Jacques Durand, Ulrike Gut, and Gjert Kristoffersen, 517–533. Oxford: Oxford University Press.
Cameron, Deborah, Elizabeth Frazer, Penelope Harvey, Ben Rampton, and Kay Richardson. 1997. Ethics, advocacy and empowerment in researching language. In Sociolinguistics, eds. Nikolas Coupland, and Adam Jaworski, 145–162. Houndmills: Macmillan. (Originally published in Language and Communication 13(2): 81–94 in 1993.)
Childs, Becky, Gerard van Herk, and Jennifer Thorburn. 2011. Safe harbour: ethics and accessibility in sociolinguistic corpus building. Corpus Linguistics and Linguistic Theory 7(1): 163–180.
Choudrie, Jyoti, Susan Grey, and Nicholas Tsitsianis. 2010. Evaluating the digital divide: the Silver Surfer’s perspective. Electronic Government, An International Journal 7(2): 148–167.
Corrigan, Karen P., Adam J. Mearns, and Hermann L. Moisl. 2013. Data-mining the DECTE Corpus: phonological and morphological variability in Tyneside English. In Cross-Linguistic and Language-Internal Variation in Text and Speech, eds. Benedikt Szmrecsanyi, and Bernhard Wälchli, 113–149. Berlin: Walter de Gruyter.
D’Arcy, Alexandra. 2011. Corpora: capturing language in use. In Analysing Variation in English, eds. Warren Maguire, and April McMahon, 49–71. Cambridge: Cambridge University Press.
Day, Timothy. 2001. The National Sound Archive: the first fifty years. In Aural History: Essays on Recorded Sound, ed. Andy Linehan, 41–64. London: The British Library.
Durand, Jacques, Ulrike Gut, and Gjert Kristoffersen, eds. 2014. The Oxford Handbook of Corpus Phonology. Oxford: Oxford University Press.
Kendall, Tyler. 2007. The Sociolinguistic Archive and Analysis Project: empowering the sociolinguistic archive. Penn Working Papers in Linguistics 13(2): 15–26.
———. 2008. On the history and future of sociolinguistic data. Language and Linguistics Compass 2(2): 332–351.
———. 2011. Corpora from a sociolinguistic perspective. Revista Brasileira de Linguística Aplicada 11(2): 361–389.
Kretzschmar, William A., Jean Anderson, Joan C. Beal, Karen P. Corrigan, Lisa-Lena Opas-Hänninen, and Bartek Plichta. 2006. Collaboration on corpora for regional and social analysis. Journal of English Linguistics 34(3): 172–205.
Labov, William. 1982. Objectivity and commitment in linguistic science. Language in Society 11: 165–201.
Lawson, Robert, and Dave Sayers. 2016a. Introduction. In Sociolinguistic Research: Application and Impact, eds. Robert Lawson, and Dave Sayers, 1-6. London: Routledge.
Lawson, Robert, and Dave Sayers. 2016b. Where we’re going, we don’t need roads: the past, present, and future of impact. In Sociolinguistic Research: Application and Impact, eds. Robert Lawson, and Dave Sayers, 7-22. London: Routledge.
Martin, Ben R. 2011. The Research Excellence Framework and the ‘impact agenda’: are we creating a Frankenstein monster? Research Evaluation 20(3): 247–254.
Norris, Pippa. 2001. Digital Divide: Civic Engagement, Information Poverty and the Internet in Democratic Societies. New York: Cambridge University Press.
Perks, Robert P. 2011. Messiah with a microphone? Oral historians, technologies and sound archives. In The Oxford Handbook of Oral History, ed. Donald A. Ritchie, 315–332. Oxford: Oxford University Press.
Reaser, Jeffrey, and Caroyln Temple Adger. 2007. Developing language awareness materials for non-linguists: lessons learned from the Do You Speak American? project. Language and Linguistics Compass 1(3): 155–167.
Rickford, John. 1993. Comments on ‘ethics, advocacy and empowerment’. Language and Communication 13(2): 129–131.
Robertson, Beth M. 2011. The archival imperative: can oral history survive the funding crisis in archival institutions? In The Oxford Handbook of Oral History, ed. Donald A. Ritchie, 393–408. Oxford: Oxford University Press.
Rowlands, Ian, David Nicholas, Peter Williams, Paul Huntington, Maggie Fieldhouse, Barrie Gunter, Richard Withey, Hamid R. Jamali, Tom Dobrowolski, and Carol Tenopir. 2008. The Google Generation: the information behaviour of the researcher of the future. Aslib Proceedings 60(4): 290–310.
Samuel, Gabrielle N., and Gemma E. Derrick. 2015. Societal impact evaluation: exploring evaluator perceptions of the characterization of impact under the REF2014. Research Evaluation 24: 229–241.
Smith, Abby, David Allen, and Karen Allen. 2004. Survey of the State of Audio Collections in Academic Libraries. Washington, DC: Council on Library and Information Resources.
Wolfram, Walt. 1993. Ethical considerations in language awareness programs. Issues in Applied Linguistics 4: 225–255.
———. 2012. In the profession: connecting with the public. Journal of English Linguistics 40(1): 111–117.
———. 2013. Community, commitment and responsibility. In The Handbook of Language Variation and Change, eds. J. K. Chambers and Natalie Schilling. 555-576, 2. Malden: Wiley/Blackwell.
———. 2016. Public sociolinguistic education in the United States: a proactive, comprehensive program. In Sociolinguistic Research: Application and Impact, eds. Robert Lawson, and Dave Sayers. 87-108. London: Routledge.
Wolfram, Walt, Jeffrey Reaser, and Charlotte Vaughan. 2008. Operationalizing linguistic gratuity: from principle to practice. Language and Linguistics Compass 2(6): 1109–1134.
Websites and Online Resources
RCUK Policy on Open Access. http://www.rcuk.ac.uk/research/openaccess (accessed 24 June 2015).
REF: Research Excellence Framework. http://www.ref.ac.uk/pubs/2011-02 (accessed 24 June 2015).
W3C: World Wide Web Consortium. http://www.w3.org (accessed 13 July 2015).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Copyright information
© 2016 The Author(s)
About this chapter
Cite this chapter
Corrigan, K.P., Mearns, A. (2016). Taming Digital Texts, Voices and Images for the Wild: Models and Methods for Handling Unconventional Corpora to Engage the Public. In: Corrigan, K., Mearns, A. (eds) Creating and Digitizing Language Corpora. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-137-38645-8_1
Download citation
DOI: https://doi.org/10.1057/978-1-137-38645-8_1
Published:
Publisher Name: Palgrave Macmillan, London
Print ISBN: 978-1-137-38644-1
Online ISBN: 978-1-137-38645-8
eBook Packages: Social SciencesSocial Sciences (R0)