Taming Digital Texts, Voices and Images for the Wild: Models and Methods for Handling Unconventional Corpora to Engage the Public

  • Karen P. Corrigan
  • Adam Mearns


This volume is the third in a series of books published by Palgrave Macmillan which focus on establishing guidelines for the creation and digitization of language corpora that are unconventional in some respect (see Beal et al. 2007a, b). Volume 3 is dedicated to the issue of public engagement and questions of how linguists can and should make their corpora accessible for a broader range of uses and to a wider audience. Although in this regard the road to building a corpus is often paved with good intentions, as Rickford (1993: 130) observes, these are frequently overtaken by ‘the less escapable commitments’ of teaching and further research. While this may be understandable, it is ‘not a picture, when we step back and view it, with which we can be proud’, since it means that ‘[m]ost of us fall short of paying our debts to the communities whose data have helped to build and advance our careers’ (Rickford 1993: 130). The importance of taking public engagement initiatives more seriously has generated considerable recent scholarly debate (especially amongst researchers in the arts, humanities and social sciences) as the so-called ‘impact agenda’ has taken hold particularly, though not exclusively, in UK higher education institutions (Lawson and Sayers 2016; Martin 2011; Samuel and Derrick 2015). A key objective of this volume is to examine the evidence for the view that despite the new requirements by funding bodies (and ultimately governments) that corpora should have a dual purpose as data that is deployable for engagement as well as research, twenty-first-century corpus linguists who do just that are not following conventional practices within their discipline. A second goal is to demonstrate how the issues that purportedly stand in the way of developing what one might term ‘impactful corpora’ can be circumvented (as our contributors have done) with a little ingenuity and motivation. Another objective is to sketch what we consider to be best practices in creating corpora for public engagement by offering guidance on optimal methods by which such data (audio, text and still/moving images) can be created, digitized and subsequently exploited for public engagement projects.


Asylum Seeker Public Engagement Continue Professional Development Corpus Data Linguistic Data Consortium 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Books and Articles

  1. Allen, Will, Joan C. Beal, Karen P. Corrigan, Hermann L. Moisl, and Warren Maguire. 2007. The Newcastle Electronic Corpus of Tyneside English. In Creating and Digitizing Language Corpora: Vol. 2, Diachronic Databases, eds. Joan C. Beal, Karen P. Corrigan, and Hermann L. Moisl, 16–48. Basingstoke: Palgrave Macmillan.Google Scholar
  2. Bauer, Laurie. 2004. Inferring variation and change from public corpora. In The Handbook of Language Variation and Change, 1 edn, eds. J.K. Chambers, and Natalie Schilling, 97–114. Malden: Blackwell.Google Scholar
  3. Beal, Joan C., Karen P. Corrigan, and Hermann L. Moisl, eds. 2007a. Creating and Digitizing Language Corpora: Vol. 1, Synchronic Databases. Basingstoke: Palgrave Macmillan.Google Scholar
  4. ———, eds. 2007b. Creating and Digitizing Language Corpora: Vol. 2, Diachronic Databases. Basingstoke: Palgrave Macmillan.Google Scholar
  5. Beal, Joan C., and Karen P. Corrigan. 2013. Working with unconventional existing data resources. In Data Collection in Sociolinguistics: Methods and Applications, eds. Becky Childs, Christine Mallinson, and Gerard van Herk, 213–216. London: Routledge.Google Scholar
  6. Beal, Joan C., Karen P. Corrigan, Adam J. Mearns, and Hermann L. Moisl. 2014. The Diachronic Electronic Corpus of Tyneside English: annotation and dissemination practices. In The Oxford Handbook of Corpus Phonology, eds. Jacques Durand, Ulrike Gut, and Gjert Kristoffersen, 517–533. Oxford: Oxford University Press.Google Scholar
  7. Cameron, Deborah, Elizabeth Frazer, Penelope Harvey, Ben Rampton, and Kay Richardson. 1997. Ethics, advocacy and empowerment in researching language. In Sociolinguistics, eds. Nikolas Coupland, and Adam Jaworski, 145–162. Houndmills: Macmillan. (Originally published in Language and Communication 13(2): 81–94 in 1993.)Google Scholar
  8. Childs, Becky, Gerard van Herk, and Jennifer Thorburn. 2011. Safe harbour: ethics and accessibility in sociolinguistic corpus building. Corpus Linguistics and Linguistic Theory 7(1): 163–180.CrossRefGoogle Scholar
  9. Choudrie, Jyoti, Susan Grey, and Nicholas Tsitsianis. 2010. Evaluating the digital divide: the Silver Surfer’s perspective. Electronic Government, An International Journal 7(2): 148–167.CrossRefGoogle Scholar
  10. Corrigan, Karen P., Adam J. Mearns, and Hermann L. Moisl. 2013. Data-mining the DECTE Corpus: phonological and morphological variability in Tyneside English. In Cross-Linguistic and Language-Internal Variation in Text and Speech, eds. Benedikt Szmrecsanyi, and Bernhard Wälchli, 113–149. Berlin: Walter de Gruyter.Google Scholar
  11. D’Arcy, Alexandra. 2011. Corpora: capturing language in use. In Analysing Variation in English, eds. Warren Maguire, and April McMahon, 49–71. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  12. Day, Timothy. 2001. The National Sound Archive: the first fifty years. In Aural History: Essays on Recorded Sound, ed. Andy Linehan, 41–64. London: The British Library.Google Scholar
  13. Durand, Jacques, Ulrike Gut, and Gjert Kristoffersen, eds. 2014. The Oxford Handbook of Corpus Phonology. Oxford: Oxford University Press.Google Scholar
  14. Kendall, Tyler. 2007. The Sociolinguistic Archive and Analysis Project: empowering the sociolinguistic archive. Penn Working Papers in Linguistics 13(2): 15–26.Google Scholar
  15. ———. 2008. On the history and future of sociolinguistic data. Language and Linguistics Compass 2(2): 332–351.Google Scholar
  16. ———. 2011. Corpora from a sociolinguistic perspective. Revista Brasileira de Linguística Aplicada 11(2): 361–389.Google Scholar
  17. Kretzschmar, William A., Jean Anderson, Joan C. Beal, Karen P. Corrigan, Lisa-Lena Opas-Hänninen, and Bartek Plichta. 2006. Collaboration on corpora for regional and social analysis. Journal of English Linguistics 34(3): 172–205.Google Scholar
  18. Labov, William. 1982. Objectivity and commitment in linguistic science. Language in Society 11: 165–201.CrossRefGoogle Scholar
  19. Lawson, Robert, and Dave Sayers. 2016a. Introduction. In Sociolinguistic Research: Application and Impact, eds. Robert Lawson, and Dave Sayers, 1-6. London: Routledge.Google Scholar
  20. Lawson, Robert, and Dave Sayers. 2016b. Where we’re going, we don’t need roads: the past, present, and future of impact. In Sociolinguistic Research: Application and Impact, eds. Robert Lawson, and Dave Sayers, 7-22. London: Routledge.Google Scholar
  21. Martin, Ben R. 2011. The Research Excellence Framework and the ‘impact agenda’: are we creating a Frankenstein monster? Research Evaluation 20(3): 247–254.CrossRefGoogle Scholar
  22. Norris, Pippa. 2001. Digital Divide: Civic Engagement, Information Poverty and the Internet in Democratic Societies. New York: Cambridge University Press.CrossRefGoogle Scholar
  23. Perks, Robert P. 2011. Messiah with a microphone? Oral historians, technologies and sound archives. In The Oxford Handbook of Oral History, ed. Donald A. Ritchie, 315–332. Oxford: Oxford University Press.Google Scholar
  24. Reaser, Jeffrey, and Caroyln Temple Adger. 2007. Developing language awareness materials for non-linguists: lessons learned from the Do You Speak American? project. Language and Linguistics Compass 1(3): 155–167.CrossRefGoogle Scholar
  25. Rickford, John. 1993. Comments on ‘ethics, advocacy and empowerment’. Language and Communication 13(2): 129–131.CrossRefGoogle Scholar
  26. Robertson, Beth M. 2011. The archival imperative: can oral history survive the funding crisis in archival institutions? In The Oxford Handbook of Oral History, ed. Donald A. Ritchie, 393–408. Oxford: Oxford University Press.Google Scholar
  27. Rowlands, Ian, David Nicholas, Peter Williams, Paul Huntington, Maggie Fieldhouse, Barrie Gunter, Richard Withey, Hamid R. Jamali, Tom Dobrowolski, and Carol Tenopir. 2008. The Google Generation: the information behaviour of the researcher of the future. Aslib Proceedings 60(4): 290–310.CrossRefGoogle Scholar
  28. Samuel, Gabrielle N., and Gemma E. Derrick. 2015. Societal impact evaluation: exploring evaluator perceptions of the characterization of impact under the REF2014. Research Evaluation 24: 229–241.CrossRefGoogle Scholar
  29. Smith, Abby, David Allen, and Karen Allen. 2004. Survey of the State of Audio Collections in Academic Libraries. Washington, DC: Council on Library and Information Resources.Google Scholar
  30. Wolfram, Walt. 1993. Ethical considerations in language awareness programs. Issues in Applied Linguistics 4: 225–255.Google Scholar
  31. ———. 2012. In the profession: connecting with the public. Journal of English Linguistics 40(1): 111–117.CrossRefGoogle Scholar
  32. ———. 2013. Community, commitment and responsibility. In The Handbook of Language Variation and Change, eds. J. K. Chambers and Natalie Schilling. 555-576, 2. Malden: Wiley/Blackwell.Google Scholar
  33. ———. 2016. Public sociolinguistic education in the United States: a proactive, comprehensive program. In Sociolinguistic Research: Application and Impact, eds. Robert Lawson, and Dave Sayers. 87-108. London: Routledge.Google Scholar
  34. Wolfram, Walt, Jeffrey Reaser, and Charlotte Vaughan. 2008. Operationalizing linguistic gratuity: from principle to practice. Language and Linguistics Compass 2(6): 1109–1134.CrossRefGoogle Scholar

Websites and Online Resources

  1. RCUK Policy on Open Access. (accessed 24 June 2015).
  2. REF: Research Excellence Framework. (accessed 24 June 2015).
  3. W3C: World Wide Web Consortium. (accessed 13 July 2015).

Copyright information

© The Author(s) 2016

Authors and Affiliations

  • Karen P. Corrigan
    • 1
  • Adam Mearns
    • 1
  1. 1.Newcastle UniversityNewcastle upon TyneUK

Personalised recommendations