Exploring Spoken English Learner Language Using Corpora

  • Eric Friginal
  • Joseph J. Lee
  • Brittany Polat
  • Audrey Roberson


As second language (L2) corpus studies expand into their third decade, innovations in computational technology and corpus creation have facilitated unprecedented access to authentic language in the classroom, including among non-native speakers (NNSs) of English. NNS writing across various written contexts (e.g., school essays, standardized tests/proficiency tests, and laboratory or research reports) has been studied extensively in both journal article and book formats using corpora by applied linguists including Douglas Biber, Ken Hyland, John Swales, Rod Ellis, Susan Conrad, Eli Hinkel, and Sylviane Granger, to name only a few. Despite these impressive contributions, gaps still remain in our knowledge of spoken English L2 registers, even those that are quite important for NNSs to master. Classroom learner speech and face-to-face NNSs interviews, for example, have been researched both qualitatively and quantitatively, primarily by utilizing the assessment of learner performance. However, extensive corpus-based analyses of these registers are still relatively few in number. Given that these oral learner skills are essential in high-stakes situations, such as admission to graduate programs, job interviews in English-speaking settings, or proficiency tests like the TOEFL (Test of English as a Foreign Language) or IELTS (International English Language Testing System), it is certainly useful and worthwhile to further investigate oral learner language systematically, and especially with corpora as part of the research methodology.


  1. Anthony, L. (2014). AntConc (Version 3.4.3) [Computer software]. Tokyo: Waseda University. Accessed 9 July.
  2. Barbieri, F. (2008). Patterns of age-based linguistic variation in American English. Journal of SocioLinguistics, 21(1), 58–88.CrossRefGoogle Scholar
  3. Barlow, M. (2012). MonoConc Pro 2.2 (MP2.2) [Software]. Available from
  4. Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  5. Biber, D. (2006a). Stance in spoken and written university registers. Journal of English for Academic Purposes, 5(2), 97–116. doi: 10.1016/j.jeap.2006.05.001.CrossRefGoogle Scholar
  6. Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics: Investigating language structure and use. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  7. Biber, D., Conrad, S., & Cortes, V. (2004a). If you look at…: Lexical bundles in university teaching and textbooks. Applied Linguistics, 25, 371–405.CrossRefGoogle Scholar
  8. Biber, D., Reppen, R., & Friginal, E. (2010). Research in corpus linguistics. In R. B. Kaplan (Ed.), The Oxford handbook of applied linguistics (2nd ed., pp. 548–570). Oxford: Oxford University Press.Google Scholar
  9. Cheng, W., Greaves, C., & Warren, M. (2006). From n-gram to skipgram to concgram. International Journal of Corpus Linguistics, 11(4), 411–433.CrossRefGoogle Scholar
  10. Cheng, W., Greaves, C., & Warren, M. (2008). A corpus-driven study of discourse intonation. Amsterdam: John Benjamins.CrossRefGoogle Scholar
  11. Cortes, V. (2004). Lexical bundles in published and student disciplinary writing: Examples from history and biology. English for Specific Purposes, 23, 397–423.CrossRefGoogle Scholar
  12. Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34, 213–238.CrossRefGoogle Scholar
  13. Coxhead, A. (2011). The academic word list 10 years on: Research and teaching implications. TESOL Quarterly, 45(2), 355–361.CrossRefGoogle Scholar
  14. Crashborn, O. (2008). Open access to sign language corpora. In O. Crashborn, T. Hanke, E. Efthimiou, I. Zwitserlood, & E. Thoutenhoofd (Eds.). Construction and exploitation of sign language corpora (Third workshop on the representation and processing of sign language, pp. 33–38). Paris: European Language Resources Association (ELRA).Google Scholar
  15. De Haan, P. (1989). Postmodifying clauses in the English noun phrase: A corpus-based study. Amsterdam: Rodopi.Google Scholar
  16. Ellis, R., & Barkhuizen, G. (2005). Analysing learner language. Oxford: Oxford University Press.Google Scholar
  17. Firth, J. (1957). Papers in linguistics. Oxford: Oxford University Press.Google Scholar
  18. Francis, D., Rivera, M., Lesaux, N., Kieffer, M., & Rivera, H. (2006). Practical guidelines for the education of English language learners: Research-based recommendations for instruction and academic interventions. Portsmouth: RMC Research Corporation, Center on Instruction.Google Scholar
  19. Friginal, E. (2009). The language of outsourced call centers: A corpus-based study of cross-cultural interaction. Amsterdam: John Benjamins.CrossRefGoogle Scholar
  20. Friginal, E. (2013). 25 years of Biber’s multi-dimensional analysis: Introduction to the special issue. Corpora, 8(2), 137–152.CrossRefGoogle Scholar
  21. Friginal, E. (2015). Concordancers. In J. Bennet (Ed.), The Sage encyclopedia of intercultural communication (pp. 109–111). Thousand Oaks: Sage.Google Scholar
  22. Friginal, E., & Hardy, J. A. (2014). Corpus-based sociolinguistics: A guide for students. New York: Routledge.Google Scholar
  23. Friginal, E., & Polat, B. (2015). Linguistic dimensions of learner speech in English interviews. Corpus Linguistics Research, 1, 53–82.CrossRefGoogle Scholar
  24. Friginal, E., Pickering, L., & Bruce, C. (2016). Narrative and informational dimensions of AAC discourse in the workplace. In L. Pickering, E. Friginal, & S. Staples (Eds.), Talking at work: Corpus-based explorations of workplace discourse (pp. 27–54). London: Palgrave-Macmillan.CrossRefGoogle Scholar
  25. Gass, S. (1997). Input, interaction, and the second language learner. Mahwah: Erlbaum.Google Scholar
  26. Gass, S., Mackey, A., & Ross-Feldman, L. (2005). Task-based interactions in classroom and laboratory settings. Language Learning, 55, 575–611.CrossRefGoogle Scholar
  27. Goo, J. (2012). Corrective feedback and working memory capacity in interaction-driven L2 learning. Studies in Second Language Acquisition, 34, 445–474.CrossRefGoogle Scholar
  28. Granger, S. (1983). The BE + past participle construction in spoken English (with special emphasis on the passive). Amsterdam: Elsevier.Google Scholar
  29. Grant, L., & Ginther, A. (2000). Using computer-tagged linguistic features to describe L2 writing differences. Journal of Second Language Writing, 9(2), 123–145.CrossRefGoogle Scholar
  30. Handford, M. (2010). The language of business meetings. Tokyo: Cambridge University Press.CrossRefGoogle Scholar
  31. Hinkel, E. (2002). Second language writers’ text: Linguistic and rhetorical features. Mahwah: Lawrence Erlbaum Associates.Google Scholar
  32. Hofstede, G. (2001). Culture’s consequences: Comparing values, behaviors, institutions, and organizations across nations (2nd ed.). Thousand Oaks: Sage.Google Scholar
  33. Holmes, J. (2006). Sharing a laugh: Pragmatic aspects of humor and gender in the workplace. Journal of Pragmatics, 38, 26–50.CrossRefGoogle Scholar
  34. Johansson, S., & Hofland, K. (1989). Frequency analysis of English vocabulary and grammar (Vols. 1–2). Oxford: Clarendon Press.Google Scholar
  35. Johnston, T., & Schembri, A. (2006). Issues on the creation of a digital archive of a signed language. In L. Barwick & N. Thieburger (Eds.), Sustainable data from digital fieldwork (pp. 7–16). Sydney: University of Sydney Press.Google Scholar
  36. Koester, A. (2010). Workplace discourse. London: Continuum.Google Scholar
  37. Kučera, H., & Francis, W. N. F. (1967). Computational analysis of present-day American English. Providence: Brown University Press.Google Scholar
  38. Lindemann, S., & Mauranen, A. (2001). “It’s just real messy”: The occurrence and function of just in a corpus of academic speech. English for Specific Purposes, 20, 459–475.CrossRefGoogle Scholar
  39. Long, M. H. (1983). Native speaker/non-native speaker conversation and the negotiation of meaning. Applied Linguistics, 4, 126–141.CrossRefGoogle Scholar
  40. Long, M. H. (1996). The role of the linguistic environment in second language acquisition. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook of language acquisition (Vol. 2): Second language acquisition (pp. 413–468). New York: Academic Press.Google Scholar
  41. Mackey, A. (1999). Input, interaction, and second language development. Studies in Second Language Acquisition, 21, 557–587.CrossRefGoogle Scholar
  42. Marra, M. (2012). English in the workplace. In B. Paltridge & S. Starfield (Eds.), The handbook of English for specific purposes (pp. 67–99). Chichester: Wiley.Google Scholar
  43. Mauranen, A. (2003). The corpus of English as lingua franca in academic settings. TESOL Quarterly, 37(3), 513–527.CrossRefGoogle Scholar
  44. McCarthy, M., & Handford, M. (2004). Invisible to us: A preliminary corpus-based study of spoken business English. In U. Connor & T. A. Upton (Eds.), Discourse in the professions: Perspectives from corpus linguistics (pp. 167–201). Amsterdam: John Benjamins.CrossRefGoogle Scholar
  45. McEnery, T., & Hardie, A. (2012). Corpus linguistics. Cambridge: Cambridge University Press.Google Scholar
  46. McEnery, T., Xiao, R., & Tono, Y. (2006). Corpus-based language studies: An advanced resource book. New York: Routledge.Google Scholar
  47. Nation, P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  48. Pennebaker, J., Chung, C., Ireland, M., Gonzales, A., & Booth, R. (2007b). The development and psychometric properties of LIWC2007 [LIWC manual]. Austin: Scholar
  49. Pica, T., Holliday, L., Lewis, N., & Morgenthaler, L. (1989). Comprehensible output as an outcome of linguistic demands on the learner. Studies in Second Language Acquisition, 11, 63–90.CrossRefGoogle Scholar
  50. Pickering, L., & Bruce, C. (2009). AAC and non-AAC workplace corpus (ANAWC). Atlanta: Georgia State University.Google Scholar
  51. Plonsky, L., & Gass, S. (2011). Quantitative research methods, study quality, and outcomes: The case of interaction research. Language Learning, 61, 325–366.CrossRefGoogle Scholar
  52. Poos, D., & Simpson, R. (2002). Cross-disciplinary comparisons of hedging: Some findings from the Michigan corpus of academic spoken English. In R. Reppen, S. Fitzmaurice, & D. Biber (Eds.), Using corpora to explore linguistic variation (pp. 3–23). Amsterdam: John Benjamins.CrossRefGoogle Scholar
  53. Rayson, P. (2003). WMatrix: A statistical method and software tool for linguistic analysis through corpus comparison. Unpublished doctoral dissertation, Lancaster University, Lancaster.Google Scholar
  54. Rayson, P. (2008). From key words to key semantic domains. International Journal of Corpus Linguistics, 13(4), 519–549. doi: 10.1075/ijcl.13.4.06ray.CrossRefGoogle Scholar
  55. Römer, U. (2010). Establishing the phraseological profile of a text type: The construction of meaning in academic book reviews. English Text Construction, 3(1), 95–119. doi: 10.1075/etc.3.1.06rom.CrossRefGoogle Scholar
  56. Römer, U., & Wulff, S. (2010). Applying corpus methods to written academic texts: Explorations of MICUSP. Journal of Writing Research, 2(2), 99–127.CrossRefGoogle Scholar
  57. Saito, K., & Akiyama, Y. (2017). Video-based interaction, negotiation for comprehensibility, and second language speech learning: A longitudinal study. Language Learning, 67(1), 43–74.CrossRefGoogle Scholar
  58. Scott, M. (1997). PC analysis of key words – And key key words. System, 25(2), 233–245.Google Scholar
  59. Scott, M. (2012). WordSmith Tools (Version 6) [Software]. Available from
  60. Sheen, Y. (2007). The effects of corrective feedback, language aptitude and learner attitudes on the acquisition of English articles. In A. Mackey (Ed.), Conversational interaction in second language acquisition: A collection of empirical studies (pp. 301–322). Oxford: Oxford University Press.Google Scholar
  61. Simpson, R., Briggs, S., Ovens, J., & Swales, J. (2002). The Michigan corpus of academic spoken English. Ann Arbor: The Regents of the University of Michigan.Google Scholar
  62. Simpson-Vlach, R. (2013). Corpus analysis of spoken English for academic purposes. In C. Chapelle (Ed.), The encyclopedia of applied linguistics (pp. 452–461). Malden: Wiley Blackwell.Google Scholar
  63. Sinclair, J. (2005). Corpus and text – Basic principles. In M. Wynne (Ed.), Developing linguistic corpora: A guide to good practice (pp. 1–16). Oxford: Oxbow Books. Retrieved from
  64. Staples, S. (2015). The discourse of nurse-patient interactions: Contrasting the communicative styles of U.S. and international nurses. Philadelphia: John Benjamins.CrossRefGoogle Scholar
  65. Staples, S. (2016). Identifying linguistic features of medical interactions: A register analysis. In L. Pickering, E. Friginal, & S. Staples (Eds.), Talking at work: Corpus-based explorations of workplace discourse (pp. 179–208). London: Palgrave-Macmillan.CrossRefGoogle Scholar
  66. Staples, S., Laflair, G., & Egbert, J. (2017). Comparing language use in oral proficiency interviews to target domains: Conversational, academic, and professional discourse. The Modern Language Journal, 101(1), 1–20.CrossRefGoogle Scholar
  67. Stubbe, M., Lane, C., Hilder, J., Vine, E., Vine, B., Marra, M., Homes, J., & Weatherall, A. (2003). Multiple discourse analyses of a workplace interaction. Discourse Studies, 5(3), 351–388.CrossRefGoogle Scholar
  68. Swales, J. M., & Burke, A. (2003). “It’s really fascinating work”: Differences in evaluative adjectives across academic registers. In P. Leistyna & C. F. Meyer (Eds.), Corpus analysis: Language structure and language use (pp. 1–18). Amsterdam: Rodopi.Google Scholar
  69. Tagliamonte, S. A. (2006). Analysing sociolinguistic variation. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  70. Vine, B. (2009). Directives at work: Exploring the contextual complexity of workplace directives. Journal of Pragmatics, 41(7), 1395–1405. doi: 10.1016/j.pragma.2009.03.001.CrossRefGoogle Scholar
  71. Vine, B. (2016). Pragmatic markers at work in New Zealand. In L. Pickering, E. Friginal, & S. Staples (Eds.), Talking at work: Corpus-based explorations of workplace discourse (pp. 1–26). London: Palgrave-Macmillan.Google Scholar
  72. Vine, B. (forthcoming). Just, actually at work in New Zealand. In E. Friginal (Ed.), Studies in corpus-based sociolinguistics. New York: Routledge.Google Scholar
  73. Warren, M. (2004). //So what have YOU been WORking on REcently//: Compiling a specialized corpus of spoken business English. In U. Connor & T. A. Upton (Eds.), Discourse in the professions: Perspectives from corpus linguistics (pp. 115–140). Philadelphia: John Benjamins.CrossRefGoogle Scholar
  74. Weisser, M. (2016). Practical corpus linguistics: An introduction to corpus-based language analysis. Malden: Wiley Blackwell.CrossRefGoogle Scholar
  75. Ziegler, N. (2015). Synchronous computer-mediated communication and interaction: A meta-analysis. Studies in Second Language Acquisition, 38, 553–586.CrossRefGoogle Scholar

Copyright information

© The Author(s) 2017

Authors and Affiliations

  • Eric Friginal
    • 1
  • Joseph J. Lee
    • 2
  • Brittany Polat
    • 3
  • Audrey Roberson
    • 4
  1. 1.Applied Linguistics and ESLGeorgia State UniversityAtlantaUSA
  2. 2.Ohio UniversityAthensUSA
  3. 3.Georgia State UniversityAtlantaUSA
  4. 4.Hobart and William Smith CollegesGenevaUSA

Personalised recommendations