Language corpora have many uses in language study, including for learners and other users of foreign languages in an approach that has come to be known as data-driven learning (DDL). This boils down to the learner’s ability to find answers to their questions by using software to access large collections of authentic texts relevant to their needs, as opposed to asking teachers or consulting ready-made reference materials. As such, not only do corpora contain the potential to answer many language questions, the consultation itself is likely to lead to improved language awareness and noticing. This chapter discusses the nature of corpora and their relevance in language learning, outlining the processes involved in DDL, and looks at the history and research development in the field from its beginnings to the present day, taking into account its limitations and gaps in our current knowledge with an eye to the future.
- Data-driven learning
- Corpus-based language learning
This is a preview of subscription content, access via your institution.
Tax calculation will be finalised at checkout
Purchases are for personal use onlyLearn about institutional subscriptions
Ahmad, K., Corbett, G., & Rogers, M. (1985). Using computers with advanced language learners: An example. The Language Teacher (Tokyo), 9(3), 4–7.
Allan, R. (2006). Data-driven learning and vocabulary: Investigating the use of concordances with advanced learners of English, Centre for Language and Communication Studies Occasional Paper (Vol. 66). Dublin: Trinity College Dublin.
Aston, G. (2015). Learning phraseology from speech corpora. In A. Leńko-Szymańska & A. Boulton (Eds.), Multiple affordances of language corpora for data-driven learning (pp. 65–84). Amsterdam: John Benjamins.
Aston, G., & Burnard, L. (1998). The BNC handbook: Exploring the British National Corpus. Edinburgh: Edinburgh University Press.
Baroni, M., & Bernardini, S. (Eds.). (2006). Wacky! Working papers on the web as corpus. Bologna: Gedit.
Baten, L., Cornu, A.-M., & Engels, L. (1989). The use of concordances in vocabulary acquisition. In C. Laurent & M. Nordman (Eds.), Special language: From humans thinking to thinking machines (pp. 452–467). Clevedon: Multilingual Matters.
Boulton, A. (2010). Data-driven learning: Taking the computer out of the equation. Language Learning, 60(3), 534–572.
Boulton, A. (2015). Applying data-driven learning to the web. In A. Leńko-Szymańska & A. Boulton (Eds.), Multiple affordances of language corpora for data-driven learning (pp. 267–295). Amsterdam: John Benjamins.
Boulton, A., & Cobb, T. (2017). Corpus use in language learning: A meta-analysis. Language Learning, 67(2).
Charles, M. (2014). Getting the corpus habit: EAP students’ long-term use of personal corpora. English for Specific Purposes, 35(1), 30–40.
Chujo, K., & Oghigian, K. (2012). DDL for EFL beginners: A report on student gains and views on paper-based concordancing and the role of L1. In J. Thomas & A. Boulton (Eds.), Input, process and product: Developments in teaching and language corpora (pp. 170–183). Brno: Masaryk University Press.
Cobb, T. (1997). From concord to lexicon: Development and test of a corpus-based lexical tutor. Unpublished PhD thesis. Montreal: Concordia University.
Davies, M. (2009). The 385+ million word Corpus of Contemporary American English (1990-2008+): Design, architecture, and linguistic insights. International Journal of Corpus Linguistics, 14(2), 159–188.
Frankenberg-Garcia, A. (2014). The use of corpus examples for language comprehension and production. ReCALL, 26(2), 128–146.
Geluso, J. (2013). Phraseology and frequency of occurrence on the web: Native speakers’ perceptions of Google-informed second language writing. Computer Assisted Language Learning, 26(2), 144–157.
Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. New York: Routledge.
Hoey, M. (2005). Lexical priming: A new theory of words and language. London: Routledge.
Johns, T., & King, P. (Eds.). (1991). Classroom concordancing, English Language Research Journal (Vol. 4). Birmingham: Centre for English Language Studies, University of Birmingham.
Johns, T., Lee, H., & Wang, L. (2008). Integrating corpus-based CALL programs and teaching English through children’s literature. Computer Assisted Language Learning, 21(5), 483–506.
Kennedy, C., & Miceli, T. (2010). Corpus-assisted creative writing: Introducing intermediate Italian learners to a corpus as a reference resource. Language Learning & Technology, 14(1), 28–44.
Kučera, H., & Francis, W. (1967). Computational analysis of present-day American English. Providence: Brown University Press.
Lee, C.-Y., & Liou, H.-C. (2003). A study of using web concordancing for English vocabulary learning in a Taiwanese high school context. English Teaching and Learning, 27(3), 35–56.
McEnery, T., & Wilson, A. (1997). Teaching and language corpora. ReCALL, 9(1), 5–14.
McKay, S. (1980). Teaching the syntactic, semantic and pragmatic dimensions of verbs. TESOL Quarterly, 14(1), 17–26.
Millar, N. (2011). The processing of malformed formulaic language. Applied Linguistics, 32(2), 129–148.
Norris, J., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning, 50(3), 417–528.
O’Sullivan, Í., & Chambers, A. (2006). Learners’ writing skills in French: Corpus consultation and learner evaluation. Journal of Second Language Writing, 15(1), 49–68.
Pérez-Paredes, P., Sánchez-Tornel, M., & Alcaraz Calero, J. (2012). Learners’ search patterns during corpus-based focus-on-form activities: A study on hands-on concordancing. International Journal of Corpus Linguistics, 17(4), 483–515.
Quaglio, P. (2009). Television dialogue: The sitcom Friends vs. natural conversation. Amsterdam: John Benjamins.
Sinclair, J. (Ed.). (1987). Looking up: An account of the COBUILD project in lexical computing (pp. 104–115). London: Collins.
Sinclair, J. (Ed.). (1991). Corpus, concordance, collocation. Oxford: Oxford University Press.
Taylor, J. (2012). The mental corpus: How language is represented in the mind. Oxford: Oxford University Press.
Thomas, J., & Boulton, A. (Eds.). (2012). Input, process and product: Developments in teaching and language corpora. Brno: Masaryk University Press.
Todd, R. (2001). Induction from self-selected concordances and self-correction. System, 29(1), 91–102.
Tomasello, M. (2005). Constructing a language: A usage-based theory of language acquisition. Harvard: Harvard University Press.
Turnbull, J., & Burston, J. (1998). Towards independent concordance work for students: Lessons from a case study. ON-CALL, 12(2), 10–21.
Yoon, H., & Hirvela, A. (2004). ESL student attitudes toward corpus use in L2. Journal of Second Language Writing, 13(4), 257–283.
Zahar, R., Cobb, T., & Spada, N. (2001). Acquiring vocabulary through reading: Effects of frequency and contextual richness. The Canadian Modern Language Review, 57(3), 541–572.
Editors and Affiliations
© 2017 Springer International Publishing AG
About this entry
Cite this entry
Boulton, A. (2017). Data-Driven Learning and Language Pedagogy. In: Thorne, S., May, S. (eds) Language, Education and Technology. Encyclopedia of Language and Education. Springer, Cham. https://doi.org/10.1007/978-3-319-02237-6_15
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02236-9
Online ISBN: 978-3-319-02237-6