Thematic Cluster Analysis of the L2 Experience Interview Corpus



This chapter addresses two research questions. First, what semantic clusters are identifiable in the L2 Experience Interview Corpus (Polat 2013a), and second, what can these clusters tell us about the participants’ L2 learning experience? Starting with our transcribed sentences (called “elementary contexts”) as the basic unit of analysis, the T-Lab software identifies keywords from the corpus, performs a correspondence analysis, and then conducts a cluster analysis based on parameters entered by the user. (See Lancia 2016, for a complete description of T-Lab operations.) In this case, a three-cluster option was selected as the most explanatory model, representing 37.73% of shared variance, with p = 0.027. This means that T-Lab (Lancia 2004) recognized three distinct groups of words that tend to co-occur with each other (internal homogeneity) and tend not to occur with words in the other two clusters (external heterogeneity). The three thematic clusters, which we have named Classroom, Communication, and Studying, are discussed in turn below.


Thematic Clusters External Heterogeneity Semantic Clustering Internal Homogeneity Representative Lemmas 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Lancia, F. (Ed.). (2004). Strumenti per l’analasi dei testi [Tools for textual analysis]. Rome: Franco Angeli.Google Scholar
  2. Lancia, F. (2016). T-Lab online user manual.
  3. Polat, B. (2013a). The L2 experience interview corpus. Atlanta: Georgia State University.Google Scholar

Copyright information

© The Author(s) 2017

Authors and Affiliations

  1. 1.Applied Linguistics and ESLGeorgia State UniversityAtlantaUSA
  2. 2.Ohio UniversityAthensUSA
  3. 3.Georgia State UniversityAtlantaUSA
  4. 4.Hobart and William Smith CollegesGenevaUSA

Personalised recommendations