Abstract
In the field of Learner Corpus Research, Gries and Deshors (Corpora 9(1):109–136, 2014) developed a two-step regression procedure (MuPDAR) to determine how and why choices made by non-native speakers differ from those made by native speakers more comprehensively than traditional learner corpus research allows for. In this chapter, we will extend and test their proposal to determine whether it can also be applied to pragmatic and grammatical phenomena (subject realization/omission in Japanese), and whether it can help study categorical differences between learner and native-speaker choices; we do so by also showing that the more advanced method of mixed-effects modeling can be very fruitfully integrated into the proposed MuPDAR method. The results of our study show that Japanese native speakers’ choices of subject realization are affected by discourse-functional factors such as givenness and contrast of referents and that, while learners are able to handle extreme values of givenness and marked cases of contrast, they still struggle (more) with intermediate degrees of givenness and unmarked/non-contrastive referents. We conclude by discussing the role of MuPDAR in Learner Corpus Research in general and its advantages over traditional corpus analysis in that field and error analysis in particular.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We are disregarding here the large body of multifactorial work done by Crossley, Jarvis, and collaborators (cf. in particular Jarvis & Crossley 2012) because much of that work focuses on detecting the L1 of a writer rather than, as here, understanding any one particular lexical or grammatical choice in detail.
- 2.
An additional problem may involve the fact that the authors used a linear regression on data that might violate the assumptions of such regressions. However, we were unable to infer from the paper what the dependent variable was – possibly a frequency of ser/estar + adjective per file? – so the above has to remain speculation for now.
- 3.
We thank Nobutaka Takara and Mikuni Okamoto for their help in transcribing the corpus data.
- 4.
By virtue of the complexity of the statistical methods involved, this section can only be rather technical in nature, plus space constraints do not permit exhaustive definitions and discussion of all the statistical technical terms. We therefore refer the reader to Baayen (2008: Ch. 7), Crawley (2013: Ch. 9, 19), Faraway (2006: Ch. 8–10), and Zuur et al. (2009: Ch. 5).
- 5.
Strictly speaking, if one does a MuPDAR analysis in which R1 is really only used for prediction, then one does not really have to apply Occam's razor rigorously to eliminate non-significant/collinear predictors that much because, within MuPDAR, the point of R1 is not to actually interpret R1's coefficients.
References
Aijmer, K. (2005). Modality in advanced Swedish learners’ written interlanguage. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition, and foreign language teaching (pp. 55–76). Amsterdam/Philadelphia: John Benjamins.
Altenberg, B. (2005). Using bilingual corpus evidence in learner corpus research. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition, and foreign language teaching (pp. 37–54). Amsterdam/Philadelphia: John Benjamins.
Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.
Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412.
Bates, D., Maechler, M., & Bolker, B. (2013). lme4. Linear mixed-effects models using S4 classes. http://lme4.r-forge.r-project.org/
Clancy, P. M. (1980). Referential choice in English and Japanese narrative discourse. In W. Chafe (Ed.), The pear stories: Cognitive and linguistics aspects of narrative production (pp. 127–202). Norwood: Ablex.
Collentine, J., & Asención-Delaney, Y. (2010). A corpus-based analysis of the discourse functions of ser/estar + adjective in three levels of Spanish as FL learners. Language Learning, 60(2), 409–445.
Cosme, C. (2008). Participle clauses in learner English: The role of transfer. In G. Gilquin, S. Papp, & M. B. Díez-Bedmar (Eds.), Linking up contrastive and learner corpus research (pp. 177–200). Amsterdam/Atlanta: Rodopi.
Crawley, M. J. (2013). The R book (2nd ed.). Chichester: Wiley.
Doğruöz, A. S., & Gries, S. T. (2012). Spread of on-going changes in an immigrant language: Turkish in the Netherlands. Review of Cognitive Linguistics, 10(2), 401–426.
Du Bois, J. W. (2006). Representing discourse. Ms., University of California, Santa Barbara.
Faraway, J. J. (2006). Extending the linear model with R: Generalized linear, mixed-effects and non-parametric regression models. Boca Raton: Chapman & Hall/CRC.
Granger, S. (1996). From CA to CIA and back: An integrated approach to computerized bilingual and learner corpora. In K. Aijmer, B. Altenberg, & M. Johansson (Eds.), Languages in contrast. Text-based cross-linguistic studies (pp. 37–51). Lund: Lund University Press.
Granger, S. (2002). A bird’s eye view of learner corpus research. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition and foreign language teaching (pp. 3–33). Amsterdam/Philadelphia: John Benjamins.
Granger, S. (2004). Computer learner corpus research: Current status and future prospects. In U. Connor & T. Upton (Eds.), Applied corpus linguistics: A multidimensional perspective (pp. 123–145). Amsterdam: Rodopi.
Gries, S. T., & Deshors, S. C. (2014). Using regressions to explore deviations between corpus data and a standard/target: Two suggestions. Corpora, 9(1), 109–136.
Gries, S. T., & Wulff, S. (2009). Psycholinguistic and corpus linguistic evidence for L2 constructions. Annual Review of Cognitive Linguistics, 7, 163–186.
Gries, S. T., & Wulff, S. (2013). The genitive alternation in Chinese and German ESL learners: Towards a multifactorial notion of context in learner corpus research. International Journal of Corpus Linguistics, 18(3), 327–356.
Harrell, F. E., Jr. (2001). Regression modeling strategies with applications to linear models, logistic regression, and survival analysis. Berlin/New York: Springer.
Hasselgård, H., & Johansson, S. (2012). Learner corpora and contrastive interlanguage analysis. In F. Meunier, S. De Cock, G. Gilquin, & M. Paquot (Eds.), A taste for corpora: In honour of Sylviane Granger (pp. 33–61). Amsterdam/Philadelphia: John Benjamins.
Hinds, J. (1982). Ellipsis in Japanese. Edmonton: Linguistic Research, Inc.
Hundt, M., & Vogel, K. (2011). Overuse of the progressive in ESL and learner Englishes – Fact or fiction? In J. Mukherjee & M. Hundt (Eds.), Exploring second-language varieties of English and learner Englishes: Bridging a paradigm gap (pp. 145–165). Amsterdam/Philadelphia: John Benjamins.
Hypermedia Corpus of Spoken Japanese. (2010). http://www.env.kitakyu-u.ac.jp/corpus/docs/index.html. Accessed Fall, 2010.
Iwasaki, S. (2002). Japanese. Amsterdam/Philadelphia: John Benjamins.
Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59(4), 434–446.
Jarvis, S., & Crossley, S. A. (Eds.). (2012). Approaching language transfer through text classification explorations in the detection-based approach. Bristol: Multilingual Matters.
Krzeskowski, T. (1990). Contrasting languages: the scope of contrastive linguistics. Berlin & New York: Mouton de Gruyter.
Kuno, S. (1973). The structure of the Japanese language. Cambridge, MA: MIT Press.
Learner’s Language Corpus of Japanese. (2013). http://cblle.tufs.ac.jp/llc/ja/. Accessed Spring, 2013.
Miglio, V. G., Gries, S. T., Harris, M. J., Wheeler, E. M., & Santana-Paixão, R. (2013). Spanish lo(s)-le(s) clitic alternations in psych verbs: A multifactorial corpus-based analysis. Somerville: Cascadilla Press.
Neff van Aertselaer, J. A., & Bunce, C. (2012). The use of small corpora for tracing the development of academic literacies. In F. Meunier, S. De Cock, G. Gilquin, & M. Paquot (Eds.), A taste for corpora: In honour of Sylviane Granger (pp. 63–83). Amsterdam/Philadelphia: John Benjamins.
Ono, T., & Thompson, S. A. (1997). Deconstructing ‘Zero Anaphora’ in Japanese. Proceedings of the Annual Meeting of the Berkeley Linguistics Society, 23, 481–491.
Pery-Woodley, M.-P. (1990). Contrasting discourses: Contrastive analysis and a discourse approach to writing. Language Teaching, 23(3), 143–151.
Rogatcheva, S. (2012). Perfect problems: A corpus-based comparison of the perfect in Bulgarian and German EFL writing. In S. Hoffmann, P. Rayson, & G. Leech (Eds.), English corpus linguistics: Looking back, moving forward (pp. 149–163). Amsterdam: Rodopi.
Schütze, C. T. (1996). The empirical base of linguistics: Grammaticality judgments and linguistic methodology. Chicago: The University of Chicago Press.
Shibatani, M. (1985). Passives and related constructions: A prototype analysis. Language, 61(4), 821–848.
Takagi, T. (2002). Contextual resources for interferring unexpressed referents in Japanese conversations. Pragmatics, 12(2), 153–182.
Tono, Y. (2004). Multiple comparisons of IL, L1 and TL corpora: The case of L2 acquisition of verb subcategorization patterns by Japanese learners of English. In G. Aston, S. Bernardini, & D. Stewart (Eds.), Corpora and language learners (pp. 45–66). Amsterdam/Philadelphia: John Benjamins.
Zuur, A. F., Ieno, E. N., Walker, N., & Saveliev, A. A. (2009). Mixed effects models and extensions in ecology with R. Berlin/New York: Springer.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Gries, S.T., Adelman, A.S. (2014). Subject Realization in Japanese Conversation by Native and Non-native Speakers: Exemplifying a New Paradigm for Learner Corpus Research. In: Romero-Trillo, J. (eds) Yearbook of Corpus Linguistics and Pragmatics 2014. Yearbook of Corpus Linguistics and Pragmatics, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-319-06007-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-06007-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06006-4
Online ISBN: 978-3-319-06007-1
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)