Skip to main content

Subject Realization in Japanese Conversation by Native and Non-native Speakers: Exemplifying a New Paradigm for Learner Corpus Research

  • Chapter
  • First Online:
Yearbook of Corpus Linguistics and Pragmatics 2014

Part of the book series: Yearbook of Corpus Linguistics and Pragmatics ((YCLP,volume 2))

Abstract

In the field of Learner Corpus Research, Gries and Deshors (Corpora 9(1):109–136, 2014) developed a two-step regression procedure (MuPDAR) to determine how and why choices made by non-native speakers differ from those made by native speakers more comprehensively than traditional learner corpus research allows for. In this chapter, we will extend and test their proposal to determine whether it can also be applied to pragmatic and grammatical phenomena (subject realization/omission in Japanese), and whether it can help study categorical differences between learner and native-speaker choices; we do so by also showing that the more advanced method of mixed-effects modeling can be very fruitfully integrated into the proposed MuPDAR method. The results of our study show that Japanese native speakers’ choices of subject realization are affected by discourse-functional factors such as givenness and contrast of referents and that, while learners are able to handle extreme values of givenness and marked cases of contrast, they still struggle (more) with intermediate degrees of givenness and unmarked/non-contrastive referents. We conclude by discussing the role of MuPDAR in Learner Corpus Research in general and its advantages over traditional corpus analysis in that field and error analysis in particular.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We are disregarding here the large body of multifactorial work done by Crossley, Jarvis, and collaborators (cf. in particular Jarvis & Crossley 2012) because much of that work focuses on detecting the L1 of a writer rather than, as here, understanding any one particular lexical or grammatical choice in detail.

  2. 2.

    An additional problem may involve the fact that the authors used a linear regression on data that might violate the assumptions of such regressions. However, we were unable to infer from the paper what the dependent variable was – possibly a frequency of ser/estar + adjective per file? – so the above has to remain speculation for now.

  3. 3.

    We thank Nobutaka Takara and Mikuni Okamoto for their help in transcribing the corpus data.

  4. 4.

    By virtue of the complexity of the statistical methods involved, this section can only be rather technical in nature, plus space constraints do not permit exhaustive definitions and discussion of all the statistical technical terms. We therefore refer the reader to Baayen (2008: Ch. 7), Crawley (2013: Ch. 9, 19), Faraway (2006: Ch. 8–10), and Zuur et al. (2009: Ch. 5).

  5. 5.

    Strictly speaking, if one does a MuPDAR analysis in which R1 is really only used for prediction, then one does not really have to apply Occam's razor rigorously to eliminate non-significant/collinear predictors that much because, within MuPDAR, the point of R1 is not to actually interpret R1's coefficients.

References

  • Aijmer, K. (2005). Modality in advanced Swedish learners’ written interlanguage. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition, and foreign language teaching (pp. 55–76). Amsterdam/Philadelphia: John Benjamins.

    Google Scholar 

  • Altenberg, B. (2005). Using bilingual corpus evidence in learner corpus research. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition, and foreign language teaching (pp. 37–54). Amsterdam/Philadelphia: John Benjamins.

    Google Scholar 

  • Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412.

    Article  Google Scholar 

  • Bates, D., Maechler, M., & Bolker, B. (2013). lme4. Linear mixed-effects models using S4 classes. http://lme4.r-forge.r-project.org/

  • Clancy, P. M. (1980). Referential choice in English and Japanese narrative discourse. In W. Chafe (Ed.), The pear stories: Cognitive and linguistics aspects of narrative production (pp. 127–202). Norwood: Ablex.

    Google Scholar 

  • Collentine, J., & Asención-Delaney, Y. (2010). A corpus-based analysis of the discourse functions of ser/estar + adjective in three levels of Spanish as FL learners. Language Learning, 60(2), 409–445.

    Article  Google Scholar 

  • Cosme, C. (2008). Participle clauses in learner English: The role of transfer. In G. Gilquin, S. Papp, & M. B. Díez-Bedmar (Eds.), Linking up contrastive and learner corpus research (pp. 177–200). Amsterdam/Atlanta: Rodopi.

    Google Scholar 

  • Crawley, M. J. (2013). The R book (2nd ed.). Chichester: Wiley.

    Google Scholar 

  • Doğruöz, A. S., & Gries, S. T. (2012). Spread of on-going changes in an immigrant language: Turkish in the Netherlands. Review of Cognitive Linguistics, 10(2), 401–426.

    Article  Google Scholar 

  • Du Bois, J. W. (2006). Representing discourse. Ms., University of California, Santa Barbara.

    Google Scholar 

  • Faraway, J. J. (2006). Extending the linear model with R: Generalized linear, mixed-effects and non-parametric regression models. Boca Raton: Chapman & Hall/CRC.

    Google Scholar 

  • Granger, S. (1996). From CA to CIA and back: An integrated approach to computerized bilingual and learner corpora. In K. Aijmer, B. Altenberg, & M. Johansson (Eds.), Languages in contrast. Text-based cross-linguistic studies (pp. 37–51). Lund: Lund University Press.

    Google Scholar 

  • Granger, S. (2002). A bird’s eye view of learner corpus research. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition and foreign language teaching (pp. 3–33). Amsterdam/Philadelphia: John Benjamins.

    Chapter  Google Scholar 

  • Granger, S. (2004). Computer learner corpus research: Current status and future prospects. In U. Connor & T. Upton (Eds.), Applied corpus linguistics: A multidimensional perspective (pp. 123–145). Amsterdam: Rodopi.

    Google Scholar 

  • Gries, S. T., & Deshors, S. C. (2014). Using regressions to explore deviations between corpus data and a standard/target: Two suggestions. Corpora, 9(1), 109–136.

    Article  Google Scholar 

  • Gries, S. T., & Wulff, S. (2009). Psycholinguistic and corpus linguistic evidence for L2 constructions. Annual Review of Cognitive Linguistics, 7, 163–186.

    Article  Google Scholar 

  • Gries, S. T., & Wulff, S. (2013). The genitive alternation in Chinese and German ESL learners: Towards a multifactorial notion of context in learner corpus research. International Journal of Corpus Linguistics, 18(3), 327–356.

    Article  Google Scholar 

  • Harrell, F. E., Jr. (2001). Regression modeling strategies with applications to linear models, logistic regression, and survival analysis. Berlin/New York: Springer.

    Google Scholar 

  • Hasselgård, H., & Johansson, S. (2012). Learner corpora and contrastive interlanguage analysis. In F. Meunier, S. De Cock, G. Gilquin, & M. Paquot (Eds.), A taste for corpora: In honour of Sylviane Granger (pp. 33–61). Amsterdam/Philadelphia: John Benjamins.

    Google Scholar 

  • Hinds, J. (1982). Ellipsis in Japanese. Edmonton: Linguistic Research, Inc.

    Google Scholar 

  • Hundt, M., & Vogel, K. (2011). Overuse of the progressive in ESL and learner Englishes – Fact or fiction? In J. Mukherjee & M. Hundt (Eds.), Exploring second-language varieties of English and learner Englishes: Bridging a paradigm gap (pp. 145–165). Amsterdam/Philadelphia: John Benjamins.

    Chapter  Google Scholar 

  • Hypermedia Corpus of Spoken Japanese. (2010). http://www.env.kitakyu-u.ac.jp/corpus/docs/index.html. Accessed Fall, 2010.

  • Iwasaki, S. (2002). Japanese. Amsterdam/Philadelphia: John Benjamins.

    Google Scholar 

  • Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59(4), 434–446.

    Article  Google Scholar 

  • Jarvis, S., & Crossley, S. A. (Eds.). (2012). Approaching language transfer through text classification explorations in the detection-based approach. Bristol: Multilingual Matters.

    Google Scholar 

  • Krzeskowski, T. (1990). Contrasting languages: the scope of contrastive linguistics. Berlin & New York: Mouton de Gruyter.

    Google Scholar 

  • Kuno, S. (1973). The structure of the Japanese language. Cambridge, MA: MIT Press.

    Google Scholar 

  • Learner’s Language Corpus of Japanese. (2013). http://cblle.tufs.ac.jp/llc/ja/. Accessed Spring, 2013.

  • Miglio, V. G., Gries, S. T., Harris, M. J., Wheeler, E. M., & Santana-Paixão, R. (2013). Spanish lo(s)-le(s) clitic alternations in psych verbs: A multifactorial corpus-based analysis. Somerville: Cascadilla Press.

    Google Scholar 

  • Neff van Aertselaer, J. A., & Bunce, C. (2012). The use of small corpora for tracing the development of academic literacies. In F. Meunier, S. De Cock, G. Gilquin, & M. Paquot (Eds.), A taste for corpora: In honour of Sylviane Granger (pp. 63–83). Amsterdam/Philadelphia: John Benjamins.

    Google Scholar 

  • Ono, T., & Thompson, S. A. (1997). Deconstructing ‘Zero Anaphora’ in Japanese. Proceedings of the Annual Meeting of the Berkeley Linguistics Society, 23, 481–491.

    Google Scholar 

  • Pery-Woodley, M.-P. (1990). Contrasting discourses: Contrastive analysis and a discourse approach to writing. Language Teaching, 23(3), 143–151.

    Google Scholar 

  • Rogatcheva, S. (2012). Perfect problems: A corpus-based comparison of the perfect in Bulgarian and German EFL writing. In S. Hoffmann, P. Rayson, & G. Leech (Eds.), English corpus linguistics: Looking back, moving forward (pp. 149–163). Amsterdam: Rodopi.

    Google Scholar 

  • Schütze, C. T. (1996). The empirical base of linguistics: Grammaticality judgments and linguistic methodology. Chicago: The University of Chicago Press.

    Google Scholar 

  • Shibatani, M. (1985). Passives and related constructions: A prototype analysis. Language, 61(4), 821–848.

    Article  Google Scholar 

  • Takagi, T. (2002). Contextual resources for interferring unexpressed referents in Japanese conversations. Pragmatics, 12(2), 153–182.

    Google Scholar 

  • Tono, Y. (2004). Multiple comparisons of IL, L1 and TL corpora: The case of L2 acquisition of verb subcategorization patterns by Japanese learners of English. In G. Aston, S. Bernardini, & D. Stewart (Eds.), Corpora and language learners (pp. 45–66). Amsterdam/Philadelphia: John Benjamins.

    Chapter  Google Scholar 

  • Zuur, A. F., Ieno, E. N., Walker, N., & Saveliev, A. A. (2009). Mixed effects models and extensions in ecology with R. Berlin/New York: Springer.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefan Th. Gries .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Gries, S.T., Adelman, A.S. (2014). Subject Realization in Japanese Conversation by Native and Non-native Speakers: Exemplifying a New Paradigm for Learner Corpus Research. In: Romero-Trillo, J. (eds) Yearbook of Corpus Linguistics and Pragmatics 2014. Yearbook of Corpus Linguistics and Pragmatics, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-319-06007-1_3

Download citation

Publish with us

Policies and ethics