Skip to main content

The Corpus Approach to the Teaching and Learning of Chinese as an L1 and an L2 in Retrospect

  • Chapter
  • First Online:
Computational and Corpus Approaches to Chinese Language Learning

Part of the book series: Chinese Language Learning Sciences ((CLLS))

Abstract

The use of corpora in the teaching and learning of Chinese has a history of nearly a century. Pedagogically oriented Chinese corpus studies have originated on a solid methodological footing before computers were available. The creation of concordances and character/word lists, coupled with quantitative analyses of sentence patterns, have offered fascinating insights into Chinese textbook compilation and syllabus design. Such corpus findings have illuminated what lexical items and grammatical patterns should be taught, and in what order vocabulary and grammar points should be presented. Over the last few decades, the craze for Chinese interlanguage corpora has been largely motivated by China’s growing global influence. The lexico-grammatical performance in the spoken and written production of Chinese as a second language (CSL) learners has been systematically investigated. Both corpus-based L1 and L2 Chinese studies have been fairly successful in terms of the description of the Chinese (inter)language, but there is still much room for pedagogical implementation, that is, to transform the research into classroom friendly teaching materials.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    ‘B.C.’ here is Nelson Francis’ play on words, meaning ‘Before Computer’ rather than its literal sense ‘Before Christ’.

  2. 2.

    Issac Pitman’s word frequency list was published in 1843 in The Phonotypic Journal, but the research was done in 1838 (Pitman 1843: 161).

  3. 3.

    Pingmin was defined by the leaders of the Mass Education Movement as illiterates (Tao and Zhu 1923: 43).

  4. 4.

    陶知行 Zhixing Tao was an early alias of the better-known Chinese educationalist 陶行知 Xingzhi Tao. The difference in name says something about the transformation of his philosophy. Influenced by Chinese Confucianist scholar 王阳明 Wang (1472–1529), Tao (1891–1946) took his name Zhixing (meaning knowledge-action) in the 1910s and Xingzhi (meaning action-knowledge) in 1934 (Tao 1934: 286–287). Both names, Zhixing and Xingzhi, showed Tao’s identification with Yangming Wang’s theory of 知行合一 zhi xing he yi ‘unity of knowledge and action’. When Xingzhi was adopted, Tao seemed to prioritise xing (action) over zhi (knowledge), which suggested that knowledge was derived from empirical engagement (Boorman 1970: 243–244; Browning and Bunge 2009: 388).

  5. 5.

    赵淑华 Shaohua Zhao was the lead scholar and director of the Sentence Pattern Research Group at Beijing Language Institute.

  6. 6.

    The term ‘input corpus’ is used by some learner corpus linguists meaning the collection of learners’ language exposures such as teachers’ talk in class as well as the written texts that the learners are confronted with in learning. In this article, input corpora mainly refer to the written texts, textbooks in particular.

  7. 7.

    The corpus is freely available at https://www.sketchengine.co.uk/guangwai-lancaster-chinese-learner-corpus/.

  8. 8.

    http://ncl.xmu.edu.cn/shj/Default.aspx. Chinese textbooks for native Chinese students were also collected alongside the CSL textbook data.

References

  • Ao, H. (1929a). Yutiwen yingyong zihui yanjiu baogao: Chen Heqin shi Yutiwen yingyong zihui zhi xu [A study of characters used in vernacular Chinese: Extending Chen’s character list]. Jiaoyu Zazhi [Journal of Education],21(2), 77–101.

    Google Scholar 

  • Ao, H. (1929b). Yutiwen yingyong zihui yanjiu baogao (Xu): Chen Heqin shi Yutiwen yingyong zihui zhi xu [A study of characters used in vernacular Chinese: Extending Chen’s character list (continued)]. Jiaoyu Zazhi [Journal of Education], 21(3), 97–113.

    Google Scholar 

  • Bei, G., & Zhang, X. (1988). Hanzi pindu tongji [Frequency calculation of Chinese characters]. Beijing: Publishing House of Electronics Industry.

    Google Scholar 

  • Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics: Investigating language structure and use. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Biber, D., Conrad, S., & Leech, G. (2002). Longman student grammar of spoken and written English. London: Longman.

    Google Scholar 

  • Boorman, H. (1970). Biographical dictionary of republican China (Vol. 3). New York: Columbia University Press.

    Google Scholar 

  • Browning, D., & Bunge, M. (Eds.). (2009). Children and childhood in world religions: Primary sources and texts. New Brunswick: Rutgers University Press.

    Google Scholar 

  • Chen, H. (1922). Yutiwen yingyong zihui [Characters used in vernacular Chinese]. Xin Jiaoyu [New Education],5(5), 987–995.

    Google Scholar 

  • Chen, H. (1928). Yutiwen yingyong zihui [Characters used in vernacular Chinese]. Shanghai: The Commercial Press.

    Google Scholar 

  • Chen, X. (1996). Hanyu zhongjie yu yuliaoku xitong jieshao [Introducing the Chinese interlanguage corpus system]. In the proceedings of the 5th International Chinese Language Teaching conference (pp. 450–458). Beijing.

    Google Scholar 

  • China State Language Commission and China State Bureau of Standards. (1992). Xiandai hanyu zipin tongji biao [A frequency list of modern Chinese characters]. Beijing: Language and Culture Press.

    Google Scholar 

  • Chu, C. (2004). ChineseTA (1.0). Stanford university and the silicon valley language technologies. San Jose, CA: LLC.

    Google Scholar 

  • Chu, C., & Chen, X. (1993). Jianli hanyu zhongjieyu yuliaoku xitong de jiben shexiang [The initial considerations of creating a Chinese interlanguage corpus system]. Shijie Hanyu Jiaoxue [Chinese Teaching in the World],7(3), 199–205.

    Google Scholar 

  • Conrad, S., & Biber, D. (2009). Real grammar: A corpus-based approach to English. New York: Pearson.

    Google Scholar 

  • Cui, X., & Zhang, B. (2011). Quanqiu hanyu xuexizhe yuliaoku jianshe fangan [A proposal for the building of the International Learner Corpus of Chinese]. Yuyan Wenzi Yingyong [Applied Linguistics],19(2), 100–108.

    Google Scholar 

  • Feng, Z. (2006). Evolution and present situation of corpus research in China. International Journal of Corpus Linguistics,11(2), 173–207.

    Article  Google Scholar 

  • Francis, N. (1992). Language corpora B.C. In J. Svartvik (Ed.), Directions in corpus linguistics (pp. 17–32). Berlin: Mouton de Gruyter.

    Google Scholar 

  • Freeman, J. (1843). On grammalogues: To the editor of the Phonotypic Journal. The Phonotypic Journal,2(24), 170–171.

    Google Scholar 

  • Granger, S. (2015). Contrastive interlanguage analysis: A reappraisal. International Journal of Learner Corpus Research,1(1), 7–24.

    Article  Google Scholar 

  • Hu, X., & Xu, X. (2010). Mianxiang zhongwen dianhua jiaoxue de hanguo liuxuesheng hanyu zhongjieyu yuliaoku de kaifa yu jianshe [The development of a computer-assisted Chinese language teaching oriented Korean students’ interlanguage Chinese corpus]. In the Proceedings of the Seventh International Conference on Computer-assisted Chinese Language Teaching.

    Google Scholar 

  • Huang, C, Chen, K., & Chang, L. (1996). Segmentation standard for Chinese natural language processing. In Proceedings of the 1996 International Conference on Computational Linguistics. Copenhagan: Denmark.

    Google Scholar 

  • Huang, W. (1962–1963). An annotated, partial list of the publications of William Hung. Harvard Journal of Asiatic Studies, 24, 7–16.

    Google Scholar 

  • Hung, W. (1931). Shuoyuan yinde [Index to Shuo Yuan]. Peiping: Harvard-Yenching Institute Sinological Index Series, Peking University Library.

    Google Scholar 

  • Hung, W. (1932). Yinde shuo [On indexing]. Peiping: Harvard-Yenching Institute Sinological Index Series, Peking University Library.

    Google Scholar 

  • Hunston, S. (2002). Corpora in applied linguistics. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Johansson, S. (2011). A multilingual outlook of corpora studies. In V. Viana, S. Zyngier, & G. Barnbrook (Eds.), Perspectives on corpus linguistics (pp. 115–129). Amsterdam: John Benjamins.

    Chapter  Google Scholar 

  • Johns, T. (1991). From printout to handout: Grammar and vocabulary teaching in the context of data-driven learning. ELR Journal,4, 27–45.

    Google Scholar 

  • Kaeding, F. (1897). Häufigkeitswörterbuch der Deutschen Sprache. Berlin: Self-publication.

    Google Scholar 

  • Kilgarriff, A., Keng, N., & Smith, S. (2015). Learning Chinese with the Sketch Engine. In B. Zou, M. Hoey, & S. Smith (Eds.), Corpus linguistics in Chinese contexts (pp. 63–73). Basingstoke: Palgrave Macmillan.

    Chapter  Google Scholar 

  • Lewis, M. (1993). The lexical approach: The state of ELT and a way forward. Hove: LTP.

    Google Scholar 

  • Liu, D. (1926). Pingjiao zonghui ‘gaibian qianzi ke’ jianzi gongzuo de jingguo [The process of vocabulary selection for the Revised Edition of 1000 Foundation Characters by the National Association of Mass Education Movement]. Jiaoyu Zazhi [Journal of Education], 18(12), 1–14.

    Google Scholar 

  • Liu, E. (1973). Frequency dictionary of Chinese words. The Hague: Mouton.

    Book  Google Scholar 

  • Liu, Y., Liang, N., Wang, D., Zhang, S., Yang, T., Jie, C., et al. (1990). Xiandai hanyu changyong ci cipin cidian [A dictionary of frequency of modern Chinese words]. Beijing: Astronautic Publishing House.

    Google Scholar 

  • McCarthy, M., & O’Dell, F. (2008). Academic vocabulary in use. Cambridge: Cambridge University Press.

    Google Scholar 

  • McCarthy, M., McCarten, J., & Sandiford, H. (2004). Touchstone (Student’s Book 1). Cambridge: Cambridge University Press.

    Google Scholar 

  • McCarthy, M., & McCarten, J. (2012). Viewpoint (Level 1 Student’s Book). Cambridge: Cambridge University Press.

    Google Scholar 

  • McEnery, T., & Xiao, R. (2016). Corpus-based study of Chinese. In S. Chan (Ed.), The Routledge encyclopedia of the Chinese language (pp. 438–451). London: Routldge.

    Google Scholar 

  • National Association of Mass Education Movement. (1922a). Nongmin qianzi ke [1000 Chinese foundation characters for peasants]. Shanghai: The Commercial Press.

    Google Scholar 

  • National Association of Mass Education Movement. (1922b). Shibing qianzi ke [1000 Chinese foundation characters for servicemen]. Shanghai: The Commercial Press.

    Google Scholar 

  • National Association of Mass Education Movement. (1928). Shimin qianzi ke [Textbook of one thousand characters for townspeople]. Shanghai: The Commercial Press.

    Google Scholar 

  • Nesselhauf, N. (2004). Learner corpora and their potential for language teaching. In J. Sinclair (Ed.), How to use corpora in language teaching (pp. 125–152). Amsterdam: John Benjamins.

    Chapter  Google Scholar 

  • Pitman, I. (1843). List of words from which grammalogues may be selected. The Phonotypic Journal,2(23), 161–163.

    Google Scholar 

  • Sentence Pattern Research Group at Beijing Language Institute. (1989a). Xiandai hanyu jiben juxing [Basic sentence patterns of modern Chinese]. Shijie Hanyu Jiaoxue [Chinese Teaching in the World],3(1), 26–35.

    Google Scholar 

  • Sentence Pattern Research Group at Beijing Language Institute. (1989b). Xiandai hanyu jiben juxing (Xu yi) [Basic sentence patterns of modern Chinese (Continued I)]. Shijie Hanyu Jiaoxue [Chinese Teaching in the World],3(3), 144–148.

    Google Scholar 

  • Sentence Pattern Research Group at Beijing Language Institute. (1989c). Xiandai hanyu jiben juxing (Xu er) [Basic sentence patterns of modern Chinese (Continued II)]. Shijie Hanyu Jiaoxue [Chinese Teaching in the World],3(4), 211–219.

    Google Scholar 

  • Sentence Pattern Research Group at Beijing Language Institute. (1990). Xiandai hanyu jiben juxing (Xu san) [Basic sentence patterns of modern Chinese (Continued III)]. Shijie Hanyu Jiaoxue [Chinese Teaching in the World],4(1), 27–33.

    Google Scholar 

  • Sentence Pattern Research Group at Beijing Language Institute. (1991). Xiandai hanyu jiben juxing (Xu si) [Basic sentence patterns of modern Chinese (Continued IV)]. Shijie Hanyu Jiaoxue [Chinese Teaching in the World],5(1), 23–29.

    Google Scholar 

  • Sinclair, J. (1987). Collins COBUILD English dictionary. London: Collins.

    Google Scholar 

  • Su, X. (2010). Jiaocai yuyan tongji yanjiu de duoweidu gongneng [The multi-dimensional function of the statistical research on textbook language]. In Proceedings of the Innovation of International Chinese Teaching Theories and Models Conference. Xiamen.

    Google Scholar 

  • Sugiura, M. (2002). Collocational knowledge of L2 learners of English: A case study of Japanese learners. In T. Saito, J. Nakamura, & S. Yamazaki (Eds.), English corpus linguistics in Japan (pp. 303–323). Amsterdam: Rodopi.

    Google Scholar 

  • Tao, X. (1934). Xing zhi xing [Action knowledge action]. Shenghuo Jiaoyu [Life Education],1(11), 286–287.

    Google Scholar 

  • Tao, Z., & Zhu, J. (1923). Pingmin qianzi ke [Early Chinese lessons for illiterates]. Shanghai: The Commercial Press.

    Google Scholar 

  • Teng, S. Hong, Y. Chang, W. & Lu, C. (2007). Huayuwen xuexizhe hanzi pianwu shuju ziliaoku jianli ji pianwu leixing fenxi [The construction of Chinese learners’ character writing error databse and the analysis of error types]. In Proceedings of 2007 National Linguistics Conference (pp. 313–325). Tainan: National Cheng Kung University.

    Google Scholar 

  • Thorndike, E. (1921). The teacher’s word book. New York: Teachers College, Columbia University.

    Google Scholar 

  • Thorndike, E. (1931). A teacher’s word book of the twenty thousand words found most frequently and widely in general reading for children and young people. New York: Teachers College, Columbia University.

    Google Scholar 

  • Thorndike, E., & Lorge, I. (1944). The teacher’s word book of 30,000 words. New York: Teachers College, Columbia University.

    Google Scholar 

  • Tribble, C., & Jones, G. (1990). Concordances in the classroom: A resource book for teachers. London: Longman.

    Google Scholar 

  • Tsai, T. (1922). Laojielao [A synthetic study of Lao Tzu’s Tao-Te-Ching in Chinese]. Beijing: Self-publication.

    Google Scholar 

  • Tsang, W., & Yeung, Y. (2012). The development of the Mandarin Interlanguage Corpus (MIC): A preliminary report on a small-scale learner database. JALT Journal,34(2), 187–208.

    Article  Google Scholar 

  • Tsou, B., Lin, H., Chan, T., Hu, J., Chew, C., & Tse, J. (1997). A synchronous Chinese language corpus from different speech communities: Construction and application. International Journal of Computational Linguistics and Chinese Language Processing,2(1), 91–104.

    Google Scholar 

  • Voegelin, C. (1959). The notion of arbitrariness in structural statement and restatement I: Eliciting. International Journal of American Linguistics,25(4), 207–220.

    Article  Google Scholar 

  • Wang, M., Malmasi, S. & Huang, M. (2015). The Jinan Chinese Learner Corpus. In Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications (pp. 118–123). Denver, CO: The Association for Computational Linguistics.

    Google Scholar 

  • Wei, S., Zhao, P., Yang, X., & Chen, L. (2008). Daxing zhongguo xiaoxuesheng zuowen yuliaoku de shengcheng [The construction of a large-scale Chinese pupils’ written language corpus]. Modern Educational Technology,18(12), 45–48.

    Google Scholar 

  • Willis, D. (1990a). The lexical syllabus: A new approach to language teaching. London: Collins ELT.

    Google Scholar 

  • Willis, J. (1990b). Collins COBUILD English course (First lessons, Student’s edition). London: Collins ELT.

    Google Scholar 

  • Xiao, R., Rayson, P., & McEnery, T. (2009). A frequency dictionary of Mandarin Chinese: Core vocabulary for learners. London: Routledge.

    Google Scholar 

  • Xiao, X., & Zhou, W. (2014). Hanyu zhongjieyu yuliaoku biaozhu de quanmianxing ji leibie wenti [The exhaustiveness and taxonomy of Chinese interlanguage corpus annotation]. Shijie Hanyu Jiaoxue [Chinese Teaching in the World],28(3), 368–377.

    Google Scholar 

  • Xu, J. (2015). Corpus-based Chinese studies: A historical review from the 1920s to the present. Chinese Language and Discourse,6(2), 218–244.

    Article  Google Scholar 

  • Yen, J. (1922). Pingmin jiaoyu xin yundong [A new movement of mass education]. Xin Jiaoyu [New Education],5(5), 1007–1026.

    Google Scholar 

  • Yen, J., & Fu, D. (1922). Pingmin qianzi ke ‘Foundation characters’ (Books 1–3). Dingxian: National Association of Mass Education Movement.

    Google Scholar 

  • Yen, J., & Fu, D. (1924). Foundation characters (2nd revised edition). Shanghai: National Committee of Y.M.C.A. of China.

    Google Scholar 

  • Zhang, B. (2003). HSK (Hanyu Shuiping Kaoshi) dongtai zuowen yuliaoku jianjie [Introducing Chinese proficiency test dynamic essay corpus]. Ceshi Yanjiu [Assessment Research],1(4), 37–38.

    Google Scholar 

  • Zhang, P. (1999). Guanyu Yugan yu Liutongdu de Sikao [On Language Sense and Degree of Circulation]. Yuyan Jiaoxue yu Yanjiu [Language Teaching and Linguistic Studies],21(2), 83–96.

    Google Scholar 

  • Zhang, R. (2017). Hanyu zhongjieyu yuliaoku zhong de hanzi pianwu chuli yanjiu [The character errors in Chinese interlanguage corpora]. Yuliaoku Yuyanxue [Corpus Linguistics],3(2), 50–59.

    Google Scholar 

  • Zhao, S., Liu, S., & Hu, X. (1995). Beijing Yanyan Xueyuan xiandai hanyu jingdu jiaocai zhu kewen juxing tongji baogao [The BLCU report of the sentence patterns of the main texts of Modern Chinese Intensive Reading]. Yuyan Jiaoxue yu Yanjiu [Language Teaching and Linguistic Studies],17(2), 11–26.

    Google Scholar 

  • Zhou, X., Bo, W., Wang, L., & Li, Y. (2017). Guoji hanyu jiaocai yuliaoku de jianshe yu yingyong [The construction and application of international Chinese textbook corpus]. Yuyan Wenzi Yingyong [Applied Linguistics],25(1), 125–135.

    Google Scholar 

Download references

Acknowledgements

The research was supported in part by the key project of the National Research Centre for Foreign Language Education (MOE Key Research Institute of Humanities and Social Sciences at Universities) (Ref No.: 17JJD740003) at Beijing Foreign Studies University. The author gratefully acknowledges the funding of the Fulbright Visiting Scholar grant during the writing up of the article. The author would also like to thank the referees and editors for their constructive comments, Professor Chengzhi Chu for providing helpful resource of ChineseTA, and Professor Eniko Csomay and Dr. Lu Lu for proofreading the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiajin Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Xu, J. (2019). The Corpus Approach to the Teaching and Learning of Chinese as an L1 and an L2 in Retrospect. In: Lu, X., Chen, B. (eds) Computational and Corpus Approaches to Chinese Language Learning. Chinese Language Learning Sciences. Springer, Singapore. https://doi.org/10.1007/978-981-13-3570-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-3570-9_3

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-3569-3

  • Online ISBN: 978-981-13-3570-9

  • eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics