Corpus-Based Statistics of Pre-Qin Chinese

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7717)


The Pre-Qin Chinese plays a key role in the history of Chinese. However, for the lack of annotated corpus, the overview of Pre-Qin Chinese vocabulary is still not clear. This paper introduces the corpus of 25 Pre-Qin classical texts, which are under manual word segmentation and part-of-speech tagging. Then, the character and word frequencies are calculated based on the corpus. The character entropy, the syllables of words and the multiple part-of-speech words are also statistically analyzed.


Chinese information processing Pre-Qin Chinese lexical statistics multiple part-of-speech word 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chen, X.H.: Information Processing of Pre-Qin Chinese. In: The 27th Anniversary of Chinese Information Processing Society of China, Beijing (2008)Google Scholar
  2. 2.
    Shi, M., Chen, X.H., Li, B.: CRF Based Research on a Unified Ap-proach to Word Segmentation and POS Tagging for Pre-Qin Chinese. Journal of Chinese Information Processing 2(24), 39–45 (2010)Google Scholar
  3. 3.
    Zhang, S.D.: Vocabulary Study of Lv Shi Chun Qiu. Shandong Education Press, Jinan (1989)Google Scholar
  4. 4.
    Chen, K.J.: Dictionary of Chunqiu Zuozhuan. Zhongzhou Ancient Books Publishing House, Henan (2004)Google Scholar
  5. 5.
    Che, S.Y.: Vocabulary Study of Hanfeizi. Bashu Publishing House, Chengdu (2008)Google Scholar
  6. 6.
    Ye, Z.B.: Vocabulary Study of Archaic Chinese. The Central Literature Publishing House, Beijing (2007)Google Scholar
  7. 7.
    Academia Sinica Tagged Corpus of Old Chinese,
  8. 8.
    Pan, Y.Z.: The Formation and Development of Chinese Basic Vocabulary. Journal of Zhongshan University 1, 98–121 (1959)Google Scholar
  9. 9.
    Zhou, J.: Distinction between Basic Vocabulary and General Vocabulary. Journal of Nankai University 3 (1987)Google Scholar
  10. 10.
    Feng, Z.W.: The Entropy of Chinese Characters. Revolution of Chinese Characters, 12–17 (1984)Google Scholar
  11. 11.
    Zhu, D.X.: Lecture Notes on Grammar. The Commercial Press, Beijing (1983)Google Scholar
  12. 12.
    Li, J.X.: The New Chinese Grammar. The Commercial Press, Beijing (1924)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Research Center of Language and InformaticsNanjing Normal UniversityNanjingChina
  2. 2.State Key Lab for Novel Software TechnologyNanjing UniversityNanjingChina

Personalised recommendations