Study on the English Corresponding Unit of Chinese Clause
This paper annotates the English corresponding units of Chinese clauses in Chinese-English translation and statistically analyzes them. Firstly, based on Chinese clause segmentation, we segment English target text into corresponding units (clause) to get a Chinese-to-English clause-aligned parallel corpus. Then, we annotate the grammatical properties of the English corresponding clauses in the corpus. Finally, we find the distribution characteristics of grammatical properties of English corresponding clauses by statistically analyzing the annotated corpus: there are more clauses (1631,74.41%) than sentences (561,25.59%); there are more major clauses (1719,78.42%) than subordinate clauses (473,21.58%); there are more adverbial clauses (392,82.88%) than attributive clauses (81,17.12%) and more non-defining clauses (358,75.69%) than restrictive relative clauses (115,24.31%) in subordinate clauses; and there are more simple clauses (1142,52.1%) than coordinate clauses (1050,47.9%).
KeywordsClauses Parallel corpus Clause-based Clause alignment Discourse-based translation Chinese-to-English translation
This paper was supported by Program of humanities and Social Sciences of Ministry of Education (13YJC740022, 15YJC740021), Major projects of basic researches of Philosophy and Sociology in colleges, Henan (2015-JCZD-022), China Postdoctoral Fund (2013M540594), National Natural Science Foundation of China (61273320, 61502149, 61402119), China Scholarship Council (201508090048) and Programs to Improve Competitiveness, Russia (02.A03.21.0006).
- 1.Wang, J.: Computer-Oriented Chinese Translation Studies of English Clauses. Beijing Language and Culture University Press, Beijing (2009)Google Scholar
- 2.Song, R., Ge, S.: English-Chinese translation unit and translation model for discourse-based machine translation. J. Chin. Inf. Process. 29(15), 125–135 (2013)Google Scholar
- 3.Bai, X., Chang, B., Zhan, W., Wu, Y.: The construction of a large-scale Chinese-English parallel corpus. In: Proceeding of 2002 National Machine Translation Conference on Advances in Machine Translation Studies (2002)Google Scholar
- 4.Wang, K.: Bilingual Corpus: Development and Application. Foreign Language Teaching and Research Press, Beijing (2004)Google Scholar
- 5.Li, Y., Feng, W., Zhou, G., Zhu, K.: Research of Chinese clause identification based on comma. Acta Sci. Nat. Univ. Pekin. 49(1), 7–14 (2013)Google Scholar
- 6.Li, Y., Feng, W., Sun, J., Kong, F., Zhou, G.: Building Chinese discourse corpus with connective-driven dependency tree structure. In: Proceedings of EMNLP, pp. 2105–2114 (2014)Google Scholar
- 7.Zhang, Z.: A New English Grammar Coursebook. Shanghai Foreign Language Education Press, Shanghai (2013)Google Scholar
- 8.Feng, W.: Alignment and annotation of Chinese-English discourse structure. J. Chin. Inf. Process. 27(6), 158–164 (2013)Google Scholar