Recursive Part-of-Speech Tagging Using Word Structures
This research takes advantage of word structures and produces a good estimate of part-of-speech tags of Chinese compound words before they are fed into a tagger. The approach relies on a set of features from Chinese morphemes as well as a set of collocation markers which provide hints on the syntactic categories of compound words. A recursive inferential mechanism is devised to alleviate the riffle effect from changes made at its neighbors during tagging. The approach is justified with a compound words database with more than 53,500 words. Experimental results with 500,000 words show the approach outperforms its counterparts.
KeywordsPart-of-speech tagging Tree-based classifier Chinese morphemes
Unable to display preview. Download preview PDF.
- 1.Chung, Y.-S., Chen, K.-J.: Analysis of Chinese morphemes and its application to sense and part-of-speech prediction for Chinese compounds. In: Proceedings of the Joint Conference of 23rd International Conference on the Computer Processing of Oriental Languages (2010)Google Scholar
- 6.Lin, D., Zhou, S., Qin, L., Zhou, M.: Identifying synonyms among distributionally similar words. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence, pp. 1492–1493 (2003)Google Scholar
- 7.Liu, Y., Yu, S., Zhu, X.: Construction of the contemporary Chinese compound words database and its application. In: Zhang, P. (ed.) The Contemporary Educational Techniques and Teaching Chinese as a Foreign Language, pp. 273–278. Guangxi Normal University Press (2000)Google Scholar
- 8.Ng, H.T., Low, J.K.: Chinese part-of-speech tagging: One-at-a-time or all-at-once? Word-based or character-based? In: Proceedings of EMNLP, Barcelona, Spain (2004)Google Scholar
- 9.Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)Google Scholar
- 10.Tseng, H., Chen, K.-J.: Design of Chinese morphological analyzer. In: Proceedings of the First SIGHAN Workshops on Chinese Language Processing (2002)Google Scholar