Chapter

Computational Linguistics and Intelligent Text Processing

Volume 5449 of the series Lecture Notes in Computer Science pp 170-182

Combining Language Modeling and Discriminative Classification for Word Segmentation

  • Dekang LinAffiliated withGoogle, Inc.

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Generative language modeling and discriminative classification are two main techniques for Chinese word segmentation. Most previous methods have adopted one of the techniques. We present a hybrid model that combines the disambiguation power of language modeling and the ability of discriminative classifiers to deal with out-of-vocabulary words. We show that the combined model achieves 9% error reduction over the discriminative classifier alone.

Keywords

Segmentation Maximum Entropy Language Model