Chi-Square Classifier for Document Categorization
- Cite this paper as:
- Alexandrov M., Gelbukh A., Lozovoi G. (2001) Chi-Square Classifier for Document Categorization. In: Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2001. Lecture Notes in Computer Science, vol 2004. Springer, Berlin, Heidelberg
The problem of document categorization is considered. The set of domains and the keywords specific for these domains is supposed to be selected beforehand as initial data. We apply the well-known statistical hypothesis test that considers images of documents and domains as normalized vectors. In comparison with existing methods, such approach allows to take into account a random character of initial data. The classifier is developed in the framework of Document Investigator software package.
Unable to display preview. Download preview PDF.