Control of Variables in Reducts - kNN Classification with Confidence
Reduct in rough set is a minimal subset of features, which has almost the same discernible power as the entire features. Then, there are relations between reducts and the classification classes. Here, we propose multiple reducts which are followed by the k-nearest neighbor with confidence to classify documents with higher classification accuracy. To improve the classification accuracy, some reducts are needed for the classification. Then, control of variables as attributes are important for the classification. To select better reducts for the classification, a greedy algorithm is developed here for the classification, which is based on the selection of useful attributes These proposed methods are verified to be effective in the classification on benchmark datasets from the Reuters 21578 data set.
KeywordsClassification Accuracy Greedy Algorithm Confidence Computation High Classification Accuracy Train Data
Unable to display preview. Download preview PDF.
- 3.Skowron, A., Rauszer, C.: The Discernibility Matrices and Functions in Information Systems. In: Intelligent Decision Support- Handbook of Application and Advances of Rough Sets Theory, pp. 331–362. Kluwer Academic Publishers, Dordrecht (1992)Google Scholar
- 4.Skowron, A., Polkowski, L.: Decision Algorithms. A Survey of Rough Set Theoretic Methods. Fundamenta Informaticae 30(3-4), 345–358 (1997)Google Scholar
- 5.Reuters21578, http://www.daviddlewis.com/resources/testcollections/
- 6.Bao, Y., Aoyama, S., Du, X., Yamada, K., Ishii, N.: A Rough Set –Based Hybrid Method to Text Categorization. In: Proc. 2nd International Conference on Web Information Systems Engineering, pp. 254–261. IEEE Computer Society, Los Alamitos (2001)Google Scholar