Control of Variables in Reducts - kNN Classification with Confidence

  • Naohiro Ishii
  • Yuichi Morioka
  • Yongguang Bao
  • Hidekazu Tanaka
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6884)


Reduct in rough set is a minimal subset of features, which has almost the same discernible power as the entire features. Then, there are relations between reducts and the classification classes. Here, we propose multiple reducts which are followed by the k-nearest neighbor with confidence to classify documents with higher classification accuracy. To improve the classification accuracy, some reducts are needed for the classification. Then, control of variables as attributes are important for the classification. To select better reducts for the classification, a greedy algorithm is developed here for the classification, which is based on the selection of useful attributes These proposed methods are verified to be effective in the classification on benchmark datasets from the Reuters 21578 data set.


Classification Accuracy Greedy Algorithm Confidence Computation High Classification Accuracy Train Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Pawlak, Z.: Rough Sets. International Journal of Computer and Information Science 11, 341–356 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Pawlak, Z., Slowinski, R.: Rough Set Approach to Multi-attribute Decision Analysis. European Journal of Operations Research 72, 443–459 (1994)CrossRefzbMATHGoogle Scholar
  3. 3.
    Skowron, A., Rauszer, C.: The Discernibility Matrices and Functions in Information Systems. In: Intelligent Decision Support- Handbook of Application and Advances of Rough Sets Theory, pp. 331–362. Kluwer Academic Publishers, Dordrecht (1992)Google Scholar
  4. 4.
    Skowron, A., Polkowski, L.: Decision Algorithms. A Survey of Rough Set Theoretic Methods. Fundamenta Informaticae 30(3-4), 345–358 (1997)Google Scholar
  5. 5.
  6. 6.
    Bao, Y., Aoyama, S., Du, X., Yamada, K., Ishii, N.: A Rough Set –Based Hybrid Method to Text Categorization. In: Proc. 2nd International Conference on Web Information Systems Engineering, pp. 254–261. IEEE Computer Society, Los Alamitos (2001)Google Scholar
  7. 7.
    Bao, Y., Tsuchiya, E., Ishii, N., Du, X.-Y.: Classification by instance-based learning algorithm. In: Gallagher, M., Hogan, J.P., Maire, F. (eds.) IDEAL 2005. LNCS, vol. 3578, pp. 133–140. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  8. 8.
    Momin, B.F., Mitra, S., Gupta, R.D.: Reduct Generation and Classification of Gene Expression Data. In: Proc. International Conference on Hybrid Information Technology ICHIT 2006, vol. I, pp. 699–708. IEEE Computer Society, Los Alamitos (2006)CrossRefGoogle Scholar
  9. 9.
    Cheetham, W., Price, K.: Measures of solution accuracy in case-based reasoning systems. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 106–118. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  10. 10.
    Delany, S.J., Cunningham, P., Doyle, D., Zamolotskikh, A.: Generating estimates of classification confidence for a case-based spam filter. In: Muñoz-Ávila, H., Ricci, F. (eds.) ICCBR 2005. LNCS (LNAI), vol. 3620, pp. 177–190. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  11. 11.
    Freund, Y., Schapire, R.E.: A decision–theoretic generalization of on-line learning and an application to boosting. Journal of Computing System and Science 55, 119–139 (1997)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Naohiro Ishii
    • 1
  • Yuichi Morioka
    • 1
  • Yongguang Bao
    • 2
  • Hidekazu Tanaka
    • 3
  1. 1.Aichi Institute of TechnologyToyotaJapan
  2. 2.Aichi Information SystemKariyaJapan
  3. 3.Daido UniversityNagoyaJapan

Personalised recommendations