Abstract
Feature selection plays an important role in text categorization. Term frequency and document frequency are commonly used measures in feature selection methods for text categorization. The term frequency provides document level information for a word while document frequency highlights dataset level information for a word. We introduced a Game-theoretic rough set based method for combining these measures in an effective and meaningful way. The method incorporates the measures as players in a game where each player employs a three-way decision in selecting features. The three-way decisions for features received inspiration from three-way decisions for classification of objects in rough sets. The selected decisions with respective measures are utilized in finding a corporative solution as in game-theoretic rough sets. A demonstrative example suggests that this method may be more efficient for feature selection in text categorization.
Keywords
- Feature selection
- game-theoretic rough sets
- text categorization
- three-way decisions
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Azam, N., Yao, J.T.: Incorporating Game Theory in Feature Selection for Text Categorization. In: Kuznetsov, S.O., Ślęzak, D., Hepting, D.H., Mirkin, B.G. (eds.) RSFDGrC 2011. LNCS, vol. 6743, pp. 215–222. Springer, Heidelberg (2011)
Azam, N., Yao, J.T.: Comparison of term frequency and document frequency based feature selection metrics in text categorization. Expert Systems with Applications 39(5), 4760–4768 (2012)
Dasgupta, A., Drineas, P., Harb, B., Josifovski, V., Mahoney, M.W.: Feature selection methods for text classification. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), pp. 230–239 (2007)
Ekenel, H.K., Sankur, B.: Feature selection in the independent component subspace for face recognition. Pattern Recognition Letters 25(12), 1377–1388 (2004)
Forman, G.: An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research 3, 1289–1305 (2003)
Herbert, J.P., Yao, J.T.: Game-Theoretic Risk Analysis in Decision-Theoretic Rough Sets. In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT 2008. LNCS (LNAI), vol. 5009, pp. 132–139. Springer, Heidelberg (2008)
Herbert, J.P., Yao, J.T.: Learning Optimal Parameters in Decision-Theoretic Rough Sets. In: Wen, P., Li, Y., Polkowski, L., Yao, Y., Tsumoto, S., Wang, G. (eds.) RSKT 2009. LNCS, vol. 5589, pp. 610–617. Springer, Heidelberg (2009)
Herbert, J.P., Yao, J.T.: Game-theoretic rough sets. Fundamenta Informaticae 108(3-4), 267–286 (2011)
Jia, X., Li, W., Shang, L., Chen, J.: An Optimization Viewpoint of Decision-Theoretic Rough Set Model. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 457–465. Springer, Heidelberg (2011)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997)
Lakshmi, K., Mukherjee, S.: An improved feature selection using maximized signal to noise ratio technique for tc. In: Proceedings of 3rd International Conference on Information Technology: New Generations (ITNG 2006), pp. 541–546 (2006)
Lee, C., Lee, G.G.: Information gain and divergence-based feature selection for machine learning-based text categorization. Information Processing and Management 42(1), 155–165 (2006)
Li, H., Zhou, X., Zhao, J., Liu, D.: Attribute Reduction in Decision-Theoretic Rough Set Model: A Further Investigation. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 466–475. Springer, Heidelberg (2011)
Liang, H., Wang, J., Yao, Y.Y.: User-oriented feature selection for machine learning. The Computer Journal 50(4), 421–434 (2007)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17(4), 491–502 (2005)
López, F.R., Jiménez-Salazar, H., Pinto, D.: A Competitive Term Selection Method for Information Retrieval. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 468–475. Springer, Heidelberg (2007)
Mladenic, D., Grobelnik, M.: Feature selection for unbalanced class distribution and naive bayes. In: Proceedings of 16th International Conference on Machine Learning (ICML1999), pp. 258–267 (1999)
Moyotl-Hernández, E., Jiménez-Salazar, H.: Enhancement of DTP Feature Selection Method for Text Categorization. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 719–722. Springer, Heidelberg (2005)
von Neumann, J., Morgenstern, O.: Theory of Games and Economic Behavior. Princeton University Press (1944)
Ogura, H., Amano, H., Kondo, M.: Feature selection with a measure of deviations from poisson in text categorization. Expert Systems with Applications 36(3), 6826–6832 (2009)
Pawlak, Z.: Rough sets. International Journal of Computer and Information Sciences 11, 241–256 (1982)
Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Swiniarski, R.W., Skowron, A.: Rough set methods in feature selection and recognition. Pattern Recognition Letters 24(6), 833–849 (2003)
Tang, B., Shepherd, M., Milios, E., Heywood, M.I.: Comparing and combining dimension reduction techniques for efficient text clustering. In: Proceeding of International Workshop on Feature Selection for Data Mining - Interfacing Machine Learning and Statistics in Conjunction with 2005 SIAM International Conference on Data Mining, Newport Beach, California, April 23, pp. 17–26 (2005)
Tseng, T.L.B., Huang, C.C.: Rough set-based approach to feature selection in customer relationship management. Omega 35(4), 365–383 (2007)
Yang, X.P., Song, H., Li, T.J.: Decision Making in Incomplete Information System Based on Decision-Theoretic Rough Sets. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 495–503. Springer, Heidelberg (2011)
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Fisher, D.H. (ed.) Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), Nashville, Tennessee, USA, July 8-12, pp. 412–420. Morgan Kaufmann (1997)
Yao, J.T., Herbert, J.P.: A game-theoretic perspective on rough set analysis. Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition) 20(3), 291–298 (2008)
Yao, J.T., Herbert, J.P.: Analysis of Data-Driven Parameters in Game-Theoretic Rough Sets. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 447–456. Springer, Heidelberg (2011)
Yao, J.T., Yao, Y.Y., Ziarko, W.: Probabilistic rough sets: Approximations, decision-makings, and applications. International Journal of Approximate Reasoning 49(2), 253–254 (2008)
Yao, J.T., Zhang, M.: Feature Selection with Adjustable Criteria. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 204–213. Springer, Heidelberg (2005)
Yao, Y.Y.: Probabilistic rough set approximations. International Journal of Approximate Reasoning 49, 255–271 (2008)
Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximating concepts. International Journal of Man-Machine Studies 37, 793–809 (1992)
Zheng, Z., Wu, X., Srihari, R.: Feature selection for text categorization on imbalanced data. SIGKDD Exploration Newsletter 6(1), 80–89 (2004)
Zhou, B.: A New Formulation of Multi-category Decision-Theoretic Rough Sets. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 514–522. Springer, Heidelberg (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Azam, N., Yao, J. (2013). Game-Theoretic Rough Sets for Feature Selection. In: Skowron, A., Suraj, Z. (eds) Rough Sets and Intelligent Systems - Professor Zdzisław Pawlak in Memoriam. Intelligent Systems Reference Library, vol 43. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30341-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-30341-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30340-1
Online ISBN: 978-3-642-30341-8
eBook Packages: EngineeringEngineering (R0)