PAKDD 2011: Advances in Knowledge Discovery and Data Mining pp 13-25 | Cite as
A Game Theoretic Approach for Feature Clustering and Its Application to Feature Selection
Abstract
In this paper, we develop a game theoretic approach for clustering features in a learning problem. Feature clustering can serve as an important preprocessing step in many problems such as feature selection, dimensionality reduction, etc. In this approach, we view features as rational players of a coalitional game where they form coalitions (or clusters) among themselves in order to maximize their individual payoffs. We show how Nash Stable Partition (NSP), a well known concept in the coalitional game theory, provides a natural way of clustering features. Through this approach, one can obtain some desirable properties of the clusters by choosing appropriate payoff functions. For a small number of features, the NSP based clustering can be found by solving an integer linear program (ILP). However, for large number of features, the ILP based approach does not scale well and hence we propose a hierarchical approach. Interestingly, a key result that we prove on the equivalence between a k-size NSP of a coalitional game and minimum k-cut of an appropriately constructed graph comes in handy for large scale problems. In this paper, we use feature selection problem (in a classification setting) as a running example to illustrate our approach. We conduct experiments to illustrate the efficacy of our approach.
Keywords
Feature Selection Integer Linear Program Spectral Cluster Coalition Structure Feature ClusterPreview
Unable to display preview. Download preview PDF.
References
- 1.Ballester, C.: NP-completeness in hedonic games. Games and Economic Behavior 49(1), 1–30 (2004)MathSciNetCrossRefMATHGoogle Scholar
- 2.Bogomolnaia, A., Jackson, M.O.: The stability of hedonic coalition structures. Games and Economic Behavior 38, 201–230 (2002)MathSciNetCrossRefMATHGoogle Scholar
- 3.Cohen, S., Dror, G., Ruppin, E.: Feature selection via coalitional game theory. Neural Computation 19, 1939–1961 (2007)MathSciNetCrossRefMATHGoogle Scholar
- 4.Dhillon, I., Guan, Y., Kulis, B.: A unified view of kernel k-means, spectral clustering, and graph partitioning. Technical report, Univ. of Texas, Austin (2005)Google Scholar
- 5.Ding, C.: A tutorial on spectral clustering, http://crd.lbl.gov/~cding/Spectral/
- 6.Drèze, J., Greenberg, J.: Hedonic coalitions: Optimality and stability. Econometrica 48, 987–1003 (1980)MathSciNetCrossRefMATHGoogle Scholar
- 7.Flake, G., Tarjan, R., Tsioutsiouliklis, K.: Graph clustering and minimum cut trees. Internet Mathematics 1(4), 385–408 (2004)MathSciNetCrossRefMATHGoogle Scholar
- 8.Greenberg, J.: Coalition structures. In: Aumann, R.J., Hart, S. (eds.) Handbook of Game Theory, vol. 2. Elsevier Science, B.V, Amsterdam (1994)Google Scholar
- 9.Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)MATHGoogle Scholar
- 10.Karger, D., Stein, C.: An \(\widetilde{O}(n^2)\) algorithm for minimum cuts. In: STOC (1993)Google Scholar
- 11.Keinan, A., Sandbank, B., Hilgetag, C., Meilijson, I., Ruppin, E.: Axiomatic scalable neurocontroller analysis via shapley value. Artificial Life 12 (2006)Google Scholar
- 12.Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 77, 273–324 (1997)CrossRefMATHGoogle Scholar
- 13.Luxburg, U.: A tutorial on spectral clustering. Stat. and Comput. 17(4) (2007)Google Scholar
- 14.Meilă, M., Jordan, M.: Learning with mixtures of trees. JMLR 1, 1–48 (2000)MathSciNetMATHGoogle Scholar
- 15.Myerson, R.: Game Theory: Analysis of Conflict. Harvard University Press, Cambridge (1997)MATHGoogle Scholar
- 16.Olsen, M.: Nash stability in additively separable hedonic games is NP-hard. In: Cooper, S.B., Löwe, B., Sorbi, A. (eds.) CiE 2007. LNCS, vol. 4497, pp. 598–605. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 17.Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. on PAMI 27(8), 1226–1237 (2005)CrossRefGoogle Scholar
- 18.Perkins, S., Lacker, K., Theiler, J.: Grafting: Fast, incremental feature selection by gradient descent in function space. JMLR 3, 1333–1356 (2003)MathSciNetMATHGoogle Scholar
- 19.Stoer, M., Wagner, F.: A simple min-cut algorithm. J. of the ACM 44(4) (1997)Google Scholar
- 20.Vazirani, V.V.: Approximation Algorithms. Springer, Heidelberg (2004)Google Scholar