Localized Graph-Based Feature Selection for Clustering

Zhang, Zhihong; Hancock, Edwin R.

doi:10.1007/978-3-642-31295-3_1

Zhihong Zhang¹⁸ &
Edwin R. Hancock¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7324))

Included in the following conference series:

International Conference Image Analysis and Recognition

2119 Accesses

Abstract

In many data analysis tasks, one is often confronted with very high dimensional data. The feature selection problem is essentially a combinatorial optimization problem which is computationally expensive. On the one hand, to overcome this problem traditional feature selection methods frequently assume either that the features independently influence the class variable or do so only involving pairwise feature interactions. On the other hand, they attempt to select a common feature subset for all the clusters present in the data. However, in doing so they neglect the fact that different features may have different discriminating power for different classes present in data. To tackle the above problems, we propose a localized graph-based feature selection algorithm consisting of three steps, namely, i) based on the label information, we first construct a graph for each class of dataset in which each node corresponds to a feature, and each edge has a weight corresponding to the mutual information (MI) between features connected by that edge, ii) we then perform dominant set clustering for the graphs to select a highly coherent set of features, iii) we further refine the selected features based on a new measure called multidimensional interaction information (MII). The advantage of MII is that it can go beyond pairwise interaction and consider third or higher order feature interactions. Using dominant set clustering, which can extract the most informative features in the leading dominant set as a preprocessing step and in doing so we can limit the search space for higher order interactions. We use a variational EM (VBEM) algorithm to learn a Gaussian mixture model on the selected feature subset for clustering. Experimental results demonstrate the effectiveness of our localized feature selection method on a number of standard data-sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Zhang, Z., Hancock, E.R.: A Graph-Based Approach to Feature Selection. In: Graph-Based Representations in Pattern Recognition, pp. 205–214 (2011)
Google Scholar
Battiti, R.: Using Mutual Information for Selecting Features in Supervised Neural Net Learning. IEEE TNN 5(4), 537–550 (2002)
Google Scholar
Peng, H., Long, F., Ding, C.: Feature Selection Based on Mutual Information: Criteria of Max-dependency, Max-relevance, and Min-redundancy. IEEE TPAMI 27(8), 1226–1238 (2005)
Article Google Scholar
Pavan, M., Pelillo, M.: A New Graph-Theoretic Approach to Clustering and Segmentation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 145–152 (2003)
Google Scholar
Shannon, C.E.: A Mathematical Theory of Communication. ACM SIGMOBILE Mobile Computing and Communications Review 5(1), 3–55 (2001)
Article Google Scholar
Yu, J., Amores, J., Sebe, N., Tian, Q.: Toward robust distance metric analysis for similarity estimation. In: Proc. CVPR, vol. 1, pp. 316–322 (2006)
Google Scholar
Mitra, P., Murthy, C.A., Pal, S.: Unsupervised feature selection using feature similarity. IEEE TPMI 24(3), 301–312 (2002)
Article Google Scholar
Devijver, P.A., Kittler, J.: Pattern recognition: A statistical approach. Prentice Hall, Englewood Cliffs (1982)
MATH Google Scholar
He, X., Cai, D., Niyogi, P.: Laplacian Score for Feature Selection. In: NIPS (2005)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning, vol. 4. Springer, New York (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of York, York, YO10 5GH, UK
Zhihong Zhang & Edwin R. Hancock

Authors

Zhihong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Edwin R. Hancock
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Engineering, Institute of Biomedical Engineering, University of Porto, Rua Dr. Roberto Frias, 4200-465, Porto, Portugal
Aurélio Campilho
Department of Electrical and Computer Engineering, University of Waterloo, N2L 3G1, Waterloo, ON, Canada
Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Z., Hancock, E.R. (2012). Localized Graph-Based Feature Selection for Clustering. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2012. Lecture Notes in Computer Science, vol 7324. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31295-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-31295-3_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31294-6
Online ISBN: 978-3-642-31295-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics