CIARP 2013: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications pp 246-253 | Cite as
Feature Space Reduction for Graph-Based Image Classification
Abstract
Feature selection is an essential preprocessing step for classifiers with high dimensional training sets. In pattern recognition, feature selection improves the performance of classification by reducing the feature space but preserving the classification capabilities of the original feature space. Image classification using frequent approximate subgraph mining (FASM) is an example where the benefits of features selections are needed. This is due using frequent approximate subgraphs (FAS) leads to high dimensional representations. In this paper, we explore the use of feature selection algorithms in order to reduce the representation of an image collection represented through FASs. In our results we report a dimensionality reduction of over 50% of the original features and we get similar classification results than those reported by using all the features.
Keywords
Approximate graph mining approximate graph matching feature selection graph-based classificationReferences
- 1.Acosta-Mendoza, N., Gago-Alonso, A., Medina-Pagola, J.E.: Frequent Approximate Subgraphs as Features for Graph-Based Image Classification. Knowledge-Based Systems 27, 381–392 (2012)CrossRefGoogle Scholar
- 2.Acosta-Mendoza, N., Morales-González, A., Gago-Alonso, A., García-Reyes, E.B., Medina-Pagola, J.E.: Image Classification Using Frequent Approximate Subgraphs. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 292–299. Springer, Heidelberg (2012)CrossRefGoogle Scholar
- 3.Bermejo, P., de la Ossa, L., Gámez, J.A., Miguel-Puerta, J.: Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking. Knowledge-Based Systems 25(1), 35–44 (2012)CrossRefGoogle Scholar
- 4.Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowledge and Information Systems 34(3), 483–519 (2013)CrossRefGoogle Scholar
- 5.Duval, B., Hao, J.K., Hernandez, J.C.: A memetic algorithm for gene selection and molecular classification of cancer. In: Genetic and Evolutionary Computation Conference (GECCO 2009), pp. 201–208. ACM, Montreal (2009)Google Scholar
- 6.Ferreira, A.J., Figueiredo, M.A.T.: Efficient feature selection filters for high-dimensional data. Pattern Recognition Letters 33(13), 1794–1804 (2012)CrossRefGoogle Scholar
- 7.Gago-Alonso, A., Carrasco-Ochoa, J.A., Medina-Pagola, J.E., Martínez-Trinidad, J.F.: Duplicate Candidate Elimination and Fast Support Calculation for Frequent Subgraph Mining. In: Corchado, E., Yin, H. (eds.) IDEAL 2009. LNCS, vol. 5788, pp. 292–299. Springer, Heidelberg (2009)CrossRefGoogle Scholar
- 8.García, S., Herrera, F.: An Extension on “Statistical Comparisons of Classifiers over Multiple Data Sets” for all Pairwise Comparisons. Journal of Machine Learning Research 9, 2677–2694 (2008)MATHGoogle Scholar
- 9.Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)Google Scholar
- 10.He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. Advances in Neural Information Processing Systems 18, 507–514 (2006)Google Scholar
- 11.Hommel, G.: A stagewise rejective multiple test procedure. Biometrika 75, 383–386 (1988)CrossRefMATHGoogle Scholar
- 12.Holder, L., Cook, D., Bunke, H.: Fuzzy substructure discovery. In: Proceedings of the 9th International Workshop on Machine Learning, San Francisco, CA, USA, pp. 218–223 (1992)Google Scholar
- 13.Holm, S.: A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 65–70 (1979)MathSciNetMATHGoogle Scholar
- 14.Jia, Y., Zhang, J., Huan, J.: An Efficient Graph-Mining Method for Complicated and Noisy Data with Real-World Applications. Knowledge Information Systems 28(2), 423–447 (2011)CrossRefGoogle Scholar
- 15.Norshafarina, O.B., Fantimatufaridah, J.B., Mohd-Shahizan, O.B., Roliana, I.B.: Review of feature selection for solving classification problems. Journal of Research and Innovation in Information Systems, 54–60 (2013)Google Scholar
- 16.Pudil, P., Novovicova, J., Kittler, J.: Floating search methods in feature selection. Pattern Recognition Letters 15, 1119–1125 (1994)CrossRefGoogle Scholar
- 17.Rodríguez-Bermúdez, G., García-Laencina, P.J., Roca-González, J., Roca-Dorda, J.: Efficient feature selection and linear discrimination of (eeg) signals. Neurocomputing 115(4), 161–165 (2013)CrossRefGoogle Scholar
- 18.Yan, X., Huan, J.: gSpan: Graph-Based Substructure Pattern Mining. In: Proceedings International Conference on Data Mining, Maebashi, Japan, pp. 721–724 (2002)Google Scholar
- 19.Ye, Y., Wu, Q., Huang, J.Z., Ng, M.K., Li, X.: Stratified sampling for feature subspace selection in random forests for high dimensional data. Pattern Recognition 46(3), 769–787 (2013)CrossRefGoogle Scholar