Genetic Programming Based on Granular Computing for Classification with High-Dimensional Data
- 1 Citations
- 1.3k Downloads
Abstract
Classification tasks become more challenging when having the curse of dimensionality issue. Recently, there has been an increasing number of datasets with thousands of features. Some classification algorithms often need feature selection to avoid the curse of dimensionality. Genetic programming (GP) has shown success in classification tasks. GP does not require to do feature selection because of its built-in capability to automatically select informative features. However, GP-based methods are often computationally intensive to achieve a good classification accuracy. Based on perspectives from granular computing (GrC), this paper proposes a new approach to linking features hierarchically for GP-based classification. Experiments on seven high-dimensional datasets show the effectiveness of the proposed algorithm in terms of saving training time and enhancing the classification accuracy, compared to baseline methods.
Keywords
High-dimensional data Genetic programming Granular computing ClassificationNotes
Acknowledgement
This work was supported in part by the Marsden Fund of New Zealand Government under Contracts VUW1209, VUW1509 and VUW1615, Huawei Industry Fund E2880/3663, Natural Science Foundation of Jiangsu, China BK20161406, and the University Research Fund at Victoria University of Wellington 209862/3580, and 213150/3662.
References
- 1.Bargiela, A., Pedrycz, W.: Granular computing. In: Handbook in Computational Intelligence. Fuzzy Logic, Systems, Artificial Neural Networks, and Learning Systems, vol. 1, pp. 43–66. World Scientific (2016)Google Scholar
- 2.Cao, J., Lin, Z., Huang, G.B., Liu, N.: Voting based extreme learning machine. Inf. Sci. 185(1), 66–77 (2012)MathSciNetCrossRefGoogle Scholar
- 3.Cervante, L., Xue, B., Shang, L., Zhang, M.: A dimension reduction approach to classification based on particle swarm optimisation and rough set theory. In: Thielscher, M., Zhang, D. (eds.) AI 2012. LNCS (LNAI), vol. 7691, pp. 313–325. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35101-3_27CrossRefGoogle Scholar
- 4.Deng, Z., Zhu, X., Cheng, D., Zong, M., Zhang, S.: Efficient knn classification algorithm for big data. Neurocomputing 195, 143–148 (2016)CrossRefGoogle Scholar
- 5.Espejo, P.G., Ventura, S., Herrera, F.: A survey on the application of genetic programming to classification. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 40(2), 121–144 (2010)CrossRefGoogle Scholar
- 6.Fleury, A., Vacher, M., Noury, N.: SVM-based multimodal classification of activities of daily living in health smart homes: sensors, algorithms, and first experimental results. IEEE Trans. Inf. Technol. Biomed. 14(2), 274–283 (2010)CrossRefGoogle Scholar
- 7.Joshi, A., Dangra, J., Rawat, M.: A decision tree based classification technique for accurate heart disease classification and prediction. Int. J. Technol. Res. Manag. 3, 1–4 (2016)Google Scholar
- 8.Luna, J.M., Pechenizkiy, M., del Jesus, M.J., Ventura, S.: Mining context-aware association rules using grammar-based genetic programming. IEEE Trans. Cybern. (2017)Google Scholar
- 9.Murphy, K.P.: Naive Bayes Classifiers. University of British Columbia (2006)Google Scholar
- 10.Nguyen, H.B., Xue, B., Andreae, P.: A hybrid GA-GP method for feature reduction in classification. In: Shi, Y., et al. (eds.) SEAL 2017. LNCS, vol. 10593, pp. 591–604. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68759-9_48CrossRefGoogle Scholar
- 11.Poli, R., Langdon, W., McPhee, N.: A Field Guide to Genetic Programming (2008)Google Scholar
- 12.Sáez, J.A., Galar, M., Luengo, J., Herrera, F.: Tackling the problem of classification with noisy data using multiple classifier systems: analysis of the performance and robustness. Inf. Sci. 247, 1–20 (2013)CrossRefGoogle Scholar
- 13.Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437 (2009)CrossRefGoogle Scholar
- 14.Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.P., Feuston, B.P.: Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43(6), 1947–1958 (2003)CrossRefGoogle Scholar
- 15.Thearling, K.: An Introduction to Data Mining (2017)Google Scholar
- 16.Tran, B., Xue, B., Zhang, M.: Genetic programming for feature construction and selection in classification on high-dimensional data. Memetic Comput. 8(1), 3–15 (2016)CrossRefGoogle Scholar
- 17.Tran, B., Xue, B., Zhang, M.: Using feature clustering for GP-based feature construction on high-dimensional data. In: McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P. (eds.) EuroGP 2017. LNCS, vol. 10196, pp. 210–226. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55696-3_14CrossRefGoogle Scholar
- 18.Wang, G., Yang, J., Xu, J.: Granular computing: from granularity optimization to multi-granularity joint problem solving. Granul. Comput. 2(3), 105–120 (2017)CrossRefGoogle Scholar
- 19.Yang, H.J., Roe, B.P., Zhu, J.: Studies of stability and robustness for artificial neural networks and boosted decision trees. Nucl. Instrum. Methods Phys. Res. Sect. A: Accel. Spectrometers Detect. Assoc. Equip. 574(2), 342–349 (2007)CrossRefGoogle Scholar
- 20.Yao, J.: Novel Developments in Granular Computing: Applications for Advanced Human Reasoning and Soft Computation. IGI Global (2010)Google Scholar
- 21.Yao, Y.: A triarchic theory of granular computing. Granul. Comput. 1(2), 145–157 (2016)CrossRefGoogle Scholar
- 22.Zhang, G.P.: Neural networks for classification: a survey. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 30(4), 451–462 (2000)CrossRefGoogle Scholar
- 23.Zhu, Z., Ong, Y.S., Dash, M.: Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit. 40(11), 3236–3248 (2007)CrossRefGoogle Scholar