Abstract
The paper proposes applying Gene Expression Programming (GEP) to induce expression trees used subsequently as weak classifiers. Two techniques of constructing ensemble classifiers from weak classifiers are investigated in the paper. The working hypothesis of the paper can be stated as follows: given a set of classifiers generated through applying gene expression programming method and using some variants of boosting technique, one can construct the ensemble producing effectively high quality classification results. A detailed description of the proposed GEP implementation generating classifiers in the form of expression trees is followed by the report on AdaBoost and boosting algorithms used to construct an ensemble classifier. To validate the approach computational experiment involving several benchmark datasets has been carried out. Experiment results show that using GEP-induced expression trees as weak classifiers allows for construction of a high quality ensemble classifier outperforming, in terms of classification accuracy, many other recently published solutions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository, University of California, School of Information and Computer Science (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Battiti, R., Colla, A.M.: Democracy in neural nets: Voting schemes for classification. Neural Networks 7(4), 691–707 (1994)
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36(1-2), 105–139 (1999)
Bennett, P.N.: Building Reliable Metaclassifiers for Text Learning, Ph.D. Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh (2006)
Dasarathy, B.V., Sheela, B.V.: Composite classifier system design: Concepts and Mathodology. Proceedings of the IEEE 67(5), 708–713 (1979)
Drucker, H., Cortes, C., Jackel, L.D., LeCun, Y., Vapnik, V.: Boosting and other ensemble methods. Neural Computation 6(6), 1289–1301 (1994)
Duan, L., Tang, C., Zhang, T., Wei, D., Zhang, H.: Distance Guided Classification with Gene Expression Programming. In: Li, X., Zaïane, O.R., Li, Z. (eds.) ADMA 2006. LNCS (LNAI), vol. 4093, pp. 239–246. Springer, Heidelberg (2006)
Duda, R., Hart, P., Stork, D.: Pattern Classification. John Wiley & Sons, Inc., New York (2001)
Ferreira, C.: Gene Expression Programming: a New Adaptive Algorithm for Solving Problems. Complex Systems 13(2), 87–129 (2001)
Ferreira, C.: Gene Expression Programming. Studies in Computational Intelligence 21, 337–380 (2006)
Freund, Y., Schapire, R.E.: Decision-theoretic generalization of on-line learning and application to boosting. Journal of Computer and System Science 55(1), 119–139 (1997)
Gama, J.: Local cascade generalization. In: Proceedings of the 15th International Conference on Machine Learning, pp. 206–214 (1998)
Ho, T.K., Hull, J.J., Srihari, S.N.: Decision Combination in Multiple Classifier Systems. IEEE Transactions on Pattern Recognition and Machine Intelligence 16(1), 66–75 (1994)
Huang, C.-L., Chen, M.-C., Wang, C.-J.: Credit scoring with a data mining approach based on support vector machines. Expert Systems with Applications 33, 847–856 (2007)
Karakasis, V.K., Stafylopatis, A.: Data Mining based on Gene Expression Programming and Clonal Selection. In: Proc. IEEE Congress on Evolutionary Computation, pp. 514–521 (2006)
Kuncheva, L.I.: Classifier ensembles for changing environments. In: Roli, F., Kittler, J., Windeatt, T. (eds.) MCS 2004. LNCS, vol. 3077, pp. 1–15. Springer, Heidelberg (2004)
Lam, L., Suen, C.Y.: Optimal combination of pattern classifiers. Pattern Recognition Letters 16(9), 945–954 (1995)
Last, M., Maimon, O.: A Compact and Accurate Model for Classification. IEEE Transactions on Knowledge and Data Engineering 16(2), 203–215 (2004)
Li, X., Zhou, C., Xiao, W., Nelson, P.C.: Prefix Gene Expression Programming. In: Proc. Genetic and Evolutionary Computation Conference, Washington, pp. 25–31 (2005)
Pena Centeno, T., Lawrence, N.D.: Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis. Journal of Machine Learning Research 7, 455–491 (2006)
Polikar, R.: Ensemble Based Systems in Decision Making. IEEE Circuits and Systems Magazine 3, 22–43 (2006)
Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics 26(5), 1651–1686 (1998)
Srinivasa, K.B., Singh, A., Thomas, A.O., Venugopal, K.R., Patnoik, L.M.: Generic Feature Extraction for Classification Using Fuzzy C-Means Clustering. In: Proc. Intelligent Sensing and Information Processing Conference, pp. 33–38 (2005)
Torre, F.: Boosting Correct Least General Generalizations, Technical Report GRAppA-0104, Grenoble (2004)
Wang, W., Li, Q., Han, S., Lin, H.: A Preliminary Study on Constructing Decision Tree with Gene Expression Programming. In: Proc. First International Conference on Innovative Computing, Information and Control, vol. 1, pp. 222–225 (2006)
Weinert, W.R., Lopes, H.S.: GEPCLASS: A Classification Rule Discovery Tool Using Gene Expression Programming. In: Li, X., Zaïane, O.R., Li, Z. (eds.) ADMA 2006. LNCS (LNAI), vol. 4093, pp. 871–880. Springer, Heidelberg (2006)
Wolpert, D.H.: Stacked generalization. Neural Networks 5, 241–259 (1992)
Statlog Datasets: comparison of results (accessed on 27 December 2007), www.is.umk.pl/projects/datasets.html#Cleveland
Zeng, T., Xiang, Y., Chen, P., Liu, Y.: A Model of Immune Gene Expression Programming for Rule Mining. Journal of Universal Computer Science 13(7), 1239–1252 (2007)
Zhou, C., Xiao, W., Tirpak, T.M., Nelson, P.C.: Evolving Accurate and Compact Classification Rules with Gene Expression Programming. IEEE Transactions on Evolutionary Computation 7(6), 519–531 (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jȩdrzejowicz, J., Jȩdrzejowicz, P. (2008). GEP-Induced Expression Trees as Weak Classifiers. In: Perner, P. (eds) Advances in Data Mining. Medical Applications, E-Commerce, Marketing, and Theoretical Aspects. ICDM 2008. Lecture Notes in Computer Science(), vol 5077. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70720-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-70720-2_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70717-2
Online ISBN: 978-3-540-70720-2
eBook Packages: Computer ScienceComputer Science (R0)