Evolving Coevolutionary Classifiers Under Large Attribute Spaces

Doucette, John; Lichodzijewski, Peter; Heywood, Malcolm

doi:10.1007/978-1-4419-1626-6_3

John Doucette⁴,
Peter Lichodzijewski⁴ &
Malcolm Heywood⁴

Part of the book series: Genetic and Evolutionary Computation ((GEVO))

1065 Accesses
2 Citations

Abstract

Model-building under the supervised learning domain potentially face a dual learning problem of identifying both the parameters of the model and the subset of (domain) attributes necessary to support the model, thus using an embedded as opposed to wrapper or filter based design. Genetic Programming (GP) has always addressed this dual problem, however, further implicit assumptions are made which potentially increase the complexity of the resulting solutions. In this work we are specifically interested in the case of classification under very large attribute spaces. As such it might be expected that multiple independent/ overlapping attribute subspaces support the mapping to class labels; whereas GP approaches to classification generally assume a single binary classifier per class, forcing the model to provide a solution in terms of a single attribute subspace and single mapping to class labels. Supporting the more general goal is considered as a requirement for identifying a ‘team’ of classifiers with non-overlapping classifier behaviors, in which each classifier responds to different subsets of exemplars. Moreover, the subsets of attributes associated with each team member might utilize a unique ‘subspace’ of attributes. This work investigates the utility of coevolutionary model building for the case of classification problems with attribute vectors consisting of 650 to 100,000 dimensions. The resulting team based coevolutionary evolutionary method-Symbiotic Bid-based (SBB) GP-is compared to alternative embedded classifier approaches of C4.5 and Maximum Entropy Classification (MaxEnt). SSB solutions demonstrate up to an order of magnitude lower attribute count relative to C4.5 and up to two orders of magnitude lower attribute count than MaxEnt while retaining comparable or better classification performance. Moreover, relative to the attribute count of individual models participating within a team, no more than six attributes are ever utilized; adding a further level of simplicity to the resulting solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Asuncion, A. and Newman, D. J. (2008). UCI Repository of Machine Learning Databases [http://www.ics.uci.edu/∼mlearn/mlrepository.html]. Irvine, CA: University of California, Dept. of Information and Comp. Science.
Bernado-Mansilla, E. and Garrell-Guiu, J.M. (2003). Accuracy-based learning classifier systems: Models, analysis and applications to classification tasks. Evolutionary Computation, 11:209–238.
Article Google Scholar
Brameier, M. and Banzhaf, W. (2001). Evolving teams of predictors with linear Genetic Programming. Genetic Programming and Evolvable Machines, 2(4):381–407.
Article MATH Google Scholar
Chandra, A., Chen, H., and Yao, X. (2006). Trade-off between diversity and accuracy in ensemble generation, chapter 19, pages 429–464. In ((Jin, 2006)).
Google Scholar
Daumè III, Hal (2004). Notes on CG and LM-BFGS optimization of logistic regression. Paper and code available at http://www.cs.utah.edu/∼hal/megam.
de Jong, E.D. (2007). A monotonic archive for pareto-coevolution. Evolutionary Computation, 15(1):61–93.
Article Google Scholar
Doucette, J. and Heywood, M.I. (2008). GP Classification under Imbalanced Data Sets: Active Sub-sampling and AUC Approximation. In European Conference on Genetic Programming, volume 4971 of Lecture Notes in Computer Science, pages 266–277.
Google Scholar
Doucette, J., McIntyre, A.R., Lichodzijewski, P., and Heywood, M. I. (2009). Problem decomposition under large feature spaces using a coevolutionary memetic algorithm. Manuscript under review.
Google Scholar
Folino, G., Pizzuti, C., and Spezzano, G. (2006). GP ensembles for large-scale data classification. IEEE Transactions on Evolutionary Computation, 10(5):604–616.
Article Google Scholar
Haffner, P. (2006). Scaling large margin classifiers for spoken language understanding. Speech Communication, 48:239–261.
Article Google Scholar
Imamura, K., Soule, T., Heckendorn, R. B., and Foster, J. A. (2003). Behavioral diversity and a probabilistically optimal GP ensemble. Genetic Programming and Evolvable Machines, 4(3):235–253.
Article Google Scholar
Jin, Y., editor (2006). Multi-Objective Machine Learning, volume 16 of Studies in Computational Intelligence. Spinger-Verlag.
Google Scholar
Krawiec, K. (2002). Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery tasks. Genetic Programming and Evolvable Machines, 3(4):329–343.
Article MATH Google Scholar
Kumar, R., Joshi, A.H., Banka, K.K., and Rockett, P.I. (2008). Evolution of hyperheuristics for the biobjective 0/1 knapsack problem by multiobjective Genetic Programming. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 1227–1234.
Google Scholar
Lal, T. N., Chapelle, O., Weston, J., and Elisseeff, A. (2006). Embedded methods. In Guyon, I., Gunn, S., Nikravesh, M., and Zadeh, L.A., editors, Feature Extraction: Foundations and Applications, pages 137–165. Springer Verlag.
Google Scholar
Lichodzijewski, P. and Heywood, M. I. (2008a). Coevolutionary bid-based Genetic Programming for problem decomposition in classification. Genetic Programming and Evolvable Machines, 9(4):331–365.
Article Google Scholar
Lichodzijewski, P. and Heywood, M.I. (2008b). Managing team-based problem solving with Symbiotic Bid-based Genetic Programming. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 363–370.
Google Scholar
McIntyre, A.R. and Heywood, M.I. (2008). Cooperative problem decomposition in Pareto competitive classifier models of coevolution. In European Conference on Genetic Programming, volume 4971 of Lecture Notes in Computer Science, pages 289–300.
Google Scholar
More, J. H. and White, B. C. (2007). Genome-wide genetic analysis using genetic programming. In Riolo, R., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice IV, pages 11–28. Springer Verlag.
Google Scholar
Nigam, K., Lafferty, J., and McCallum, A. (1999). Using Maximum Entropy for Text Classification. In Workshop on Machine Learning for Information Filtering (IJCAI), pages 61–67.
Google Scholar
Potter, M. and de Jong, K. (2000). Cooperative coevolution: An architecture for evolving coadapted subcomponents. Evolutionary Computation, 8(1):1–29.
Article Google Scholar
Quinlan, Ross J. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann.
Google Scholar
Rosin, C. D. and Belew, R. K. (1997). New methods for competitive coevolution. Evolutionary Compuatation, 5:1–29.
Article Google Scholar
Smith, M.G. and Bull, L. (2005). Genetic Programming with a Genetic Algorithm for Feature Construction and Selection. Genetic Programming and Evolvable Machines, 6(3):265–281.
Article Google Scholar
Thomason, R. and Soule, T. (2007). Novel ways of improving cooperation and performance in Ensemble Classifiers. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 1708–1715.
Google Scholar
Zhang, Y. and Rockett, P.I. (2006). Feature extraction using multi-objective genetic programming, chapter 4, pages 75–99. In ((Jin, 2006)).
Google Scholar

Download references

Author information

Authors and Affiliations

Dalhousie University, 6050 University Av., Halifax, NS, B3H 1W5, Canada
John Doucette (Faculty of Computer Science), Peter Lichodzijewski (Faculty of Computer Science) & Malcolm Heywood (Faculty of Computer Science)

Authors

John Doucette
View author publications
You can also search for this author in PubMed Google Scholar
Peter Lichodzijewski
View author publications
You can also search for this author in PubMed Google Scholar
Malcolm Heywood
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for the Study of Complex Systems, University of Michigan, West Hall 323, Ann Arbor, 48109, U.S.A.
Rick Riolo
Computer Science &, Massachusetts Institute of Technology, Vassar St. 32, Cambridge, 02139, U.S.A.
Una-May O'Reilly
Solido Design Automation, Inc., Research Drive 102-116, Saskatoon, S7N 3R3, Canada
Trent McConaghy

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Doucette, J., Lichodzijewski, P., Heywood, M. (2010). Evolving Coevolutionary Classifiers Under Large Attribute Spaces. In: Riolo, R., O'Reilly, UM., McConaghy, T. (eds) Genetic Programming Theory and Practice VII. Genetic and Evolutionary Computation. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-1626-6_3

Download citation

DOI: https://doi.org/10.1007/978-1-4419-1626-6_3
Published: 20 October 2009
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-1653-2
Online ISBN: 978-1-4419-1626-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics