Skip to main content

Evolving Coevolutionary Classifiers Under Large Attribute Spaces

  • Chapter
  • First Online:
Genetic Programming Theory and Practice VII

Abstract

Model-building under the supervised learning domain potentially face a dual learning problem of identifying both the parameters of the model and the subset of (domain) attributes necessary to support the model, thus using an embedded as opposed to wrapper or filter based design. Genetic Programming (GP) has always addressed this dual problem, however, further implicit assumptions are made which potentially increase the complexity of the resulting solutions. In this work we are specifically interested in the case of classification under very large attribute spaces. As such it might be expected that multiple independent/ overlapping attribute subspaces support the mapping to class labels; whereas GP approaches to classification generally assume a single binary classifier per class, forcing the model to provide a solution in terms of a single attribute subspace and single mapping to class labels. Supporting the more general goal is considered as a requirement for identifying a ‘team’ of classifiers with non-overlapping classifier behaviors, in which each classifier responds to different subsets of exemplars. Moreover, the subsets of attributes associated with each team member might utilize a unique ‘subspace’ of attributes. This work investigates the utility of coevolutionary model building for the case of classification problems with attribute vectors consisting of 650 to 100,000 dimensions. The resulting team based coevolutionary evolutionary method-Symbiotic Bid-based (SBB) GP-is compared to alternative embedded classifier approaches of C4.5 and Maximum Entropy Classification (MaxEnt). SSB solutions demonstrate up to an order of magnitude lower attribute count relative to C4.5 and up to two orders of magnitude lower attribute count than MaxEnt while retaining comparable or better classification performance. Moreover, relative to the attribute count of individual models participating within a team, no more than six attributes are ever utilized; adding a further level of simplicity to the resulting solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Asuncion, A. and Newman, D. J. (2008). UCI Repository of Machine Learning Databases [http://www.ics.uci.edu/∼mlearn/mlrepository.html]. Irvine, CA: University of California, Dept. of Information and Comp. Science.

  • Bernado-Mansilla, E. and Garrell-Guiu, J.M. (2003). Accuracy-based learning classifier systems: Models, analysis and applications to classification tasks. Evolutionary Computation, 11:209–238.

    Article  Google Scholar 

  • Brameier, M. and Banzhaf, W. (2001). Evolving teams of predictors with linear Genetic Programming. Genetic Programming and Evolvable Machines, 2(4):381–407.

    Article  MATH  Google Scholar 

  • Chandra, A., Chen, H., and Yao, X. (2006). Trade-off between diversity and accuracy in ensemble generation, chapter 19, pages 429–464. In ((Jin, 2006)).

    Google Scholar 

  • Daumè III, Hal (2004). Notes on CG and LM-BFGS optimization of logistic regression. Paper and code available at http://www.cs.utah.edu/∼hal/megam.

  • de Jong, E.D. (2007). A monotonic archive for pareto-coevolution. Evolutionary Computation, 15(1):61–93.

    Article  Google Scholar 

  • Doucette, J. and Heywood, M.I. (2008). GP Classification under Imbalanced Data Sets: Active Sub-sampling and AUC Approximation. In European Conference on Genetic Programming, volume 4971 of Lecture Notes in Computer Science, pages 266–277.

    Google Scholar 

  • Doucette, J., McIntyre, A.R., Lichodzijewski, P., and Heywood, M. I. (2009). Problem decomposition under large feature spaces using a coevolutionary memetic algorithm. Manuscript under review.

    Google Scholar 

  • Folino, G., Pizzuti, C., and Spezzano, G. (2006). GP ensembles for large-scale data classification. IEEE Transactions on Evolutionary Computation, 10(5):604–616.

    Article  Google Scholar 

  • Haffner, P. (2006). Scaling large margin classifiers for spoken language understanding. Speech Communication, 48:239–261.

    Article  Google Scholar 

  • Imamura, K., Soule, T., Heckendorn, R. B., and Foster, J. A. (2003). Behavioral diversity and a probabilistically optimal GP ensemble. Genetic Programming and Evolvable Machines, 4(3):235–253.

    Article  Google Scholar 

  • Jin, Y., editor (2006). Multi-Objective Machine Learning, volume 16 of Studies in Computational Intelligence. Spinger-Verlag.

    Google Scholar 

  • Krawiec, K. (2002). Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery tasks. Genetic Programming and Evolvable Machines, 3(4):329–343.

    Article  MATH  Google Scholar 

  • Kumar, R., Joshi, A.H., Banka, K.K., and Rockett, P.I. (2008). Evolution of hyperheuristics for the biobjective 0/1 knapsack problem by multiobjective Genetic Programming. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 1227–1234.

    Google Scholar 

  • Lal, T. N., Chapelle, O., Weston, J., and Elisseeff, A. (2006). Embedded methods. In Guyon, I., Gunn, S., Nikravesh, M., and Zadeh, L.A., editors, Feature Extraction: Foundations and Applications, pages 137–165. Springer Verlag.

    Google Scholar 

  • Lichodzijewski, P. and Heywood, M. I. (2008a). Coevolutionary bid-based Genetic Programming for problem decomposition in classification. Genetic Programming and Evolvable Machines, 9(4):331–365.

    Article  Google Scholar 

  • Lichodzijewski, P. and Heywood, M.I. (2008b). Managing team-based problem solving with Symbiotic Bid-based Genetic Programming. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 363–370.

    Google Scholar 

  • McIntyre, A.R. and Heywood, M.I. (2008). Cooperative problem decomposition in Pareto competitive classifier models of coevolution. In European Conference on Genetic Programming, volume 4971 of Lecture Notes in Computer Science, pages 289–300.

    Google Scholar 

  • More, J. H. and White, B. C. (2007). Genome-wide genetic analysis using genetic programming. In Riolo, R., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice IV, pages 11–28. Springer Verlag.

    Google Scholar 

  • Nigam, K., Lafferty, J., and McCallum, A. (1999). Using Maximum Entropy for Text Classification. In Workshop on Machine Learning for Information Filtering (IJCAI), pages 61–67.

    Google Scholar 

  • Potter, M. and de Jong, K. (2000). Cooperative coevolution: An architecture for evolving coadapted subcomponents. Evolutionary Computation, 8(1):1–29.

    Article  Google Scholar 

  • Quinlan, Ross J. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann.

    Google Scholar 

  • Rosin, C. D. and Belew, R. K. (1997). New methods for competitive coevolution. Evolutionary Compuatation, 5:1–29.

    Article  Google Scholar 

  • Smith, M.G. and Bull, L. (2005). Genetic Programming with a Genetic Algorithm for Feature Construction and Selection. Genetic Programming and Evolvable Machines, 6(3):265–281.

    Article  Google Scholar 

  • Thomason, R. and Soule, T. (2007). Novel ways of improving cooperation and performance in Ensemble Classifiers. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 1708–1715.

    Google Scholar 

  • Zhang, Y. and Rockett, P.I. (2006). Feature extraction using multi-objective genetic programming, chapter 4, pages 75–99. In ((Jin, 2006)).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Doucette, J., Lichodzijewski, P., Heywood, M. (2010). Evolving Coevolutionary Classifiers Under Large Attribute Spaces. In: Riolo, R., O'Reilly, UM., McConaghy, T. (eds) Genetic Programming Theory and Practice VII. Genetic and Evolutionary Computation. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-1626-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-1626-6_3

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-1653-2

  • Online ISBN: 978-1-4419-1626-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics