A multi-core parallelization strategy for statistical significance testing in learning classifier systems

Rudd, James; Moore, Jason H.; Urbanowicz, Ryan J.

doi:10.1007/s12065-013-0092-0

A multi-core parallelization strategy for statistical significance testing in learning classifier systems

Research Paper
Published: 08 October 2013

Volume 6, pages 127–134, (2013)
Cite this article

Evolutionary Intelligence Aims and scope Submit manuscript

James Rudd¹,
Jason H. Moore¹ &
Ryan J. Urbanowicz¹

259 Accesses
2 Citations
10 Altmetric
1 Mention
Explore all metrics

Abstract

Permutation-based statistics for evaluating the significance of class prediction, predictive attributes, and patterns of association have only appeared within the learning classifier system (LCS) literature since 2012. While still not widely utilized by the LCS research community, formal evaluations of statistical confidence are imperative to large and complex real world applications such as genetic epidemiology where it is standard practice to quantify the likelihood that a seemingly meaningful statistic could have been obtained purely by chance. Learning classifier system algorithms are relatively computationally expensive on their own. The compounding requirements for generating permutation-based statistics may be a limiting factor for some researchers interested in applying LCS algorithms to real world problems. Technology has made LCS parallelization strategies more accessible and thus more popular in recent years. In the present study we examine the benefits of externally parallelizing a series of independent LCS runs such that permutation testing with cross validation becomes more feasible to complete on a single multi-core workstation. We test our python implementation of this strategy in the context of a simulated complex genetic epidemiological data mining problem. Our evaluations indicate that as long as the number of concurrent processes does not exceed the number of CPU cores, the speedup achieved is approximately linear.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel Feature Selection for Regularized Least-Squares

ExSTraCS 2.0: description and evaluation of a scalable learning classifier system

Article 03 April 2015

Extracting Correlated Patterns on Multicore Architectures

References

Genetics based machine learning central. http://gbml.org/.
Python multiprocessing module. http://docs.python.org/2/library/multiprocessing.html
Microsoft powershell. http://technet.microsoft.com/en-us/library/bb978526.aspx, 2012
Bacardit J, Llorà X (2013) Large-scale data mining using genetics-based machine learning. Wiley Interdiscip Rev Data Min Knowl Discov 3(1):37–61
Article Google Scholar
Bernadó-Mansilla E, Garrell-Guiu JM (2003) Accuracy-based learning classifier systems: models, analysis and applications to classification tasks. Evol Comput 11(3):209–238
Article Google Scholar
Binet S, Calafiura P, Snyder S, Wiedenmann W, Winklmeier F (2010) Harnessing multicores: strategies and implementations in atlas. J Phys Conf Ser 219:042002. IOP Publishing
Google Scholar
Foley SS, Elwasif WR, Bernholdt DE (2011) The integrated plasma simulator: a flexible python framework for coupled multiphysics simulation. PyHPC 2011: Python for High Performance and Scientific Computing
Friborg RM, Bjørndalen JM, Vinter B (2009) Three unique implementations of processes for pycsp. Commun Process Archit 2009:277–292
Google Scholar
Lanzi PL, Loiacono D (2010) Speeding up matching in learning classifier systems using cuda. Learn Classif Syst 1–20. Springer
Loiacono D (2011) Fast prediction computation in learning classifier systems using cuda. In: Proceedings of the 13th annual conference companion on Genetic and evolutionary computation, pp 169–170. ACM
Moore JH, Asselbergs FW, Williams SM (2010) Bioinformatics challenges for genome-wide association studies. Bioinformatics 26(4):445–455
Article Google Scholar
Urbanowicz R, Granizo-Mackenzie A, Moore J (2012) Instance-linked attribute tracking and feedback for michigan-style supervised learning classifier systems. In: Proceedings of the fourteenth international conference on genetic and evolutionary computation conference, pp 927–934. ACM
Urbanowicz RJ, Andrew AS, Karagas MR, Moore JH (2013) Role of genetic heterogeneity and epistasis in bladder cancer susceptibility and outcome: a learning classifier system approach. J AMIA
Urbanowicz RJ, Granizo-Mackenzie A, Moore JH (2012) An analysis pipeline with statistical and visualization-guided knowledge discovery for michigan-style learning classifier systems. Comput Intell Mag IEEE 7(4):35–45
Article Google Scholar
Urbanowicz RJ, Kiralis J, Fisher JM, Moore JH (2012) Predicting the difficulty of pure, strict, epistatic models: metrics for simulated model selection. BioData Min 5(1):1–13
Article Google Scholar
Urbanowicz RJ, Kiralis J, Sinnott-Armstrong NA, Heberling T, Fisher JM, Moore JH (2012) Gametes: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures. BioData Min 5(1):16
Article Google Scholar
Urbanowicz RJ, Moore JH (2009) Learning classifier systems: a complete introduction, review, and roadmap. J Artif Evol Appl 2009:1
Article Google Scholar
Urbanowicz RJ, Moore JH (2010) The application of michigan-style learning classifiersystems to address genetic heterogeneity and epistasisin association studies. In: Proceedings of the 12th annual conference on genetic and evolutionary computation, pp 195–202. ACM
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Article Google Scholar

Download references

Acknowledgments

This work was supported by NIH grants LM011360, LM009012 and LM010098.

Author information

Authors and Affiliations

Dartmouth College, 1 Medical Center Dr, Lebanon, NH, 03755, USA
James Rudd, Jason H. Moore & Ryan J. Urbanowicz

Authors

James Rudd
View author publications
You can also search for this author in PubMed Google Scholar
Jason H. Moore
View author publications
You can also search for this author in PubMed Google Scholar
Ryan J. Urbanowicz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ryan J. Urbanowicz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rudd, J., Moore, J.H. & Urbanowicz, R.J. A multi-core parallelization strategy for statistical significance testing in learning classifier systems. Evol. Intel. 6, 127–134 (2013). https://doi.org/10.1007/s12065-013-0092-0

Download citation

Received: 20 August 2013
Accepted: 12 September 2013
Published: 08 October 2013
Issue Date: November 2013
DOI: https://doi.org/10.1007/s12065-013-0092-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-core parallelization strategy for statistical significance testing in learning classifier systems

Abstract

Access this article

Similar content being viewed by others

Parallel Feature Selection for Regularized Least-Squares

ExSTraCS 2.0: description and evaluation of a scalable learning classifier system

Extracting Correlated Patterns on Multicore Architectures

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A multi-core parallelization strategy for statistical significance testing in learning classifier systems

Abstract

Access this article

Similar content being viewed by others

Parallel Feature Selection for Regularized Least-Squares

ExSTraCS 2.0: description and evaluation of a scalable learning classifier system

Extracting Correlated Patterns on Multicore Architectures

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation