Automated discovery of test statistics using genetic programming
The process of developing new test statistics is laborious, requiring the manual development and evaluation of mathematical functions that satisfy several theoretical properties. Automating this process, hitherto not done, would greatly accelerate the discovery of much-needed, new test statistics. This automation is a challenging problem because it requires the discovery method to know something about the desirable properties of a good test statistic in addition to having an engine that can develop and explore candidate mathematical solutions with an intuitive representation. In this paper we describe a genetic programming-based system for the automated discovery of new test statistics. Specifically, our system was able to discover test statistics as powerful as the t test for comparing sample means from two distributions with equal variances.
KeywordsGenetic programming Statistics Optimization t test
This work was supported by National Institutes of Health (USA) Grants LM012601, AI116794, and DK112217. We would like to thank the reviewers for the thoughtful suggestions.
- 2.L. Spector, D.M. Clark, I. Lindsay, B. Barr, J. Klein. Genetic programming for finite algebras, in Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation. (ACM, New York, 2008), pp. 1291–1298Google Scholar
- 4.R. Poli, W.B. Langdon, N.F. McPhee, A Field Guide to Genetic Programming (Lulu Enterprises, UK Ltd, 2008)Google Scholar
- 13.D. Medernach, J. Fitzgerald, R.M.A Azad, C. Ryan. A new wave: a dynamic approach to genetic programming, in Proceedings of the Genetic and Evolutionary Computation Conference 2016. (ACM, New York, 2016), pp. 757–764Google Scholar