Machine Learning

, Volume 52, Issue 1–2, pp 11–30 | Cite as

Relation Between Permutation-Test P Values and Classifier Error Estimates

  • Tailen Hsing
  • Sanju Attoor
  • Edward Dougherty


Gene-expression-based classifiers suffer from the small number of microarrays usually available for classifier design. Hence, one is confronted with the dual problem of designing a classifier and estimating its error with only a small sample. Permutation testing has been recommended to assess the dependency of a designed classifier on the specific data set. This involves randomly permuting the labels of the data points, estimating the error of the designed classifiers for each permutation, and then finding the p value of the error for the actual labeling relative to the population of errors for the random labelings. This paper addresses the issue of whether or not this p value is informative. It provides both analytic and simulation results to show that the permutation p value is, up to very small deviation, a function of the error estimate. Moreover, even though the p value is a monotonically increasing function of the error estimate, in the range of the error where the majority of the p values lie, the function is very slowly increasing, so that inversion is problematic. Hence, the conclusion is that the p value is less informative than the error estimate. This result demonstrates that random labeling does not provide any further insight into the accuracy of the classifier or the precision of the error estimate. We have no knowledge beyond the error estimate itself and the various distribution-free, classifier-specific bounds developed for this estimate.

classification error estimation genomics microarrays p value pattern recognition 


  1. Allander, S. V., Nupponen, N. N., Ringner, M., Hostetter, G., Maher, Goldberger, N., Chen, Y., Carpten, J., Elkahloun, A. G., & Meltzer, P. S. (2001). Gastrointestinal stromal tumors with KIT mutations exhibit a remarkably homogeneous gene expression profile. Cancer Research, 61, 8624–8628.Google Scholar
  2. Armstrong, S. A. et al. (2002). MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics, 30, 41–47.Google Scholar
  3. Bai, Z., & Hsing, T. The broken sample problem. Available at Scholar
  4. Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., & Yakhini, Z. (2000). Tissue classification with gene expression profiles. Computational Biology, 7, 559–583.Google Scholar
  5. Bhattacharjee, A. et al. (2001). Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences, 98:24, 13790–13795Google Scholar
  6. De Risi, J. L., Iyer, V. R., & Brown, P. O. (1997). Exploring the metabolic and genetic control of gene expression on a genomic scale. Science, 278, 680–686.Google Scholar
  7. Devroye, L., Gyorfi, L., & Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. New York: Springer-Verlag.Google Scholar
  8. Dougherty, E. R. (2001). Small sample issues for microarray-based classification. Comparative and Functional Genomics, 2, 28–34.Google Scholar
  9. Duggan, D. J., Bittner, M. L., Chen, Y., Meltzer, P. S., & Trent, J. M. (1999). Expression profiling using cDNA microarrays. Nature Genetics, 21, 10–14.Google Scholar
  10. Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D., & Lander, E. S. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286, 531–537.Google Scholar
  11. Good, P. (1994). Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypothesis. New York: Springer-Verlag.Google Scholar
  12. Hedenfalk, I., Duggan, D., Chen, Y., Radmacher, M., Bittner, M., Simon, R., Meltzer, P., Gusterson, B., Esteller, M., Raffeld, Yakhini, Z., Ben-Dor, A., Dougherty, E., Kononen, J., Bubendorf, L., Fehrle, W., Pittaluga, S., Gruvverger, S., Loman, N., Johannsson, O., Olsson, H., Wifond, B., Sauter, G., Kallioniemi, O. P., Borg, A., & Trent, J. (2001). Gene expression profiles distinguish hereditary breast cancers. New England Journal of Medicine, 34, 539–548.Google Scholar
  13. Khan, J., Wei, J. S., Ringner, M., Saal, L. H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C. R., Peterson, C., & Meltzer, P. S. (2001). Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Natture Medicine, 7, 673–679.Google Scholar
  14. Lesnick, S. T., Dacwag, C. S., & Golub, T. R. (2002). The Ewing's sarcoma oncoprotein EWS/FLI induces p 53-dependent growth arrest in primary human fibroblasts. Cancer Cell, 1, 393–401.Google Scholar
  15. Pomeroy, S. L., Tamayo, P., Gaasenbeek, M., Sturla, L. M., Angelo, M., McLaughlin, Kim, J. Y. H., Goumnerova,. C., Black, P., Lau, C., Allen, J. C., Zagzag, D., Olson, J. M., Curran, T., Wetmore, C., Biegel, J. A., Poggio, T., Mukherjee, C., Rifkin, R., Califano, A., Stolovitzky, G., Louis, D. N., Mesirov, J. P., Lander, E. S., & Golub, T. R. (2002). Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature, 415, 436–442.Google Scholar
  16. Ramaswamy, S. et al. (2001). Multi-class cancer diagnosis using tumor gene expression signatures. Proceedings of the National Academy of Sciences, 98, 15149–15154.Google Scholar
  17. Rogers, W., & Wagner, T. (1978). A finite sample distribution-free performance bound for local discrimination rules. Annals of Statistics, 8, 506–514.Google Scholar
  18. Schena, M., Shalon, D., Davis, R. W., & Brown, P. O. (1995). Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 270, 467–470.Google Scholar
  19. Slonim, D. K., Tamayo, P., Mesirov, J. P., Golub, T. R., & Lander, E. S. (2000). Class prediction and discovery using gene expression aata. Annual Conference on Research in Computational Molecular Biology, Tokyo.Google Scholar
  20. Vapnik, V., & Chervonenkis, A. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16, 264–280.Google Scholar
  21. Yeang, C.-H. et al. (2001). Molecular classification of multiple tumor types. Bioinformatics. 17(Supplement 1): S316–S322.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Tailen Hsing
    • 1
  • Sanju Attoor
    • 2
  • Edward Dougherty
    • 2
    • 3
  1. 1.Department of StatisticsTexas A&M UniversityUSA
  2. 2.Department of Electrical EngineeringTexas A&M UniversityUSA
  3. 3.Department of PathologyUniversity of Texas M.D. Anderson Cancer CenterUSA

Personalised recommendations