The Effect of Instance-Space Partition on Significance
- 964 Downloads
- 6 Citations
Abstract
This paper demonstrates experimentally that concluding which induction algorithm is more accurate based on the results from one partition of the instances into the cross-validation folds may lead to statistically erroneous conclusions. Comparing two decision tree induction and one naive-bayes induction algorithms, we find situations in which one algorithm is judged more accurate at the p = 0.05 level with one partition of the training instances but the other algorithm is judged more accurate at the p = 0.05 level with an alternate partition. We recommend a new significance procedure that involves performing cross-validation using multiple instance-space partitions. Significance is determined by applying the paired Student t-test separately to the results from each cross-validation partition, averaging their values, and converting this averaged value into a significance value.
References
- Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.Google Scholar
- Breiman, L., Friedman, J. H., Olshen, R. A.,& Stone, C. J. (1984). Classification and Regression Trees. Belmont, CA: Wadsworth International Group.Google Scholar
- Brodley, C. E.& Friedl, M. A. (1999). Identifying mislabeled training data. Journal of Artificial Intelligence Research, 11, 131–167.Google Scholar
- Casella, G.& Berger, R. L. (1990). Statistical Inference. Belmont, CA: Duxbury Press.Google Scholar
- Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7), 1895–1924.Google Scholar
- Dietterich, T. G.& Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263–286.Google Scholar
- Feelders, A.& Verkooijen, W. (1996). On the statistical comparison of inductive learning methods. In Fifth International Workshop on Artificial Intelligence and Statistics. Ft. Lauderdale, FL. Proceedings available as Fisher, D. H.& Lenz, H.-J., Ed. (1996), Learning from Data: Artificial Intelligence and Statistics V. New York, NY: Springer.Google Scholar
- Freund, R. J.& Wilson, W. J. (1997). Statistical Methods, revised edition. San Diego, CA: Academic Press.Google Scholar
- Freund, Y.& Schapire, R. (1996). Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learning (pp. 146–156). San Mateo, CA.Google Scholar
- Kohavi, R. (1996). Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (pp. 202–207). Portland, OR.Google Scholar
- Kohavi, R.& Kunz, C. (1997). Option decision trees with majority votes. In Proceedings of the Fourteenth International Conference on Machine Learning (pp. 161–169). Nashville, TN.Google Scholar
- Kohavi, R., Sommerfield, D.,& Dougherty, J. (1996). Data mining using \({\mathcal{M}}{\mathcal{L}}{\mathcal{C}}\)++: A machine learning library in C++. In Tools with Artificial Intelligence (pp. 234–245). http://www.sgi.com/Technology/mlc.Google Scholar
- Mansour, Y. (1997). Pessimistic decision tree pruning based on tree size. In Proceedings of the Fourteenth International Conference on Machine Learning (pp. 195–201). Nashville, TN.Google Scholar
- Merz, C. J.& Murphy, P. M. (1997). UCI repository of machine learning databases. http://www. ics.uci.edu/»mlearn/MLRepository.html.Google Scholar
- Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. San Mateo, California: Morgan Kaufmann.Google Scholar
- Salzberg, S. (1997). On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery, 1(3), 317–328.Google Scholar
- Snedecor, G.W.& Cochran, W. G. (1989). Statistical Methods, eighth edition. Ames, Iowa: Iowa State University Press.Google Scholar
- Steel, R. G. D. (1997). Principles and Procedures of Statistics: A Biometrical Approach. New York, NY: McGraw-Hill.Google Scholar
- Utgoff, P., Berkman, N. C.,& Clouse, J. A. (1997). Decision tree induction based on efficient tree restructuring. Machine Learning, 29(1), 5–44.Google Scholar
- Wolpert, D. H. (1994). Off-training set error and a priori distinctions between learning algorithms. Technical Report SFI TR 94-12-123, The Santa Fe Institute.Google Scholar