Weighted Classification Error Rate Estimator for the Euclidean Distance Classifier
Error counting estimators are among the best known and most widely used error estimation techniques. Perhaps the best known subcategory of error-counting estimators are k-fold cross-validation methods. Like most other error estimation techniques, cross-validation methods are biased. One way to correct this bias is to use a weighted average of cross-validation and resubstitution estimators. In this paper we propose a new weighted error-counting classification error rate estimator designed specially for the Euclidean distance classifier. Experiments with real world and synthetic data sets show that resubstitution, repeated 2-fold cross-validation, leave-one-out, basic bootstrap and D-method are outperformed by the proposed weighted error rate estimator (in terms of root-mean-square error).
KeywordsError estimation Classification Resubstitution Cross-validation Bootstrap
- 1.Bache, K., Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml. School of Information and Computer Science, University of California, Irvine, CA (2015)
- 8.Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 1137–1143 (1995)Google Scholar