Abstract
Error counting estimators are among the best known and most widely used error estimation techniques. Perhaps the best known subcategory of error-counting estimators are k-fold cross-validation methods. Like most other error estimation techniques, cross-validation methods are biased. One way to correct this bias is to use a weighted average of cross-validation and resubstitution estimators. In this paper we propose a new weighted error-counting classification error rate estimator designed specially for the Euclidean distance classifier. Experiments with real world and synthetic data sets show that resubstitution, repeated 2-fold cross-validation, leave-one-out, basic bootstrap and D-method are outperformed by the proposed weighted error rate estimator (in terms of root-mean-square error).
This is a preview of subscription content, log in via an institution.
References
Bache, K., Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml. School of Information and Computer Science, University of California, Irvine, CA (2015)
Braga-Neto, U., Dougherty, E.: Is cross-validation valid for small sample microarray classification? Bioinformatics 20(3), 374–380 (2004)
Dougherty, E., Sima, C., Hua, J., Hanczar, B., Braga-Neto, U.: Performance of error estimators for classification. Curr. Bioinf. 5(1), 53–67 (2010)
Efron, B.: Bootstrap methods: another look at the jackknife. Ann. Stat. 7(1), 1–26 (1979)
Efron, B.: Estimating the error rate of a prediction rule: improvement on cross-validation. J. Am. Stat. Assoc. 78(382), 316–331 (1983)
Efron, B., Tibshirani, R.: Improvements on cross-validation: the 632+ bootstrap method. J. Am. Stat. Assoc. 92(438), 548–560 (1997)
Fisher, R.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7, 179–188 (1936)
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 1137–1143 (1995)
Krzanowski, W.J., Hand, D.J.: Assessing error rate estimators: the leaving-one-out reconsidered. Aust. J. Stat. 39(1), 35–46 (1997)
Lachenbruch, P., Mickey, R.: Estimation of error rates in discriminant analysis. Technometrics 10(1), 1–11 (1968)
McLachlan, G.J.: A note on the choice of a weighting function to give an efficient method for estimating the probability of misclassification. Pattern Recogn. 9, 147–149 (1977)
Raudys, S.: Statistical and Neural Classifiers, An Integrated Approach to Design. Springer, London (2001)
Rodriguez, J.D., Perez, A., Lozano, J.A.: A general framework for the statistical analysis of the sources of variance for classification error estimators. Pattern Recogn. 46, 855–864 (2013)
Sima, C., Dougherty, E.: Optimal convex error estimators for classification. Pattern Recogn. 39(6), 1763–1780 (2006)
Smith, C.: Some examples of discrimination. Ann. Eugenics 18, 272–282 (1947)
Toussaint, G., Sharpe, P.: An efficient method for estimating the probability of misclassification applied to a problem in medical diagnosis. Comput. Biol. Med. 4, 269–278 (1975)
Zollanvari, A., Braga-Neto, U., Dougherty, E.: Analytic study of performance of error estimators for linear discriminant analysis. IEEE Trans. Signal Process. 59(9), 4238–4255 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Gvardinskas, M. (2015). Weighted Classification Error Rate Estimator for the Euclidean Distance Classifier. In: Dregvaite, G., Damasevicius, R. (eds) Information and Software Technologies. ICIST 2015. Communications in Computer and Information Science, vol 538. Springer, Cham. https://doi.org/10.1007/978-3-319-24770-0_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-24770-0_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24769-4
Online ISBN: 978-3-319-24770-0
eBook Packages: Computer ScienceComputer Science (R0)