Abstract
One typically expects classifiers to demonstrate improved performance with increasing training set sizes or at least to obtain their best performance in case one has an infinite number of training samples at ones’s disposal. We demonstrate, however, that there are classification problems on which particular classifiers attain their optimum performance at a training set size which is finite. Whether or not this phenomenon, which we term dipping, can be observed depends on the choice of classifier in relation to the underlying class distributions. We give some simple examples, for a few classifiers, that illustrate how the dipping phenomenon can occur. Additionally, we speculate about what generally is needed for dipping to emerge. What is clear is that this kind of learning curve behavior does not emerge due to mere chance and that the pattern recognition practitioner ought to take note of it.
Chapter PDF
Similar content being viewed by others
Keywords
- Learning Curve
- Linear Discriminant Analysis
- Decision Boundary
- Statistical Pattern Recognition
- Training Sample Size
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Amari, S., Fujita, N., Shinomoto, S.: Four types of learning curves. Neural Computation 4(4), 605–618 (1992)
Ben-David, S., Srebro, N., Urner, R.: Universal learning vs. no free lunch results. In: Philosophy and Machine Learning Workshop @ NIPS 2011 (December 2011), http://www.dsi.unive.it/PhiMaLe2011/
Duda, R., Hart, P.: Pattern classification and scene analysis. John Wiley & Sons (1973)
Duin, R.: Small sample size generalization. In: Proceedings of the Scandinavian Conference on Image Analysis, vol. 2, pp. 957–964 (1995)
Haussler, D., Kearns, M., Seung, H., Tishby, N.: Rigorous learning curve bounds from statistical mechanics. Machine Learning 25(2), 195–236 (1996)
Hughes, G.: On the mean accuracy of statistical pattern recognizers. IEEE Transactions on Information Theory 14(1), 55–63 (1968)
Jain, A., Duin, R., Mao, J.: Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(1), 4–37 (2000)
Krämer, N.: On the peaking phenomenon of the lasso in model selection. Arxiv preprint arXiv:0904.4416 (2009)
Langley, P.: Machine learning as an experimental science. Machine Learning 3(1), 5–8 (1988)
McLachlan, G.: Discriminant Analysis and Statistical Pattern Recognition. John Wiley & Sons (1992)
Opper, M.: Learning to generalize. In: Frontiers of Life, vol. 3(part 2), pp. 763–775. Academic Press (2001)
Opper, M., Kinzel, W.: Statistical mechanics of generalization. In: Models of Neural Networks III, ch. 5. Springer (1995)
Raudys, S., Duin, R.: Expected classification error of the fisher linear classifier with pseudo-inverse covariance matrix. Pattern Recognition Letters 19(5), 385–392 (1998)
Skurichina, M., Duin, R.: Stabilizing classifiers for very small sample sizes. In: Proceedings of the 13th International Conference on Pattern Recognition, vol. 2, pp. 891–896. IEEE (1996)
Steinwart, I.: Consistency of support vector machines and other regularized kernel classifiers. IEEE Transactions on Information Theory 51(1), 128–142 (2005)
Vapnik, V.: Estimation of dependences based on empirical data. Springer (1982)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Loog, M., Duin, R.P.W. (2012). The Dipping Phenomenon. In: Gimel’farb, G., et al. Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2012. Lecture Notes in Computer Science, vol 7626. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34166-3_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-34166-3_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34165-6
Online ISBN: 978-3-642-34166-3
eBook Packages: Computer ScienceComputer Science (R0)