Summary
This chapter provides a characterization of data complexity using the framework of Vapnik’s learning theory and Karl Popper’s philosophical ideas that can be readily interpreted in the context of empirical learning of inductive models from finite data. We approach the notion of data complexity under the setting of predictive learning, where this notion is directly related to the flexibility of a set of possible models used to describe available data. Hence, any characterization of data complexity is related to model complexity control. Recent learning methods [such as support vector machines (SVM), aka kernel methods] introduced the concept of margin to control model complexity. This chapter describes the characterization of data complexity for such margin-based methods. We provide a general philosophical motivation for margin-based estimators by interpreting the concept of margin as the degree of a model’s falsifiability. This leads to a better understanding of two distinct approaches to controlling model complexity: margin-based, where complexity is controlled by the size of the margin (or adaptive empirical loss function); and model-based, where complexity is controlled by the parameterization of admissible models. We describe SVM methods that combine margin-based and model-based complexity control, and show the effectiveness of the SVM strategy via empirical comparisons using synthetic data sets. Our comparisons clarify the difference between SVM methods and regularization methods. Finally, we introduce a new index of data complexity for margin-based classifiers. This new index effectively measures the degree of separation between the two classes achieved by margin-based methods (such as SVMs). The data sets with a high degree of separation (hence, good generalization) are characterized as simple, as opposed to complex data sets with heavily overlapping class distributions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
V. Cherkassky, F. Mulier. Learning from Data: Concepts, Theory, and Methods. New York: John Wiley & Sons, 1998.
T. Hastie, R. Tibshirani, J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York: Springer, 2001.
V. Vapnik. The Nature of Statistical Learning Theory. New York: Springer, 1995.
B.D. Ripley. Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press, 1996.
A. Barron, L. Birge, P. Massart. Risk bounds for model selection via penalization. Probability Theory and Related Fields, 113, 301–413, 1999.
T. Poggio, S. Smale. The Mathematics of Learning: Dealing with Data. Notices American Mathematical Society, 50, 537–544, 2003.
V. Vapnik. Estimation of Dependences Based on Empirical Data. Berlin: Springer Verleg, 1982.
V. Vapnik. Statistical Learning Theory. New York: Wiley, 1998.
K. Popper. The Logic of Scientific Discovery. New York: Harper Torch Books, 1968.
K. Popper. Conjectures and Refutations: The Growth of Scientific Knowledge. London and New York: Routledge, 2000.
R. Duda, P. Hart, D. Stork. Pattern Classification. 2nd. ed., New York: Wiley, 2000.
V. Cherkassky, Y. Ma. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Networks, 17(1), 113–126, 2004.
J. Suykens, J. Vanderwalle. Least squares support vector machine classifiers. Neural Processing Letters, 9(3), 293–300, 1999.
J. Suykens, T. Van Gestel, et al. Least Squares Support Vector Machines. Singapore: World Scientific, 2002.
B.D. Ripley. Neural networks and related methods for classification (with discussion). J. Royal Stat. Soc., B56, 409–456, 1994.
S. Mika. Kernel Fisher discriminants, Ph.D. thesis, Technical University of Berlin, 2002.
B. Schölkopf, A. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. Cambridge, MA: MIT Press, 2002.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer Verlag London Limited
About this chapter
Cite this chapter
Cherkassky, V., Ma, Y. (2006). Data Complexity, Margin-Based Learning, and Popper’s Philosophy of Inductive Learning. In: Basu, M., Ho, T.K. (eds) Data Complexity in Pattern Recognition. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84628-172-3_5
Download citation
DOI: https://doi.org/10.1007/978-1-84628-172-3_5
Publisher Name: Springer, London
Print ISBN: 978-1-84628-171-6
Online ISBN: 978-1-84628-172-3
eBook Packages: Computer ScienceComputer Science (R0)