Support vector machines (SVMs) are one of the most successful algorithms on small and medium-sized data sets, but on large-scale data sets their training and predictions become computationally infeasible. The author considers a spatially defined data chunking method for large-scale learning problems, leading to so-called localized SVMs, and implements an in-depth mathematical analysis with theoretical guarantees, which in particular include classification rates. The statistical analysis relies on a new and simple partitioning based technique and takes well-known margin conditions into account that describe the behavior of the data-generating distribution. It turns out that the rates outperform known rates of several other learning algorithms under suitable sets of assumptions. From a practical point of view, the author shows that a common training and validation procedure achieves the theoretical rates adaptively, that is, without knowing the margin parameters in advance.
- Introduction to Statistical Learning Theory
- Histogram Rule: Oracle Inequality and Learning Rates
- Localized SVMs: Oracle Inequalities and Learning Rates
Researchers, students, and practitioners in the fields of mathematics and computer sciences who focus on machine learning or statistical learning theory
Ingrid Karin Blaschzyk is a postdoctoral researcher in the Department of Mathematics at the University of Stuttgart, Germany.