How to Compare Different Loss Functions and Their Risks
- First Online:
- Cite this article as:
- Steinwart, I. Constr Approx (2007) 26: 225. doi:10.1007/s00365-006-0662-3
- 277 Downloads
Many learning problems are described by a risk functional which in turn is defined by a loss function, and a straightforward and widely known approach to learn such problems is to minimize a (modified) empirical version of this risk functional. However, in many cases this approach suffers from substantial problems such as computational requirements in classification or robustness concerns in regression. In order to resolve these issues many successful learning algorithms try to minimize a (modified) empirical risk of a surrogate loss function, instead. Of course, such a surrogate loss must be "reasonably related" to the original loss function since otherwise this approach cannot work well. For classification good surrogate loss functions have been recently identified, and the relationship between the excess classification risk and the excess risk of these surrogate loss functions has been exactly described. However, beyond the classification problem little is known on good surrogate loss functions up to now. In this work we establish a general theory that provides powerful tools for comparing excess risks of different loss functions. We then apply this theory to several learning problems including (cost-sensitive) classification, regression, density estimation, and density level detection.