Learning the set covering machine by bound minimization and margin-sparsity trade-off
- First Online:
- Cite this article as:
- Laviolette, F., Marchand, M., Shah, M. et al. Mach Learn (2010) 78: 175. doi:10.1007/s10994-009-5137-3
- 337 Downloads
We investigate classifiers in the sample compression framework that can be specified by two distinct sources of information: a compression set and a message string of additional information. In the compression setting, a reconstruction function specifies a classifier when given this information. We examine how an efficient redistribution of this reconstruction information can lead to more general classifiers. In particular, we derive risk bounds that can provide an explicit control over the sparsity of the classifier and the magnitude of its separating margin and a capability to perform a margin-sparsity trade-off in favor of better classifiers. We show how an application to the set covering machine algorithm results in novel learning strategies. We also show that these risk bounds are tighter than their traditional counterparts such as VC-dimension and Rademacher complexity-based bounds that explicitly take into account the hypothesis class complexity. Finally, we show how these bounds are able to guide the model selection for the set covering machine algorithm enabling it to learn by bound minimization.