The Group-Lasso: ℓ1, ∞ Regularization versus ℓ1,2 Regularization
The ℓ1, ∞ norm and the ℓ1,2 norm are well known tools for joint regularization in Group-Lasso methods. While the ℓ1,2 version has been studied in detail, there are still open questions regarding the uniqueness of solutions and the efficiency of algorithms for the ℓ1, ∞ variant. For the latter, we characterize the conditions for uniqueness of solutions, we present a simple test for uniqueness, and we derive a highly efficient active set algorithm that can deal with input dimensions in the millions. We compare both variants of the Group-Lasso for the two most common application scenarios of the Group-Lasso, one is to obtain sparsity on the level of groups in “standard” prediction problems, the second one is multi-task learning where the aim is to solve many learning problems in parallel which are coupled via the Group-Lasso constraint. We show that both version perform quite similar in “standard” applications. However, a very clear distinction between the variants occurs in multi-task settings where the ℓ1,2 version consistently outperforms the ℓ1, ∞ counterpart in terms of prediction accuracy.
Unable to display preview. Download preview PDF.