Abstract
The so-called ℓ0 pseudonorm, or cardinality function, counts the number of nonzero components of a vector. In this paper, we analyze the ℓ0 pseudonorm by means of so-called Capra (constant along primal rays) conjugacies, for which the underlying source norm and its dual norm are both orthant-strictly monotonic (a notion that we formally introduce and that encompasses the ℓp-norms, but for the extreme ones). We obtain three main results. First, we show that the ℓ0 pseudonorm is equal to its Capra-biconjugate, that is, is a Capra-convex function. Second, we deduce an unexpected consequence, that we call convex factorization: the ℓ0 pseudonorm coincides, on the unit sphere of the source norm, with a proper convex lower semicontinuous function. Third, we establish a variational formulation for the ℓ0 pseudonorm by means of generalized top-k dual norms and k-support dual norms (that we formally introduce).
Similar content being viewed by others
References
Akian, M., Gaubert, S., Kolokoltsov, V.: Invertibility of functional Galois connections. Comptes Rendus Mathematique 335(11), 883–888 (2002)
Argyriou, A., Foygel, R., Srebro, N.: Sparse Prediction with the K-Support Norm. In: Proceedings of the 25th International Conference on Neural Information Processing Systems - vol. 1, NIPS’12, pp 1457–1465. Curran Associates Inc, USA (2012)
Bhatia, R.: Matrix Analysis. Springer, New York (1997)
Chancelier, J.-P., De Lara, M.: Constant along primal rays conjugacies and the ℓ0 pseudonorm. Optimization 0(0), 1–32 (2020)
Chancelier, J.-P., De, M.: Lara. Hidden convexity in the ℓ0 pseudonorm. J. Convex Anal. 28(1), 203–236 (2021)
Fan, Z., Jeong, H., Sun, Y., Friedlander, M.P.: Atomic decomposition via polar alignment. Foundations and Trends\(^{{\circledR }},\) in Optimization 3(4), 280–366 (2020)
Gries, D.: Characterization of certain classes of norms. Numer. Math. 10, 30–41 (1967)
Gries, D., Stoer, J.: Some results on fields of values of a matrix. SIAM J. Numer. Anal. 4(2), 283–300 (1967)
Hiriart-Urruty, J.-B., Le, H.: A variational approach of the rank function. TOP: An Official Journal of the Spanish Society of Statistics and Operations Research 21(2), 207–240 (2013)
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms, vol. I. Springer, Berlin (1993)
Marques de Sà, E., Sodupe, M.-J.: Characterizations of *orthant-monotonic norms. Linear Algebra Appl. 193, 1–9 (1993)
Martínez-Legaz, J.E.: Generalized convex duality and its economic applications. In: Hadjisavvas, S.S., Komlósi S, N. (eds.) Handbook of Generalized Convexity and Generalized Monotonicity. Nonconvex Optimization and Its Applications, vol. 76, pp 237–292. Springer (2005)
McDonald, A.M., Pontil, M., Stamos, D.: New perspectives on k-support and cluster norms. J. Mach. Learn. Res. 17(155), 1–38 (2016)
Mirsky, L.: Symmetric gauge functions and unitarily invariant norms. Q. J. Math. 11(1), 50–59 (1960)
Moreau, J.J.: Inf-convolution, sous-additivité, convexité des fonctions numériques. J. Math. Pures Appl. (9) 49, 109–154 (1970)
Nikolova, M.: Relationship between the optimal solutions of least squares regularized with l0-norm and constrained by k-sparsity. Appl. Comput. Harmon. Anal. 41(1), 237–265 (2016)
Obozinski, G., Bach, F.: A unified perspective on convex structured sparsity Hierarchical, symmetric, submodular norms and beyond. Preprint (2016)
Rockafellar, R.T.: Conjugate Duality and Optimization CBMS-NSF regional conference series in applied mathematics society for industrial and applied mathematics (1974)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)
Rockafellar, T.R.: Convex Analysis. Princeton University Press, Princeton (1970)
Rubinov, A.: Abstract convexity and global optimization, vol. 44 of Nonconvex Optimization and Its Applications. Kluwer Academic Publishers, Dordrecht (2000)
Singer, I.: Abstract Convex Analysis. Canadian Mathematical Society Series of Monographs and Advanced Texts. Wiley, New York (1997)
Tono, K., Takeda, A., Gotoh, J.-Y.: Efficient DC algorithm for constrained sparse optimization. Preprint (2017)
Acknowledgements
We thank Guillaume Obozinski for discussions on first versions of this work, and Jean-Baptiste Hiriart-Urruty for his comments (and for proposing the term “convex factorization”). We are indebted to two Reviewers and to the Editor who, by their questions and comments, helped us improve the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Properties of relevant norms for the ℓ 0 pseudonorm
We provide background on properties of norms that prove relevant for the ℓ0 pseudonorm. In Section A.1, we review notions related to dual norms. We establish properties of orthant-monotonic and orthant-strictly monotonic norms in Appendix A.2 and of coordinate-k and dual coordinate-k norms in Appendix A.3.
1.1 A.1 Dual Norm, ∥ | ⋅ ∥ |-Duality, Normal Cone
For any norm ∥ | ⋅ ∥ | on \(\mathbb {R}^{d}\), we recall that the following expression
defines a norm on \(\mathbb {R}^{d}\), called the dual norm ∥ | ⋅ ∥ |⋆ (in [20, Section 15], this operation is widened to a polarity operation between closed gauges). By definition of the dual norm in (27), we have the inequality
We are interested in the case where this inequality is an equality. One says that \( y \in \mathbb {R}^{d} \) is ∥ | ⋅ ∥ |-dual to \( x \in \mathbb {R}^{d} \), denoted by y ∥∥ | ⋅ ∥ |x, if equality holds in inequality (28a), that is,
The terminology ∥ | ⋅ ∥ |-dual comes from [11, page 2] (see also the vocable of dual vector pair in [7, Equation (1.11)] and of dual vectors in [8, p. 283], whereas it is refered as polar alignment in [6]). It will be convenient to express this notion of ∥ | ⋅ ∥ |-duality in terms of geometric objects of convex analysis. For this purpose, we recall that the normal cone NC(x) to the (nonempty) closed convex subset \({C} \subset \mathbb {R}^{d} \) at x ∈C is the closed convex cone defined by [10, p.136]
Now, an easy computation shows that the notion of ∥ | ⋅ ∥ |-duality can be rewritten in terms of normal cone \( N_{\mathbb {B} } \) as follows:
1.2 A.2 Properties of Orthant-Strictly Monotonic Norms
We provide useful properties of orthant-monotonic and orthant-strictly monotonic norms (see Definition 4). We recall that \( x_{K} \in {\mathcal {R}}_{K} \) denotes the vector which coincides with x, except for the components outside of K that vanish, and that the subspace \( {\mathcal {R}}_{K} \) of \( \mathbb {R}^{d} \) has been defined in (7).
Proposition 5
Let ∥ | ⋅ ∥ | be an orthant-monotonic norm on \(\mathbb {R}^{d}\). Then, the dual norm ∥ | ⋅ ∥ |⋆ is orthant-monotonic, and the norm ∥ | ⋅ ∥ | is increasing with the coordinate subspaces, in the sense that, for any \( x \in \mathbb {R}^{d} \) and any J ⊂K ⊂⟦1,d⟧, we have ∥ | xJ ∥ |≤∥ | uK ∥ |.
Proof
Let ∥ | ⋅ ∥ | be an orthant-monotonic norm on \(\mathbb {R}^{d}\). Then, by [7, Theorem 2.23], the dual norm ∥ | ⋅ ∥ |⋆ is also orthant-monotonic and, by [11, Proposition 2.4], we have that ∥ | u ∥ |≤∥ | u + v ∥ |, for any subset J ⊂⟦1,d⟧ and for any vectors \( u \in {\mathcal {R}}_{J} \) and \( v \in {\mathcal {R}}_{-J} \) (following notation from game theory, we have denoted by −J the complementary subset of J ⊂⟦1,d⟧, that is, J ∪(−J) = ⟦1,d⟧ and J ∩(−J) = ∅). We consider \( x \in \mathbb {R}^{d} \) and J ⊂K ⊂⟦1,d⟧. By setting \( u=x_{J} \in {\mathcal {R}}_{J} \) and v = xK −xJ, we get that \( v \in {\mathcal {R}}_{-J} \), hence that ∥ | xJ ∥ |≤∥ | uK ∥ |. □
Proposition 6
Let ∥ | ⋅ ∥ | be an orthant-strictly monotonic norm on \(\mathbb {R}^{d}\). Then
-
(a)
the norm ∥ | ⋅ ∥ | is strictly increasing with the coordinate subspaces in the sense that, for any \( x \in \mathbb {R}^{d} \) and any \( J \subsetneq K \subset {\llbracket 1,d \rrbracket } \), we have xJ≠xK ⇒∥ | xJ ∥ |< ∥ | uK ∥ |.
-
(b)
for any vector \( u \in \mathbb {R}^{d}\setminus \{0\} \), there exists a vector \( v \in \mathbb {R}^{d}\setminus \{0\} \) such that supp(v) = supp(u), that u ∘ v ≥ 0, and that v is ∥ | ⋅ ∥ |-dual to u, that is, 〈u,v〉= ∥ | u ∥ |×∥ | v ∥ |⋆.
Proof
-
(a)
Let \( x \in \mathbb {R}^{d} \) and \( J \subsetneq K \subset {\llbracket 1,d \rrbracket } \) be such that xJ≠xK. We will show that ∥ | uK ∥ |> ∥ | xJ ∥ |.
For this purpose, we set u = xJ and v = xK −xJ. Thus, we get that \( u \in {\mathcal {R}}_{K} \) and \( v \in {\mathcal {R}}_{-K}\setminus \{0\} \) (since \( J \subsetneq K \) and xJ≠xK), that is, u = uK and v = v−K≠ 0. We are going to show that ∥ | u + v ∥ |> ∥ | u ∥ |. On the one hand, by definition of the module of a vector, we easily see that |w|= |wK|+ |w−K|, for any vector \( w \in \mathbb {R}^{d} \). Thus, we have \( {|u+v|} = {|\left ({u+v}_{K} \right )|} + {| \left ({u+v}_{-K} \right )|} = {| u_{K}+v_{K} |} + {| u_{-K}+v_{-K} |} = {| u_{K}+0 |} + {| 0+v_{-K} |} = {| u_{K} |} + {|v_{-K} |} > {| u_{K} |} ={| u |} \) since |v−K|> 0 as v = v−K≠ 0, and since u = uK. On the other hand, we easily get that \( \left ({u+v}\right )~\circ ~u = \left ({ \left ({u+v}\right )_{K}~\circ ~u_{K} }\right ) + \left ({ \left ({u+v}\right )_{-K}~\circ ~u_{-K} }\right ) = \left ({ u_{K}~\circ ~u_{K} }\right ) + \left ({ v_{-K}~\circ ~u_{-K} }\right ) = \left ({ u_{K}~\circ ~u_{K} }\right ) \), because u−K = 0. Therefore, we get that \( \left ({u+v}\right )~\circ ~u = \left ({ u_{K}~\circ ~u_{K} }\right ) \geq 0 \).
From |u + v|> |u| and \( \left ({u+v}\right )~\circ ~u \geq 0 \), we deduce that ∥ | u + v ∥ |> ∥ | u ∥ | by Definition 4 as the norm ∥ | ⋅ ∥ | is orthant-strictly monotonic. Since u = xJ and v = xK −xJ, we conclude that ∥ | uK ∥ |> ∥ | xJ ∥ |.
-
(b)
Let \( u \in \mathbb {R}^{d}\setminus \{0\} \) be given and let us put K = supp(u)≠∅. As the norm ∥ | ⋅ ∥ | is orthant-strictly monotonic, it is orthant-monotonic; hence, by [11, Proposition 2.4], there exists a vector \( v \in \mathbb {R}^{d}\setminus \{0\} \) such that supp(v) ⊂supp(u), that u ∘ v ≥ 0 and that v is ∥ | ⋅ ∥ |-dual to u, as in (28b), that is, 〈u,v〉= ∥ | u ∥ |×∥ | v ∥ |⋆. Thus J = supp(v) ⊂K = supp(u). We will now show that \( J \subsetneq K \) is impossible, hence that J = K, thus proving that Item (b) holds true with the above vector v.
Writing that 〈u,v〉= ∥ | u ∥ |×∥ | v ∥ |⋆ (using that u = uK and v = vK = vJ), we obtain
by obvious properties of the scalar product 〈⋅,⋅〉. As a consequence, we get that \( \{u_{K},u_{J} \} \subset \arg \max \limits _{\|\!|\! x \!\|\!| \leq \|\!|\! u \!\|\!| } \langle {x, v}\rangle \), by definition (27) of ∥ | v ∥ |⋆, because ∥ | u ∥ |= ∥ | uK ∥ |≥∥ | uJ ∥ |, by Proposition 5 since J ⊂K and the norm ∥ | ⋅ ∥ | is orthant-monotonic. But any solution in \( \arg \max \limits _{\|\!|\! x \!\|\!| \leq \|\!|\! u \!\|\!| } \langle {x, v}\rangle \) belongs to the frontier of the ball of radius ∥ | u ∥ |, hence has exactly norm ∥ | u ∥ |. Thus, we deduce that ∥ | u ∥ |= ∥ | uK ∥ |= ∥ | uJ ∥ |. If we had \( J = \text {supp}({v}) \subsetneq K = \text {supp}({u}) \), we would have uJ≠uK, hence ∥ | uK ∥ |> ∥ | uJ ∥ | by Item (a); this would be in contradiction with ∥ | uK ∥ |= ∥ | uJ ∥ |. Therefore, J = supp(v) = K = supp(u).
This ends the proof. □
1.3 A.3 Properties of Coordinate-k and Dual Coordinate-k Norms, and of Generalized top-k and k-Support Dual Norms
We establish useful properties of coordinate-k and dual coordinate-k norms (Definition 2), and of generalized top-k and k-support dual norms (Definition 3).
Proposition 7
Let ∥ | ⋅ ∥ | be a source norm on \(\mathbb {R}^{d}\).
Coordinate-k norms are greater than k-support dual norms, that is,
whereas dual coordinate-k norms are lower than generalized top-k dual norms, that is,
If the source norm norm ∥ | ⋅ ∥ | is orthant-monotonic, then equalities hold true, that is,
Proof
It is known that, for any nonempty subset K ⊂⟦1,d⟧, we have the inequality ∥ | ⋅ ∥ |K,⋆ ≤∥ | ⋅ ∥ |⋆,K (see [11, Proposition 2.2]). From the definition (10) of the generalized top-k dual norm, and the definition (8) of the dual coordinate-k norm, we get that \( \|\!|\! y \!\|\!|^{\mathcal {R}}_{(k), \star } = \sup _{{|K|} \leq k} \|\!|\! y_{K} \!\|\!|_{K,\star } \leq \sup _{{|K|} \leq k} \|\!|\! y_{K} \!\|\!|_{\star ,K} = \|\!|\! y \!\|\!|^{\text {tn}}_{\star , (k)} \), hence we obtain (31b). By taking the dual norms, we get (31a).
The norms for which the equality ∥ | ⋅ ∥ |K,⋆ = ∥ | ⋅ ∥ |⋆,K holds true for all nonempty subsets K ⊂⟦1,d⟧, are the orthant-monotonic norms ([7, Characterization 2.26], [11, Theorem 3.2]). Therefore, if the norm ∥ | ⋅ ∥ | is orthant-monotonic, from the definition (10) of the generalized top-k dual norm, we get that the inequality (31b) becomes an equality. Then, the inequality (31a) also becomes an equality by taking the dual norm as in (27). Thus, we have obtained (32).
This ends the proof. □
Proposition 8
Let ∥ | ⋅ ∥ | be a source norm on \(\mathbb {R}^{d}\). Let \( y\in \mathbb {R}^{d} \) and l ∈⟦1,d⟧. If the dual norm ∥ | ⋅ ∥ |⋆ is orthant-strictly monotonic, we have that
Proof
We consider \( y\in \mathbb {R}^{d} \). We put L = supp(y) and we suppose that ℓ0(y) = |L|= l.
Since the norm ∥ | ⋅ ∥ |⋆ is orthant-strictly monotonic, it is orthant-monotonic and so is ∥ | ⋅ ∥ | by Proposition 5. By (32) in Proposition 7, we get that \( \|\!|\! \cdot \!\|\!|^{\mathcal {R}}_{(j)} = \|\!|\! \cdot \!\|\!|^{\star \text {sn}}_{\star , (j)} \) and \( \|\!|\! \cdot \!\|\!|^{\mathcal {R}}_{(j), \star } = \|\!|\! \cdot \!\|\!|^{\text {tn}}_{\star , (j)} \), for j ∈⟦1,d⟧ (with the convention that these are the null seminorms in the case j = 0). Therefore, we can translate all the results, obtained in [4], with coordinate-k and dual coordinate-k norms, into results regarding generalized top-k and k-support dual norms. As an application, by [4, Equation (18)], we get, from ℓ0(y) = l, that
We now prove (33) in two steps.
We first show that \( \|\!|\! y \!\|\!|^{\text {tn}}_{\star , (l)} = {\cdots } = \|\!|\! y \!\|\!|^{\text {tn}}_{\star , (d)} =\|\!|\! y \!\|\!|_{\star } \) (the right hand side of (33)). Since y = yL, by definition of the set L = supp(y), we have that \( \|\!|\! y \!\|\!|_{\star } = \|\!|\! y_{L} \!\|\!|_{\star } \leq \sup _{{|K|} \leq l} \|\!|\! y_{K} \!\|\!|_{\star } = \|\!|\!y\!\|\!|^{\text {tn}}_{\star , (l)} \) by the very definition (10) of the generalized top-l dual norm \( \|\!|\! \cdot \!\|\!|^{\text {tn}}_{\star , (l)} \). By (34), we conclude that \( \|\!|\!y\!\|\!|^{\text {tn}}_{\star , (l)} = {\cdots } = \|\!|\!y\!\|\!|^{\text {tn}}_{\star , (d)} =\|\!|\! y \!\|\!|_{\star } \).
Second, we show that \( \|\!|\!y\!\|\!|^{\text {tn}}_{\star , (1)} < {\cdots } < \|\!|\!y\!\|\!|^{\text {tn}}_{\star , (l-1)} < \|\!|\!y\!\|\!|^{\text {tn}}_{\star , (l)} \) (the left hand side of (33)). There is nothing to show for l = 0. Now, for l ≥ 1 and for any k ∈⟦0,1 − 1⟧, we have
because the set L ∖K is nonempty (having cardinality |L|−|K|= l −|K|≥k + 1 −|K|≥ 1), and because, since the norm ∥ | ⋅ ∥ |⋆ is orthant-strictly monotonic, using Item 6 in Proposition 6, we obtain that ∥ | yK ∥ |⋆ < ∥ | yK∪{j} ∥ |⋆ as yK≠yK∪{j} for at least one j ∈L ∖K since L = supp(y)
by definition (10) of the generalized top-(k + 1) dual norm (in fact the last inequality is easily shown to be an equality as yL = y). Thus, for any k ∈⟦0,1 − 1⟧, we have established that \( \|\!|\! y \!\|\!|^{\text {tn}}_{\star , (k)} < \|\!|\!y\!\|\!|^{\text {tn}}_{\star , (k+1)} \).
This ends the proof. □
Appendix B: Proposition 9
We reproduce here [4, Proposition 4.5] in order to simplify the reading of the proof of Proposition 3.
Proposition 9
([4, Proposition 4.5]) Let ∥ | ⋅ ∥ | be a norm on \(\mathbb {R}^{d}\), with associated sequence \( \left \{\|\!|\! \cdot \!\|\!|^{\mathcal {R}}_{(j)}\right \}_{j\in {\llbracket 1,d \rrbracket }} \) of coordinate-k norms and sequence \( \left \{\|\!|\! \cdot \!\|\!|^{\mathcal {R}}_{(j), \star }\right \}_{j\in {\llbracket 1,d \rrbracket }} \) of dual coordinate-k norms, as in Definition 2, and with associated Capra coupling ¢ in (4).
-
1.
For any function \( \varphi : {\llbracket 0,d \rrbracket } \to {\overline {\mathbb R}} \), we have
$$ ({ \varphi \circ \ell_0})^{\cent{\cent^{\prime}}}({x}) = \left( \left( { \varphi \circ \ell_0}\right){\cent} \right)^{\star \prime} \left( { \frac{x}{\|\!|\! x \!\|\!|} }\right) , \forall x \in \mathbb{R}^{d}\backslash\{0\} , $$(35a)where the closed convex function \( \left (\left ({\varphi \circ \ell _{0}}\right )^{\cent } \right )^{\star \prime } \) has the following expression as a Fenchel conjugate
$$ \left( \left( { \varphi \circ \ell_0}\right){\cent} \right)^{\star \prime} = \left( { \sup\limits_{J \in{\llbracket 1,d \rrbracket}} \left[{ \|\!|\! \cdot \!\|\!|^{\mathcal{R}}_{(j), \star} -\varphi(j) }\right] }\right)^{\star \prime} , $$(35b)and also has the following four expressions as a Fenchel biconjugate
$$ = \left( \inf_{J \in{\llbracket 1,d \rrbracket}} \left[ \delta_{\mathbb{B}^{\mathcal {R}}_{(j)} } \dotplus \varphi(j) \right] \right)^{\star\star \prime} , $$(35c)hence the function \( \left (\left ({ \varphi \circ \ell _0}\right ){\cent } \right )^{\star \prime } \) is the largest closed convex function below the integer valued function \( \inf _{J \in {\llbracket 1,d \rrbracket }} \left [ \delta _{\mathbb {B}^{\mathcal {R}}_{(j)} } \dotplus \varphi (j) \right ] \), such that \( x \in \mathbb {B}^{\mathcal {R}}_{(j)} \backslash \mathbb {B}^{\mathcal {R}}_{(j-1)} \mapsto \varphi (j) \) for l ∈⟦1,d⟧, and \(x \in \mathbb {B}^{\mathcal {R}}_{(0)} = \{0\} \mapsto \varphi ({0})\), the function being infinite outside \( \mathbb {B}^{\mathcal {R}}_{(d)}= \mathbb {B} \), that is, with the convention that \( \mathbb {B}^{\mathcal {R}}_{(0)}=\{0\} \) and that \( \inf \emptyset = +\infty \)
$$ \begin{array}{@{}rcl@{}} &=& \left( x \mapsto \inf \left\{ \varphi(j) \right\} { x \in \mathbb{B}^{\mathcal {R}}_{(j)} , j \in {\llbracket 0,d \rrbracket} } \right)^{\star\star \prime} , \end{array} $$(35d)$$ \begin{array}{@{}rcl@{}} &=& \left( \inf_{J \in{\llbracket 1,d \rrbracket}} \left[\delta_{\mathbb{S}^{\mathcal {R}}_{(j)} } \dotplus \varphi(j)\right] \right)^{\star\star \prime} , \end{array} $$(35e)hence the function \( \left (\left ({ \varphi \circ \ell _0}\right ){\cent } \right )^{\star \prime } \) is the largest closed convex function below the integer valued function \( \inf _{J \in {\llbracket 1,d \rrbracket }} \left [ \delta _{\mathbb {S}^{\mathcal {R}}_{(j)} } \dotplus \varphi (j)\right ] \), that is, with the convention that \( \mathbb {S}^{\mathcal {R}}_{(0)}=\{0\} \) and that \( \inf \emptyset = +\infty \)
$$ =\left( x \mapsto \inf \left\{ \varphi(j) \mid x \in \mathbb{S}^{\mathcal {R}}_{(j)} , \j \in {\llbracket 0,d \rrbracket} \right\} \right) . $$(35f) -
2.
For any function \( \varphi : {\llbracket 0,d \rrbracket } \to \mathbb {R} \), that is, with finite values, the function \( \left (\left ({ \varphi \circ \ell _0}\right ){\cent } \right )^{\star \prime } \) is proper convex lsc and has the following variational expression (where Δd+ 1 denotes the simplex of \(\mathbb {R}^{d+1}\))
$$ \left( \left( { \varphi \circ \ell_0}\right){\cent} \right)^{\star \prime} ({x}) = \underset{\underset{x \in \sum\limits_{j=1 }^{d } \lambda_j \mathbb{B}^{\mathcal {R}}_{(j)}}{ \left( {\lambda_{0},\lambda_{1},\ldots,\lambda_{d}}\right) \in {\Delta}_{d+1}}}{\min} {\sum}_{j=0}^{d } \lambda_j \varphi(j) , \forall x \in \mathbb{R}^{d} . $$(35g) -
3.
For any function \( \varphi : {\llbracket 0,d \rrbracket } \to \mathbb {R}_{+} \), that is, with nonnegative finite values, and such that φ(0) = 0, the function \( \left (\left ({ \varphi \circ \ell _0}\right ){\cent } \right )^{\star \prime } \) is proper convex lsc and has the following two variational expressions (notice that, in (35g), the sum starts from j = 0, whereas in (35h) and in (35i), the sum starts from j = 1)
$$ \begin{array}{@{}rcl@{}} \left( \left( { \varphi \circ \ell_0}\right){\cent} \right)^{\star \prime} ({x}) &=& \underset{\underset{x \in \sum\limits_{j=1 }^{d } \lambda_j \mathbb{S}^{\mathcal {R}}_{(j)}}{ \left( {\lambda_{0},\lambda_{1},\ldots,\lambda_{d}}\right) \in {\Delta}_{d+1}} }{\min} \sum\limits_{j=1 }^{d } \lambda_j \varphi(j) , \forall x \in \mathbb{R}^{d} , \end{array} $$(35h)$$ \begin{array}{@{}rcl@{}} &=& \underset{ \begin{array}{ccc} z^{(1)} \in \mathbb{R}^{d}, \ldots, z^{(d)} \in \mathbb{R}^{d}\\ \sum\limits_{j=1 }^{d } \|\!|\! z^{(j)} \!\|\!|j \leq 1\\ \sum\limits_{j=1 }^{d } z^{(j)} = x \end{array} }{\min} \sum\limits_{j=1 }^{d } \varphi(j) \|\!|\! z^{(j)} \!\|\!|^{\mathcal{R}}_{(j)} , \forall x \in \mathbb{R}^{d} , \end{array} $$(35i)and the function \( ({ \varphi \circ \ell _0})^{\cent {\cent ^{\prime }}} \) has the following variational expression
$$ ({ \varphi \circ \ell_0})^{\cent{\cent^{\prime}}}({x}) = \frac{ 1 }{ \|\!|\! x \!\|\!| } \underset{ \begin{array}{ccc} z^{(1)} \in \mathbb{R}^{d}, \ldots, z^{(d)} \in \mathbb{R}^{d}\\ \sum\limits_{j=1 }^{d } \|\!|\! z^{j} \!\|\!|j \leq \|\!|\! x \!\|\!|\\ \sum\limits_{j=1 }^{d } z^{(j)} = x \end{array} }{\min} \sum\limits_{j=1 }^{d } \|\!|\! z^{j} \!\|\!|j \varphi(j) , \forall x \in \mathbb{R}^{d}\backslash\{0\} . $$(36)
Appendix C: Background on the Fenchel conjugacy on \(\mathbb {R}^{d}\)
We review concepts and notations related to the Fenchel conjugacy (we refer the reader to [18]). For any function \( h: \mathbb {R}^{d} \to {\overline {\mathbb R}} \), its epigraph is \(\left \{(w,t) \in \mathbb {R}^{d}\times \mathbb {R}\mid h(w) \leq t\right \}\), its effective domain is dom\(h = \left \{w\in \mathbb {R}^{d} \mid h(w)<+\infty \right \} \). A function \( h : \mathbb {R}^{d} \to {\overline {\mathbb R}} \) is said to be convex if its epigraph is a convex set, proper if it never takes the value \(-\infty \) and that domh≠∅, lower semi continuous (lsc) if its epigraph is closed, closed if it either lsc and nowhere having the value \(-\infty \), or is the constant function \(-\infty \) [18, p. 15]. Closed convex functions are the two constant functions \(-\infty \) and \(+\infty \) united with all proper convex lsc functions. In particular, any closed convex function that takes at least one finite value is necessarily proper convex lsc.
For any functions \( f : \mathbb {R}^{d} \to {\overline {\mathbb R}} \) and \( g : \mathbb {R}^{d} \to {\overline {\mathbb R}} \), we denote
In convex analysis, one does not use the notation \(^{\star \prime } \) in (37b) and \(^{\star \star \prime } \) in (37c), but simply ⋆ and ⋆⋆. We use \(^{\star \prime } \) and \(^{\star \star \prime } \) to be consistent with the notation (6b) for general conjugacies.
It is proved that the Fenchel conjugacy (indifferently f↦f⋆ or \( g \mapsto g^{\star \prime } \)) induces a one-to-one correspondence between the closed convex functions on \(\mathbb {R}^{d}\) and themselves [18, Theorem 5].
In [20, p. 214-215] (see also the historical note in [19, p. 343]), the notions of (Moreau) subgradient and of (Rockafellar) subdifferential are defined for a convex function. Following the definition of the subdifferential of a function with respect to a duality in [1], we define the (Rockafellar-Moreau) subdifferential∂f(x) of a function \( f : \mathbb {R}^{d} \to {\overline {\mathbb R}} \) at \( x \in \mathbb {R}^{d} \) by
When the function f is proper convex and x ∈domf, we recover the classic definition that
Rights and permissions
About this article
Cite this article
Chancelier, JP., De Lara, M. Capra-Convexity, Convex Factorization and Variational Formulations for the ℓ0 Pseudonorm. Set-Valued Var. Anal 30, 597–619 (2022). https://doi.org/10.1007/s11228-021-00606-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11228-021-00606-z
Keywords
- ℓ 0 pseudonorm
- Orthant-strictly monotonic norm
- Fenchel-Moreau conjugacy
- Generalized k-support dual norm
- Sparse optimization