On the Determination of Lagrange Multipliers for a Weighted LASSO Problem Using Geometric and Convex Analysis Techniques

Compressed Sensing (CS) encompasses a broad array of theoretical and applied techniques for recovering signals, given partial knowledge of their coefficients, cf. Candés (C. R. Acad. Sci. Paris, Ser. I 346, 589–592 (2008)), Candés et al. (IEEE Trans. Inf. Theo (2006)), Donoho (IEEE Trans. Inf. Theo. 52(4), (2006)), Donoho et al. (IEEE Trans. Inf. Theo. 52(1), (2006)). Its applications span various fields, including mathematics, physics, engineering, and several medical sciences, cf. Adcock and Hansen (Compressive Imaging: Structure, Sampling, Learning, p. 2021), Berk et al. (2019 13th International conference on Sampling Theory and Applications (SampTA) pp. 1-5. IEEE (2019)), Brady et al. (Opt. Express 17(15), 13040–13049 (2009)), Chan (Terahertz imaging with compressive sensing. Rice University, USA (2010)), Correa et al. (2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 7789–7793 (2014, May) IEEE), Gao et al. (Nature 516(7529), 74–77 (2014)), Liu and Kang (Opt. Express 18(21), 22010–22019 (2010)), McEwen and Wiaux (Mon. Notices Royal Astron. Soc. 413(2), 1318–1332 (2011)), Marim et al. (Opt. Lett. 35(6), 871–873 (2010)), Yu and Wang (Phys. Med. Biol. 54(9), 2791 (2009)), Yu and Wang (Phys. Med. Biol. 54(9), 2791 (2009)). Motivated by our interest in the mathematics behind Magnetic Resonance Imaging (MRI) and CS, we employ convex analysis techniques to analytically determine equivalents of Lagrange multipliers for optimization problems with inequality constraints, specifically a weighted LASSO with voxel-wise weighting. We investigate this problem under assumptions on the fidelity term \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left\Vert Ax-b\right\Vert _2^2$$\end{document}Ax-b22, either concerning the sign of its gradient or orthogonality-like conditions of its matrix. To be more precise, we either require the sign of each coordinate of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2(Ax-b)^TA$$\end{document}2(Ax-b)TA to be fixed within a rectangular neighborhood of the origin, with the side lengths of the rectangle dependent on the constraints, or we assume \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A^TA$$\end{document}ATA to be diagonal. The objective of this work is to explore the relationship between Lagrange multipliers and the constraints of a weighted variant of LASSO, specifically in the mentioned cases where this relationship can be computed explicitly. As they scale the regularization terms of the weighted LASSO, Lagrange multipliers serve as tuning parameters for the weighted LASSO, prompting the question of their potential effective use as tuning parameters in applications like MR image reconstruction and denoising. This work represents an initial step in this direction.


Introduction
Decomposing finite-energy signals as superpositions of fundamental functions (atoms), with preservation of the energy content, is one of the main issue in signal analysis. and . This decomposition is unique, because B is a basis, but in the applications it may be useful to have different representations for the same signal, and this is pursued using frames.In time-frequency analysis, where the time and frequency behaviours of signals are treated simultaneously, Gabor frames have become popular.For a function g ∈ L 2 (R d ) and (x, ξ) ∈ R 2d , the associated time-frequency shift is defined as π(x, ξ)g(t) = e 2πiξ•t g(t − x), that is, the composition of the translation T x g(t) = g(t − x) and modulation M ξ g(t) = e 2πiξ•t f (t) operators, x, ξ ∈ R d , see Section 2 for details.A Gabor frame is a family G(g, Λ) = {π(λ)g} λ∈Λ , where g ∈ L 2 (R d ) \ {0} and Λ ⊆ R 2d is a discrete set (often a lattice, i.e., a discrete subgroup of R 2d ), with the following energy-preservation property: there exist A, B > 0 such that (1) If G(g, Λ) is a Gabor frame, then there exists a window γ = γ(g) with unconditional convergence in the norm of L 2 (R d ).Stated differently, Gabor frames allow to decompose signals into discrete superpositions of time-frequency shifts, which take over the role of building blocks for finite-energy signals.The coefficients in (2) define the short-time Fourier transform (STFT).Namely, if g ∈ L 2 (R d ) \ {0} is fixed and f ∈ L 2 (R d ), the STFT of f with respect to the window g is the function defined as: (3) where F 2 is the partial Fourier transform with respect to the second variable and T ST is a rescaling operator, see Section 2 below.The STFT is a joint time-frequency representation of f , it describes the local time-frequency behaviour of f and it allows to recover both f and its Fourier transform, via inversion formulae.However, in many contexts, the STFT is not the best time-frequency representation in terms of mathematical properties, and it is preferred to appeal to different distributions, such as the τ -Wigner distributions [3], defined as As for the STFT, these distributions can be written as composition of operators.Namely, where T τ is a suitable rescaling operator, see Section 2 in the sequel.However, this is not the only similarity that τ -Wigner distributions share with the STFT.In fact, it was proved in [5] that the τ -Wigner distributions (τ = 0, 1) can be rewritten in terms of L 2 inner products as where π τ (x, ξ) is, up to rescaling and a chirp phase factor, the composition of a rescaled time-frequency shift and a unitary change of variables.
Another issue is the frame property.Gabor frames are enormously popular in applications such that imaging, radar, audio processing and quantum mechanics [16].Though, in some framework it is more useful to employ generalized versions of them, see for example [15].In our context, let us first consider the atoms generating from the τ -Wigner distributions, defined in (5) for a suitable γ τ = γ τ (g, τ ) ∈ L 2 (R d ), providing a different way to represent signals in terms of the atoms π τ (λ)g.For many aspects, τ -Wigner distributions represent signals better than the STFT, but they are only the tip of the iceberg of a wide variety of time-frequency representations, called metaplectic Wigner distributions, introduced and studied in [4,5,6,8,9,13].Among all the metaplectic Wigner distributions, the Wigner-decomposable representations are the immediate generalization of the STFT and τ -Wigner distributions.In fact, they are defined, up to a chirp phase factor, in terms of the Fourier transform with respect to the second variable and a rescaling operator: where and A denotes a symplectic matrix which is related to F 2 and T E , see Section 4.
In this work, we derive the expression of the metaplectic atoms associated to Wigner-decomposable distributions and showcase the related frames.Moreover, we characterize modulation spaces for Wigner-decomposable distributions, extending the work [1,10].
Overview.Section 2 contains definitions and preliminaries.Section 3 introduces the main protagonists of this theory: the metaplectic Wigner distributions and related atoms, showing their fundamental properties.Section 4 is the core of this study: (totally) Wigner decomposable distributions are defined and the related atoms are computed explicitly.Also, the inverse of a Wigner decomposable atom is shown and relations among Wigner decomposable and totally Wigner decomposable atoms are highlighted.The last section is devoted to frame theory for (totally) Wigner decomposable atoms.In particular, Theorem 5.2 shows the relation between these frames and the classical Gabor ones.Finally, modulation and Wiener amalgam spaces can be defined by means of Wigner decomposable distributions and related frames, cf.Theorems 5.3 and 5.5.

Preliminaries and Notation
Throughout this work, xy = x • y denotes the standard inner product on These are norms if p ≥ 1 and quasi-norms if 0 < p < 1. S(R d ) denotes the spaces of Schwartz functions, whereas S ′ (R d ) denotes its topological dual, the space of tempered distributions.We denote with f, g = f (t)g(t)dt the inner product on L 2 (R d ) (antilinear in the second component).We use the same notation to denote the duality pairing of For z = (x, ξ) ∈ R 2d , the time-frequency shift π(z) is the operator where are the modulation and the translation operators, respectively.
) and F = F (x, y) : R 2d → C measurable, we denote by If f, g : R 2d → C, their tensor product is denoted by f ⊗ g(x, y) = f (x)g(y).If f, g ∈ S ′ (R d ), their tensor product is the unique tempered distribution f ⊗ g ∈ S ′ (R 2d ) characterized by its action on tensor products ϕ ⊗ ψ ∈ S(R 2d ) by: 2.1.Fourier transform.Let f ∈ S(R d ), the Fourier transform of f , denoted by f , is the function defined as The Fourier transform of f ∈ S ′ (R d ) is defined by duality as The Fourier transform operator will be denoted with F .It is topological isomorphism of S(R d ) and S ′ (R d ) as well as a unitary operator on L 2 (R d ).For F = F (x, y) ∈ S(R 2d ), we set the partial Fourier transform with respect to the second variables.It is a topological isomorphism of S(R 2d ) to itself that extends to a topological isomorphism of S ′ (R 2d ).This extension is characterized by its action on tensor products ϕ ⊗ ψ ∈ S(R 2d ) by: [2,7,11,16,12].Let f ∈ S ′ (R d ) and fix g ∈ S(R d )\{0}.The short-time Fourier transform of f with respect to g is the function defined in (3).It is possible to recover a signal f ∈ S ′ (R d ) as an integral superposition of time-frequency shifts: for fixed g, γ ∈ S(R d ) such that g, γ = 0,

Modulation spaces
where the integral must be intended in the weak sense of vector-valued integration.Other time-frequency representation that we will consider are the (cross-)τ -Wigner distributions, defined in (4).
The cases τ = 0, 1 are known as Rihacek and conjugate-Rihacek distributions.For 0 • M p,q m are quasi-norms (norms if p, q ≥ 1) and different choices of g give rise to equivalent (quasi-)norms.We recall the basic inclusion and duality properties of these spaces: ) is defined in terms of the STFT as the space of tempered distributions f ∈ S ′ (R d ) such that the (quasi-)norm , is finite, with the obvious adjustments for max{p, q} = ∞.Again, different choices of the window g yield to equivalent (quasi-)norms.This definition highlights that 2.3.Gabor frames.Let g ∈ L 2 (R d ) and Λ ⊆ R 2d be a discrete set.The Gabor system G(g, Λ) is defined as the set of the time-frequency shifts of g parametrized by Λ. Namely, G(g, Λ) = {π(λ)g} λ∈Λ .
The Gabor system G(g, Λ) defines a Gabor frame of L 2 (R d ) if there exist A, B > 0 such that (1) holds true.Observe that this is equivalent to If G(g, Λ) is a Gabor frame, then where γ ∈ L 2 (R d ) is the so-called dual window, which depends on g and Λ, and the series converges unconditionally in the L 2 (R d )-norm.
2.4.Symplectic group and metaplectic operators.For details we refer to [14].Let I d×d and 0 d×d denote the identity matrix and the matrix with all zero entries, respectively.Let us denote with J the standard symplectic matrix: We denote by Sp(d, R) the group of 2d × 2d symplectic matrices, which is generated by the symplectic matrices J, For z = (x, ξ) ∈ R 2d and τ ∈ R, we denote with ρ(z; τ ) = e 2πiτ e −iξ•x π(z).This is the Schrödinger representation of the Heisenberg group.A unitary operator ) which satisfies the intertwining relationship: In the sequel we showcase the main examples of such operators.
• f is a metaplectic operator and its projection is the symplectic matrix V C defined in ( 8).
(ii) The normalized rescaling operator ) is a metaplectic operator and π M p (T E ) = D E , defined as in ( 8).
(iii) The Fourier transform ) is a metaplectic operator and its projection onto the symplectic group is the symplectic matrix of Sp(2d, R) whose block decomposition is
We recall the following continuity properties [8].
Proposition 3.3.Let W A be a metaplectic Wigner distribution.Then, (i) ) is continuous and Moyal's identity holds: An equivalent of the inversion formula (7) for the STFT holds for general metaplectic Wigner distributions, as it was proved in [5], as long as time-frequency shifts are replaced by metaplectic atoms.Definition 3.4.Let W A be a metaplectic Wigner distribution and z ∈ R 2d .The metaplectic atom π A (z) : S(R d ) → S ′ (R d ) is the operator defined for all f ∈ S(R d ) by its action on S(R d ) as: Their main properties were investigated in [5]; in particular, we recall the issues below.
For every metaplectic Wigner distribution Theorem 3.5.Let W A be a metaplectic Wigner distribution.Then, for every f ∈ S ′ (R d ) and every g, γ ∈ S(R d ) such that g, γ = 0, we have: when the integral must be intended in the weak sense of vector-valued integration.
We call E A the submatrix: and observe that, if W A is the metaplectic Wigner distribution associated to Â, for every f, g ∈ L 2 (R d ) and every w ∈ R 2d .
Definition 3.6.Under the notation above, Shift-invertible Wigner distributions are fundamental, as they characterize modulation spaces.Whence, they can be used to measure the local time-frequency content of signals: It was showed in [4] that if W A is not shift-invertible or E A is not uppertriangular, Theorem 3.7 fails in general.

Wigner-decomposable atoms
Among all shift-invertible Wigner distributions, the Wigner-decomposable ones, introduced in [6], provide a direct generalization of the STFT, as they can be written as F 2 T E for some E ∈ GL(2d, R), up to a chirp function.In this section, we compute metaplectic atoms associated to Wigner-decomposable metaplectic Wigner distributions.Definition 4.1.We say that a matrix (8), for some C ∈ R d symmetric and E ∈ GL(d, R).In particular, if C = 0 d×d then V C = I 2d×2d and we call A F T 2 D E (or, equivalently, W A F T 2 D E ) totally Wigner-decomposable.
Metaplectic atoms of Wigner-decomposable distributions can be expressed in terms of the matrices C and E. From now on, we assume that the matrix E enjoys the block decomposition In [6], the authors prove that a Wigner-decomposable distribution W A is shiftinvertible if and only if E is right-regular and A F T 2 defined as in (10).The requirement of E to be right-regular is equivalent to A being shift-invertible, as shown in [6,Corollary 4.12].Moreover, it follows by [6,Remark 4.7] that if E has block decomposition (15), then the matrix E A associated to A has block decomposition: Corollary 4.5.Under the assumptions of Proposition 4.3, the inverse π −1 A can be explicitly computed as where Φ −C is the chirp function defined in (9) (with −C instead of C), and π A is the metaplectic atom in (16).
Proof.The proof uses that V C is the multiplication operator by the chirp Φ C .The rest is a straightforward computation.
As a consequence, the computation of the inverse π −1 B is immediate: Corollary 4.7.Under the assumptions of the previous proposition, A computed in (17).

Wigner-decomposable Gabor frames
In this section we focus on frame theory e exhibit new Gabor-type frames, which originate from Wigner decomposable distributions.Definition 5.1.Let W A be a metaplectic Wigner distribution, g ∈ L 2 (R d ) \ {0} and Λ ⊆ R 2d be a discrete set.A metaplectic Gabor frame of L 2 (R d ) is a family G A (g, Λ) = {π A (λ)g} λ∈Λ that enjoys the following properties: The metaplectic Gabor frames related to the STFT are simply referred to Gabor frames.
Theorem 5.2.Let E ∈ GL(2d, R) be right-regular and W A = F 2 T E be the associated totally Wigner decomposable distribution.Let g ∈ L 2 (R d ) \ {0} and Λ ⊆ R 2d be a discrete set.The following statements are equivalent: (i) G A (g, Λ) is a metaplectic Gabor frame with bounds A and B; with E being the diagonal matrix Proof.Using ( 16), for all f ∈ L 2 (R d ): where the matrix E is invertible because E is invertible, right-regular and E 11 − E 12 E −1 22 E 21 is its Schur's complement.We mention the characterization of modulation spaces through shift-invertible Wigner-decomposable distributions.
(ii) If 1 ≤ p, q ≤ ∞, the window g can be chosen in the larger class M 1 v (R d ).Proof.Let (13) be the block decomposition of A. The expression of W A (f, g) in terms of the STFT and the blocks of A is: for a matrix S ∈ Sp(d, R), is called metaplectic operator.The symplectic matrix S is the projection of U onto Sp(d, R).We write S = π M p (U) and denote U = Ŝ.Given S ∈ Sp(d, R), the associated metaplectic operator Ŝ is unique, up to a sign.The group {± Ŝ : S ∈ Sp(d, R)} of metaplectic operators is denoted by Mp(d, R).For C ∈ R d×d , let us denote (9) Φ C (t) := e iπCt•t .

I d×d 0 d×d 0 d×d 0 d×d 0 d×d 0
d×d 0 d×d I d×d 0 d×d 0 d×d I d×d 0 d×d 0 d×d −I d×d 0 d×d 0 d×d