1 Introduction

Many high-dimensional signals, such as natural images, can be represented in a low-dimensional space. A variety of methods have been proposed in the past for low-dimensional representations of high-dimensional signals. Sparse representation is one of such techniques that have been studied extensively. This issue can be described as recovering an unknown signal x of a high-dimension from a low-dimensional signal y having a limited set of measurements:

$$ \boldsymbol{y}=\boldsymbol{Mx}+\boldsymbol{e} $$
(1)

where M ∈  m × d is a known linear operator and e ∈  m is additive noise bounded by \( \parallel \boldsymbol{e}{\parallel}_2^2\le {\varepsilon}^2 \). In a noiseless case, e is set to be a zero vector. One such example is the problem of compressed sensing where M is a measurement matrix. For m < d, this is an ill-posed underdetermined problem and thus has an infinite number of solutions, and extra priors or constraints need to be imposed on the model in order to limit the range of possible solutions to x.

Sparsity is usually considered as an effective prior in both the synthesis and analysis models [1]. The synthesis model assumes that x can be represented by x = Dz, where D ∈  d × n is an over-complete dictionary with n > d and z is the sparse representation coefficient. The original signal x can be recovered by solving the following optimization problem:

$$ \widehat{\boldsymbol{z}}=\underset{\boldsymbol{z}}{argmin}\left|\right|\boldsymbol{z}{\left|\right|}_0\kern1em \mathrm{s}.\mathrm{t}.\kern1em \parallel \boldsymbol{y}-\boldsymbol{M}\boldsymbol{D}\boldsymbol{z}{\parallel}_2\le \varepsilon $$
(2)

where ‖ ⋅ ‖0 is a l 0 pseudo-norm counting the number of non-zero elements in its argument vector.

Since solving (2) is an non-deterministic polynomial (NP)-hard problem [2], many approximation techniques have been proposed to recover x. Basis pursuit (BP) [3], which is based on the l 1-minimization using linear programming (LP), is a well-known reconstruction algorithm. Another option for approximating (2) is to use a family of greedy-like algorithms, such as Orthogonal Matching Pursuit [4] or the thresholding technique [510].

In the analysis model, recovering x from the incomplete measurements is achieved by solving the following minimization problem [11]:

$$ \widehat{\boldsymbol{x}}=\underset{\boldsymbol{x}}{argmin}\left|\right|\boldsymbol{\Omega} \boldsymbol{x}{\left|\right|}_0\kern1em \mathrm{s}.\mathrm{t}.\kern1em \parallel \boldsymbol{y}-\boldsymbol{Mx}{\parallel}_2\le \varepsilon $$
(3)

where Ω ∈  p × d is a fixed analysis operator which is also referred to as the analysis dictionary. Typically, the dimensions are m ≤ d ≤ p, n. In the analysis model, the cosparsity l (used to distinguish from the term “sparsity” in the synthesis model) is defined as

$$ l=p-\parallel \boldsymbol{\Omega} \boldsymbol{x}{\parallel}_0. $$
(4)

The role of cosparsity in the analysis model is similar to the role of sparsity in the synthesis model. The level of sparsity in the synthesis model indicates the number of non-zeros in the representation vector z in (2), while in the analysis model, the cosparsity l is used to indicate the number of zeros in the vector Ω x, as defined in (4). In other words, the quantity l denotes the number of rows of Ω that are orthogonal to the signal.

Solving problem (3) is NP-complete, just as in the synthesis case; thus, approximation methods are required for reconstructing x. As before, l 1-relaxation [13, 14] can be used to replace the l 0 pseudo-norm-based optimization and the problem can then be solved by linear programming. With l 1-relaxation, only a small number of measurements are required to achieve a high reconstruction rate. However, the computational complexity of this method may limit its practical use in large-scale applications. The restricted isometry property (RIP) is commonly used by both the synthesis- and analysis-based algorithms to govern the recovery condition of the sparse or cosparse signals. The measurement matrix M has the Ω-RIP property with a constant δ l , if δ l is the smallest constant that satisfies

$$ \left(1-{\delta}_l\right)\parallel \boldsymbol{v}{\parallel}_2^2\le \parallel \boldsymbol{M}\boldsymbol{v}\parallel \le \left(1+{\delta}_l\right)\parallel \boldsymbol{v}{\parallel}_2^2 $$
(5)

whenever Ωv has at least l zeros [12].

Another popular class of cosparse reconstruction algorithms is based on the idea of iterative greedy pursuit, such as Greedy Analysis Pursuit (GAP) [11, 15, 16]. As compared with l 1-relaxation, GAP has better reconstruction performance and, to some degree, a lower computational complexity. Analysis Iterative Hard Thresholding (AIHT) and Analysis Hard Thresholding Pursuit (AHTP) [12, 17, 18] have been proposed by incorporating the idea of backtracking, which enables the wrong cosupports obtained in the previous iteration to be pruned in the current iteration and offers strong theoretical guarantees. Experiments show that both of them recover the signal faster than the GAP algorithm. Nevertheless, they require a relatively large number of measurements for exact reconstruction.

Recently, more sophisticated greedy algorithms have been developed, such as Analysis Subspace Pursuit (ASP) and Analysis Compressive Sampling Matching Pursuit (ACoSaMP) [12, 19]. They employ the backtracking strategy and offer strong theoretical guarantees. ASP and ACoSaMP with a candidate set size of 2l − p have good performance on reconstructing the signal when l is close to d, but they require more measurements for an exact reconstruction with an increasing level of cosparsity. ASP and ACoSaMP with a candidate set size of l provide a comparable reconstruction quality to that of the l 1-relaxation methods with a lower reconstruction complexity. Other recent methods include a Bayesian method [20] where the model parameters are estimated by a Bayesian algorithm for the reconstruction of the signal under consideration.

Although all these greedy pursuit methods achieve signal reconstruction with a high accuracy, they require the cosparsity l to be known a priori for signal recovery. However, l may not be available in many practical applications. For example, most natural image signals are only cosparse when represented by an analysis operator such as a two-dimensional Fourier transform. It is difficult to define a cosparsity that exactly matches the signal under consideration. The inaccurate cosparsity may degrade the performance of the signal reconstruction algorithm, as demonstrated in the next section.

In this paper, a new greedy algorithm named Cosparsity-based Stagewise Matching Pursuit (CSMP) is proposed for the case where l is unknown. By analyzing the projection from the signal under consideration to the analysis operator, CSMP estimates the cosparsity with a pre-set step size stage by stage without the cosparsity knowledge. Cosupport and measurement residual are estimated alternately in the forward stage and fine-tuned in the backward stage, and the signal approximation is obtained at the end of the procedure. Our experiments show that the proposed algorithm has a reconstruction performance comparable to ASP and ACoSaMP, but without the knowledge of the cosparsity.

This paper is organized as follows. In Section 2, we present the motivation of this work. The CSMP algorithm is detailed in Section 3 and a theoretical analysis to the algorithm is also provided. The simulation results are given in Section 4, followed by concluding remarks and future work in Section 5.

2 Motivation

In analysis cosparse representation, it is important to establish the cosupport accurately for an exact signal reconstruction. A common approach is to adopt the “correlation” term defined as follows:

$$ {\alpha}_i=\left|{\boldsymbol{\Omega}}_i{\boldsymbol{M}}^T\boldsymbol{y}\right|,\kern1em i\in \left\{1,2,\cdots, p\right\} $$
(6)

where M T y resembles the original signal, Ω i denotes the i-th row of Ω, the superscript T denotes matrix transpose, and | ⋅ | takes the absolute value of its argument. Obviously, a growing α i implies the increase of the correlation between the signal and Ω i . When the signal is orthogonal to Ω i and M = I, we get α i  = 0, where I is an identity matrix. The existing recovery methods based on the analysis model can be categorized into bottom-up and top-down methods, respectively. The bottom-up methods, such as GAP, prune one or more rows of the analysis operator which correspond to the entries of the largest correlation in each iteration. It could lead to unreliable reconstruction if the cosupports have been removed incorrectly. In contrast, the top-down methods, such as ACoSaMP and ASP [12], employ the backtracking technique to establish the cosupports. Although such cosupport refinement technique improves the performance of ACoSaMP and ASP significantly, it requires the cosparsity l as a priori for an exact recovery of the target signal. In practical applications, such information is often unavailable and must be pre-set in advance. If l is set to an inappropriate value, the performance of the algorithm for signal recovery in terms of both accuracy and robustness could be degraded significantly.

To see this, we first perform an experiment for the ASP algorithm, when l is not given accurately. Here, the analysis operator is a two-dimensional finite difference operator Ω ∈ ℜp × d, where p = 144 and d = 120; x is a Gaussian random signal of length d = 120 and a cosparsity of l = 90.Footnote 1 In this experiment, the sampling rate δ, which is defined as δ = m / d [12], is chosen from the set {0.50, 0.54, 0.58, 0.62, 0.66, 0.70}. The cosparsity l est, which denotes the estimated cosparsity of the signal, is chosen from the set {110, 100, 90, 80, 70}. We draw a phase transition diagram [12] for this algorithm.

The vertical and horizontal axes of the diagram are l est and δ, respectively. For each pair of l est and δ, we repeat the experiment 50 times. In each experiment, we check whether we have an exact reconstruction in the sense that the energy of the difference between the reconstruction and the original signal is smaller than 10−6 [12]. Note that the choice of this threshold value is followed from the existing algorithms in the literature, such as GAP, ASP, ACoSaMP, AIHT, and AHTP, for fair comparisons to be demonstrated in our experiments later. White cells in the diagram denote a completely exact reconstruction, and black cells denote a total failure in the exact reconstruction. Figure 1 shows the probability of exact reconstruction changes with respect to the sampling rate δ and the cosparsity l est. We can see that the performance of ASP drops significantly if an inaccurate cosparsity l is used.

Fig. 1
figure 1

The probability of exact reconstructions versus the cosparsity in ASP

3 Cosparsity-based Stagewise Matching Pursuit

3.1 Algorithm description

To address the above issue, we propose a novel greedy algorithm for blind cosparse reconstruction, where the cosupports are refined iteratively and the information on cosparsity is extracted automatically. The proposed CSMP algorithm, as shown in Algorithm 1, is composed of two processes, namely, the forward and backward processes. The forward process estimates the cosparsity, constructs the cosupport starting from a cosupport with all rows in Ω, and updates the measurement residual simultaneously. The procedure ends with a backward process that tries to add the rows in Ω with a smaller correlation until the terminating condition is reached. The terminating condition for CSMP is controlled by a threshold, which ensures that the estimated cosparsity is fairly close to the actual one and the target signal has been well reconstructed. The main steps of CSMP are summarized in Algorithm 1.

In Algorithm 1, cosupp(x) = {i: Ω i x = 0}, cosupp(x, l est) returns the index set of l est smallest (in absolute value) elements in Ω x, 2l est − p is the size of |Γ| and for a vector α, function Min(α, 2l est − p) returns the 2l est − p indices corresponding to the 2l est − p smallest values of α, index(Ω x, q) returns q elements from the (l est + 1)-th to (l est + q)-th smallest (in absolute value) rows in Ω x, and Γ, Λ Δ, \( {\tilde{\varLambda}}^k \) and Λ k are subsets of {1, 2, ⋯, p}, and ⌈s/q⌉ returns the smallest integer which is larger than s / q.

The CSMP algorithm adopts a stagewise approach [6] to estimate the real cosparsity in each stage in the forward process, which only requires the step size s to be set in initialization. Here, l is defined as the real cosparsity of the original signal, and s should not be larger than d − l normally [11]. The initial cosupport Λ 0 has the maximum number of rows in Ω that enables CSMP to recover the signal. To avoid under-estimation of the cosparsity, a safe choice is s = 1 if l is unknown. Nevertheless, there is a tradeoff between s and the recovery speed as a smaller s requires more stages. We can see that GAP and ASP/ACoSaMP are the special cases of the proposed algorithm when CSMP has a step size of s = 1 and s = d − l, respectively.

Suppose CSMP has a cosparsity of l est in the forward process. With this cosparsity, the candidate set is constructed by selecting some rows in Ω with smaller correlations. Here, we should explain the relation between greedy synthesis algorithms and their analysis counterparts. Given two vectors v 1, v 2 ∈  n such that Λ 1 = cosupp(Ω v 1) and Λ 2 = cosupp(Ω v 2). Assuming that ‖Λ 10 ≤ (l est)1 and ‖Λ 20 ≤ (l est)2, it holds that ‖Λ 1 ∩ Λ 20 ≥ (l est)1 + (l est)2 − p. For the case ‖Λ 10 = ‖Λ 20 = l est, we have 2l est − p ≤ ‖Λ 1 ∩ Λ 20 ≤ l est. So 2l est − p is a reasonable size of the candidate set for CSMP, which corresponds to the candidate set size of 2k of CoSaMP in the synthesis model. Denoting T 1 = supp(Ω v 1) and T 2 = supp(Ω v 2), it is clear that supp(Ω(v 1 + v 2)) ⊆ T 1 ∪ T 2. Noticing that supp(⋅) = cosupp(⋅)C, we get cosupp(Ω(v 1 + v 2)) ⊇ (T 1 ∪ T 2)C = T 1 C ∩ T 2 C = Λ 1 ∩ Λ 2, where the superscript C denotes the complementary set. This implies that the union of the supports in the synthesis case is parallel to the intersection of the cosupports in the analysis case.

Now, with the candidate set, we begin to construct the cosupport and update the measurement residual. The rows in the analysis operator, which correspond to the smallest l est components in the temporarily estimated signal (calculated in Step 4 in Algorithm 1), are used to form the cosupport. Then, we approximate the signal and update the measurement residual of the current iteration with this cosupport. The backtracking strategy [12, 19], which can be described as not only selecting rows that match the current residual signal better from the candidate set in each iteration but also excluding the other rows from the cosupport, provides the basis for constructing a more accurate cosupport and obtaining a smaller measurement residual. Here, \( \mathrm{index}\left(\boldsymbol{\Omega} {\widehat{\boldsymbol{x}}}_{\mathrm{temp}},q\right) \) is reserved for sharing in the backward process. An efficient mechanism is required for stage switching when each stage finishes till l est < l. This can be performed if the current measurement residual energy no longer decreases when compared with that in the last iteration. From Step 9 in Algorithm 1, we can see that the algorithm will run for some iterations with the same cosparsity l est until it reaches the stage switching condition. The proposed algorithm will perform with a cosparsity of l est − s in the next stage. The forward process will not stop until the measurement residual reaches a pre-set threshold such as ‖y r 2/‖y2 ≤ 10‐ 6 for the noiseless case or ‖y r 2 ≤ ε for the noisy case.

In the backward process, the algorithm tries to further increase the cosparsity by adding the less used rows into the cosupport. These q rows have been chosen in the last iteration of the forward process. The choice of the value of q can be made in terms of the value of s. As a rule of thumb, a small q is chosen when s is relatively small, and likewise, a greater q should be chosen if s is large. Typically, in our experiments, we choose q = 1 when s = 10, but we select q = 50 when s is in the order of thousands. With this strategy, we can get a more accurate cosupport for signal approximation. The iterations stop as soon as the measurement residual reaches the terminating threshold used in the forward process. The backward process needs to repeat less than ⌈s/q⌉ times since it is enough for the cosparsity to change from l est to l est + s in this process.

3.2 Relaxed versions for high-dimensional problems

In CSMP, the constrained minimization problem

$$ \underset{\boldsymbol{x}}{min}\kern0.28em \parallel \boldsymbol{y}-\boldsymbol{Mx}{\parallel}_2^2\kern1.2em \mathrm{s}.\mathrm{t}.\kern1em {\boldsymbol{\Omega}}_{\varLambda}\boldsymbol{x}=\mathbf{0} $$
(7)

is hard to solve for high-dimensional signals, and we propose to replace it with the minimization of the following cost function in Steps 4, 6, 12, and 14 in Algorithm 1:

$$ \parallel \boldsymbol{y}-\boldsymbol{Mx}{\parallel}_2^2+\lambda \kern0.5em \parallel {\boldsymbol{\Omega}}_{\varLambda}\boldsymbol{x}{\parallel}_2^2 $$
(8)

where λ is a relaxation constraint and we choose λ = 0.001 as in [12] in our experiments. This results in a relaxed version of the algorithm. We refer hereafter to this version as relaxed CSMP (RCSMP).

3.3 Theoretical performance analysis

This section describes our theoretical analysis of the behavior of CSMP for the cosparse model in both noiseless and noisy cases. Because the proposed algorithm has a similar strategy of backtracking which is used in ASP, the proofs are mainly based on the proof framework of ASP/ACoSaMP. The following theorems are formed in parallel with those in [12], except for the unknown cosparsity and the initial cosupport and measurement vectors.

To show the ability of exact and stable recovery of cosparse signals by the CSMP algorithm, we define all variables involved in the process of signal reconstruction as follows:

Definition 3.1 [12] Let \( {\boldsymbol{Q}}_{\varLambda }=\boldsymbol{I}-{\boldsymbol{\Omega}}_{\varLambda}^{+}{\boldsymbol{\Omega}}_{\varLambda } \) be the orthogonal projection onto Ω Λ, where Ω Λ is a sub-matrix of the analysis operator Ω ∈ ℜp × d and Λ is a subset of {1, 2, ⋯, p}. A cosupport Ŝ l which implies a near-optimal projection is defined as

$$ \hat{S}_{l}(\text{\textbf{\textit{v}}})={\underset{\varLambda\in \mathrm{L}_{l}}{\text{argmin}}}\|v-\text{\textbf{\textit{Q}}}_{\varLambda}\text{\textbf{\textit{v}}}\|^{2}_{2} $$
(9)

where v ∈  d and \( \mathrm{L}_{l}=\left\{\varLambda\in\left[1,2,\cdots,p\right],\mid\varLambda\mid\geq l \right\} \) l-cosparse cosupports.

Definition 3.2 (Problem p) [12] Consider a measurement vector y ∈ ℜm such that y = Mx + e, where x ∈ ℜd is l-cosparse, M ∈ ℜm × d is a measurement matrix, and e ∈ ℜm is a bounded additive noise. The largest singular value of M is σ M and its Ω -RIP constant is δ l . The analysis operator Ω ∈ ℜp × d is given and fixed. Define C l to be the fraction of the largest and the smallest eigenvalues (which are not zero) of the sub-matrix composed of l rows from Ω. Assume that Ŝ l  = cosupp(Ω x,l). According to Definition 3.1, the cosupport of x is a near-optimal projection. Our task is to recover x from y. The recovery result is denoted by \( \widehat{\boldsymbol{x}} \).

Now, we give the guarantees on exact recovery and stability of CSMP for recovering the cosparse signals.

Theorem 3.3 (Convergence for cosparse recovery) Consider the problem p, and suppose that there exists γ > 0 such that

$$ \left(1+{C}_{2{l}_q-p}\right)\left(1-\left({\scriptscriptstyle \frac{C_{2{l}_q-p}}{\left(1+{\gamma}^2\right)}}-\left({C}_{2{l}_q-p}-1\right){\sigma}_{\boldsymbol{M}}^2\right)\right)<1, $$
(10)

then there exists \( \delta \left({C}_{2{l}_q-p},{\sigma}_{\boldsymbol{M}}^2,\gamma \right)>0, \) whenever \( {\delta}_{4{l}_q-3p}\le \delta \left({C}_{2{l}_q-p},{\sigma}_{\boldsymbol{M}}^2,\gamma \right), \) such that the k-th iteration of the algorithm satisfies

$$ \parallel \boldsymbol{x}-{\widehat{\boldsymbol{x}}}^k{\parallel}_2\le \frac{1+{\delta}_{2{l}_q-p}}{1-{\delta}_{2{l}_q-p}}{\rho}_1{\rho}_2\parallel \boldsymbol{x}-{\widehat{\boldsymbol{x}}}^{k-1}{\parallel}_2+\left(\frac{1+{\delta}_{2{l}_q-p}}{1-{\delta}_{2{l}_q-p}}\left({\eta}_1+{\rho}_1{\eta}_2\right)+\frac{2}{1-{\delta}_{2{l}_q-p}}\right)\parallel \boldsymbol{e}{\parallel}_2, $$
(11)

where

$$ \begin{array}{ll}{\eta}_1\triangleq \kern1em & \frac{\sqrt{{\scriptscriptstyle \frac{2+{C}_{l_q}}{1+{C}_{l_q}}}+2\sqrt{C_{l_q}}+{C}_{l_q}}\sqrt{1+{\delta_3}_{l_q-2p}}}{1-{\delta_4}_{l_q-3p}},\kern1em \\ {}{\eta}_2^2\triangleq \kern1em & \left(\frac{1+{\delta_3}_{l_q-2p}}{\gamma \left(1+\alpha \right)}+\frac{\left(1+{\delta_2}_{l_q-p}\right){C_2}_{l_q-p}}{\gamma \left(1+\alpha \right)\left(1+\gamma \right)}+\frac{\left({C_2}_{l_q-p}-1\right)\left(1+\gamma \right){\sigma}_{\boldsymbol{M}}^2}{\left(1+\alpha \right)\left(1+\gamma \right)\gamma}\right),\kern1em \\ {}{\rho}_1^2\triangleq \kern1em & \frac{1+2{\delta_4}_{l_q-3p}\sqrt{C_{l_q}}+{C}_{l_q}}{1-{\delta}_{4{l}_q-3p}^2}\kern1em \\ {}{\rho}_2^2\triangleq \kern1em & 1-{\left(\sqrt{{\delta_4}_{l_q-3p}}-\sqrt{\frac{{C_2}_{l_q-p}}{{\left(1+\gamma \right)}^2}{\left(1-\sqrt{{\delta_2}_{l_q-p}}\right)}^2-\left({C_2}_{l_q-p}-1\right)\left(1+{\delta_2}_{l_q-p}\right){\sigma}_{\boldsymbol{M}}^2}\right)}^2\kern1em \end{array} $$

and

$$ \alpha =\frac{\sqrt{\delta_{4{l}_q-3p}}}{\sqrt{{\scriptscriptstyle \frac{C_{2{l}_q-p}}{{\left(1+\gamma \right)}^2}}{\left(1-{\delta}_{2{l}_q-p}\right)}^2-\left({C}_{2{l}_q-p}-1\right)\left(1+{\delta}_{2{l}_q-p}\right){\sigma}_{\boldsymbol{M}}^2}-\sqrt{\delta_{4{l}_q-3p}}} $$
(12)

Moreover, when \( {\rho}_1^2{\rho}_2^2<1, \) the iteration converges. The constant γ gives a tradeoff between satisfying the theorem conditions and the noise level, and the conditions for the noiseless case are achieved when γ tends to zero.

Theorem 3.4 (Exact recovery for cosparse signals) Consider the problem p whene2 = 0. Let l s  = d − s⌈(d − l)/sand l q  = l s  + q⌊(l − l s )/q. If the measurement matrix M satisfies the Ω-RIP with parameter \( {\delta}_{4{l}_q-3p}\le \delta \left({C}_{2{l}_q-p},{\sigma}_{\boldsymbol{M}}^2,\gamma \right), \) where \( {C}_{2{l}_q-p} \) and γ are as in Theorem 3.3 and \( \delta \left({C}_{2{l}_q-p},{\sigma}_{\boldsymbol{M}}^2,\gamma \right) \) is a constant guaranteed to be greater than zero whenever ( 10) is satisfied, the CSMP algorithm guarantees an exact recovery of x from y via a finite number of iterations.

The proof is mainly based on the following lemma:

Lemma 3.5 If the measurement matrix M satisfies the Ω-RIP with the same conditions as in Theorem 3.4, then

  • The ⌈(d − l)/s⌉ + ⌊(l − l s )/q⌋ − th stage of the algorithm is equivalent to the ASP algorithm with estimated cosparsity l q , except that they have different initial cosupports and initial measurement vectors.

  • CSMP recovers the target signal exactly after completing the ⌈(d − l)/s⌉ + ⌊(l − l s )/q⌋ − th stage.

In the Appendix, Lemma 3.5 is proved in detail.

Lemma 3.5 describes that CSMP has a process of signal reconstruction that is equivalent to ASP, and could complete the exact recovery of the cosparse signals in a finite number of stages. To complete the proof, it is sufficient to show that the CSMP algorithm never gets stuck at any iteration of either stage, i.e., it takes a finite number of iterations up to ⌈(d − l)/s⌉ + ⌊(l − l s )/q⌋ stages. At each stage, the cosupport (whose size is assumed to be l est) adds and discards some rows of Ω, and the number of rows is fixed and finite. Hence, there are a finite number of combinations, at most, \( \left(\begin{array}{c}\hfill d\hfill \\ {}\hfill {l}_{\mathrm{est}}\hfill \end{array}\right), \) where d is the length of the signal. Thus, if CSMP takes an infinite number of iterations in this stage, the construction of cosupport would be repeated after at most \( \left(\begin{array}{c}\hfill d\hfill \\ {}\hfill {l}_{\mathrm{est}}\hfill \end{array}\right) \) iterations. Hence, Theorem 3.6 follows.

Lemma 3.6 (Stability for cosparse recovery) Consider the problem p. If (10) holds and \( {\delta}_{4{l}_q-3p}\le \delta \left({C}_{2{l}_q-p},{\sigma}_{\boldsymbol{M}}^2,\gamma \right) \) , where γ is as in Theorem 3.3 and \( \delta \left({C}_{2{l}_q-p},{\sigma}_{\boldsymbol{M}}^2,\gamma \right) \) is a constant guaranteed to be greater than zero whenever \( \frac{\left({C}_{2{l}_q-p}^2-1\right){\sigma}_{\boldsymbol{M}}^2}{C_{2{l}_q-p}^2}<1 \) is satisfied, then for any

$$ k\ge \left|\frac{ \log \left(\parallel \boldsymbol{x}{\parallel}_2/\parallel \boldsymbol{e}{\parallel}_2\right)}{ \log \left(1/{\scriptscriptstyle \frac{1+{\delta}_{2{l}_q-p}}{1-{\delta}_{2{l}_q-p}}}{\rho}_1{\rho}_2\right)}\right|, $$

we have

$$ \parallel \widehat{\boldsymbol{x}}-\boldsymbol{x}{\parallel}_2\le \left(1+\frac{1-{\left({\scriptscriptstyle \frac{1+{\delta}_{2{l}_q-p}}{1-{\delta}_{2{l}_q-p}}}{\rho}_1{\rho}_2\right)}^k}{1-{\scriptscriptstyle \frac{1+{\delta}_{2{l}_q-p}}{1-{\delta}_{2{l}_q-p}}}{\rho}_1{\rho}_2}\cdot \left({\scriptscriptstyle \frac{1+{\delta}_{2{l}_q-p}}{1-{\delta}_{2{l}_q-p}}}\left({\eta}_1+{\rho}_1{\eta}_2\right)+\frac{2}{1-{\delta}_{2{l}_q-p}}\right)\right)\parallel \boldsymbol{e}{\parallel}_2 $$
(13)

implying that CSMP leads to a stable recovery. The constants η 1 , η 2 , ρ 1 , and ρ 2 are the same as in Theorem 3.3.

Similarly, the proof of Lemma 3.6 is based on Lemma 3.5 and the corresponding theorems of ASP algorithm in [12], and we omit the detailed proof here.

The above theorems are sufficient conditions of CSMP for exact recovery and stability. They are slightly more restrictive than the corresponding results of ASP algorithms because the true cosparsity level l is always larger than or equal to the estimated one l q . This may be regarded as an additional cost for not having precise information of cosparsity. On the other hand, the proofs also show that these sufficient conditions may not be optimal or tight enough because they only consider the final stage and ignore the influence of previous stages on the performance of the algorithm.

4 Experiments

In this section, we evaluate the performance of the proposed algorithm, as compared with several baseline algorithms. To this end, we repeat some of the experiments performed in [12] for the noiseless case (e = 0) and noisy case.

4.1 Phase transition diagrams for synthetic signals in the noiseless case

We show the performance of the proposed algorithm as compared with six baseline methods, namely, AIHT, AHTP, ASP, ACoSaMP, l 1-relaxation, and GAP, using the same experiments as performed in [12] for the noiseless case. We begin with synthetic signals and test the performance of CSMP with s = 1, s = 5, and s = 10, respectively. The results of the proposed algorithm are compared with those of AIHT and AHTP with an adaptively changing step size, ASP and ACoSaMP with a = 1 and \( a={\scriptscriptstyle \frac{2l-p}{l}} \), l1-relaxation, and GAP. We use a random matrix M, where each entry is drawn independently from a Gaussian distribution, and a random tight frame Ω of size d = 120 and p = 144.

We draw a phase transition diagram [12] for each of the algorithms. In each phase transition diagram, 20 different possible values of m and 20 different values of l are tested. For each pair of m and l est, we repeat the experiment for 50 times, with the value of l est selected according to the formula in [12]:

$$ l=d-\rho m $$
(14)

where ρ is the ratio between the cosparsity of the signal and the number of measurements, shown as the vertical axis of the phase diagram. The sampling rate δ is defined as δ = m / d and shown as the horizontal axis. There are 400 cells in each phase transition diagram, and the gray level of the cell shows the exact reconstruction rate of its recovery algorithm. White cells in the diagram denote a completely exact reconstruction, and black cells denote a total failure in the exact reconstruction.

The reconstruction results of the proposed algorithm and the baseline algorithms are shown in Fig. 2.

Fig. 2
figure 2

The phase transition diagrams for a CSMP with s = 1, b CSMP with s = 5, c CSMP with s = 10, d AIHT with an adaptive changing step size, e AHTP with an adaptive changing step size, f ASP with a = 1, g ASP with \( a={\scriptscriptstyle \frac{2l-p}{l}} \), h ACoSaMP with a = 1, i ACoSaMP with \( a={\scriptscriptstyle \frac{2l-p}{l}} \), j l 1-relaxation, and k GAP

In Fig. 2, experiments with s = 1, s = 5, and s = 10 are performed for CSMP, respectively. From Fig. 2, it can be seen that CSMP has better results than those of AIHT and AHTP with an adaptively changing step size, and ASP and ACoSaMP with \( a={\scriptscriptstyle \frac{2l-p}{l}} \) when l est is far from d. In addition, the proposed algorithm with s = 5 and s = 10 provides comparable performance to ASP and ACoSaMP with a = 1 and the l1-relaxation when the knowledge of cosparsity is unknown. Although the accurate recovery rates of GAP for experiments of all pairs of δ and ρ are higher than those of CSMP, the number of white cells is 59 in Fig. 2k, which is less than 67 in Fig. 2b and 63 in Fig. 2c.

4.2 Reconstruction of high-dimensional images in the noiseless and noisy cases

We now test the methods for high-dimensional signals. We use RCSMP (relaxed versions of CSMP defined in Section 3.2) for the reconstruction of the Shepp-Logan phantom from a few number of measurements. RCSMP is computationally more efficient than CSMP for high-dimensional signals and is thus chosen in this experiment. The sampling operator M is a two-dimensional Fourier transform that measures only a certain number of radial lines from the Fourier transform. The cosparse operator is a two-dimensional finite difference operator Ω 2D-DIF, of which the number of rows is p = 130, 560 and the real cosparsity of the signal under this operator is l = 128, 014. The original phantom image is presented in Fig. 3a. Using the RCSMP with s = 4000, q = 50, we get perfect reconstructions using only 12 radial lines just as RASP (relaxed Analysis Subspace Pursuit), i.e., only m = 3032 measurements out of d = 65,536 which is less than 4.63 % of the data in the original image. The algorithm requires less than 15 iterations which are less than those required by RASP for achieving this recovery percentage. The reconstruction results of CSMP are shown in Fig. 3c.

Fig. 3
figure 3

a Shepp-Logan Phantom image. b 12 sampled radial lines. c CSMP with s = 4000 and q = 50 using 12 radial lines

We now turn to test the method for the noisy case; we perform reconstruction using RCSMP with s = 4000, q = 50 of a noisy measurement of the phantom with 22 radial lines (in Fig. 4b) and signal-to-noise ratio (SNR) at 20 dB. Figure 4c presents the noisy image, which is the result of applying inverse Fourier transform on the measurements. We get the reconstruction results by the proposed method with a peak SNR (PSNR) of 37.11 dB in Fig. 4e. The recovery image using GAP is shown in Fig. 4f, with a little worse PSNR of 34.34 dB. Note that for the minimization process, we solve conjugate gradients in Steps 4, 6, 12, and 14 in each iteration in Algorithm 1, take only the real part of the result, and crop the values of the resulting image to be in the range of [0, 1] [12].

Fig. 4
figure 4

a Shepp-Logan Phantom image. b 22 sampled radial lines. c Noisy image with a SNR of 20 dB. d Location of non-zero elements in the difference map. e Recovery image using CSMP with s = 4000 and q = 50 and only using 22 radial lines. f Recovery image using GAP and only using 22 radial lines

5 Conclusions

We have presented a novel greedy pursuit algorithm CSMP for the cosparse analysis model. With the proposed algorithm, the information of the cosparsity of the target signal is not required as a priori. It addresses a common limitation associated with the existing greedy pursuit algorithms. The underlying intuition of CSMP is to obtain the cosparsity estimation and the signal approximation in the forward process and refine them in the backward process. Borrowing the idea from ASP, a theoretical study of the proposed algorithm has been performed to give guarantees for stable recovery under the assumption of the Ω - RIP and the existence of an optimal or a near-optimal projection. Experiments have confirmed that the proposed algorithm gives competitive results for signal recovery as compared with those of l 1-relaxation and ACoSaMP/ASP in the noiseless case and better results than GAP in the noisy case.