1 Introduction

The concept of compressed sensing (CS) was first introduced by Donoho [12], Candès, Romberg and Tao [8] and Candès and Tao [9] with the involved essential idea–recovering some original n-dimensional but sparse signal∖image from linear measurement with dimension far fewer than n. Recently, large numbers of researchers, including applied mathematicians, computer scientists and engineers, have begun to pay their attention to this area owing to its wide applications in signal processing, communications, astronomy, biology, medicine, seismology and so on, see, e.g., survey papers [1, 2, 19] and a monograph [14].

The fundamental problem in compressed sensing is reconstructing a high-dimensional sparse signal from remarkably small number of measurements. We assume to recover a sparse solution \(x\in\mathbb {R}^{n}\) of the underdetermined system of the form Φx=y, where \(y\in\mathbb{R}^{m}\) is the available measurement and \(\varPhi\in\mathbb{R}^{m\times n}\) is a known measurement matrix (with mn). The mathematical model would be to minimize the number of the non-zero components of x, i.e., to solve the following l 0-norm optimization problem:

(1)

where ∥x0 is l 0-norm of the vector \(x\in\mathbb{R}^{n}\), i.e., the number of nonzero entries in x (this is not a true norm, as ∥⋅∥0 is not positive homogeneous). A vector x with at most k nonzero entries, ∥x0k, is called k-sparse. However, (1) is combinatorial and computationally intractable and one popular and powerful approach is to solve it via 1 minimization (its convex relaxation)

(2)

One of the most commonly used frameworks for sparse recovery via l 1 minimization is the Restricted Isometry Property (RIP) introduced by Candès and Tao [9]. For some integer k∈{1,2,⋯,n}, the k-restricted isometry constant (RIC) δ k of a matrix Φ is the smallest number in (0,1) such that

(3)

holds for all k-sparse vectors. We say that Φ has k-RIP if there is a k-RIC δ k ∈(0,1) such that the above inequalities hold. Furthermore, if for integers k 1,k 2,⋯,k s there exist \(\delta_{k_{1}}, \delta _{k_{2}},\cdots,\delta_{k_{s}}\in(0,1)\) such that the corresponding inequalities hold, we say that Φ has {k 1,k 2,⋯,k s }-RIP. Here, δ k ∈(0,1) is often used in literature, see, e.g., [13, 14], and δ k has the monotone property for k (see, e.g., [3, 4]), i.e.,

(4)

Thus, Φ has {k 1,k 2,⋯,k s }-RIP is the same as that Φ has max{k 1,k 2,⋯,k s }-RIP. In addition, if k+k′⩽n, the k,k′-restricted orthogonality constant (ROC) θ k,k is the smallest number that satisfies

(5)

for all k-sparse x and k′-sparse x′ with disjoint supports. Candès and Tao [9] showed the link between RIC and ROC

(6)

By the definition (3), one would observe that

(7)

where ∥⋅∥ denotes the spectral norm of a matrix (see, e.g., [18]). Clearly, it is hard to compute RICs for a given matrix Φ because it essentially requires that every subset of columns of Φ with certain cardinality approximately behaves like an orthonormal system. Moreover, as shown by Zhang [20], for a nonsingular matrix (transformation) \(Q\in\mathbb{R}^{n\times n}\), the RIP constants of Φ and can be very different. However, a widely used technique for avoiding checking the RIP condition directly is to generate the matrix randomly and to show that the resulting random matrix satisfies the RIP with high probability [17].

Although the RIP condition is difficult to check, it is of independent interest to study the bounds for RIC in CS since l 1-norm minimization can recover a sparse signal under various conditions on δ k δ 2k and θ k,k, such as, the condition δ k +θ k,k +θ k,2k <1 in [9], δ 2k +θ k,2k <1 in [10], and δ 1.25k +θ k,1.25k <1 in [4].

While many previous results in compressed sensing made reference to δ 2k , probably because it implies that k-sparse signals remain well separated in the measurement space. The first major result of this sort was established in Candès [6], namely, \(\delta_{2k}\leqslant\sqrt{2}-1\) is sufficient for k-sparse signal reconstruction. Recently Cai and Zhang [5] obtained the sufficient condition δ 2k ⩽1/2. To the best of our knowledge, the bound for δ 2k on sparse recovery is gradually improved from \(\sqrt{2}-1 ({\approx}0.4142)\) to 0.5 in recent years. The details are listed in the Table 1 below.

Table 1 Different bounds for RIC

The main contribution of the present paper is to give the new bounds for RIC in CS in the following theorem. Here, for \(x\in\mathbb{R}^{n}\), we define the best k-sparse approximation \(x^{ (k )}\in\mathbb{R}^{n}\) from x with all but the k largest entries (in absolute value) set to zero.

Theorem 1

Let x be a feasible solution to (1) and x (k) be the best k-sparse approximation of x. If the following inequalities hold

(8)

and

(9)

for a>3/8, then the solution \(\hat{x}\) to the l 1 minimization problem (2) satisfies

(10)

for some positive constant C 0<1 given explicitly by (16). In particular, if x is k-sparse, the recovery is exact.

From Theorem 1, when a=1,1.5,2 and 3, we get that δ 2k <0.5746, δ 2.5k <0.7046, δ 3k <0.7731 and δ 4k <0.8445 with the corresponding assumption δ 8ak <1. Observing Table 1, under the extra assumption δ 8ak <1, our conditions are all weaker than the ones known in the literature.

Note that the k-RIP condition implies that every subset of columns of Φ with cardinality less than k approximately behaves like an orthonormal system. In the context of (large scaled) sparse optimization, it is often said that kn. Recently, Candès and Recht [7] showed that a k-sparse vector in \(\mathbb{R}^{n}\) can be efficiently recovered from 2klogn measurements with high probability, i.e., \(m=\mathcal{O}(2k\log n)\). In this case, 8ak should be less than m for smaller a. Thus, 8ak<m and 8akn make sense and our extra assumption is meaningful and valuable in large scale sparse optimization.

The organization of this paper is as follows. In the next section, we establish some key inequalities. In Sect. 3, we prove our main result. In Sect. 4, we conclude this paper with some remarks.

2 Key Inequalities

In this section, we will give some inequalities, which play an important role in improving the RIC bound for sparse recovery in this paper.

We begin with the following interesting and important inequality, which states the connection between several norms of l 0,l 1,l 2,l and l −∞. Here, we define ∥x−∞ norm as ∥x−∞:=min i {|x i |}. (In fact, l −∞ is not a norm since the triangle inequality does not hold). For convenience, we call (11) the Norm Inequality, which is essentially from (6) in [3].

Proposition 1

(Norm Inequality)

For any \(x\in\mathbb {R}^{n}\) and x≠0,

(11)

Furthermore, we obtain the following general inequality,

$$\frac{\Vert x\Vert _1}{\sqrt{\Vert x\Vert _0}}\leqslant\Vert x\Vert _2\leqslant\frac{\Vert x\Vert _1}{\sqrt{\Vert x\Vert _0}}+ \frac{\sqrt{\Vert x\Vert _0}}{4}\Vert x\Vert _\infty. $$

Throughout the paper, let \(\hat{x}\) be a solution to the minimization problem (2), and \(x\in\mathbb{R}^{n}\) be a feasible one, i.e., Φx=y. Clearly, \(\Vert \hat{x}\Vert _{1}\leqslant\Vert x\Vert _{1}\). We let \(x^{ (k )}\in\mathbb{R}^{n}\) be defined as above again. Without loss of generality we assume that the support of x (k) is T 0.

Denote that \(h=\hat{x}-x\) and h T is the vector equal to h on an index set T and zero elsewhere. We decompose h into a sum of vectors \(h_{T_{0}} ,h_{T_{1}} ,h_{T_{2}},\cdots{}\), where T 1 corresponds to the locations of the ak largest coefficients of \(h_{T_{0}^{C}}\) (\(T_{0}^{C}=T_{1}\cup T_{2}\cup\cdots\)); T 2 to the locations of the 4ak largest coefficients of \(h_{(T_{0}\cup T_{1})^{C}}\), T 3 to the locations of the next 4ak largest coefficients of \(h_{(T_{0}\cup T_{1})^{C}}\), and so on. That is

(12)

Here, the sparsity of \(h_{T_{0}}\) is at most k; the sparsity of \(h_{T_{1}}\) is at most ak; the sparsity of \(h_{T_{j}}(j\geqslant2)\) are at most 4ak.

In order to get a new bound on RIC, for the above decomposition (12), we define

(13)

Obviously, ρ∈[0,1] and

$$\sum_{j\geqslant2}\Vert h_{T_j}\Vert _1=(1-\rho)\sum_{j\geqslant1}\Vert h_{T_j}\Vert _1. $$

Applying the Norm Inequality, we can give some inequalities of h, which are very useful in the proof of our main results.

Lemma 1

Let \(h_{T_{0}}, h_{T_{1}}, h_{T_{2}}, \cdots\), and ρ be given by (12) and (13), respectively. Then

(14)

and

(15)

Proof

By the definitions of \(h_{T_{j}}(j=1,2,\cdots)\) and ρ, direct calculation yields

Thus, (14) holds. We remain to show (15). Applying the Norm Inequality (11), we obtain that

for j=2,3,⋯, where for the last \(h_{T_{j}}\), we set \(\Vert h_{T_{j+1}}\Vert _{\infty}:=0\). Adding up all the inequalities for j=2,3,⋯, we get that

The desired conclusion holds immediately. □

In the end of this section, we give two lemmas which give us the connection about the norms of \(\varPhi h_{T_{j}}\) and \(h_{T_{j}}\).

Lemma 2

Let \(h_{T_{0}}\) and \(h_{T_{1}}\) be given by (12). Then

$$\bigl\Vert \varPhi (h_{T_0}+h_{T_1} )\bigr\Vert _2^2 \geqslant\frac{1-\delta _{k+ak}}{k} \biggl(\Vert h_{T_0}\Vert _1^2+\frac{1}{a}\Vert h_{T_1}\Vert _1^2 \biggr). $$

Proof

From (3), we obtain

$$\bigl\Vert \varPhi (h_{T_0}+h_{T_1} )\bigr\Vert _2^2 \geqslant (1-\delta_{k+ak} )\Vert h_{T_0}+h_{T_1}\Vert _2^2. $$

Because the supports T 0 and T 1 are disjoint, the following equality holds

$$\Vert h_{T_0}+h_{T_1}\Vert _2^2=\Vert h_{T_0}\Vert _2^2+\Vert h_{T_1}\Vert _2^2. $$

Therefore

where the second inequality is derived from (11). □

Lemma 3

Let \(h_{T_{0}}, h_{T_{1}}, h_{T_{2}}, \cdots\), and ρ be given by (12) and (13), respectively. Then

$$\biggl\Vert \sum_{j\geqslant2}\varPhi h_{T_j}\biggr\Vert_2^2\leqslant\frac{4\rho(1-\rho)+\delta_{8ak}}{4ak} \biggl(\sum _{j\geqslant1}\Vert h_{T_j}\Vert _1\biggr)^2. $$

Proof

By direct calculations, we obtain that

where the first inequality holds by the triangle inequality, the second holds due to (3) and (6), the third is from (14) and (15); and the first equality holds from

$$\sum_{j\geqslant2}\Vert h_{T_j}\Vert _2^2+2\sum_{j>i\geqslant2}\Vert h_{T_j}\Vert _2\Vert h_{T_i}\Vert _2= \biggl(\sum_{j\geqslant2}\Vert h_{T_j}\Vert _2 \biggr)^2. $$

Hence, the desired result follows. □

3 Proof of the Main Result

In this section, we will prove our main result. For simplicity, we first define a quadratic function of variable ρ,

Clearly, it is a strictly concave function. We can easily obtain the optimal maximum value of f(ρ) through demanding its derivative, that is

$$f\bigl(\rho^*\bigr)=\max_{0\leqslant\rho\leqslant1}f(\rho)=\frac{1+(2-\delta _{k+ak})\delta_{8ak}}{2-\delta_{k+ak}}>0, $$

where

Moreover, we denote that

(16)

Before proving our main results, we show that the RIP bound in (9) is a sufficient condition for C 0<1.

Lemma 4

If (8) and (9) hold, then C 0<1.

Proof

From (9), it is easy to verify that

$$-4a\delta_{k+ak}^2+(12a-1)\delta_{k+ak}+3-8a<0, $$

which is equivalent to

$$\frac{3-\delta_{k+ak}}{4a(2-\delta_{k+ak})(1-\delta_{k+ak})}<1. $$

Since 0⩽δ 8ak ⩽1, and by (16), we have

Thus, if (9) holds, we ensure C 0<1. □

Now we begin to prove our main result.

Proof of Theorem 1

The proof proceeds in two steps, which is a common approach in literature [4, 6].

The first step is to prove that

(17)

The second step shows that \(\Vert \hat{x}-x\Vert _{1}\) is appropriately small.

For the first step, we note that Φh=0, which implies that

$$\bigl\Vert \varPhi(h_{T_0}+h_{T_1})\bigr\Vert _2^2= \biggl\Vert \sum_{j\geqslant2}\varPhi h_{T_j}\biggr\Vert_2^2. $$

From Lemmas 2 and 3, the following inequality holds

Then we get

where the first inequality is derived from (13). Combining with (16), we get (17).

For the second step, we have

where the first inequality holds from (12) in [6]. Then

$$\sum_{j\geqslant1}\Vert h_{T_j}\Vert _1\leqslant\frac{2}{1-C_0}\bigl\Vert x-x^{(k)}\bigr\Vert _1. $$

This together with (17) yields

We complete to prove (10).

In particular, if x is k-sparse, then xx (k)=0, and hence \(x=\hat{x}\) from (10). □

4 Conclusion

In this paper, we have gotten that, when a>3/8, the conditions (8) and (9) enable us to obtain several interesting RIC bounds for measurement matrices, such as δ 2k , δ 2.5k , δ 3k , and δ 4k so on. For intuitionistic analysis, we draw the curve about the connection between t(:=a+1) and the bound for δ tk .

From Fig. 1, it is easy to see that the bounds for δ tk increase fast between 1.75⩽t⩽3 and the bounds for δ tk are larger than 0.9 when t⩾6. In addition, Davies and Gribonval [11] have given detailed counter-examples to show that the bound of δ 2k cannot exceed \(1/\sqrt{2}\approx 0.7071\). Based on 0.5746<0.7071, we wonder whether there is a better way to improve the bound 0.5746 for δ 2k without the extra assumption δ 8k <1. So the further research topics we can do are to omit the extra assumption δ 8ak <1 and further to reduce the gap between 0.5746 and 0.7071.

Fig. 1
figure 1

The curve of bounds for δ tk