Background

The recent development of compressive sensing (CS) theory (Candès et al. 2006; Donoho 2006; Eldar and Kutyniok 2012) has drawn much attention in signal processing community over the past few years. The basic principle of CS is that sparse or compressible signals can be recovered from very few measurements in comparison with traditional data acquisitions limited by Shannon–Nyquist sampling theorem. CS has an attractive advantage that the encoder is signal independent and computationally inexpensive at the cost of high complexity at the decoder. This is highly desirable to many applications where the data acquisition devices must be simple (e.g. inexpensive resource-deprived sensors) or long-term sampling process will harm the object being captured (e.g. X-ray imaging) (Eldar and Kutyniok 2012).

A signal \( x = \{ x_n\}_{n=1}^N\) of length N is said to be sparse in a basis space \(\varPsi \) = \( \{\psi _n \}_{1\leqslant n \leqslant N}\) if transform coefficients \(\langle x, \psi _n \rangle , 1\le n \le N \) are mostly zero, or nearly sparse in the space \(\varPsi \) if a dominant portion of these N coefficients are either zero or very close to zero. The sparsity of x in \(\varPsi \) is quantified by the number of significant (nonzero) coefficients K. The signal can be perfectly recovered from \( M = O(K\log (N/K))\) observations with a high probability. Current CS recovery algorithms explore the prior knowledge that a natural image is sparse in some domains, such as DCT, wavelets, and total variation (TV) domain (Becker et al. 2011; Bioucas-Dias and Figueiredo 2007; Candes et al. 2006; Li et al. 2009; Rudin et al. 1992). The TV model was first proposed by Rudin et al. (1992) for image denoising and has been successfully used for image restoration. Recently, some TV solvers have been incorporated in the CS framework (Becker et al. 2011; Bioucas-Dias and Figueiredo 2007; Li et al. 2009; Lustig et al. 2007). Particularly, TV-based CS recovery methods achieve state-of-the-art results (Li et al. 2009). In this paper, we use TV-based CS recovery approach, which adopts the finite difference as the sparsifying operator.

Given M measurements \( y = \varPhi x \), with \( \varPhi \) producing the random projections, standard CS recovers x from y using the following constrained optimization problem:

$$\begin{aligned} \min _x \Vert \varPsi ^\text{T} x\Vert _\text{p} \quad \text{s.t.} \quad y = \varPhi x \end{aligned}$$
(1)

where p is usually set to be 1 or 0, guaranteeing the sparse solution of the vector \( \varPsi ^\text{T} x \). \(\Vert *\Vert _1 \) is \( \ell _1 \) norm, i.e. the summation of the absolute value of all the elements in a vector. While \(\Vert *\Vert _0\) is \( \ell _0 \) norm, counting the nonzero entries of a vector. The \( \ell _1 \) minimization problem of (1) can be solved by a linear programming (Candes and Tao 2005). Other recovery algorithms have also been recently proposed, including gradient projection sparse reconstruction (Figueiredo et al. 2007), matching pursuit (Tropp and Gilbert 2007), and iterative thresholding methods (Daubechies et al. 2004).

The recovery performance of CS depends significantly on the measurements. The dominant cost in the sensing (measurement) process is the inner product between the sensing matrix and signal, which requires O(MN) operations for the mainstream CS. The random projection is not only time consuming but also costs a large number of memory particulary for large-scale data sensing. To sense signals under a resource-limited condition, we adopt a random element-wise operator to randomly sample M pixels of an image x with N pixels. The sensing strategy of our method, which can be regarded as an approximation of the identity operator, saves computer resources drastically. The random mask sensing can be useful in constructing 2D or 3D maps for military, environment, medicine, etc.

A natural property of CS is its ability to compactly encode the signal x without considering any specific features of x. Hence, an adaptive measurement learning, which captures the most important characteristics of a signal, could improve the CS recovery performance to a great extent. Some model-based or adaptive recovery algorithms (Wu et al. 2012; Soni and Haupt 2012, 2011) have been proposed, which greatly promoted the recovery performance over the traditional signal independent CS. For example, the model-guided adaptive recovery of compressive sensing (MARX) (Wu et al. 2012) improves the reconstruction quality of existing CS methods by 2–7 dB for some natural images. But relevant CS recovery algorithms are time consuming, which takes about 10 h for a \(512 \times 512\) image. The time-consuming recovery process exists in most literature works and thus limits their use for large-scale data sensing. Consequently, block-based CS (Mun and Fowler 2009) and fast CS framework (Do et al. 2012) are introduced in recent years. The TV-based CS recovery algorithms are edge-reserving algorithms and much faster than other algorithms (Bioucas-Dias and Figueiredo 2007; Becker et al. 2011; Li et al. 2009; Lustig et al. 2007). Additionally, the structure random matrix (SRM) is highly relevant for large-scale and real-time compressive sensing as it has fast computation and supports block-based processing (Mun and Fowler 2009; Do et al. 2012). But these fast CS methods are not adaptive to signals. An adaptive sampling method (Yang et al. 2012) is proposed, but the forepart sampling is fixed. It is well known that edges are the critical and dominant information for nature images in computer vision, which contain not only the local statistics but also nonlocal ones of the images. In this paper, a new framework of adaptive-random sampling and recovery (ASR) algorithm is proposed to improve the rate-distortion performance of the image acquisition system, while maintaining a low complexity of the encoder.

The decoder is essentially the same as the traditional CS recovery algorithm except that a low-complexity sensing operator was incorporated (in the decoder). To the best of knowledge, this is the first time that edges of recovered images have been exploited in the adaptive image recovery, which is also our main contribution. Regarding completely random element-wise operation measurements, the measurements are independent and all the spatial pixels have the same chance to be measured. However, our adaptive sensing strategy provides the pixels located around edges more chance to be sensed than smooth regions of the image.

In the new method, we partially sample m pixels in total as the compressed m measurements. For convenience, we also name our method as compressive (partial) sampling or sensing although it is not the same as standard CS. The proposed framework is able to balance computational costs with reconstruction quality. In the new image acquisition-coding system, \(m_\text{r} (\ll N)\) random measurements \(y = \varPhi _\text{r} \circ f\) of a 2D image f (\(f\in R^{\sqrt{N} \times \sqrt{N}}\)) are measured first, where the \( \varPhi _\text{r}\) is a \(\sqrt{N} \times \sqrt{N}\) random matrix with 0/1 entries. The \( \circ \) denotes the inner product operation element-wise. In other words, we randomly sense \(m_\text{r}\) image pixels as the measurements. These measurements are quantized and sequentially transmitted to the decoder. Second, we recover a coarse image \(\hat{f}_1\) from the \( m_\text {r} \) measurements by the traditional TV-based recovery algorithm. The edge of f denoted by \(m_\text {a}\), as the adaptive measurements, is predicted by \(\hat{f}_1\). Third, the edge measurements are combined with the \(m_r\) random ones as updated \(m = m_\text {r} + m_\text {a}\) measurements to recover a refined image \(\hat{f}_2\). Using the measurement-learning or measurement-updating concept, the double recovery procedure could improve the reconstruction performance significantly compared to completely M random measurements and other state-of-the-art CS methods.

The remainder of this paper is organized as follows. In the next section, we introduce the sensing strategy of the new ASR framework: a hybrid adaptive-random sampling is elaborated. The subsequent section presents the process of de-quantization by the CS decoder to solve a under-determined inverse problem of ASR, followed by which simulation results are reported. The final section concludes this paper.

Hybrid adaptive-random sensing for ASR

In this section, we introduce the hybrid adaptive-random sampling (HAS) protocol of the ASR framework. Figure 1 shows the schematic diagram for the protocol.

Fig. 1
figure 1

The schematic diagram for the ASR framework

Here, we assume the image f(xy) as a function in 2D Hilbert space \(L(R)\times L(R)\). The new sensing matrix can be constructed by the following four steps:

Step 1 with random sensing pattern \( \varPhi _\text {r}\) with a 2D uniform distribution \(U(0,1)\times U(0,1)\) and binary thresholding. The completely random sampling, which acquires pixels at edges and smooth regions uniformly, captures the image profile information and guarantees the RIP and incoherence condition.Footnote 1 We recover a low-quality image \(f_l = \hat{f}_1\) with the completely \(m_r\) random measurements \(y_\text {r} = \varPhi _\text {r} \circ f\).

Step 2, the low-quality image \(f_\text {l} = \hat{f}_1\) is used to predict the edge information of the high-quality image f to be recovered. Mathematically, we have

$$\begin{aligned} \varGamma (f)\approx \varGamma (f_\text {p})=\varGamma (I(f_\text {l})) \end{aligned}$$
(2)

where \(\varGamma (f)=1\) for the edge pixels of f, otherwise \(\varGamma (f)=0\). The \(f_\text {p}\) denotes the predicted image from low-quality image \(f_\text {l}\). The prediction operation I, such as image denoising and debluring implantation, maps \(f_\text {l}\) to \(f_\text {p}\). Here, we simply use the low-quality image \(f_\text {l}\) as the predict image \(f_\text {p}\). The \(\varGamma \) denotes the edge detection operator that can be implemented with the Sobel edge detector (Canny 1986) and binary thresholding. As a result, real edges of the high-quality image \(\varGamma (f)\) can be approximated by the predicted edges \(\varGamma (f_\text {p})\).

Step 3 due to possible inaccuracy of the predicted edges, morphology operations can be used to generate an adaptive sampling pattern around the edges of f:

$$\begin{aligned} \hat{ \varPhi }_a=M^\text {p}(\varGamma (f_\text {p})) \end{aligned}$$
(3)

where \(\hat{ \varPhi }_\text {a}\) is the adaptive sampling pattern and \(M^\text {p}\) is the binary morphology operator on the edges of the predicted image \(f_\text {p}\). The morphology operator involves dilation \(M_\text {d}^\text {p}\) and closing \(M_\text {c}^\text {p}\) (dilation followed by erosion). Additionally, \(M_n^\text{p}\) suggests no morphology operation is executed. In practice, with the help of edges, we need not to resample the measurements that have been sampled in the first completely random sensing in Step 1. Before the adaptive sampling, we remove the random sensing measurements located at edges:

$$\begin{aligned} \varPhi _\text{a}= \hat{ \varPhi }_\text {a} \setminus ( \varPhi _\text {r} \cap \hat{ \varPhi }_\text {a}) \end{aligned}$$
(4)

where \( \varPhi _\text {a}\) is the practical adaptive sampling (we also said as adaptive sampling), \(\setminus \) is the complement operation and \( \hat{ \varPhi }_\text {a} \setminus ( \varPhi _\text {r} \cap \hat{ \varPhi }_\text {a}) \) is the complement of \(( \varPhi _\text {r} \cap \hat{ \varPhi }_\text {a})\) in \(\hat{ \varPhi }_\text {a}\). Then the adaptive measurements is \(y_\text {a} = \varPhi _\text {a} \circ f\).

After understanding the function of image edges in computer vision and image processing, we suppose that the image pixels that located at edges or near the edges are more important than those located at smooth regions. Consequently, involving the adaptive sampling pattern into the sensing procedure is highly reasonable.

Step 4 we mix the random and adaptive sampling patterns via a union operation to get the new hybrid adaptive-random sampling pattern (sensing matrix with 0/1 elements).

$$\begin{aligned} \varPhi _\text{m}= \varPhi _\text {r}\cup \varPhi _\text {a} \end{aligned}$$
(5)

where \( \varPhi _\text {r}\) is the random sampling pattern and \( \varPhi _\text {a}\) is the adaptive sampling pattern corresponding to the edge of \(f_\text {l}\). In practice, we directly use the completete random measurements (in Step 1) and the adaptive measurements as our new hybrid adaptive-random measurements.

$$\begin{aligned} y_\text{m} = y_\text {r} \cup y_\text {a} \end{aligned}$$
(6)

where \(y_\text {r}\) is the completely random measurement and \(y_\text {a}\) is the adaptive measurement corresponding to the edge of \(f_\text {p}\).

In other words, we reuse (do not resample) the random measurements of \( \varPhi _\text {r}\) obtained at the Step 1 for saving the measurements, as well as the previous predicted edges of the recovered image at the Step 1 are also reused for the second iteration recover which will refine the recovered image.

To physically acquire the pixels corresponding to the HAS matrix \( \varPhi _\text{m}\), we may use integrated circuits to control reset transistors (or switches) in complementary metal-oxide-semiconductor (CMOS) camera. As a result, only a portion of photodetectors and amplifiers (with respect to \( \varPhi _\text {r}\) first and then \( \varPhi _\text {a} \)) is turned on. Compared to traditional image acquisitions, the HAS saves electrical power and extends lifetime of image sensors. More importantly, the HAS can be generalized to other data acquisitions where the most important information of object is adaptively extracted via a low-cost and online sampling and recovery in advance.

For convenience, the sensing ratio (or sampling ratio) of the HAS matrix \(\eta _1\) is defined as the number of nonzero elements of \( \varPhi _\text{m}\) over the dimension of \( \varPhi _\text{m}\) (i.e. image size of f). The adaptive sampling ratio \(\eta _2\) is defined as the number of nonzero elements of \( \varPhi _\text {a} \) over that of \( \varPhi _\text{m}\).

$$\begin{aligned} \eta _1=\frac{\sum _{i,j} \varPhi _\text{m}(i,j)}{\text{Dim}( \varPhi _\text{m})},\quad \eta _2= \frac{\sum _{i,j} \varPhi _\text {a}(i,j)}{\sum _{i,j} \varPhi _\text{m}(i,j)} \end{aligned}$$
(7)

The sensing ratio \(\eta _1\) could be considerably smaller and thus measurement cost can be reduced. In addition, the adaptive sampling ratio \(\eta _2\) cannot be too large to satisfy the RIP and incoherence condition.

Recovery algorithm with TV regularizer

After using the HAS matrix to directly acquire a compressed image representation, the recovery algorithm plays a key role to reconstruct a high-quality image. In this section, we discussed the TV regularizer (Becker et al. 2011; Bioucas-Dias and Figueiredo 2007; Li et al. 2009; Lustig et al. 2007; Rudin et al. 1992), which combined with the HAS strategy, used in our ASR framework.

For reconstructing the high-quality image f from the measurements (compressed image representation) g, the Lagrangian regularization problem should be solved, i.e.

$$\begin{aligned} \mathop {\min }\limits _f \left\{ {\int {\left( {g - \varPhi _\text{m} \circ f} \right) } ^2}\text {d}xdy \right. \left. + \alpha \int {\sqrt{{\left( \frac{\text {d}f}{\text {d}x}\right) ^2} + {\left( \frac{\text {d}f}{\text {d}y}\right) ^2}} dxdy} + \beta \int {\sqrt{{{\left( {Tf} \right) }^2}} \text {d}x\text {d}y}\right\} \end{aligned}$$
(8)

where \( \varPhi _\text{m}\) is the HAS operator, \(\alpha \) and \(\beta \) are Lagrangian multipliers. The second term is the TV regularizer; the third term relates to \(\ell _1\)-minimization with a sparsifying transform operator T. According to the variational principle, we have

$$\begin{aligned} \frac{\delta O(f)}{\delta f} &= 2{ \varPhi _\text{m}^{*}} \circ \left( {g - \varPhi _\text{m} \circ f} \right) - \alpha \frac{\text {d}}{{\text {d}x}}\left( {\frac{{\text {d}{f}/\text {d}x}}{{\sqrt{{{(\text {d}{f}/\text {d}x)}^2} + {{(\text {d}{f}/\text {d}y)}^2}} }}} \right) \nonumber \\ & \quad - \alpha \frac{\text {d}}{{\text {d}y}}\left( {\frac{{\text {d}{f}/\text {d}y}}{{\sqrt{{{(\text {d}{f}/\text {d}x)}^2} + {{(\text {d}{f}/\text {d}y)}^2}} }}} \right) +\beta {T^{*}}\left( \frac{{T{f}}}{{\sqrt{{{\left( {T{f}} \right) }^2}} }}\right) \end{aligned}$$
(9)

where O(f) is the objective functional given in (8); \(\phi _\text{m}^{*}\) and \(T^{*}\) are adjoint operators of \(\phi _\text{m}\) and T, respectively. In this work, we set \(\beta \) to zero for fast and simple reconstruction. With the help of nonlinear conjugate gradient method (Hager and Zhang 2006; Rudin et al. 1992) and (9), the problem (8) can be solved.

Empirical results and remarks

In this section, numerical performances of the proposed ASR approach will be evaluated. Without loss of generality, we assume \(\text {Dim}(S_\text{m})=\text {Dim}(f)=256\times 256\). The sensing ratio \(\eta _1\) and adaptive sampling ratio \(\eta _2\) defined in (7) can be tunable with modifying binary thresholds in Steps 1 and 3 of “Hybrid adaptive-random sensing for ASR” section. For simple notations, \(M_n\), \(M_\text {d}\) and \(M_\text {c}\) correspond to the edges of f with null morphology operation, dilation and closing (Step 3 of “Hybrid adaptive-random sensing for ASR” section). Similarly, \(M_n^\text {p}\), \(M_\text {d}^\text {p}\) and \(M_\text {c}^\text {p}\) correspond to the edges of \(f_\text {p}\) (Step 2 of “Hybrid adaptive-random sensing for ASR” section). Moreover, we use abbreviations of \(\varPhi _\text {r}\) and \(\varPhi _\text{m}\) to denote sensing methods using the completely random matrix and HAS matrix, respectively (Step 4 of “Hybrid adaptive-random sensing for ASR” section). For \( \varPhi _\text{m} + M_n \) and \( \varPhi _\text{m} + M_n^\text {p} \) the adaptive sensing ratio \(\eta _2\) is set to 0.2. For \( \varPhi _\text{m} + M_{\text {d,c}}^\text {p} \) and \( \varPhi _\text{m} + M_{\text {d,c}} \) the parameter \(\eta _2\) is set to 0.45.

We will demonstrate that the incorporation of edge information in the sensing procedure can pronouncedly improve the recovery performance. In the beginning, the HAS performance for different edge extraction methods are investigated. Then, we compare recovery performance of the HAS with the completely random sensing and other CS recovery methods. Finally, we will discuss the influence of \(\eta _2\) on the recovery performance.

Fig. 2
figure 2

PSNR ( In decibels) results for ASR with different edges and completely random sensing

Fig. 3
figure 3

The original Lena image (a) and reconstructed versions with sensing ratios (\(\eta _1\)) of 35% for the sensings, (b) \( \varPhi _\text{m}+M_\text{c}\); (c) \( \varPhi _\text{m}+M_\text{d}\); (d) \( \varPhi _\text{m}+M_n\); (e) \( \varPhi _\text {r}\); (f) \( \varPhi _\text{m}+M_\text {c}^\text {p}\); (g) \( \varPhi _\text{m}+M_\text {d}^\text {p}\); (h) \( \varPhi _\text{m}+M_n^\text {p}\)

Fig. 4
figure 4

Recovered Lena, House and Boat images with sensing ratio 40%. From left to right in each row they correspond to ASR, BCS, MARX, TVAL3 and SRM, respectively

For fair comparison, m measurements are used to recover the image for all methods. In our HAR model \(m = m_\text {r}+m_\text {a}\) measurements are used for recovering the high-quality image. First, the low-quality image \(f_\text {l}\) is generated by the low sensing ratio \(m_\text {r}\) measurements. The edges of f and \(f_\text {p}\) can be extracted by the Sobel method (Step 2 of “Hybrid adaptive-random sensing for ASR” section).

To highlight the importance of edge for recovery method and verify the efficacy of ASR in this regard, we adopt two type edges, \(\varGamma (f)\) and \(\varGamma (f_\text {p})\), respectively, to conduct a comparative study between ASR and completely random sensing. A general set of test images, e.g., Lena, Boat, Cameraman, Fruits, House and Peppers commonly found in the literatures, was used in our comparative study.

Using the images, Fig. 2 shows the peak signal-to-noise ratio (PSNR) as a function of the sensing ratio \(\eta _1\). We observe: (1) the convergence of all the methods are comparable; (2) the performance of \(\varPhi _\text{m}\) sensing is much better than that of \(\varPhi _\text {r}\); (3) the best PSNR is achieved by the \(\varPhi _\text{m}\) sensing involving the dilated edge information. This also suggests the pixels around edges contain very important information of image. Figure 3 shows the sensing performance of the Lena. After comparing Fig. 3b–d with Fig. 3f–h, \( \varPhi _\text{m} + M_{\text {d,c}}\) shows better recovery results than \( \varPhi _\text{m} + M_{\text {d,c}}^p\), while \( \varPhi _\text{m} + M_n^\text {p} \) is comparable to \( \varPhi _\text{m} + M_n\). However, the HAS matrix incorporating predicted edges by \( \varPhi _\text{m} + M_{\text {d,c}}^p\) especially \( \varPhi _\text{m} + M_\text {d}^\text {p}\) operations still achieve higher PSNR values (such as 30.14 dB with the sensing ratio \(\eta _1 =35\%\)) in contrast to completely random sensing matrix (27.22 dB with the same \(\eta _1=35\%\)).

Fig. 5
figure 5

PSNR performance versus sampling rate of our method ASR (\( \varPhi _\text{m}+M_\text {d}^\text {p}\) ) and other CS recovery methods

Fig. 6
figure 6

Evaluating the optimum \(\eta _2\) for \( \varPhi _\text{m}+M_{\text {c,d}}^p.\) The sensing ratio is \(\eta _1=40\%\) for both \( \varPhi _\text {r}\) and \( \varPhi _\text{m}+M_\text {{c,d}}^\text {p}\). The left column: the PSNR versus \(\eta _2\) for test images, respectively, and the right column: the SSIM versus \(\eta _2\) , respectively

We also compare our method ASR (\( \varPhi _\text{m}+M_\text {d}^\text {p}\)) with state-of-the-art methods: MARX by Wu et al. (2012), TVAL3 by Li et al. (2009), SRM (WPFFT) by Do et al. (2012) and BCS (SPL-DDWT) by Mun and Fowler (2009). Figure 5 shows the PSNR performance for different CS recovery methods and ours. Although the performance of MARX is the best over all other methods, ASR is obviously better than TVAL3, SPM and BCS. However, the MARX takes very long run time: around 10 h for \(512 \times 512\) image tested on MatLab 7.1, which is not fit for large-scale signal processing. For the same-sized image and sensing rate, our method only need ten seconds. In Table 1, we list the CPU time of these methods for \(256 \times 256\) test gray images with the sensing rate \(\eta _1=35\%\). All the methods are tested on a PC with 3.30 GHz Intel i3 CPU and 8G RAM. We can see that the methods except for MARX take only several or fifteens seconds while MARX takes about 80 minutes at the same situation. Figure 4 shows the visual quality of these methods for some test images with the sensing rate \(\eta _1 =40\%\). We can see that our method is comparable to or better than other methods both in PSNR value and visual quality.

Table 1 The computing times (seconds) for \(256 \times 256\) images with sensing rate \(35\%\)

In Figs. 2, 3, 4, 5 and Table 1, we use the same parameter \(\eta _2\), which is not the optimal value for each image. We can get better performance using the optimal adaptive sensing ratio \(\eta _2\). But too much edge information in the HAS matrix will destroy the RIP and incoherence condition of the CS framework. This situation appears at extremely high adaptive sampling ratios \(\eta _2\), where the HAS breaks down. We evaluate the optimum value of \(\eta _2\) for test images with \(\eta _1=40\%\) as illustrated in Fig. 6. Using morphology operations, better reconstruction results are achieved by the HAS even if the number of adaptive sampling is comparable to that of random sampling (with \(\eta _1 = 40\%\), the optimum \(\eta _2^\text {{opt}}\approx 45\% \pm 5\% \) for \( \varPhi _\text{m}+M_{\text {d,c}}^p\)). However, with \(\eta _1 = 40\%\), the optimum \(\eta _2^\text {{opt}}\approx 15\% \pm 5\% \) for \( \varPhi _\text{m}+M_n^\text {p}\) in our numerous experiments. It demonstrates that the image pixels that located “around” the edges will contain more important information than the pixels that located “exactly at” the edges without dilation operation. Therefore, for image sensing, to select pixels around the edges together with some randomly picked pixels will achieve much better recovery performance.

Conclusions

A novel ASR protocol is proposed to acquire a compressed image representation in space domain. Incorporating adaptive edge information that can be extracted from a lower random sampling, the ASR measurements show much better reconstruction results in comparison with the completely random measurements and some state-of-the-art CS methods. The RIP and incoherence condition of the ASR sensing matrix can be satisfied by balancing the number of adaptive sampling with that of completely random sampling. The hybrid sensing concept opens up a bright and unexplored way for low-cost data acquisition.