1 Introduction

Compressive sensing (CS) [1] theory provides a new way to sample and compress data. The basic idea of CS is that a higher-dimensional signal is projected onto a measurement matrix, by which a low-dimensional sensed sequence is obtained. Meanwhile, [2,3,4] prove that if the sensed sequence consists of a small number of non-zero elements, then it can recover the original signal from the sensed sequence. CS applications confirm that random measurement matrices are suitable for compressed sensing. However, these applications require considerable storage space to realize random matrices [3]. As a result, much work has been done to reduce the storage space and improve performance.

In [5, 6], an intermediate structure for the measurement matrix is proposed based on random sampling, called block compressed sensing (block CS). Using the block CS, the data sampling is conducted in a block-by-block manner through the same measurement matrix, which overcomes the difficulties encountered in traditional CS technology, for which the random measurement ensembles are numerically unwieldy.

In [7, 8], Thong, et al. introduced a fast and efficient way to construct a measurement matrix, called the structurally random matrix (SRM), which attempted to improve the structure of an initial random measurement matrix using optimization techniques. SRM is related to large-scale, real-time CS applications for low requirements in storage space.

To reduce the storage space for CS, many deterministic measurement matrices have been designed [9,10,11,12,13,14]. They satisfy the restricted isometry property (RIP) and recovered the sparse signal successfully.

In [15, 16], low-dimensional orthogonal basis vectors or matrices were used to construct high-dimensional matrices according to the Kronecker product. The proposed algorithm effectively reduces the storage space of a measurement matrix.

Low-rank matrices and rank-one matrices are attractive because they need less storage space than general measurement matrices [17,18,19,20]. Indeed, if the measurement matrix is sparse, it takes less storage space and incurs less computational cost. Low-rank or rank-one matrices have been designed to sample and reconstruct the original signal, obtaining high-quality reconstructions.

All of the above findings are directed at how to reduce the storage space of a measurement matrix for CS. However, using the block CS, several block artifacts occur in the reconstructed images, owing to the block-by-block manner and the neglect of global sparsity. However, the SRM method is more complicated and difficult to achieve. Deterministic measurement matrices require little storage space and incur less computational cost, but the accuracy of the reconstruction is not as high as a random measurement matrix. To reconstruct the original signals, the Kronecker algorithm must generate an M × N dimensional measurement matrix, and this requires large-scale memory space.

For the same purpose, we propose a random sampling scheme for CS. The aim is to propose an algorithm that can maintain the same reconstruction performance as conventional compressive sensing, but requires less required storage for the measurement matrix and less memory for reconstructing.

The proposed algorithm is based on the semi-tensor product (STP) [21, 22], a novel matrix product that works by extending the conventional matrix product in cases of unequal dimensions. Our algorithm generates a random matrix, with dimensions that are smaller than M and N, where M is the length of the sampling vector and N is the length of signal that we want to reconstruct. Then, we use the iteratively re-weighted least squares (IRLS) algorithm to estimate the value of the sparse coefficients. Experiments were carried out using the sparse column signals and images, demonstrating that it outperforms other algorithms in terms of storage space and a suitable peak signal to noise ratio (PSNR) performance. The experimental results show that if we reduce the dimensions of the measurement matrix appropriately, there is almost no decline in the PSNR of the reconstruction, yet the storage space required by the measurement matrix can be reduced to a quarter (or even 16th) of the size.

The remainder of this paper is organized as follows: In Section 2, the preliminaries of the STP and the conventional CS algorithm are introduced. In Section 3, we describe the proposed STP approach to the CS algorithm (STP-CS). In Section 4, we present the experimental results and a discussion. Finally, Section 5 concludes the paper and contains a discussion of our plans for future research.

2 Related works

In this section, the concepts of the conventional CS algorithm and some necessary preliminaries to the STP are briefly introduced. The STP of matrices was introduced by Cheng [21, 22].

2.1 Semi-tensor product

In [21, 22], the STP is presented as an extension of the conventional matrix product. For a conventional matrix product, if Col (A) ≠ Row (B), then matrices A and B are multiplicative. The STP of matrices, on the other hand, extends the conventional matrix product in cases of unequal dimensions. In [22], the STP is defined as follows:

Definition 1: Let A ∈  m × n and B ∈  p × q. If either n is a factor of p—i.e., if nt = p (denoted as A t B)—or if p is a factor of n—i.e., n = pt. (denoted as A t B)—then the (left) STP of A and B can be denoted by C = {C ij} = A ⋉ B, as follows: C consists of m × q blocks, and each block is defined as

$$ {C}^{ij}={A}^i\ltimes {B}_j,i=1,\cdots, \kern0.5em m,\kern0.5em j=1,\cdots, q, $$

where A i is the ith row of A, and B j is the jth column of B.

If we assume that A t B (or A t B), then A and B are split into blockwise forms as follows:

$$ A=\left[\begin{array}{ccc}\hfill {A}^{11}\hfill & \hfill \cdots \hfill & \hfill {A}^{1s}\hfill \\ {}\hfill \vdots \hfill & \hfill \hfill & \hfill \vdots \hfill \\ {}\hfill {A}^{r1}\hfill & \hfill \cdots \hfill & \hfill {A}^{rs}\hfill \end{array}\right],\kern0.5em B=\left[\begin{array}{ccc}\hfill {B}^{11}\hfill & \hfill \cdots \hfill & \hfill {B}^{1t}\hfill \\ {}\hfill \vdots \hfill & \hfill \hfill & \hfill \vdots \hfill \\ {}\hfill {B}^{s1}\hfill & \hfill \cdots \hfill & \hfill {B}^{st}\hfill \end{array}\right]. $$

If A ik t B kj ,  ∀  i ,  j ,  k (correspondingly, A ik t B kj ,   ∀  i ,  j ,  k), then

$$ A\ltimes B=\left[\begin{array}{ccc}\hfill {C}^{11}\hfill & \hfill \cdots \hfill & \hfill {C}^{1t}\hfill \\ {}\hfill \vdots \hfill & \hfill \hfill & \hfill \vdots \hfill \\ {}\hfill {C}^{r1}\hfill & \hfill \cdots \hfill & \hfill {C}^{rt}\hfill \end{array}\right], $$
(1)

where \( {C}^{ij}={\sum}_{k=1}^s{A}^{ik}\ltimes {B}^{kj} \), and the dimensions of A ⋉ B can be determined by deleting the largest common factor of the dimensions of the two factor matrices. If A t B, the dimensions of A ⋉ B are m × tq. Likewise, if A t B, the dimensions of A ⋉ B are mt × q. In order to allow the reader to clearly understand the STP, we use the following numerical example to describe it:

$$ \begin{array}{l}\mathrm{Let}\kern0.5em X=\left[\begin{array}{c}\hfill 1\hfill \\ {}\hfill 0\hfill \\ {}\hfill 3\hfill \end{array}\kern1.12em \begin{array}{c}\hfill 2\hfill \\ {}\hfill 1\hfill \\ {}\hfill 3\hfill \end{array}\kern1.5em \begin{array}{c}\hfill -1\hfill \\ {}\hfill 2\hfill \\ {}\hfill 1\hfill \end{array}\kern1.5em \begin{array}{c}\hfill 2\hfill \\ {}\hfill 3\hfill \\ {}\hfill 1\hfill \end{array}\right],\kern0.5em Y=\left[\begin{array}{cc}\hfill 1\hfill & \hfill 2\hfill \\ {}\hfill -1\hfill & \hfill 3\hfill \end{array}\right].\kern0.5em \mathrm{Then}\\ {}\\ {}X\ltimes Y=\left[\begin{array}{l}\left(1\kern1.5em 2\right)-\left(-1\kern1.5em 2\right)\kern2em 2\left(1\kern1.5em 2\right)+3\left(-1\kern1.5em 2\right)\\ {}\kern0.5em \left(0\kern1.5em 1\right)-\left(2\kern1.5em 3\right)\kern3.5em 2\left(0\kern1.5em 1\right)+3\left(2\kern1.5em 3\right)\\ {}\kern0.5em \left(3\kern1.5em 3\right)-\left(1\kern1.5em 1\right)\kern4em 2\left(3\kern1.5em 3\right)+3\left(1\kern1em 1\right)\end{array}\right]\\ {}\\ {}\kern1.5em =\left[\begin{array}{c}\hfill 2\hfill \\ {}\hfill -2\hfill \\ {}\hfill 2\hfill \end{array}\kern1.5em \begin{array}{c}\hfill 0\hfill \\ {}\hfill -2\hfill \\ {}\hfill 2\hfill \end{array}\kern1.5em \begin{array}{c}\hfill -1\hfill \\ {}\hfill 6\hfill \\ {}\hfill 9\hfill \end{array}\kern1.5em \begin{array}{c}\hfill 10\hfill \\ {}\hfill 11\hfill \\ {}\hfill 9\hfill \end{array}\right].\end{array} $$

In comparing the product of the conventional matrix with the STP of the matrix, it is easy to see that there are significant differences between them.

If A ∈  m × n ,  B ∈  p × q, and n = p, then

$$ A\ltimes B= AB. $$

Consequently, when the conventional matrix product is extended to the STP, almost all of its properties are nevertheless maintained. Two properties of the STP are introduced as follows.

$$ \begin{array}{l}\mathbf{Proposition}\kern0.5em \mathbf{1}:\kern0.5em \mathrm{Suppose}\kern0.5em A=\left({\alpha}_{ij}\right)\in {\mathrm{\mathbb{R}}}^{p\times qr},\kern0.5em B=\left({b}_{ij}\right)\in {\mathrm{\mathbb{R}}}^{r\times s},\kern0.5em \mathrm{and}\kern0.5em C=\kern0.5em \left({c}_{ij}\right)\in {\mathbb{R}}^{qst\times l},\kern0.5em \mathrm{then}\\ {}{A}_{p\times qr}\ltimes {B}_{r\times s}\ltimes {C}_{qst\times l}={\left(A\ltimes B\right)}_{p\times qs}\ltimes {C}_{qst\times l}={\left(A\ltimes B\ltimes C\right)}_{pt\times l}.\end{array} $$
(2)

Referring to Proposition 1, the dimensions of the STP of two matrices can be determined by removing the largest common factor of the dimensions of the two-factor matrices. As shown in (2), in the first product, r is deleted, and then the qs is deleted.

$$ \begin{array}{l}\mathbf{Proposition}\ \mathbf{2}:\mathrm{Suppose}\kern0.5em A=\left({a}_{ij}\right)\in {\mathrm{\mathbb{R}}}^{m\times tp},\kern0.5em B=\left({b}_{ij}\right)\in {\mathrm{\mathbb{R}}}^{p\times p},\kern0.5em \mathrm{and}\kern0.5em C=\left({c}_{ij}\right)\in {\mathrm{\mathbb{R}}}^{p\times p}.\kern0.2em \mathrm{The}\kern0.2em STP\kern0.2em \mathrm{satifies}\kern0.2em \mathrm{the}\kern0.2em \mathrm{associative}\kern0.2em \mathrm{law}:\\ {}\kern0.5em \left(A\ltimes B\right)\ltimes C=A\ltimes \left(B\ltimes C\right).\end{array} $$
(3)

In recent years, the STP has been exploited by a wide range of applications: in nonlinear system control for structural analysis and control of Boolean networks [23, 24], in biological systems as a solution to Morgan’s Problem [25], in a linear system for nonlinear feedback-shift registers [26, 27], etc. However, we have not yet seen publicly reported applications for the STP in the field of CS, to the best of our knowledge.

2.2 Conventional CS algorithm

The conventional CS algorithm can be described simply as follows. Assume x ∈  N × 1, which can be represented sparsely in a known transform domain (e.g., Fourier, or wavelet). Although x can be sparse in the current domain (e.g., time or pixel), we always assume that x is sparse in a known transform domain. The so-called sparsifying transform can be denoted as Ψ ∈  Ν × Ν. Considering the above notations, the signal x can be denoted as

$$ {x}_{N\times 1}={\varPsi}_{N\times N}{\theta}_{N\times 1}, $$
(4)

where ΨΨ T = Ψ T Ψ = I, I is the unit for the matrix, and θ ∈  n × 1 is a column vector of sparse coefficients, having merely k ≪ N non-zero elements.

Here, the vector θ is called exact-sparse because it has k non-zeros and the rest of the elements are equal to zero. However, there might be cases where the coefficient vector θ includes only a few large components with many small coefficients. In this case, x is called a compressible signal, and sparse approximation methods can be applied.

Then, the data acquisition process of the conventional CS algorithm is defined as follows:

$$ {y}_{M\times 1}={\varPhi}_{M\times N}{\varPsi}_{N\times N}{\theta}_{N\times 1}, $$
(5)

where Φ M × N  ∈  M × N (M < N) is defined as the measurement matrix and y M × 1  ∈  M × 1 is treated as measurements.

As shown in (5), the higher dimensional signal x is projected onto the matrix Φ M×N , and a low-dimensional sensed sequence y M×1 is obtained. Assume a signal is sampled using the above scheme and then the measurements y are transmitted. The crucial task (for the receiver) is to reconstruct the original samples x with knowledge about the measurements y M×1 and the measurement matrix Φ M×N . The recovery problem is ill conditioned since M < N. However, several methods have been proposed to tackle this problem, such as the iteratively re-weighted least squares (IRLS) algorithm.

When the original signal is sampled and reconstructed, the role played by the measurement matrix is vital in order to faithfully reconstruct with precision and complexity [4]. However, this requires a lot of storage space in order to realize the measurement matrices in CS applications. This is especially true of random measurement matrices, because they are computationally expensive and require considerable memory [3]. Therefore, reducing the storage space of the measurement matrix is essential to practical CS applications, especially in terms of the feasibility of embedded hardware implementations.

3 Proposed algorithm

To effectively reduce the storage space of a random measurement matrix, we propose an STP approach for the CS algorithm (STP-CS), which can reduce storage space to at least a quarter of the size while maintaining the quality of reconstructed signals or images.

Then, the STP-CS algorithm is defined as follows:

$$ {y}_{M\times 1}={\varPhi}_{\raisebox{1ex}{$M$}\!\left/ \!\raisebox{-1ex}{$t$}\right.\times \raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.}\ltimes {\varPsi}_{N\times N}\ltimes {\theta}_{N\times 1}, $$
(6)

where the dimensions of the measurement matrix Φ are (M/t × N/t) with t < M (M, N, t, M/t, and N/t are positive integers). For convenience, we denote by Φ a matrix whose dimensions had been reduced to Φ(t). Here, Ψ and θ are defined as the conventional CS.

We assume x N × 1 is a k-sparse signal of length N. Then, x can be represented by Φ(t) as follows:

$$ {y}_{M\times 1}=\varPhi (t)\ltimes {x}_{N\times 1}, $$
(7)

Equation (7) can be expanded by multiplication, such that the following holds:

$$ \left(\begin{array}{c}\hfill {y}_1\hfill \\ {}\hfill {y}_2\hfill \\ {}\hfill \vdots \hfill \\ {}\hfill {y}_M\hfill \end{array}\right)=\left(\begin{array}{lllll}{\varphi}_{1,1}& \kern1em \cdots \kern1em & {\varphi}_{1,j}& \kern1em \cdots \kern1em & {\varphi}_{1,\raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.}\\ {}\kern1em \vdots \kern3.9em \vdots \kern4.4em \vdots \\ {}{\varphi}_{i,1}\kern1em \cdots \kern1em {\varphi}_{i,j}\kern1em \cdots \kern1.19em {\varphi}_{i,\raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.}\\ {}\vdots \kern3.9em \vdots \kern4.4em \vdots \\ {}{\varphi}_{\raisebox{1ex}{$M$}\!\left/ \!\raisebox{-1ex}{${t}^{,1}$}\right.}\cdots \kern1.08em {\varphi}_{\raisebox{1ex}{$M$}\!\left/ \!\raisebox{-1ex}{${t}^{,j}$}\right.}\kern2em \cdots \kern1.4em {\varphi}_{\raisebox{1ex}{$M$}\!\left/ \!\raisebox{-1ex}{$t$}\right.,\raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.}\end{array}\right)\ltimes \left(\begin{array}{c}\hfill {x}_1\hfill \\ {}\hfill {x}_2\hfill \\ {}\hfill \vdots \hfill \\ {}\hfill {x}_N\hfill \end{array}\right), $$
(8)

where φ i ,  j  ∈ Φ(t) (i = 1,  2,  ⋯,  M/t,  j = 1,  2,  ⋯,  N/t).

According to the definitions of STP shown above, Eq. (8) can be expressed as:

$$ \left(\begin{array}{c}\hfill {y}_1\hfill \\ {}\hfill {y}_2\hfill \\ {}\hfill \vdots \hfill \\ {}\hfill {y}_M\hfill \end{array}\right)=\left(\begin{array}{l}{\eta}^{1,1}+\cdots +{\eta}^{1,j}+\cdots +{\eta}^{1,N/t}\\ {}{\eta}^{i,1}+\cdots +{\eta}^{i,j}+\cdots \kern0.5em +{\eta}^{i,N/t}\\ {}{\eta}^{M/t,1}+\cdots +{\eta}^{M/t,j}+\cdots +{\eta}^{M/t,N/t}\end{array}\right), $$
(9)

where η i , j = φ i , j (x (j − 1)t + 1   ⋯ x jt )T ,  i = 1 ,  2 ,  ⋯  ,  M/t, j = 1 ,  2 ,  ⋯  ,  N/t. η i , j is a column vector of length t, and \( {\sum}_{j=1}^{N/t}{\eta}^{i,j} \) is also a column vector of length t.

If we assume t = 2, N = 10, M = 6, x 10×1 is the sparse signal, y 6×1 is the measurement, Φ 3×5 is the random measurement matrix, the acquisition process of STP-CS can be described as follows:

$$ \begin{array}{l}\left(\begin{array}{l}{y}_1\\ {}{y}_2\\ {}\kern0.5em \vdots \\ {}{y}_5\\ {}{y}_6\end{array}\right)=\left(\begin{array}{c}\hfill {\varphi}_{1,1}\hfill \\ {}\hfill {\varphi}_{2,1}\hfill \\ {}\hfill {\varphi}_{3,1}\hfill \end{array}\kern1.5em \begin{array}{c}\hfill {\varphi}_{1,2}\hfill \\ {}\hfill {\varphi}_{2,2}\hfill \\ {}\hfill {\varphi}_{3,2}\hfill \end{array}\kern1.5em \begin{array}{c}\hfill {\varphi}_{1,3}\hfill \\ {}\hfill {\varphi}_{2,3}\hfill \\ {}\hfill {\varphi}_{3,3}\hfill \end{array}\kern1.5em \begin{array}{c}\hfill {\varphi}_{1,4}\hfill \\ {}\hfill {\varphi}_{2,4}\hfill \\ {}\hfill {\varphi}_{3,4}\hfill \end{array}\kern1.5em \begin{array}{c}\hfill {\varphi}_{1,5}\hfill \\ {}\hfill {\varphi}_{2,5}\hfill \\ {}\hfill {\varphi}_{3,5}\hfill \end{array}\right)\ltimes \left(\begin{array}{l}{x}_1\\ {}{x}_2\\ {}\kern0.5em \vdots \\ {}{x}_9\\ {}{x}_{10}\end{array}\right)\\ {}=\left(\begin{array}{l}{\varphi}_{1,1}\left(\begin{array}{l}{x}_1\\ {}{x}_2\end{array}\right)+{\varphi}_{1,2}\left(\begin{array}{l}{x}_3\\ {}{x}_4\end{array}\right)+{\varphi}_{1,3}\left(\begin{array}{l}{x}_5\\ {}{x}_6\end{array}\right)+{\varphi}_{1,4}\left(\begin{array}{l}{x}_7\\ {}{x}_8\end{array}\right)+{\varphi}_{1,5}\left(\begin{array}{l}{x}_9\\ {}{x}_{10}\end{array}\right)\\ {}{\varphi}_{2,1}\left(\begin{array}{l}{x}_1\\ {}{x}_2\end{array}\right)+{\varphi}_{2,2}\left(\begin{array}{l}{x}_3\\ {}{x}_4\end{array}\right)+{\varphi}_{2,3}\left(\begin{array}{l}{x}_5\\ {}{x}_6\end{array}\right)+{\varphi}_{2,4}\left(\begin{array}{l}{x}_7\\ {}{x}_8\end{array}\right)+{\varphi}_{2,5}\left(\begin{array}{l}{x}_9\\ {}{x}_{10}\end{array}\right)\\ {}{\varphi}_{3,1}\left(\begin{array}{l}{x}_1\\ {}{x}_2\end{array}\right)+{\varphi}_{3,2}\left(\begin{array}{l}{x}_3\\ {}{x}_4\end{array}\right)+{\varphi}_{3,3}\left(\begin{array}{l}{x}_5\\ {}{x}_6\end{array}\right)+{\varphi}_{3,4}\left(\begin{array}{l}{x}_7\\ {}{x}_8\end{array}\right)+{\varphi}_{3,5}\left(\begin{array}{l}{x}_9\\ {}{x}_{10}\end{array}\right)\end{array}\right)\\ {}=\left(\begin{array}{l}{\varphi}_{1,1}{x}_1+{\varphi}_{1,2}{x}_3+{\varphi}_{1,3}{x}_5+{\varphi}_{1,4}{x}_7+{\varphi}_{1,5}{x}_9\\ {}{\varphi}_{1,1}{x}_2+{\varphi}_{1,2}{x}_4+{\varphi}_{1,3}{x}_6+{\varphi}_{1,4}{x}_8+{\varphi}_{1,5}{x}_{10}\\ {}{\varphi}_{2,1}{x}_1+{\varphi}_{2,2}{x}_3+{\varphi}_{2,3}{x}_5+{\varphi}_{2,4}{x}_7+{\varphi}_{2,5}{x}_9\\ {}{\varphi}_{2,1}{x}_2+{\varphi}_{2,2}{x}_4+{\varphi}_{2,3}{x}_6+{\varphi}_{2,4}{x}_8+{\varphi}_{2,5}{x}_{10}\\ {}{\varphi}_{3,1}{x}_1+{\varphi}_{3,2}{x}_3+{\varphi}_{3,3}{x}_5+{\varphi}_{3,4}{x}_7+{\varphi}_{3,5}{x}_9\\ {}{\varphi}_{3,1}{x}_2+{\varphi}_{3,2}{x}_4+{\varphi}_{3,3}{x}_6+{\varphi}_{3,4}{x}_8+{\varphi}_{3,5}{x}_{10}\end{array}\right).\end{array} $$
(10)

According to the example, we can see that the sparse signal x 10×1 can project onto Φ 3×5, by which a low-dimensional sensed sequence y 6×1 is obtained. The measurements are simply linear combinations of the elements of x 10×1.

If we assume t = 1, measurement y 6×1 is obtained by Φ 6×10, which is the same as in the conventional CS. Thus, Eq. (7) can be expressed as y Mx1 = Φ M × N  ⋅ x N × 1.

It should be pointed out that, for the same sparse signal, the measurements obtained from different measurement matrices are different. That is y 6×1 ≠ y 6×1.

Let the measurement matrix Φ(t) be of full-row rank—i.e., Rank(Φ(t)) = M/t (t > 1)—and be a random matrix, such as Gaussian N(0, 1/(M/t)) with i.i.d. entries. Given a sparsifying basis Ψ, as shown in Eq. (6), we can get Θ t as follows:

$$ {\varTheta}^t={\varPhi}_{\raisebox{1ex}{$M$}\!\left/ \!\raisebox{-1ex}{$t$}\right.\times \raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.}\ltimes {\varPsi}_{N\times N}, $$
(11)

where Θ t is an N × N matrix.

The mutual coherence [3] of Θ t can be obtained as follows:

$$ \mu \left({\varTheta}^t\right)\triangleq \underset{\begin{array}{c}\hfill i\ne j\hfill \\ {}\hfill 1\le i,j\le N\hfill \end{array}}{ \max}\frac{\left|{\theta}_i^T{\theta}_j\right|}{{\left\Vert {\theta}_i\right\Vert}_2{\left\Vert {\theta}_j\right\Vert}_2}, $$
(12)

where θ i is the column vector of Θ t.

If \( \mu \left({\varTheta}^t\right)\in \left[1,\kern0.5em \sqrt{N}\right],\kern0.5em \varPhi (t) \) is incoherent with basis Ψ with a high probability, such that \( {\varTheta}_{N\times N}^t \) satisfies the RIP with high probability [1,2,3]. This guarantees that a k-sparse or compressible signal can be fully represented by M measurements with the dimension-reduced measurement matrix Φ(t). The approach does not change the linear nature of the CS acquisition process, except for involving a smaller measurement matrix to obtain measurements. It is clear that the STP approach fits well with the conventional CS for t > 1.

With Propositions 1 and 2, we can verify that the STP approach in Eq. (6) is compatible with the conventional CS in Eq. (5):

$$ \begin{array}{l}{\varPhi}_{\raisebox{1ex}{$M$}\!\left/ \!\raisebox{-1ex}{$t$}\right.\times \raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$T$}\right.}\ltimes \left({\varPsi}_{N\times N}\ltimes {\theta}_{N\times 1}\right)={\varPhi}_{\raisebox{1ex}{$M$}\!\left/ \!\raisebox{-1ex}{$t$}\right.\times \raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.}\ltimes {\left(\varPsi \ltimes \theta \right)}_{N\times 1}=\\ {}\kern1em {\left(\varPhi \ltimes \varPsi \ltimes \theta \right)}_{M\times 1}={\boldsymbol{y}}_{M\times 1},\\ {}\mathrm{or}\\ {}\left({\varPhi}_{\raisebox{1ex}{$M$}\!\left/ \!\raisebox{-1ex}{$t$}\right.\times \raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.}\ltimes {\varPsi}_{N\times N}\right)\ltimes {\theta}_{N\times 1}={\left(\varPhi \ltimes \varPsi \right)}_{M\times N}\ltimes {\theta}_{N\times 1}=\\ {}{\left(\varPhi \ltimes \varPsi \ltimes \theta \right)}_{M\times 1}={y}_{M\times 1}.\end{array} $$

When we assume t = 2, then the dimensions of Φ(t) are (M/2) × (N/2), and when we assume t = 4, the dimensions are (M/4) × (N/4), etc.. Thus, the storage space of Φ(t) is reduced quadratically. For example, to process a 1024 × 1024 image, when the sampling rate is 50% and the data is with double precision floating-point, there are 512 K measurements. A Gaussian random matrix requires 4096 K bytes when t = 1. While t = 2, the needed storage is 1024 K bytes, and when t = 4, this is reduced to 256 K bytes.

Thus far, the STP approach could be an effective way to reduce the storage space of the measurement matrix. There remains a key question, however, regarding how the original signal can be reconstructed based on the STP approach acquisition process.

To reconstruct the original signal, we adopted the IRLS to reconstruct the original signal [28,29,30,31,32,33,34,35]. In [29], it was shown empirically that using q-minimization with 0 < q < 1 can do with fewer measurements than 1-minimization. In case of a noisy k-sparse vector, using q-minimization with 0 < q < 1 is more stable than 1-minimization [31, 32]. In [33,34,35], an approximate 0-norm minimization algorithm was proposed. The approximate 0-norm minimization shows attractive convergence properties, which is capable of very fast signal recovery, thereby reducing retrieval latency when handling high-dimensional signals.

According to the algorithm of IRLS, the solution to the original sparse signal x N × 1 can be obtained as follows:

$$ {x}_{N\times 1}^{\left(n+1\right)}={D}_n\ltimes {\varPhi}^{\mathrm{T}}(t)\ltimes {\left(\varPhi (t)\ltimes {D}_n\ltimes {\varPhi}^{\mathrm{T}}(t)\right)}^{-1}\ltimes {y}_{M\times 1}, $$
(13)

where \( {x}_{N\times 1}^{\left(n+1\right)} \) denotes the (n + 1)th iteration, D n is the N × N diagonal matrix, and the ith diagonal element is 1/w i (n) (i = 1,  2,  ⋯ ,  N).

For q-norm (0 < q < 1) minimization, weight w i (n) is defined as

$$ {w}_i^{(n)}={\left({\left({x}_i^{(n)}\right)}^2+{\varepsilon}_n^{1+q}\right)}^{\raisebox{1ex}{$\left(2-q\right)$}\!\left/ \!\raisebox{-1ex}{$q$}\right.}, $$
(14)

and for approximate 0-norm minimization, the weight is defined as

$$ {w}_i^{(n)}= \exp \left(-{\left({x}_i^{(n)}\right)}^2/2{\varepsilon}_n^2\right), $$
(15)

where ε n is a positive real number. During the iterations, it decreases as ε n+1 = ρ ε n (0 < ρ < 1).

When we derive a vector of measurements y M×1 by a random measurement matrix Φ(t), we initialize the algorithm by taking w 0 = (1,  ⋯ , 1)1 × N  , x 0 = (1, ⋯1)1 × N , and ε 0 = 1. The k-sparse signal x is then reconstructed by iterations.

Therefore, in Section 4, we experimentally reconstruct the original sparse signal with q-norm (0 < q < 1) minimization and approximate 0-norm minimization, respectively.

4 Experiments results and discussion

In this section, we verify the performance of the proposed STP-CS. Our intent is to determine tradeoffs between recovery performance and the reduction ratio of the measurement matrix. We also compared the performance of STP-CS with that of CS with q-minimization (0 < q < 1) and approximate 0-minimization. We begin the numerical experiments with some N × 1 column-sparse vectors and some N × N gray-scale images. In our experiments the dimensions of the measurement matrix Φ(t) are (M/t) × (N/t), where t could be 1, 2, 4, or even larger, and the matrix Φ(t) is Gaussian N(0, 1/(M/t)) i.i.d. entries, which approximately satisfy the RIP with high probability [2, 32]. In addition, as we have shown in Section 3, when t = 1, there is no reduction to the dimensions of the matrix Φ(t). As such, it can be treated as conventional CS, whereas when t = 2, 4 or higher, the dimensions of the matrix Φ(t) are reduced. Therefore, we performed the comparison with different t. Our experiments and comparisons were implemented in Matlab R2010b on an Intel i7–4600 laptop with 8 GB of memory, running Windows 8.

4.1 Comparison with one-dimensional sparse signal vectors

To compare the performance with the matrices Φ(t), we measured the rate of convergence and the probability of an exact reconstruction for different sparsity values k and for different numbers of measurements.

First, we considered one-dimensional sparse vectors x N×1, where N is the length of the sparse vector.

When N = 256, M = 128, and k = 40, a 40-sparse vector is generated with a random positioning of the non-zeros. Here, according to [1], if a sparse vector is used to ensure the uniqueness of a sparse solution, the number of non-zero elements can reach a limit of M/2. Therefore, we give a maximum number of k (k = 1,  2,  ⋯,  M/2).

Given N = 256 and the measurement numbers M, we generated some Gaussian random measurement matrices Φ(t), with t = 1, 2, and 4. The matrices Φ(t) are provided in Table 1.

Table 1 Different dimensional measurement matrices Φ(t) for STP-CS (N = 256)

As shown in Table 1, when t = 1, the matrix Φ(1) is M × N, whereas when, t = 2, 4, the dimensions of the matrices Φ(2) and Φ(4) are (M/2) × (N/2) and (M/4) × (N/4), respectively. Hence, the storage space needed for the matrix Φ is reduced effectively. Meanwhile, there is also a significant reduction in the memory requirements for reconstruction.

With the sparse vector x and Gaussian random measurement matrices Φ(t), we obtained the measurement vectors of length M = 128 by (7). Then, we initialized the reconstruction process with ε 0 = 1, after iterating, if ε n  < 10−8, the process of recovery is considered to be completed, and a solution to the sparse vector \( \widehat{x} \) is returned. Then, we calculate the relative error between \( \widehat{x} \) and x by following:

$$ Error=\frac{{\left\Vert x-\widehat{x}\right\Vert}_2}{{\left\Vert x\right\Vert}_2}. $$
(16)

If the relative error is less than \( {10}^{-5},\kern0.5em \widehat{x} \) can be considered the correct solution, and the recovery is successful. Otherwise, the recovery is considered to have failed.

If we consider the solution to be correct, we execute the operation T correct  + 1. To measure the probability of exact reconstruction, we performed 500 trials for a single sparsity value of k. Thus, the probability of exact reconstruction is measured as follows:

$$ pr=\frac{T_{correct}}{T_{total}} $$
(17)

where T total denotes the total attempts that were made. Here, T total  = 500.

It needs to be pointed out that matrices Φ(t) were generated only once during the trails.

For a different sparsity value k, the curves of probability of exact reconstruction are shown in Fig. 1. The curve with t = 1 denotes that the matrix we employed is Φ 128×256. Curves with t = 2 and 4 denote that the matrices are Φ 64×128 and Φ 32×64, respectively.

Fig. 1
figure 1

Comparison of probabilities of exact reconstruction with different dimensions of measurement matrices (M = 128, N = 256). Frame a is obtained from l 0-norm minimization with ρ = 0.8. Frame b is obtained from l q-norm minimization with q = 0.8 and ρ = 0.8. In frame a and b, t = 1, 2, 4 mean the dimensions of the measurement matrices are 128 × 256, 64 × 128, and 32 × 64, respectively

As shown in Fig. 1, when the sparsity value k is relatively small, namely k ≤ 20, the probability of an exact reconstruction remains almost 100%, regardless of whether the dimensions of the measurement matrix are reduced. When we increase the value k, the probability of an exact reconstruction declines. Compared to the probability curve with t = 1, the probability curves with t = 2 or 4 decline quickly. However, they nevertheless maintain a high probability of an exact reconstruction. It is clear that we can recover the original sparse vector in the matrices with reduced dimensions. Furthermore, by contrasting frames (a) and (b), we can see that when sparsity value k approaches the limit value of M/2, (a) still has a higher probability of reconstruction than (b).

During the comparisons, an issue emerged that caught our attention, regarding why the probability of an exact reconstruction declines so quickly, when we reduced the dimensions of matrix Φ(t) (t > 1).

As shown in Eqs. (9) and (10), we can see that when we reduce the number of dimensions of matrix Φ(t), the number of coefficients is also reduced. The number of coefficients in (9) is only (M/t). Furthermore, the measurements y with length M can be divided into (M/t) groups, where each group is defined as an adjacent measurement with length t, such that (y 1, …, y t )T is the first group, and (y (i − 1) × t + 1,  …,  y i × t )T is the ith group. For the ith group,

$$ \left(\begin{array}{l}{y}_{\left(i-1\right)\times t+1}\\ {}{y}_{\left(i-1\right)\times t+2}\\ {}\kern2.5em \vdots \\ {}\kern1.5em {y}_{i\times t}\end{array}\right)=\left(\begin{array}{l}{\varphi}_{i,1}\kern0.5em {x}_1+\cdots {\varphi}_{i,j}\kern0.5em {x}_{\left(j-1\right)\times t+1}\cdots +{\varphi}_{i,\raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.}\kern0.5em {x}_{\left(\raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.-1\right)\times t+1}\\ {}{\varphi}_{i,1}\kern0.5em {x}_2+\cdots {\varphi}_{i,j}\kern0.5em {x}_{\left(j-i\right)\times t+2}\cdots +{\varphi}_{i,\raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.}\kern0.5em {x}_{\left(\raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.-1\right)\times t+2}\\ {}\kern15em \vdots \\ {}{\varphi}_{i,1}\kern0.5em {x}_t+\cdots {\varphi}_{i,j}\kern0.5em {x}_{\left(j-1\right)\times t+t}\cdots +{\varphi}_{i,\raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.}\kern0.5em {x}_{\left(\raisebox{1ex}{$N$}\!\left/ \!\raisebox{-1ex}{$t$}\right.-1\right)\times t+t}\end{array}\right), $$
(18)

where 1 ≤ i ≤ (M/t), 1 ≤ j ≤ (N/t), and φ i,j is defined as in (8).

In (18), we see that all the coefficients for different measurements in the ith group are the same—namely, (φ i,1, …, φ i,N/t ). As shown in (10), (y 1, y 2)T, (y 3, y 4)T, and (y 5, y 6)T are the groups.

If the original sparse vector x has a comparably large sparsity value k, it needs more iterations to derive the sparse solutions. That is, the rate of convergence is relatively slower compared to conventional CS with the IRLS algorithm. This exacerbates the decline to the probability of exact reconstruction. Whereas the sparsity value k is relatively small, less iteration are needed, and the rate of convergence is still fast, despite reducing the dimensions in the measurement matrix. To verify our analysis, we compared the rate of convergence with varying sparsity values k and Φ(t); the numerical results are shown in Fig. 2.

Fig. 2
figure 2

Comparison of the rates of convergence for different dimensions of measurement matrices (M = 128, N = 256, ρ = 0.8). Frame a was obtained from l 0-norm minimization with k = 40. Frame b was obtained from l 0-norm minimization with k = 60. In frame a and b, t = 1, 2, 4 mean the dimensions of the measurement matrices are 128 × 256, 64 × 128, and 32 × 64, respectively

The sparse vector we used in this experiment had a value of k—namely, k = 40 or 60. In these comparisons, 500 attempts were executed for generating the three different measurement matrices. We generated the sparse vector x only once with a given k. The numerical results on the curves represent the mean of these 500 attempts.

As shown in Fig. 2, for k = 40, the rate of convergence was roughly the same for different matrices. For k = 60, more iterations were needed to achieve a sparse solution when increasing the value of t. This showed that if the original signal is sufficiently sparse, the rate of convergence was still fast, despite a reduction in the number of dimensions of the measurement matrix.

To further compare the probability of the exact reconstruction with the matrices Φ(t) for different numbers of measurements, we conducted a third experiment, with N = 256 and k = 40. In this experiment, the conditions for completing of the reconstruction process and the success of the reconstruction were the same as they were in the first experiment. That is, there were 500 attempts at generating the measurement matrices Φ(t), whereas the sparse vector x was generated only once. The numerical results on the curves represent the mean of 500 attempts, these results are shown in Fig. 3, where the number of measurements M varied from 0 to N.

Fig. 3
figure 3

Comparison of probabilities of exact reconstruction for different numbers of measurements with different dimensions of measurement matrices (N = 256, k = 40). Frame a was obtained from l 0-norm minimization with ρ = 0.8. Frame b was obtained from l q-norm minimization with q = 0.8 and ρ = 0.8. In frame a and b, t = 1, 2, 4 mean the dimensions of the measurement matrices are M × N, (M/2) × (N/2), and (M/4) × (N/4), respectively

As shown in Fig. 3, for the same number of measurements, the probabilities of exact reconstruction differed little with different measurement matrices. Hence, there was no need to increase the number of measurements to derive the solution when we reduced the number of dimensions of the measurement matrix.

According to the comparisons of one-dimensional sparse signals, we can see that the STP approach can reconstruct a sparse signal with a randomly measurement matrix Φ(t) (t > 1). Moreover, performance with dimensionality reduction for a Gaussian random measurement matrix Φ(t) is generally comparable to that of the random matrix without any reduced dimensions. Furthermore, the performance with the dimensionality reduction to matrix Φ(t) depends on the sparsity of the signal x. In particular, if the original signal is sufficiently sparse, its performance with dimensionality reduction was relative good compared to that of the matrix without reduced dimensions. On the other hand, if the original signal is not sufficiently sparse, the performance declines. Therefore, there is a tradeoff between the performance of the reconstruction and the dimensionality of the measurement matrix.

4.2 Comparisons with two-dimensional signals

Here, to compare the performance with the matrices Φ(t) for two-dimensional signals, we measured the PSNR values of the reconstructed images.

In these comparisons, the signals were two-dimensional natural images. We know that signals and natural images must be sparse in a certain transform domain or dictionary, in order for them to be reconstructed exactly within the CS framework. In our experiments, we employed coefficients from the wavelet transform as two-dimensional compressible signals, and projected the coefficients onto a Gaussian random measurement matrix. When we derived the measurements y with a matrix Φ(t), the coefficients were reconstructed by IRLS with approximate 0-minimization. Three natural images of different sizes were used in our experiments. Lena (size: 256 × 256), Peppers (size: 256 × 256), and OT-Colon (size: 512 × 512), OT-Colon is a DICOM gray-scale medical image, and it can be retrieved from [36].

In this experiment, we stipulated the ratio for sampling at 0.8215, 0.75, 0.5, and 0.4375. We then generated some Gaussian random measurement matrices Φ(t), with t = 1, 2, 4, 8, and 16. The number of dimensions of the matrices for M/N = 0.5 generated are shown in Table 2.

Table 2 Different dimensional measurement matrices Φ(t) for images (M/N = 0.5)

As shown in Table 2, with the increase in the value of t, the storage space was reduced quadratically, such that the storage space of Φ(16) was 1/256th that of Φ(1).

In this simulation, 50 attempts were made for generating these measurement matrices Φ(t), and the coefficients from wavelet transforms were generated only once. The process for reconstruction uses the IRLS algorithm with 0-minimization per our proposal, where ρ = 0.8. Visual reconstructions are shown in Figs. 4 and 5, which were reconstructed with 0-minimization and randomly selected from 50 attempts.

Fig. 4
figure 4

Comparison of the reconstructed images with different dimensions of measurement matrices. (Lena, M = 128, N = 256, 0-minimization with ρ = 0.8). Frame a is the original image; Frame b is the reconstructed image from the original in Frame a using the matrix Φ 128×256; Frame c is the reconstructed image using the matrix Φ 64×128; Frame d is the reconstructed image using the matrix Φ 32×64; Frame e is the reconstructed image using the matrix Φ 16×32; Frame f is the reconstructed image using the matrix Φ 8×16

Fig. 5
figure 5

Comparison of the reconstructed images with different dimensions of measurement matrices. (OT-colon, M = 256, N = 512, 0-minimization with ρ = 0.8). Frame a is the original image; Frame b is the reconstructed image from the original in Frame a using the matrix Φ 256×512; Frame c is the reconstructed image using the matrix Φ 128×256; Frame d is the reconstructed image using the matrix Φ 64×128; Frame e is the reconstructed image using the matrix Φ 32×64; Frame f is the reconstructed image using the matrix Φ 16×32

By comparing Frames (b–f) in Figs. 4 and 5, we can see that the quality of the reconstructed images remained high. That is, the subjective visual quality of the reconstructed images barely declined, despite reducing the storage spaces needed for the matrices and the memory required for reconstruction by t 2 times, where t can be 2, 4, 8, or 16. Then, we used the peak signal-to-noise ratio (PSNR) to evaluate the quality of the reconstructed images. A total of 50 attempts were carried out, and the maximum (Max), the minimum (Min), and the mean PSNR values are listed in Table 3.

Table 3 Comparison of the PSNR of reconstructions with different dimensions of Gaussian measurement matrices ( 0-minimization with ρ = 0.8)

In the maximum values listed, we see that there is almost no difference among the PSNR values, regardless of whether the number of dimensions of the measurement matrix was reduced by t 2 times, even 256 times. Moreover, some values from t > 1 were greater than those from t = 1. This indicates that the proposed algorithm is effective at sampling and reconstructing sparse signals with measurement matrices where the number of dimensions was reduced while maintaining a high level of quality. We can therefore confirm that the quality of the reconstructed image relies significantly on the random matrix generated, rather than the dimensions of the random matrix. This means that if we generate a suitable random matrix (that is, if it satisfies RIP and NSP), we can also obtain a precise reconstruction, even if the dimensions of the matrix are reduced.

In the minimum values listed, some values from t > 1 were significantly lower than those from t = 1. For instance, the sampling rate of Lena was 0.4375 and t = 16, and the PSNR was only 8.2242 dB. By calculating the corresponding mutual coherence (μ(Θ 16)), we found that this μ(Θ 16) was considerably greater than others. This confirmed that the random matrix Φ(t) (t > 1) we generated should satisfy RIP and NSP appropriately. Thus, we can improve the stability of reconstruction quality.

Like Gaussian random matrices, the matrices of Bernoulli, Hadamard, and Toeplitz are also the random. Therefore, we sought to verify the performance of the reconstruction with these matrices. Again, 50 attempts were made for generating random matrices with different dimensions, and the wavelet coefficients were generated only once for a natural image. The IRLS reconstruction algorithm was used with approximate 0-minimization. The results are shown in Table 4. We opted to use 256 × 256 images from Lena and Peppers.

Table 4 Tests on other random measurement matrices with different dimensions. (M/N = 0.5, 0-minimization with ρ = 0.8)

As demonstrated by the results in Table 4, the proposed STP approach is suitable for other random measurement matrices. It produces high-quality images with a suitable number of dimensions in the random matrix.

Furthermore, in order to compare it with other low-memory techniques, PSNR with a reconstructed Lena under different sampling ratios are shown in Fig. 6. The curves represented are the mean of 50 attempts.

Fig. 6
figure 6

Comparison of performance with other low-memory techniques. (Lena, 0-minimization with ρ = 0.8). t = 2, 4, 8, and 16 were obtained from ℓ0-minimization per our proposal, where ρ = 0.8; KCS was obtained from the Kronecker CS approach in [15]. LDPC was obtained from a deterministic measurement matrix in [12]

As shown in Fig. 6, we can see that the PSNR of the reconstructed Lena at different sampling ratios was better than the other two low-memory techniques.

During these experiments, we focused on the performance of the reconstructed one- and two-dimensional signals, where the dimensions of the random measurement matrices we used were reduced by t 2 times (t ≥ 1). As mentioned above, when t = 1, the dimensions of the random matrix were not reduced. This can be treated as equivalent to conventional CS. When t > 1, such as t = 2, the dimensions of the random matrix were reduced by four times, and if t = 8, the dimensions can be reduced by 64 times. From the results, we can see that increasing the value of t is an effective way to reduce the storage space of the random measurement matrix and the memory required for reconstruction. The tradeoff between more precise reconstructions and constrained storage requirements depends on the specific application, and this can significantly influence the physical implementation of CS in images, especially with embedded systems and FPGAs, where storage is limited.

5 Conclusions

In this paper, a novel approach to STP-CS was proposed. Our work aimed at reducing the amount of storage space needed with conventional compressive sensing. We provided a theoretical analysis of the acquisition process with STP-CS and that of the recovery algorithm with IRLS. Furthermore, numerical experiments were conducted on one-dimensional sparse signals and two-dimensional compressible signals, where the two-dimensional signals were the coefficients from wavelet transforms. A comparison of the numerical experiments demonstrated the effectiveness of the STP approach. Moreover, they show that our proposed STP approach for compressive sensing did not improve the quality of the reconstructed signal, yet reduced the storage space of the measurement matrix and the memory requirements for sampling and reconstruction. With a suitable reduction to the dimensionality of the random measurement matrix (e.g., when t = 2, 4, 8, or 16), we achieved a recovery performance similar to that obtained when t = 1, while the storage requirements reduced by t 2 times.

Although the dimensions of the random matrix were reduced and the PSNR of the reconstructed signal declined somewhat, it was also possible to improve the accuracy, provided that the generated random matrix satisfies the RIP and NSP appropriately. Moreover, the proposed algorithm is easy to execute, and additional operations for sampling and reconstruction are unnecessary. This can significantly influence physical implementations of the CS.

However, more investigation is required to improve the recovery performance and optimize the sampling and reconstruction processes. Further work remains in constructing the so-called independent identically distributed random matrix with fewer dimensions. Moreover, we shall attempt to optimize the matrix based on QR decomposition, which could help to improve the incoherence between the measurement matrix and the sparse basis. This will help to improve the performance of reconstruction. This has also motivated us to employ other measurement matrices, such as the structurally random matrix, low-rank matrix, rank-one matrix, and etc., in order to reduce required storage while maintaining or improving performance quality. Parallel framework [37, 38] will also be considered to reduce the time consuming during the reconstruction.