1 Introduction

Recently, multiple-input multiple-output (MIMO) relay communication systems have attracted much research interest and provided significant improvement in terms of both spectral efficiency and link reliability [117]. Many works have studied the optimal relay amplifying matrix for the source-relay-destination channel. In [2, 3], the optimal relay amplifying matrix maximizing the mutual information (MI) between the source and destination nodes was derived, assuming that the source covariance matrix is an identity matrix. In [46], the optimal relay amplifying matrix was designed to minimize the mean-squared error (MSE) of the signal waveform estimation at the destination.

A few research has studied the joint optimization of the source precoding matrix and the relay amplifying matrix for the source-relay-destination channel. In [7], both the source and relay matrices were jointly designed to maximize the source-destination MI. In [8, 9], source and relay matrices were developed to jointly optimize a broad class of objective functions. The author of [10] investigated the joint source and relay optimization for two-way MIMO relay systems using the projected gradient (PG) approach. The source and relay optimization for multi-user MIMO relay systems with single relay node has been investigated in [1114].

All the works in [114] considered a single relay node at each hop. In general, joint source and relay precoding matrices design for MIMO relay systems with multiple relay nodes is more challenging than that for single-relay systems. The authors of [15] developed the optimal relay amplifying matrices with multiple relay nodes. A matrix-form conjugate gradient algorithm has been proposed in [16] to optimize the source and relay matrices. In [17], the authors proposed a suboptimal source and relay matrices design for parallel MIMO relay systems by first relaxing the power constraint at each relay node to the sum relay power constraints at the output of the second-hop channel and then scaling the relay matrices to satisfy the individual relay power constraints.

In this paper, we propose a jointly optimal source precoding matrix and relay amplifying matrices design for a two-hop non-regenerative MIMO relay network with multiple relay nodes using the PG approach. We show that the optimal relay amplifying matrices have a beamforming structure. This new result is not available in [16]. It generalizes the optimal source and relay matrices design from a single relay node case [8] to multiple parallel relay nodes scenarios. Exploiting the structure of relay matrices, an iterative joint source and relay matrices optimization algorithm is developed to minimize the MSE of the signal waveform estimation. Different to [17], in this paper, we develop the optimal source and relay matrices by directly considering the transmission power constraint at each relay node. Simulation results demonstrate the effectiveness of the proposed iterative joint source and relay matrices design algorithm with multiple parallel relay nodes using the PG approach.

The rest of this paper is organized as follows. In Section 2, we introduce the model of a non-regenerative MIMO relay communication system with parallel relay nodes. The joint source and relay matrices design algorithm is developed in Section 3. In Section 4, we show some numerical simulations. Conclusions are drawn in Section 5.

2 System model

In this section, we introduce the model of a two-hop MIMO relay communication system consisting of one source node, K parallel relay nodes, and one destination node as shown in Figure 1. We assume that the source and destination nodes have Ns and Nd antennas, respectively, and each relay node has Nr antennas. The generalization to systems with different number of antennas at each relay node is straightforward. Due to its merit of simplicity, a linear non-regenerative strategy is applied at each relay node. The communication process between the source and destination nodes is completed in two time slots. In the first time slot, the Nb×1(NbNs) modulated source symbol vector s is linearly precoded as

x=Bs,
(1)
Figure 1
figure 1

Block diagram of a parallel MIMO relay communication system.

where B is an Ns×Nb source precoding matrix. We assume that the source signal vector satisfies E s s H = I N b , where I n stands for an n×n identity matrix, (·)H is the matrix (vector) Hermitian transpose, and E [·] denotes statistical expectation. The precoded vector x is transmitted to K parallel relay nodes. The Nr×1 received signal vector at the i th relay node can be written as

y r , i = H s r , i x+ v r , i ,i=1,,K,
(2)

where Hsr,i is the Nr×Ns MIMO channel matrix between the source and the i th relay nodes and vr,i is the additive Gaussian noise vector at the i th relay node.

In the second time slot, the source node is silent, while each relay node transmits the linearly amplified signal vector to the destination node as

x r , i = F i y r , i ,i=1,,K,
(3)

where F i is the Nr×Nr amplifying matrix at the i th relay node. The received signal vector at the destination node can be written as

y d = i = 1 K H r d , i x r , i + v d ,
(4)

where Hrd,i is the Nd×Nr MIMO channel matrix between the i th relay and the destination nodes, and vd is the additive Gaussian noise vector at the destination node.

Substituting (1) to (3) into (4), we have

y d = i = 1 K ( H r d , i F i H s r , i B s + H r d , i F i v r , i ) + v d = H r d F H s r B s + H r d F v r + v d H ~ s + v ~ ,
(5)

where H s r H s r , 1 T , H s r , 2 T , , H s r , K T T is a K Nr×Ns channel matrix between the source node and all relay nodes, H r d H r d , 1 , H r d , 2 , , H r d , K is an Nd×K Nr channel matrix between all relay nodes and the destination node, Fbd[ F 1 , F 2 ,, F K ] is the K Nr×K Nr block diagonal equivalent relay matrix, v r v r , 1 T , v r , 2 T ,, v r , K T T is obtained by stacking the noise vectors at all the relays, H ~ H r d F H s r B is the effective MIMO channel matrix of the source-relay-destination link, and v ~ H r d F v r + v d is the equivalent noise vector. Here, (·)T denotes the matrix (vector) transpose, bd [ ·] constructs a block-diagonal matrix. We assume that all noises are independent and identically distributed (i.i.d.) Gaussian noise with zero mean and unit variance. The transmission power consumed by each relay node (3) can be expressed as

E t r x r , i x r , i H = tr F i H s r , i B B H H s r , i H + I N r F i H , i = 1 , , K ,
(6)

where tr(·) stands for the matrix trace.

Using a linear receiver, the estimated signal waveform vector at the destination node is given by s ̂ = W H y d , where W is an Nd×Nb weight matrix. The MSE of the signal waveform estimation is given by

MSE = t r E s ̂ - s s ̂ - s H = t r W H H ~ - I N b W H H ~ - I N b H + W H C ~ W ,
(7)

where C ~ is the equivalent noise covariance matrix given by C ~ =E v ~ v ~ H = H r d F F H H r d H + I N d . The weight matrix W which minimizes (7) is the Wiener filter and can be written as

W= ( H ~ H ~ H + C ~ ) - 1 H ~ ,
(8)

where (·)-1 denotes the matrix inversion. Substituting (8) back into (7), it can be seen that the MSE is a function of F and B and can be written as

MSE=tr I N b + H ~ H C ~ - 1 H ~ - 1 .
(9)

3 Joint source and relay matrix optimization

In this section, we address the joint source and relay matrix optimization problem for MIMO multi-relay systems with a linear minimum mean-squared error (MMSE) receiver at the destination node. In particular, we show that optimal relay matrices have a general beamforming structure. Based on (6) and (9), the joint source and relay matrices optimization problem can be formulated as

min { F i } , B tr I N b + H ~ H C ~ - 1 H ~ - 1
(10)
s.t. tr B B H P s
(11)
tr F i H s r , i B B H H s r , i H + I N r F i H P r , i , i = 1 , , K ,
(12)

where { F i }{ F i ,i=1,,K}, (11) is the transmit power constraint at the source node, while (12) is the power constraint at each relay node. Here, Ps>0 and Pr,i>0, i=1,⋯,K, are the corresponding power budget. Obviously, to avoid any loss of transmission power in the relay system when a linear receiver is used, there should be Nb≤min(K Nr,Nd). The problem (10)-(12) is non-convex, and a globally optimal solution of B and {F i } is difficult to obtain with a reasonable computational complexity. In this paper, we develop an iterative algorithm to optimize B and {F i }. First, we show the optimal structure of {F i }.

3.1 Optimal structure of relay amplifying matrices

For given source matrix B satisfying (11), the relay matrices {F i } are optimized by solving the following problem:

min { F i } tr I N b + H ~ H C ~ - 1 H ~ - 1
(13)
s.t. tr F i H s r , i B B H H s r , i H + I N r F i H P r , i , i = 1 , , K.
(14)

Let us introduce the following singular value decompositions (SVDs):

H s r , i B= U s , i Λ s , i V s , i H , H r d , i = U r , i Λ r , i V r , i H ,i=1,,K,
(15)

where Λs,i and Λr,i are Rs,i×Rs,i and Rr,i×Rr,i diagonal matrices, respectively. Here, R s , i rank( H s r , i B), R r , i rank( H r d , i ), i=1,⋯,K, and rank(·) denotes the rank of a matrix. Based on the definition of matrix rank, Rs,i≤ min(Nr,Nb) and Rr,i≤ min(Nr,Nd). The following theorem states the structure of the optimal {F i }.

Theorem 1.

Using the SVDs of (15), the optimal structure of Fi as the solution to the problem (13)-(14) is given by

F i = V r , i A i U s , i H ,i=1,,K,
(16)

where A i is an Rr,i×Rs,i matrix, i=1,⋯,K.

Proof

See Appendix 1.

The remaining task is to find the optimal Ai, i=1,⋯,K. From (31) and (32) in Appendix 1, we can equivalently rewrite the optimization problem (13)-(14) as

min { A i } tr I N b + i = 1 K V s , i Λ s , i A i H Λ r , i U r , i H × i = 1 K U r , i Λ r , i A i A i H Λ r , i U r , i H + I N d - 1 × i = 1 K U r , i Λ r , i A i Λ s , i V s , i H - 1
(17)
s.t. tr A i ( Λ s , i 2 + I R s , i ) A i H P r , i , i = 1 , , K.
(18)

Both the problem (13)-(14) and the problem (17)-(18) have matrix optimization variables. However, in the former problem, the optimization variable F i is an Nr×Nr matrix, while the dimension of A i is Rr,i×Rs,i, which may be smaller than that of F i . Thus, solving the problem (17)-(18) has a smaller computational complexity than solving the problem (13)-(14). In general, the problem (17)-(18) is non-convex, and a globally optimal solution is difficult to obtain with a reasonable computational complexity. Fortunately, we can resort to numerical methods, such as the projected gradient algorithm [18] to find (at least) a locally optimal solution of (17)-(18).

Theorem 2.

Let us define the objective function in (17) as f(A i ). Its gradient ∇f(Ai) with respect to A i can be calculated by using results on derivatives of matrices in [19] as

f ( A i ) = 2 R i H M i H E i S i H + D i H - R i H G i - H E i S i H , i = 1 , , K ,
(19)

where M i , R i , S i , D i , E i , and G i are defined in Appendix 2.

Proof

See Appendix 2.

In each iteration of the PG algorithm, we first obtain A ~ i = A i - s n f( A i ) by moving A i one step towards the negative gradient direction of f(A i ), where s n >0 is the step size. Since A ~ i might not satisfy the constraint (18), we need to project it onto the set given by (18). The projected matrix A ̄ i is obtained by minimizing the Frobenius norm of A ̄ i - A ~ i (according to [18]) subjecting to (18), which can be formulated as the following optimization problem:

min A ̄ i tr ( A ̄ i - A ~ i ) ( A ̄ i - A ~ i ) H
(20)
s.t. tr A ̄ i Λ s , i 2 + I R s , i A ̄ i H P r , i .
(21)

Obviously, if tr A ~ i Λ s , i 2 + I R s , i A ~ i H P r , i , then A ̄ i = A ~ i . Otherwise, the solution to the problem (20)-(21) can be obtained by using the Lagrange multiplier method, and the solution is given by

A ̄ i = A ~ i ( λ + 1 ) I R s , i + λ Λ s , i 2 - 1 ,

where λ>0 is the solution to the non-linear equation of

t r A ~ i ( λ + 1 ) I R s , i + λ Λ s , i 2 - 1 ( Λ s , i 2 + I R s , i ) × ( λ + 1 ) I R s , i + λ Λ s , i 2 - 1 A ~ i H = P r , i .
(22)

Equation (22) can be efficiently solved by the bisection method [18].

The procedure of the PG algorithm is listed in Algorithm 1, where (·)(n) denotes the variable at the n th iteration, δ n and s n are the step size parameters at the n th iteration, ∥·∥ denotes the maximum among the absolute value of all elements in the matrix, and ε is a positive constant close to 0. The step size parameters δ n and s n are determined by the Armijo rule [18], i.e., s n =s is a constant through all iterations, while at the n th iteration, δ n is set to be γ m n . Here, m n is the minimal non-negative integer that satisfies the following inequality f A i ( n + 1 ) -f A i ( n ) α γ m n tr f A i ( n ) H A ̄ i ( n ) - A i ( n ) , where α and γ are constants. According to [18], usually α is chosen close to 0, for example, α∈[10-5,10-1], while a proper choice of γ is normally from 0.1 to 0.5.

3.2 Optimal source precoding matrix

With fixed {F i }, the source precoding matrix B is optimized by solving the following problem:

min B tr I N b + B H Ψ B - 1
(23)
s.t. tr B B H P s ,
(24)
tr F i H s r , i B B H H s r , i H F i H P ̆ r , i , i = 1 , , K ,
(25)

where Ψ H s r H F H H r d H H r d F F H H r d H + I N d - 1 H r d F H s r , and P ̆ r , i P r , i -tr F i F i H , i=1,⋯,K. Let us introduce ΩB B H , and a positive semi-definite (PSD) matrix X with X I N s + Ψ 1 2 Ω Ψ 1 2 - 1 , where AB means that A-B is a PSD matrix. By using the Schur complement [20], the problem (23)-(25) can be equivalently converted to the following problem:

min X , Ω tr X - N s + N b
(26)
s.t. X I N s I N s I N s + Ψ 1 2 Ω Ψ 1 2 0 ,
(27)
t r Ω P s , Ω 0 ,
(28)
t r F i H s r , i Ω H s r , i H F i H P ̆ r , i , i = 1 , , K.
(29)

The problem (26)-(29) is a convex semi-definite programming (SDP) problem which can be efficiently solved by the interior point method [20]. Let us introduce the eigenvalue decomposition (EVD) of Ω= U Ω Λ Ω U Ω H , where Λ Ω is a R Ω ×R Ω eigenvalue matrix with R Ω =rank(Ω). If R Ω =Nb, then from Ω=B BH, we have B= U Ω Λ Ω 1 2 . If R Ω >Nb, the randomization technique [21] can be applied to obtain a possibly suboptimal solution of B with rank Nb. If R Ω <Nb, it indicates that the system (channel) cannot support Nb independent data streams, and thus, in this case, a smaller Nb should be chosen in the system design.

Now, the original joint source and relay optimization problem (10)-(12) can be solved by an iterative algorithm as shown in Algorithm 2, where (·)(m) denotes the variable at the m th iteration. This algorithm is first initialized at a random feasible B satisfying (11). At each iteration, we first update {F i } with fixed B and then update B with fixed {F i }. Note that the conditional updates of each matrix may either decrease or maintain but cannot increase the objective function (10). Monotonic convergence of {F i } and B towards (at least) a locally optimal solution follows directly from this observation. Note that in each iteration of this algorithm, we need to update the relay amplifying matrices according to the procedure listed in Algorithm 1 at a complexity order of O K N d 3 + N r 3 + N b 3 and update the source precoding matrix through solving the SDP problem (26)-(29) at a complexity cost that is at most O N s 2 + K + 1 3.5 using interior point methods [22]. Therefore, the per-iteration computational complexity order of the proposed algorithm is O K N d 3 + N r 3 + N b 3 + N s 2 + K + 1 3.5 . The overall complexity of this algorithm depends on the number of iterations until convergence, which will be studied in the next section.

4 Simulations

In this section, we study the performance of the proposed jointly optimal source and relay matrix design for MIMO multi-relay systems with linear MMSE receiver. All simulations are conducted in a flat Rayleigh fading environment where the channel matrices have zero-mean entries with variances σ s 2 / N s and σ r 2 /( KN r ) for Hsr and Hrd, respectively. For the sake of simplicity, we assume Pr,i=Pr, i=1,⋯,K. The BPSK constellations are used to modulate the source symbols, and all noises are i.i.d. Gaussian with zero mean and unit variance. We define SNR s = σ s 2 P s K N r / N s and SNR r = σ r 2 P r N d /( KN r ) as the signal-to-noise ratio (SNR) for the source-relay link and the relay-destination link, respectively. We transmit 1000Ns randomly generated bits in each channel realization, and all simulation results are averaged over 200 channel realizations. In all simulations, we set Nb=Ns=Nr=Nd=3, and the MMSE linear receiver in (8) is employed at the destination for symbol detection.

In the first example, a MIMO relay system with K=3 relay nodes is simulated. We compare the normalized MSE performance of the proposed joint source and relay optimization algorithm using the projected gradient (JSR-PG) algorithm in Algorithm 2, the optimal relay-only algorithm using the projected gradient (ORO-PG) algorithm in Algorithm 1 with B= P s / N s I N s , and the naive amplify-and-forward (NAF) algorithm. Figure 2 shows the normalized MSE of all algorithms versus SNRs for SNRr= 20 dB. While Figure 3 demonstrates the normalized MSE of all algorithms versus SNRr for SNRs fixed at 20 dB. It can be seen from Figures 2 and 3 that the JSR-PG and ORO-PG algorithms have a better performance than the NAF algorithm over the whole SNRs and SNRr range. Moreover, the proposed JSR-PG algorithm yields the lowest MSE among all three algorithms.

Figure 2
figure 2

Example 1. Normalized MSE versus SNRs with K=3, SNRr=20 dB.

Figure 3
figure 3

Example 1. Normalized MSE versus SNRr with K=3, SNRs=20 dB.

The number of iterations required for the JSR-PG algorithm to converge to ε=10-3 in a typical channel realization are listed in Table 1, where we set K=3 and SNRr=20 dB. It can be seen that the JSR-PG algorithm converges within several iterations, and thus, it is realizable with the advancement of modern chip design.

Table 1 Iterations required until convergence in the JSR-PG algorithm

In the second example, we compare the bit error rate (BER) performance of the proposed JSR-PG algorithm in Algorithm 2, the ORO-PG algorithm in Algorithm 1, the suboptimal source and relay matrix design in [17], the one-way relay version of the conjugate gradient-based source and relay algorithm in [16], and the NAF algorithm. Figure 4 displays the system BER versus SNRs for a MIMO relay system with K=3 relay nodes and fixed SNRr at 20 dB. It can be seen from Figure 4 that the proposed JSR-PG algorithm has a better BER performance than the existing algorithms over the whole SNRs range.

Figure 4
figure 4

Example 2. BER versus SNRs with K=3, SNRr=20 dB.

In the third example, we study the effect of the number of relay nodes to the system BER performance using the JSR-PG and ORO-PG algorithms. Figure 5 displays the system BER versus SNRs with K=2, 3, and 5 for fixed SNRr at 20 dB. It can be seen that at BER = 10-2, for both the ORO-PG algorithm and JSR-PG algorithm, we can achieve approximately 3-dB gain by increasing from K=2 to K=5. It can also be seen that the performance gain of the JSR-PG algorithm over the ORO-PG algorithm increases with the increasing number of relay nodes.

Figure 5
figure 5

Example 3. BER versus SNRs for different K, SNRr=20 dB.

5 Conclusions

In this paper, we have derived the general structure of the optimal relay amplifying matrices for linear non-regenerative MIMO relay communication systems with multiple relay nodes using the projected gradient approach. The proposed source and relay matrices minimize the MSE of the signal waveform estimation. The simulation results demonstrate that the proposed algorithm has improved the MSE and BER performance compared with existing techniques.

Appendices

Appendix 1

Proof of Theorem 1

Without loss of generality, F i can be written as

F i = V r , i V r , i A i X i Y i Z i U s , i H U s , i H ,i=1,,K,
(30)

where V r , i ( V r , i ) H = I N r - V r , i V r , i H , U s , i U s , i H = I N r - U s , i U s , i H , such that V ̄ r , i V r , i , V r , i and U ̄ s , i U s , i , U s , i are Nr×Nr unitary matrices. Matrices A i ,X i ,Y i ,Z i are arbitrary matrices with dimensions of Rr,i×Rs,i, Rr,i×(Nr-Rs,i), (Nr-Rr,iRs,i, (Nr-Rr,i)×(Nr-Rs,i), respectively. Substituting (15) and (30) back into (13), we obtain that H r d , i F i H s r , i B= U r , i Λ r , i A i Λ s , i V s , i H and H r d , i F i F i H H r d , i H = U r , i Λ r , i A i A i H + X i X i H Λ r , i U r , i H . Thus, we can rewrite (13) as

MSE tr I N b + i = 1 K V s , i Λ s , i A i H Λ r , i U r , i H × i = 1 K U r , i Λ r , i A i A i H + X i X i H Λ r , i U r , i H + I N d - 1 × i = 1 K U r , i Λ r , i A i Λ s , i V s , i H - 1 .
(31)

It can be seen that (31) is minimized by X i = 0 R r , i × ( N r - R s , i ) , i=1,⋯,K.

Substituting (15) and (30) back into the left-hand side of the transmission power constraint (14), we have

tr F i H s r , i B B H H s r , i H + I N r F i H = tr A i Λ s , i 2 + I R s , i A i H + Y i Λ s , i 2 + I R s , i Y i H + X i X i H + Z i Z i H , i = 1 , , K.
(32)

From (32), we find that X i = 0 R r , i × ( N r - R s , i ) , Y i = 0 ( N r - R r , i ) × R s , i , and Z i = 0 ( N r - R r , i ) × ( N r - R s , i ) minimize the power consumption at each relay node. Thus, we have F i = V r , i A i U s , i H ,i=1,,K.

Appendix 2

Proof of Theorem 2

Let us define Z i j = 1 , j i K U r , j Λ r , j A j Λ s , j V s , j H and Y i j = 1 , j i K U r , j Λ r , j A j A j H Λ r , j U r , j H + I N d . Then, f(A i ) can be written as

f ( A i ) = t r I N b + Z i H + V s , i Λ s , i A i H Λ r , i U r , i H × Y i + U r , i Λ r , i A i A i H Λ r , i U r , i H - 1 × Z i + U r , i Λ r , i A i Λ s , i V s , i H - 1 .
(33)

Applying I N b + A H C - 1 A - 1 = I N b - A H ( A A H + C ) - 1 A, (33) can be written as

f ( A i ) = t r I N b - Z i H + V s , i Λ s , i A i H Λ r , i U r , i H × Z i + U r , i Λ r , i A i Λ s , i V s , i H × Z i H + V s , i Λ s , i A i H Λ r , i U r , i H + Y i + U r , i Λ r , i A i A i H Λ r , i U r , i H - 1 × Z i + U r , i Λ r , i A i Λ s , i V s , i H .
(34)

Let us now define E i Z i + U r , i Λ r , i A i Λ s , i V s , i H , K i Y i + U r , i Λ r , i A i A i H Λ r , i U r , i H , and G i E i E i H + K i . We can rewrite (34) as

f( A i )=tr I N b - E i H G i - 1 E i =tr I N b - E i E i H G i - 1 .
(35)

The derivative of f(A i ) with respect to A i is given by

∂f ( A i ) A i = - A i t r E i E i H G i - 1 = A i t r G i - 1 E i E i H G i - 1 Z i + U r , i Λ r , i A i Λ s , i V s , i H E i H + Y i + U r , i Λ r , i A i A i H Λ r , i U r , i H - A i t r E i H G i - 1 U r , i Λ r , i A i Λ s , i V s , i H .
(36)

Defining M i G i - 1 E i E i H G i - 1 , R i U r , i Λ r , i , S i Λ s , i V s , i H , and D i A i H Λ r , i U r , i H , we can rewrite (36) as

∂f ( A i ) A i = A i t r M i Z i + U r , i Λ r , i A i Λ s , i V s , i H E i H + M i Y i + U r , i Λ r , i A i D i - E i H G i - 1 U r , i Λ r , i T Λ s , i V s , i H T
(37)
= A i t r M i R i A i S i E i H + M i R i A i D i - E i H G i - 1 R i T S i T .
(38)

Finally, the gradient of f(A i ) is given by

f ( A i ) = 2 ∂f ( A i ) A i = 2 ( M i R i ) T S i E i H T + ( M i R i ) T D i T - E i H G i - 1 R i T S i T = 2 R i H M i H E i S i H + D i H - R i H G i - H E i S i H
(39)

where (·) stands for complex conjugate.