Abstract
This paper presents a new decoder algorithm for the double space–time transmit diversity (DSTTD) system. The decoder is based on the QRDM algorithm, which performs a breadthfirst search of possible solutions tree. The search is simplified by skipping unlikely candidates, and it is stopped when no promising candidates are left. Furthermore, the search is divided into three concurrent iterations, making possible a fast, parallel implementation either in hardware or software. After presenting an analysis of the capacity and diversity of DSTTD, we present performance results showing that the proposed decoder is capable of achieving near maximum likelihood performance. We also show that the proposed algorithm exhibits lower computational complexity than other existing maximum likelihood detectors.
Introduction
It is known that, when transmitting and receiving over multiple antennas, the richscattering wireless channel has enormous capacity [1]. Furthermore, this capacity can be exploited to obtain increased data rates, or an increase in reliability; a tradeoff between these two properties can also be achieved [2].
Over the past two decades, techniques known as space–time codes have been developed to exploit these gains. Some space–time codes focus on spatial multiplexing gain; independent symbols are transmitted over different antennas at each channel use, increasing the data rate, but also the interference at the receiver. The Bell Labs layered space–time (VBLAST) architecture [3] is an example. Space–time block codes (STBC), such as the Alamouti scheme [4] and other orthogonal designs [5], aim to provide transmitter diversity; crucially, this is achieved without requiring channel knowledge at the transmit side. General space–time frameworks have also been proposed, which enable the design of codes that provide a mix of spatial multiplexing and diversity gain. One example is linear dispersion codes [6], which use the maximization of the mutual information between transmitter and receiver as a design criterion.
Hybrid space–time codes present a simple way to achieve both spatial multiplexing and transmit diversity gain [7,8,9,10]. These codes operate in layers, like VBLAST; however, at least some of the layers consist of a set of antennas transmitting an STBC code. In this paper, we focus on the double space–time transmit diversity (DSTTD) scheme [11], which consists of two layers, each of which is an Alamouti STBC. This architecture provides an increase in data rate, as the two layers transmit in parallel, while still offering the transmit diversity advantage of each underlying Alamouti code. DSTTD has been adopted by the IEEE 802.11n [12] and 802.16e [13] WLAN standards.
The practical feasibility of a space–time code depends on the complexity of the decoder. The maximumlikelihood (ML) decoder is optimum, but its complexity increases exponentially with the constellation size and the number of antennas. On the other hand, the DSTTD code is a linear dispersion code and, as such, can be decoded using the same lowcomplexity, ordered decision feedback algorithm developed for VBLAST [10, 14]; however, its performance is quite suboptimal. A number of detectors have been proposed, with less complexity than ML but better performance than ordered decision feedback equalizers. Some of these algorithms were designed for VBLAST [15, 16]. Others are general nearML algorithms, such as the sphere decoder [17]. All of them can be easily adapted to DSTTD.
Recently, treesearch algorithms have been applied to STBC decoding [18]; these have enabled nearML decoding performance with reduced complexity compared to the sphere decoder [8, 19,20,21,22]. Good results have been obtained with decoders based on the Malgorithm combined with the QR decomposition of the channel matrix. The reduction in complexity is obtained by reducing the number of distances calculated [19] and by exploiting the structure of the channel matrix [23]. These algorithms have been successfully applied to different STBC architectures, including DSTTD [24, 25].
In this paper, we propose a new decoding algorithm for DSTTD that achieves nearML performance and exhibits lower complexity than other known decoders. The new algorithm builds on ideas presented in the past. Like the decoders inspired by [23], we exploit the structure of the QR decomposition of the channel matrix to simplify the ML problem. We perform a tree search similar to that of [19] and [22], with an improved search order that allows the decoder to find the optimum solution in fewer iterations. Furthermore, the size of the search performed by the proposed decoder can be easily constrained to impose a maximum limit to the complexity, in many cases with negligible impact on its error performance. Finally, we divide the candidate search into three independent searches that can be executed concurrently, which enables a fast hardware implementation.
The algorithm, as presented, is adapted to work exclusively in the detection of DSTTD. However, with a slight modification it can also detect a twolayer hybrid code where the second layer consists of a spatial antenna (for a total of three transmitter antennas) [26].
The paper is organized as follows. In Sect. 2, we present an analysis of the DSTTD space–time code and show that its capacity is just slightly below that of the underlying MIMO channel. In Sect. 3, we present an overview of existing decoding algorithms for DSTTD. In Sect. 4 we make a detailed presentation of the proposed algorithm. Simulation results, including error rates and complexity, as well as a comparison with other algorithms, are presented in Sect. 5. Finally, we present our conclusions in Sect. 6.
The double space–time transmit diversity linear dispersion code
A space–time block code (STBC) is a mapping from a vector of \(n_s\) informationbearing symbols \(s_i\), \(i=1,2,\ldots ,n_s\), to a \(n_t \times T\) space–time code matrix \(\mathbf {S}\), that specifies how symbols are spread over \(n_t\) antennas and T time intervals. The double space–time transmit diversity (DSTTD) linear space–time block code transmits \(n_s=4\) complex symbols over \(T=2\) symbol intervals and \(n_t=4\) transmit antennas [11]. The DSTTD space–time code matrix \(\mathbf {S}\) is given by
where the dispersion matrices \(A_1,\ldots ,A_4\) and \(B_1,\ldots ,B_4\) (of size \(n_t \times T\)) are defined as
Note that, since we assume that the transmitter has no channel knowledge, this code allocates the same average power to each transmitter antenna and each symbol. The DSTTD timespace mapping is summarized in Table 1.
We assume a richscattering, Rayleigh wireless channel with flat and slow fading, where the channel between transmitter antenna j and receiver antenna i can be modeled as a complex Gaussian gain \(h_{ij}\sim \mathcal {C}(0,1)\) of zero mean and variance 0.5 per dimension. This gain remains constant for several symbol intervals, after which it changes to a new independent realization. The overall channel can be modeled as a random matrix \(\mathbf {H}\) of size \(n_r \times n_t\). The receiver is assumed to have perfect channel state information, obtained using techniques such as those described in [27].
Further assumptions are that all the antennas transmit information symbols from the same MQAM constellation, that the receiver is perfectly synchronized to the transmitter, that each receiver antenna is also subject to additive white Gaussian noise of zero mean and power spectral density \(N_0/2\) per dimension. A block diagram of a DSTTD system is shown in Fig. 1.
Analysis of DSTTD: mutual information and diversity order
The DSTTD code is equivalent to two “stacked” Alamouti \(2\times 1\) codes. Each Alamouti code can be interpreted as a separate layer in a spatial multiplexing code; as such, it belongs to the category of hybrid space–time codes [10]. Hybrid codes are adhoc combinations of layered space–time codes, which may potentially achieve maximum spatial multiplexing gain, and (quasi) orthogonal codes, which achieve maximum diversity gain. These codes are interesting because, under certain conditions, they offer larger diversity gain than spatial multiplexing codes, and larger transmission rates than orthogonal codes. In this sense, hybrid codes’ diversity gain and rate can be designed to lie on intermediate points of the diversitymultiplexing tradeoff curve described in [2]. At the same time, their structure allows for low complexity decoding (see e.g. [26]).
In the rest of this section we describe some properties of the DSTTD code, with the aim of showing that it offers a diversitymultiplexing tradeoff that is close to optimal. Receiver algorithms are studied in subsequent sections.
Note that the DSTTD code has rate \(R=n_s/T=2\). It is not orthogonal, since \(\mathbf {S}\mathbf {S}^{\mathsf {H}}\ne \sum s_n^2\mathbf {I}.\) In contrast, a \(2\times 1\) Alamouti code has rate \(R=1\) and is orthogonal.
We now calculate the mutual information of DSTTD. The ergodic capacity of a \(4 \times 2\) MIMO system is given by
where \(E_\mathbf {H}(\cdot )\) is the expectation over \(\mathbf {H}\) and \(\mathbf {H}^{\mathsf {H}}\) is the Hermitian conjugate of \(\mathbf {H}\).
The mutual information \(M(\mathbf {H})\) of DSTTD can be expressed in terms of the channel matrix \(\mathbf {H}\) and the linear dispersion matrices. Let
Then, the mutual information can then be expressed as
where the factor 1 / 4 normalizes for two channel uses.
The capacity of a \(4\times 2\) channel is compared to the mutual information of DSTTD in Fig. 2. It can be seen that, for a signal to noise ratio of around 15 dB, the mutual information is around 1 dB below the capacity. The difference between them increases for larger SNR, but it can be concluded that DSTTD achieves a large fraction of the channel capacity.
Regarding the diversity gain of DSTTD, it can be calculated as follows. Let \(\mathbf {S}\) and \(\mathbf {W}\) be two different code matrices, and let \(\mathbf {D}=\mathbf {S}\mathbf {W}\). The diversity gain of DSTTD is equal to the rank of \(\mathbf {D}\) times the number of receiver antennas. The difference matrix \(\mathbf {D}\) is equal to
and its rank is equal to 2. The diversity gain of the code is then equal to 4. This is double that of the Alamouti \(2\times 1\) code.
System equation
The received signal can be represented as a matrix \(\mathbf {Y}\) of size \(n_r \times T\) given by
where \(\mathbf {N}\) is a matrix of noise samples. Following the conventional analysis for STBC, we can vectorize the expression for the received symbols as
Note that the \(4\times 2\) DSTTD system is equivalent to a \(4 \times 1\) system where the channel matrix has a specific structure. By defining \(\mathbf {H}_a\) as the channel matrix in Eq. 6, and \(\mathbf {s}=[s_1 \, s_2 \, s_3 \, s_4]^\text {T}\), we can write the system’s equation as
The system equation with this redefined channel matrix plays an important role in the development of receiver algorithms described in the next section.
Decoding algorithms for DSTTD
In the previous section we showed that the DSTTD space–time code has the potential for large spatial multiplexing and diversity gains compared to the underlying Alamouti \(2\times 1\) layers. In this section, we present several decoding algorithms that are tailored to the DSTTD code. First, we present the optimal maximumlikelihood decoder. Then, we describe several suboptimal decoders that have low complexity, making them attractive for hardware or software implementation.
The algorithms presented below all follow three common steps: first, the system equations are rewritten to obtain a more convenient system representation; second, the QR decomposition of the channel matrix is calculated, and finally the actual decoding is performed. The structure of the matrix \(\mathbf {R}\) is exploited to reduce the number of operations performed.
Maximum likelihood decoder
Assume that the vector \(\mathbf {y}\) is received over two symbol periods as described in Eq. (7). The maximumlikelihood detection of the transmitted symbol \(\mathbf {s}\) is given by
where \(\Omega \) is the signal constellation and \(\Omega ^4\) is the set of all possible transmitted vectors. We may exploit the structure of the channel matrix to simplify this problem. Let \(\mathbf {H}_a=\mathbf {QR}\) be the QR decomposition of \(\mathbf {H}_a\), where \(\mathbf {Q}\) is an unitary matrix and \(\mathbf {R}\) is upper triangular. Then, if we multiply the received vector \(\mathbf {y}\) by \(\mathbf {Q}\)\(^{\mathsf {H}}\), we obtain the modified vector
The statistical properties of the noise don’t change, because \(\mathbf {Q}\) is unitary. The structure of the channel matrix \(\mathbf {H}_a\) results in a matrix \(\mathbf {R}\) with the following structure [10]:
The ML detector can then be stated as:
where \(D_i^2\), \(i=1,2,3,4\), are given by:
In general, a bruteforce approach to solving Eq. (11) requires the calculation and comparison of \(\Omega ^4\) metrics. However, careful analysis of Eq. (12) reveals that only \(\Omega ^2\) metric calculations are needed. The reason is that fixing the values of \(x_3\) and \(x_4\) allows estimation of \(x_1\) and \(x_2\); this means that iteration over other values of \(x_1\) and \(x_2\) is not necessary [19, 25]. The complete process required by the ML detector is summarized in Algorithm (1). In this algorithm, we denote the ith symbol in the constellation by \(\Omega (i)\), and \(\mathcal Q[\cdot ]\) denotes a hard decision on a symbol.
This complexity reduction is intuitively satisfactory; the orthogonality of each of the underlying layers results in a simplified detection problem. However, as it could be expected, the fact that the code is not truly orthogonal results in offdiagonal elements in \(\mathbf {R}\) that do not allow the extreme decoding simplicity of orthogonal codes.
OSIC detector using sorted QR decomposition
The ordered, successive interference cancellation (OSIC) detector, coupled with the sorted QR decomposition, results in a very lowcomplexity but suboptimal detector. The sorted QR decomposition calculates a triangular matrix \(\mathbf {R}\), a unitary matrix \(\mathbf {Q}\) and a permutation vector \(\mathbf {p}\), such that \(\mathbf {H}_p=\mathbf {QR}\), where \(\mathbf {H}_p\) is the channel matrix \(\mathbf {H}_a\) with its columns reordered according to \(\mathbf {p}\).
The reordering of the channel matrix results in matrix \(\mathbf {R}\) with rows are ordered from higher to lower signaltonoise ratio [15]. Then, symbols are estimated in sequence, from lower stream to higher stream; in each layer, the interference from previouslyestimated symbols is subtracted. Assuming that all previous decisions are correct, the interference of previous symbols can be perfectly canceled at each step. For DSTTD, the OSIC detector calculates the symbol estimates \(\hat{\mathbf {x}} = \left[ \hat{x_1}, \hat{x_2}, \hat{x_3}, \hat{x_4}\right] ^{\mathsf {T}}\) as [10]:
Finally, the estimate \(\hat{\mathbf {s}}\) is obtained by reordering \(\hat{\mathbf {x}}\) according to the permutation vector \(\mathbf {p}\). A similar algorithm, also based on reordering the channel matrix according to the norm of its columns, is presented in [14].
NearML detector based on improved OSIC algorithm
The OSIC detection algorithm for DSTTD scheme has very low complexity, but sequential instead of joint detection means that, in many cases, the optimum symbol vector is discarded during the vector nulling process. In [19] an efficient scheme based on OSIC detection was proposed to improve upon its errorrate performance by finding better starting points for further searches. In particular, it defines the metric \(D_\theta ^2=f(D_1^2,D_2^2) +D_3^2+D_4^2\), and explores vectors that were discarded by the OSIC detector but may have, in fact, a better metric. The function f may be chosen among \(\text {max()}\), \(\text {min()}\) and a weighted average; each one has slightly different performance and complexity properties. In general, though, this detector achieves a significant reduction in complexity compared to optimal ML algorithms such as the sphere decoder, because in practice few additional candidate vectors are examined, but the search usually includes the optimum solution.
Decoders based on the Malgorithm
The ML solution to Eq. (11) may be expressed as a search in a tree. Symbol \(s_4\) sits at the root of the tree, and it branches to each possible value of \(s_3\), and so on successively to \(s_1\). Each branch is assigned a distance metric, and the symbols with smallest overall distance are selected as the optimum solution [28].
The Malgorithm is a breadthfirst, sorted tree search algorithm that may be adapted for MIMO detection [16, 18]. The algorithm reduces the search complexity by storing only the best M branches at a time. Small values for M result in low complexity, but quite suboptimal performance; as M increases, the complexity also increases but the algorithm’s performance gets closer to the ML decoder.
This algorithm has been adapted to the DSTTD code [24]. The main idea is to choose the best M candidates in the estimation of the symbols \(\hat{s_3}\) and \(\hat{s_4}\); then, the search for \(\hat{s_2}\) and \(\hat{s_1}\) is limited to \(M^2\) candidates. This results in a marked reduction in complexity without a large sacrifice in optimality. The decoding process is presented in Algorithm (2).
The LC maximum likelihood detector
A quasiorthogonal space–time block code (QSTBC) is one for which the code matrix product \(\mathbf {S}\mathbf {S}^{\mathsf {H}}\) has a small number of offdiagonal elements. In general, this results in a maximumlikelihood decoder with much more complexity than that of an OSTBC; however, in some cases, lowcomplexity decoders can be found. Consider the QSTBC code specified by the following code matrix:
Assuming \(n_R=1\), the QR decomposition of the equivalent channel matrix \(\mathbf {H}=\mathbf {Q}\mathbf {R}\) results in \(\mathbf {Q}\) equal to the identity matrix and
In [23], a very lowcomplexity decoder with near ML performance was proposed. This decoder selects the estimated symbol vector \(\hat{\mathbf {s}}\) that satisfies
where \(D_T^2=\sum _{i=1}^4 D_i^2\) and the \(D_i^2\) are given by:
The decoding algorithm has two interesting complexityreducing properties. One is that the detection of \(s_1\) and \(s_3\) can be done concurrently with that of \(s_2\) and \(s_4\), since these two detection steps are completely independent. The second is that the symbols \(\mathbf {x}\in \Omega \) are sorted in a specific way so that not all of them need to be tested using Eq. (16).
Note the similarity of the matrix \(\mathbf {R}\) in Eq. (15) with the corresponding DSTTD matrix in Eq. (10); likewise, compare the DSTTD decoding metrics in Eq. (12) and the LCML decoding procedure in Eq. (17). This would suggest that the strategies presented in [23] might also be applicable to DSTTD decoding. This is explored in the next section.
Proposed nearML decoding algorithm
In this section, we present a new decoding algorithm for DSTTD. The aim of the decoder is to find the optimum solution to the maximum likelihood equation (11), using the distances calculated in Eq. (12). Consider a breadthfirst, tree search decoder that operates by executing the following steps:

1.
Sort the elements of \(\Omega \) in order of increasing distance \(D_3^2\) and store them in vector \(\mathbf {x_3}\). Repeat for distance \(D_4^2\), storing the result in \(\mathbf {x_4}\).

2.
Using \(\hat{x}_3=\mathbf {x_3}[1]\) and \(\hat{x}_4=\mathbf {x_4}[1]\), find symbols \(\hat{x}_1,\hat{x}_2\in \Omega \) that minimize \(D_1^2\) and \(D_2^2\). Store the current total distance \(D_T^2=\sum _i D_i^2\).

3.
Iterate over all remaining elements of \(\mathbf {x_3}\) and \(\mathbf {x_4}\). For each pair \(\hat{x}_3\in \mathbf {x_3}\), \(\hat{x}_4\in \mathbf {x_4}\), find the pair \(\hat{x}_1\) and \(\hat{x}_2\) that minimizes \(D_T^2\).

4.
Return the symbols that produce the smallest total distance.
Note that this procedure can be improved in several ways. First, the iteration in step 3 does not need to be over all symbols in \(\mathbf {x_3}\) and \(\mathbf {x_4}\). Since the symbols are ordered in order of increasing distance, the likelihood of a pair of symbols \(\hat{x}_3\), \(\hat{x}_4\) being in the optimal solution decreases as the algorithm progresses. This suggests that some pairs of symbols may be skipped, and that the search can be stopped early, according to some criterion, resulting in a significant reduction in complexity. The criterion should maximize the probability of the optimum solution being included in the search, while minimizing the number of symbol pairs examined.
Note, as well, that the iteration in step 3 can be divided into a number of independent, concurrent iterations. This means that the decoder is amenable to fast implementations, either in hardware or in software, using multiple processors. While execution in a single processor requires a roughly similar amount of memory as other proposed decoders, concurrent execution of the algorithm may involve a small memory usage penalty, because each process requires its own local metric storage.
The proposed algorithm builds on the decoders described in Sect. 3, which were first presented in [19, 23] and [24]. The decoder in [23] is designed for an ABBA code, which is similar to DSTTD but requires \(T=4\). In [19], a pool of candidate symbols is examined per iteration, whereas our proposal examines only one. In addition, [19] requires finetuning of a distance weighting function; no such adjustments are required in our proposal. Finally, the algorithm proposed in [24] has fixed complexity, and does not employ any heuristics for stopping the search early.
We present a more detailed description of the proposed algorithm in the next subsections; we also propose specific criteria for skipping symbols and for stopping the search. For clarity, we have divided the algorithm into three different stages: preprocessing, parallel candidate search, and postprocessing.
Preprocessing
It can be seen in Eq. (12) that the SNR of \(\hat{x}_3\) and \(\hat{x}_4\) depends on \(R_{33}\). Since a correct initial estimate of these two symbols reduces the search complexity, their SNR should be maximized. This is accomplished by reordering the columns of the channel matrix as described in Algorithm (3). Note that the column reordering does not affect the structure of matrix R.
Candidate search
This is the main portion of the algorithm, where candidate solutions are explored. It is divided into two parts. The first part, presented in Algorithm (4), calculates required quantities and obtains an initial estimate. The following conventions are used:

1.
Variables in bold represent either vectors (lowercase) or matrices (uppercase). \(\varvec{\Omega }\) is a vector whose elements are the constellation symbols.

2.
Arithmetic on vectors is performed element by element.

3.
Vector indexing is indicated using square brackets.

4.
The function \(\text {findmin}(\mathbf {x})\) returns a tuple of the smallest element in vector \(\mathbf {x}\) and its corresponding index.

5.
The function \(\text {sortperm}(\mathbf {x})\) returns a vector of indices to the elements of \(\mathbf {x}\) in increasing order.

6.
The instruction “break” exits all nested loops.
The initialization process is very similar to the OSIC decoder: the initial symbol estimates are those that individually minimize the metrics \(D_i^2\) in Eq. (11). After initialization, the estimate is refined in three iterative processes than can run concurrently. Each iteration examines a different subset of the available candidates, as defined in Table 2. The purpose of this partition is to share work as equally as possible between the three iterations, while allowing each iteration to examine its assigned symbols from most to least promising. Furthermore, note that the search size is constrained by parameter \(N_c\). If \(N_c=\Omega \), then the search can occur over the entire symbol set.
Iteration 1 is described in detail in Algorithm (5). Note that the other two iterations are identical except for the different indexing order. Each iteration produces a symbol estimate \(\hat{\mathbf {s}_i}\) with distance \(d_{mi}\), for \(i=1,2,3\). Symbol pairs \(\hat{x}_3\) and \(\hat{x}_4\) whose distances \(D_3^2\) and \(D_4^2\) are not better than previous ones are skipped, as specified in line 9. Furthermore, the search is stopped early if the condition in lines 9 and 19 are both met. This is the main reason for the algorithm’s reduced complexity.
Postprocessing
After all three iterations have finished, the estimate \(\hat{\mathbf {s}_i}\) with least distance \(d_i\), \(i=1,2,3\) is selected. If the Boolean flag reverse is true, then elements 1 and 2 of \(\hat{\mathbf{s}}\) are swapped with elements 3 and 4. This concludes the decoding process.
Simulation results and analysis
To demonstrate the advantages of the proposed detector, we compare the bit error rate and complexity of several different detectors for DSTTD with \(4\times 2\) (4 transmit and 2 receive antennas). We present results with QPSK and \(32QAM\) modulation. In all cases, the channel block length was fixed to \(L=2\), and simulations were run until 1000 symbol errors were found. The BER is represented as a function of the perbit signaltonoise ratio \(E_b/N_0\).
The detectors used in the comparison are the following. The OSIC detector (Sect. 3.2, [15] has the worst error rate, but it is included as the baseline for low complexity. The DSTTDadapted QR Malgorithm with \(M=2\) (Sect. 3.4, [24] is also included. Next is the nearML, improved OSIC detector described in Sect. 3.3 [19]; we have chosen the metric function f() to be equal to \(\text {max}()\), since it offers the best error performance (with a slight computational complexity increase). Finally, for QPSK, we include the bruteforce ML detector (Algorithm 1) to bracket the optimum error performance.
In Figs. 3 and 4, we compare the BER performance of our proposal to that of the detectors listed above, for QPSK and 32QAM modulations. It can be seen that both our proposed algorithm and the improved OSIC detector essentially achieve the optimum error performance. For QPSK, the Malgorithm detector is approximately 1 dB worse at \(\text {BER}=10^{3}\) and it diverges for increasing SNR; the OSIC detector is 4 dB worse at the same point. A similar margin exists for 32QAM.
In Figs. 5 and 6, we compare the computational complexity of the different detectors. This complexity is measured in terms of the average number of \(\hat{x}_3, \hat{x}_4\) pairs examined per each decoded space–time symbol. In particular, for the proposed algorithm this is equivalent to averaging how many times lines 10–18 of Algorithm (5) are executed in the decoding of a single space–time symbol. This complexity measure is useful because the improved OSIC detector, the Malgorithm and the proposed algorithm perform a similar number of arithmetic and logic operations when examining a candidate pair. Note that since Algorithm (4) is always executed, the minimum number of examined pairs is one.
It can be seen that the proposed algorithm shows significantly reduced complexity compared to the improved OSIC detector, at essentially the same BER performance. The difference between the algorithms increases as the constellation order increases. It is interesting to note that, for 32QAM, the complexity of the improved OSIC detector does not always decrease smoothly with increasing SNR. We attribute this behavior to a large variance in the number of examined pairs at low SNR. The proposed algorithm does not seem to exhibit this variance and its complexity is more easily predictable.
One feature of the proposed algorithm is that the number of examined pairs can be limited by the parameter \(N_c\); this can be seen in lines 2 and 5 in Algorithm (5). Limiting the search like this has the effect of reducing the detector’s complexity, at the cost of possibly omitting the optimum solution from the search. In Figs. 7 and 8, we show the bit error rate for \(N_c=2,3,4\) in the case of QPSK, and \(N_c=4,6,8,32\) for 32QAM. In Figs. 9 and 10, we present the complexity for different values of \(N_c\) (recall that \(N_c\le \Omega \)).
For QPSK modulation, limiting the search size has a measurable effect in the bit error rate. For low SNR, setting \(N_c=2\) or \(N_c=3\) results in a reduction in complexity; however, for high SNR, the complexity in all three cases is very similar and converges to the OSIC complexity. This result is interesting, because it implies that, for high SNR, the algorithm initialization phase almost always finds the optimum solution; however, in some cases, it needs to consider further candidates to reach the ML performance.
For 32QAM, results indicate that setting \(N_c=8\) is enough to achieve ML performance. We again see that, as SNR increases, the complexity tends to 1, albeit more slowly than for QPSK.
Conclusions
We have presented a receiver algorithm for the DSTTD hybrid space–time code with nearoptimal bit error rate and substantially less complexity than other, existing nearML decoders. The presented decoder is an adaptation of the QRDM tree search algorithm. After an initial estimate, three concurrent iterations search the tree of candidates and store the solution with least metric. The low complexity results from the decoder skipping over candidates that do not meet a specified criterion, as well as the search being stopped as soon as no promising candidates are left. The concurrent nature of the three iterations suggests the algorithm can be executed on multiple processors, increasing its operating speed.
References
 1.
Telatar, I. E. (1999). Capacity of multipleantenna Gaussian channels. European Transactions on Telecommunications, 10(6), 585–595.
 2.
Zheng, L., & Tse, D. (2003). Diversity and multiplexing: A fundamental tradeoff in multipleantenna channels. IEEE Transactions on Information Theory, 49(5), 1073–1096.
 3.
Foschini, G. J. (1996). Layered spacetime architecture for wireless communication in a fading environment when using multielement antennas. Bell Labs Technical Journal, 1(2), 41–59.
 4.
Alamouti, S. M. (1998). A simple transmit diversity technique for wireless communications. IEEE Journal on Selected Areas in Communications, 16(8), 1451–1458.
 5.
Tarokh, V., Jafarkhani, H., & Calderbank, A. R. (2002). Spacetime block codes from orthogonal designs. IEEE Transactions on Information Theory, 45(5), 1456–1467.
 6.
Hassibi, B., & Hochwald, B. M. (2002). Highrate codes that are linear in space and time. IEEE Transactions on Information Theory, 48(7), 1804–1824.
 7.
Cortez, J., Bazdresch, M., Torres, D. & ParraMichel, R. (2007). An efficient detector for nonorthogonal SpaceTime Block Codes with Receiver Antenna Selection. In: IEEE 18th international symposium on personal, indoor and mobile radio communications (PIMRC). https://doi.org/10.1109/PIMRC.2007.4394046.
 8.
Yu, S.J., Lee, E.Y., & Song, H.K. (2015). Combination of STBC and SM scheme with iterative detection in LTE systems. Wireless Personal Communications, 83(2), 1203–1211.
 9.
Soria, F., Garcia, J., & Barboza, F. (2015). Improved detection of SMSMux signals for MIMO channels. IEEE Latin America Transactions, 13(1), 43–47.
 10.
Bazdresch, M., Cortez, J., Longoria, O., & ParraMichel, R. (2012). A family of hybrid space–time codes for MIMO wireless communications. Journal of Applied Research and Technology, 19(2), 122–142.
 11.
Onggosanusi, E., Dabak, A., & Schmidl, T. (2002). High rate spacetime block coded scheme: Performance and improvement in correlated fading channels. IEEE Wireless Communications Networking Conference (WCNC),. https://doi.org/10.1109/WCNC.2002.993489.
 12.
IEEE Standard for Local and Metropolitan Area Networks, Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications Amendment 5: Enhancements for Higher Throughput, IEEE Std. 802.11n2009 (2009).
 13.
IEEE Standard for Local and Metropolitan Area Networks Part 16: Air Interface for Fixed and Mobile Broadband Wireless Access Systems, Amendment 2: Physical and Medium Access Control Layers for Combined Fixed and Mobile Operation in Licensed Bands and Corrigendum 1, IEEE Std. 802.16e2005 (2006).
 14.
Kim, H. & Park, H. (2005). Efficient successive interference cancellation algorithms for the DSTTD system. In 16th IEEE international symposium on personal, indoor and mobile radio communications (PIMRC). https://doi.org/10.1109/PIMRC.2005.1651399.
 15.
Wübben, D., Böhnke, R., Rinas, J., Kühn, V., & Kammeyer, K. D. (2001). Efficient algorithm for decoding layered space–time codes. Electronics Letters, 37(22), 1348.
 16.
Wu, K., Sang, L., Xiong, C., Zhang, X., & Yang, D. (2009). Novel QRMMLD algorithm for VBLAST systems with permuted channel matrix. In 20th IEEE international symposium on personal, indoor and mobile radio communications (PIMRC). https://doi.org/10.1109/PIMRC.2009.5449768.
 17.
Hassibi, B., & Vikalo, H. (2005). On the spheredecoding algorithm: I. Expected complexity. IEEE Transactions on Signal Processing, 53(8), 2806–2816.
 18.
de Jong, Y., & Willink, T. (2002). Iterative tree search detection for MIMO wireless systems. IEEE Transactions on Communications, 53(6), 930–935.
 19.
Kim, H., & Park, H. (2008). Approaching maximumlikelihood performance with reduced complexity for a double space–time transmit diversity system. IET Communications, 2(5), 682–689.
 20.
Kim, I., Park, Y. O., & Bang, Y. (2009). Very fast detection for rate2 quasiorthogonal STBCs. IEEE Transactions on Wireless Communications, 8(1), 95–101.
 21.
Lee, Y., & Shieh, H.W. (2011). A simple layered space–time block nulling technique for DSTTD systems. IEEE Communication Letters, 15(12), 1323–1325.
 22.
Pai, H.T. (2012). Adaptive detection for double STBCs based on QR decomposition. IEEE Transactions on Vehicular Technology, 61(3), 1182–1187.
 23.
Le, M.T., Pham, V.S., Mai, L., & Yoon, G. (2005). Lowcomplexity maximumlikelihood decoder for fourtransmitantenna quasiorthogonal space–time block code. IEEE Transactions on Communications, 53(11), 1817–1821.
 24.
Cortez, J., Palacio, R., RamirezPacheco, J. C., & RuizIbarra, E. (2015). A very low complexity near ML detector based on QRDM algorithm for STBCVBLAST architecture. In 7th IEEE LatinAmerican conference on communications (LATINCOM). https://doi.org/10.1109/LATINCOM.2015.7430123.
 25.
Kim, H., & Park, H. (2007). New DSTTD transceiver architecture for lowcomplexity maximumlikelihood detection. In IEEE wireless communications and networking conference (WCNC). https://doi.org/10.1109/WCNC.2007.144.
 26.
Cortez, J., Bazdresch, M., & LongoriaGandara, O. (2017). A lowcomplexity nearML detector for a 3 \(\times \) nR hybrid spacetime code. In IEEE 9th LatinAmerican conference on communications (LATINCOM). https://doi.org/10.1109/LATINCOM.2017.8240182.
 27.
LongoriaGandara, O., & ParraMichel, R. (2011). Estimation of correlated MIMO channels using partial channel state information and DPSS. IEEE Transactions on Wireless Communications, 10(11), 3711–3719. https://doi.org/10.1109/TWC.2011.091411.101199.
 28.
Anderson, J., & Mohan, S. (1984). Sequential coding algorithms: a survey and cost analysis. IEEE Transactions on Communications, 32(2), 169–176.
Acknowledgements
The present article was jointly funded by PFCE Projects and PROFAPI 2018.
Author information
Ethics declarations
Conflicts of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Cortez, J., Bazdresch, M., Ramírez, J. et al. Low complexity maximumlikelihood detector for DSTTD architecture based on the QRDM algorithm. Telecommun Syst 70, 55–66 (2019). https://doi.org/10.1007/s1123501804678
Published:
Issue Date:
Keywords
 Space–time codes
 Double space–time transmit diversity
 Maximumlikelihood detection
 QRDM algorithm