Skip to main content
Log in

Implementation of a High Throughput Soft MIMO Detector on GPU

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Multiple-input multiple-output (MIMO) significantly increases the throughput of a communication system by employing multiple antennas at the transmitter and the receiver. To extract maximum performance from a MIMO system, a computationally intensive search based detector is needed. To meet the challenge of MIMO detection, typical suboptimal MIMO detectors are ASIC or FPGA designs. We aim to show that a MIMO detector on Graphic processor unit (GPU), a low-cost parallel programmable co-processor, can achieve high throughput and can serve as an alternative to ASIC/FPGA designs. However, careful architecture aware software design is needed to leverage the performance offered by GPU. We propose a novel soft MIMO detection algorithm, multi-pass trellis traversal (MTT), and show that we can achieve ASIC/FPGA-like performance and handle different configurations in software on GPU. The proposed design can be used to accelerate wireless physical layer simulations and to offload MIMO detection processing in wireless testbed platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9

Similar content being viewed by others

References

  1. Amiri, K., Sun. Y., Murphy, P., Hunter, C., Cavallaro, J. R., et al. (2007). Warp, a unified wireless network testbed for education and research. In MSE ’07: Proceedings of the 2007 IEEE international conference on microelectronic systems education.

  2. Antikainen, J., Salmela, P., Silven, O., Juntti, M., Takala, J., & Myllyla, M. (2007). Application-specific instruction set processor implementation of list sphere detector. EURASIP Journal on Embedded Systems.

  3. Burg, A., Borgmann, M., Wenk, M., Zellweger, M., Fichtner, W., & Bolcskei, H. (2005). VLSI implementation of MIMO detection using the sphere decoding algorithm. IEEE Journal Solid-State Circuit, 40, 1566–1577.

    Article  Google Scholar 

  4. Chen, S., Zhang, T., & Xin, Y. (2007). Relaxed K-best MIMO signal detector design and VLSI implementation. IEEE Transactions on Very Large Scale Integration (VLSI) System, 15, 328–337.

    Article  Google Scholar 

  5. de Jong, Y. L. C. , & Willink, T. J. (2002). Iterative tree search detection for MIMO wireless systems. IEEE Transactions on Communications, 53(6), 930–935.

    Article  Google Scholar 

  6. Falcão, G., Silva, V., & Sousa, L. (2009). How GPUs can outperform ASICs for fast LDPC decoding. In ICS ’09: Proceedings of the 23rd international conference on supercomputing.

  7. Fincke, U., & Pohst, M. (1985). Improved methods for calculating vectors of short length in a lattice, including a complexity analysis. Mathematics of Computation, 44(170), 463–471.

    Article  MathSciNet  MATH  Google Scholar 

  8. Garrett, D., Davis, L., ten Brink, S., Hochwald, B., & Knagge, G. (2004). Silicon complexity for maximum likelihood MIMO detection using spherical decoding. IEEE Journal of Solid-State Circuit, 39, 1544–1552.

    Article  Google Scholar 

  9. Guo, Z., & Nilsson, P. (2006). Algorithm and implementation of the K-best sphere decoding for MIMO detection. IEEE Journal on Selected Areas in Communication, 24, 491–503.

    Article  Google Scholar 

  10. Hochwald, B., & Brink, S. (2003). Achieving near-capacity on a multiple-antenna channel. IEEE Transactions on Communications, 51, 389–399.

    Article  Google Scholar 

  11. Huang, X., Liang, C., & Ma, J. (2008). System architecture and implementation of MIMO sphere decoders on FPGA. IEEE Transactions on Very Large Scale Integration (VLSI) System, 2, 188–197.

    Article  Google Scholar 

  12. Janhunen, J., Silvn, O., & Juntti, M. (2010). Programmable processor implementations of K-best list sphere detector for MIMO receiver. Signal Processing, 90(1), 313–323.

    Article  MATH  Google Scholar 

  13. NVIDIA Corporation (2008). CUDA compute unified device architecture programming guide. http://www.nvidia.com/object/cuda_develop.html.

  14. NVIDIA Corporation (2009). NVIDIA CUDA visual profiler version 2.2 readme. http://developer.download.nvidia.com/compute/cuda/2_2/toolkit/docs/cudaprof_1.2_readme.html.

  15. Qi, Q., & Chakrabarti, C. (2007). Sphere decoding for multiprocessor architectures. In IEEE workshop on signal processing systems (pp. 17–19).

  16. Sun, Y., & Cavallaro, J. R. (2009). High throughput vlsi architecture for soft-output mimo detection based on a greedy graph algorithm. In GLSVLSI ’09: Proceedings of the 19th ACM great lakes symposium on VLSI. ACM.

  17. Sun, Y., & Cavallaro, J. R. (2008). A low-power 1-Gbps reconfigurable LDPC decoder design for multiple 4G wireless standards. In IEEE international SOC conference (pp. 367–370).

  18. van der Laan, W. J. (2009). Decuda. http://wiki.github.com/laanwj/decuda.

  19. Wong, K., Tsui, C., Cheng, R., & Mow, W. (2002). A VLSI architecture of a K-best lattice decoding algorithm for MIMO channels. In IEEE int. symp. on circuits and syst. (Vol. 3, pp. 273–276).

  20. Wu, M., Sun, Y., & Cavallaro, J. R. (2009). Reconfigurable real-time MIMO detector on GPU. In IEEE 43rd asilomar conference on signals, systems and computers (ASILOMAR’09).

  21. Wu, M., Gupta, S., Sun, Y., & Cavallaro, J. R. (2009). A GPU implementation of A real-time MIMO detector. In IEEE workshop on signal processing systems (SiPS’09).

Download references

Acknowledgements

This work was supported in part by Nokia, NSN, Texas Instruments, Xilinx, and by NSF under grants CCF-0541363, CNS-0551692, CNS-0619767, EECS-0925942 and CNS-0923479.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Wu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, M., Sun, Y., Gupta, S. et al. Implementation of a High Throughput Soft MIMO Detector on GPU. J Sign Process Syst 64, 123–136 (2011). https://doi.org/10.1007/s11265-010-0523-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-010-0523-4

Keywords

Navigation