Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Instruction Set Extensions for Matrix Decompositions on Software Defined Radio Architectures


Emerging wireless applications consistently demand higher data rates. Unfortunately, it is challenging to achieve high data rates within the limited amount of available frequency spectrum. Hence, enhanced spectral efficiency and link reliability within the available frequency spectrum are of the utmost importance in current and next generation wireless protocols. To attain high spectral efficiency and link reliability, wireless protocols employ increasingly complex 2-dimensional techniques that involve computationally-intensive matrix operations. Multiple-Input Multiple-Output (MIMO) communication is an example of a promising technique employed by wireless protocols to deliver higher data rates at the cost of increased algorithmic complexity. Application Specific Integrated Circuits (ASICs) have traditionally been used to implement compute-intensive wireless protocols. The wireless industry has been gradually moving towards an alternative programmable platform called Software Defined Radio (SDR) due to its significant benefits, such as reduced development costs, and accelerated time-to-market. The computationally-intensive matrix operations used in current and next generation wireless protocols are extremely expensive to implement in SDR platforms with conventional Digital Signal Processor (DSP) instruction sets. Hence there is a need for novel instructions, hardware designs and algorithm enhancements to enable higher spectral efficiency on SDR platforms. In this paper, we propose Single Instruction Multiple Data (SIMD) CoOrdinate Rotation DIgital Computer (CORDIC) instruction set extensions with CORDIC hardware support to speedup computationally-intensive matrix decomposition algorithms. The CORDIC instruction set extensions have been implemented on the Sandbridge Sandblaster SB3000 SDR platform and evaluated on conventional algorithms used for decomposing a closed loop 4-by-4 Worldwide Interoperability for Microwave Access (WiMAX) MIMO channel into independent Single-Input Single-Output (SISO) channels. Our experimental results on the closed-loop MIMO channel decomposition using CORDIC instructions demonstrate more than 6x speedup over a Sandblaster baseline implementation that uses state-of-the-art SIMD DSP instructions. The CORDIC instructions also provide similar numerical accuracy when compared to the baseline implementation. The techniques we propose in this paper are also applicable to other SDR and embedded processor architectures.

This is a preview of subscription content, log in to check access.

Figure 1
Figure 2
Figure 3


  1. 1.

    van Nee, R. D., Prasad, R. Editors (2000). “OFDM for wireless multimedia comm.” Artech House Publishers.

  2. 2.

    Alamouti, S. M. (1998). A simple transmit diversity technique for wireless communications. IEEE Journal on Selected Areas in Communications, 16, 1451–1458.

  3. 3.

    Gesbert, D., Shafi, M., Shiu, D. S., Smith, P. J., & Naguib, A. (2003). From theory to practice: an overview of MIMO space-time coded wireless systems. IEEE Journal on Selected Areas in Communications, 21(3), 281–302.

  4. 4.

    Trefethen, L. N., & Bau, D. (1997). “Numerical linear algebra.” Society for Industrial and Applied Mathematics (SIAM).

  5. 5.

    Bauch, G., & Hagenauer, J. (2002) “Smart versus dumb antennas-capacities and FEC performance”, Communications Letters, IEEE, 6(2):55–57.

  6. 6.

    Tuttlebee, W. H. W., Editor (2004). “Software Defined Radio.” John Wiley & Sons, Ltd.

  7. 7.

    Volder, J. E. (1959). The CORDIC trigonometric computing technique. IRE Transactions on Electronic Computers, 8(3), 330–34.

  8. 8.

    Walther, J. S. (1971). A unified algorithm for elementary functions. AFIPS Spring Joint Computer Conference, 38, 379–85.

  9. 9.

    Schulte, M. J., Glossner, J., Jinturkar, S., Moudgill, M., Mamidi, S., & Vassiliadis, S. (2006). A low-power multithreaded processor for software defined radio. Journal of VLSI Signal Processing, 43(2–3), 143–159.

  10. 10.

    Glossner, J., Iancu, D., Moudgill, M., Nacer, G., Jinturkar, S., Stanley, S., et al. (2007). “The Sandbridge SB3011 Platform,” in the EURASIP Journal on Embedded Systems, Special Issue on Embedded Digital Signal Processing Systems, vol. 2007, Article ID 56467.

  11. 11.

    Jinturkar, S., Glossner, J., Kotlyar, V., Moudgill, M. (2004). “The sandblaster automatic multithreaded vectorizing compiler,” 2004 Global Signal Processing Expo (GSPx) and International Signal Processing Conference (ISPC), Sep.

  12. 12.

    Glossner, J., Dorward, S., Jinturkar, S., Moudgill, M., Hokenek, E., Schulte, M., & Vassiliadis, S. (2005). Sandbridge Software Tools. Proceedings of the 5th Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation, Lecture Notes in Computer Science, 3553, 269–278.

  13. 13.

    Sima, M., Senthilvelan, M., Iancu, D., Glossner, J., Moudgill, M., & Schulte, M. J. (2007). “Software Solutions for Converting a MIMO-OFDM Channel into Multiple SISO-OFDM Channels,” Third IEEE International Conference on Wireless and Mobile Computing, Networking and Communications, WiMOB 2007, Oct.

  14. 14.

    Senthilvelan, M. (2010). “CORDIC instructions for software defined radio,” PhD dissertation, University of Wisconsin-Madison, Aug.

  15. 15.

    Senthilvelan, M., Yu, M., Iancu, D., Sima, M., & Schulte, M. (2011). CORDIC instructions for LDPC decoding on SDR platforms. Journal of Analog Integrated Circuits and Signal Processing, 69(2–3), 191–206.

  16. 16.

    Cavallaro, J. R., & Elster, A. (1989). “Complex matrix factorizations with CORDIC arithmetic,” Technical Report 89–1071, Department of Computer Science, Cornell University, Ithaca, NY, Dec, 1989.

  17. 17.

    Kogbetliantz, E. (1955). Solution of linear equations by diagonalization of coefficients matrix. Quarterly of Applied Mathematics, XIII(2), 123–132.

  18. 18.

    Forsythe, G., & Henrici, P. (1960). The cyclic Jacobi method for computing the principal values of a complex matrix. Transactions of the American Mathematical Society, 94(1), 1–23.

  19. 19.

    Andraka, R. J. (1998). “A survey of CORDIC algorithms for FPGAs,” Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays, pp. 191–200, Feb.

  20. 20.

    Hu, Y. H. (1992). “CORDIC-based VLSI Architectures for Digital Signal Processing,” IEEE Signal Processing Magazine, pp. 16–35, Jul.

  21. 21.

    Deprettere, E., Dewilde, P., & Udo, R. (1984). Pipelined cordic architectures for fast VLSI filtering and array processing. IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP ‘84, 9, 250–253.

  22. 22.

    Hsiao, S.-F., & Delosme, J.-M. (1996). Parallel singular value decomposition of complex matrices using multidimensional CORDIC algorithms. IEEE Transactions on Signal Processing, 44(3), 685–697.

  23. 23.

    Wang, S., Piuri, V., & Swartzlander, E. E. (1996). A unified view of CORDIC processor design. IEEE 39th Midwest symposium on Circuits and Systems, 2, 852–855.

  24. 24.

    Mencer, O., Semeria, L., Morf, M., & Delosme, J.-M. (1998). Application of reconfigurable CORDIC architectures. Thirty-Second Asilomar Conference on Signals, Systems & Computers, 1, 182–186.

  25. 25.

    Sibul, L. H., Fogelsanger, A. L. (1984) “Application of coordinate rotation algorithm to singular value decomposition.” IEEE Int. Symp. Circuits and Systems, 821–824.

  26. 26.

    Ercegovac, M. D., & Lang, T. (1990). Redundant and on-line CORDIC: application to matrix triangularization and SVD. IEEE Transactions on Computers, 39(6), 725–740.

  27. 27.

    Delosme, J.-M. (1998). “A processor for two-dimensional symmetric Eigenvalue and singular value arrays.” Proceedings of 21st Asilomar Conf. on Circuits, Systems and Computers, pp. 217–221, Nov.

  28. 28.

    Liu, Z., Dickson, K., & McCanny, J. V. (2005). Application-specific instruction set processor for SoC implementation of modern signal processing algorithms. IEEE Transactions on Circuits and Systems I, 52(4), 755–765.

  29. 29.

    Heyne, B., Götze, J. (2005). “Cordic based algorithms for software defined radio (SDR) baseband processing.” Kleinheubacher Berichte.

  30. 30.

    Daggett, D. H. (1959). Decimal-binary conversions in CORDIC. IRE Transactions on Electronic Computers, EC-8(3), 335–339.

  31. 31.

    Hu, Y. H. (1992). The quantization effects of the CORDIC algorithm. IEEE Transactions on Signal Processing, 40, 834–844.

  32. 32.

    Hu, X., & Bass, S. C. (1993). “A neglected Error Source in the CORDIC Algorithm.” Proceedings IEEE ISCAS‘93, pp. 766–769.

Download references

Author information

Correspondence to Murugappan Senthilvelan.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Senthilvelan, M., Sima, M., Iancu, D. et al. Instruction Set Extensions for Matrix Decompositions on Software Defined Radio Architectures. J Sign Process Syst 70, 289–303 (2013).

Download citation


  • Instruction set extensions
  • Software defined radio
  • QR decomposition
  • Singular value decomposition