Journal of Signal Processing Systems

, Volume 90, Issue 11, pp 1551–1567 | Cite as

Multicore and Manycore Implementations of ADMM-based Decoders for LDPC Decoding

  • Imen DebbabiEmail author
  • Bertrand Le Gal
  • Nadia Khouja
  • Fethi Tlili
  • Christophe Jégo


The alternate direction method of multipliers (ADMM) algorithm has recently been proposed for LDPC decoding based on linear programming (LP) techniques. Even though it improves the error rate performance compared with usual message passing (MP) techniques, it shows a higher computation complexity. However, a significant step towards LP LDPC decoding scalability and optimization is made possible since the ADMM algorithm acts as an MP decoding one. In this paper, an overview of the ADMM approach and its error correction performances is provided. Then, its computation and memory complexities are evaluated. Finally, optimized software implementations of the decoder to take advantage of multi/many-core device features are described. Optimization choices are discussed and justified according to execution profiling figures and the algorithm’s parallelism levels. Experimentation results show that this LP based decoding technique can reach WiMAX and WRAN standards real time throughput requirements on mid-range devices.


Many-core Multi-core LDPC decoding Linear programming ADMM algorithm Software optimization SIMD SIMT GPU Throughput 


  1. 1.
    Gallager, R.G. (1962). Low-density parity-check codes. IRE Transactions on Information Theory, 8(1), 21–28.MathSciNetCrossRefGoogle Scholar
  2. 2.
    MacKay, D.J., & Neal, R.M. (1996). Near shannon limit performance of low density parity check codes. Electronics Letters, 32(18), 1645–1646.CrossRefGoogle Scholar
  3. 3.
    Le Gal, B., & Jégo, C. (2015). High-throughput multi-core LDPC decoders based on x86 processor. In IEEE transactions on parallel and distributed systems.Google Scholar
  4. 4.
    Andrade, J., Falcao, G., & Silva, V. (2014). Optimized fast Walsh-Hadamard transform on GPUs for non-binary LDPC decoding. Parallel Computing, Elsevier.Google Scholar
  5. 5.
    Wu, M., Sun, Y., Wang, G., & Cavallaro, J.R. (2011). Implementation of a high throughput 3GPP turbo decoder on GPU. Journal of Signal Processing Systems, Springer.Google Scholar
  6. 6.
    Falcão, G., Silva, V., Sousa, L., & Marinho, J. (2008). High coded data rate and multicodeword wiMAX LDPC decoding on the cell/BE. Electronics Letters, 44(24), 1415–1417.CrossRefGoogle Scholar
  7. 7.
    Zhao, J., Zhao, M., Yang, H., Chen, J., Chen, X., & Wang, J. (2011). High performance LDPC decoder on CELL BE for wiMAX system. In Proceedings of the CMC Conference (pp. 278–281).Google Scholar
  8. 8.
    Gronroos, S., Nybom, K., & Bjorkqvist, J. (2012). Efficient GPU and CPU-based LDPC decoders for long codewords. Journal of Analog Integrated Circuits and Signal Processing, 73, 583–595.CrossRefGoogle Scholar
  9. 9.
    Pan, X., Fan Lu, X., Qi Li, M., & Fang Song, R. (2013). A high throughput LDPC decoder in CMMB based on virtual radio. In Proceedings of the WCNC workshop (pp. 95–99).Google Scholar
  10. 10.
    Han, X., Niu, K., & He, Z. (2013). Implementation of IEEE 802.11n LDPC codes based on general purpose processors. In Proceedings of the ICCT conference (pp. 218–222).Google Scholar
  11. 11.
    Le Gal, B., & Jégo, C. (2015). High-Throughput LDPC decoder on Low-Power embedded processors. IEEE Communications Letters, 19, 1861–1864.CrossRefGoogle Scholar
  12. 12.
    Chang, Y.-L. (2010). High-throughput GPU-based LDPC. In Proceedings of the SPIE satellite data compression, communications and processing conference (Vol. 7810).Google Scholar
  13. 13.
    Falcao, G., Andrade, J., Silva, V., & Sousa, L. (2011). GPU-based DVB-s2 LDPC decoder with high throughput and fast error floor detection. Electronics Letters, 47(9), 542–543.CrossRefGoogle Scholar
  14. 14.
    Falcao, G, Sousa, L., & Silva, V. (2011). Massively LDPC decoding on multicore architectures. IEEE Transactions on Parallel and Distributed Systems, 22(2), 309–322.CrossRefGoogle Scholar
  15. 15.
    Wang, G., Wu, M., Sun, Y., & Cavallaro, J.R. (2011). GPU Accelerated scalable parallel decoding of LDPC codes. In Proceedings of the IEEE asilomar conference on signals, systems, and computers (pp. 2053–2057).Google Scholar
  16. 16.
    Ji, H., Cho, J., & Sung, W. (2011). Memory access optimized implementation of cyclic and quasi-cyclic LDPC codes on a GPGPU. Journal of Signal Processing Systems, 64, 149–159.CrossRefGoogle Scholar
  17. 17.
    Kang, S., & Moon, J. (2012). Parallel LDPC decoder implementation on GPU based on unbalanced memory coalescing. In Proceedings of the IEEE conference on communications (pp. 3692–3697).Google Scholar
  18. 18.
    Wang, G., Wu, M., Yin, B., & Cavallaro, J. R. (2013). High throughput low latency LDPC decoding on GPU for SDR systems. In Proceedings of the IEEE GlobalSIP conference (pp. 1258–1261).Google Scholar
  19. 19.
    Lin, Y., & Niu, W. (2014). High throughput LDPC decoder on GPU. IEEE Communications Letters, 18(2), 344–347.CrossRefGoogle Scholar
  20. 20.
    Le Gal, B., Jégo, C., & Crenne, J. (2014). A high throughput efficient approach for decoding LDPC codes onto GPU devices. IEEE Embedded Systems Letters, 6(2), 29–32.CrossRefGoogle Scholar
  21. 21.
    Feldman, J. (2003). Decoding error-correcting codes via linear programming. Phd thesis, Massachussets Institute of Technology.Google Scholar
  22. 22.
    Zhang, X., & Siegel, P.H. (2013). Efficient iterative LP decoding of ldpc codes with alternating direction method of multipliers. In IEEE international symposium on information theory (ISIT).Google Scholar
  23. 23.
    Feldman, J., Wainwright, M., & Karger, D. (2005). Using linear programming to decode binary linear codes. IEEE Transactions on Information Theory, 51, 954–972.MathSciNetCrossRefGoogle Scholar
  24. 24.
    Liu, X., Draper, S., & Recht, B. (2012). Suppressing pseudocodewords by penalizing the objective of LP decoding. In Proceedings of the IEEE information theory workshop (ITW) (pp. 367–371).Google Scholar
  25. 25.
    Zhang, X., & Siegel, P. (2012). Adaptive cut generation algorithm for improved linear programming decoding of binary linear codes. IEEE Transactions on Information Theory, 58, 6581–6594.MathSciNetCrossRefGoogle Scholar
  26. 26.
    Boyd, S.P., Parikh, N., Chu, E., Peleato, B., & Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1), 1–122.CrossRefGoogle Scholar
  27. 27.
    Barman, S., Liu, X., Draper, S.C., & Recht, B. (2013). Decomposition methods for large scale LP decoding. IEEE Transactions on Information Theory, 59(12), 7870–7886.MathSciNetCrossRefGoogle Scholar
  28. 28.
    Zhang, X. (2012). LDPC codes: structural analysis and decoding techniques. PhD thesis, San Diego: University of California.Google Scholar
  29. 29.
    C++ ADMM Decoder by X. Liu. sites.
  30. 30.
    Debbabi, I., Le Gal, B., Khouja, N., Tlili, F., & Jégo, C. (2016). Multicore implementation of LDPC decoders based on ADMM algorithm. In Proceedings of the 41st IEEE international conference on acoustics speech and signal processing.Google Scholar
  31. 31.
    Wei, H., Jiao, X., & Mu, J. (2015). Reduced-complexity linear programming decoding based on ADMM for LDPC codes. IEEE Communications Letters, 19, 909–912.CrossRefGoogle Scholar
  32. 32.
    Falcao, G., Yamagiwa, S., Silva, V., & Sousa, L. (2009). Parallel LDPC decoding on GPUs using a stream-based computing approach. Journal of Computer Science And Technology, 24, 913– 924.CrossRefGoogle Scholar
  33. 33.
    Wang, G., Wu, M., Sun, Y., & Cavallaro, J.R. (2011). A massively parallel implementation of QC-LDPC decoder on GPU. In Proceedings of the symposium on application specific processors (pp. 82–85).Google Scholar
  34. 34.
    Jiao, X., Wei, H., Mu, J., & Chen, C. (2015). Improved ADMM penalized decoder for irregular low-density parity-check codes. IEEE Communications Letters, 19, 913–916.CrossRefGoogle Scholar
  35. 35.
    Liu, X., & Draper, S.C. (2014). The ADMM penalized decoder for LDPC codes, CoRR, arXiv:1409.5140.
  36. 36.
    Barman, S., Liu, X., Draper, S.C., & Recht, B. (2012). Decomposition methods for large scale LP decoding, CoRR, arXiv:1204.0556.
  37. 37.
    Amdahl, G.M. (2007). Computer architecture and amdahl’s law. IEEE Solid-State Circuits Newsletter, 3(12), 4–9.CrossRefGoogle Scholar
  38. 38.
    Zekri, A.S. (2014). Enhancing the matrix transpose operation using intel AVX instruction set extension. International Journal of Computer Science & Information Technology (IJCSIT), 6, 68–78.Google Scholar
  39. 39.
    Le Gal, B., & Jégo, C. (2016). High-throughput multi-core LDPC decoders based on ×86 processor. IEEE Transactions on Parallel and Distributed Systems, 27, 1373–1386.CrossRefGoogle Scholar
  40. 40.
    Giard, P., Sarkis, G., Leroux, C., Thibeault, C., & Gross, W.J. (2016). Low-latency software polar decoders. Journal of Signal Processing Systems, Springer.Google Scholar
  41. 41.
    Gal, B.L., Leroux, C., & Jégo, C. (2014). Multi-gb/s software decoding of polar codes. In IEEE transactions on signal processing (pp. 349–359).Google Scholar
  42. 42.
    Cassagne, A., Tonnellier, T., Leroux, C., Gal, B.L., Aumage, O., & Barthou, D. (2016). Beyond Gbps turbo decoder on multi-core CPUs. In Proceedings of the 9th international symposium on turbo codes & iterative information processing (ISTC’16). France.Google Scholar
  43. 43.
    Chapman, B., Jost, G., & Van Der Pas, R. (2008). Using OpenMP: portable shared memory parallel programming. The MIT Press.Google Scholar
  44. 44.
    Li, R. (2013). A multi-standard efficient column-layered LDPC decoder for software defined radio on GPUs. In Proceedings of the SPAWC workshop (pp. 724–728).Google Scholar
  45. 45.
    Wasson, M., & Draper, S.C. (2015). Hardware based projection onto the parity polytope and probability simplex. In 49th asilomar conference on signals systems and computers.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.GRESCOM Laboratory, High School of CommunicationsCarthage UniversityArianaTunisia
  2. 2.CNRS Lab. IMSBordeaux INP, Bordeaux UniversityBordeauxFrance

Personalised recommendations