Efficient GPU and CPU-based LDPC decoders for long codewords


The next generation DVB-T2, DVB-S2, and DVB-C2 standards for digital television broadcasting specify the use of low-density parity-check (LDPC) codes with codeword lengths of up to 64800 bits. The real-time decoding of these codes on general purpose computing hardware is useful for completely software defined receivers, as well as for testing and simulation purposes. Modern graphics processing units (GPUs) are capable of massively parallel computation, and can in some cases, given carefully designed algorithms, outperform general purpose CPUs (central processing units) by an order of magnitude or more. The main problem in decoding LDPC codes on GPU hardware is that LDPC decoding generates irregular memory accesses, which tend to carry heavy performance penalties (in terms of efficiency) on GPUs. Memory accesses can be efficiently parallelized by decoding several codewords in parallel, as well as by using appropriate data structures. In this article we present the algorithms and data structures used to make log-domain decoding of the long LDPC codes specified by the DVB-T2 standard—at the high data rates required for television broadcasting—possible on a modern GPU. Furthermore, we also describe a similar decoder implemented on a general purpose CPU, and show that high performance LDPC decoders are also possible on modern multi-core CPUs.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  1. 1.

    Abburi, K. (2011). A scalable LDPC decoder on GPU. In VLSI design (VLSI design), 2011 24th international conference on (pp. 183–188). doi:10.1109/VLSID.2011.44.

  2. 2.

    Chen, J., Dholakia, A., Eleftheriou, E., Fossorier, M., & Hu, X. Y. (2005). Reduced-complexity decoding of LDPC codes. IEEE Transactions on Communications, 53(8), 1288–1299. doi:10.1109/TCOMM.2005.852852.

    Article  Google Scholar 

  3. 3.

    DVB BlueBook A133. (2009). Implementation guidelines for a second generation digital terrestrial television broadcasting system (DVB-T2). DVB technical report.

  4. 4.

    DVB Bluebook A147. (2010). DVB-C2 implementation guidelines. DVB technical specification.

  5. 5.

    ETSI EN 302 307 v1.2.1. (2009). Digital video broadcasting (DVB); second generation framing structure, channel coding and modulation systems for broadcasting, interactive services, news gathering and other broadband satellite applications (DVB-S2). ETSI technical report.

  6. 6.

    ETSI EN 302 755 v1.1.1. (2009). Digital video broadcasting (DVB); frame structure channel coding and modulation for a second generation digital terrestrial television broadcasting system (DVB-T2). ETSI technical report

  7. 7.

    ETSI EN 302 769 V1.2.1. (2011). Frame structure channel coding and modulation for a second generation digital transmission system for cable systems (DVB-C2). ETSI technical report.

  8. 8.

    Falcão, G., Sousa, L., & Silva, V. (2008). Massive parallel LDPC decoding on GPU. In Proceedings of the 13th ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP ’08 (pp. 83–90). New York: ACM. doi:10.1145/1345206.1345221.

  9. 9.

    Falcão, G., Sousa, L., & Silva, V. (2011). Massively LDPC decoding on multicore architectures. IEEE Transactions on Parallel and Distributed Systems, 22(2), 309–322. doi:10.1109/TPDS.2010.66.

    Article  Google Scholar 

  10. 10.

    Falcão, G., Andrade, J., Silva, V., & Sousa, L. (2011). GPU-based DVB-S2 LDPC decoder with high throughput and fast error floor detection. Electronics Letters, 47(9), 542–543. doi:10.1049/el.2011.0201.

    Article  Google Scholar 

  11. 11.

    Gallager, R. (1963). Low-density parity-check codes. Ph.D. thesis, MIT.

  12. 12.

    Grönroos, S., Nybom, K., & Björkqvist, J. (2011). Complexity analysis of software defined DVB-T2 physical layer. Analog Integrated Circuits and Signal Processing, 69, 131–142. doi:10.1007/s10470-011-9724-4.

    Article  Google Scholar 

  13. 13.

    Hasse, P., & Robert, J. (2011). A software-based real-time DVB-C2 receiver. In IEEE international symposium on broadband multimedia systems and broadcasting (BMSB).

  14. 14.

    Hyunwoo, J., Junho, C., & Wonyong, S. (2009). Massively parallel implementation of cyclic LDPC codes on a general purpose graphics processing unit. In Signal processing systems, 2009. SiPS 2009. IEEE workshop on (pp. 285–290). doi:10.1109/SIPS.2009.5336268.

  15. 15.

    Intel Corporation. (2011). Intel 64 and IA-32 architectures software developer’s manual. http://www.intel.com.

  16. 16.

    Khronos Group. (2012). OpenCL—the open standard for parallel programming of heterogeneous systems. http://www.khronos.org/opencl. Accessed May.

  17. 17.

    MacKay, D. (1999). Good error-correcting codes based on very sparse matrices. IEEE Transactions on Information Theory, 45(2), 399–431. doi:10.1109/18.748992.

    MathSciNet  MATH  Article  Google Scholar 

  18. 18.

    Micikevicius, P. (2010). Analysis-driven optimization. In Presented at the GPU technology conference 2010, San Jose, California, USA.

  19. 19.

    NVIDIA. (2009). NVIDIA’s next generation CUDA compute architecture: Fermi.. Whitepaper, http://www.nvidia.com.

  20. 20.

    NVIDIA. (2010). NVIDIA GeForce GTX 570 GPU datasheet. Datasheet, http://www.nvidia.com.

  21. 21.

    NVIDIA. (2010). Tesla C2050 and Tesla C2070 computing processor board. Board specification, http://www.nvidia.com.

  22. 22.

    NVIDIA. (2011). GeForce GTX 570. http://www.nvidia.com/object/product-geforce-gtx-570-us.html.. Accessed June.

  23. 23.

    NVIDIA. (2011). CUDA C programming guide v.4.0. http://www.nvidia.com.

  24. 24.

    Portland Group. (2012). PGI CUDA-x86. http://www.pgroup.com/resources/cuda-x86.htm. Accessed May.

  25. 25.

    Shuang, W., Cheng, S., & Qiang, W. (2008). A parallel decoding algorithm of LDPC codes using CUDA. In Signals, systems and computers, 2008 42nd asilomar conference on (pp. 171–175). doi:10.1109/ACSSC.2008.5074385.

  26. 26.

    Vangelista, L., Benvenuto, N., Tomasin, S., Nokes, C., Stott, J., Filippi, A., et al. (2009). Key technologies for next-generation terrestrial digital television standard DVB-T2. IEEE Communications Magazine, 47(10), 146–153. doi:10.1109/MCOM.2009.5273822.

    Article  Google Scholar 

  27. 27.

    Wiberg, N. (1996). Codes and decoding on general graphs. Ph.D. thesis, Linköping University.

Download references

Author information



Corresponding author

Correspondence to Stefan Grönroos.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Grönroos, S., Nybom, K. & Björkqvist, J. Efficient GPU and CPU-based LDPC decoders for long codewords. Analog Integr Circ Sig Process 73, 583–595 (2012). https://doi.org/10.1007/s10470-012-9895-7

Download citation


  • DVB-T2
  • LDPC
  • SDR
  • CUDA
  • SSE
  • SIMD