Skip to main content
Log in

An optimized parallel order scheme of the deblocking filtering process for enhancing the performance of the HEVC standard using GPUs

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In HEVC, deblocking filtering (DF) is responsible for about 20% of the time consumed to perform video compression. In a typical parallel DF scheme, a set of horizontal and vertical edges are processed using deblocking filters. In conventional parallel DF schemes, deblocking filters could be applied to the same edges more than once. Moreover, some edges are assigned to cores to be filtered even though those edges are not designated to be filtered. Accordingly, the used parallel hardware architecture requires more on-chip memory modules. Those challenges negatively affect HEVC performance resulting in an increase in computational complexity. In this paper, an optimized parallel DF scheme is proposed for HEVC using graphical processing units (GPUs). The proposed scheme outperforms competing ones in terms of reducing the decoding time of all frames of video sequences by average speed-up factors of 2.83 and 2.45 using the all-intra and low-delay video coding configuration modes, respectively. The proposal does not change the rate-distortion between the decoded video sequences and their original sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Bossen F (2011) Common test conditions and software reference configurations. Tech. Rep. Doc. JCTVC-l1100, San Jose CA

  2. Bossen F, Bross B, Suhring K, Flynn D (2012) HEVC, complexity and implementation analysis. IEEE Trans Circuits Syst Video Technol 22(12):1685–1696

    Article  Google Scholar 

  3. Chao YC, Lin JK, Yang JF, Liu BD (2006) A high throughput and data reuse architecture for H.264/AVC deblocking filter IEEE Asia pacific conference on circuits and systems, pp 1260–1263

    Google Scholar 

  4. Chi CC, Mesa MA, Juurlink B, Clare G, Henry F, Pateux S, Schierl T (2012) Parallel scalability and efficiency of HEVC parallelization approaches. IEEE Trans Circuits Syst Video Technol 22(12):1827–1838

    Article  Google Scholar 

  5. Eldeken AF, Dansereau RM, Fouad MM, Salama GI (2015) High throughput parallel scheme for HEVC deblocking filter The 22nd IEEE international conference on image processing, pp 1538–1542

    Google Scholar 

  6. Eldeken AF, Salama GI (2015) An adaptive deblocking filter to improve the quality of the HEVC standard. International Journal of Image, Graphics and Signal Processing 7(3):9–15

    Article  Google Scholar 

  7. Hsia SC, Hsu WC, Lee SC (2012) Low-complexity high-quality adaptive deblocking filter for H.264/AVC system. Signal Process Image Commun 27(7):749–759

    Article  Google Scholar 

  8. Huang YW, Chen TW, Hsieh BY, Wang TC, Chang TH, Chen LG (2003) Architecture design for deblocking filter in H.264/JVT/AVC IEEE International conference on multimedia and expo, vol 1, pp 693–696

  9. Kim IK, McCann K, Sugimoto K, Bross B, Han WJ (2013) High efficiency video coding (HEVC) test model draft 10 (HM 10) encoder description. Tech. Rep. Doc. JCTVC-l1002, JCT-VC, Geneva, Switzerland

  10. Kotra AM, el Raulet M, Deforges O (2013) Comparison of different parallel implementations for deblocking filter of HEVC IEEE International conference on acoustics, speech and signal processing, pp 2721–2725

  11. Kthiri M, Kadionik P, Lvi H, Loukil H, Atitallah AB, Masmoudi N (2010) A parallel hardware architecture of deblocking filter in H.264/AVC The 9th IEEE international symposium on electronics and telecommunications, pp 341–344

    Google Scholar 

  12. Le HHN, Bae J (2014) High-throughput parallel architecture for h.265/HEVC deblocking filter. J Inf Sci Eng 30(2):281–294

    Google Scholar 

  13. Li Y, Han N, Chen C (2009) A novel deblocking filter algorithm in h.264 for real time implementation The 3rd IEEE international Conference on multimedia and ubiquitous engineering, pp 26–30

    Google Scholar 

  14. List P, Joch A, Lainema J, Bjöntegaard G, Karczewicz M (2003) Adaptive deblocking filter. IEEE Trans Circuits Syst Video Technol 13(7):614–619

    Article  Google Scholar 

  15. Lou J, Jagmohan A, He D, Lu L, Sun MT (2009) H.264 deblocking speedup. IEEE Trans Circuits Syst Video Technol 19(8):1178–1182

    Article  Google Scholar 

  16. Monteiro E, Vizzotto B, Diniz C, Zatt B, Bampi S (2011) Applying CUDA architecture to accelerate full search block matching algorithm for high performance motion estimation in video encoding The 23rd IEEE international symposium on computer architecture and high performance computing, pp 128–135

    Google Scholar 

  17. Norkin A, Bjöntegaard G, Fuldseth A, Narroschke M, Ikeda M, Andersson K, Minhua Z, Der Auwera GV (2012) HEVC deblocking filter. IEEE Trans Circuits Syst Video Technol 22(12):1746–1754

    Article  Google Scholar 

  18. Online: ftp://hvc:US88Hula@ftp.tnt.uni-hannover.de/testsequences (2003)

  19. Online: http://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches (2012)

  20. Online: http://www.geforce.com/hardware (2016)

  21. Shen W, Shang Q, Shen S, Fan Y, Zeng X (2013) A high-throughput VLSI architecture for deblocking filter in HEVC. IEEE Int Symp Circuits Syst :673–676

  22. Sheng B, Gao W, Wu D (2004) An implemented architecture of deblocking filter for h.264/AVC IEEE International Conference on Image Processing, pp 665–668

    Google Scholar 

  23. de Souza DF, Roma N, Sousa L (2014) Cooperative CPU+GPU deblocking filter parallelization for high performance HEVC video codecs IEEE International Conference on Acoustics, Speech and Signal Processing, pp 4993–4997

  24. Su H, Zhang C, Chai J, Yang Q (2011) A efficient parallel deblocking filter based on GPU: Implementation and optimization IEEE International conference on communications, computers and signal processing (PacRim), pp 280–285

    Google Scholar 

  25. Sullivan G, Ohm JR, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circuits Syst Video Technol 22 (12):1649–1668

    Article  Google Scholar 

  26. Tahir SM, Shen OP, Yang LC, Karuppiah EK (2013) Implementation of intrusion detection system in CUDA for real-time multi-node streaming IEEE International conference on systems, process and control, pp 97–102

    Google Scholar 

  27. Vijay S, Chakrabarti C, Karam LJ (2010) Parallel deblocking filter for H.264 AVC/SVC IEEE Workshop on signal processing systems, pp 116–121

    Google Scholar 

  28. Wiegand T, Sullivan GJ, Bjöntegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circuits Syst Video Technol 13 (7):560–576

    Article  Google Scholar 

  29. Yan C, Zhang Y, Dai F, Wang X, Li L, Dai Q (2014) Parallel deblocking filter for HEVC on many-core processor. Electron Lett 50(5):367–368

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed M. Fouad.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fouad, M.M., Dansereau, R.M. An optimized parallel order scheme of the deblocking filtering process for enhancing the performance of the HEVC standard using GPUs. Multimed Tools Appl 76, 24609–24634 (2017). https://doi.org/10.1007/s11042-017-4876-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4876-6

Keywords

Navigation