A Highly Utilized Hardware-Based Merge Mode Estimation with Candidate Level Parallel Execution for High-Efficiency Video Coding

Kim, Tae Sung; Rhee, Chae Eun; Lee, Hyuk-Jae

doi:10.1007/s11265-017-1268-0

A Highly Utilized Hardware-Based Merge Mode Estimation with Candidate Level Parallel Execution for High-Efficiency Video Coding

Published: 22 July 2017

Volume 90, pages 743–757, (2018)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

311 Accesses
Explore all metrics

Abstract

The merge mode is one of the new tools adopted in high-efficiency video coding (HEVC) to improve the inter-frame coding efficiency. The merge mode saves the bits for the motion vector (MV) by sharing the MV with neighboring blocks. Merge mode estimation (MME) is the process of finding the merge mode candidate achieving the highest compression efficiency at the cost of extensive computation. This paper tackles the intrinsic inefficiency problem of hardware-based MME and proposes a new hardware-efficient MME scheme which regulates the number of fractional estimation for merge mode candidate including vertically fractional MV. The proposed MME hardware organization in this paper features two different data paths where only one path includes a vertical interpolation filter. The proposed efficient MME scheme reduces the computational complexity of MME by regulating peak computational complexity of vertical interpolation filter. As a result, the proposed MME hardware organization saves hardware resources for vertical interpolation filter. In addition, the proposed hardware maintains high utilization by adopting adaptive candidate allocation scheme which well balances workloads between two independent data paths. Consequently, the proposed MME hardware processes 62,106 of 64 × 64 CTUs per second with a clock frequency of 400 MHz and a gate count of 460.8 K, which correspond to 23% less hardware resources and 17% higher throughput than the conventional MME hardware.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

Article 27 April 2021

A Survey on Pipelined FFT Hardware Architectures

Article Open access 06 July 2021

A Modern Primer on Processing in Memory

References

Sullivan, G. J., Ohm, J., Han, W.-J., & Wiegand, T. (2012). Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circuits and Systems for Video Technology, 22(12), 1649–1668.
Article Google Scholar
Helle, P., Oudin, S., Bross, B., Marpe, D., Bici, M. O., Ugur, K., Jung, J., Clare, G., & Wiegand, T. (2012). Block merging for Quadtree-based partitioning in HEVC. IEEE Trans Circuits and Systems for Video Technology, 22(12), 1720–1731.
Article Google Scholar
De Forni, R., & Taubman, D. S. (2005). On the benefits of leaf merging in quad-tree motion models. In proc. ICIP. Genova, pp. II - 858-61.
Mathew, R., & Taubman, D. S. (2010). Quad-tree motion modeling with leaf merging. IEEE Trans Circuits and Systems for Video Technology, 20(10), 1331–1345.
Article Google Scholar
Shukla, R., Dragotti, P. L., Do, M. N., & Vetterli, M. (2005). Rate-distortion optimized tree-structured compression algorithms for piecewise polynomial images. IEEE Trans Image Processing, 14(3), 343–359.
Article MathSciNet Google Scholar
Li, G.-L., & Yen, C.-H. (2012). Clock cycle oriented data bandwidth aware merge mode motion vector selection algorithm for HEVC. In proc. SII. Fukuoka, pp. 901–905.
Kim, M., Lee, H.-J., & Ling, N. (2013). Fast merge mode decision for diamond search in high efficiency video coding. In proc. VCIP. Kuching, pp. 1–6.
Li, M., Chono, K., & Goto, S. (2013). Low-complexity merge candidate decision for fast HEVC encoding. In proc. ICMEW. San Jose, pp. 1–6.
Zhao, W., Onoye, T., & Song, T. (2015). Hierarchical structure-based fast mode decision for H.265/HEVC. IEEE Trans Circuits and Systems for Video Technology, 25(10), 1651–1664.
Article Google Scholar
Pan, Z., Kwong, S., Sun, M.-T., & Lei, J. (2014). Early MERGE mode decision based on motion estimation and hierarchical depth correlation for HEVC. IEEE Trans Broadcasting, 60(2), 405–412.
Article Google Scholar
Lee, K., Lee, H.-J., Kim, J., & Choi, Y. (2013). A novel algorithm for zero block detection in high efficiency video coding. IEEE Journal Selected Topics in Signal Processing, 7(6), 1124–1134.
Article Google Scholar
Shen, L., Zhang, Z., & Liu, Z. (2014). Adaptive inter-mode decision for HEVC jointly utilizing inter-level and spatiotemporal correlation. IEEE Trans Circuits and Systems for Video Technology, 24(10), 1709–1722.
Article Google Scholar
Vanne, J., Viitanen, M., & Hämäläinen, T. D. (2014). Efficient mode decision schemes for HEVC inter prediction. IEEE Trans Circuits and Systems for Video Technology, 24(9), 1579–1593.
Article Google Scholar
Rhee, C. E., Lee, K. J., Kim, T. S., & Lee, H.-J. (2012). A survey of fast mode decision algorithms for inter-prediction and their applications to high efficiency video coding. IEEE Trans Consumer Electronics, 58(4), 1375–1383.
Article Google Scholar
Shen, L., Liu, Z., Zhang, X., Zhao, W., & Zhang, Z. (2013). An effective CU size decision method for HEVC encoders. IEEE Transactions on Multimedia, 15(2), 465–470.
Article Google Scholar
He, G., Zhou, D., Li, Y., Chen, Z., Zhang, T., & Goto, S. (2015). High-throughput power-efficient VLSI architecture of fractional motion estimation for ultra-HD HEVC video encoding. IEEE Trans Very Large Scale Integration (VLSI) Systems, 23(12), 3138–3142.
Article Google Scholar
Jou, S.-Y., & Chang, T.-S. (2013). Fast prediction unit selection for HEVC fractional pel motion estimation design. In proc. SiPS. Taipei, pp. 247–250.
Sotetsumoto, T., Song, T., & Shimamoto, T. (2013). Low complexity algorithm for sub-pixel motion estimation of HEVC. In proc. ICSPCC. Kunming, pp. 1–4.
Li, H., Zhang, Y., & Chao, H. (2013). An optimally scalable and cost-effective fractional-pixel motion estimation algorithm for HEVC. In proc. ICASSP. Vancouver, pp. 1399–1403.
Sze, V., Budagavi, M., & Sullivan, G. J. (2014). High efficiency video coding (HEVC): algorithms and architectures. In Encoder hardware architecture for HEVC, 1st ed. Switzerland. (pp. 343–375). Springer.
Tsai, S.-F., Li, C.-T., Chen, H.-H., Tsung, P.-K., Chen, K.-Y., & Chen, L.-G. (2013). A 1062Mpixels/s 8192×4320p high efficiency video coding (H.265) encoder chip. In proc. VLSIC. Kyoto, pp. C188–C189.
Kim, T. S., Rhee, C. E., & Lee, H.-J. (2016). Merge mode estimation for a hardware-based HEVC encoder. IEEE Trans Circuits and Systems for Video Technology, 26(1), 195–209.
Article Google Scholar
JCT-VC (2013). High Efficiency Video Coding Reference Software. High Efficiency Video Coding Test Model 12.1 (HM12.1). [Online]. Available: http://hevc.hhi.fraunhofer.de/.
Sinagil, M. E., Sze, V., Zhou, M., & Chandrakasan, A. P. (2013). Cost and coding efficient motion estimation design considerations for high efficiency video coding (HEVC) standard. IEEE Journal Selected Topics in Signal Processing, 7(6), 1017–1028.
Article Google Scholar
Jou, S.-Y., Chang, S.-J., & Chang, T.-S. (2015). Fast motion estimation algorithm and Design for Real Time QFHD high efficiency video coding. IEEE Trans Circuits and Systems for Video Technology, 25(9), 1533–1544.
Article Google Scholar
Pastuszak, G., & Trochimiuk, M. (2016). Algorithm and architecture design of the motion estimationfor the H.265_HEVC 4K-UHD encoder. Journal Real-Time Image Proc, 12(2), 517–529.
Article Google Scholar
Bjontegaard, G. (2001) Calculation of average PSNR differences between RD curves. Presented at the 13th VCEG-M33 meeting, Austin, Texas, USA.
Kim, T. S., Rhee, C. E., & Lee, H.-J. (2015) Highly utilized merge mode estimation for a hardware-based HEVC encoder. In proc. SiPS. Hangzhou, pp. 1–6.

Download references

Acknowledgements

This work was supported by the Korea Institute for Advancement of Technology (KIAT) grant funded by the Korean government (Motie: Ministry of Trade, Industry & Energy, HRD Program for Software-SoC convergence) (No. N0001883).This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2015R1C1A1A02037625).

Author information

Authors and Affiliations

Inter-university Semiconductor Research Center, Department of Electrical Engineering, Seoul National University, Seoul, South Korea
Tae Sung Kim & Hyuk-Jae Lee
Department of Information and Communication Engineering, Inha University, Incheon, South Korea
Chae Eun Rhee

Authors

Tae Sung Kim
View author publications
You can also search for this author in PubMed Google Scholar
Chae Eun Rhee
View author publications
You can also search for this author in PubMed Google Scholar
Hyuk-Jae Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chae Eun Rhee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, T.S., Rhee, C.E. & Lee, HJ. A Highly Utilized Hardware-Based Merge Mode Estimation with Candidate Level Parallel Execution for High-Efficiency Video Coding. J Sign Process Syst 90, 743–757 (2018). https://doi.org/10.1007/s11265-017-1268-0

Download citation

Received: 07 February 2016
Revised: 14 December 2016
Accepted: 12 July 2017
Published: 22 July 2017
Issue Date: May 2018
DOI: https://doi.org/10.1007/s11265-017-1268-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Highly Utilized Hardware-Based Merge Mode Estimation with Candidate Level Parallel Execution for High-Efficiency Video Coding

Abstract

Access this article

Similar content being viewed by others

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

A Survey on Pipelined FFT Hardware Architectures

A Modern Primer on Processing in Memory

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Highly Utilized Hardware-Based Merge Mode Estimation with Candidate Level Parallel Execution for High-Efficiency Video Coding

Abstract

Access this article

Similar content being viewed by others

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

A Survey on Pipelined FFT Hardware Architectures

A Modern Primer on Processing in Memory

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation