Abstract
Approximate computing techniques exploit the characteristics of error-tolerant applications either to provide faster implementations of their computational structures or to achieve substantial improvements in terms of energy efficiency. In video encoding, the motion estimation (ME) stage, including the Integer ME (IME) and the Fractional ME (FME) steps, is the most computational intensive task and it is highly resilient to controlled losses of accuracy. In accordance, this article proposes the exploitation of approximate computing techniques to implement energy efficient dedicated hardware structures targeting the motion estimation stage of current video encoders. The designed ME architecture supports IME and FME and is able to real-time process 4 K UHD videos (3840 × 2160 pixels) at 30 frames per second, while dissipating 108.92 mW. When running at its maximum operation frequency, the architecture can process 8 K UHD videos (7680 × 4320 pixels) at 120 frames per second. The solution described in this article presents the highest throughput and the highest energy efficiency among all state-of-the-art compared works, showing that the use of approximate computing is a promising solution when implementing video encoders in dedicated hardware.
Similar content being viewed by others
References
Sullivan, G., Ohm, J., Han, W., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. (2012). https://doi.org/10.1109/TCSVT.2012.2221191
He, Z., Yu, L., Zheng, X., Ma, S.: AVS2-video coding standard - an application-oriented and high performance video coding standard. IEEE Int. Conf. Multimedia Expo Workshops (2014). https://doi.org/10.1109/ICMEW.2014.6890707
AOM: Alliance for Open Media: Get started creating products with AV1. https://aomedia.org/av1-features/get-started/ (2019). Accessed 08 May 2019.
MPEG: The Moving Picture Experts Group: Versatile Video Coding. https://mpeg.chiariglione.org/standards/mpeg-i/versatile-video-coding (2019). Accessed 08 May 2019.
Han, J., Orshansky, M.: Approximate computing: an emerging paradigm for energy-efficient design. In: 18th IEEE European Test Symposium (2013). https://doi.org/10.1109/ETS.2013.6569370.
Chippa, V., Venkataramani, S., Chakradhar, S., Roy, K., Raghunathan, A.: Approximate computing: an integrated hardware approach. Asilomar Conf Signals Syst. Comput. (2013). https://doi.org/10.1109/ACSSC.2013.6810241
Raha, A., Jayakumar, H., Raghunathan, V.: A Power efficient video encoder using reconfigurable approximate arithmetic units. In: 27th International Conference on VLSI Design and 2014 13th International Conference on Embedded Systems (2014). https://doi.org/10.1109/VLSID.2014.62.
Porto, R., Agostini, L., Zatt, B., Porto, M., Roma, N., Sousa, L.: Energy-efficient motion estimation with approximate arithmetic. In: IEEE 19th International Workshop on Multimedia Signal Processing (2017). https://doi.org/10.1109/MMSP.2017.8122248.
Alvar, S., Abdollahzadeh, M., Seyedarabi, H.: A novel fast search motion estimation algorithm in video coding. In: IEEE 23rd International Symposium on Industrial Electronics (2014). https://doi.org/10.1109/ISIE.2014.6864737.
Chiang, J., Kuo, W., Su, L.: Fast motion estimation using hexagon-based search pattern in predictive search range. Int. Conf. Comput. Commun. Netw. (2007). https://doi.org/10.1109/ICCCN.2007.4317974
Jou, S., Chang, S., Chang, T.: Fast motion estimation algorithm and design for real time QFHD high efficiency video coding. IEEE Trans. Circuits Syst. Video Technol. (2015). https://doi.org/10.1109/TCSVT.2015.2389472
He, G., Zhou, D., Li, Y., Chen, Z., Zhang, T., Goto, S.: High-throughput power-efficient VLSI architecture of fractional motion estimation for ultra-HD HEVC video encoding. IEEE Trans. Very Large Scale Integr. Syst. (2015). https://doi.org/10.1109/TVLSI.2014.2386897
Medhat, A., Shalaby, A., Sayed, M.: High-throughput hardware implementation for motion estimation in HEVC encoder. In: IEEE 58th International Midwest Symposium on Circuits and Systems (2015). https://doi.org/10.1109/MWSCAS.2015.7282040.
Park, S., Choi, B., Lim, I., Park, H., Kang, S.: An efficient motion estimation hardware architecture using Modified Reference Data Access (MRDAS) skip algorithm for high Efficiency Video Coding (HEVC) encoder. In: IEEE 6th International Conference on Consumer Electronics – Berlin (2016). https://doi.org/10.1109/ICCE-Berlin.2016.7684724.
Singh, K., Ahamed, S.: Low power motion estimation algorithm and architecture of HEVC/H265 for consumer applications. IEEE Trans. Consum. Elect. (2018). https://doi.org/10.1109/TCE.2018.2867823
Jia, L., Tsui, C., Au, O., Jia, K.: A low-power motion estimation architecture for HEVC based on a new sum of absolute difference computation. IEEE Trans. Circuits Syst. Video Technol. (2018). https://doi.org/10.1109/TCSVT.2018.2890204
Mahdiani, H., Ahmadi, A., Fakhraie, S., Lucas, C.: Bio-inspired imprecise computational blocks for efficient VLSI implementation of soft-computing applications. IEEE Trans. Circuits Syst. I Regul. Pap. (2010). https://doi.org/10.1109/TCSI.2009.2027626
Porto, R., Agostini, L., Zatt, B., Roma, N., Porto, M.: Power-efficient approximate SAD architecture with LOA imprecise adders. In: IEEE 10th Latin American Symposium on Circuits & Systems (2019). https://doi.org/10.1109/LASCAS.2019.8667554.
Perleberg, M., Afonso, V., Conceição, R., Susin, A., Agostini, L., Porto, M., Zatt, B.: Energy and rate-aware design for HEVC motion estimation based on Pareto efficiency. J. Integrat. Circuits Syst. 13, 1 (2018)
Hwang, S., Ha, J., Sunwoo, M.: Efficient integer motion estimation algorithm using sub-sampling. Int. SoC Design Conf. (2009). https://doi.org/10.1109/SOCDC.2009.5423844
Kang, S., Yoo, D., Lee, S., Kim, Y.: Hardware implementation of motion estimation using a sub-sampled block for frame rate up-conversion. Int. SoC Design Conf. (2008). https://doi.org/10.1109/SOCDC.2008.4815694
Frustaci, F., Lanuzza, M., Zicari, P., Perri, S., Corsonello, P.: Designing high-speed adders in power-constrained environments. IEEE Trans. Circ. Syst. II Express Briefs (2009). https://doi.org/10.1109/TCSII.2008.2010187
Dutt, S., Nandi, S., Trivedi, G.: A comparative survey of approximate adders. In: 26th International Conference Radioelektronika (2016). https://doi.org/10.1109/RADIOELEK.2016.7477392.
Kahng, A., Kang, S.: Accuracy-configurable adder for approximate arithmetic designs. Design Autom. Conf. (2012). https://doi.org/10.1145/2228360.2228509
Camus, V., Schlachter, J., Enz, C.: A low-power carry cut-back approximate adder with fixed-point implementation and floating-point precision. In: 53nd ACM/EDAC/IEEE Design Automation Conference (2016). https://doi.org/10.1145/2897937.2897964.
Zhu, N., Goh, W., Zhang, W., Yeo, K., Kong, Z.: Design of low-power high-speed truncation-error-tolerant adder and its application in digital signal processing. IEEE Trans. Very Large Scale Integr. Syst. (2010). https://doi.org/10.1109/TVLSI.2009.2020591
Zhu, N., Goh, W., Wang, G., Yeo, K.: Enhanced low-power high-speed adder for error-tolerant application. Int. SoC Design Conf. (2010). https://doi.org/10.1109/SOCDC.2010.5682905
Shafique, M., Ahmad, W., Hafiz, R., Henkel, J.: A low latency generic accuracy configurable adder. ACM/EDAC/IEEE Design Autom. Conf. (2015). https://doi.org/10.1145/2744769.2744778
Bossen, F.: Common test conditions and software reference configurations. Document JCTVC-L1100, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 Joint Collaborative Team on Video Coding (JCT-VC). (2013).
Bjontegaard, G.: Improvements of the BD-PSNR model, VCEG-AI11, ITU-T SG16/Q6 VCEG 35th meeting, Berlin, Germany, 16–18 (2008).
Silvaco: Nangate FreePDK45 open cell library. https://www.silvaco.com/products/nangate/FreePDK45_Open_Cell_Library/index.html (2019). Accessed 08 May 2019.
High Efficiency Video Coding (HEVC): Reference software. https://hevc.hhi.fraunhofer.de (2019). Accessed 08 May 2019.
Ghosh, A., Dehuri, S.: Evolutionary algorithms for multi-criterion optimization: a survey. Int. J. Comput. Inf. Sci. 2, 38–57 (2004)
Song, C., Ju, L., Jia, Z.: Hybrid scratchpad and cache memory management for energy-efficient parallel HEVC encoding. IEEE Int. Conf. Comput. Design (2015). https://doi.org/10.1109/ICCD.2015.7357185
Chen, C., Huang, C., Chen, Y., Chen, L.: Level C+ data reuse scheme for motion estimation with corresponding coding orders. IEEE Trans. Circuits Syst. Video Technol. (2006). https://doi.org/10.1109/TCSVT.2006.871388
Afonso, V., Maich, H., Audibert, L., Zatt, B., Porto, M., Agostini, L., Susin, A.: Hardware implementation for the HEVC fractional motion estimation targeting real-time and low-energy. J. Integrat. Circuits Syst. 11, 106–120 (2016)
Acknowledgements
This work is partly financed by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior–Brasil (CAPES) Finance Code 001, by FCT projects PTDC/EEI-HAC/30485/2017 and UIDB/50021/2020, and also by CNPq and FAPERGS Brazilian research support agencies.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Porto, R., Perleberg, M., Afonso, V. et al. Fast and energy-efficient approximate motion estimation architecture for real-time 4 K UHD processing. J Real-Time Image Proc 18, 723–737 (2021). https://doi.org/10.1007/s11554-020-01014-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-020-01014-6