Three-level pipelined multi-resolution integer motion estimation engine with optimized reference data sharing search for AVS

  • Xiaofeng Huang
  • Kaijin Wei
  • Haibing Yin
  • Chuang Zhu
  • Huizhu Jia
  • Don Xie
Special Issue Paper
  • 2 Downloads

Abstract

Integer motion estimation (IME), which acts as a key component in video encoder, is to remove temporal redundancies by searching the best integer motion vectors for dynamic partition blocks in a macro-block (MB). Huge memory bandwidth requirements and unbearable computational resource demanding are two key bottlenecks in IME engine design, especially for large search window (SW) cases. In this paper, a three-level pipelined VLSI architecture design is proposed, where efficiently integrates the reference data sharing search (RDSS) into multi-resolution motion estimation algorithm (MMEA). First, a hardware-friendly MMEA algorithm is mapped into three-level pipelined architecture with neglected coding quality loss. Second, sub-sampled RDSS coupled with Level C + are adopted to reduce on-chip memory and bandwidth at the coarsest and middle level. Data sharing between IME and fractional motion estimation (FME) is achieved by loading only a local predictive SW at the finest level. Finally, the three levels are parallelized and pipelined to guarantee the gradual refinement of MMEA and the hardware utilization. Experimental results show that the proposed architecture can reach a good balance among complexity, on-chip memory, bandwidth, and the data flow regularity. Only 320 processing elements (PE) within 550 cycles are required for IME search, where the SW is set to 256 × 256. Our architecture can achieve 1080P@30 fps real-time processing at the working frequency of 134.6 MHz, with 135 K gates and 8.93 KB on-chip memory.

Keywords

Integer motion estimation Reference data sharing Multi-resolution motion estimation Bandwidth Computational complexity 

References

  1. 1.
    ITUT Recommendation and International Standard of Joint Video Specification. ITU-T Rec. H.264/ ISO/ IEC 14496-10 AVC. (2005)Google Scholar
  2. 2.
    Information technology–Advanced coding of audio and video–Part 2: Video. AVS Standard Draft (2005)Google Scholar
  3. 3.
    Chen, T.-C., Tsai, C.-Y., Huang, Y.-W., Chen, L.-G.: Single reference frame multiple current macroblocks scheme for multiple reference frame motion estimation in H.264/AVC IEEE Trans. Circuits Syst. Video Technol. 17(2), 242–247 (2007)CrossRefGoogle Scholar
  4. 4.
    Tourapis, A.M.: Enhanced predictive zonal search for single and multiple frame motion estimation. Proc. SPIE Vis. Commun. Image Process. 4671, 1069–1079 (2002)Google Scholar
  5. 5.
    Chen, C.-Y., Huang, C.-T., Chen, Y.-H., Chen, L.-G.: Level C + data reuse scheme for motion estimation with corresponding coding orders. IEEE Trans. Circuits Syst. Video Technol. 16(4), 553–558 (2006)CrossRefGoogle Scholar
  6. 6.
    Song, B.C., Chun, K.W.: Multi-resolution block matching algorithm and its VLSI architecture for fast motion estimation in a MPEG-2 video encoder. IEEE Trans. CSVT 14(9), 1119–1137 (2004)Google Scholar
  7. 7.
    Liu, Z., Song, Y., Shao, M., Li, S., Li, L., Ishiwata, S., Nakagawa, M., Goto, S., Ikenaga, T.: HDTV 1080P H.264/AVC encoder chip design and performance analysis. IEEE J. Solid-State Circuits 44(2), 594–608. 2009CrossRefGoogle Scholar
  8. 8.
    Chen, T.C., et al.: Analysis and architecture design of an HDTV720p 30 frames/s H.264/AVC encoder. IEEE Trans. Circuits Syst. Video Technol. 16(6), 673–688. 2006CrossRefGoogle Scholar
  9. 9.
    Chen, Z., Zhou, P., He, Y.: Fast Integer PEL and Fractional PEL Motion Estimation for JVT, document JVT-F017, Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCE, 6th Meeting, Awaji Island (2002)Google Scholar
  10. 10.
    Lin, Y.-K., Lin, C.-C., Kuo, T.-Y., Chang, T.-S.: A hardware-efficient H.264/AVC motion-estimation design for high-definition video. IEEE Trans. Circuits Syst. I Regular Papers 55(6), 1526–1535 (Jul. 2008)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Jung, J., Kim, J., Kyung, C.-M.: A dynamic search range algorithm for stabilized reduction of memory traffic in video encoder. IEEE Trans. Circuits Syst. Video Technol. 20(7), 1041–1046 (2010)CrossRefGoogle Scholar
  12. 12.
    Yin, H., Jia, H., Qi, H., Ji, X., Xie, X., Gao, W.: A hardware-efficient multi-resolution block matching algorithm and its VLSI architecture for high definition MPEG-like video encoders. IEEE Trans. Circuits Syst. Video Technol., 20(9), 1242–1254 (2010)CrossRefGoogle Scholar
  13. 13.
    Woo, D., Rhee, C., Lee, H.-J.: A cache-aware motion estimation organization for a hardware-based H.264 encoder, IEEE Trans. Consum. Electron. 60(1), 83–91 (2014)CrossRefGoogle Scholar
  14. 14.
    Kao, C.-Y., Lin, Y.-L.: A memory-efficient and highly parallel architecture for variable block size integer motion estimation in H.264/AVC. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 18(6), 864–874 (2010)CrossRefGoogle Scholar
  15. 15.
    Sullivan, G.J., Wiegand, T.: Rate-distortion optimization for video compression. IEEE Signal Process. Mag. 15(6), 74–90 (1998)CrossRefGoogle Scholar
  16. 16.
    Lee, S.H., Chung, M.K., Park, S.M., Kyung, C.M.: Lossless frame memory recompression for video codec preserving random accessibility of coding unit. IEEE Trans. Consum. Electron. 55(4), 2105–2113 (2009)CrossRefGoogle Scholar
  17. 17.
    Tao, L., Su-Ying, Y., Zai-Feng, S., Peng, G.: An improved three-step search algorithm with zero detection and vector filter for motion estimation. Proc. IEEE Int. Conf. Comput. Sci. Softw. Eng. 976–970 (2008)Google Scholar
  18. 18.
    Tham, J.Y., Ranganath, S., Ranganath, M., Kassim, A.A.: A novel unrestricted center-biased diamond search algorithm for block motion estimation. IEEE Trans. Circuits Syst. Video Technol. 8(4)369–377 (1998)CrossRefGoogle Scholar
  19. 19.
    Wei, K., Zhang, S., Jia, H., Xie, D., Gao, W.: A flexible and high-performance hardware video encoder architecture. Picture Coding Symp. (PCS). 373–376 (2012)Google Scholar
  20. 20.
    Tsai, T.H., Pan, Y.N.: High efficiency architecture design of real-time QFHD for H.264/AVC fast block motion estimation. IEEE Trans. Circuits Syst. Video Technol. 21(11), 1646–1658 (2011)CrossRefGoogle Scholar
  21. 21.
    Ndili, O., Ogunfunmi, T.: Algorithm and architecture co-design of hardware-oriented, modified diamond search for fast motion estimation in H.264/AVC. IEEE Trans. Circuits Syst. Video Technol. 21(9), 1214–1227 (2011)CrossRefGoogle Scholar
  22. 22.
    Pastuszak, G., Jakubowski, M.: Adaptive computationally-scalable motion estimation for the hardware H.264/AVC encoder. IEEE Trans. Circuits Syst. Video Technol. 23(5), 802–812 (2013)CrossRefGoogle Scholar
  23. 23.
    Kim, S., Sunwoo, M.: A configurable and data reusable motion estimation specific instruction-set processor. IEEE Trans. Circuits Syst. Video Technol. 23(10), 1767–1780 (2013)CrossRefGoogle Scholar
  24. 24.
    Yin, H., Park, D., Zhang, X.: Buffer structure optimized VLSI architecture for efficient hierarchical integer pixel motion estimation implementation. J. Real-Time Image Proc. 11(3), 507–525 (2016)CrossRefGoogle Scholar
  25. 25.
    Jou, S.Y., Chang, S.J., Chang, T.S.: Fast motion estimation algorithm and design for real time QFHD high efficiency video coding. IEEE Trans. Circuits Syst. Video Technol. 25(9), 1533–1544, (2015)CrossRefGoogle Scholar
  26. 26.
    Fan, Y., Huang, L., Hao, B., Zeng, X.: A hardware-oriented IME algorithm for HEVC and its hardware implementation. IEEE Trans. Circuits Syst. Video Technol.  https://doi.org/10.1109/TCSVT.2017.2702194

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Xiaofeng Huang
    • 1
  • Kaijin Wei
    • 2
  • Haibing Yin
    • 1
  • Chuang Zhu
    • 3
  • Huizhu Jia
    • 2
  • Don Xie
    • 2
  1. 1.Hangzhou Dianzi UniversityZhejiangChina
  2. 2.National Engineering Laboratory for Video TechnologyPeking UniversityBeijingChina
  3. 3.Beijing University of Posts and TelecommunicationsBeijingChina

Personalised recommendations