Abstract
Implementation of video coding systems such as H.264/AVC and AVS on multi-core and many-core platforms is attracting much attention. The slice-level parallelism is popular in parallel video coding for its simplicity and flexibility, however, the video quality loses greatly since the partitioning of slices breaks the dependency between macro-blocks, especially on multi-core or many-core platforms. To address this problem, we propose a Macro-Block Group (MBG) parallel scheme for parallel AVS coding. In the proposed scheme, video frames are equally divided into rectangular MBG regions; each MBG consists of more rows and less columns of macro-blocks than the slice-level scheme. Given that MBG is not syntactically supported by AVS, a vertical partitioning scheme is introduced. Additionally, we use mode confining and motion vector difference adjusting techniques to keep consistent with the standard. Two MBG parallel schemes (5 × 9 MBG partition and 8 × 7 MBG partition) are developed on a TILE64 many-core platform, where P/B frames use the MBG parallel scheme and I frames use the macro-block-level parallelism. Experimental results show that the proposed scheme of 5 × 9 MBG partition can achieve a reduction of 52% (IPPP) and 41% (IBBP) quality loss while keeping the same speed-up compared with the slice-level parallelism. With more cores employed, the scheme of 8 × 7 MBG partition gains 23.9 times of speed-up compared with the single-core implementation and achieves similar coding performance as the 5 × 9 scheme.
Similar content being viewed by others
Notes
The AVS1-P2 Jizhun, Zengjiang and Jiaqiang profile only supports slice with complete MB lines, while the Shenzhan profile supports rectangular slices, which can be directly used in realizing the proposed MBG.
This extra core is not taken into account in the calculation of the acceleration efficiency in follow sections.
References
Wiegand, T., Sullivan, G. J., Bjontegaard, G., & Luthra, A. (2003). Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, 13(7), 560–576.
Information technology—Advanced coding of audio and video—Part 2:Video. GB/T20090.2 (2006).
Chen, Y. K., Li, E. Q., Zhou, X. S., & Ge, S. (2005). Implementation of H.264 encoder and decoder on personal computers. Journal of Visual Communication and Image Representation, 17(2006), 509–532.
Li, P., Veeravalli, B., & Kassim, A. A. (2005). Design and implementation of parallel video encoding strategies using divisible load analysis. IEEE Transactions on Circuits and Systems for Video Technology, 15(9), 1098–1112.
Barbosa, D., Kitajima, J. P., Jr, Miera, W. (1999) Real-time MPEG encoding in shared-memory multiprocessors. In proceeding of the 2nd International Conference Parallel Computing Systems, pp. 130–200
Shen, K., Delp, E. J. (1995). A parallel implementation of an mpeg1 encoder: Faster than real-time. Proceedings of the SPIE 2419, Digital Video Compression: Algorithms and Techniques.
Jung, B., & Jeon, B. (2008). Adaptive slice-level parallelism for h.264/avc encoding using pre macroblock mode selection. Journal of Visual Communication and Image Representation, 19(8), 558–572.
Zhao, Z., Liang, P. (2006). Data partition for wavefront parallelization of H.264 video encoder. IEEE International Symposium on Circuits and Systems, 2006. ISCAS 2006. Proceedings. 2006 0-0 0 Page(s):4 pp. – 2672.
Rodriguez, A., Gonzalez, A., Malumbres, M. P. (2006). Hierarchical Parallelization of an H.264/AVC Video Encoder. Proceeding of International Symposium on Parallel Computing in Electrical Engineering, 2006. PAR ELEC 2006. 13–17 Sept. 2006:363–368.
Sun, S. W., Wang, D., & Chen, S. M. (2007). A highly efficient parallel algorithm for H.264 encoder based on macro-block region partition. High Performance Computing and Communications, 4782/2007, 577–585.
He, Y., Ahmad, I., & Liou, M. L. (1998). A software-based MPEG-4 video encoder using parallel processing. IEEE Transactions on Circuits and Systems for Video Technology, 8(7), 909–920.
ISO/IEC 14496-1,2,3:1999 (1999). Information technology Coding of audio-visual objects—Part1 Systems, Part2:Visual, Part3: Audio.
Lin, D., Huang, X., Nguyen, Q., Blackburn, J., Rodrigues, C., Huang, T., et al. (2009). The parallelization of video processing from programming models to applications. IEEE Signal Processing Magazine, Nov.2009:26(6):pp. 103–112.
Wenger, S. (2003). H.264/AVC over IP. IEEE Transactions on Circuits and Systems for Video Technology, 13(7), 645–656.
Wang, Z., Liang, L., Zhang X., et al. (2009). A novel macro-block group scheme of AVS coding for many-core processor, in Proceedings of IEEE Pacific-Rim Conference International Conference on Multimedia, PCM2009, Bangkok, Thailand, Dec.2009.
Tilera Corporation: ProductBrief_TILEPro64_Web_v2, http://www.tilera.com/.
Audio Video coding Standard Workgroup of China, http://www.avs.org.cn.
Zhu, C., Lin, X., & Chau, L. P. (2002). Hexagon-based search pattern for fast block motion estimation. IEEE Transactions on Circuits and Systems for Video Technology, 12(5), 349–355.
Acknowledgements
This paper is partially supported by the National Basic Research Program of China (973 Program) under contract No. 2009CB320902, the National Natural Science Foundation of China under contract No. 60832004 Beijing Municipal Natural Science Foundation under contract No. 4102025. Especially, we would like to express deep gratitude to our colleagues Kaijin Wei and Qian Huang. This paper would be incomplete without their tremendous contributions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, Z., Liang, L., Yang, G. et al. A Novel Macro-Block Group Based AVS Coding Scheme for Many-Core Processor. J Sign Process Syst 65, 129–145 (2011). https://doi.org/10.1007/s11265-010-0543-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-010-0543-0