Advertisement

Journal of Real-Time Image Processing

, Volume 16, Issue 6, pp 2173–2187 | Cite as

Software pipelining with CGA and proposed intrinsics on a reconfigurable processor for HEVC decoders

  • Yong-Jo Ahn
  • Jonghun Yoo
  • Hyun-Ho Jo
  • Donggyu SimEmail author
Original Research Paper
  • 56 Downloads

Abstract

This work proposes several intrinsics on a reconfigurable processor intended for HEVC decoding and software pipelining algorithms with a coarse-grained array (CGA) architecture as well as the proposed intrinsic instructions. Software pipelining algorithms are developed for the CGA acceleration of inverse transform, pixel reconstruction, de-blocking filter and sample adaptive offset modules. To enable efficient software pipelining, several very-long instruction-word-based intrinsics are designed in order to maximize the parallelization rather than the computational acceleration. We found that the HEVC decoder with the proposed intrinsics yields 2.3 times faster in running clock cycle than a decoder that does not use the intrinsics. In addition, the HEVC decoder with CGA pipelining algorithms executes 10.9 times faster than that without the CGA mode.

Keywords

HEVC Reconfigurable processor Software pipelining Coarse-grained array Intrinsic 

Notes

Acknowledgements

This research was partly supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2014R1A2A1A11052210) and the Ministry of Science, ICT and Future Planning (MSIP), Korea, under the Information Technology Research Center (ITRC) support program (IITP-2017-2016-0-00288) supervised by the Institute for Information & Communications Technology Promotion (IITP).

References

  1. 1.
    Sullivan, G., Ohm, J., Han, W.-J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1649–1668 (2012)CrossRefGoogle Scholar
  2. 2.
    Ahn, Y.-J., Ryu, H., Sim, D., Kang, J.-W.: Analysis of screen content coding based on HEVC. IEIE Trans. Smart Process. Comput. 4(4), 231–236 (2015)CrossRefGoogle Scholar
  3. 3.
    Viitanen, M., Vanne, J., Hamalainen, T.D., Gabouj, M.: Complexity analysis of next-generation HEVC decoder. In: Proceedings on IEEE International Symposium Circuits and System (ISCAS), pp. 882–885 (2012)Google Scholar
  4. 4.
    Vanne, J., Vitanen, M., Hamalainen, T.D., Hallapuro, A.: Comparative rate-distortion-complexity analysis of HEVC and AVC video codecs. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1885–1898 (2012)CrossRefGoogle Scholar
  5. 5.
    Yan, L., Duan, Y., Sun, J., Guo, Z.: Implementation of HEVC decoder on x86 processors with SIMD optimization. In: IEEE Visual Communications and Image Processing (VCIP), pp. 1–6 (2012)Google Scholar
  6. 6.
    Seo, J., Jo, H., Sim, D., Kim, D., Song, J.: Fast CAVLD of H.264/AVC on bitstream decoding processor. EURASIP J. Image Video Process. 2013(1), 1–14 (2013)CrossRefGoogle Scholar
  7. 7.
    Ryu, H., Ahn, Y.-J., Mok, J.-S., Sim, D.: Performance analysis of HEVC paralleization methods for high-resolution videos. IEIE Trans. Smart Process. Comput. 4(1), 28–34 (2015)CrossRefGoogle Scholar
  8. 8.
    Chen, T.W., Huang, Y.W., Chen, T.C., Chen, Y.H., Tsai, C.Y., Chen, L.G.: Architecture design of H.264/AVC decoder with hybrid task pipelining for high definition videos. In: Proceedings on IEEE International Symposium Circuits and System (ISCAS), pp. 2931–2934 (2005)Google Scholar
  9. 9.
    Nunez-Yanez, J.L., Spiteri, T., Vafiadis, G.: Multi-standard reconfigurable motion estimation processor for hybrid video codecs. IET Comput. Digit. Tech. 5(2), 73–85 (2011)CrossRefGoogle Scholar
  10. 10.
    Wang, Y., Liu, L., Yin, S., Zhu, M., Cao, P., Wang, J., Wei, S.: On-chip memory hierarchy in one coarse-grained reconfigurable time and data-reference time. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. (2013)Google Scholar
  11. 11.
    Mei, C., Li, M., Cao, P., Amin, A., Li, C., Yang, J., Dejonghe, A., Perre, L.V., Shi, L., Pollin, S.: Exploration of full HD media decoding on a software defined radio baseband processor. IEEE Trans. Signal Process. 61(18), 4438–4449 (2013)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Kim, S., Lee, J., Yang, J., Sunwoo, M., Oh, S.: Novel instructions and their hardware architecture for video signal processing. In: Proceedings on IEEE International Symposium Circuits and Systems (ISCAS), pp. 3323–3326 (2005)Google Scholar
  13. 13.
    Kim, H., Ahn, M., Stratton, J.A., Hwu, W.W.: Design evaluation of openCL compiler framework for coarse-grained reconfigurable arrays. In: Proceedings of IEEE International Conference on Field-Programmable Technology (FPT), pp. 313–320, Seoul, KR (2012)Google Scholar
  14. 14.
    Lee, J., Byun, K., Eum, N.: ASIP for multi-standard video decoding. In: International Conference on Advances in Circuits, Electronics and Micro-electronics, pp. 37–42 (2012)Google Scholar
  15. 15.
    Jo, H.-H., Ahn, Y.-J., Kang, D.-B., Ji, B., Sim, D.-G.: Flexible multi-core platform for a multiple-format video decoder. J. Signal Process. Syst. Signal Image Video Technol. 80(2), 163–179 (2013)CrossRefGoogle Scholar
  16. 16.
    Maiti, K., Pasupuleti, S.K., Gadde, R.N., Lee, S.J.: Efficient deblocking filter implementation on reconfigurable processor. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, pp. 1050–1054 (2016)Google Scholar
  17. 17.
    Real-time hardware decoding on FPGAs developed at Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute. https://www.hhi.fraunhofer.de/en/departments/vca/technologies-and-solutions/hevc-software-and-hardware-solutions/hevc-4k-real-time-hardware-decoder.html
  18. 18.
    Brown, S., Rose, J.: FPGAs and CPLDs: a tutorial. IEEE Des. Test Comput. 13(2), 24–57 (2002)Google Scholar
  19. 19.
    Rodriguez-Andian, J.J., Moure, M.J., Valdes, M.D.: Features, design tools, and application domains of FPGAs. IEEE Trans. Ind. Electron. 54(4), 1810–1823 (2007)CrossRefGoogle Scholar
  20. 20.
    Park, S., Kim, H., Byun, K.: High performance and FPGA implementation of scalable video encoder. IEIE Trans. Smart Process. Comput. 3(6), 353–357 (2014)CrossRefGoogle Scholar
  21. 21.
    Kim, I.-K., Min, J., Lee, T., Han, W.-J., Park, J.H.: Block partitioning structure in the HEVC standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1697–1706 (2012)CrossRefGoogle Scholar
  22. 22.
    Ahn, Y.-J., Han, W.-J., Sim, D.G.: Study of decoder complexity for HEVC and AVC standards based on tool-by-tool comparison. In: SPIE Application of Digital Processing XXXV, Proceedings of SPIE, vol. 8499, pp. 8499–8432, San Diego, CA (2012)Google Scholar
  23. 23.
    Bossen, F., Bross, B., Suhring, K., Flynn, D.: HEVC complexity and implementation analysis. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1685–1696 (2012)CrossRefGoogle Scholar
  24. 24.
    Jin, S., Lee, S.-H., Chung, M.-K., Cho, Y.-G., Ryu, S.: Implementation of a volume rendering on coarse-grained reconfigurable multiprocessor. In: Proceedings of IEEE International Conference on Field-Programmable Technology (FPT), pp. 243–246, Seoul, KR (2012)Google Scholar
  25. 25.
    Lee, S., Song, J., Kim, M., Kim, D., Lee, S.: H.264/AVC UHD decoder implementation on multi-cluster platform using hybrid parallelization method. In: 18th IEEE International Conference on Image Processing (2011)Google Scholar
  26. 26.
    Kim, C., Chung, M., Cho, Y., Konijnenburg, M., Ryu, S., Kim, J.: ULP-SRP: ultra low power samsung reconfigurable processor for biomedical applications. In: Proceedings of IEEE International Conference on Field-Programmable Technology (FPT), pp. 329–334, Seoul, KR (2012)Google Scholar
  27. 27.
    Budagavi, M., Fuldeth, A., Bjontegaard, G., Sze, V., Sadafale, M.: Core transform design in the high efficiency video coding (HEVC) standard. IEEE J. Sel. Top. Signal Process. 7(6), 1029–1041 (2013)CrossRefGoogle Scholar
  28. 28.
    Norkin, A., Bjontegaard, G., Fuldseth, A., Narroschke, M., Ikeda, M., Andersson, K., Zhou, M., Auwera, G.V.: HEVC deblocking filter. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1746–1754 (2012)CrossRefGoogle Scholar
  29. 29.
    Fu, C.-M., Alshina, E., Alshin, A., Huang, Y.-W., Chen, C.-Y., Tsai, C.-Y., Hsu, C.-W., Lei, S.-M., Park, J.-H., Han, W.-J.: Sample adaptive offset in the HEVC standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1755–1764 (2012)CrossRefGoogle Scholar
  30. 30.
    Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG11, HM-9.0 reference softwareGoogle Scholar
  31. 31.
    Bossen, F.: Common HM test conditions and software reference configuration. In: Joint Collaborative Team on Video Coding (JCT-VC), JCTVC-E196 (2011)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  • Yong-Jo Ahn
    • 1
  • Jonghun Yoo
    • 1
  • Hyun-Ho Jo
    • 1
  • Donggyu Sim
    • 1
    Email author
  1. 1.Department of Computer EngineeringKwangwoon UniversitySeoulKorea

Personalised recommendations