Journal of Signal Processing Systems

, Volume 75, Issue 1, pp 65–84 | Cite as

Architectural Decomposition of Video Decoders by Meansof an Intermediate Data Stream Format

  • Henryk RichterEmail author
  • Benno Stabernack
  • Volker Kühn


The microprocessor industry trend towards many-core architectures introduced the necessity of devising appropriately scalable applications. While implementing software based video decoding, the main challenges are the optimized partitioning of decoder operations, efficient tracking of dependencies and resource synchronization for multiple parallel units. The same applies for hardware implementations of video decoders where monolithic approaches anticipate scalability of the design and reusability of already implemented core components.In this paper, we propose an intermediate data stream format (Meta Format Stream) which is suited for architectural decomposition of video decoding by replacing the conventional monolithic decoder architecture design with a pipelined structure. The Meta Format is forward-oriented and self contained and multistandard capable, so that processing of Meta Streams is independent of the originating bit stream. Our approach does not require special coding settings and is applicable to accelerated decoding of any standards-compliant bit stream. A H.264/AVC multiprocessing proposal is presented as a case study for the potential our our concept. The case study combines coarse grained frame-level parallel decoding of the bit stream with fine-grained macroblock level parallelism in the image processing stage.The proposed H.264 decoder achieved speedup factors of up to 7.6 on an 8 core machine with 2-way SMT. We are reporting actual decoding speeds of up to 150 frames per second in 2160p-resolution.


Video CODEC Parallelization Decompression H.264/AVC Multi-core 


  1. 1.
    Ungerer, T., Robič, B., Šilc, J. (2003). A survey of processors with explicit multithreading. ACM Computing Surveys, 35, 29–63. doi: 10.1145/641865.641867.CrossRefGoogle Scholar
  2. 2.
    Marr, D.T., Binns, F., Hill, D.L., Hinton, G., Koufaty, D.A., Miller, J.A., Upton, M. (2002). Hyper-threading technology architecture and microarchitecture. Intel Technology Journal, 6(1), 1–12. Google Scholar
  3. 3.
    Wiegand, T., Sullivan, G., Bjontegaard, G., Luthra, A. (2003). Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, 13(7), 560–576.CrossRefGoogle Scholar
  4. 4.
    Marpe, D., Gordon, S., Wiegand, T. (2005). H.264/MPEG4-AVC fidelity range extensions: tools, profiles, performance, and application areas. In ICIP 2005. Genova, Italy.Google Scholar
  5. 5.
    Schöffmann, K., Fauster, M., Lampl, O., Böszörmenyi, L. (2007). An evaluation of parallelization concepts for baseline-profile compliant H.264/avc decoders. In Lecture notes in computer science 4641 (pp. 782–791). Berlin/Heidelberg: Springer.Google Scholar
  6. 6.
    Gurhanlia, A., Chen, C.C.-P., Hung, S.-H. (2010). Coarse grain parallelization of H.264 video decoder and memory bottleneck in multicore architectures In Online Preprint. [Online]. Available: .
  7. 7.
    Meenderinck, C., Azevedo, A., Juurlink, B., Mesa, M.A., Ramirez, A. (2008). Parallel scalability of video decoders. Journal Sign Process Systems, 57(2), 173-194.CrossRefGoogle Scholar
  8. 8.
    Sihn, K.-H., Baik, H., Kim, J.-T., Bae, S., Song, H.J. (2009). Novel approaches to parallel H.264 decoder on symmetric multicore systems. IEEE International Conference on Acoustics, Speech, and Signal Processing, 2017–2020.Google Scholar
  9. 9.
    Chen, Y.-K., Li, E.Q., Zhou, X., Ge, S. (2006). Implementation of H.264 encoder and decoder on personal computers. Journal of Visual Communication and Image Representation, 17(2), 509–532. CrossRefGoogle Scholar
  10. 10.
    Roitzsch, M. (2007). Slice-balancing H.264 video encoding for improved scalability of multicore decoding. In Proceedings of the 7th ACM & IEEE international conference on embedded software, ser. EMSOFT ’07 (pp. 269–278). New York, ACM, 2007. doi: 10.1145/1289927.1289969.
  11. 11.
    van der Tol, E.B., Jaspers, E.G., Gelderblom, R.H. (2003). Mapping of H.264 decoding on a multiprocessor architecture. In B. Vasudev, T.R. Hsing, A.G. Tescher, T. Ebrahimi (Eds.), Image and video communications and processing 2003 (pp. 707–718).Google Scholar
  12. 12.
    Chi, C.C., Juurlink, B., Meenderinck, C. (2010). Evaluation of parallel H.264 decoding strategies for the cell broadband engine. In Proceedings of the 24th ACM international conference on supercomputing, ser. ICS ’10 (pp. 105–114). New York, ACM, [Online]. Available: doi: 10.1145/1810085.1810102.
  13. 13.
    Seitner, F.H., Schreier, R.M., Bleyer, M., Gelautz, M. (2008). Evaluation of data-parallel splitting approaches for H.264 decoding. In Proceedings of the 6th International Conference on Advances in Mobile Computing and Multimedia, ser. MoMM ’08 (pp. 40–49) New York, ACM. doi: 10.1145/1.497185.1497198.
  14. 14.
    Chong, J., Satish, N., Catanzaro, B., Ravindran, K., Keutzer, K. (2007). Efficient parallelization of H.264 decoding with macro block level scheduling. 2007 IEEE international conference on multimedia and expo (pp. 1874–1877).Google Scholar
  15. 15.
    Mesa, M., Ramirez, A., Azevedo, A., Meenderinck, C., Juurlink, B., Valero, M. (2009). Scalability of macroblock-level parallelism for H.264 decoding. In 2009 15th international conference on parallel and distributed systems (ICPADS) (pp. 236–243).Google Scholar
  16. 16.
    Hoogerbrugge, J., & Terechko, A. (2011). A multithreaded multicore system for embedded media processing In P. Stenström (Ed.), Transactions on high-performance embedded architectures and compilers III (Vol. 6590, pp. 154–173). Ser. Lecture Notes in Computer Science. Berlin / Heidelberg: Springer. doi: 10.1007/978-3-642-19448-1_9.CrossRefGoogle Scholar
  17. 17.
    il Kim, Y., Kim, J.-T., Bae, S., Baik, H., Song, H.J. (2008). H.264/AVC decoder parallelization and optimization on asymetric multicore platform using dynamic load balancing. In 2008 IEEE international conference on multimedia and expo (pp. 1001–1004).Google Scholar
  18. 18.
    Baker, M.A., Dalale, P., Chatha, K.S., Vrudhula, S.B. (2009). A scalable parallel H.264 decoder on the cell broadband engine architecture. In Proceedings of the 7th IEEE/ACM international conference on hardware/software codesign and system synthesis, ser. CODES+ISSS ’09 (pp. 353–362). New York, ACM, 2009. doi: 10.1145/1629435.1629484.
  19. 19.
    Nishihara, K., Hatabu, A., Moriyoshi, T. (2008). Parallelization of H.264 video decoder for embedded multicore processor. In 2008 IEEE international conference on multimedia and expo (pp. 329–332).Google Scholar
  20. 20.
    Cho, Y., Kim, S., Lee, J., Shin, H. (2010). Parallelizing the H.264 decoder on the cell BE architecture. In Proceedings of the tenth ACM international conference on embedded software, ser. EMSOFT ’10 (pp. 49–58). New York, ACM. doi: 10.1145/1879021.1879029.
  21. 21.
    Chi, C.C., & Juurlink, B. (2011). A QHD-capable parallel H.264 decoder. In Proceedings of the international conference on supercomputing, ser. ICS ’11 (pp. 317–326). New York, ACM. doi: 10.1145/1.995896.1995945.
  22. 22.
    Richter, H., & Müller, E. (2007). Multistandard video decompression based on a uniform meta format stream. In Proceedings of 26th picture coding symposium (PCS’07).Google Scholar
  23. 23.
    ITU Telecom (1997). Standardization sector of ITU. Video coding for low bitrate communication. Draft ITU-T Recommendation H.263 Version 2.Google Scholar
  24. 24.
    ISO/IEC MPEG and ITU-T VCEG (2000). Information technology – Generic coding of moving pictures and associated audio information – Part2: Video (ISO/IEC 13818-2:2000 | ITU-T Rec. H.262).Google Scholar
  25. 25.
    List, P., Joch, A., Lainema, J., Bjøntegaard, G., Karczewicz, M. (2003). Adaptive deblocking filter. IEEE Transactions Circuits Systematic Video Technology, 13(7), 614–619.CrossRefGoogle Scholar
  26. 26.
    Malvar, H.S., Hallapuro, A., Karczewicz, M., Kerofsky, L. (2003). Low-complexity transform and quantization in H.264/AVC. IEEE Transactions Circuits Systematic Video Technology, 13(7), 598–603.CrossRefGoogle Scholar
  27. 27.
    Ostermann, J., Bormans, J., List, P., Marpe, D., Narroschke, M., Pereira, F., Stockhammer, T., Wedi, T. (2004). Video coding with H.264/AVC: tools, performance, and complexity. IEEE Circuits and Systems Magazine, 4(1), 7–28.CrossRefGoogle Scholar
  28. 28.
    Horowitz, M., Joch, A., Kossentini, F., Hallapuro, A. (2003). H.264/AVC baseline profile decoder complexity analysis. IEEE Transactions Circuits Systematic Video Technology, 13(7), 704–716.CrossRefGoogle Scholar
  29. 29.
    Richter, H., Stabernack, B., Müller, E. (2005). Realtime optimization techniques for processor based H.264 intra frame compression. In Proceedings of GSPx 2005 pervasive signal processing conference.Google Scholar
  30. 30.
    Seitner, F.H., Schreier, R.M., Bleyer, M., Gelautz, M. (2008). A high-level simulator for the H.264/AVC decoding process in multi-core systems. In Proceedings of SPIE, multimedia on mobile devices. 2008, ser. SPIE IS & T electronic imaging conference (pp. 5–16). San Jose. doi: 10.1117/12.766423.
  31. 31.
    Richter, H., Stabernack, B., Müller, E. (2009). Adaptive multithreaded H.264/AVC decoding. In Proceedings of the 43rd Asilomar conference on signals, systems and computers, ser. asilomar’09 (pp. 886–890). Piscataway, IEEE Press, Available:
  32. 32.
    Anderson, T. (1990). The performance of spin lock alternatives for shared-memory multiprocessors. IEEE Transactions Parallel Distribution Systematic, 01(1), 6–16.CrossRefGoogle Scholar
  33. 33.
    Sühring, K. (2011). JVT reference software model, version JM18.0.
  34. 34.
    Aimer, L., Merrit, L., Petit, E.X264 – a free H.264/AVC encoder. Available
  35. 35.
    Hübert, H., Stabernack, B., Richter, H. (2004). Tool-aided performance analysis and optimization of an H.264 decoder for embedded systems. In The eighth IEEE international symposium on consumer electronics (ISCE 2004).Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Henryk Richter
    • 1
    Email author
  • Benno Stabernack
    • 2
  • Volker Kühn
    • 1
  1. 1.Institute of Communications EngineeringUniversity of RostockRostockGermany
  2. 2.Department of Image ProcessingFraunhofer Institute for TelecommunicationsBerlinGermany

Personalised recommendations