Journal of Real-Time Image Processing

, Volume 11, Issue 4, pp 633–644 | Cite as

High performance architecture for real-time HDTV broadcasting

  • Yasser IsmailEmail author
  • Wael El-Medany
  • Hessa Al-Junaid
  • Ahmed Abdelgawad
Special Issue Paper


A novel full search motion estimation co-processor architecture design is presented in this paper. The proposed architecture efficiently reuses search area data to minimize memory I/O while fully utilizing the hardware resources. A smart processing element (PE) and an efficient simple internal memory are the main components of the proposed co-processor. An efficient algorithm is used for loading both the current block and the search area inside the PE array. The search area data flow horizontally while the current block data are stationary. As a result, the speed of the co-processor is improved in terms of the throughput and the operating frequency compared to the state-of-the-art techniques. A smart local memory and PE design guarantees a simple and a regular data flow. The design of the local memory is implemented using only registers and a simple counter. This simplifies the design by avoiding the use of complicated addressing to write or read into/from the local memory. The proposed architecture is implemented using both the FPGA and the ASIC flow design tools. For a search range of 32 × 32 and block size of 16 × 16, the architecture can perform motion estimation for 30 fps of HDTV video at 350 MHz and easily outperforms many fast full search architectures.


H.264/AVC Motion estimation Video coding 



The authors acknowledge the support of the Deanship of Scientific Research—University of Bahrain—Bahrain for their financial support to finalize this work.


  1. 1.
    Wiegand, T., Sullivan, G.J., Bjontegaard, G., Luthra, A.: Overview of the H.264/AVC video coding standard. IEEE Trans. Circ. Syst. Video Technol. 13, 560–576 (2003)CrossRefGoogle Scholar
  2. 2.
    Eckart, S., Fogg, C.: ISO/IEC MPEG-2 software video codec. Proc. SPIE 2419, 100–118 (1995)CrossRefGoogle Scholar
  3. 3.
    ITU-T Rec. H.263: Video coding for low bit rate communication (1998)Google Scholar
  4. 4.
    ISO/IEC 14496-2 (MPEG-4 Video): Information technology—Coding of audio visual objects (1999)Google Scholar
  5. 5.
    ITU-T Rec. H.264 and ISO/IEC 14496-10 AVC: Advanced video coding for generic audiovisual services (2003)Google Scholar
  6. 6.
    Woong, I.L.C., Byeungwoo, J., Jechang, J.: Fast motion estimation with modified diamond search for variable motion block sizes. In: Proceedings. 2003 International Conference on Image Processing, 2003. ICIP 2003, vol. 3, pp. II-371-4 (2003)Google Scholar
  7. 7.
    Goel, S., Ismail, Y., Bayoumi, M.A.: Adaptive search window size algorithm for fast motion estimation in H.264/AVC standard. In: 48th Midwest Symposium on Circuits and Systems, 2005, vol. 2, pp. 1557–1560 (2005)Google Scholar
  8. 8.
    Goel, S., Ismail, Y., Devulapalli, P., McNeely, J., Bayoumi, M.A.: An efficient data reuse motion estimation engine. In: IEEE Workshop on Signal Processing Systems Design and Implementation, 2006. SIPS ‘06, pp. 383–386 (2006)Google Scholar
  9. 9.
    Ahmed, A., Shahid, M.U., Martina, M., Magli, E., Masera, G.: VLSI architecture for low-complexity motion estimation in H.264 multiview video coding. In: 2013 Euromicro Conference on Digital System Design (DSD), pp. 288–292 (2013)Google Scholar
  10. 10.
    Huong, H., Klepko, R., Nam, N., Demin, W.: A high performance hardware architecture for multi-frame hierarchical motion estimation. IEEE Trans. Consum. Electron. 57, 794–801 (2011)CrossRefGoogle Scholar
  11. 11.
    Pastuszak, G., Jakubowski, M.: Adaptive computationally scalable motion estimation for the hardware H.264/AVC encoder. IEEE Trans. Circ. Syst. Video Technol. 23, 802–812 (2013)CrossRefGoogle Scholar
  12. 12.
    Koga, T., Iinuma, K., Iijima, A.: Motion-compensated interframe coding for video conferencing. In: Proceedings of NTC81, pp. C9.6–9.6.5, New Orleans, LA (1981)Google Scholar
  13. 13.
    Li, R., Zeng, B., Liou, M.L.: A new three-step search algorithm for block motion estimation. IEEE Trans. Circ. Syst. Video Technol. 4, 438–442 (1994)CrossRefGoogle Scholar
  14. 14.
    Po, L.M., Ma, W.C.: A novel four-step search algorithm for fast block motion estimation. IEEE Trans. Circ. Syst. Video Technol. 6, 313–317 (1996)CrossRefGoogle Scholar
  15. 15.
    Zhu, S., Ma, K–.K.: A new diamond search algorithm for fast block matching motion estimation. IEEE Trans. Image Process. 9, 287–290 (2000)CrossRefGoogle Scholar
  16. 16.
    Cheung, C.H., Po, L.M.: A novel cross-diamond search algorithm for fast block motion estimation. IEEE Trans. Circ. Syst. Video Technol. 12, 1168–1177 (2002)CrossRefGoogle Scholar
  17. 17.
    Sahani, S.K., Adhikari, G., Das, B.K.: Fast template matching based on multilevel successive elimination algorithm. In: 2012 International Conference on Signal Processing and Communications (SPCOM), pp. 1–5 (2012)Google Scholar
  18. 18.
    Tae Gyoung, A., Yong Ho, M., Jae-Ho, K.: Fast full-search motion estimation based on multilevel successive elimination algorithm. IEEE Trans. Circ. Syst. Video Technol. 14, 1265–1269 (2004)CrossRefGoogle Scholar
  19. 19.
    Ce, Z., Wei-Song, Q., Ser, W.: Predictive fine granularity successive elimination for fast optimal block-matching motion estimation. Image Process. IEEE Trans. 14, 213–221 (2005)CrossRefGoogle Scholar
  20. 20.
    Ismail, Y., McNeely, J.B., Shaaban, M., Mahmoud, H., Bayoumi, M.A.: Fast motion estimation system using dynamic models for H.264/AVC video coding. IEEE Trans. Circ. Syst. Video Technol. 22, 28–42 (2012)CrossRefGoogle Scholar
  21. 21.
    Luheng, J., Au, O.C., Chi-ying, T., Yongfang, S., Rui, M., Hong, Z.: A diamond search window based adaptive search range algorithm. In: 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), pp. 1–4 (2013)Google Scholar
  22. 22.
    Komarek, T., Pirsch, P.: Array architectures for block matching algorithms. IEEE Trans. Circ. Syst. 36, 1301–1308 (1989)CrossRefGoogle Scholar
  23. 23.
    Pirsch, P., Demassieux, N., Gehrke, W.: VLSI architectures for video compression—a survey. Proc. IEEE 83, 220–246 (1995)CrossRefGoogle Scholar
  24. 24.
    Minmin, S., Li, L., Jiawen, W., Hongbing, P., Wei, L.: A horizontal data reuse approach for fractional motion estimation in H.264/AVC encoder. In: 2012 International Conference on Computer Science and Information Processing (CSIP), pp. 821–825 (2012)Google Scholar
  25. 25.
    Kaijin, W., Rongwei, Z., Shanghang, Z., Huizhu, J., Don, X., Wen, G.: An optimized hardware video encoder for AVS with Level C + data reuse scheme for motion estimation. In: 2012 IEEE International Conference on Multimedia and Expo (ICME), pp. 1055–1060 (2012)Google Scholar
  26. 26.
    Tuan, J.-C., Chang, T.-S., Jen, C.-W.: On the data reuse and memory bandwidth analysis for full-search block-matching VLSI architecture. Trans. Circ. Syst. Video Technol. 12, 61–72 (2002)CrossRefGoogle Scholar
  27. 27.
    Goel, S., Ismail, Y., Bayoumi, M.: High-speed motion estimation architecture for real-time video transmission. Computer J. 55, 35–46 (2011)CrossRefGoogle Scholar
  28. 28.
    Minho, K., Ingu, H., Soo-Ik, C.: A fast VLSI architecture for full-search variable block size motion estimation in MPEG-4 AVC/H.264. In: Proceedings of the ASP-DAC 2005. Asia and South Pacific Design Automation Conference, vol. 1, pp. 631–634 (2005)Google Scholar
  29. 29.
    Vos, L.D., Schobinger, M.: VLSI architecture for a flexible block matching processor. IEEE Trans. Circ. Syst. Video Technol. 5(5), 417–428 (1995)CrossRefGoogle Scholar
  30. 30.
    Lai, Y.-K., Chen, L.-F.: A high data-reuse architecture with double-slice processing for full-search block-matching algorithm. In: Proceedings of ISCAS 2003, vol. 2, pp. II-716–II-719 (2003)Google Scholar
  31. 31.
    Swee Yeow,Y., McCanny, J.V.: A VLSI architecture for advanced video coding motion estimation. In: Proceedings. IEEE International Conference on Application-Specific Systems, Architectures, and Processors, pp. 293–301 (2003)Google Scholar
  32. 32.
    Borkar, S.: Design challenges of technology scaling. Micro IEEE, pp. 23–29 (1999)Google Scholar
  33. 33.
    Joint Video Team: Reference Software JM12.4. (2014)
  34. 34.
    Azadfar, M.M.: Implementation of a optimized systolic array architecture for FSBMA using FPGA for real-time applications. IJCSNS Int. J. Comput. Sci. Netw. Sec. 8(3), 46–51 (2008)Google Scholar
  35. 35.
    Wang, B.-M., Yen, J.-C., Chang, S.: Zero waiting-cycle hierarchical block matching algorithm and its array architecture. IEEE Trans. Circ. Syst. Video Technol. 4, 18–28 (1994)CrossRefGoogle Scholar
  36. 36.
    Jehng, Y.-S., Chen, L.-G., Chiueh, T.-D.: An efficient and simple VLSI tree architecture for motion estimation algorithms. IEEE Trans. Signal Process. 41, 889–900 (1993)CrossRefGoogle Scholar
  37. 37.
    Nunez-Yanez, J.L., Nabina, A., Hung, E., Vafiadis, G.: Cogeneration of fast motion estimation processors and algorithms for advanced video coding. IEEE Trans Very Large Scale Integr (VLSI) Syst 20(Issue3), 437–448 (2012)CrossRefGoogle Scholar
  38. 38.
    Zhang, J., Nezan, J.-F., Cousin, J.-G.: Implementation of motion estimation based on heterogeneous parallel computing system with open CL. In: IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), High Performance Computing and Communication, pp. 41–45 (2012)Google Scholar
  39. 39.
    González, D., Botella, G., García, C., Prieto, M., Tirado, F.: Acceleration of blockmatching algorithms using a custom instruction-based paradigm on a Nios II microprocessor. EURASIP J. Adv. Signal Process. 2013, 118 (2013). doi: 10.1186/1687-6180-2013-118 CrossRefGoogle Scholar
  40. 40.
    González, D., Botella, G., Meyer-Baese, U., Garcí, C., Sanz, C., Prieto-Matías, M., Tirado, F.: A low cost matching motion estimation sensor based on the NIOS II microprocessor. Sens. Basel 12, 13126–13149 (2012). doi: 10.3390/s121013126 CrossRefGoogle Scholar
  41. 41.
    Monteiro, E., Maule, M., Sampaio, F., Diniz, C., Zatt, B., Bampi, S.: Real-time block matching motion estimation onto GPGPU. In: 19th IEEE International Conference on Image Processing (ICIP), pp. 1693–1696 (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Yasser Ismail
    • 1
    Email author
  • Wael El-Medany
    • 1
  • Hessa Al-Junaid
    • 1
  • Ahmed Abdelgawad
    • 2
  1. 1.University of BahrainSakhairBahrain
  2. 2.Central Michigan UniversityMt PleasantUSA

Personalised recommendations