Skip to main content
Log in

Real-time UHD video super-resolution and transcoding on heterogeneous hardware

  • Special Issue Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Videos have become the major type of data produced and consumed every day. With screens grow larger, ultra high definition (UHD) videos are becoming more popular since they provide better visual experience. However, video contents with UHD resolution are still scarce. High-performance video super-resolution (SR) techniques that can obtain high resolution (HR) videos from low resolution (LR) sources are recently used in UHD video production. Deep learning (DL)-based SR methods can provide HR videos with appreciable objective and subjective qualities, while their massive computational complexity makes the processing speed far slower than real-time even on GPU servers when producing UHD videos. Moreover, transcoding and other video processing algorithms executed during the enhancement are also time and resource consuming, which performs relatively slow on ordinary CPU and GPU servers. Nowadays, hardware including GPU, field-programmable gate array (FPGA) and application specific integrated circuit (ASIC) are proved to have outstanding capability on image and video processing tasks in different aspects, and there are also dedicated hardware accelerators meant for specific video processing tasks. In this paper, we focus on accelerating a UHD video enhancement workflow on heterogeneous system with multiple hardware accelerators. First, we optimize the most time consuming task, video SR, with CUDNN and CUDA libraries to achieve real-time processing speed for a single UHD output frame on an ordinary GPU. Second, we design a GPU-friendly multi-thread scheduling algorithm for data and computation to better utilize GPU resources and achieve real-time performance on outputting UHD video clips. Third, targeting on production environment, we build a UHD video enhancement application on selected heterogeneous hardware, with an integrated command line tool of our proposed algorithm, and achieve 60 fps real-time end to end processing speed. Experiments show high efficiency, robustness and compatibility of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Wang, Z., Jian, C., Steven, C.: Deep learning for image super-resolution: a survey (2019). arXiv preprint arXiv:1902.06068

  2. Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A., Bishop, R., Rueckert, D., Wang, Z.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016)

  3. Elad, M., Feuer, A.: Restoration of a single superresolution image from several blurred, noisy, and undersampled measured images. IEEE Trans. Image Process. 6(12), 1646–1658 (1997)

    Article  Google Scholar 

  4. Bose, N.K., Boo, K.J.: High?resolution image reconstruction with multisensors. Int. J. Imaging Syst. Technol. 9(4), 294–304 (1998)

    Article  Google Scholar 

  5. He, Y., Yap, K.H., Chen, L., Chau, L.: A nonlinear least square technique for simultaneous image registration and super-resolution. IEEE Trans. Image Process. 16(11), 2830–2841 (2007)

    Article  MathSciNet  Google Scholar 

  6. Anbarjafari, G., Demirel, H.: Image super resolution based on interpolation of wavelet domain high frequency subbands and the spatial domain input image. ETRI J. 32(3), 390–394 (2010)

    Article  Google Scholar 

  7. Kim, K.I., Kwon, Y.: Single-image super-resolution using sparse regression and natural image prior. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1127–1133 (2010)

    Article  Google Scholar 

  8. Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, L.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding (2012)

  9. Jung, C., Ke, P., Sun, Z., Gu, A.: A fast deconvolution-based approach for single-image super-resolution with GPU acceleration. J. Real-Time Image Process. 14(2), 501–512 (2018)

    Article  Google Scholar 

  10. Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. European conference on computer vision. Springer, Cham (2014)

    Google Scholar 

  11. Zhao, Z., Song, L., Xie, R., Yang, X.: GPU accelerated high-quality video/image super-resolution. 2016 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB) (2016)

  12. Kim, J., Jung, K., Kyoung, M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016)

  13. Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2017)

  14. Kim, J., Kwon Lee, J., Mu Lee, K.: Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016)

  15. Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2017)

  16. Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., Change Loy, C.: Esrgan: Enhanced super-resolution generative adversarial networks. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)

  17. Caballero, J., Ledig, C., Aitken, A., Acosta, A., Totz, J., Wang, Z., Shi, W.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

  18. Jo, Y., Wug Oh, S., Kang, J., Joo Kim, S.: Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2018)

  19. Yang, W., Zhang, X., Tian, Y., Wang, W., Xue, J., Liao, Q.: Deep learning for single image super-resolution: a brief review. IEEE Trans Multimedia (2019)

  20. Chang, J., Keon-Woo, K., Suk-Ju, K.: An energy-efficient fpga-based deconvolutional neural networks accelerator for single image super-resolution. IEEE Trans Circuits Syst Video Technol (2018)

  21. Kim, Y., Choi, J.S., Kim, M.: A real-time convolutional neural network for super-resolution on FPGA with applications to 4K UHD 60 fps Video Services. IEEE Trans Circuits Syst Video Technol (2018)

  22. He, Z., Huang, H., Jiang, M., Bai, Y., Luo, G.: FPGA-based real-time super-resolution system for ultra high definition videos. In: 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) (2018)

  23. Ko, Y., Yi, Y., Ha, S.: An efficient parallelization technique for x264 encoder on heterogeneous platforms consisting of CPUs and GPUs. J. Real-Time Image Process. 9(1), 5–18 (2014)

    Article  Google Scholar 

  24. Lee, D., Sim, D., Cho, K., Oh, S.: Fast motion estimation for HEVC on graphics processing unit (GPU). J. Real-Time Image Process. 12(2), 549–562 (2016)

    Article  Google Scholar 

  25. Zhu, H., Wang, D., Zhang, P., Luo, Z., Jiao, L., Han, H.: Parallel implementations of frame rate up-conversion algorithm using OpenCL on heterogeneous computing devices. Multimedia Tools Appl 78, 9311–9334 (2018)

    Article  Google Scholar 

  26. Bittner, R., Ruf, E., Forin, A.: Direct GPU/FPGA communication via PCI express. Cluster Comput. 17(2), 339–348 (2014)

    Article  Google Scholar 

  27. Chang, Z.H., Jong, B.F., Wong, W.J., Wong, M.: Distributed video transcoding on a heterogeneous computing platform. In: 2016 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) (2016)

  28. HajiRassouliha, A., Taberner, A.J., Nash, M.P., Nielsen, P.M.: Suitability of recent hardware accelerators (DSPs, FPGAs, and GPUs) for computer vision and image processing algorithms. Signal Process. Image Commun. 68, 101–119 (2018)

    Article  Google Scholar 

  29. Georgis, G., Lentaris, G., Reisis, D.: Acceleration techniques and evaluation on multi-core CPU, GPU and FPGA for image processing and super-resolution. J. Real-Time Image Process. 16, 1–28 (2016)

    Google Scholar 

  30. Schulter, S., Christian, L., Horst, B.: Fast and accurate image upscaling with super-resolution forests. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)

  31. Fang, M., Fang, J., Zhang, W., Zhou, H., Liao, J., Wang, Y.: Benchmarking the GPU memory at the warp level. Parallel Comput. 71, 23–41 (2018)

    Article  MathSciNet  Google Scholar 

  32. https://pytorch.org

  33. https://hevc.hhi.fraunhofer.de

  34. https://docs.nvidia.com/

Download references

Acknowledgements

This work was supported by NSFC (61521062, U1611461, 61671296), MoE-China Mobile Research Fund Project (MCM20180702), the 111 Project (B07022 and Sheitc No. 150633) and the Shanghai Key Laboratory of Digital Media Processing and Transmissions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Song.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dong, Y., Song, L., Xie, R. et al. Real-time UHD video super-resolution and transcoding on heterogeneous hardware. J Real-Time Image Proc 17, 2029–2045 (2020). https://doi.org/10.1007/s11554-019-00913-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-019-00913-7

Keywords

Navigation