Deep Kalman Filtering Network for Video Compression Artifact Reduction

  • Guo Lu
  • Wanli Ouyang
  • Dong XuEmail author
  • Xiaoyun Zhang
  • Zhiyong Gao
  • Ming-Ting Sun
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11218)


When lossy video compression algorithms are applied, compression artifacts often appear in videos, making decoded videos unpleasant for human visual systems. In this paper, we model the video artifact reduction task as a Kalman filtering procedure and restore decoded frames through a deep Kalman filtering network. Different from the existing works using the noisy previous decoded frames as temporal information in the restoration problem, we utilize the less noisy previous restored frame and build a recursive filtering scheme based on the Kalman model. This strategy can provide more accurate and consistent temporal information, which produces higher quality restoration results. In addition, the strong prior information of prediction residual is also exploited for restoration through a well designed neural network. These two components are combined under the Kalman framework and optimized through the deep Kalman filtering network. Our approach can well bridge the gap between the model-based methods and learning-based methods by integrating the recursive nature of the Kalman model and highly non-linear transformation ability of deep neural network. Experimental results on the benchmark dataset demonstrate the effectiveness of our proposed method.


Compression artifact reduction Deep neural network Kalman model Recursive filtering Video restoration 



This work was supported by s research project from SenseTime. This work was also supported in part by National Natural Science Foundation of China (61771306, 61521062), Natural Science Foundation of Shanghai(18ZR1418100)Chinese National Key S&T Special Program(2013ZX01033001-002-002), STCSM Grant 17DZ1205602, Shanghai Key Laboratory of Digital Media Processing and Transmissions (STCSM 18DZ2270700).


  1. 1.
    Sullivan, G.J., Ohm, J., Han, W.J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. TCSVT 22(12), 1649–1668 (2012)Google Scholar
  2. 2.
    Schwarz, H., Marpe, D., Wiegand, T.: Overview of the scalable video coding extension of the H. 264/AVC standard. TCSVT 17(9), 1103–1120 (2007)Google Scholar
  3. 3.
    Lu, G., Zhang, X., Chen, L., Gao, Z.: Novel integration of frame rate up conversion and HEVC coding based on rate-distortion optimization. TIP 27(2), 678–691 (2018)MathSciNetGoogle Scholar
  4. 4.
    Shen, M.Y., Kuo, C.C.J.: Review of postprocessing techniques for compression artifact removal. J. Vis. Commun. Image Represent. 9(1), 2–14 (1998)CrossRefGoogle Scholar
  5. 5.
    Reeve, H.C., Lim, J.S.: Reduction of blocking effects in image coding. Opt. Eng. 23(1) (1984)Google Scholar
  6. 6.
    Jung, C., Jiao, L., Qi, H., Sun, T.: Image deblocking via sparse representation. Signal Process. Image Commun. 27(6), 663–677 (2012)CrossRefGoogle Scholar
  7. 7.
    Choi, I., Kim, S., Brown, M.S., Tai, Y.W.: A learning-based approach to reduce JPEG artifacts in image matting. In: ICCV (2013)Google Scholar
  8. 8.
    Chang, H., Ng, M.K., Zeng, T.: Reducing artifacts in JPEG decompression via a learned dictionary. IEEE Trans. Signal Process. 62(3), 718–728 (2014)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Liu, X., Wu, X., Zhou, J., Zhao, D.: Data-driven sparsity-based restoration of JPEG-compressed images in dual transform-pixel domain. In: CVPR, vol. 1. p. 5 (2015)Google Scholar
  10. 10.
    Ouyang, W., Wang, X.: Joint deep learning for pedestrian detection. In: ICCV (2013)Google Scholar
  11. 11.
    Ouyang, W., et al.: Deepid-net: deformable deep convolutional neural networks for object detection. In: CVPR (2015)Google Scholar
  12. 12.
    Wang, L., Ouyang, W., Wang, X., Lu, H.: Visual tracking with fully convolutional networks. In: ICCV (2015)Google Scholar
  13. 13.
    Zhao, R., Ouyang, W., Wang, X.: Unsupervised salience learning for person re-identification. In: CVPR (2013)Google Scholar
  14. 14.
    Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014). Scholar
  15. 15.
    Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016)CrossRefGoogle Scholar
  16. 16.
    Tai, Y., Yang, J., Liu, X., Xu, C.: Memnet: a persistent memory network for image restoration. In: CVPR (2017)Google Scholar
  17. 17.
    Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. TIP 26(7), 3142–3155 (2017)MathSciNetGoogle Scholar
  18. 18.
    Dong, C., Deng, Y., Change Loy, C., Tang, X.: Compression artifacts reduction by a deep convolutional network. In: ICCV (2015)Google Scholar
  19. 19.
    Guo, J., Chao, H.: Building dual-domain representations for compression artifacts reduction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 628–644. Springer, Cham (2016). Scholar
  20. 20.
    Galteri, L., Seidenari, L., Bertini, M., Del Bimbo, A.: Deep generative adversarial compression artifact removal. arXiv preprint arXiv:1704.02518 (2017)
  21. 21.
    Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. arXiv preprint arXiv:1711.09078 (2017)
  22. 22.
    Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolution. In: ICCV (2017)Google Scholar
  23. 23.
    Liu, D., et al.: Robust video super-resolution with learned temporal dynamics. In: CVPR (2017)Google Scholar
  24. 24.
    Caballero, J., et al.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: CVPR (2017)Google Scholar
  25. 25.
    Foi, A., Katkovnik, V., Egiazarian, K.: Pointwise shape-adaptive DCT for high-quality denoising and deblocking of grayscale and color images. TIP 16(5), 1395–1411 (2007)MathSciNetGoogle Scholar
  26. 26.
    Zhang, X., Xiong, R., Fan, X., Ma, S., Gao, W.: Compression artifact reduction by overlapped-block transform coefficient estimation with block similarity. TIP 22(12), 4613–4626 (2013)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Wang, Z., Liu, D., Chang, S., Ling, Q., Yang, Y., Huang, T.S.: D3: dep dual-domain based fast restoration of JPEG-compressed images. In: CVPR (2016)Google Scholar
  28. 28.
    Svoboda, P., Hradis, M., Barina, D., Zemcik, P.: Compression artifacts removal using convolutional neural networks. arXiv preprint arXiv:1605.00366 (2016)
  29. 29.
    Mao, X.J., Shen, C., Yang, Y.B.: Image denoising using very deep fully convolutional encoder-decoder networks with symmetric skip connections. arXiv preprint (2016)Google Scholar
  30. 30.
    Guo, J., Chao, H.: One-to-many network for visually pleasing compression artifacts reduction. In: CVPR (2017)Google Scholar
  31. 31.
    Zhang, K., Zuo, W., Gu, S., Zhang, L.: Learning deep CNN denoiser prior for image restoration. arXiv preprint (2017)Google Scholar
  32. 32.
    Chang, J.R., Li, C.L., Poczos, B., Kumar, B.V., Sankaranarayanan, A.C.: One network to solve them allsolving linear inverse problems using deep projection models. arXiv preprint (2017)Google Scholar
  33. 33.
    Bigdeli, S.A., Zwicker, M., Favaro, P., Jin, M.: Deep mean-shift priors for image restoration. In: NIPS (2017)Google Scholar
  34. 34.
    Liao, R., Tao, X., Li, R., Ma, Z., Jia, J.: Video super-resolution via deep draft-ensemble learning. In: ICCV (2015)Google Scholar
  35. 35.
    Kappeler, A., Yoo, S., Dai, Q., Katsaggelos, A.K.: Video super-resolution with convolutional neural networks. IEEE Trans. Comput. Imaging 2(2), 109–122 (2016)MathSciNetCrossRefGoogle Scholar
  36. 36.
    Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: NIPS (2015)Google Scholar
  37. 37.
    Shashua, S.D.C., Mannor, S.: Deep robust kalman filter. arXiv preprint arXiv:1703.02310 (2017)
  38. 38.
    Krishnan, R.G., Shalit, U., Sontag, D.: Deep Kalman filters. arXiv preprint arXiv:1511.05121 (2015)
  39. 39.
    Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng. 82(1), 35–45 (1960)CrossRefGoogle Scholar
  40. 40.
    Haykin, S.S.: Kalman Filtering and Neural Networks. Wiley Online Library, New York (2001)CrossRefGoogle Scholar
  41. 41.
    He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). Scholar
  42. 42.
    Ballé, J., Laparra, V., Simoncelli, E.P.: Density modeling of images using a generalized normalization transformation. arXiv preprint arXiv:1511.06281 (2015)
  43. 43.
    Abadi, M., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)
  44. 44.
    Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. TIP 13(4), 600–612 (2004)Google Scholar
  45. 45.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  46. 46.
    Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: International Conference on Artificial Intelligence and Statistics (2010)Google Scholar
  47. 47.
    Maggioni, M., Boracchi, G., Foi, A., Egiazarian, K.: Video denoising, deblocking, and enhancement through separable 4-d nonlocal spatiotemporal transforms. TIP 21(9), 3952–3966 (2012)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Guo Lu
    • 1
  • Wanli Ouyang
    • 2
    • 3
  • Dong Xu
    • 2
    Email author
  • Xiaoyun Zhang
    • 1
  • Zhiyong Gao
    • 1
  • Ming-Ting Sun
    • 4
  1. 1.Shanghai Jiao Tong UniversityShanghaiChina
  2. 2.The University of SydneySydneyAustralia
  3. 3.SenseTime Computer Vision Research GroupThe University of SydneySydneyAustralia
  4. 4.University of WashingtonSeattleUSA

Personalised recommendations