CrossNet: An End-to-End Reference-Based Super Resolution Network Using Cross-Scale Warping

  • Haitian Zheng
  • Mengqi Ji
  • Haoqian Wang
  • Yebin Liu
  • Lu FangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11210)


The Reference-based Super-resolution (RefSR) super-resolves a low-resolution (LR) image given an external high-resolution (HR) reference image, where the reference image and LR image share similar viewpoint but with significant resolution gap (\(8{\times }\)). Existing RefSR methods work in a cascaded way such as patch matching followed by synthesis pipeline with two independently defined objective functions, leading to the inter-patch misalignment, grid effect and inefficient optimization. To resolve these issues, we present CrossNet, an end-to-end and fully-convolutional deep neural network using cross-scale warping. Our network contains image encoders, cross-scale warping layers, and fusion decoder: the encoder serves to extract multi-scale features from both the LR and the reference images; the cross-scale warping layers spatially aligns the reference feature map with the LR feature map; the decoder finally aggregates feature maps from both domains to synthesize the HR output. Using cross-scale warping, our network is able to perform spatial alignment at pixel-level in an end-to-end fashion, which improves the existing schemes [1, 2] both in precision (around 2 dB–4 dB) and efficiency (more than 100 times faster).


Reference-based Super Resolution Light field imaging Image synthesis Encoder-decoder Optical flow 

Supplementary material

474211_1_En_6_MOESM1_ESM.pdf (23.5 mb)
Supplementary material 1 (pdf 24100 KB)


  1. 1.
    Boominathan, V., Mitra, K., Veeraraghavan, A.: Improving resolution and depth-of-field of light field cameras using a hybrid imaging system. In: ICCP, pp. 1–10. IEEE (2014)Google Scholar
  2. 2.
    Zheng, H., Ji, M., Wang, H., Liu, Y., Fang, L.: Learning cross-scale correspondence and patch-based synthesis for reference-based super-resolution. In: BMVC (2017)Google Scholar
  3. 3.
    Wang, Y., Liu, Y., Heidrich, W., Dai, Q.: The light field attachment: turning a DSLR into a light field camera using a low budget camera ring. IEEE Trans. Vis. Comput. Graph. (2016)Google Scholar
  4. 4.
    Yuan, X., Lu, F., Dai, Q., Brady, D., Yebin, L.: Multiscale gigapixel video: a cross resolution image matching and warping approach. In: IEEE International Conference on Computational Photography (2017)Google Scholar
  5. 5.
    Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802 (2016)
  6. 6.
    Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025 (2015)Google Scholar
  7. 7.
    Li, X., Orchard, M.T.: New edge-directed interpolation. IEEE Trans. Image Process. 10(10), 1521–1527 (2001)CrossRefGoogle Scholar
  8. 8.
    Zhang, L., Wu, X.: An edge-guided image interpolation algorithm via directional filtering and data fusion. IEEE Trans. Image Process. 15(8), 2226–2238 (2006)CrossRefGoogle Scholar
  9. 9.
    Tai, Y.W., Liu, S., Brown, M.S., Lin, S.: Super resolution using edge prior and single image detail synthesis. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2400–2407. IEEE (2010)Google Scholar
  10. 10.
    Babacan, S.D., Molina, R., Katsaggelos, A.K.: Total variation super resolution using a variational approach. In: 2008 15th IEEE International Conference on Image Processing, ICIP 2008, pp. 641–644. IEEE (2008)Google Scholar
  11. 11.
    Krishnan, D., Fergus, R.: Fast image deconvolution using hyper-Laplacian priors. In: Advances in Neural Information Processing Systems, pp. 1033–1041 (2009)Google Scholar
  12. 12.
    Yang, J., Wright, J., Huang, T., Ma, Y.: Image super-resolution as sparse representation of raw image patches. In: CVPR. IEEE (2008) 1–8Google Scholar
  13. 13.
    Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Kim, K.I., Kwon, Y.: Single-image super-resolution using sparse regression and natural image prior. TPAMI 32(6), 1127–1133 (2010)CrossRefGoogle Scholar
  15. 15.
    Yang, J., Wang, Z., Lin, Z., Cohen, S., Huang, T.: Coupled dictionary training for image super-resolution. IEEE Trans. Image Process. 21(8), 3467–3478 (2012)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Glasner, D., Bagon, S., Irani, M.: Super-resolution from a single image. In: ICCV, pp. 349–356. IEEE (2009)Google Scholar
  17. 17.
    Freeman, W.T., Jones, T.R., Pasztor, E.C.: Example-based super-resolution. IEEE Comput. Graph. Appl. 22(2), 56–65 (2002)CrossRefGoogle Scholar
  18. 18.
    Chang, H., Yeung, D.Y., Xiong, Y.: Super-resolution through neighbor embedding. In: CVPR, vol. 1, p. I. IEEE (2004)Google Scholar
  19. 19.
    Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 60–65. IEEE (2005)Google Scholar
  20. 20.
    Salvador, J., Pérez-Pellitero, E.: Naive bayes super-resolution forest. In: ICCV, pp. 325–333 (2015)Google Scholar
  21. 21.
    Schulter, S., Leistner, C., Bischof, H.: Fast and accurate image upscaling with super-resolution forests. In: CVPR, pp. 3791–3799 (2015)Google Scholar
  22. 22.
    Yang, C.Y., Yang, M.H.: Fast direct super-resolution by simple functions. In: ICCV, pp. 561–568 (2013)Google Scholar
  23. 23.
    Yang, J., Lin, Z., Cohen, S.: Fast image super-resolution based on in-place example regression. In: CVPR, pp. 1059–1066 (2013)Google Scholar
  24. 24.
    He, H., Siu, W.C.: Single image super-resolution using Gaussian process regression. In: CVPR, pp. 449–456. IEEE (2011)Google Scholar
  25. 25.
    Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014). Scholar
  26. 26.
    Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 391–407. Springer, Cham (2016). Scholar
  27. 27.
    Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: CVPR, pp. 1874–1883 (2016)Google Scholar
  28. 28.
    Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: CVPR, pp. 1646–1654 (2016)Google Scholar
  29. 29.
    Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep Laplacian pyramid networks for fast and accurate super-resolution. In: CVPR, pp. 624–632 (2017)Google Scholar
  30. 30.
    Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: CVPRW, vol. 1, p. 3 (2017)Google Scholar
  31. 31.
    Wanner, S., Goldluecke, B.: Spatial and angular variational super-resolution of 4D light fields. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 608–621. Springer, Heidelberg (2012). Scholar
  32. 32.
    Mitra, K., Veeraraghavan, A.: Light field denoising, light field superresolution and stereo camera based refocussing using a GMM light field patch prior. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 22–28. IEEE (2012)Google Scholar
  33. 33.
    Wu, J., Wang, H., Wang, X., Zhang, Y.: A novel light field super-resolution framework based on hybrid imaging system. In: 2015 Visual Communications and Image Processing (VCIP), pp. 1–4. IEEE (2015)Google Scholar
  34. 34.
    Zheng, H., Guo, M., Wang, H., Liu, Y., Fang, L.: Combining exemplar-based approach and learning-based approach for light field super-resolution using a hybrid imaging system. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2481–2486 (2017)Google Scholar
  35. 35.
    Kalantari, N.K., Wang, T.C., Ramamoorthi, R.: Learning-based view synthesis for light field cameras. ACM Trans. Graph. (TOG) 35(6), 193 (2016)CrossRefGoogle Scholar
  36. 36.
    Ji, D., Kwon, J., McFarland, M., Savarese, S.: Deep view morphing. Technical report (2017)Google Scholar
  37. 37.
    Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. arXiv preprint arXiv:1711.09078 (2017)
  38. 38.
    Sajjadi, M.S., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6626–6634 (2018)Google Scholar
  39. 39.
    Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolutionGoogle Scholar
  40. 40.
    Liu, Z., Yeh, R., Tang, X., Liu, Y., Agarwala, A.: Video frame synthesis using deep voxel flow. In: ICCV, vol. 2 (2017)Google Scholar
  41. 41.
    Wang, T.C., Zhu, J.Y., Kalantari, N.K., Efros, A.A., Ramamoorthi, R.: Light field video capture using a learning-based hybrid imaging system. ACM Trans. Graph. (TOG) 36(4), 133 (2017)Google Scholar
  42. 42.
    Fischer, P., et al.: FlowNet: learning optical flow with convolutional networks. arXiv preprint arXiv:1504.06852 (2015)
  43. 43.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). Scholar
  44. 44.
    Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 807–814 (2010)Google Scholar
  45. 45.
    Bruhn, A., Weickert, J., Schnörr, C.: Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods. IJCV 61(3), 211–231 (2005)CrossRefGoogle Scholar
  46. 46.
    Srinivasan, P.P., Wang, T., Sreelal, A., Ramamoorthi, R., Ng, R.: Learning to synthesize a 4D RGBD light field from a single image. In: ICCV, vol. 2, p. 6 (2017)Google Scholar
  47. 47.
    The (new) stanford light field archive.
  48. 48.
    Kim, C., Zimmer, H., Pritch, Y., Sorkine-Hornung, A., Gross, M.H.: Scene reconstruction from high spatio-angular resolution light fields. ACM TOG (2013)Google Scholar
  49. 49.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  50. 50.
    Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRefGoogle Scholar
  51. 51.
    Sheikh, H.R., Bovik, A.C., De Veciana, G.: An information fidelity criterion for image quality assessment using natural scene statistics. IEEE Trans. Image Process. 14(12), 2117–2128 (2005)CrossRefGoogle Scholar
  52. 52.
    Bergstra, J., et al.: Theano: a CPU and GPU math compiler in PythonGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Haitian Zheng
    • 1
  • Mengqi Ji
    • 1
    • 2
  • Haoqian Wang
    • 1
  • Yebin Liu
    • 3
  • Lu Fang
    • 1
    Email author
  1. 1.Tsinghua-Berkeley Shenzhen InstituteTsinghua UniversityBeijingChina
  2. 2.Hong Kong University of Science and TechnologyClear Water BayHong Kong
  3. 3.Department of AutomationTsinghua UniversityBeijingChina

Personalised recommendations