Joint learning of visual and spatial features for edit propagation from a single image


In this paper, we regard edit propagation as a multi-class classification problem and deep neural network (DNN) is used to solve the problem. We design a shallow and fully convolutional DNN that can be trained end-to-end. To achieve this, our method uses combinations of low-level visual features, which are extracted from the input image, and spatial features, which are computed through transforming user interactions, as input of the DNN, which efficiently performs a joint learning of visual and spatial features. We then train the DNN on many of such combinations in order to build a DNN-based pixel-level classifier. Our DNN is also equipped with patch-by-patch training and whole image estimation, speeding up learning and inference. Finally, we improve classification accuracy of the DNN by employing a fully connected conditional random field. Experimental results show that our method can respond to user interactions well and generate precise results compared with the state-of-art edit propagation approaches. Furthermore, we demonstrate our method on various applications.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11


  1. 1.

    Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. ACM Trans. Graph. 23(3), 689–694 (2004)

    Article  Google Scholar 

  2. 2.

    Yatziv, L., Sapiro, G.: Fast image and video colorization using chrominance blending. IEEE Trans. Image Process. 15(5), 1120–1129 (2006)

    Article  Google Scholar 

  3. 3.

    Qu, Y., Wong, T.T., Heng, P.A.: Manga colorization. ACM Trans. Graph. 25(3), 1214–1220 (2006)

    Article  Google Scholar 

  4. 4.

    Luan, Q., Wen, F., Cohen-Or, D., Liang, L., Xu, Y.Q., Shum, H.Y.: Natural image colorization. In: Proceedings of the Eurographics Symposium on Rendering Techniques, pp. 309–320 (2007)

  5. 5.

    Lischinski, D., Farbman, Z., Uyttendaele, M., Szeliski, R.: Interactive local adjustment of tonal values. ACM Trans. Graph. 25(3), 646–653 (2006)

    Article  Google Scholar 

  6. 6.

    Pellacini, F., Lawrence, J.: AppWand: editing measured materials using appearance-driven optimization. ACM Trans. Graph. 26(3), 54 (2007)

    Article  Google Scholar 

  7. 7.

    An, X., Pellacini, F.: AppProp: all-pairs appearance-space edit propagation. ACM Trans. Graph. 27(3), 40:1–40:9 (2008)

    Article  Google Scholar 

  8. 8.

    Xu, K., Li, Y., Ju, T., Hu, S.M., Liu, T.Q.: Efficient affinity-based edit propagation using K-D tree. ACM Trans. Graph. 28(5), 118:1–118:6 (2009)

    Google Scholar 

  9. 9.

    Li, Y., Ju, T., Hu, S.M.: Instant propagation of sparse edits on images and videos. Comput. Graph. Forum 29(7), 2049–2054 (2010)

    Article  Google Scholar 

  10. 10.

    Bie, X., Huang, H., Wang, W.: Real time edit propagation by efficient sampling. Comput. Graph. Forum 30(7), 2041–2048 (2011)

    Article  Google Scholar 

  11. 11.

    Xiao, C., Nie, Y., Tang, F.: Efficient edit propagation using hierarchical data structure. IEEE Trans. Vis. Comput. Graph. 17(8), 1135–1147 (2011)

    Article  Google Scholar 

  12. 12.

    Criminisi, A., Sharp, T., Rother, C., Perez, P.: Geodesic image and video editing. ACM Trans. Graph. 29(5), 134:1–134:15 (2010)

    Article  Google Scholar 

  13. 13.

    Farbman, Z., Fattal, R., Lischinski, D.: Diffusion maps for edge-aware image editing. ACM Trans. Graph. 29(6), 145:1–145:10 (2010)

    Article  Google Scholar 

  14. 14.

    Ma, L.Q., Xu, K.: Efficient antialiased edit propagation for images and videos. Comput. Graph. 36(8), 1005–1012 (2012)

    Article  Google Scholar 

  15. 15.

    Chen, X., Zou, D., Zhao, Q., Tan, P.: Manifold preserving edit propagation. ACM Trans. Graph. 31(6), 132:1–132:7 (2012)

    Google Scholar 

  16. 16.

    Musialski, P., Cui, M., Ye, J.P., Razdan, A., Wonka, P.: A framework for interactive image color editing. Vis. Comput. 39(11), 1173–1186 (2013)

    Article  Google Scholar 

  17. 17.

    Xu, L., Yan, Q., Jia, J.Y.: A sparse control model for image and video editing. ACM Trans. Graph. 32(6), 197:1–197:10 (2013)

    Google Scholar 

  18. 18.

    Yatagawa, T., Yamaguchi, Y.: Sparse pixel sampling for appearance edit propagation. Vis. Comput. 31, 1101–1111 (2015)

    Article  Google Scholar 

  19. 19.

    Li, Y., Adelson, E., Agarwala, A.: Scribbleboost: adding classification to edge-aware interpolation of local image and video adjustments. EGSR 08, 1255–1264 (2008)

    Google Scholar 

  20. 20.

    Dalmau, O., Rivera, M., Alarcon, T.: Bayesian scheme for interactive colourization, recolourization and image/video editing. Comput. Graph. Forum 29(8), 2372–2386 (2010)

    Article  Google Scholar 

  21. 21.

    Chen, X., Zou, D., Li, J., Cao, X., Zhao, Q., Zhang, H.: Sparse dictionary learning for edit propagation of high-resolution images. CVPR 2014, 2854–2861 (2014)

    Google Scholar 

  22. 22.

    Levin, A., Lischinski, D., Weiss, Y.: A closed form solution to natural image matting. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 228–242 (2008)

    Article  Google Scholar 

  23. 23.

    Cho, H., Lee, H., Kang, H., Lee, S.: Bilateral texture filtering. ACM Trans. Graph. 33(4), 128:1–128:8 (2014)

    Article  Google Scholar 

  24. 24.

    Cambra, A.B., Murillo, A.C., Munõz, A.: A generic tool for interactive complex image editing. Vis. Comput. 34, 1493–1505 (2017)

    Article  Google Scholar 

  25. 25.

    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  26. 26.

    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)

  27. 27.

    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR arXiv:1409.1556 (2014)

  28. 28.

    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)

  29. 29.

    Ren, S., He, K.M., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)

  30. 30.

    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)

  31. 31.

    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)

    Article  Google Scholar 

  32. 32.

    Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016)

    Article  Google Scholar 

  33. 33.

    Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. 36(4), 1–14 (2017)

    Article  Google Scholar 

  34. 34.

    Endo, Y., Iizuka, S., Kanamori, Y., Mitani, J.: DeepProp: extracting deep features from a single image for edit propagation. Comput. Graph. Forum 35, 189–201 (2016)

    Article  Google Scholar 

  35. 35.

    Yan, Z., Zhang, H., Wang, B., Paris, S., Yu, Y.: Automatic photo adjustment using deep neural networks. ACM Trans. Graph. 35(2), 11:1–11:15 (2016)

    Article  Google Scholar 

  36. 36.

    Xu, N., Price, B.L., Cohen, S., Yang, J., Huang, T.S.: Deep interactive object selection. In: CVPR, pp. 373–381 (2016)

  37. 37.

    Zhang, R., Zhu, J.Y., Isola, P., Geng, X.Y., Lin, A.S., Yu, T., Efros, A.A.: Real-time user-guided image colorization with learned deep priors. ACM Trans. Graph. 36(4), 119:1–119:11 (2017)

    Google Scholar 

  38. 38.

    Kingma, D. P., Ba, J.: Adam: a method for stochastic optimization. CoRR arXiv:1412.6980 (2014)

  39. 39.

    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)

    Article  Google Scholar 

  40. 40.

    Krähenbühl P., Koltun V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: NIPS, pp. 109–117 (2011)

  41. 41.

    Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: CVPR, pp. 1597–1604 (2009)

  42. 42.

    Lin, T., Maire, M., Belongie, S.J., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: ECCV, pp. 740–755 (2014)

  43. 43.

    Gui, Y., Zeng, G., Tang, W.: Fast and robust image cutout using bilateral grid and confidence based color model. J. Comput. Aided Des. Comput. Graph. 30(7), 1284–1296 (2018). (in Chinese)

    Article  Google Scholar 

Download references


We would like to thank Prof. Yiyu Cai and Dr. Zhifeng Xie for proofreading our paper. We would also like to thank the reviewers for their valuable comments. This study was funded by the National Natural Science Foundations of P. R. China (Grant Nos. 61402053; 61602059; 61772087; 61802031) and the Scientific Research Fund of Education Department of Hunan Province (Grant Nos. 16C0046; 16A008).

Author information



Corresponding author

Correspondence to Yan Gui.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gui, Y., Zeng, G. Joint learning of visual and spatial features for edit propagation from a single image. Vis Comput 36, 469–482 (2020).

Download citation


  • Image editing
  • Edit propagation
  • Deep neural network
  • Fully connected conditional random field