Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12347)


Deep encoder-decoder based CNNs have advanced image inpainting methods for hole filling. While existing methods recover structures and textures step-by-step in the hole regions, they typically use two encoder-decoders for separate recovery. The CNN features of each encoder are learned to capture either missing structures or textures without considering them as a whole. The insufficient utilization of these encoder features hampers the performance of recovering both structures and textures. In this paper, we propose a mutual encoder-decoder CNN for joint recovery of both. We use CNN features from the deep and shallow layers of the encoder to represent structures and textures of an input image, respectively. The deep layer features are sent to a structure branch, while the shallow layer features are sent to a texture branch. In each branch, we fill holes in multiple scales of the CNN features. The filled CNN features from both branches are concatenated and then equalized. During feature equalization, we reweigh channel attentions first and propose a bilateral propagation activation function to enable spatial equalization. To this end, the filled CNN features of structure and texture mutually benefit each other to represent image content at all feature levels. We then use the equalized feature to supplement decoder features for output image generation through skip connections. Experiments on benchmark datasets show that the proposed method is effective to recover structures and textures and performs favorably against state-of-the-art approaches.


Deep image inpainting Feature equalizations 



This work is partially supported by the National Natural Science Foundation of China under Grant No. 61702176.

Supplementary material

504434_1_En_43_MOESM1_ESM.pdf (8.3 mb)
Supplementary material 1 (pdf 8542 KB)


  1. 1.
    Ballester, C., Bertalmio, M., Caselles, V., Sapiro, G., Verdera, J.: Filling-in by joint interpolation of vector fields and gray levels. TIP 10, 1200–1211 (2001)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.: PatchMatch: a randomized correspondence algorithm for structural image editing. In: SIGGRAPH (2009)Google Scholar
  3. 3.
    Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: SIGGRAPH (2000)Google Scholar
  4. 4.
    Criminisi, A., Pérez, P., Toyama, K.: Region filling and object removal by exemplar-based image inpainting. TIP 13, 1200–1212 (2004)Google Scholar
  5. 5.
    Darabi, S., Shechtman, E., Barnes, C., Goldman, D.B., Sen, P.: Image melding: combining inconsistent images using patch-based synthesis. ACM Trans. Graph. 31, 18 (2012)CrossRefGoogle Scholar
  6. 6.
    Doersch, C., Singh, S., Gupta, A., Sivic, J., Efros, A.A.: What makes Paris look like Paris? Commun. ACM 58, 103–110 (2015)CrossRefGoogle Scholar
  7. 7.
    Efros, A., Freeman, W.: Image quilting for texture synthesis and transfer. In: SIGGRAPH (2001)Google Scholar
  8. 8.
    Efros, A., Freeman, W.: Texture synthesis by nonparametric sampling. In: ICCV (2001)Google Scholar
  9. 9.
    Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)Google Scholar
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  11. 11.
    Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: NIPS (2017)Google Scholar
  12. 12.
    Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR (2018)Google Scholar
  13. 13.
    Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. In: SIGGRAPH (2017)Google Scholar
  14. 14.
    Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)Google Scholar
  15. 15.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). Scholar
  16. 16.
    Jolicoeur-Martineau, A.: The relativistic discriminator: a key element missing from standard GAN. In: ICLR (2018)Google Scholar
  17. 17.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  18. 18.
    Levin, A., Zomet, A., Weiss, Y.: Learning how to inpaint from global image statistics. In: ICCV (2003)Google Scholar
  19. 19.
    Li, Y., Liu, S., Yang, J., Yang, M.H.: Generative face completion. In: CVPR (2017)Google Scholar
  20. 20.
    Liu, G., Reda, F.A., Shih, K.J., Wang, T.-C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 89–105. Springer, Cham (2018). Scholar
  21. 21.
    Liu, H., Jiang, B., Xiao, Y., Yang, C.: Coherent semantic attention for image inpainting. In: ICCV (2019)Google Scholar
  22. 22.
    Liu, Z., LuoPi, n., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: ICCV (2015)Google Scholar
  23. 23.
    Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018)
  24. 24.
    Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., Ebrahimi, M.: EdgeConnect: generative image inpainting with adversarial edge learning. In: ICCV Workshops (2019)Google Scholar
  25. 25.
    Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.: Context encoders: feature learning by inpainting. In: CVPR (2016)Google Scholar
  26. 26.
    Ren, Y., Yu, X., Zhang, R., Li, T.H., Liu, S., Li, G.: StructureFlow: image inpainting via structure-aware appearance flow. In: ICCV (2019)Google Scholar
  27. 27.
    Song, Y., Yang, C., Lin, Z., Liu, X., Huang, Q., Li, H., Kuo, C.-C.J.: Contextual-based image inpainting: infer, match, and translate. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 3–18. Springer, Cham (2018). Scholar
  28. 28.
    Song, Y., Bao, L., He, S., Yang, Q., Yang, M.H.: Stylizing face images via multiple exemplars. CVIU 162, 135–145 (2017)Google Scholar
  29. 29.
    Song, Y., Yang, C., Shen, Y., Wang, P., Huang, Q., Kuo, J.: SPG-Net: segmentation prediction and guidance network for image inpainting. arXiv preprint arXiv:1805.03356 (2018)
  30. 30.
    Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: CVPR (1998)Google Scholar
  31. 31.
    Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: CVPR (2018)Google Scholar
  32. 32.
    Wang, Y., Tao, X., Qi, X., Shen, X., Jia, J.: Image inpainting via generative multi-column convolutional neural networks. In: NIPS (2018)Google Scholar
  33. 33.
    Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. TIP 13, 600–612 (2004)Google Scholar
  34. 34.
    Xiong, W., Yu, J., Lin, Z., Yang, J., Lu, X., Barnes, C., Luo, J.: Foreground-aware image inpainting. In: CVPR (2019)Google Scholar
  35. 35.
    Xu, L., Yan, Q., Xia, Y., Jia, J.: Structure extraction from texture via relative total variation. SIGGRAPH 31, 139 (2012)Google Scholar
  36. 36.
    Xu, Z., Sun, J.: Image inpainting by patch propagation using patch sparsity. TIP 19, 1153–1165 (2010)MathSciNetzbMATHGoogle Scholar
  37. 37.
    Yan, Z., Li, X., Li, M., Zuo, W., Shan, S.: Shift-Net: image inpainting via deep feature rearrangement. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 3–19. Springer, Cham (2018). Scholar
  38. 38.
    Yeh, R., Chen, C., Lim, T., Johnson, M.H., Do, M.N.: Semantic image inpainting with perceptual and contextual losses. arXiv preprint arXiv:1607.07539 (2016)
  39. 39.
    Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)
  40. 40.
    Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: CVPR (2018)Google Scholar
  41. 41.
    Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: ICCV (2019)Google Scholar
  42. 42.
    Zeng, Y., Fu, J., Chao, H., Guo, B.: Learning pyramid-context encoder network for high-quality image inpainting. In: CVPR (2019)Google Scholar
  43. 43.
    Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. PAMI 40, 1452–1464 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.College of Computer Science and Electronic EngineeringHunan UniversityChangshaChina
  2. 2.Tencent AI LabShenzhenChina

Personalised recommendations