Skip to main content

MTPFK: Multi-scale Transformer Joint Predictive Filter Kernel for Image Inpainting

  • Conference paper
  • First Online:
Communications, Signal Processing, and Systems (CSPS 2023)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1033))

  • 117 Accesses

Abstract

In the task of image inpainting, it is common to utilize a CNN-based encoder-decoder architecture to extract the feature information from the damaged image, achieving satisfactory restoration results. However, these methods often struggle to achieve high-quality restoration for images with varying degrees of damage. In this paper, propose a two-stage inpainting model. Firstly, leverage the powerful contextual capturing capabilities of the Transformer to form a coarse recovery network, so as to roughly fill holes of different sizes. Secondly, employ a predicted filtering kernel network to perform fine restoration, building upon the coarse restoration. Method conducted qualitative and quantitative experiments on the CelebA and Places2 datasets, demonstrating the superiority of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Li X, Guo Q, Lin D, Li P, Feng W, Wang S (2022) MISF: multi-level interactive Siamese filtering for high-fidelity image inpainting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Google Scholar 

  2. Liu H, Jiang B, Song Y, Huang W, Yang C (2020) Rethinking image inpainting via a mutual encoder-decoder with feature equalizations. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, 23–28 Aug 2020, proceedings, Part II 16. Springer International Publishing, pp 725–741

    Google Scholar 

  3. Guo Q, Li X, Juefei-Xu F, Yu H, Liu Y, Wang S (2021) JPGNet: joint predictive filtering and generative network for image inpainting. In: Proceedings of the 29th ACM international conference on multimedia

    Google Scholar 

  4. Guo X, Yang H, Huang D (2021) Image inpainting via conditional texture and structure dual generation. In: Proceedings of the IEEE/CVF international conference on computer vision

    Google Scholar 

  5. Wan Z, Zhang J, Chen D, Liao J (2021) High-fidelity pluralistic image completion with transformers. In: Proceedings of the IEEE/CVF international conference on computer vision

    Google Scholar 

  6. Zeng Y, Lin Z, Lu H, Patel VM (2021) CR-fill: generative image inpainting with auxiliary contextual reconstruction. In: Proceedings of the IEEE/CVF international conference on computer vision

    Google Scholar 

  7. Zheng C, Cham TJ, Cai J, Phung D (2022) Bridging global context interactions for high-fidelity image completion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Google Scholar 

  8. Nazeri K, Ng E, Joseph T, Qureshi FZ, Ebrahimi M (2019) EdgeConnect: generative image inpainting with adversarial edge learning

    Google Scholar 

  9. Liu G, Reda FA, Shih KJ, Wang TC, Tao A, Catanzaro B (2018) Image inpainting for irregular holes using partial convolutions. In: Proceedings of the European conference on computer vision (ECCV)

    Google Scholar 

  10. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T et al (2020) An image is worth 16×16 words: transformers for image recognition at scale

    Google Scholar 

  11. Ren S, Zhou D, He S, Feng J, Wang X (2022) Shunted self-attention via multi-scale token aggregation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Google Scholar 

  12. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  13. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision

    Google Scholar 

  14. Guo Q, Qiu X, Liu P, Xue X, Zhang Z (2020) Multi-scale self-attention for text classification. In: Proceedings of the AAAI conference on artificial intelligence

    Google Scholar 

  15. He K, Chen X, Xie S, Li Y, Dollár P, Girshick R (2022) Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongping Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, M., Xie, Y. (2024). MTPFK: Multi-scale Transformer Joint Predictive Filter Kernel for Image Inpainting. In: Wang, W., Liu, X., Na, Z., Zhang, B. (eds) Communications, Signal Processing, and Systems. CSPS 2023. Lecture Notes in Electrical Engineering, vol 1033. Springer, Singapore. https://doi.org/10.1007/978-981-99-7502-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7502-0_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7555-6

  • Online ISBN: 978-981-99-7502-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics