Data Augmentation for Low-Level Vision: CutBlur and Mixture-of-Augmentation

Ahn, Namhyuk; Yoo, Jaejun; Sohn, Kyung-Ah

doi:10.1007/s11263-023-01970-z

Data Augmentation for Low-Level Vision: CutBlur and Mixture-of-Augmentation

Published: 05 January 2024

Volume 132, pages 2041–2059, (2024)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

468 Accesses
1 Altmetric
Explore all metrics

Abstract

Data augmentation (DA) is an effective way to improve the performance of deep networks. Unfortunately, current methods are mostly developed for high-level vision tasks (eg, image classification) and few are studied for low-level (eg, image restoration). In this paper, we provide a comprehensive analysis of the existing DAs in the frequency domain. We find that the methods that largely manipulate the spatial information can hinder the image restoration process and hurt the performance. Based on our analyses, we propose CutBlur and mixture-of-augmentation (MoA). CutBlur cuts a low-quality patch and pastes it to the corresponding high-quality image region, or vice versa. The key intuition is to provide enough DA effect while keeping the pixel distribution intact. This characteristic of CutBlur enables a model to learn not only “how” but also “where” to reconstruct an image. Eventually, the model understands “how much” to restore given pixels, which allows it to generalize better to unseen data distributions. We further improve the restoration performance by MoA that incorporates the curated list of DAs. We demonstrate the effectiveness of our methods by conducting extensive experiments on several low-level vision tasks on both single or a mixture of distortion tasks. Our results show that CutBlur and MoA consistently and significantly improve the performance especially when the model size is big and the data is collected under real-world environments. Our code is available at https://github.com/clovaai/cutblur.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Deep learning models for digital image processing: a review

Article 07 January 2024

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Data Availibility

The datasets generated during and/or analyzed during the current study are available in the following repository: https://github.com/clovaai/cutblur.

Notes

For every experiment, we only used geometric DA methods, flip and rotation, which is the default setting of EDSR. Here, to solely analyze the effect of the DA methods, we did not use the \(\times 2\) pre-trained model.

References

Abdelhamed, A., Lin, S., & Brown, M. S. (2018). A high-quality denoising dataset for smartphone cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Abdelhamed, A., Afifi, M., & Timofte, R., et al. (2020). Ntire 2020 challenge on real image denoising: Dataset, methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops
Agustsson, E., & Timofte, R. (2017). Ntire 2017 challenge on single image super-resolution: Dataset and study. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops
Ahmad, W., Ali, H., Shah, Z., et al. (2022). A new generative adversarial network for medical images super resolution. Scientific Reports, 12(1), 9533.
Article Google Scholar
Ahn, N., Kang, B., & Sohn, K. A. (2018). Fast, accurate, and lightweight super-resolution with cascading residual network. In Proceedings of the European Conference on Computer Vision (ECCV)
Baek, K., Bang, D., & Shim, H. (2021). Gridmix: Strong regularization through local context mapping. Pattern Recognition, 109(107), 594.
Google Scholar
Bevilacqua, M., Roumy, A., & Guillemot, C., et al. (2012). Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In Proceedings of the British Machine Vision Conference (BMVC)
Cai, J., Zeng, H., & Yong, H., et al. (2019). Toward real-world single image super-resolution: A new benchmark and a new model. In Proceedings of the IEEE/CVF International Conference on Computer Vision
Chen, C., Xiong, Z., & Tian, X., et al. (2019). Camera lens super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Cheng, K., & Wu, C. (2020). Self-calibrated attention neural network for real-world super resolution. In European Conference on Computer Vision. Springer, pp 453–467
Choe, J., & Shim, H. (2019). Attention-based dropout layer for weakly supervised object localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Cubuk, E. D., Zoph, B., & Mane, D., et al. (2019). Autoaugment: Learning augmentation policies from data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Dabouei, A., Soleymani, S., & Taherkhani, F., et al, (2021). Supermix: Supervising the mixing data augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13,794–13,803
DeVries, T., & Taylor, G. W. (2017). Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552
Dong, C., Loy, C. C., He, K., et al. (2015). Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 295–307.
Article Google Scholar
El Helou, M., Zhou, R., & Süsstrunk, S. (2020). Stochastic frequency masking to improve super-resolution and denoising networks. In European Conference on Computer Vision. Springer, pp. 749–766
Feng, R., Gu, J., & Qiao, Y., et al. (2019). Suppressing model overfitting for image super-resolution networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops
Foi, A., Katkovnik, V., & Egiazarian, K. (2007). Pointwise shape-adaptive dct for high-quality denoising and deblocking of grayscale and color images. IEEE Transactions on Image Processing, 16(5), 1395–1411.
Article MathSciNet Google Scholar
Gastaldi, X. (2017). Shake-shake regularization. arXiv:1705.07485
Ghiasi, G., Lin, T. Y., & Le, Q.V. (2018). Dropblock: A regularization method for convolutional networks. In Advances in Neural Information Processing Systems
Gu, S., Zuo, W., & Xie, Q., et al. (2015). Convolutional sparse coding for image super-resolution. In Proceedings of the IEEE International Conference on Computer Vision
Hendrycks, D., & Dietterich, T. (2019). Benchmarking neural network robustness to common corruptions and perturbations. In International Conference on Learning Representations
Hong, M., Choi, J., & Kim, G. (2021). Stylemix: Separating content and style for enhanced data augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14,862–14,870
Huang, J. B., Singh, A., & Ahuja, N. (2015). Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Ishii, Y., & Yamashita, T. (2021). Cutdepth: Edge-aware data augmentation in depth estimation. arXiv:2107.07684
Lai, W. S., Huang, J. B., Ahuja, N., et al. (2018). Fast and accurate image super-resolution with deep Laplacian pyramid networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(11), 2599–2613.
Article Google Scholar
Leclerc, S., Smistad, E., Pedrosa, J., et al. (2019). Deep learning for segmentation using an open large-scale dataset in 2d echocardiography. IEEE Transactions on Medical Imaging, 38(9), 2198–2210.
Article Google Scholar
Liang, J., Cao, J., & Sun, G., et al. (2021). Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844
Liang, W., Liang, Y., & Jia, J. (2023). Miamix: Enhancing image classification through a multi-stage augmented mixied sample data augmentation method. arXiv:2308.02804
Lim, B., Son, S., & Kim, H., et al. (2017). Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops
Lim, S., Kim, I., & Kim, T., et al. (2019). Fast autoaugment. In Advances in Neural Information Processing Systems
Liu, Z., Lin, Y., & Cao, Y., et al. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10,012–10,022
Liu, Z., Li, S., & Wu, D., et al. (2022). Automix: Unveiling the power of mixup for stronger classifiers. In European Conference on Computer Vision. Springer, pp. 441–458
Martin, D., Fowlkes, C., & Tal, D., et al. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the IEEE International Conference on Computer Vision
Matsui, Y., Ito, K., Aramaki, Y., et al. (2017). Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, 76(20), 21811–21838.
Article Google Scholar
Nakao, K., & Nobuhara, H. (2022). Controllable image super-resolution by som based data augmentation and its applications. In 2022 Joint 12th International Conference on Soft Computing and Intelligent Systems and 23rd International Symposium on Advanced Intelligent Systems (SCIS &ISIS). IEEE, pp. 1–7
Raghavan, J., & Ahmadi, M. (2022). Data augmentation methods for low resolution facial images. In: TENCON 2022–2022 IEEE Region 10 Conference (TENCON). IEEE, pp. 1–6
Sheikh, H. (2005). Live image quality assessment database release 2. http://liveeceutexasedu/research/quality
Srivastava, N., Hinton, G., Krizhevsky, A., et al. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929–1958.
MathSciNet Google Scholar
Staal, J., Abràmoff, M. D., Niemeijer, M., et al. (2004). Ridge-based vessel segmentation in color images of the retina. IEEE Transactions on Medical Imaging, 23(4), 501–509.
Article Google Scholar
Szegedy, C., Vanhoucke, V., & Ioffe, S., et al. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Timofte, R., De Smet, V., & Van Gool, L. (2014). A+: Adjusted anchored neighborhood regression for fast super-resolution. In Asian Conference on Computer Vision. Springer
Timofte, R., Rothe, R., & Van Gool, L. (2016). Seven ways to improve example-based single image super resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Tompson, J., Goroshin, R., & Jain, A., et al. (2015). Efficient object localization using convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Vaswani, A., Shazeer, N., & Parmar, N., et al. (2017). Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 6000-6010
Verma, V., Lamb, A., & Beckham, C., et al. (2019). Manifold mixup: Better representations by interpolating hidden states. In International Conference on Machine Learning, PMLR
Vu, T., Van Nguyen, C., & Pham, T. X., et al. (2018). Fast and efficient image quality enhancement via desubpixel convolutional neural networks. In Proceedings of the European Conference on Computer Vision (ECCV)
Wang, X., Xie, L., & Dong, C., et al. (2021). Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1905–1914
Wang, Z., Bovik, A. C., Sheikh, H. R., et al. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.
Article Google Scholar
Wei, Y., Ma, J., & Jiang, Z., et al. (2022). Mixed color channels (mcc): A universal module for mixed sample data augmentation methods. In 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE, pp. 1–6
Yamada, Y., Iwamura, M., Akiba, T., et al. (2019). Shakedrop regularization for deep residual learning. IEEE Access, 7, 186,126-186,136.
Article Google Scholar
Yang, J., Wright, J., Huang, T. S., et al. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861–2873.
Article MathSciNet Google Scholar
Yoo, J., Ahn, N., & Sohn, K. A. (2020). Rethinking data augmentation for image super-resolution: A comprehensive analysis and a new strategy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8375–8384
Yu, K., Dong, C., & Lin, L., et al. (2018). Crafting a toolchain for image restoration by deep reinforcement learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Yun, S., Han, D., & Oh, S. J., et al. (2019). Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision
Zhang, H., Cisse, M., & Dauphin, Y. N., et al. (2018a). mixup: Beyond empirical risk minimization. In International Conference on Learning Representations
Zhang, K., Zuo, W., & Gu, S., et al. (2017). Learning deep cnn denoiser prior for image restoration. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Zhang, K., Liang, J., & Van Gool, L., et al. (2021). Designing a practical degradation model for deep blind image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4791–4800
Zhang, X., Ng, R., & Chen, Q. (2018b). Single image reflection separation with perceptual losses. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Zhang, Y., Li, K., & Li, K., et al. (2018c). Image super-resolution using very deep residual channel attention networks. In Proceedings of the European Conference on Computer Vision (ECCV)
Zhang, Y., Li, K., & Li, K., et al. (2019). Residual non-local attention networks for image restoration. In International Conference on Learning Representations
Zhang, Y., Tian, Y., Kong, Y., et al. (2020). Residual dense network for image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 2480–2495.
Article Google Scholar
Zhong, Z., Zheng, L., & Kang, G., et al. (2017). Random erasing data augmentation. arXiv:1708.04896
Zhou, R., El Helou, M., & Sage, D., et al. (2020). W2S: Microscopy data with joint denoising and super-resolution for widefield to SIM mapping. In ECCVW

Download references

Acknowledgements

This work was by the Korea Research Institute for Defence Technology Planning and Advancement (KRIT) grant funded by the Korea government (DAPA) in 2022 (KRIT-CT-22-037, SAR Image Super-Resolution for Improving of Target Identification Performance, 50%), National Research Foundation of Korea Grants funded by the Korea government (MSIT) (No. NRF-2019R1A2C1006608, 5%, No. 2.220574.01, 15%), Institute of Information & communications Technology Planning & Evaluation(IITP) Grants funded by the ITRC (Information Technology Research Center) support program (IITP-2020-2018-0-01431, 5%) and MSIT No.2020-0-01336 5%, Artificial Intelligence Graduate School Program (UNIST), 5%, No.2021-0-02068, Artificial Intelligence Innovation Hub, 5%, No.2022-0-00959, (Part 2) Few-Shot Learning of Causal Inference in Vision and Language for Decision Making, 5%, No.2022-0-00264, Comprehensive Video Understanding and Generation with Knowledge-based Deep Logic Neural Network, 5%).

Author information

Authors and Affiliations

NAVER WEBTOON AI, Seongnam, 13529, Republic of Korea
Namhyuk Ahn
Graduate School of Artificial Intelligence, UNIST, Ulsan, 33919, Republic of Korea
Jaejun Yoo
Department of Artificial Intelligence, Ajou University, Suwon, 16499, Republic of Korea
Namhyuk Ahn & Kyung-Ah Sohn

Authors

Namhyuk Ahn
View author publications
You can also search for this author in PubMed Google Scholar
Jaejun Yoo
View author publications
You can also search for this author in PubMed Google Scholar
Kyung-Ah Sohn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jaejun Yoo.

Ethics declarations

Conflict of interest

We claim no conflicts of interest.

Additional information

Communicated by Oliver Zendel.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 5762 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ahn, N., Yoo, J. & Sohn, KA. Data Augmentation for Low-Level Vision: CutBlur and Mixture-of-Augmentation. Int J Comput Vis 132, 2041–2059 (2024). https://doi.org/10.1007/s11263-023-01970-z

Download citation

Received: 15 December 2022
Accepted: 30 November 2023
Published: 05 January 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s11263-023-01970-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data Augmentation for Low-Level Vision: CutBlur and Mixture-of-Augmentation

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Deep learning models for digital image processing: a review

Image Matching from Handcrafted to Deep Features: A Survey

Data Availibility

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 5762 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Data Augmentation for Low-Level Vision: CutBlur and Mixture-of-Augmentation

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Deep learning models for digital image processing: a review

Image Matching from Handcrafted to Deep Features: A Survey

Data Availibility

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 5762 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation