Skip to main content
Log in

Data Augmentation for Low-Level Vision: CutBlur and Mixture-of-Augmentation

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Data augmentation (DA) is an effective way to improve the performance of deep networks. Unfortunately, current methods are mostly developed for high-level vision tasks (eg, image classification) and few are studied for low-level (eg, image restoration). In this paper, we provide a comprehensive analysis of the existing DAs in the frequency domain. We find that the methods that largely manipulate the spatial information can hinder the image restoration process and hurt the performance. Based on our analyses, we propose CutBlur and mixture-of-augmentation (MoA). CutBlur cuts a low-quality patch and pastes it to the corresponding high-quality image region, or vice versa. The key intuition is to provide enough DA effect while keeping the pixel distribution intact. This characteristic of CutBlur enables a model to learn not only “how” but also “where” to reconstruct an image. Eventually, the model understands “how much” to restore given pixels, which allows it to generalize better to unseen data distributions. We further improve the restoration performance by MoA that incorporates the curated list of DAs. We demonstrate the effectiveness of our methods by conducting extensive experiments on several low-level vision tasks on both single or a mixture of distortion tasks. Our results show that CutBlur and MoA consistently and significantly improve the performance especially when the model size is big and the data is collected under real-world environments. Our code is available at https://github.com/clovaai/cutblur.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data Availibility

The datasets generated during and/or analyzed during the current study are available in the following repository: https://github.com/clovaai/cutblur.

Notes

  1. For every experiment, we only used geometric DA methods, flip and rotation, which is the default setting of EDSR. Here, to solely analyze the effect of the DA methods, we did not use the \(\times 2\) pre-trained model.

References

  • Abdelhamed, A., Lin, S., & Brown, M. S. (2018). A high-quality denoising dataset for smartphone cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  • Abdelhamed, A., Afifi, M., & Timofte, R., et al. (2020). Ntire 2020 challenge on real image denoising: Dataset, methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops

  • Agustsson, E., & Timofte, R. (2017). Ntire 2017 challenge on single image super-resolution: Dataset and study. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops

  • Ahmad, W., Ali, H., Shah, Z., et al. (2022). A new generative adversarial network for medical images super resolution. Scientific Reports, 12(1), 9533.

    Article  Google Scholar 

  • Ahn, N., Kang, B., & Sohn, K. A. (2018). Fast, accurate, and lightweight super-resolution with cascading residual network. In Proceedings of the European Conference on Computer Vision (ECCV)

  • Baek, K., Bang, D., & Shim, H. (2021). Gridmix: Strong regularization through local context mapping. Pattern Recognition, 109(107), 594.

    Google Scholar 

  • Bevilacqua, M., Roumy, A., & Guillemot, C., et al. (2012). Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In Proceedings of the British Machine Vision Conference (BMVC)

  • Cai, J., Zeng, H., & Yong, H., et al. (2019). Toward real-world single image super-resolution: A new benchmark and a new model. In Proceedings of the IEEE/CVF International Conference on Computer Vision

  • Chen, C., Xiong, Z., & Tian, X., et al. (2019). Camera lens super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  • Cheng, K., & Wu, C. (2020). Self-calibrated attention neural network for real-world super resolution. In European Conference on Computer Vision. Springer, pp 453–467

  • Choe, J., & Shim, H. (2019). Attention-based dropout layer for weakly supervised object localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  • Cubuk, E. D., Zoph, B., & Mane, D., et al. (2019). Autoaugment: Learning augmentation policies from data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  • Dabouei, A., Soleymani, S., & Taherkhani, F., et al, (2021). Supermix: Supervising the mixing data augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13,794–13,803

  • DeVries, T., & Taylor, G. W. (2017). Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552

  • Dong, C., Loy, C. C., He, K., et al. (2015). Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 295–307.

    Article  Google Scholar 

  • El Helou, M., Zhou, R., & Süsstrunk, S. (2020). Stochastic frequency masking to improve super-resolution and denoising networks. In European Conference on Computer Vision. Springer, pp. 749–766

  • Feng, R., Gu, J., & Qiao, Y., et al. (2019). Suppressing model overfitting for image super-resolution networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops

  • Foi, A., Katkovnik, V., & Egiazarian, K. (2007). Pointwise shape-adaptive dct for high-quality denoising and deblocking of grayscale and color images. IEEE Transactions on Image Processing, 16(5), 1395–1411.

    Article  MathSciNet  Google Scholar 

  • Gastaldi, X. (2017). Shake-shake regularization. arXiv:1705.07485

  • Ghiasi, G., Lin, T. Y., & Le, Q.V. (2018). Dropblock: A regularization method for convolutional networks. In Advances in Neural Information Processing Systems

  • Gu, S., Zuo, W., & Xie, Q., et al. (2015). Convolutional sparse coding for image super-resolution. In Proceedings of the IEEE International Conference on Computer Vision

  • Hendrycks, D., & Dietterich, T. (2019). Benchmarking neural network robustness to common corruptions and perturbations. In International Conference on Learning Representations

  • Hong, M., Choi, J., & Kim, G. (2021). Stylemix: Separating content and style for enhanced data augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14,862–14,870

  • Huang, J. B., Singh, A., & Ahuja, N. (2015). Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  • Ishii, Y., & Yamashita, T. (2021). Cutdepth: Edge-aware data augmentation in depth estimation. arXiv:2107.07684

  • Lai, W. S., Huang, J. B., Ahuja, N., et al. (2018). Fast and accurate image super-resolution with deep Laplacian pyramid networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(11), 2599–2613.

    Article  Google Scholar 

  • Leclerc, S., Smistad, E., Pedrosa, J., et al. (2019). Deep learning for segmentation using an open large-scale dataset in 2d echocardiography. IEEE Transactions on Medical Imaging, 38(9), 2198–2210.

    Article  Google Scholar 

  • Liang, J., Cao, J., & Sun, G., et al. (2021). Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844

  • Liang, W., Liang, Y., & Jia, J. (2023). Miamix: Enhancing image classification through a multi-stage augmented mixied sample data augmentation method. arXiv:2308.02804

  • Lim, B., Son, S., & Kim, H., et al. (2017). Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops

  • Lim, S., Kim, I., & Kim, T., et al. (2019). Fast autoaugment. In Advances in Neural Information Processing Systems

  • Liu, Z., Lin, Y., & Cao, Y., et al. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10,012–10,022

  • Liu, Z., Li, S., & Wu, D., et al. (2022). Automix: Unveiling the power of mixup for stronger classifiers. In European Conference on Computer Vision. Springer, pp. 441–458

  • Martin, D., Fowlkes, C., & Tal, D., et al. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the IEEE International Conference on Computer Vision

  • Matsui, Y., Ito, K., Aramaki, Y., et al. (2017). Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, 76(20), 21811–21838.

    Article  Google Scholar 

  • Nakao, K., & Nobuhara, H. (2022). Controllable image super-resolution by som based data augmentation and its applications. In 2022 Joint 12th International Conference on Soft Computing and Intelligent Systems and 23rd International Symposium on Advanced Intelligent Systems (SCIS &ISIS). IEEE, pp. 1–7

  • Raghavan, J., & Ahmadi, M. (2022). Data augmentation methods for low resolution facial images. In: TENCON 2022–2022 IEEE Region 10 Conference (TENCON). IEEE, pp. 1–6

  • Sheikh, H. (2005). Live image quality assessment database release 2. http://liveeceutexasedu/research/quality

  • Srivastava, N., Hinton, G., Krizhevsky, A., et al. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929–1958.

    MathSciNet  Google Scholar 

  • Staal, J., Abràmoff, M. D., Niemeijer, M., et al. (2004). Ridge-based vessel segmentation in color images of the retina. IEEE Transactions on Medical Imaging, 23(4), 501–509.

    Article  Google Scholar 

  • Szegedy, C., Vanhoucke, V., & Ioffe, S., et al. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  • Timofte, R., De Smet, V., & Van Gool, L. (2014). A+: Adjusted anchored neighborhood regression for fast super-resolution. In Asian Conference on Computer Vision. Springer

  • Timofte, R., Rothe, R., & Van Gool, L. (2016). Seven ways to improve example-based single image super resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  • Tompson, J., Goroshin, R., & Jain, A., et al. (2015). Efficient object localization using convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  • Vaswani, A., Shazeer, N., & Parmar, N., et al. (2017). Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 6000-6010

  • Verma, V., Lamb, A., & Beckham, C., et al. (2019). Manifold mixup: Better representations by interpolating hidden states. In International Conference on Machine Learning, PMLR

  • Vu, T., Van Nguyen, C., & Pham, T. X., et al. (2018). Fast and efficient image quality enhancement via desubpixel convolutional neural networks. In Proceedings of the European Conference on Computer Vision (ECCV)

  • Wang, X., Xie, L., & Dong, C., et al. (2021). Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1905–1914

  • Wang, Z., Bovik, A. C., Sheikh, H. R., et al. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.

    Article  Google Scholar 

  • Wei, Y., Ma, J., & Jiang, Z., et al. (2022). Mixed color channels (mcc): A universal module for mixed sample data augmentation methods. In 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE, pp. 1–6

  • Yamada, Y., Iwamura, M., Akiba, T., et al. (2019). Shakedrop regularization for deep residual learning. IEEE Access, 7, 186,126-186,136.

    Article  Google Scholar 

  • Yang, J., Wright, J., Huang, T. S., et al. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861–2873.

    Article  MathSciNet  Google Scholar 

  • Yoo, J., Ahn, N., & Sohn, K. A. (2020). Rethinking data augmentation for image super-resolution: A comprehensive analysis and a new strategy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8375–8384

  • Yu, K., Dong, C., & Lin, L., et al. (2018). Crafting a toolchain for image restoration by deep reinforcement learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  • Yun, S., Han, D., & Oh, S. J., et al. (2019). Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision

  • Zhang, H., Cisse, M., & Dauphin, Y. N., et al. (2018a). mixup: Beyond empirical risk minimization. In International Conference on Learning Representations

  • Zhang, K., Zuo, W., & Gu, S., et al. (2017). Learning deep cnn denoiser prior for image restoration. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  • Zhang, K., Liang, J., & Van Gool, L., et al. (2021). Designing a practical degradation model for deep blind image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4791–4800

  • Zhang, X., Ng, R., & Chen, Q. (2018b). Single image reflection separation with perceptual losses. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  • Zhang, Y., Li, K., & Li, K., et al. (2018c). Image super-resolution using very deep residual channel attention networks. In Proceedings of the European Conference on Computer Vision (ECCV)

  • Zhang, Y., Li, K., & Li, K., et al. (2019). Residual non-local attention networks for image restoration. In International Conference on Learning Representations

  • Zhang, Y., Tian, Y., Kong, Y., et al. (2020). Residual dense network for image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 2480–2495.

    Article  Google Scholar 

  • Zhong, Z., Zheng, L., & Kang, G., et al. (2017). Random erasing data augmentation. arXiv:1708.04896

  • Zhou, R., El Helou, M., & Sage, D., et al. (2020). W2S: Microscopy data with joint denoising and super-resolution for widefield to SIM mapping. In ECCVW

Download references

Acknowledgements

This work was by the Korea Research Institute for Defence Technology Planning and Advancement (KRIT) grant funded by the Korea government (DAPA) in 2022 (KRIT-CT-22-037, SAR Image Super-Resolution for Improving of Target Identification Performance, 50%), National Research Foundation of Korea Grants funded by the Korea government (MSIT) (No. NRF-2019R1A2C1006608, 5%, No. 2.220574.01, 15%), Institute of Information & communications Technology Planning & Evaluation(IITP) Grants funded by the ITRC (Information Technology Research Center) support program (IITP-2020-2018-0-01431, 5%) and MSIT No.2020-0-01336 5%, Artificial Intelligence Graduate School Program (UNIST), 5%, No.2021-0-02068, Artificial Intelligence Innovation Hub, 5%, No.2022-0-00959, (Part 2) Few-Shot Learning of Causal Inference in Vision and Language for Decision Making, 5%, No.2022-0-00264, Comprehensive Video Understanding and Generation with Knowledge-based Deep Logic Neural Network, 5%).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jaejun Yoo.

Ethics declarations

Conflict of interest

We claim no conflicts of interest.

Additional information

Communicated by Oliver Zendel.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 5762 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ahn, N., Yoo, J. & Sohn, KA. Data Augmentation for Low-Level Vision: CutBlur and Mixture-of-Augmentation. Int J Comput Vis 132, 2041–2059 (2024). https://doi.org/10.1007/s11263-023-01970-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-023-01970-z

Keywords

Navigation