RandoMix: a mixed sample data augmentation method with multiple mixed modes

Liu, Xiaoliang; Shen, Furao; Zhao, Jian; Nie, Changhai

doi:10.1007/s11042-024-18868-8

RandoMix: a mixed sample data augmentation method with multiple mixed modes

Published: 19 March 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xiaoliang Liu^1,2,
Furao Shen ORCID: orcid.org/0000-0002-7285-326X^1,3,
Jian Zhao⁴ &
…
Changhai Nie^1,2

75 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Data augmentation plays a crucial role in enhancing the robustness and performance of machine learning models across various domains. In this study, we introduce a novel mixed-sample data augmentation method called RandoMix. RandoMix is specifically designed to simultaneously address robustness and diversity challenges. It leverages a combination of linear and mask mixed modes, introducing flexibility in candidate selection and weight adjustments. We evaluate the effectiveness of RandoMix on diverse datasets, including CIFAR-10/100, Tiny-ImageNet, ImageNet, and Google Speech Commands. Our results demonstrate its superior erformance compared to existing techniques such as Mixup, CutMix, Fmix, and ResizeMix. Notably, RandoMix excels in enhancing model robustness against adversarial noise, natural noise, and sample occlusion. The comprehensive experimental results and insights into parameter tuning underscore the potential of RandoMix as a versatile and effective data augmentation method. Moreover, it seamlessly integrates into the training pipeline.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey

Article Open access 06 June 2023

DeepBT and NLP Data Augmentation Techniques: A New Proposal and a Comprehensive Study

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Data Availibility Statement

The data supporting the findings of this study include publicly available datasets. These datasets are CIFAR-10/100, Tiny-ImageNet, ImageNet and Google Speech Commands, which can be accessed through their respective online repositories. Additional data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Voulodimos A, Doulamis N, Doulamis A, Protopapadakis E (2018) Deep learning for computer vision: a brief review. Computational intelligence and neuroscience
Kamath U, Liu J, Whitaker J Deep learning for NLP and speech recognition vol 84. Springer
Vapnik V (1968) On the uniform convergence of relative frequencies of events to their probabilities. Dokl Akad Nauk USSR 181:781–787
Google Scholar
Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2018) Mixup: beyond empirical risk minimization. International conference on learning representations (ICLR)
Yun S, Han D, Oh SJ, Chun S, Choe J, Yoo Y (2019) Cutmix: regularization strategy to train strong classifiers with localizable features. IEEE international conference on computer vision (ICCV)
Qin J, Fang J, Zhang Q, Liu W, Wang X, Wang X (2020) Resizemix: mixing data with preserved object information and true labels. arXiv:2012.11101
Harris E, Marcu A, Painter M, Niranjan M, Hare AP-BJ (2021) Fmix: enhancing mixed sample data augmentation. International conference on learning representations (ICLR)
Kim J-H, Choo W, Song HO (2020) Puzzle mix: exploiting saliency and local statistics for optimal mixup. International conference on machine learning (ICML)
Kim J, Choo W, Jeong H, Song HO (2021) Co-mixup: saliency guided joint mixup with supermodular diversity. In: international conference on learning representations (ICLR)
Uddin AFMS, Monira MS, Shin W, Chung T, Bae S-H (2021) Saliencymix: a saliency guided data augmentation strategy for better regularization. International conference on learning representations (ICLR)
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images
Chrabaszcz P, Loshchilov I, Hutter F (2017) A downsampled variant of imagenet as an alternative to the cifar datasets. arXiv:1707.08819
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al (2015) Imagenet large scale visual recognition challenge. International journal on computer vision (IJCV)
Warden P (2017) Speech commands: a public dataset for single-word speech recognition. Dataset available from http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz
Verma V, Lamb A, Beckham C, Najafi A, Mitliagkas I, Courville A, Lopez-Paz D, Bengio Y (2019) Manifold mixup: better representations by interpolating hidden states. International conference on machine learning (ICML)
Faramarzi M, Amini M, Badrinaaraayanan A, Verma V, Chandar S (2022) Patchup: a feature-space block-level regularization technique for convolutional neural networks. Proc AAAI Conf Artif Intell 36:589–597
Google Scholar
DeVries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European conference on computer vision (ECCV)
Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Procedings of the British machine vision conference 2016. British machine vision association
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR)
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 10012–10022
Loshchilov I, Hutter F (2017) Sgdr: stochastic gradient descent with warm restarts. In: International conference on learning representations (ICLR)
Wang W, Liang J, Liu D (2022) Learning equivariant segmentation with instance-unique querying. Neural Inf Process Syst (NeurIPS) 35:12826–12840
Google Scholar
Wang W, Han C, Zhou T, Liu D (2023) Visual recognition with deep nearest centroids. In: International conference on learning representations (ICLR)
Liu D, Cui Y, Tan W, Chen Y (2021) Sg-net: spatial granularity network for one-stage video instance segmentation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 9816–9825
Liang J, Zhou T, Liu D, Wang W (2023) Clustseg: clustering for universal segmentation. In: international conference on machine learning (ICML)
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: IEEE international conference on computer vision (ICCV), pp 618–626
Brain G (2017) Tensorflow speech recognition challenge. https://www.kaggle.com/c/tensorflow-speech-recognition-challenge
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. International conference on learning representations (ICLR)
Hendrycks D, Dietterich T (2019) Benchmarking neural network robustness to common corruptions and perturbations. International conference on learning representations (ICLR)

Download references

Acknowledgements

This work was supported in part by the STI 2030-Major Projects of China under Grant 2021ZD0201300, and by the National Science Foundation of China under Grant 62276127.

Explicitly Stated the Absence of Direct Ethical Concerns: We clarified that our research methodology and the nature of RandoMix do not directly engage with ethical dilemmas typically encountered in studies involving human or animal subjects, sensitive data, or environmental impacts.

Compliance and Ethical Standards: We have also affirmed our adherence to general ethical standards in research, including honesty in reporting results, transparency in methodology, and respect for intellectual property.

Author information

Authors and Affiliations

National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Xiaoliang Liu, Furao Shen & Changhai Nie
Department of Computer Science and Technology, Nanjing University, Nanjing, China
Xiaoliang Liu & Changhai Nie
School of Artificial Intelligence, Nanjing University, Nanjing, China
Furao Shen
School of Electronic Science and Engineering, Nanjing University, Nanjing, China
Jian Zhao

Authors

Xiaoliang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Furao Shen
View author publications
You can also search for this author in PubMed Google Scholar
Jian Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Changhai Nie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Furao Shen.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, X., Shen, F., Zhao, J. et al. RandoMix: a mixed sample data augmentation method with multiple mixed modes. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18868-8

Download citation

Received: 23 October 2023
Revised: 02 February 2024
Accepted: 04 March 2024
Published: 19 March 2024
DOI: https://doi.org/10.1007/s11042-024-18868-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RandoMix: a mixed sample data augmentation method with multiple mixed modes

Abstract

Access this article

Similar content being viewed by others

Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey

DeepBT and NLP Data Augmentation Techniques: A New Proposal and a Comprehensive Study

A survey on Image Data Augmentation for Deep Learning

Data Availibility Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

RandoMix: a mixed sample data augmentation method with multiple mixed modes

Abstract

Access this article

Similar content being viewed by others

Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey

DeepBT and NLP Data Augmentation Techniques: A New Proposal and a Comprehensive Study

A survey on Image Data Augmentation for Deep Learning

Data Availibility Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation