Rethinking Regularization with Random Label Smoothing

dos Santos, Claudio Filipi Gonçalves; Papa, João Paulo

doi:10.1007/s11063-024-11579-z

Rethinking Regularization with Random Label Smoothing

Open access
Published: 27 April 2024

Volume 56, article number 157, (2024)
Cite this article

Download PDF

You have full access to this open access article

Neural Processing Letters Aims and scope Submit manuscript

Rethinking Regularization with Random Label Smoothing

Download PDF

Claudio Filipi Gonçalves dos Santos^1,2 &
João Paulo Papa³

167 Accesses
Explore all metrics

Abstract

Regularization helps to improve machine learning techniques by penalizing the models during training. Such approaches act in either the input, internal, or output layers. Regarding the latter, label smoothing is widely used to introduce noise in the label vector, making learning more challenging. This work proposes a new label regularization method, Random Label Smoothing, that attributes random values to the labels while preserving their semantics during training. The idea is to change the entire label into fixed arbitrary values. Results show improvements in image classification and super-resolution tasks, outperforming state-of-the-art techniques for such purposes.

Self-label correction for image classification with noisy labels

Article 15 June 2023

Semi-supervised Classification by Nuclear-Norm Based Transductive Label Propagation

Partial label learning with noisy side information

Article 04 February 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Neural networks are acknowledged to be learn-by-example techniques. Assuming the training set is representative, the problem becomes finding proper loss functions to avoid local optima and a good backbone. We know that some convolutional neural networks (CNNs) are more accurate than others. In image classification tasks, ResNet [9] usually performs better than VGG [21], for it has a more complex architecture and residual connections that help generalization [10].

Changing the training protocol is another approach to increase generalization. Notably, neural networks can obtain better results when more (and proper) training instances are available. One can also train a model in some larger dataset before fine-tuning it to the desired problem, i.e., transfer learning [16]. Results show it can boost the outcomes significantly in a variety of problems. Google improved results in the ImageNet challenge [4] by creating a huge dataset with more than 300 million images to first train a model and further fine-tune it in the ImageNet set [23].

However, some scenarios do not allow us to collect additional training data, for the labeling cost is prohibitive. Regularization can come to this aid by either making training harder or the loss function landscape smoother [7]. We expect to achieve improvements in the model’s generalization after the application of such approaches. Classical regularization approaches in deep learning include data augmentation, which shall consider semantics after transformation. A model that trains on the MNIST dataset [13], for instance, can rotate images to some extent only. Rotating a “6” in 180\(^{circ}\) ends up in a “9”, generating a wrong instance-label pair.

This work introduces Random Label Smoothing (RLS), a new regularization approach that works on the output layer. RLS operates by randomly changing values in the label vector (ground truth). We demonstrate state-of-the-art results in image classification and super-resolution, evidencing it can be used in a broad range of application domains.

The manuscript is organized as follows. Sections 2 and 3 present some related works and the proposed approach, respectively. Section 4 introduces the methodology, and Sect. 5 demonstrates the robustness of the proposed approach in experiments under different scenarios. Section 6 provides a brief discussion about the outcomes, and Sect. 7 states conclusions.

2 Related Works

Several regularization techniques are available for helping neural networks to accomplish better results in different domains. Regularization based on data augmentation is classic and with many approaches in the literature. AutoAugment [3] performs data augmentation by first learning the best policy for creating synthetic samples. However, it may take too long to determine the best data augmentation strategy for a given data set. Aiming to make this process faster, Fast AutoAugment [15] calculates the gradient from just one batch, decreasing the computational effort considerably. Other methods, such as Cutout [5] and RandomErasing [30], work by removing random areas of the image. The former removes a patch and leaves its content empty, while the other fills it with some random noise.

Other methods operate by changing the feature maps generated during training. Dropout [22] randomly drops neurons, while MaxDropout [18] eliminates the most active ones, i.e., the neurons with the highest activation values. An improved version, called MaxDropoutV2 [19], includes a more efficient approach to finding the most active neurons. Instead of directly comparing values on the output feature map from a given layer, it first sums the value of each neuron in the depth axis for further performing the comparison. MaxDropoutV2 carries more semantic information than its original counterpart. Additional methods consider other internal aspects when training CNNs. Shake–Shake [6] changes the weights of the inference and the backpropagation values on training time in a multi-branch model, such as ResNeXT [26]. Results show it can significantly improve the results.

A recent analysis of regularization methods for CNNs [17] raised some interesting drawbacks in the area. The first one is the shortage of algorithms that perform regularization on a label level. The other point concerns the application domain, i.e., most regularization techniques designed for deep nets focus on image classification. Label Smoothing [24] changes the values of the output layer (label vector), i.e., it decreases the value of the position that represents the true label and increases the values of the inactive labels. The Two-Stage Label Smoothing (TSLA) [27] changes the label values to some extent during training. The work shows that stopping label smoothing in a late training stage helps the model to generalize better.

3 Proposed Approach

According to [17], there are several issues related to some new regularization methods. We address quite a few of them in this work. Regularization approaches are often evaluated in a single context only, primarily on image classification. Here, we also consider image super-resolution. Still, according to to [17], a good regularization technique should improve results even if the model is already using another regularization technique, which RLS is capable of.

Traditional data augmentation usually changes features in the input data. A simple way to perform data augmentation is to rotate the image to the left or to the right in random degrees. Another way is to crop some areas of the input image, e.g., Cutout [5]. Following a similar logic, RLS augments the data by performing random but controlled changes in the label of every single instance during training. We explain how to do that for image classification and super-resolution tasks.

3.1 Image Classification

Concerning image classification, we vary the output values that define the label in a controlled but random range of values. In the “active” position, i.e., the index represented by the value “1” that encodes the label (one-hot representation), we randomly decrease its values between 0.05 and 0.49, guaranteeing that the active label will always have the greatest value. For the inactive positions (defined as “0”), we divided the amount of value used earlier among these positions. For instance, if the problem has 10 classes and the removed value is 0.3, the active position is set to 0.7, and all the other 9 positions receive a portion of the remaining value.

By doing these transformations in the labels during training, we create various acceptable labels for a given instance, working as an augmented label algorithm. Our results show it is functional for image classification, helping the model to generalize better and overcoming other methods, such as TargetDrop [31] and MaxDropout [18]. Figure 1 demonstrates how RLS works for a classification problem in a toy example.

3.2 Image Super-Resolution

For image reconstruction, we first tried a similar approach to the classification task by randomly changing the label’s values following a Gaussian distribution.^{Footnote 1} However, it did not work as expected. The new reference image (modified ground truth) figures much Gaussian noise by changing the pixel values using a Gaussian distribution. The entire system then learns to reconstruct images with noise.

Changing the values pixel-by-pixel with different random values did not seem to be a good strategy, for we lose semantic details. We, therefore, decided to perturb all pixels by the same amount, i.e., a random value that can be either added or subtracted by the pixel value in a given training interaction. The problem is now defining what values range leads to the best results.

We achieved promising outcomes by reverse-engineering the results of the neural networks employed in this study. We used the results (i.e., PSNR—peak signal-to-noise ratio) of each architecture to set a range of values with better results. For instance, PyNET [11] achieved a 21.19 dB of PSNR; then, converting this value to a gray-scale amount results in about 12 units. Therefore, all pixel values (for all dataset images) were either subtracted or added concerning that amount (in this example). For EDSR (Enhanced Deep Residual Networks for Single Image Super-esolution) [14], the results are a PSNR of 29.21 dB for Div2k [1] and 28.89 dB for the RealSR dataset [2]. Therefore, we set 4.5 units for both datasets.

We verified if the above methodology could be further improved in the last experiment. We found out that we could achieve even better results by using half of the range than using the full range. In this case, we used 6 units for the PyNet and 2.25 for EDSR. Even though the image regions may be different, smaller differences in continuous regions can help the entire model understand that smaller errors are more acceptable than the exact value.

4 Methodology

This section provides a complete description of the experimentation protocol we used to evaluate RLS. We divided the experiments into three main parts: first, all four architectures we used for evaluating purposes are described. Right after, we presented the training protocol and later a description of the datasets.

4.1 Scenarios

We considered three different scenarios to evaluate RLS. They all have, in some ways, differences in the input data or the label level. The first one is standard image classification, and the second task concerns image super-resolution. In this case, the neural network’s task is to magnify the image input, creating another but amplified. For example, for a magnification of four times, if the input has the size of \(200\times 200\), the model’s objective is to create an output of size \(800\times 800\).

Last but not least, another challenge is to simulate an image signal processor (ISP), which basically creates an RGB image from a CFA (color filter array) acquired by the camera’s sensor. Therefore, given a CFA input, the task is to learn a CNN that can generate its corresponding RBG output.

4.2 Neural Network Architecture

As mentioned earlier [17], a good regularization technique should improve results in different problems to show it can enhance a given CNN outcome. We tested four neural backbones in different neural architectures to provide a fair evaluation: two for image classification, one for single image super-resolution, and one for software ISP.

The first CNN we use to evaluate RLS is ResNet [9], more precisely, ResNet-18. We have chosen this architecture because it is widely used for evaluating regularization techniques, allowing a natural comparison. Such neural backbone comprises a sequence of convolutional and pooling layers, with pooling after a sequence of two or three convolutional layers. The significant innovation in its architecture concerns the residual connections, which may improve effectiveness to a certain extent.

EDSR [14] is one of the scarce neural networks used to evaluate regularization methods [29], ending up in another natural choice. It stands for a residual convolutional network with a sequence of convolutional-ReLU activation-convolutional operations in its residual blocks and pixel shuffle operations [20] to perform image super-resolution in the end. PyNET [11], a multi-branch CNN that has several layers in parallel and uses different measures for error calculation, is interesting in evaluating problems related to image and signal processing, specifically image reconstruction.

4.3 Training Protocol

For the image classification problem, we considered the protocol suggested by [17]. The images were redimensioned to \(32\times 32\) pixels and then randomly cropped in \(28\times 28\) patches. Stochastic gradient descent with Nesterov momentum is used for gradient calculation. The learning rate starts at \(10e-2\) and is multiplied by \(10e-1\) on epochs 80, 120, and 160.

Concerning image super-resolution, we did not find any defined or suggested protocol. We, therefore, followed the same parameters used in CutBlur [29] and PyNET [11]. We understand a natural comparison to these previous works by observing the same parameters.

In all scenarios, five training runs were performed to avoid comparing results only by chance. Our results report the mean and standard deviation values for each performance measure.

4.4 Datasets

We used a different dataset for each task evaluated in this work to allow a fair comparison against other methods. In each case, we selected datasets that, according to our research, are the most used ones on each application domain considered here, i.e., image classification and super-resolution.

For the image classification task, we appointed CIFAR-100 [12], one of the most used datasets to evaluate regularization techniques [17]. It comprises 50, 000 images from 100 different classes for training purposes and 10, 000 images as the validation set.

For the single image super-resolution task, we considered two different datasets inspired in [29]. The first one stands for the Div2K [1] dataset, with 800 pairs of low and high-resolution RGB images for training and 100 for evaluation. The other dataset is RealSR [2], which comprises 459 pairs of images for training and 100 for model validation. We used a magnification of four times for comparison purposes in both cases.

The last one is the Zurich RAW to RGB Dataset [11], which evaluates techniques for image reconstruction. This dataset is divided into 46, 839 pairs of RGB Bayer filter data/RGB image for training and 1, 204 similar pairs for testing purposes.

5 Experimental Results

This section provides outcomes of RLS against some state-of-the-art regularization approaches. RLS is first evaluated over image classification tasks (Sect. 5.1) and later on image super-resolution problems (Sect. 5.2). Best results are presented in bold.

5.1 Image Classification

Table 1 presents the error rate concerning ResNet-18 in CIFAR-100 dataset. RLS has achieved the best outcome solely, outperforming seven other techniques. The first row in the table is our baseline, i.e., ResNet-18, without any regularization.

Table 1 Image classification experiment over CIFAR-100 dataset

Full size table

5.1.1 Working Along with Other Regularizers

As mentioned by [17], it is vital to check how a particular regularization algorithm works along with other regularization methods. Here, we provide some interesting outcomes. Table 2 presents results of ResNet-18 using Cutout and other methods working together. The proposed approach can outperform MaxDropout and TargetDrop working jointly with Cutout by more than 0.5% on average, which is to be considered a good improvement.

Table 2 Results on CIFAR-100 using ResNet-18 with one or more regulization methods

Full size table

Combining PyramidNet with ShakeDrop and RLS results in some improvement too. Table 3 shows the outcomes of PyramidNet without any regularization, using ShakeDrop, and using ShakeDrop+RLS. On average, that combination allowed an improvement of \(0.1\%\).

Table 3 Results on CIFAR-100 using PyramidNet with one or more regulization methods

Full size table

5.2 Image Super-Resolution

Table 4 shows the outcomes of EDSR backbone using different regularization techniques from [29]. Some conclusions can be drawn in this scenario. The first one concerns Div2k dataset, whose results show that RLS using half of the perturbation value (RLS-Half) outperforms all methods, including the situation when all techniques (i.e., EDSR, Cutout, Cutmix, Mixup, RGB permutation, Blend, and Cutblur) are used together (All).

The second analysis concerns the outcomes of the RealSR dataset. Although RLS did not overcome the neural network trained using all methods, still, it has the best individual result.

Table 4 PSNR results on Div2K and RealSR datasets for EDSR using regulization methods

Full size table

We considered an additional experiment related to image reconstruction. Table 5 shows the outcomes of PyNET using RLS algorithm considering PSNR and Multiscale Structural Similarity Index Measure (MS-SSIM) [25] quality measures. An improvement in the original results (i.e., standard PyNet) can be observed when RLS is applied. It is worth mentioning that, even though there are several regularization methods available for CNNs, none of them tackles, or at least is evaluated, in the context of image reconstruction. As far as we are aware [17], this is the first regularization approach that improves the results of deep learning models in the aforementioned task.

Table 5 Results on Zurich RAW to RGB Dataset for PyNet

Full size table

6 Discussion

Providing new regularization algorithms is not straightforward for it often needs the knowledge of a specialist in the problem. Deep learning by itself is already an area of research that demands plenty of work when some improvements are required. This section provides some discussion about the outcomes obtained in the previous section.

6.1 Lack of Label Regularization Methods

Achieving the best results using a neural network is always desired, and regularization methods should be encouraged in most cases, as far as it does not break the semantics of the dataset. The label can be considered safe to test for multi-class classification, regardless of the application domain. The output is often hot-encoded, so there are few possibilities to harm or lose semantics.

This work is about a regularization method for Convolutional Neural Networks. It is fair and easy to compare with other algorithms because it follows the same evaluation protocol. However, more techniques rather than TSLA are desired to perform a better and direct comparison.

6.2 Lack of Comparison for General-Purpose Applications

There might be a bias toward creating deep learning regularization algorithms only for the image classification task. We could find several regularization methods [5, 18, 30, 31] for comparison purposes; however, we found only one for directly comparing in the context of image super-resolution [29]. Besides, as far as we are aware, no other in-depth study compared regularization techniques for image reconstruction.

The scarcity of works that aimed to compare regularization techniques in problems other than image classification is worrying. Indeed, we found some works [17, 29] that also complain about this scarcity of research on regularization algorithms in more problems. We encourage the researchers to develop new methods for other image processing problems, for there might be a promising area of research.

7 Conclusions and Future Works

We presented the RLS technique for label-level regularization concerning convolutional neural networks. Our results demonstrate that it can outperform other techniques when applied to different image processing problems. As such, we tackle not only the enhancement of neural networks but the problem of generalizing regularization algorithms, complained by [17]. RLS can be combined with other techniques and be used within any backbone.

We intend to apply RLS to other problems in future works, such as natural processing language processing. Another intent is to check if there are random distributions that can improve our current results.

Notes

Now, the pixel values encode the labels.

References

Agustsson E, Timofte R (2017) Ntire 2017 challenge on single image super-resolution: dataset and study. In: The IEEE conference on computer vision and pattern recognition (CVPR) workshops
Cai J, Gu S, Timofte R et al (2019) Ntire 2019 challenge on real image super-resolution: methods and results. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops
Cubuk ED, Zoph B, Mane D et al (2018) Autoaugment: learning augmentation policies from data. arXiv preprint arXiv:1805.09501
Deng J, Dong W, Socher R et al (2009) ImageNet: a large-scale hierarchical image database. In: CVPR09
DeVries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552
Gastaldi X (2017) Shake–shake regularization. arXiv:1705.07485
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
Google Scholar
Han D, Kim J, Kim J (2017) Deep pyramidal residual networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5927–5935
He K, Zhang X, Ren S et al (2016a) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He K, Zhang X, Ren S et al (2016b) Identity mappings in deep residual networks. In: European conference on computer visio. Springer, pp 630–645
Ignatov A, Van Gool L, Timofte R (2020) Replacing mobile camera isp with a single deep learning model. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 536–537
Krizhevsky A, Nair V, Hinton G (2009) Cifar-10 and cifar-100 datasets. https://www.cs.toronto.edu/kriz/cifarhtml6:1
LeCun Y, Cortes C (2010) MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/
Lim B, Son S, Kim H et al (2017) Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 136–144
Lim S, Kim I, Kim T et al (2019) Fast autoaugment. In: Advances in neural information processing systems, pp 6665–6675
Ribani R, Marengoni M (2019) A survey of transfer learning for convolutional neural networks. In: 2019 32nd SIBGRAPI conference on graphics, patterns and images tutorials (SIBGRAPI-T). IEEE, pp 47–57
Santos CFG, Papa JP (2022) Avoiding overfitting: a survey on regularization methods for convolutional neural networks. ACM Comput Surv. https://doi.org/10.1145/3510413
Article Google Scholar
Santos CFGd, Colombo D, Roder M et al (2020) Maxdropout: deep neural network regularization based on maximum output values. In: Proceedings of 25th international conference on pattern recognition, ICPR 2020, Milan, Italy, 10–15 Jan, 2021. IEEE Computer Society, pp 2671–2676
Santos CFGd, Roder M, Passos LA et al (2022) Maxdropoutv2: an improved method to drop out neurons in convolutional neural networks. In: Iberian conference on pattern recognition and image analysis. Springer, pp 271–282
Shi W, Caballero J, Huszár F et al (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1874–1883
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet Google Scholar
Sun C, Shrivastava A, Singh S et al (2017) Revisiting unreasonable effectiveness of data in deep learning era. In: Proceedings of the IEEE international conference on computer vision, pp 843–852
Szegedy C, Vanhoucke V, Ioffe S et al (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Wang Z, Simoncelli EP, Bovik AC (2003) Multiscale structural similarity for image quality assessment. In: The thirty-seventh Asilomar conference on signals, systems & computers, 2003. IEEE, pp 1398–1402
Xie S, Girshick R, Dollár P et al (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1492–1500
Xu Y, Xu Y, Qian Q et al (2020) Towards understanding label smoothing
Yamada Y, Iwamura M, Akiba T et al (2019) Shakedrop regularization for deep residual learning. IEEE Access 7:186,126-186,136
Article Google Scholar
Yoo J, Ahn N, Sohn KA (2020) Rethinking data augmentation for image super-resolution: a comprehensive analysis and a new strategy. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8375–8384
Zhong Z, Zheng L, Kang G et al (2020) Random erasing data augmentation. In: AAAI, pp 13,001–13,008
Zhu H, Zhao X (2020) TargetDrop: a targeted regularization method for convolutional neural networks

Download references

Acknowledgements

The authors are grateful to the São Paulo Research Foundation (Grants 2013/07375-0, 2014/12236-1, and 2019/07665-4), to Eldorado Research Institute, and to the National Council for Scientific and Technological Development (Grant 308529/2021-9).

Author information

Authors and Affiliations

Department of Computer Science, Federal University of Sao Carlos - UFSCar, Washington Luiz Road, Sao Carlos, SP, 13565-905, Brazil
Claudio Filipi Gonçalves dos Santos
Department of Software Application, Eldorado Research Institure, 275 Alan Turing Av., Campinas, SP, 13083-898, Brazil
Claudio Filipi Gonçalves dos Santos
Department of Computing, State University of Sao Paulo - UNESP, 14-01 Eng. Luís Edmundo Carrijo Coube Av., Bauru, SP, 17033-360, Brazil
João Paulo Papa

Authors

Claudio Filipi Gonçalves dos Santos
View author publications
You can also search for this author in PubMed Google Scholar
João Paulo Papa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Claudio Filipi Gonçalves dos Santos.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

dos Santos, C.F.G., Papa, J.P. Rethinking Regularization with Random Label Smoothing. Neural Process Lett 56, 157 (2024). https://doi.org/10.1007/s11063-024-11579-z

Download citation

Accepted: 16 February 2024
Published: 27 April 2024
DOI: https://doi.org/10.1007/s11063-024-11579-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Rethinking Regularization with Random Label Smoothing

Abstract

Similar content being viewed by others

Self-label correction for image classification with noisy labels

Semi-supervised Classification by Nuclear-Norm Based Transductive Label Propagation

Partial label learning with noisy side information

1 Introduction

2 Related Works

3 Proposed Approach

3.1 Image Classification

3.2 Image Super-Resolution

4 Methodology

4.1 Scenarios

4.2 Neural Network Architecture

4.3 Training Protocol

4.4 Datasets

5 Experimental Results

5.1 Image Classification

5.1.1 Working Along with Other Regularizers

5.2 Image Super-Resolution

6 Discussion

6.1 Lack of Label Regularization Methods

6.2 Lack of Comparison for General-Purpose Applications

7 Conclusions and Future Works

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Rethinking Regularization with Random Label Smoothing

Abstract

Similar content being viewed by others

Self-label correction for image classification with noisy labels

Semi-supervised Classification by Nuclear-Norm Based Transductive Label Propagation

Partial label learning with noisy side information

1 Introduction

2 Related Works

3 Proposed Approach

3.1 Image Classification

3.2 Image Super-Resolution

4 Methodology

4.1 Scenarios

4.2 Neural Network Architecture

4.3 Training Protocol

4.4 Datasets

5 Experimental Results

5.1 Image Classification

5.1.1 Working Along with Other Regularizers

5.2 Image Super-Resolution

6 Discussion

6.1 Lack of Label Regularization Methods

6.2 Lack of Comparison for General-Purpose Applications

7 Conclusions and Future Works

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation