Cauchy Loss Function: Robustness Under Gaussian and Cauchy Noise

Mlotshwa, Thamsanqa; van Deventer, Heinrich; Bosman, Anna Sergeevna

doi:10.1007/978-3-031-22321-1_9

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1734))

Included in the following conference series:

Southern African Conference for Artificial Intelligence Research

443 Accesses
2 Citations

Abstract

In supervised machine learning, the choice of loss function implicitly assumes a particular noise distribution over the data. For example, the frequently used mean squared error (MSE) loss assumes a Gaussian noise distribution. The choice of loss function during training and testing affects the performance of artificial neural networks (ANNs). It is known that MSE may yield substandard performance in the presence of outliers. The Cauchy loss function (CLF) assumes a Cauchy noise distribution, and is therefore potentially better suited for data with outliers. This papers aims to determine the extent of robustness and generalisability of the CLF as compared to MSE. CLF and MSE are assessed on a few handcrafted regression problems, and a real-world regression problem with artificially simulated outliers, in the context of ANN training. CLF yielded results that were either comparable to or better than the results yielded by MSE, with a few notable exceptions.

Supported by the NRF Thuthuka Grant Number 13819413.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zahra, M.M., Essai, M.H., Ellah, A.: Performance functions alternatives of MSE for neural networks learning. Int. J. Eng. Res. Technol. (IJERT) 3(1), 967–970 (2014)
Google Scholar
Heravi, A.R., Hodtani, G.A.: Where does minimum error entropy outperform minimum mean square error? a new and closer look. IEEE Access 6(1), 5856–5864 (2018)
Article Google Scholar
El-Melegy, M.T., Essai, M.H., Ali, A.A.: Robust training of artificial feedforward neural networks. In: Hassanien, A.E., Abraham, A., Vasilakos, A.V., Pedrycz, W. (eds.) Foundations of Computational. Studies in Computational Intelligence, vol. 201, pp. 217–242. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01082-8_9
Chapter Google Scholar
Brunet, F.: Contributions to parametric image registration and 3D surface reconstruction. PhD thesis, University of Auvergne, Auvergne, France (2010)
Google Scholar
Borak, S., Härdle, W., Weron, R.: Stable distributions. In: Čížek, P., Weron, R., Härdle, W. (eds.) Statistical Tools for Finance and Insurance, pp. 21–44. Springer, Heidelberg (2005). https://doi.org/10.1007/3-540-27395-6_1
Chapter Google Scholar
Li, X., Lu, Q., Dong, Y., Tao, D.: Robust subspace clustering by Cauchy loss function. IEEE Trans. Neural Netw. Learn. Syst. 30(7), 2067–2078 (2019)
Article MathSciNet Google Scholar
Park, S., Serpedin, E., Qaraqe, K.: Gaussian assumption: the least favorable but the most useful. IEEE Signal Process. Mag. 30(3), 183–186 (2013)
Article Google Scholar
Pearson, R.K.: Control Systems, Identification, pp. 687–707. Academic Press, California (2003)
Google Scholar
Chambers, R.L., Steel, Wang, D.G., Welsh, A.: Maximum Likelihood Estimation for Sample Surveys. Chapman and Hall/CRC (2012)
Google Scholar
Chen, R., Paschalidis, I.C.: A robust learning approach for regression models based on distributionally robust optimization. J. Mach. Learn. Res. 19, 517–564 (2018)
MathSciNet MATH Google Scholar
Tsakalides, P., Nikias, C.L.: Maximum likelihood localization of sources in noise modeled as a Cauchy process. In: Proceedings of MILCOM 1994, vol. 2, pp. 613–617 (1994)
Google Scholar
Barron, J.T.: A general and adaptive robust loss function. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4326–4334 (2019)
Google Scholar
Huang, H.-C., Cressie, N.: Deterministic/stochastic wavelet decomposition for recovery of signal from noisy data. Technometrics 42(3), 262–276 (2000)
Article MathSciNet MATH Google Scholar
Abu-Mostafa, Y.S., Magdon-Ismail, M., Lin, H.-T.: Learning from data : a short course. AMLbook.com, USA (2012)
Google Scholar
Balkema, G., Embrechts, P.: Linear regression for heavy tails. Risks 6, 93 (2018)
Article Google Scholar
Fan, C., Zhang, D., Zhang, C.-H.: On sample size of the Kruskal-Wallis test with application to a mouse peritoneal cavity study. Biometrics 67, 213–24 (2010)
Article MathSciNet MATH Google Scholar
Brcich, R., Iskander, D., Zoubir, A.: The stability test for symmetric alpha-stable distributions. IEEE Trans. Signal Process. 53(3), 977–986 (2005)
Article MathSciNet MATH Google Scholar
Hart, A.: Mann-Whitney test is not just a test of medians: differences in spread can be important. BMJ 323(7309), 391–393 (2001)
Article Google Scholar
Sathishkumar, V.E., Park, J., Cho, Y.: Using data mining techniques for bike sharing demand prediction in metropolitan city. Comput. Commun. 153, 353–366 (2020)
Article Google Scholar
Qi, Z., Wang, H.: Dirty-data impacts on regression models: an experimental evaluation. In: Jensen, C.S., et al. (eds.) DASFAA 2021. LNCS, vol. 12681, pp. 88–95. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-73194-6_6
Chapter Google Scholar
Zhang, Z.: Improved Adam optimizer for deep neural networks. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), pp. 1–2 (2018)
Google Scholar
Banerjee, C., Mukherjee, T., Pasiliao, E.L.: An empirical study on generalizations of the relu activation function. In: Proceedings of the 2019 ACM Southeast Conference (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Pretoria, Pretoria, South Africa
Thamsanqa Mlotshwa, Heinrich van Deventer & Anna Sergeevna Bosman

Authors

Thamsanqa Mlotshwa
View author publications
You can also search for this author in PubMed Google Scholar
Heinrich van Deventer
View author publications
You can also search for this author in PubMed Google Scholar
Anna Sergeevna Bosman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anna Sergeevna Bosman .

Editor information

Editors and Affiliations

University of KwaZulu-Natal, Durban, South Africa
Anban Pillay
University of KwaZulu-Natal, Durban, South Africa
Edgar Jembere
University of Pretoria, Pretoria, South Africa
Aurona Gerber

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mlotshwa, T., van Deventer, H., Bosman, A.S. (2022). Cauchy Loss Function: Robustness Under Gaussian and Cauchy Noise. In: Pillay, A., Jembere, E., Gerber, A. (eds) Artificial Intelligence Research. SACAIR 2022. Communications in Computer and Information Science, vol 1734. Springer, Cham. https://doi.org/10.1007/978-3-031-22321-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-22321-1_9
Published: 28 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22320-4
Online ISBN: 978-3-031-22321-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cauchy Loss Function: Robustness Under Gaussian and Cauchy Noise