Skip to main content

Cauchy Loss Function: Robustness Under Gaussian and Cauchy Noise

  • Conference paper
  • First Online:
Artificial Intelligence Research (SACAIR 2022)

Abstract

In supervised machine learning, the choice of loss function implicitly assumes a particular noise distribution over the data. For example, the frequently used mean squared error (MSE) loss assumes a Gaussian noise distribution. The choice of loss function during training and testing affects the performance of artificial neural networks (ANNs). It is known that MSE may yield substandard performance in the presence of outliers. The Cauchy loss function (CLF) assumes a Cauchy noise distribution, and is therefore potentially better suited for data with outliers. This papers aims to determine the extent of robustness and generalisability of the CLF as compared to MSE. CLF and MSE are assessed on a few handcrafted regression problems, and a real-world regression problem with artificially simulated outliers, in the context of ANN training. CLF yielded results that were either comparable to or better than the results yielded by MSE, with a few notable exceptions.

Supported by the NRF Thuthuka Grant Number 13819413.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zahra, M.M., Essai, M.H., Ellah, A.: Performance functions alternatives of MSE for neural networks learning. Int. J. Eng. Res. Technol. (IJERT) 3(1), 967–970 (2014)

    Google Scholar 

  2. Heravi, A.R., Hodtani, G.A.: Where does minimum error entropy outperform minimum mean square error? a new and closer look. IEEE Access 6(1), 5856–5864 (2018)

    Article  Google Scholar 

  3. El-Melegy, M.T., Essai, M.H., Ali, A.A.: Robust training of artificial feedforward neural networks. In: Hassanien, A.E., Abraham, A., Vasilakos, A.V., Pedrycz, W. (eds.) Foundations of Computational. Studies in Computational Intelligence, vol. 201, pp. 217–242. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01082-8_9

    Chapter  Google Scholar 

  4. Brunet, F.: Contributions to parametric image registration and 3D surface reconstruction. PhD thesis, University of Auvergne, Auvergne, France (2010)

    Google Scholar 

  5. Borak, S., Härdle, W., Weron, R.: Stable distributions. In: Čížek, P., Weron, R., Härdle, W. (eds.) Statistical Tools for Finance and Insurance, pp. 21–44. Springer, Heidelberg (2005). https://doi.org/10.1007/3-540-27395-6_1

    Chapter  Google Scholar 

  6. Li, X., Lu, Q., Dong, Y., Tao, D.: Robust subspace clustering by Cauchy loss function. IEEE Trans. Neural Netw. Learn. Syst. 30(7), 2067–2078 (2019)

    Article  MathSciNet  Google Scholar 

  7. Park, S., Serpedin, E., Qaraqe, K.: Gaussian assumption: the least favorable but the most useful. IEEE Signal Process. Mag. 30(3), 183–186 (2013)

    Article  Google Scholar 

  8. Pearson, R.K.: Control Systems, Identification, pp. 687–707. Academic Press, California (2003)

    Google Scholar 

  9. Chambers, R.L., Steel, Wang, D.G., Welsh, A.: Maximum Likelihood Estimation for Sample Surveys. Chapman and Hall/CRC (2012)

    Google Scholar 

  10. Chen, R., Paschalidis, I.C.: A robust learning approach for regression models based on distributionally robust optimization. J. Mach. Learn. Res. 19, 517–564 (2018)

    MathSciNet  MATH  Google Scholar 

  11. Tsakalides, P., Nikias, C.L.: Maximum likelihood localization of sources in noise modeled as a Cauchy process. In: Proceedings of MILCOM 1994, vol. 2, pp. 613–617 (1994)

    Google Scholar 

  12. Barron, J.T.: A general and adaptive robust loss function. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4326–4334 (2019)

    Google Scholar 

  13. Huang, H.-C., Cressie, N.: Deterministic/stochastic wavelet decomposition for recovery of signal from noisy data. Technometrics 42(3), 262–276 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  14. Abu-Mostafa, Y.S., Magdon-Ismail, M., Lin, H.-T.: Learning from data : a short course. AMLbook.com, USA (2012)

    Google Scholar 

  15. Balkema, G., Embrechts, P.: Linear regression for heavy tails. Risks 6, 93 (2018)

    Article  Google Scholar 

  16. Fan, C., Zhang, D., Zhang, C.-H.: On sample size of the Kruskal-Wallis test with application to a mouse peritoneal cavity study. Biometrics 67, 213–24 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  17. Brcich, R., Iskander, D., Zoubir, A.: The stability test for symmetric alpha-stable distributions. IEEE Trans. Signal Process. 53(3), 977–986 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  18. Hart, A.: Mann-Whitney test is not just a test of medians: differences in spread can be important. BMJ 323(7309), 391–393 (2001)

    Article  Google Scholar 

  19. Sathishkumar, V.E., Park, J., Cho, Y.: Using data mining techniques for bike sharing demand prediction in metropolitan city. Comput. Commun. 153, 353–366 (2020)

    Article  Google Scholar 

  20. Qi, Z., Wang, H.: Dirty-data impacts on regression models: an experimental evaluation. In: Jensen, C.S., et al. (eds.) DASFAA 2021. LNCS, vol. 12681, pp. 88–95. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-73194-6_6

    Chapter  Google Scholar 

  21. Zhang, Z.: Improved Adam optimizer for deep neural networks. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), pp. 1–2 (2018)

    Google Scholar 

  22. Banerjee, C., Mukherjee, T., Pasiliao, E.L.: An empirical study on generalizations of the relu activation function. In: Proceedings of the 2019 ACM Southeast Conference (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Sergeevna Bosman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mlotshwa, T., van Deventer, H., Bosman, A.S. (2022). Cauchy Loss Function: Robustness Under Gaussian and Cauchy Noise. In: Pillay, A., Jembere, E., Gerber, A. (eds) Artificial Intelligence Research. SACAIR 2022. Communications in Computer and Information Science, vol 1734. Springer, Cham. https://doi.org/10.1007/978-3-031-22321-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-22321-1_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-22320-4

  • Online ISBN: 978-3-031-22321-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics