Mathematical Programming

, Volume 174, Issue 1–2, pp 145–166 | Cite as

Regression analysis: likelihood, error and entropy

  • Bogdan Grechuk
  • Michael ZabarankinEmail author
Full Length Paper Series B


In a regression with independent and identically distributed normal residuals, the log-likelihood function yields an empirical form of the \(\mathcal{L}^2\)-norm, whereas the normal distribution can be obtained as a solution of differential entropy maximization subject to a constraint on the \(\mathcal{L}^2\)-norm of a random variable. The \(\mathcal{L}^1\)-norm and the double exponential (Laplace) distribution are related in a similar way. These are examples of an “inter-regenerative” relationship. In fact, \(\mathcal{L}^2\)-norm and \(\mathcal{L}^1\)-norm are just particular cases of general error measures introduced by Rockafellar et al. (Finance Stoch 10(1):51–74, 2006) on a space of random variables. General error measures are not necessarily symmetric with respect to ups and downs of a random variable, which is a desired property in finance applications where gains and losses should be treated differently. This work identifies a set of all error measures, denoted by \(\mathscr {E}\), and a set of all probability density functions (PDFs) that form “inter-regenerative” relationships (through log-likelihood and entropy maximization). It also shows that M-estimators, which arise in robust regression but, in general, are not error measures, form “inter-regenerative” relationships with all PDFs. In fact, the set of M-estimators, which are error measures, coincides with \(\mathscr {E}\). On the other hand, M-estimators are a particular case of L-estimators that also arise in robust regression. A set of L-estimators which are error measures is identified—it contains \(\mathscr {E}\) and the so-called trimmed \(\mathcal{L}^p\)-norms.


Regression Likelihood Entropy Error measure M-estimator L-estimator 

Mathematics Subject Classification

90C90 90C25 90C15 



We are grateful to the referees for the comments and suggestions, which helped to improve the quality of the paper. The first author thanks the University of Leicester for granting him the academic study leave to do this research.


  1. 1.
    Alfons, A., Croux, C., Gelper, S.: Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann. Appl. Stat. 7(1), 226–248 (2013)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Bartolucci, F., Scaccia, L.: The use of mixtures for dealing with non-normal regression errors. Comput. Stat. Data Anal. 48(4), 821–834 (2005)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Bernholt, T.: Computing the least median of squares estimator in time o(\(n^d\)). In: International Conference on Computational Science and Its Applications, pp. 697–706. Springer (2005)Google Scholar
  4. 4.
    Boscovich, R.J.: De litteraria expeditione per pontificiam ditionem, et synopsis amplioris operis, ac habentur plura ejus ex exemplaria etiam sensorum impressa. Bononiensi Scientarum et Artum Instituto Atque Academia Commentarii 4, 353–396 (1757)Google Scholar
  5. 5.
    Box, G.: Non-normality and tests on variances. Biometrika 40, 318–335 (1953)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Cover, T., Thomas, J.: Elements of Information Theory. Wiley, New York (2012)zbMATHGoogle Scholar
  7. 7.
    Edgeworth, F.: On observations relating to several quantities. Hermathena 6(13), 279–285 (1887)Google Scholar
  8. 8.
    Efron, B.: Regression percentiles using asymmetric squared error loss. Stat. Sin. 1(1), 93–125 (1991)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Föllmer, H., Schied, A.: Stochastic Finance, 3rd edn. de Gruyter, Berlin (2011)zbMATHGoogle Scholar
  10. 10.
    Gauss, C.F.: Theoria motus corporum coelestium in sectionibus conicis solem ambientium. sumtibus Frid. Perthes et IH Besser (1809)Google Scholar
  11. 11.
    Grechuk, B., Molyboha, A., Zabarankin, M.: Maximum entropy principle with general deviation measures. Math. Oper. Res. 34(2), 445–467 (2009)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Grechuk, B., Molyboha, A., Zabarankin, M.: Chebyshev inequalities with law-invariant deviation measures. Probab. Eng. Inf. Sci. 24(1), 145–170 (2010)MathSciNetzbMATHGoogle Scholar
  13. 13.
    Grechuk, B., Zabarankin, M.: Schur convex functionals: Fatou property and representation. Math. Finance 22(2), 411–418 (2012)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Grechuk, B., Zabarankin, M.: Inverse portfolio problem with mean-deviation model. Eur. J. Oper. Res. 234(2), 481–490 (2014)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Grechuk, B., Zabarankin, M.: Sensitivity analysis in applications with deviation, risk, regret, and error measures. SIAM J. Optim. 27(4), 2481–2507 (2017)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Gu, Y., Zou, H.: High-dimensional generalizations of asymmetric least squares regression and their applications. Ann. Stat. 44(6), 2661–2694 (2016)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Harter, L.: The method of least squares and some alternatives: Part I. In: International Statistical Review/Revue Internationale de Statistique, pp. 147–174 (1974)Google Scholar
  18. 18.
    Hosking, J., Balakrishnan, N.: A uniqueness result for l-estimators, with applications to l-moments. Stat. Methodol. 24, 69–80 (2015)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Huber, P.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Huber, P.: Robust Statistics. Wiley, New York (1981)zbMATHGoogle Scholar
  21. 21.
    Jaynes, E.T.: Information theory and statistical mechanics (notes by the lecturer). Stat. Phys. 3 1, 181 (1963)MathSciNetGoogle Scholar
  22. 22.
    Jouini, E., Schachermayer, W., Touzi, N.: Law invariant risk measures have the Fatou property. Adv. Math. Econ. 9, 49–71 (2006)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Koenker, R., Bassett Jr., G.: Regression quantiles. Econ. J. Econ. Soc. 46(1), 33–50 (1978)MathSciNetzbMATHGoogle Scholar
  24. 24.
    Krokhmal, P.: Higher moment coherent risk measures. Quant. Finance 7(4), 373–387 (2007)MathSciNetzbMATHGoogle Scholar
  25. 25.
    Kullback, S., Leibler, R.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)MathSciNetzbMATHGoogle Scholar
  26. 26.
    Laplace, P.S.: Traité de mécanique céleste, vol. 2. J. B. M. Duprat, Paris (1799)Google Scholar
  27. 27.
    Lee, W.M., Hsu, Y.C., Kuan, C.M.: Robust hypothesis tests for m-estimators with possibly non-differentiable estimating functions. Econom. J. 18(1), 95–116 (2015)MathSciNetGoogle Scholar
  28. 28.
    Legendre, A.M.: Nouvelles méthodes pour la détermination des orbites des comètes. 1. F. Didot, Paris (1805)Google Scholar
  29. 29.
    Lisman, J., Van Zuylen, M.: Note on the generation of most probable frequency distributions. Stat. Neerl. 26(1), 19–23 (1972)zbMATHGoogle Scholar
  30. 30.
    Loh, P.L.: Statistical consistency and asymptotic normality for high-dimensional robust \(m\)-estimators Ann. Stat. 45(2), 866–896 (2017)MathSciNetzbMATHGoogle Scholar
  31. 31.
    Mafusalov, A., Uryasev, S.: CVaR (superquantile) norm: stochastic case. Eur. J. Oper. Res. 249(1), 200–208 (2016)MathSciNetzbMATHGoogle Scholar
  32. 32.
    Morales-Jimenez, D., Couillet, R., McKay, M.: Large dimensional analysis of robust m-estimators of covariance with outliers. IEEE Trans. Signal Process. 63(21), 5784–5797 (2015)MathSciNetzbMATHGoogle Scholar
  33. 33.
    Mount, D., Netanyahu, N., Piatko, C., Silverman, R., Wu, A.: On the least trimmed squares estimator. Algorithmica 69(1), 148–183 (2014)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Rockafellar, R.T., Royset, J.: Measures of residual risk with connections to regression, risk tracking, surrogate models, and ambiguity. SIAM J. Optim. 25(2), 1179–1208 (2015)MathSciNetzbMATHGoogle Scholar
  35. 35.
    Rockafellar, R.T., Royset, J.: Random variables, monotone relations, and convex analysis. Math. Program. 148(1–2), 297–331 (2014)MathSciNetzbMATHGoogle Scholar
  36. 36.
    Rockafellar, R.T., Uryasev, S.: Conditional value-at-risk for general loss distributions. J. Bank. Finance 26(7), 1443–1471 (2002)Google Scholar
  37. 37.
    Rockafellar, R.T., Uryasev, S.: The fundamental risk quadrangle in risk management, optimization and statistical estimation. Surv. Oper. Res. Manag. Sci. 18(1), 33–53 (2013)MathSciNetGoogle Scholar
  38. 38.
    Rockafellar, R.T., Uryasev, S., Zabarankin, M.: Generalized deviations in risk analysis. Finance Stoch. 10(1), 51–74 (2006)MathSciNetzbMATHGoogle Scholar
  39. 39.
    Rockafellar, R.T., Uryasev, S., Zabarankin, M.: Risk tuning with generalized linear regression. Math. Oper. Res. 33(3), 712–729 (2008)MathSciNetzbMATHGoogle Scholar
  40. 40.
    Rousseeuw, P., Leroy, A.: Robust Regression and Outlier Detection, vol. 589. Wiley, New York (2005)zbMATHGoogle Scholar
  41. 41.
    Rousseeuw, P., Van Driessen, K.: Computing LTS regression for large data sets. Data Min. Knowl. Disc. 12(1), 29–45 (2006)MathSciNetzbMATHGoogle Scholar
  42. 42.
    Rousseeuw, P.G.: Least median of squares regression. J. Am. Stat. Assoc. 79, 871–880 (1984)MathSciNetzbMATHGoogle Scholar
  43. 43.
    Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. 27, 379–423, 623–656 (1948)Google Scholar
  44. 44.
    Xie, S., Zhou, Y., Wan, A.: A varying-coefficient expectile model for estimating value at risk. J. Bus. Econ. Stat. 32(4), 576–592 (2014)MathSciNetGoogle Scholar
  45. 45.
    Zabarankin, M., Uryasev, S.: Statistical Decision Problems: Selected Concepts and Portfolio Safeguard Case Studies. Springer, Berlin (2014)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2018

Authors and Affiliations

  1. 1.Department of MathematicsUniversity of LeicesterLeicesterUK
  2. 2.Department of Mathematical SciencesStevens Institute of TechnologyHobokenUSA

Personalised recommendations