Skip to main content
Log in

Three steps towards robust regression

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

The three most commonly used statistics, the arithmetic mean, variance, and the product-moment correlation, are most unfortunate choices when data are not strictly Gaussian. A new measure of correlation and a measure of scale are proposed which are substantially more robust than their least squares counterparts. An illustration shows how increased robustness can be obtained through the use of equal regression weights without severe loss in accuracy. The paper also shows how incorporating knowledge about the theoretical structure of the regression coefficients into their estimation can aid substantially in increasing their robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Anderson, N. H. Scales and statistics: Parametric and non-parametric.Psychological Bulletin, 1961,58, 305–316.

    Google Scholar 

  • Andrews, D. F., Bickel, P. J., Hampel, F. R., Huber, P. J., Rogers, W. H., and Tukey, J. W.Robust estimates of location. Princeton, N. J.: Princeton University Press, 1972.

    Google Scholar 

  • Bock, R. D.Multivariate statistical methods in behavioral research. New York: McGraw-Hill, 1975.

    Google Scholar 

  • Bock, R. D., and Kolakowski, D. Further evidence of sex-linked major gene influence on human spatial visualizing ability.Americal Journal of Human Genetics, 1973,25, 1–14.

    Google Scholar 

  • Bock, R. D., Wainer, H., Thissen, D., Peterson, A., Murray, J., and Roche, A. F. A parameterization of individual human growth curves.Human Biology, 1973,45, 63–80.

    Google Scholar 

  • Box, G. E. P., and Tiao, G. C. A Bayesian approach to some outlier problems.Biometrika, 1968,55, 119–129.

    Google Scholar 

  • Czuber, E.Theorie der beobachtungsfehler. Leipzig, 1891.

  • David, H. A. Gini's mean difference rediscovered.Biometrika, 1968,55, 573–574.

    Google Scholar 

  • Devlin, S. J., Gnanadesikan, R., and Kettenring, J. R. Robust estimation and outlier detection with correlation coefficients.Biometrika, 1975, in press. (a)

  • Devlin, S. J., Gnanadesikan, R., and Kettenring, J. R.Robust estimation of correlation and covariance matrices. Paper presented at the spring meeting of the Psychometric Society, Iowa City, April 26, 1975. (b)

  • Downton, F. Linear estimates with polynomial coefficients.Biometrika, 1966,53, 129–141.

    Google Scholar 

  • Gauss, C. F.Gottingsche gelehrte anzeigen, 1821.

  • Gini, C. Variabilita e mutabilita, contributo allo studio delle distribuzione e relazione statistiche. Sudi-Economico-Giuridici della R. Universita di Cagliari, 1912.

  • Gnanadesikan, R., and Kettenring, J. R. Robust estimates, residuals, and outlier detection with multiresponse data.Biometrics, 1972,28, 81–124.

    Google Scholar 

  • Green, B. F.Parameter sensitivity in multivariate methods. Mimeographed manuscript. Baltimore: Department of Psychology, Johns Hopkins University, 1974.

    Google Scholar 

  • Helmert, F. R. Die Berechnung des wahrscheinlichen Beobachtungs fehlers aus den ersten Potenzen der Differenzen gleichgenauer directer Beobachtungen.Astronomische Nachrichten, 1876,88, 257–272.

    Google Scholar 

  • Hogg, R. V. Adaptive robust procedures: A partial review and some suggestions for further applications and theory.Journal of the American Statistical Association, 1974,69, 909–927.

    Google Scholar 

  • Hogg, R. V., and Randles, R. Adaptive distribution-free regression methods.Technometrics, 1975, in press.

  • Hotelling, H., and Pabst, M. R. Rank correlation and tests of significance involving no assumption of normality.Annals of Mathematical Statistics, 1936,7, 29–43.

    Google Scholar 

  • Huber, P. J. Robust statistics: A review.Annals of Mathematical Statistics, 1972,43, 1041–1067.

    Google Scholar 

  • Knuth, D. E.The art of computer programming (Vol. 2). Reading, Mass.: Addison-Wesley, 1969, 1–112.

    Google Scholar 

  • Mood, A. M.Introduction to the theory of statistics. New York: McGraw-Hill, 1950.

    Google Scholar 

  • Roche, A. F., Wainer, H., and Thissen, D.Predicting adult stature for individuals. Basel, Switz.: Karger, 1975.

    Google Scholar 

  • Samejima, F. Estimation of latent ability using a response pattern of graded scores.Psychometrika Monograph Supplement, 1969, No. 17.

  • Singleton, R. C. An efficient algorithm for sorting with minimal storage.Communications of the Association for Computing Machinery, 1969,12, 185–187.

    Google Scholar 

  • Tukey, J. W.Exploratory data analysis (limited preliminary edition, Vol. 3). Reading, Mass.: Addison-Wesley, 1970.

    Google Scholar 

  • Tukey, J. W., and McLaughlin, D. H. Less vulnerable confidence and significance procedures for location based upon a single sample: Trimming/Winsorization 1.Sankhyā, 1963, A25, 331–352.

    Google Scholar 

  • von Andrae. Uber die Bestimmung des wahrscheinlichen Fehlers durch die gegebenen Differenzen vom gleich genauen Beobachtungen einer Unbekannten.Astronomische Nachrichten, 1872,79, 257–272.

    Google Scholar 

  • Wainer, H. Predicting the outcome of the Senate trial of Richard M. Nixon.Behavioral Science, 1974,19, 404–406.

    Google Scholar 

  • Wainer, H. Estimating coefficients in linear models: It don't make no nevermind.Psychological Bulletin, 1975, in press.

  • Wainer, H., Gruvaeus, G., and Zill, N. Senatorial decision making: I. The determination of structure.Behavioral Science, 1973,18, 7–19. (a)

    Google Scholar 

  • Wainer, H., and Thissen, D. Multivariate semi-metric smoothing in multiple prediction.Journal of the American Statistical Association, 1975,70. (a)

  • Wainer, H., and Thissen, D. When jackknifing fails (or does it?)Psychometrika, 1975,40, 113–114. (b)

    Google Scholar 

  • Wainer, H., Zill, N., and Gruvaeus, G. Senatorial decision making: II. Prediction.Behavioral Science, 1973,18, 20–26. (b)

    Google Scholar 

  • Wright, S.Evolution and the genetics of populations. Vol. 1, Genetic and biometric foundations, Chicago: University of Chicago Press, 1968.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Various aspects of the research reported here was supported by: NICH and HD grant 1 R01 HD08896-01 AFY to The University of Chicago, Howard Wainer, Principal Investigator; The Social Sciences Divisional Research Grants of the University of Chicago; NIH grant HD-04660 and contract NICHD-72-2735 to the Fels Research Institute, A. F. Roche, Principal Investigator.

We wish to thank John W. Tukey for his initial help and continuing interest. The measure of correlation (r t ) and the resistant measure of scale herein proposed stem directly from a suggestion by Professor Tukey. Additionally, we wish to acknowledge the help and useful suggestions of W. Kruskal, A. F. Roche, J. Kettenring, and R. Gnanadesikan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wainer, H., Thissen, D. Three steps towards robust regression. Psychometrika 41, 9–34 (1976). https://doi.org/10.1007/BF02291695

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02291695

Keywords

Navigation