Psychometrika

, Volume 35, Issue 2, pp 257–271 | Cite as

A comparison of three predictor selection techniques in multiple regression

  • Robert L. McCornack
Article

Abstract

Three methods for selecting a few predictors from the many available are described and compared with respect to shrinkage in cross-validation. From 2 to 6 predictors were selected from the 15 available in 100 samples ranging in size from 25 to 200. An iterative method was found to select predictors with slightly, but consistently, higher cross-validities than the popularly used stepwise method. A gradient method was found to equal the performance of the stepwise method only in the larger samples and for the largest predictor subsets.

Keywords

Shrinkage Public Policy Iterative Method Statistical Theory Gradient Method 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderson, H. E. & Fruchter, B. Some multiple correlation and predictor selection methods.Psychometrika, 1960,25, 59–76.Google Scholar
  2. Anscombe, F. J. Topics in the investigation of linear relations fitted by the method of least squares.Journal of the Royal Statistical Society, Series B, 1967,29, 1–52.Google Scholar
  3. Burket, G. R. A study of reduced rank models for multiple prediction.Psychometric Monographs, No. 12, 1964.Google Scholar
  4. Cochran, W. G. The omission or addition of an independent variate in multiple linear regression.Journal of the Royal Statistical Society (supplement), 1938,5, 171–176.Google Scholar
  5. Cooley, W. W., & Lohnes, P. R.Multivariate procedures for the behavioral sciences. New York: Wiley, 1962.Google Scholar
  6. Cureton, E. E. Approximate linear restraints and best predictor weights.Educational and Psychological Measurement, 1951,11, 12–15.Google Scholar
  7. Dixon, W. J. (Ed.)Biomedical computer programs. Los Angeles: UCLA Student Store, 1965.Google Scholar
  8. Draper, N. R. & Smith, H.Applied regression analysis. New York: Wiley, 1966.Google Scholar
  9. DuBois, P. H.Multivariate correlational analysis. New York: Harper, 1957.Google Scholar
  10. Dwyer, P. S. The square root method and its use in correlation and regression.Journal of the American Statistical Association, 1945,40, 493–503.Google Scholar
  11. Efroymson, M. A. Multiple regression analysis. In Ralston, A. & Wilf, H. S. (Eds.)Mathematical methods for digital computers. New York: Wiley, 1960.Google Scholar
  12. Elfving, G., Sitgreaves, R. & Solomon, H. Item selection for item variables with a known factor structure.Psychometrika, 1959,24, 189–205.Google Scholar
  13. Fisher, R. A.Statistical methods for research workers. Edinburgh: Oliver and Boyd, 6th ed., 1936.Google Scholar
  14. Fruchter, B. & Anderson, H. E. Geometrical representation of two methods of linear least squares multiple correlation.Psychometrika, 1961,26, 433–442.Google Scholar
  15. Garside, M. J. The best subset in multiple regression analysis.Applied Statistics Journal of the Royal Statistical Society, Series C, 1965,14, 196–200.Google Scholar
  16. Gorman, J. W. & Toman, R. J. Selection of variables for fitting equations to data.Technometrics, 1966,8, 27–51.Google Scholar
  17. Graybill, F. A.An introduction to linear statistical models. New York: McGraw-Hill, 1961.Google Scholar
  18. Greenberger, M. H. & Ward, J. H. An iterative technique for multiple correlation analysis.IBM Technical Newsletter, 1956,12, 85–97.Google Scholar
  19. Hamaker, H. C. On multiple regression analyses.Statistica Neerlandica, 1962,16, 31–56.Google Scholar
  20. Hemmerle, W. J.Statistical computations on a digital computer. Waltham, Mass.: Blaisdell, 1967.Google Scholar
  21. Hocking, R. R. & Leslie, R. N. Selection of the best subset in regression analysis.Technometrics, 1967,9, 531–540.Google Scholar
  22. Horst, P. (Ed.)The prediction of personal adjustment. New York: Social Science Research Council Bulletin 48, 1941.Google Scholar
  23. Horst, P. & Smith, S. The discrimination of two racial samples.Psychometrika, 1950,15, 271–289.Google Scholar
  24. Householder, A. S.Principles of numerical analysis. New York: McGraw-Hill, 1953.Google Scholar
  25. International Business Machines Corporation.System/360 scientific subroutine package. White Plains, New York: H20-0205-0, 1966.Google Scholar
  26. Jennings, E. Matrix formulas for part and partial correlation.Psychometrika, 1965,30, 353–356.Google Scholar
  27. Kelley, T. L. & Salisbury, F. S. An iteration method for determining multiple correlation constants.Journal of the American Statistical Association, 1926,21, 282–292.Google Scholar
  28. Leiman, J. M.The calculation of regression weights from common factor loadings. Unpublished doctoral dissertation: University of Washington, 1951.Google Scholar
  29. Lev, J. Maximizing test battery prediction when the weights are required to be non-negative.Psychometrika, 1956,21, 245–252.Google Scholar
  30. Li, J. C. R.Statistical Inference II. Ann Arbor, Mich.: Edwards Brothers, 1964.Google Scholar
  31. Linhart, H. A criterion for selecting variables in a regression analysis.Psychometrika, 1960,25, 45–58.Google Scholar
  32. Lubin, A. & Summerfield, A. A square root method for selecting a minimum set of variables in multiple regression: II.Psychometrika, 1951,16, 425–437.Google Scholar
  33. Mann, H. B.Analysis and design of experiments. New York: Dover, 1949.Google Scholar
  34. Oosterhoff, J.On the selection of independent variables in a regression equation. Preliminary Report S 319. Amsterdam: Stichting Mathematisch Centrum, 1963.Google Scholar
  35. Rao, C. R.Linear statistical inference and its applications. New York: Wiley, 1965.Google Scholar
  36. Rhyne, A. L., Jr. & Steel, R. G. D. Tables for a treatments versus control multiple comparisons sign test.Technometrics, 1965,7, 293–306.Google Scholar
  37. Scheffe, H.The analysis of variance. New York: Wiley, 1959.Google Scholar
  38. Searle, S. R.Matrix algebra for the biological sciences. New York: Wiley, 1966.Google Scholar
  39. Shine, L. C. The relative efficiency of test selection methods in crossvalidation on generated data.Educational and Psychological Measurement, 1966,26, 833–846.Google Scholar
  40. Steel, R. G. D. A multiple comparison sign test: treatments versus control.Journal of the American Statistical Association, 1959,54, 767–775.Google Scholar
  41. Summerfield, A. & Lubin, A. A square root method of selecting a minimum set of variables in multiple regression: I.Psychometrika, 1951,16, 271–284.Google Scholar
  42. Thomas, G. B.Calculus and analytic geometry. Reading, Mass.: Addison-Wesley, 1960.Google Scholar
  43. Thorndike, R. L.Personnel selection. New York: Wiley, 1949.Google Scholar
  44. Toops, H. A. The L-method.Psychometrika, 1941,6, 249–266.Google Scholar
  45. Veldman, D. J.Fortran programming for the behavioral sciences. New York: Holt, 1967.Google Scholar
  46. Watson, F. R. A new method for solving simultaneous linear equations associated with multivariate analysis.Psychometrika, 1964,29, 75–86.Google Scholar
  47. Wherry, R. J. A new formula for predicting the shrinkage of the coefficient of multiple correlation.Annals of Mathematical Statistics, 1931,2, 440–451.Google Scholar
  48. Wherry, R. J. & Gaylord, R. H. Test selection with integral gross score weights.Psychometrika, 1946,11, 173–183.Google Scholar
  49. Wherry, R. J. in Stead, W. H. & Shartle, C. P.Occupational counseling techniques. New York: American Book Company, 1940.Google Scholar
  50. Wood, K. R., McCornack, R. L. & Villone, L.Multiple regression with subsetting of variables. Santa Monica, California: System Development Corporation, FN-662, 1962.Google Scholar
  51. Winer, B. J.Statistical principles in experimental design. New York: McGraw-Hill, 1962.Google Scholar

Copyright information

© Psychometric Society 1970

Authors and Affiliations

  • Robert L. McCornack
    • 1
  1. 1.San Diego State CollegeUSA

Personalised recommendations