Stacking regressions is a method for forming linear combinations of different predictors to give improved prediction accuracy. The idea is to use cross-validation data and least squares under non negativity constraints to determine the coefficients in the combination. Its effectiveness is demonstrated in stacking regression trees of different sizes and in a simulation stacking linear subset and ridge regressions. Reasons why this method works are explored. The idea of stacking originated with Wolpert (1992).
Belsley, D.A., Kuh, E and Welsch, R. "Regression Diagnostics." 1980. John Wiley and Sons. New York.
Berger, J.O. and Bock, M.E., "Combining independent normal mean estimation problems with unknown variances," Ann, Statist. 4, 1976, pp. 642–648.
Breiman, L., Friedman, J., Olshen, R. and Stone, J., "Classification and Regression Trees," 1984, Wadsworth, California.
Breiman, L. and Friedman, J.H., "Estimating Optimal Transformations in Multiple Regression and Correlation (with discussion)," J. Amer. Statist. Assoc., 80. 1985, pp. 580–619.
Breiman, L. and Spector, P., "Submodel Selection and Evaluation-X Random Case," International Statistical Review, 3, 1992, pp. 291–319.
Efron, B. and Morris, C., "Combining possibly related estimation problems (with discussion)," J. Roy. Statist. Soc. Scr. B, 35, 1973, pp. 379–421.
Green, E.J. and Strawderman, W.E., "A James-Stein type estimator for combining unbiased and possibly biased estimators," J. Amer. Statist. Assoc., 86, 1991, pp. 1001–1006.
Hoerl, A.E. and Kennard, R.W., "Ridge regression: Biased estimation for nonorthogonal problems," Technometrics, 12, 1970, pp. 55–67.
Lawson, J. and Hanson, R., "Solving Least Squares Problems," 1974, Prentice-Hall, New Jersey.
Luenberger, D., "Linear and Nonlinear Programming," 1984, Addison-Wesley Publishing Co.
Le Blanc, M. and Tibshirani, R., "Combining Estimates in Regression and Classification," Technical Report 9318, 1973, Dept. of Statistics, University of Toronto.
Perrone, M. P. "Genernal Averaging Result for Convex Optimization," Proceedings of the 1993 Connectionist Models Summer School, Erlbaum Associates. 1994, pp. 364–371.
Rao, J.N.K. and Subrathmaniam, K., "Combining independent estimators and estimation in linear regression with unequal variances," Biometrics, 27, 1971, pp. 971–990.
Raubin, D.B. and Weisberg, S., "The variance of a linear combination of independent estimators using estimated weights," Biometrika, 62, 1975, pp. 708–709.
Wolpert, D., "Stacked Generalization," Neural Networks, Vol. 5, 1992, pp. 241–259.