Advertisement

Statistics and Computing

, Volume 8, Issue 1, pp 25–33 | Cite as

Coaching variables for regression and classification

  • ROBERT TIBSHIRANI
  • GEOFFREY HINTON
Article

Abstract

In a regression or classification setting where we wish to predict Y from x1,x2,..., xp, we suppose that an additional set of ‘coaching’ variables z1,z2,..., zm are available in our training sample. These might be variables that are difficult to measure, and they will not be available when we predict Y from x1,x2,..., xp in the future. We consider two methods of making use of the coaching variables in order to improve the prediction of Y from x1,x2,..., xp. The relative merits of these approaches are discussed and compared in a number of examples.

regression classification missing data mixtures of experts 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andrews, D. and Herzberg, A. (1985) Data, Berlin: Springer-Verlag.Google Scholar
  2. Breiman, L. and Friedman, J. (1997) Predicting multivariate responses in multiple linear regression, (with discussion), Journal of the Royal Statistical Society B, 59, 3.Google Scholar
  3. Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984) Classification and Regression Trees, Wadsworth.Google Scholar
  4. Cleveland, W., Grosse, E., Shyu, W. and Terpenning, I. (1991) Local regression models. In J. Chambers and T. Hastie (eds) Statistical models in S, Wadsworth.Google Scholar
  5. Hastie, T. and Tibshirani, R. (1993) Varying coefficient models (with discussion), Journal of the Royal Statistical Society B, 55, 757-96.Google Scholar
  6. Hosmer, D. and Dick, N. (1974) Information and mixtures of two normal distributions, Journal of Statistics and Computer Simulation, 995-1006.Google Scholar
  7. Jacobs, R., Jordan, M., Nowlan, S. and Hinton, G. (1991) Adaptive mixtures of local experts, Neural Computation, 3, 79-87.Google Scholar
  8. Jordan, M. and Jacobs, R. (1994) Hierarchical mixtures of experts and the em algorithm. Neural Computation, 6, 181-214.Google Scholar
  9. MacLachlan, G. and Basford, K. (1988) Mixture models: inference and applications to clustering, Marcel Dekker.Google Scholar
  10. Nowlan, S. (1991) Soft competition and adaptation, Technical report, PhD thesis, Computer Science, Carnegie Mellon University.Google Scholar

Copyright information

© Chapman and Hall 1998

Authors and Affiliations

  • ROBERT TIBSHIRANI
    • 1
  • GEOFFREY HINTON
    • 2
  1. 1.Department of Public Health Sciences and Department of StatisticsUniversity of TorontoTorontoCanada
  2. 2.Department of Computer ScienceUniversity of TorontoTorontoCanada

Personalised recommendations