Bellman, R.E. 1961. Adaptive Control Processes. Princeton University Press.

Breiman, L. 1995. Bagging predictors. Dept. of Statistics, University of California, Berkeley, Technical Report.

Breiman, L. 1996. Bias, variance, and arcing classifiers. Dept. of Statistics, University of California, Technical Report (revised).

Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J. 1984. Classification and Regression Trees. Wadsworth.

Chow, W.S. and Chen, Y.C. 1992. A new fast algorithm for effective training of neural classifiers. Pattern Recognition, 25:423–429.

Google ScholarDietterich, T.G. and Kong, E.B. 1995. Machine learning bias, statistical bias, and statistical variance of decision tree algorithms. Dept. of Computer Science, Oregon State University Technical Report.

Efron, B. and Tibshirani, R. 1995. Cross-validation and the bootstrap: Estimating the error rate of a prediction rule. Dept. of Statistics, Stanford University Technical Report.

Fix, E. and Hodges, J.L. 1951. Discriminatory analysis-nonparametric discrimination: Consistency properties. Randolf Field Texas: U.S. Airforce School of Aviation Medicine Technical Report No. 4.

Friedman, J.H. 1985. Classification and multiple response regression through projection pursuit. Dept. of Statistics, Stanford University Technical Report LCS012.

Geman, S., Bienenstock, E., and Doursat, R. 1992. Neural networks and the bias/variance dilemma. Neural Comp., 4:1–48.

Google ScholarGood, I.J. 1965. The Estimation of Probabilities: An Essay on Modern Bayesian Methods. M.I.T. Press.

Hand, D.J. 1982. Kernel discriminant analysis. Chichester: Research Studies Press.

Heckerman, D., Geiger, D., and Chickering, D. 1994. Learning Bayesian networks: the combination of knowledge and statistical data. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, pp. 293–301, AAAI Press and M.I.T. Press.

Henley, W.E. and Hand, D.J. 1996. A

*k*-nearest neighbour classifier for assessing consumer credit risk. The Statistician, 45:77–95.

Google ScholarHolte, R.C. 1993. Very simple classification rules perform well on most commonly used data sets. Machine Learning, 11:63–90.

Google ScholarKohavi, R. and Wolpert, D.H. 1996. Bias plus variance decomposition for zero-one loss functions. Dept. of Computer Science, Stanford University Technical Report.

Kohonen, T. 1990. The self-organizing map. Proceedings of the IEEE, 78:1464–1480.

Google ScholarLangley, P., Iba, W., and Thompson, K. 1992. An analysis of Bayesian classifiers. In Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 223–228, AAAI Press and M.I.T. Press.

Lippmann, R. 1989. Pattern classification using neural networks. IEEE Communications Magazine, 11:47–64.

Google ScholarMcLachlan, G.J. 1992. Discriminant Analysis and Statistical Pattern Recognition. Wiley.

Quinlan, J.R. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann.

Rosen, D.B., Burke, H.B., and Goodman, P.H. 1995. Local learning methods in high dimension: Beating the bias-variance dilemma via recalibration. Workshop Machines That Learn-Neural Networks for Computing, Snowbird Utah.

Tibshirani, R. 1996. Bias, variance and prediction error for classification rules. Dept. of Statistics, University of Toronto Technical Report.

Titterington, D.M., Murray, G.D., Murray, L.S., Spiegelhalter, D.J., Skene, A.M., Habbema, J.D.F., and Gelpke, G.J. 1981. Comparison of discrimination techniques applied to a complex data set of head injured patients. J. Roy. Statist. Soc. A, 144:145–175.

Google Scholar