Bartlett, P.L. (1998). The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network.

*IEEE Transactions on Information Theory*,

*44*(2), 525–536.

Google ScholarBauer, E., & Kohavi, R. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants.

*Machine Learning*, 36(1/2): 105–139, 1999.

Google ScholarBaum, E.B., & Haussler, D. (1989). What size net gives valid generalization?

*Neural Computation*,

*1*(1), 151–160.

Google ScholarBlum, A. (1997). Empirical support for winnow and weighted-majority based algorithms: results on a calendar scheduling domain.

*Machine Learning*,

*26*, 5–23.

Google ScholarBreiman, L. (1998). Arcing classifiers.

*The Annals of Statistics*,

*26*(3), 801–849.

Google ScholarCsiszár, I., & Tusnády, G. (1984). Information geometry and alternaning minimization procedures.

*Statistics and Decisions, Supplement Issue*,

*1*, 205–237.

Google ScholarDietterich, T.G. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. *Machine Learning*, to appear.

Dietterich, T.G., & Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes.

*Journal of Artificial Intelligence Research*,

*2*, 263–286.

Google ScholarDrucker, H., & Cortes, C. (1996). Boosting decision trees. In *Advances in Neural Information Processing Systems*, *8*, MIT Press.

Fletcher, R. (1987). *Practical Methods of Optimization* (second edition), John Wiley.

Freund, Y., Iyer, R., Schapire, R.E., & Singer, Y. (1998). An efficient boosting algorithm for combining preferences. *Machine Learning: Proceedings of the Fifteenth International Conference*.

Freund, Y., & Schapire, R.E. (1996). Experiments with a new boosting algorithm. *Machine Learning: Proceedings of the Thirteenth International Conference* (pp. 148–156).

Freund, Y., & Schapire, R.E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting.

*Journal of Computer and System Sciences*,

*55*(1), 119–139.

Google ScholarFreund, Y., Schapire, R.E., Singer, Y., & Warmuth, M.K. (1997). Using and combining predictors that specialize. *Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing* (pp. 334–343).

Friedman, J., Hastie, T., & Tibshirani, R. (1998). *Additive logistic regression: A statistical view of boosting* Technical Report.

Haussler, D. (1992). Decision theoretic generalizations of the PAC model for neural net and other learning applications.

*Information and Computation*,

*100*(1), 78–150.

Google ScholarHaussler, D., & Long, P.M. (1995). A generalization of Sauer's lemma.

*Journal of Combinatorial Theory, Series A*,

*71*(2), 219–240.

Google ScholarKearns, M., & Mansour, Y. (1996). On the boosting ability of top-down decision tree learning algorithms. *Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing*.

Maclin, R., & Opitz, D. (1997). An empirical evaluation of bagging and boosting. *Proceedings of the Fourteenth National Conference on Artificial Intelligence* (pp. 546–551).

Margineantu, D.D., & Dietterich, T.G. (1997). Pruning adaptive boosting. *Machine Learning: Proceedings of the Fourteenth International Conference* (pp. 211–218).

Merz, C.J., & Murphy, P.M. (1998). *UCI repository of machine learning databases*. http://www.ics.uci.edu/~mlearn/MLRepository.html.

Quinlan, J.R. (1996). Bagging, boosting, and C4.5. *Proceedings of the Thirteenth National Conference on Artificial Intelligence* (pp. 725–730).

Schapire, R.E. (1997). Using output codes to boost multiclass learning problems. *Machine Learning: Proceedings of the Fourteenth International Conference* (pp. 313–321).

Schapire, R.E., Freund, Y., Bartlett, P., & Lee, W.S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods.

*The Annals of Statistics*,

*26*(5), 1651–1686.

Google ScholarSchapire, R.E., & Singer, Y. BoosTexter: A boosting-based system for text categorization. *Machine Learning*, to appear.

Schwenk, H., & Bengio, Y. (1998). Training methods for adaptive boosting of neural networks. In *Advances in Neural Information Processing Systems 10*. MIT Press.