Learning with Randomized Majority Votes
We propose algorithms for producing weighted majority votes that learn by probing the empirical risk of a randomized (uniformly weighted) majority vote—instead of probing the zero-one loss, at some margin level, of the deterministic weighted majority vote as it is often proposed. The learning algorithms minimize a risk bound which is convex in the weights. Our numerical results indicate that learners producing a weighted majority vote based on the empirical risk of the randomized majority vote at some finite margin have no significant advantage over learners that achieve this same task based on the empirical risk at zero margin. We also find that it is sufficient for learners to minimize only the empirical risk of the randomized majority vote at a fixed number of voters without considering explicitly the entropy of the distribution of voters. Finally, our extensive numerical results indicate that the proposed learning algorithms are producing weighted majority votes that generally compare favorably to those produced by AdaBoost.
Unable to display preview. Download preview PDF.
- 1.Catoni, O.: PAC-Bayesian supervised classification: the thermodynamics of statistical learning. Monograph series of the Institute of Mathematical Statistics (December 2007), http://arxiv.org/abs/0712.0248
- 3.Germain, P., Lacasse, A., Laviolette, F., Marchand, M.: PAC-Bayesian Learning of Linear Classifiers. In: Bottou, L., Littman, M. (eds.) Proceedings of the 26th International Conference on Machine Learning (ICML 2009), pp. 353–360. Omnipress, Montreal (June 2009)Google Scholar
- 4.Jaakkola, T., Meila, M., Jebara, T.: Maximum entropy discrimination. In: Advances in neural information processing systems, vol. 12. MIT Press, Cambridge (2000)Google Scholar
- 6.Langford, J., Seeger, M., Megiddo, N.: An improved predictive accuracy bound for averaging classifiers. In: Brodley, C.E., Danyluk, A.P. (eds.) Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), June 28-July 1, pp. 290–297. Morgan Kaufmann, San Francisco (2001)Google Scholar
- 8.McAllester, D.A.: PAC-Bayesian model averaging. In: COLT, pp. 164–170 (1999)Google Scholar