Abstract
In the standard agnostic multiclass model, <instance, label > pairs are sampled independently from some underlying distribution. This distribution induces a conditional probability over the labels given an instance, and our goal in this paper is to learn this conditional distribution. Since even unconditional densities are quite challenging to learn, we give our learner access to <instance, conditional distribution > pairs. Assuming a base learner oracle in this model, we might seek a boosting algorithm for constructing a strong learner. Unfortunately, without further assumptions, this is provably impossible. However, we give a new boosting algorithm that succeeds in the following sense: given a base learner guaranteed to achieve some average accuracy (i.e., risk), we efficiently construct a learner that achieves the same level of accuracy with arbitrarily high probability. We give generalization guarantees of several different kinds, including distribution-free accuracy and risk bounds. None of our estimates depend on the number of boosting rounds and some of them admit dimension-free formulations.
Similar content being viewed by others
References
Alon, N., Ben-David, S., Cesa-Bianchi, N., Haussler, D.: Scale-sensitive dimensions, uniform convergence, and learnability. J. ACM 44(4), 615–631 (1997)
Bartlett, P., Shawe-Taylor, J.: Generalization performance of support vector machines and other pattern classifiers (1999)
Breiman, L.: Arcing classifier (with discussion and a rejoinder by the author). Ann. Statist. 26, 801–849 (1998)
Das, D., Petrov, S.: Unsupervised part-of-speech tagging with bilingual graph-based projections. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies - volume 1, HLT ’11, pp. 600–609. Association for Computational Linguistics, Stroudsburg (2011)
Devroye, L., Lugosi, G.: Combinatorial methods in density estimation, springer series in statistics. Springer, New York (2001)
Duffy, N., Helmbold, D.: Boosting methods for regression. Mach. Learn. 47, 153–200 (2002)
Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: Adacost: misclassification cost-sensitive boosting. In: ICML, pp. 97–105 (1999)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Ann. Stat. 28, 337–374 (2000)
Gibbs, A.L., Su, F.E.: On choosing and bounding probability metrics. Int. Stat. Rev. 70(3), 419–435 (2002)
Gottlieb, L.A., Kontorovich, L., Krauthgamer, R.: Efficient classification for metric data. In: COLT, pp. 433–440 (2010)
Gupta, A., Krauthgamer, R., Lee, J.R.: Bounded geometries, fractals, and low-distortion embeddings. In: FOCS, pp. 534–543 (2003)
Kanamori, T.: Deformation of log-likelihood loss function for multiclass boosting. Neural Netw. 23(7), 843–864 (2010)
Krauthgamer, R., Lee, J.R.: Navigating nets: Simple algorithms for proximity search. In: 15th annual ACM-SIAM Symposium on discrete algorithms, pp. 791–801 (2004)
McDiarmid, C.: On the method of bounded differences. In: Siemons, J. (ed.) Surveys in combinatorics of LMS lecture notes series, vol. 141, pp. 148–188. Morgan Kaufmann Publishers, San Mateo (1989)
Mease, D., Wyner, A.J., Buja, A.: Boosted classification trees and class probability/quantile estimation. J. Mach. Learn. Res. 8, 409–439 (2007)
Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: a new explanation for the effectiveness of voting methods. Ann. Statist. 26(5), 1651–1686 (1998)
Talagrand, M.: New concentration inequalities in product spaces. Invent. Math. 126(3), 505–563 (1996)
Toutanova, K., Cherry, C.: A global model for joint lemmatization and part-of-speech prediction. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, ACL ’09, pp. 486–494 (2009)
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version was invited to ISAIM 2014. A.K. was partially supported by the Israel Science Foundation (grant No. 1141/12) and a Yahoo Faculty award.
Rights and permissions
About this article
Cite this article
Gutfreund, D., Kontorovich, A., Levy, R. et al. Boosting conditional probability estimators. Ann Math Artif Intell 79, 129–144 (2017). https://doi.org/10.1007/s10472-015-9465-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-015-9465-7