Definition
Consider a given random variable \(\underline{F}\) and a random variable that we can modify, \(\hat{\underline{F}}\). We wish to use a sample of \(\hat{\underline{F}}\) as an estimate of a sample of \(\underline{F}\). The mean squared error (MSE) between such a pair of samples is a sum of four terms. The first term reflects the statistical coupling between \(\underline{F}\) and \(\hat{\underline{F}}\) and is conventionally ignored in bias-variance analysis. The second term reflects the inherent noise in \(\underline{F}\) and is independent of the estimator \(\hat{\underline{F}}\). Accordingly, we cannot affect this term. In contrast, the third and fourth terms depend on \(\hat{\underline{F}}\). The third term, called the bias, is independent of the precise samples of both \(\underline{F}\) and \(\hat{\underline{F}}\) and reflects the difference between the means of \(\underline{F}\) and \(\hat{\underline{F}}\). The fourth term, called the variance, is independent of the...
This is a preview of subscription content, log in via an institution.
Recommended Reading
Angluin D (1992) Computational learning theory: survey and selected bibliography. In: Proceedings of the twenty-fourth annual ACM symposium on theory of computing, Victoria. ACM, New York
Berger JO (1985) Statistical decision theory and bayesian analysis. Springer, New York
Breiman L (1996a) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (1996b) Stacked regression. Mach Learn 24(1):49–64
Buntine W, Weigend A (1991) Bayesian back-propagation. Complex Syst 5:603–643
Ermoliev YM, Norkin VI (1998) Monte carlo optimization and path dependent nonstationary laws of large numbers. Technical Report IR-98-009. International Institute for Applied Systems Analysis, Austria
Lepage GP (1978) A new algorithm for adaptive multidimensional integration. J Comput Phys 27:192–203
Mackay D (2003) Information theory, inference, and learning algorithms. Cambridge University Press, Cambridge
Robert CP, Casella G (2004) Monte Carlo statistical methods. Springer, New York
Rubinstein R, Kroese D (2004) The cross-entropy method. Springer, New York
Smyth P, Wolpert D (1999) Linearly combining density estimators via stacking. Mach Learn36(1–2):59–83
Vapnik VN (1982) Estimation of dependences based on empirical data. Springer, New York
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Wolpert DH (1997) On bias plus variance. Neural Comput 9:1211–1244
Wolpert DH, Rajnarayan D (2007) Parametric learning and monte carlo optimization. arXiv:0704.1274v1 [cs.LG]
Wolpert DH, Strauss CEM, Rajnarayan D (2006) Advances in distributed optimization using probability collectives. Adv Complex Syst 9(4):383–436
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this entry
Cite this entry
Rajnarayan, D., Wolpert, D. (2014). Bias-Variance Trade-offs: Novel Applications. In: Sammut, C., Webb, G. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7502-7_28-1
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7502-7_28-1
Received:
Accepted:
Published:
Publisher Name: Springer, Boston, MA
Online ISBN: 978-1-4899-7502-7
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering