Definition
Regression is a fundamental problem in statistics and machine learning. In regression studies, we are typically interested in inferring a real-valued function (called a regression function) whose values correspond to the mean of a dependent (or response or output) variable conditioned on one or more independent (or input) variables. Many different techniques for estimating this regression function have been developed, including parametric, semi-parametric, and nonparametric methods.
Motivation and Background
Assume that we are given a set of data points sampled from an underlying but unknown distribution, each of which includes input x and output y. An example is given in Fig. 1. The task of regression is to learn a hidden functional relationship between x and y from observed and possibly noisy data points. In Fig. 1, the input–output relationship is a Gaussian-corrupted sinusoidal relationship, that is, \(y =\mathrm{ sin}(2\pi x)+\epsilon\) where \(\epsilon\)is the...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Machine learning textbooks such as Bishop (2006), among others, introduce different regression models. For a more statistical introduction including an extensive overview of the many different semi-parametric methods and non-parametric methods such as kernel methods, see Hastie et al. (2003). For a coverage of key statistical issues including nonlinear regression, identifiability, measures of curvature, autocorrelation, and such, see Seber and Wild (1989). For a large variety of built-in regression techniques, refer to R (http://www.r-project.org/).
Recommended Reading
Machine learning textbooks such as Bishop (2006), among others, introduce different regression models. For a more statistical introduction including an extensive overview of the many different semi-parametric methods and non-parametric methods such as kernel methods, see Hastie et al. (2003). For a coverage of key statistical issues including nonlinear regression, identifiability, measures of curvature, autocorrelation, and such, see Seber and Wild (1989). For a large variety of built-in regression techniques, refer to R (http://www.r-project.org/).
Bishop C (2006) Pattern recognition and machine learning. Springer, New York
Gaffney S, Smyth P (1999) Trajectory clustering with mixtures of regression models. In: ACM SIGKDD, vol 62. ACM, New York, pp 63–72
Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural Comput 4:1–58
Goldberg P, Williams C, Bishop C (1998) Regression with input-dependent noise: a Gaussian process treatment. In: Neural information processing systems, vol 10. MIT
Hastie T, Tibshirani R, Friedman J (Corrected ed) (2003) The elements of statistical learning: data mining, inference, and prediction. Springer, New York
Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge
Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc: Ser A 135: 370–384
Seber G, Wild C (1989) Nonlinear regression. Wiley, New York
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this entry
Cite this entry
Quadrianto, N., Buntine, W.L. (2017). Regression. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_716
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7687-1_716
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering