Abstract
Linear regression is one of the most accessible machine learning methods which has strong roots in the field of statistics. Problems of interest consider the numerical relationship between the input variables (or features) and the output variables (or target variables). In this chapter we introduce machine learning regression analysis with the goal, to use it for inferring the functional relation between features and target variables, helping us to understand important aspects of the data. As a prerequisite, it is explained how trained models are used to make predictions. Furthermore, a number of more general concepts and notations are introduced which are also of importance for later chapters.
If you don’t know where you are going, any road will get you there.
Lewis Carroll (1832–1898)
English author and mathematician
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Galton, probably for practical reasons, did not use the mean values which would have been directly related to a least-square fit.
- 2.
Also, the human eye is surprisingly good at finding a good fit of a line to given data—at least as long as the data is not too “unusual.” In any case, looking at a fitted line and the data is certainly a good “sanity check”!
- 3.
In Sect. 12.3.3, we will see that there are different types of “errors,” but for now we use error just as a synonym for any sort of deviation.
- 4.
Hint: For finding the minimum, we have to take the derivative. Then, the the squared residuals simply becomes just the residual—a very simple function.
- 5.
In case that we are looking for a maximum, we still can use the whole “minimization machinery,” and only the sign of the cost function needs to be changed.
References
Numpy. URL https://numpy.org/.
Website dedicated to Galton’s life and work, 2022. URL https://galton.org/.
H. Blockeel. Hypothesis Space, pages 511–513. Springer US, Boston, MA, 2010. ISBN 978-0-387-30164-8. DOI https://doi.org/10.1007/978-0-387-30164-8_373.
L. N. G. Filon, G. U. Yule, H. Westergaard, M. Greenwood, and K. Pearson. Speeches delivered at a dinner held in university college, london in honour of professor karl pearson 23 april 1934. URL https://archive.org/details/filon-et-al-1934-speeches-delivered-at-a-dinner/.
F.R.S. Karl Pearson, Liii. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11): 559–572, 1901. DOI https://doi.org/10.1080/14786440109462720.
F. Galton. Natural Inheritance. New York: Macmillan and Company, 5th edition, 1894.
W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes 3rd Edition: The Art of Scientific Computing. Cambridge University Press, 3 edition, 2007. ISBN 0521880688.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Sandfeld, S. (2024). A First Approach to Machine Learning with Linear Regression. In: Materials Data Science. The Materials Research Society Series. Springer, Cham. https://doi.org/10.1007/978-3-031-46565-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-46565-9_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46564-2
Online ISBN: 978-3-031-46565-9
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)