A First Approach to Machine Learning with Linear Regression

Sandfeld, Stefan

doi:10.1007/978-3-031-46565-9_12

Stefan Sandfeld²

Part of the book series: The Materials Research Society Series ((MRSS))

210 Accesses

Abstract

Linear regression is one of the most accessible machine learning methods which has strong roots in the field of statistics. Problems of interest consider the numerical relationship between the input variables (or features) and the output variables (or target variables). In this chapter we introduce machine learning regression analysis with the goal, to use it for inferring the functional relation between features and target variables, helping us to understand important aspects of the data. As a prerequisite, it is explained how trained models are used to make predictions. Furthermore, a number of more general concepts and notations are introduced which are also of importance for later chapters.

If you don’t know where you are going, any road will get you there.

Lewis Carroll (1832–1898)

English author and mathematician

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Galton, probably for practical reasons, did not use the mean values which would have been directly related to a least-square fit.
2.
Also, the human eye is surprisingly good at finding a good fit of a line to given data—at least as long as the data is not too “unusual.” In any case, looking at a fitted line and the data is certainly a good “sanity check”!
3.
In Sect. 12.3.3, we will see that there are different types of “errors,” but for now we use error just as a synonym for any sort of deviation.
4.
Hint: For finding the minimum, we have to take the derivative. Then, the the squared residuals simply becomes just the residual—a very simple function.
5.
In case that we are looking for a maximum, we still can use the whole “minimization machinery,” and only the sign of the cost function needs to be changed.

References

Numpy. URL https://numpy.org/.
Website dedicated to Galton’s life and work, 2022. URL https://galton.org/.
H. Blockeel. Hypothesis Space, pages 511–513. Springer US, Boston, MA, 2010. ISBN 978-0-387-30164-8. DOI https://doi.org/10.1007/978-0-387-30164-8_373.
L. N. G. Filon, G. U. Yule, H. Westergaard, M. Greenwood, and K. Pearson. Speeches delivered at a dinner held in university college, london in honour of professor karl pearson 23 april 1934. URL https://archive.org/details/filon-et-al-1934-speeches-delivered-at-a-dinner/.
F.R.S. Karl Pearson, Liii. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11): 559–572, 1901. DOI https://doi.org/10.1080/14786440109462720.
F. Galton. Natural Inheritance. New York: Macmillan and Company, 5th edition, 1894.
Google Scholar
W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes 3rd Edition: The Art of Scientific Computing. Cambridge University Press, 3 edition, 2007. ISBN 0521880688.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Advanced Simulation – Materials Data Science and Informatics (IAS-9), 52068 Aachen, Germany
Stefan Sandfeld

Authors

Stefan Sandfeld
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sandfeld, S. (2024). A First Approach to Machine Learning with Linear Regression. In: Materials Data Science. The Materials Research Society Series. Springer, Cham. https://doi.org/10.1007/978-3-031-46565-9_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-46565-9_12
Published: 17 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46564-2
Online ISBN: 978-3-031-46565-9
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)

Publish with us

Policies and ethics