Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Often these observed variables are mathematically modeled by random variables, which can cause some confusion. We will not make this connection until Chap. 4. For this chapter, we will only have data and no model.
- 2.
Do not take the word measurement too literally here. It does not necessarily involve a scientist reading numbers from a fancy instrument; it can be a person’s age or sex, for example.
- 3.
The numbers should be meaningful as numbers rather than as codes for something else. For example, zip codes are not quantitative variables.
- 4.
Sometimes a categorical variable specifies groups that have a natural ordering (e.g. freshman, sophomore, junior, senior), in which case the variable is called ordinal.
- 5.
For simplicity, we will talk as if there is always exactly one response variable. That will indeed be the case throughout this text. The alternative context is called multivariate analysis.
- 6.
When the response variable is categorical, this selection is called classification.
- 7.
The terms least-squares point and location regression are not common, but they are introduced in order to help clarify the coherence of this chapter’s topics.
- 8.
If n is left unspecified, do we have to worry about whether the drawings remain valid with n < 3? Not to worry, the drawing remains valid as can be a subspace embedded within the three-dimensional picture.
- 9.
More precisely, these statistics are unbiased estimates of the variances and covariance between the random variables X and Y , assuming (x 1, y 1), …, (x n, y n) were iid draws from the distribution of (X, Y ) see Exercise 3.36.
- 10.
Perhaps you would rather use the slope-intercept parameterization y = a + bx in the first place rather than subtracting \({\bar {x}}\) from x. I have concluded that there is very good reason to subtract \({\bar {x}}\), so bear with me. Either way you parameterize it, you will derive the same least-squares line.
- 11.
Of course, individuals also have a chance to be more exceptional than their parents in any given characteristic, and as a result aggregate population characteristics remain relatively stable.
- 12.
In that section, the centered data matrix was defined by subtracting the empirical mean vector from each row of the data matrix. Convince yourself that the expression given here is equivalent.
- 13.
The least-squares coefficient of that column was identified as \({\hat {b}}_0\) in Sect. 2.1.3, while the least-squares coefficients of the x (1), …, x (m) are exactly the same as the coefficients calculated for their centered versions.
- 14.
The notation is called an indicator function; it takes the value 1 if its condition is true and 0 otherwise.
- 15.
Galton used the letter R for the “regression coefficient” in the least-squares equation for the standardized variables, which we now call the correlation. In simple linear regression, R 2 is exactly the squared correlation, which is why the statistic is called R 2. Note that with more than one explanatory variable, this interpretation no longer works.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Brinda, W.D. (2021). Least-Squares Linear Regression. In: Visualizing Linear Models. Springer, Cham. https://doi.org/10.1007/978-3-030-64167-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-64167-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64166-5
Online ISBN: 978-3-030-64167-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)