Abstract
In Chap. 1 we learned how to solve a system of linear equations. All of the systems were square (i.e., the number of equations equaled the number of unknowns) and each system had an exact solution. Systems like these do not characterize most statistical analyses. Instead, we deal with rectangular systems (more equations than unknowns) for which an exact solution does not exist. Such systems are called overdetermined linear systems, and in this chapter you will learn how to solve them and become familiar with their role in a linear regression model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The error term is also referred to as the residuals, and in this chapter we will use the terms interchangeably.
- 2.
Some textbooks refer to the QR decomposition as the QR factorization. These terms are interchangeable. Unfortunately, there is another technique known as the QR algorithm that is used in a different context. We will discuss this method in Chap. 4, but it should not be confused with the QR decomposition itself.
- 3.
Some textbooks use the term orthogonal to mean orthonormal, failing to distinguish the two terms. The failure is justified because any orthogonal matrix can be normalized to create an orthonormal one.
- 4.
In the section that follows we will learn other ways of calculating R that offer greater numerical stability.
- 5.
Rearranging terms in Eq. (2.11) shows another way to calculate b from R: b = R−1Q′y. Because this method involves using an inverse, it is less accurate and efficient than using backward substitution.
- 6.
In Chap. 1 we noted that many statistical packages, including \( \mathrm{\mathcal{R}} \), return an upper triangular matrix when performing the Cholesky decomposition. In part, this is because the Cholesky decomposition of A′A is equivalent to the R matrix from a QR decomposition of A.
- 7.
A more sophisticated implementation of this method, known as the Modified Gram-Schmidt Orthogonalization can be found in Golub and van Loan (2013).
- 8.
We can also compute Q = XR−1.
- 9.
Other ways to create the rotation have greater numerical stability than the one shown here, and the accompanying\( \mathrm{\mathcal{R}} \) code uses a method from Golub and van Loan (2013).
- 10.
The sign of the rotated vectors differs among decomposition methods, but this has no effect on the decomposition itself.
- 11.
Portions of this section are excerpted from Brown (2014).
- 12.
Nonlinear regression models are discussed later in this text.
- 13.
In practice, the predictors can be random variables as long as their values are generated by a mechanism that is unrelated to the error term.
- 14.
Bias and efficiency are known as finite-sample properties because they do not depend on sample size; in contrast, consistency refers to an asymptotic property that varies as a function of sample size.
- 15.
This definition assumes that we have only one predictor. When we have multiple predictors, we need to further stipulate that they are not linearly dependent.
- 16.
Notice that as long as the errors have expectation 0, the least squares estimators are unbiased regardless of whether the errors are independent or identically distributed. These latter properties do, however, affect the efficiency of the least squares estimators.
- 17.
A more thorough discussion of maximum likelihood estimation can be found in a variety of textbooks, including Brown (2014).
References
Brown, J. D. (2014). Linear models in matrix form: A hands-on approach for the behavioral sciences. New York: Springer.
Davis, C. H. (1857). Theory of the motion of the heavenly bodies moving about the sun in conic sections: A translation of Gauss’s “Theoria Motus”. Boston: Little, Brown and Company.
Gauss, C. F. (1809). Theoria motus corporum coelestium. In Carl Friedrich Gauss – Werke. Königliche Gesellschaft der Wissenschaften zu Göttingen (1906).
Golub, G. H., & van Loan, C. F. (2013). Matrix computations (4th ed.). Baltimore: John Hopkins.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Brown, J.D. (2018). Least Squares Estimation. In: Advanced Statistics for the Behavioral Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-93549-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-93549-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93547-8
Online ISBN: 978-3-319-93549-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)