Abstract
An alternative look at the linear regression model is taken by proposing an original treatment of a full column rank model (design) matrix. In such a situation, the Moore–Penrose inverse of the matrix can be obtained by utilizing a particular formula which is applicable solely when a matrix to be inverted can be columnwise partitioned into two matrices of disjoint ranges. It turns out that this approach, besides simplifying derivations, provides a novel insight into some of the notions involved in the model and reduces computational costs needed to obtain sought estimators. The paper contains also a numerical example based on astronomical observations of the localization of Polaris, demonstrating usefulness of the proposed approach.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The problem of curve fitting on the basis of a finite number of observations arises in almost all areas where mathematics is applied and one of the most powerful tools used for this purpose is based on the least squares method. Over the years a rich sample of results occurred in the literature providing an indisputable evidence that the matrix analysis concepts and techniques offer handy means to apply the method. The present paper constitutes a further contribution to this stream of considerations by demonstrating how an expression for the Moore–Penrose inverse of a columnwise partitioned matrix derived in Baksalary and Baksalary (2007, Theorem 1) may be advantageously utilized to deal with the problems originating from linear regression. Among the benefits resulting from the proposed approach one may mention: a simplification of derivations, a novel insight into the notions involved in the regression model, and reduction of computational costs necessary to obtain sought estimators. Furthermore, by simplifying inevitable mathematical operations, the proposed approach offers an attractive alternative to the researchers, who are not keen on exploiting more advanced than necessary matrix methods or utilizing software packages which do not provide a comprehensive control over the processed data, as the approach enables to perform calculations almost “by hand” preserving an insight into every step of linear regression.
The aforementioned representation of the Moore–Penrose inverse established in Baksalary and Baksalary (2007, Theorem 1) is recalled in the following lemma.
Lemma 1.1
Let \({\mathbf{A}}\) be an \(n \times m\), \(m\geqslant 2\), real matrix columnwise partitioned as \({\mathbf{A}} = ({\mathbf{A}}_1 : {\mathbf{A}}_2)\), with \({\mathbf{A}}_i\) denoting \(n \times m_i\), \(i = 1, 2\), matrices such that \(m_1 + m_2 = m\). Furthermore, let the ranges of \({\mathbf{A}}_1\) and \({\mathbf{A}}_2\) be disjoint. Then the Moore–Penrose inverse of \(\mathbf{A}\) is of the form
where \({\mathbf{Q}}_i\), \(i = 1, 2\), is the orthogonal projector onto the null space of the transpose of \({\mathbf{A}}_i\) and \(({\mathbf{Q}}_i \mathbf{A}_j)^\dagger \), \(i = 1, 2\), \(i \ne j\), is the Moore–Penrose inverse of \({\mathbf{Q}}_i {\mathbf{A}}_j\).
In the next section we briefy discuss particular linear regression models, shading a spotlight on issues in which the Moore–Penrose inverse of a columnwise partitioned matrix naturally emerges. These considerations are followed by Sect. 3, which contains an example demonstrating applicability of the present approach. The data used in the example originated from the observations of the position of Polaris made on December 12, 1983, by S.G. Brewer, which were afterwards used by Pedler (1993) to develop a solution to an (as the author claims) “astronomical problem” aimed at fitting a circle to a set of points. Section 4 provides a number of remarks concerned with the proposed approach.
2 Particular linear regression models
All matrices occurring in what follows are of real entries and the superscript \(^\prime \) stands for a matrix transpose. Let us consider the linear regression model
where \({\mathbf{y}}\) is an \(n \times 1\) random vector of observations, \({\mathbf{X}}\) is an \(n \times p\) known model (design) matrix of constants, \(\varvec{\beta }\) is a \(p \times 1\) vector of unknown parameters, and \({\mathbf{u}}\) is an \(n \times 1\) vector of unknown errors. The entries of the vector \({\mathbf{u}} = (u_1, u_2,\ldots , u_n)^\prime \) are assumed to have a mean of zero and (unknown) variance \(\sigma ^2\), and each pair \(u_i\), \(u_j\), \(i \ne j\), is assumed to be uncorrelated, i.e., the expectation vector and the covariance matrix of \({\mathbf{u}}\) are \({\mathsf {E}}({\mathbf{u}}) = {\mathbf{0}}\) and \(\mathsf {Cov}({\mathbf{u}}) = \sigma ^2 {\mathbf{I}}_n\), respectively. Customarily, the symbol \({\mathbf{I}}_n\) stands for the identity matrix of order n. We also assume that the matrix \({\mathbf{X}}\) is of full column rank. Then, the least squares estimator (LSE) of \(\varvec{\beta }\) is given by
It is worth emphasizing that the assumption that \({\mathbf{X}}\) is of full column rank plays a crucial role, and is most often made to assure uniqueness of the estimator of \(\varvec{\beta }\); see Puntanen et al. (2011, p. 34). It turns out that \(({\mathbf{X}}^\prime {\mathbf{X}})^{-1} {\mathbf{X}}^\prime = {\mathbf{X}}^\dagger \), i.e., the Moore–Penrose inverse of \({\mathbf{X}}\); see Appendix A.
To calculate \({\mathbf{X}}^\dagger \), instead of bothering with the inverse of \({\mathbf{X}}^\prime {\mathbf{X}}\), we may write the regressor matrix \({\mathbf{X}}\) in the columnwise partitioned form
where \({\mathbf{X}}_i\), \(i = 1, 2\), denote \(n \times p_i\) matrices such that \(p_1 + p_2 = p\). Since \({\mathbf{X}}\) is of full column rank, it follows that
where \({{\mathcal {R}}}({\mathbf{.}})\) stands for the column space (range) of a matrix argument. According to Lemma 1.1, the Moore–Penrose inverse of a matrix of the form (3), such that (4) is satisfied, can be expressed as
where \({\mathbf{Q}}_i = {\mathbf{I}}_n - {\mathbf{X}}_i {\mathbf{X}}_i^\dagger \) is the orthogonal projector onto \({{\mathcal {N}}}({\mathbf{X}}_i^\prime )\), the null space of \({\mathbf{X}}_i^\prime \), \(i = 1, 2\); see Appendix A. Note that the condition (4), under which the representation of the Moore–Penrose inverse (5) is valid, is weaker than the requirement that \({\mathbf{X}}\) specified in (3) is of full column rank (the assumption which will be extensively exploited in what follows).
Let us consider a simple linear regression model
where \({\beta }_0, {\beta }_1 \in {\mathbb {R}}\), \({\mathbf {1}} = (1, 1,\ldots , 1)^\prime \) is the vector of n ones and \({\mathbf{x}} = (x_1, x_2,\ldots , x_n)^\prime \) is the vector of observations on one regressor variable. The vector \({\mathbf{u}} = (u_1, u_2,\ldots , u_n)^\prime \) consists of the unknown errors. To obtain the LSE of the parameter vector \(\varvec{\beta } = ({\beta }_0, {\beta }_1)^\prime \) we may partition \(n \times 2\) matrix \({\mathbf{X}}\) as
where \({{\mathcal {R}}}({\mathbf{1}}) \cap {{\mathcal {R}}}({\mathbf{x}}) = \{ {\mathbf{0}} \}\) since we assume that \({\mathbf{X}}\) is of full column rank. By (5) it follows that
where \({\mathbf{Q}}_{\mathbf{x}} {\mathbf{1}} = (\mathbf{I }_n - {\mathbf{x}}\mathbf{x}^\dagger ) {\mathbf{1}}\) and \({\mathbf{Q}}_{{\mathbf{1}}} {\mathbf{x}} = ({\mathbf{I}}_n - {\mathbf{1}}{\mathbf{1}}^\dagger ) {\mathbf{x}}\) are both column vectors. Thus, by the identity (A1) given in Appendix A,
Consequently, the LSE of \(\varvec{\beta } = ({\beta }_0, {\beta }_1)^\prime \) is
From (8) we obtain
whence
Let us use the symbol \({\mathbf{H}}\) to denote the so-called hat-matrix, which represents the orthogonal projector onto \({{\mathcal {R}}}(\mathbf{X})\), i.e., \({\mathbf{H}} = {\mathbf{X}}{\mathbf{X}}^\dagger \). Then, the identities (6)–(8) entail
which means that the hat-matrix is a sum of two matrices of rank one. Furthermore, each of the summands involved in (10) is idempotent. This observation leads to the conclusion (see e.g., Rao and Mitra (1971, Theorem 5.1.2)) that the matrices necessarily commute and their product is equal to the zero matrix, i.e.,
Since \({\mathbf{1}} \in {{\mathcal {R}}}({\mathbf{X}})\), it follows that \({\mathbf{H}}{\mathbf{1}} = {\mathbf{1}}\). Hence, by denoting \(\hat{\mathbf{y}} = {\mathbf{H}}{\mathbf{y}}\), we see that \({\mathbf{1}}^\prime \hat{\mathbf{y}} = {\mathbf{1}}^\prime {\mathbf{H}}{\mathbf{y}} = {\mathbf{1}}^\prime {\mathbf{y}} \), which gives \(\sum \nolimits _{i=1}^{n} \hat{y_i} = \sum \nolimits _{i=1}^{n} y_i\). Furthermore, by putting \(\hat{\mathbf{u}} = ({\mathbf{I}}_n - {\mathbf{H}}){\mathbf{y}}\), we obtain \({\mathbf{1}}^\prime \hat{\mathbf{u}} = {\mathbf{1}}^\prime ({\mathbf{I}}_n - {\mathbf{H}}){\mathbf{y}} = {\mathbf{0}}\), so \( \sum \nolimits _{i=1}^{n} \hat{u_i} = 0\). Another consequence of \({\mathbf{1}} \in {{\mathcal {R}}}({\mathbf{X}})\) is the identity
with \({\mathbf{J}} = {\mathbf{1}}{\mathbf{1}}^\dagger \); see Puntanen et al. (2011, Proposition 8.5). Alternatively, the equality (11) can be expressed as \(SST = SSR + SSE\), where SST stands for the total sum of squares, SSR for the regression sum of squares, and SSE for the residual sum of squares. The coefficient of determination defined as
turns out to be
Clearly, \(R^2 \geqslant 0\). Another observation is that \({\mathbf{I}}_n - {\mathbf{J}} - ({\mathbf{H}} - {\mathbf{J}}) = {\mathbf{I}}_n - {\mathbf{H}}\) is nonnegative definite (as \({\mathbf{I}}_n - {\mathbf{H}}\) is the orthogonal projector onto \({{\mathcal {N}}}({\mathbf{X}}^\prime )\)). Hence, \({\mathbf{H}} - {\mathbf{J}} {\mathop {\leqslant }\limits ^{{\mathsf {L}}}} {\mathbf{I}}_n - {\mathbf{J}}\), where the symbol \({\mathop {\leqslant }\limits ^{{\mathsf {L}}}}\) denotes the Löwner partial ordering, from where we conclude that \(0 \leqslant R^2 \leqslant 1\). The fact that values of \(R^2\) are restricted to the interval [0, 1] is known in the literature (see e.g., Davidson and MacKinnon (1993, p. 14)), but usually it is demonstrated in rather more involved way than in the present paper.
Consider now the general linear model with intercept
where \({\mathbf{X}}\) and \(\varvec{\beta }\) are of dimensions \(n \times (p -1)\) and \((p - 1) \times 1\), respectively, and \(\begin{pmatrix} {\mathbf{1}} : {\mathbf{X}} \end{pmatrix}\) is assumed to be of full column rank. Then, by analogy to (9), the LSE of \(({\beta }_0, \varvec{\beta })^\prime \) is
On account of (A1), we obtain \(({\mathbf{Q}}_{\mathbf{X}} {\mathbf{1}})^\dagger = ({\mathbf{1}}^\prime {\mathbf{Q}}_{\mathbf{X}} {\mathbf{1}})^{-1}{\mathbf{1}}^\prime {\mathbf{Q}}_{\mathbf{X}}\). Hence, \((\mathbf{Q}_{\mathbf{X}} {\mathbf{1}})^\dagger = ({\mathbf{Q}}_{\mathbf{X}} {\mathbf{1}})^\dagger {\mathbf{Q}}_{\mathbf{X}}\). Similarly, we arrive at \((\mathbf{Q}_{{\mathbf{1}}} {\mathbf{X}})^\dagger = ({\mathbf{Q}}_{{\mathbf{1}}} {\mathbf{X}})^\dagger {\mathbf{Q}}_{{\mathbf{1}}}\). In consequence, since \({\mathsf {E}}({\mathbf{y}}) = {\beta }_0 {\mathbf{1}} + {\mathbf{X}} \varvec{\beta }\), and since both, \({\mathbf{Q}}_{{\mathbf{1}}} \mathbf{X}\) and \({\mathbf{Q}}_{\mathbf{X}} {\mathbf{1}}\), are of full column ranks, we have
Note that the hat-matrix turns out to be
a sum of two matrices of which the former is of rank one and the latter is of the same rank as matrix \({\mathbf{X}}\), i.e., \(p - 1\). Similarly as above, both summands which determine the hat-matrix specified in (14) are commuting idempotents, whose product equals the zero matrix.
The formula (14) (as well as its particular case (10)) can be viewed as an alternative to the representation of \({\mathbf{H}}\) as a sum of two orthogonal projectors, which reads \({\mathbf{H}} = {\mathbf{J}} + {\mathbf{P}}_{{\mathbf{C}}{\mathbf{X}}}\), where \({\mathbf{P}}_\mathbf{CX} = {\mathbf{CX}} ({\mathbf{CX}})^\dagger \), with \({\mathbf{C}}\) denoting the so-called centering matrix defined as \({\mathbf{C}} = {\mathbf{I}}_n - \mathbf{J}\); see Puntanen et al. (2011, formula (8.108)). It is clear that \({\mathbf{J}}{\mathbf{P}}_{\mathbf{C\mathbf{X}}} = {\mathbf{0}}\).
3 Applications
As in Pedler (1993, Sect. 6), we consider now the linear regression model with 4 regressors
Let \({\mathbf{x}} = (x_1, x_2,\ldots , x_n)^\prime \), \({\mathbf{y}} = (y_1, y_2, \ldots , y_n)^\prime \), \({\mathbf{a}} = (a_1, a_2,\ldots , a_n)^\prime \), \({\mathbf{b}} = (b_1, b_2,\ldots , b_n)^\prime \), \(\varvec{\beta } = (p, q, u, v)^\prime \), \({\mathbf{c}} = (c_1, c_2,\ldots , c_n)^\prime \), \({\mathbf{d}} = (d_1, d_2,\ldots , d_n)^\prime \). Furthermore, let \({\mathbf{X}}\) be a \(2n \times 4\) matrix of the form
Then (15) can be written as
As the matrix \({\mathbf{X}}\) is of full column rank, we can determine the LSE of \(\varvec{\beta }\) by applying the representation derived in Baksalary and Trenkler (2021, Example 1) as a consequence of Baksalary and Baksalary (2007, Theorem 1). In order to take advantage of this result, let
where \({\overline{a}} = \frac{1}{n} \sum \nolimits _{i=1}^n a_i\) and \({\overline{b}} = \frac{1}{n} \sum \nolimits _{i=1}^n b_i\). Then, on account of Baksalary and Trenkler (2021, formula (8)), we obtain a very handy representation of the Moore–Penrose inverse of the model matrix (16), namely
Hence, analogously to (13), the LSE of the parameter vector \(\varvec{\beta }\) is given by
Let us now demonstrate the usefulness of the expressions (20) and (21) by applying them to a set of real data.
Example 3.1
As mentioned in Introduction, Pedler (1993) considered “astronomical problem” of fitting a circle to a set of points and solved it from first principles. The considerations in Pedler (1993) contain also an example which exploits the data collected by S.G. Brewer from the observations of the position of Polaris made on December 12, 1983. The data are given in Table 1.
The data provided in Table 1 enable to calculate the scalars A, B, M, N as well as the vectors \({\mathbf{e}}\), \({\mathbf{f}}\), \(\mathbf{g}\), \({\mathbf{h}}\) defined in (17)–(19). Hence, we obtain the vector inner products of the four vectors and the vector \({\mathbf{z}}\). It should be emphasized that these straightforward calculations involve only scalars and vectors and require relatively low computational cost; for details on advantages of the algorithm to calculate the Moore–Penrose inverse which takes into account columnwise partitioning into range disjoint matrices see Baksalary and Trenkler (2021). The outcomes of these computations are provided in Table 2.
In the light of (21), we arrive at the components of the estimator of \(\hat{\varvec{\beta }}\), which are given in Table 3. As expected, the values coincide with the ones given in Pedler (1993, Table 2).
The values of the total sum of squares SST, regression sum of squares SSR, residual sum of squares SSE, and the coefficient of determination \(R^2\) are provided in Table 4.
4 Supplementary remarks
In the linear regression models considered in Sect. 2 it was assumed that the model matrices are of full column ranks and that the vector \({{\mathbf{1}}}\) is one of the columns. Such assumptions are well justified as they correspond to several common situations. However, the present approach enables to generalize the considerations by weakening the assumption that the model matrix is of full column rank to the requirement that it can be columnwise partitioned into two range disjoint matrices and by relaxing the assumption that one of the columns is the vector \({{\mathbf{1}}}\). To demonstrate this fact, let us assume that the model matrix \(\mathbf{X}\) in (2) is partitioned in accordance with (3), i.e.,
where vectors \(\varvec{\beta }_i\) are of orders \(p_i \times 1\), \(i = 1, 2\). Provided that the condition (4) holds, by (5) we conclude that the LSE of \({(\varvec{\beta }_1, \varvec{\beta }_2)}^\prime \) is given by
Visibly, the expression (13) is obtained from (23) by taking \({\mathbf{X}}_1 = {{\mathbf{1}}}\) and \(\mathbf{X}_2 = {{\mathbf{X}}}\). Furthermore, from (3) and (5) we obtain
which under \({\mathbf{X}}_1 = {{\mathbf{1}}}\) and \({\mathbf{X}}_2 = {{\mathbf{X}}}\) leads to the formula (14). Note that the expression (24) is given in Puntanen et al. (2011, Proposition 16.1) along with its equivalent counterparts, one of which is (4).
Another evidence of the applicability of Lemma 1.1 in statistical estimation theory was provided in Baksalary and Trenkler (2021), by deriving an original representation for the best linear unbiased estimator (BLUE) of \({\mathbf{X}} \varvec{\beta }\) under the generalized version of the (consistent) linear model (2) with \(\mathsf {Cov}({\mathbf{u}}) = \sigma ^2 {\mathbf{V}}\), where \({\mathbf{V}}\) denotes a known \(n \times n\) positive semidefinite matrix. It was shown in Baksalary and Trenkler (2021, Example 4) that \({\mathbf{G}}{\mathbf{y}}\) with
is BLUE of \({\mathbf{X}} \varvec{\beta }\).
Analogously, we can derive representations for BLUE of both, \(\mathbf{X}_1 \varvec{\beta }_1\) and \({\mathbf{X}}_2 \varvec{\beta }_2\) under the model (22) when \(\mathsf {Cov}({\mathbf{u}}) = \sigma ^2 {\mathbf{V}}\). From Puntanen et al. (2011, formula (10.5)) it follows that \({\mathbf{G}}_1{\mathbf{y}}\) is BLUE of an (estimable) parametric function \({\mathbf{X}}_1 \varvec{\beta }_1\) if \({\mathbf{G}}_1\) satisfies the equation
On account of (4), which is a necessary and sufficient condition for \({\mathbf{X}}_1 \varvec{\beta }_1\) to be estimable, we arrive at \({{\mathcal {R}}} ({\mathbf{X}}_1) \cap {{\mathcal {R}}}(\mathbf{X}_2 : {\mathbf{V}}{\mathbf{Q}}_{\mathbf{X}}) = \{ {\mathbf{0}} \}\). In consequence, we can utilize the representation of the Moore–Penrose inverse provided in Lemma 1.1, which leads to the conclusion that one of the solutions of (26) is of the form
Hence,
Similarly, by interchanging the subscripts “1” and “2” in (26), we obtain
The paper is concluded with some remarks concerned with advantages of utilizing the representation of the Moore–Penrose inverse provided in Lemma 1.1 from the computational point of view. In comparison to the methods of determining the inverse based on the singular value decomposition (SVD), which are exploited in several popular software packages (e.g., Matlab, Mathematica or R), an algorithm based on the representation (1) seems to have three main advantages, each leading to a dropping of computational costs. The first one is that it reduces sizes of matrices to be Moore–Penrose inverted—instead of the inverse of an \(n \times m\) matrix \({\mathbf{A}}\), we need to compute two inverses of matrices \({\mathbf{Q}}_2 {\mathbf{A}}_1\) and \(\mathbf{Q}_1 {\mathbf{A}}_2\) of orders \(n \times m_1\) and \(n \times m_2\), respectively, where \(m_1 + m_2 = m\); as several software tools impose limits on sizes of matrices which can be stored, one can encounter a situation in which the inverse of \({\mathbf{A}}\) may exceed the limit, while the inverses of \({\mathbf{Q}}_2 {\mathbf{A}}_1\) and \(\mathbf{Q}_1 {\mathbf{A}}_2\) are still manageable. The second benefit is that the algorithm allows computing both block entries occurring in the inverse (almost) simultaneously—in the light of Baksalary and Trenkler (2021, formula (6)), the two entries involved in the representation (1) are linked by the identity
which means that one of the Moore–Penrose inverses involved in the representation can be derived from the knowledge of the other. The third advantage of the algorithm is that it can be executed iteratively in each subsequent step to matrices of smaller order—from Baksalary and Trenkler (2021, Lemma 1) it follows that when \({\mathbf{A}}\) is of full column rank, then \({\mathbf{Q}}_2 {\mathbf{A}}_1\) and \({\mathbf{Q}}_1 {\mathbf{A}}_2\) are of full column ranks as well, which means that the inverses \(({\mathbf{Q}}_2 {\mathbf{A}}_1)^\dagger \) and \(({\mathbf{Q}}_1 {\mathbf{A}}_2)^\dagger \) can be computed by applying the same algorithm; the procedure might be carried out iteratively till the matrices to be inverted are all reduced to (row) vectors.
References
Baksalary JK, Baksalary OM (2007) Particular formulae for the Moore-Penrose inverse of a columnwise partitioned matrix. Linear Algebra Appl 421: 16–23
Baksalary OM, Trenkler G (2021) On formulae for the Moore-Penrose inverse of a columnwise partitioned matrix. Appl Math Comput 403: 125913
Davidson R, MacKinnon JG (1993) Estimation and inference in econometrics. Oxford University Press, New York
Pedler PJ (1993) Fitting a circle to numerical data: an illustration of statistical modelling. Int J Math Educ Sci Technol 24: 131–143
Puntanen S, Styan GPH, Isotalo J (2011) Matrix tricks for linear statistical models - our personal top twenty. Springer-Verlag, Berlin
Rao CR, Mitra SK (1971) Generalized Inverse of Matrices and Its Applications. Wiley, New York
Acknowledgements
The authors are thankful to the two referees for their pertinent comments and suggestions on the first version of the paper, which resulted in a noticeable betterment of its content. The authors are also grateful to the handling editor for highlighting several relevant facts, the mentioning of which distinctly enriched the paper.
Open Access
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Author information
Authors and Affiliations
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Moore–Penrose inverse of a matrix
Appendix A: Moore–Penrose inverse of a matrix
Let \({\mathbf{S}}\) be an \(n \times m\) matrix. Then there exists a unique matrix \({\mathbf{S}}^\dagger \) such that
The matrix \({\mathbf{S}}^\dagger \) is called the Moore–Penrose inverse of \({\mathbf{S}}\).
It can be verified that
which takes the form \({\mathbf{S}}^\dagger = ({\mathbf{S}}^\prime \mathbf{S})^{-1} {\mathbf{S}}^\prime \), when \({\mathbf{S}}\) is of full column rank. Another relevant property of the Moore–Penrose inverse is that it offers a handy way to represent orthogonal projectors in \({\mathbb {R}}^n\) (symmetric idempotent matrices of order n). To be precise, an \( n \times n\) matrix \({\mathbf{P}}\) is an orthogonal projector if and only if it is expressible as \({\mathbf{SS}}^\dagger \) for some \(n \times m\) matrix \({\mathbf{S}}\). Then, \({\mathbf{SS}}^\dagger \) is the orthogonal projector onto \({{\mathcal {R}}}({\mathbf{S}})\) and, consequently, \({\mathbf{I}}_n - {\mathbf{SS}}^\dagger \) is the orthogonal projector onto the orthogonal complement of \({{\mathcal {R}}}(\mathbf{S})\), which coincides with \({{\mathcal {N}}}({\mathbf{S}}^\prime )\). Similarly, \({\mathbf{S}}^\dagger {\mathbf{S}}\) and \({\mathbf{I}}_m - \mathbf{S}^\dagger {\mathbf{S}}\) are the orthogonal projectors onto \({{\mathcal {R}}}({\mathbf{S}}^\prime )\) and \({{\mathcal {N}}}({\mathbf{S}})\), respectively, where \({{\mathcal {R}}}({\mathbf{S}}^\prime ) {\mathop {\oplus }\limits ^{\perp }} {{\mathcal {N}}}({\mathbf{S}}) = {\mathbb {R}}^m\). An important feature is that there is a one-to-one correspondence between the orthogonal projector and the subspace onto which it projects.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Baksalary, O.M., Trenkler, G. An alternative look at the linear regression model. Stat Papers 63, 1499–1509 (2022). https://doi.org/10.1007/s00362-021-01280-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-021-01280-x