Let us do the twist again

Baksalary, Oskar Maria; Trenkler, Götz; Liski, Erkki

doi:10.1007/s00362-013-0512-3

Let us do the twist again

Regular Article
Open access
Published: 17 March 2013

Volume 54, pages 1109–1119, (2013)
Cite this article

Download PDF

You have full access to this open access article

Statistical Papers Aims and scope Submit manuscript

Let us do the twist again

Download PDF

Oskar Maria Baksalary¹,
Götz Trenkler² &
Erkki Liski³

1393 Accesses
Explore all metrics

Abstract

Krämer (Sankhy$\bar{\mathrm{a }}$ 42:130–131, 1980) posed the following problem: “Which are the $\mathbf{y}$, given $\mathbf{X}$ and $\mathbf{V}$, such that OLS and Gauss–Markov are equal?”. In other words, the problem aimed at identifying those vectors $\mathbf{y}$ for which the ordinary least squares (OLS) and Gauss–Markov estimates of the parameter vector $\varvec{\beta }$ coincide under the general Gauss–Markov model $\mathbf{y} = \mathbf{X} \varvec{\beta } + \mathbf{u}$. The problem was later called a “twist” to Kruskal’s Theorem, which provides conditions necessary and sufficient for the OLS and Gauss–Markov estimates of $\varvec{\beta }$ to be equal. The present paper focuses on a similar problem to the one posed by Krämer in the aforementioned paper. However, instead of the estimation of $\varvec{\beta }$, we consider the estimation of the systematic part $\mathbf{X} \varvec{\beta }$, which is a natural consequence of relaxing the assumption that $\mathbf{X}$ and $\mathbf{V}$ are of full (column) rank made by Krämer. Further results, dealing with the Euclidean distance between the best linear unbiased estimator (BLUE) and the ordinary least squares estimator (OLSE) of $\mathbf{X} \varvec{\beta }$, as well as with an equality between BLUE and OLSE are also provided. The calculations are mostly based on a joint partitioned representation of a pair of orthogonal projectors.

The Legend of the Equality of OLSE and BLUE: Highlighted by C. R. Rao in 1967

Orthogonal Block Structure and Uniformly Best Linear Unbiased Estimators

Following K. Pearson to test the general linear hypothesis

Article 16 June 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let us consider the general Gauss–Markov model

$$\begin{aligned} \mathbf{y } = \mathbf{X } \varvec{\beta } + \mathbf{u }, \end{aligned}$$

(1)

where $\mathbf{y}$ is an $n \times 1$ observable random vector, $\mathbf{X}$ is a known $n \times p$ model matrix, $\varvec{\beta }$ is a $p \times 1$ vector of unknown parameters, and $\mathbf{u}$ is an $n \times 1$ random error vector. The expectation vector and the covariance matrix of $\mathbf{u}$ are $\mathsf E (\mathbf{u}) = \mathbf{0}$ and $\mathsf Cov (\mathbf{u}) = \sigma ^2 \mathbf{V}$, respectively, where $\sigma ^2 > 0$ is an unknown constant and $\mathbf{V}$ is a known $n \times n$ nonnegative definite matrix. Both $\mathbf{X}$ and $\mathbf{V}$ may be rank deficient. It is assumed beforehand that the model (1) is consistent, i.e., $\mathbf{y} \in {\fancyscript{R}} ( \mathbf{X} : \mathbf{V} )$, where ${\fancyscript{R}}(\mathbf{.})$ stands for the column space of a matrix argument and $(\mathbf{X} : \mathbf{V})$ denotes the $n \times (p + n)$ columnwise partitioned matrix obtained by juxtaposing matrices $\mathbf{X}$ and $\mathbf{V}$; cf. Rao (1973, p. 297) or Puntanen et al. (2011, pp. 43, 125).

In his paper, Krämer (1980, p. 130) posed the following problem: “Which are the $\mathbf{y}$, given $\mathbf{X}$ and $\mathbf{V}$, such that ordinary least squares (OLS) and Gauss–Markov are equal?” In other words, the problem aimed at identifying those vectors $\mathbf{y}$ for which the OLS and Gauss–Markov estimates of the parameter vector $\varvec{\beta }$ coincide. Referring to this problem, in a follow-up paper Krämer et al. (1996) called this a “twist” to Kruskal’s Theorem (Kruskal 1968), which provides conditions necessary and sufficient for the OLS and Gauss–Markov estimates of $\varvec{\beta }$ to be equal. In Krämer et al. (1996) “another twist” to Kruskal’s Theorem is dealt with, and rather than asking when is the OLS equal to the Gauss–Markov for the full regression vector $\varvec{\beta }$, a condition for the equality of the OLS and Gauss–Markov for a subparameter of $\varvec{\beta }$ is provided. A more general “final twist” is considered in Jaeger and Krämer (1998), where the single vectors $\mathbf{y}$ are characterized that yield identical OLS and Gauss–Markov estimators for such a subparameter.

Inspired by Jaeger and Krämer (1998), Krämer (1980), and Krämer et al. (1996), in what follows we “do the twist again”. However, unlike in the three papers, we do not assume that $\mathbf{X}$ and $\mathbf{V}$ are of full (column) rank, which means that the vector $\varvec{\beta }$ is not necessarily unbiasedly estimable. For this reason, instead of the estimation of $\varvec{\beta }$, we consider the estimation of the systematic part $\mathsf E (\mathbf{y}) = \mathbf{X} \varvec{\beta }$. Note that this parameter function always has a linear unbiased estimator, namely $\mathbf{y}$ itself.

An important role in the subsequent considerations will be played by the notion of a projector. It is known that any $n \times n$ idempotent matrix, say $\mathbf{F} \in \mathbb{R }^{n \times n}$, is an oblique projector onto its column space ${\fancyscript{R}}(\mathbf{F})$ along its null space ${\fancyscript{N}}(\mathbf{F})$, where ${\fancyscript{R}}(\mathbf{F}) \oplus {\fancyscript{N}}(\mathbf{F}) = \mathbb{R }^{n,1}$. Among many conditions characterizing idempotent matrices one finds for instance: $\mathbf{F}^2 = \mathbf{F} \Leftrightarrow {\fancyscript{R}}(\mathbf{F}) = {\fancyscript{N}}(\overline{\mathbf{F}}) \Leftrightarrow {\fancyscript{R}}(\overline{\mathbf{F}}) = {\fancyscript{N}}(\mathbf{F})$, where $\overline{\mathbf{F}} = \mathbf{I}_n - \mathbf{F}$. When idempotent $\mathbf{F}$ projects onto ${\fancyscript{R}}(\mathbf{F})$ along the orthogonal complement of ${\fancyscript{R}}(\mathbf{F})$, then it is called an orthogonal projector. It can be verified that $\mathbf{F}$ is an orthogonal projector if and only if it is both idempotent and symmetric, i.e., $\mathbf{F}^2 = \mathbf{F} = \mathbf{F}^\prime $. Projectors are widely used in Statistics and Econometrics as a basic tool for estimation and test procedures.

Let $\mathbf{G}$ be an $n \times n$ matrix. An estimator $\mathbf{G}\mathbf{y}$ for $\mathbf{X} \varvec{\beta }$, which is unbiased and of minimal covariance matrix in the Löwner sense fulfills the conditions

$$\begin{aligned} \mathbf{G }\mathbf{H } = \mathbf{H } \quad {\mathrm{and}} \quad \mathbf{G }\mathbf{V }\mathbf{M } = \mathbf{0 }, \end{aligned}$$

(2)

where $\mathbf{H} = \mathbf{X}\mathbf{X}^\dagger $ and $\mathbf{M} = \mathbf{I}_n - \mathbf{H}$ are the orthogonal projectors onto, respectively, ${\fancyscript{R}}(\mathbf{X})$, the column space of $\mathbf{X}$, and the orthogonal complement of ${\fancyscript{R}}(\mathbf{X})$ which coincides with ${\fancyscript{N}}(\mathbf{X}^\prime )$, the null space of $\mathbf{X}^\prime $. The symbol $\mathbf{X}^\dagger $ denotes the Moore–Penrose inverse of $\mathbf{X}$. The conditions (2) can be rewritten as

$$\begin{aligned} \mathbf{G }(\mathbf{H } : \mathbf{P }_{\mathbf{V }\mathbf{M }}) = (\mathbf{H } : \mathbf{0 }), \end{aligned}$$

(3)

where $\mathbf{P}_{\mathbf{V}\mathbf{M}} = \mathbf{V}\mathbf{M}(\mathbf{V}\mathbf{M})^\dagger $ is the orthogonal projector onto ${\fancyscript{R}}(\mathbf{V}\mathbf{M})$. It was pointed out in (Baksalary and Trenkler (2012), Remark 3.1) that Eq. (3) always has a solution $\mathbf{G}$ and that each $\mathbf{G}$ satisfying (3) yields a representation of the best linear unbiased estimator $\mathsf{BLUE }(\mathbf{X} \varvec{\beta })$ of $\mathbf{X} \varvec{\beta }$. All these representations coincide; see Groß (2004, Corollary 3). In the Appendix given below it is demonstrated that there may exist, however, quite useless versions of $\mathsf{BLUE }(\mathbf{X} \varvec{\beta })$. To avoid this discrepancy subsequently we strengthen the consistency condition $\mathbf{y} \in {\fancyscript{R}} ( \mathbf{X} : \mathbf{V} )$ to

$$\begin{aligned} {\fancyscript{R}} ( \mathbf{X } : \mathbf{V } ) = \mathbb R ^{n,1}. \end{aligned}$$

(4)

Then, according to Groß (2004, Corollary 4), $\mathsf{BLUE }(\mathbf{X} \varvec{\beta })$ is unique.

In the next section some representations of the best linear unbiased estimator (BLUE) and the ordinary least squares estimator (OLSE) are provided, whereas Sect. 3 deals with “another twist” to Kruskal’s Theorem, which was briefly mentioned above. Section 4 is concerned with bounds for the Euclidean distance between BLUE and OLSE of $\mathbf{X} \varvec{\beta }$, and the last section of the paper revisits the problem of when BLUE equals OLSE.

2 Representations of BLUE and OLSE

Let $\mathbf{P}$ be an orthogonal projector in $\mathbb{R }^{n,1}$, i.e., an $n \times n$ real symmetric idempotent matrix. Assume that the rank of $\mathbf{P}$ is r. It is known that there exists an orthogonal matrix $\mathbf{U}$ such that

$$\begin{aligned} \mathbf{P } = \mathbf{U }\left( \begin{array}{l@{\quad }l} \mathbf{I }_r &{} \mathbf{0 } \\ \mathbf{0 } &{} \mathbf{0 } \\ \end{array}\right) \mathbf{U }^\prime ; \end{aligned}$$

(5)

see Trenkler (1994, Theorem 13). Any other orthogonal projector of the same size, say $\mathbf{Q} \in \mathbb R ^{n \times n}$, can be represented as

$$\begin{aligned} \mathbf{Q } = \mathbf{U }\left( \begin{array}{l@{\quad }l} \mathbf{A } &{} \mathbf{B } \\ \mathbf{B }^\prime &{} \mathbf{D } \\ \end{array}\right) \mathbf{U }^\prime , \end{aligned}$$

(6)

with symmetric matrices $\mathbf{A}$ and $\mathbf{D}$ of orders $r$ and $n-r$, respectively.

In what follows we assume that $\mathbf{H} = \mathbf{X}\mathbf{X}^\dagger $ is represented by $\mathbf{P}$ of the form (5) and $\mathbf{P}_{\mathbf{V}\mathbf{M}}$ is represented by $\mathbf{Q}$ defined in (6), i.e., $\mathbf{P} = \mathbf{H} = \mathbf{X}\mathbf{X}^\dagger $ and $\mathbf{Q} = \mathbf{P}_{\mathbf{V}\mathbf{M}} = \mathbf{V}\mathbf{M}(\mathbf{V}\mathbf{M})^\dagger $. It can be verified that $\mathbf{T} = (\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H})^\dagger $, where $\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}} = \mathbf{I}_n - \mathbf{P}_{\mathbf{V}\mathbf{M}}$, is an idempotent matrix; see Greville (1974, p. 830). From (5) and (6) we obtain

$$\begin{aligned} \mathbf{T } = \mathbf{U }\left( \begin{array}{llll} \mathbf{P }_{\overline{\mathbf{A }}} &{}\quad -\mathbf{B }\mathbf{D }^\dagger \\ \mathbf{0 } &{}\quad \mathbf{0 } \\ \end{array}\right) \mathbf{U }^\prime , \end{aligned}$$

where $\mathbf{P}_{\overline{\mathbf{A}}}$ is the orthogonal projector onto the column space of $\overline{\mathbf{A}} = \mathbf{I}_r - \mathbf{A}$. It follows that $\mathbf{T}$ is the oblique projector onto ${\fancyscript{R}}(\mathbf{H}) \cap [{\fancyscript{N}}(\mathbf{H}) + {\fancyscript{N}}(\mathbf{P}_{\mathbf{V}\mathbf{M}})]$ along ${\fancyscript{R}}(\mathbf{P}_{\mathbf{V}\mathbf{M}}) \stackrel{\perp }{\oplus } [{\fancyscript{N}}(\mathbf{H}) \cap {\fancyscript{N}}(\mathbf{P}_{\mathbf{V}\mathbf{M}})]$, where $\stackrel{\perp }{\oplus }$ indicates that the two subspaces involved in the direct sum are orthogonal; see Baksalary and Trenkler (2010, Theorem 2). From

$$\begin{aligned} {\fancyscript{R}}(\mathbf{H }) \cap {\fancyscript{R}}(\mathbf{V }\mathbf{M }) = {\fancyscript{R}}(\mathbf{H }) \cap {\fancyscript{R}}(\mathbf{P }_{\mathbf{V }\mathbf{M }}) = \{\mathbf{0 }\} \end{aligned}$$

(7)

(see Baksalary and Trenkler 2009, Theorem 1), we arrive at ${\fancyscript{N}}(\mathbf{H}) + {\fancyscript{N}}(\mathbf{P}_{\mathbf{V}\mathbf{M}}) = \mathbb R ^{n,1}$, which leads to the conclusion that $\mathbf{T}$ is the oblique projector onto ${\fancyscript{R}}(\mathbf{H})$ along ${\fancyscript{R}}(\mathbf{P}_{\mathbf{V}\mathbf{M}}) \stackrel{\perp }{\oplus } [{\fancyscript{N}}(\mathbf{H}) \cap {\fancyscript{N}}(\mathbf{P}_{\mathbf{V}\mathbf{M}})]$. Furthermore, it follows that $\mathbf{T}$ takes the form

$$\begin{aligned} \mathbf{T } = \mathbf{U }\left( \begin{array}{ll} \mathbf{I }_r &{}\quad -\mathbf{B }\mathbf{D }^\dagger \\ \mathbf{0 } &{}\quad \mathbf{0 } \\ \end{array}\right) \mathbf{U }^\prime ; \end{aligned}$$

(8)

see Baksalary and Trenkler (2010, Sect. 2).

It is well known that (7) ensures that Eq. (3) is solvable. One of the solutions, namely $\mathbf{T} = (\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H})^\dagger $, gives a representation of the BLUE for $\mathbf{X} \varvec{\beta }$, i.e., $\mathsf{BLUE }(\mathbf{X} \varvec{\beta }) = \mathbf{T}\mathbf{y}$. There is a number of further expressions for the BLUE (see Baksalary and Trenkler 2009, Sect. 4), but they all coincide by the assumption (4). Observe that the OLSE of $\mathbf{X} \varvec{\beta }$ is $\mathsf{OLSE }(\mathbf{X} \varvec{\beta }) = \mathbf{H}\mathbf{y}$.

3 Another twist

As in Krämer (1980), we consider the problem of identifying those observation vectors $\mathbf{y }$ which yield the same value of $\mathsf{OLSE }(\mathbf{X} \varvec{\beta })$ and $\mathsf{BLUE }(\mathbf{X} \varvec{\beta })$. This amounts to an analysis of the subspace ${\fancyscript{L}}$ of $\mathbb R ^{n,1}$ which is the null space of $\mathbf{H} - \mathbf{T}$, i.e., ${\fancyscript{L}} = {\fancyscript{N}}(\mathbf{H} - \mathbf{T})$. For this purpose the following result is useful.

Lemma 1

Let $\mathbf{R}$ and $\mathbf{S}$ be idempotent matrices of the same size. Then:

(i)
${\fancyscript{N}}(\mathbf{S} - \mathbf{R}) = {\fancyscript{N}}(\mathbf{S}\overline{\mathbf{R}}) \cap {\fancyscript{N}}(\overline{\mathbf{S}}\mathbf{R})$,
(ii)
${\fancyscript{N}}(\mathbf{R}\mathbf{S}) = {\fancyscript{N}}(\mathbf{S}) \oplus [{\fancyscript{N}}(\mathbf{R}) \cap {\fancyscript{R}}(\mathbf{S})]$.

Proof

For a proof see (Baksalary and Trenkler (2013), Theorems 1 and 9). $\square $

Lemma 1 leads to the following result.

Theorem 1

Under the model (1), let $\mathbf{H} = \mathbf{X}\mathbf{X}^\dagger $, $\mathbf{M} = \mathbf{I}_n - \mathbf{H}$, and $\mathbf{T} = (\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H})^\dagger $. Then

$$\begin{aligned} {\fancyscript{L}} = {\fancyscript{N}}(\mathbf{H } - \mathbf{T }) = {\fancyscript{N}}(\mathbf{T }\mathbf{M }). \end{aligned}$$

Proof

Lemma 1 yields

$$\begin{aligned} {\fancyscript{N}}(\mathbf{H } - \mathbf{T }) = {\fancyscript{N}}(\mathbf{T }\overline{\mathbf{H }}) \cap {\fancyscript{N}}(\overline{\mathbf{T }}\mathbf{H }) = {\fancyscript{N}}(\mathbf{T }\mathbf{M }) \cap {\fancyscript{N}}(\overline{\mathbf{T }}\mathbf{H }). \end{aligned}$$

Another relevant fact is that with $\mathbf{H}$ of the form (5) and $\mathbf{T}$ given in (8) we directly get $\overline{\mathbf{T}}\mathbf{H} = \mathbf{0}$. $\square $

The vectors belonging to the subspace ${\fancyscript{L}} = {\fancyscript{N}}(\mathbf{H} - \mathbf{T})$ can be explicitly written as

$$\begin{aligned} {\fancyscript{L}} = \{ [\mathbf{I }_n - (\mathbf{T }\mathbf{M })^\dagger \mathbf{T }\mathbf{M }] \mathbf{z }:\mathbf{z } \in \mathbb R ^{n,1}\}, \end{aligned}$$

as the solutions to the equation $\mathbf{T}\mathbf{M}\mathbf{z} = \mathbf{0}$. Observe also that ${\fancyscript{N}}(\mathbf{T}\mathbf{M}) \supseteq {\fancyscript{N}}(\mathbf{M}) = {\fancyscript{R}}(\mathbf{H})$. This means that, inter alia, all vectors belonging to ${\fancyscript{R}}(\mathbf{H}) = {\fancyscript{R}}(\mathbf{X})$ result in estimates such that $\mathsf{BLUE }(\mathbf{X} \varvec{\beta }) = \mathsf OLSE (\mathbf{X} \varvec{\beta })$, for example $\hat{\mathbf{y}} = \mathbf{H}\mathbf{y}$. This does not come as a surprise, since by (5) and (8) it follows that $\mathbf{T}\mathbf{H} = \mathbf{H}$.

Further characterization of the subspace ${\fancyscript{L}} = {\fancyscript{N}}(\mathbf{H} - \mathbf{T})$ is established in the theorem below.

Theorem 2

Under the model (1), let $\mathbf{H} = \mathbf{X}\mathbf{X}^\dagger $, $\mathbf{M} = \mathbf{I}_n - \mathbf{H}$, and $\mathbf{T} = (\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H})^\dagger $. Then

$$\begin{aligned} {\fancyscript{L}} = {\fancyscript{R}}(\mathbf{H })\stackrel{\perp }{\oplus } [{\fancyscript{N}}(\mathbf{H }\overline{\mathbf{P }}_{\mathbf{V }\mathbf{M }}) \cap {\fancyscript{N}}(\mathbf{H })]. \end{aligned}$$

Proof

By Theorem 1 we have ${\fancyscript{L}} = {\fancyscript{N}}(\mathbf{T}\mathbf{M})$. Hence, Lemma 1 implies

$$\begin{aligned} {\fancyscript{N}}(\mathbf{T }\mathbf{M }) = {\fancyscript{N}}(\mathbf{M }) \oplus [{\fancyscript{N}}(\mathbf{T }) \cap {\fancyscript{R}}(\mathbf{M })] = {\fancyscript{R}}(\mathbf{H }) \oplus [{\fancyscript{N}}(\mathbf{T }) \cap {\fancyscript{N}}(\mathbf{H })]. \end{aligned}$$

Now

$$\begin{aligned} {\fancyscript{N}}(\mathbf{T }) = {\fancyscript{N}}[(\overline{\mathbf{P}}_{\mathbf{V }\mathbf{M }}\mathbf{H })^\dagger ] = {\fancyscript{N}}[(\overline{\mathbf{P }}_{\mathbf{V }\mathbf{M }}\mathbf{H })^\prime ] = {\fancyscript{N}}(\mathbf{H }\overline{\mathbf{P }}_{\mathbf{V }\mathbf{M }}), \end{aligned}$$

which completes the proof. $\square $

The result of Theorem 2 looks somehow different than that of Groß et al. (2001, Theorem 9), for setting there $\mathbf{C} = \mathbf{I}_n$ gives

$$\begin{aligned} {\fancyscript{L}} = {\fancyscript{R}}(\mathbf{H }) \oplus [{\fancyscript{R}}(\mathbf{V }\mathbf{M }) \cap {\fancyscript{N}}(\mathbf{H })]. \end{aligned}$$

(9)

This discrepancy can be explained on account of the identity

$$\begin{aligned} {\fancyscript{N}}(\mathbf{H }\overline{\mathbf{P }}_{\mathbf{V }\mathbf{M }}) = {\fancyscript{R}}(\mathbf{P }_{\mathbf{V }\mathbf{M }}) \stackrel{\perp }{\oplus } [{\fancyscript{N}}(\mathbf{H }) \cap {\fancyscript{N}}(\mathbf{P }_{\mathbf{V }\mathbf{M }})], \end{aligned}$$

(10)

following from Lemma 1. The subspace of Theorem 2 coincides with (9) when ${\fancyscript{N}}(\mathbf{H}) \cap {\fancyscript{N}}(\mathbf{P}_{\mathbf{V}\mathbf{M}}) = \{\mathbf{0}\}$, which is equivalent to ${\fancyscript{R}}(\mathbf{H}) + {\fancyscript{R}}(\mathbf{P}_{\mathbf{V}\mathbf{M}}) = \mathbb R ^{n,1}$ or ${\fancyscript{R}}(\mathbf{H}:\mathbf{V}\mathbf{M}) = {\fancyscript{R}}(\mathbf{X}:\mathbf{V}) = \mathbb R ^{n,1}$; see Puntanen et al. (2011, Proposition 5.1). However, the latter condition, given above as (4), was assumed to be valid in the whole paper. Thus, we may state what follows.

Corollary 1

Under the model (1), let $\mathbf{H} = \mathbf{X}\mathbf{X}^\dagger $ and $\mathbf{M} = \mathbf{I}_n - \mathbf{H}$. Then

$$\begin{aligned} {\fancyscript{L}} = {\fancyscript{R}}(\mathbf{H }) \stackrel{\perp }{\oplus } [{\fancyscript{R}}(\mathbf{V }\mathbf{M })\cap {\fancyscript{N}}(\mathbf{H })]. \end{aligned}$$

Corollary 1 corresponds to Krämer’s (1980, Theorem), where the identity $\mathsf{BLUE }(\varvec{\beta }) = \mathsf{OLSE }(\varvec{\beta })$ is explored under the assumption that $\mathbf{X}$ and $\mathbf{V}$ are of full (column) rank.

Recall that the projector $\mathbf{P}$ introduced in (5) was determined by the model matrix $\mathbf{X}$, for $\mathbf{P} = \mathbf{H} = \mathbf{X}\mathbf{X}^\dagger $. In consequence, rank of $\mathbf{P}$ coincides with the ranks of $\mathbf{H}$ and $\mathbf{X}$, i.e., $r = {\mathrm{rank}}(\mathbf{H}) = {\mathrm{rank}}(\mathbf{X})$. Consider now an oblique projector of rank $r$ having the form

$$\begin{aligned} \mathbf{L } = \mathbf{U }\left( \begin{array}{l@{\quad }l} \mathbf{I }_r &{} \mathbf{K } \\ \mathbf{0 } &{} \mathbf{0 } \end{array}\right) \mathbf{U }^\prime , \end{aligned}$$

(11)

with $\mathbf{K} \in \mathbb R ^{r \times n-r}$. It was shown by Baksalary and Trenkler (2011, Sect. 3) that when $\mathbf{K} = -\mathbf{W}_{12} (\mathbf{D} \mathbf{W}_{22}\mathbf{D})^\dagger $, where $\mathbf{D} \in \mathbb R ^{n-r \times n-r}$ is a symmetric idempotent matrix and $\mathbf{W}_{12} \in \mathbb R ^{r \times n-r}$ and $\mathbf{W}_{22} \in \mathbb R ^{n-r \times n-r}$ originate from the representation of $\mathbf{V}$ given by

$$\begin{aligned} \mathbf{V } = \mathbf{U } \left( \begin{array}{ll} \mathbf{W }_{11} &{}\quad \mathbf{W }_{12} \\ \mathbf{W }_{12}^\prime &{}\quad \mathbf{W }_{22} \end{array}\right) \mathbf{U }^\prime , \end{aligned}$$

then $\mathbf{L}\mathbf{y}$ is an unbiased estimator of $\mathbf{X} \varvec{\beta }$ whose efficiency lies between that of $\mathsf{BLUE }(\mathbf{X} \varvec{\beta })$ and $\mathsf{OLSE }(\mathbf{X} \varvec{\beta })$. In what follows we identify those observation vectors $\mathbf{y}$ which yield the same estimators, compared to $\mathsf{BLUE }(\mathbf{X} \varvec{\beta })$ and $\mathsf{OLSE }(\mathbf{X} \varvec{\beta })$. The resulting formulas give an impression how close the three estimators can be.

Theorem 3

Under the model (1), let $\mathbf{H} = \mathbf{X}\mathbf{X}^\dagger $, $\mathbf{M} = \mathbf{I}_n - \mathbf{H}$, and $\mathbf{T} = (\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H})^\dagger $. Moreover, let $\mathbf{L}$ be of the form (11). Then:

(i)
${\fancyscript{N}}(\mathbf{H} - \mathbf{L}) = {\fancyscript{N}}(\mathbf{L}\mathbf{M}) = {\fancyscript{R}}(\mathbf{L}) \oplus [{\fancyscript{N}}(\mathbf{H}) \cap {\fancyscript{N}}(\mathbf{L})]$,
(ii)
${\fancyscript{N}}(\mathbf{T} - \mathbf{L}) = {\fancyscript{N}}(\mathbf{T}\overline{\mathbf{L}}) = {\fancyscript{R}}(\mathbf{T}) \oplus [{\fancyscript{N}}(\mathbf{L}) \cap {\fancyscript{N}}(\mathbf{T})]$.

Proof

From Lemma 1 we have ${\fancyscript{N}}(\mathbf{H} - \mathbf{L}) = {\fancyscript{N}}(\mathbf{H}\overline{\mathbf{L}}) \cap {\fancyscript{N}}(\overline{\mathbf{H}}\mathbf{L})$. Representations (5) and (11) yield $\overline{\mathbf{H}}\mathbf{L} = \mathbf{0}$, whence, again by Lemma 1,

$$\begin{aligned} {\fancyscript{N}}(\mathbf{H } - \mathbf{L }) = {\fancyscript{N}}(\mathbf{H }\overline{\mathbf{L }}) = {\fancyscript{N}}(\overline{\mathbf{L }}) \oplus [{\fancyscript{N}}(\mathbf{H }) \cap {\fancyscript{R}}(\overline{\mathbf{L }})] = {\fancyscript{R}}(\mathbf{L }) \oplus [{\fancyscript{N}}(\mathbf{H }) \cap {\fancyscript{N}}(\mathbf{L })]. \end{aligned}$$

Since Theorem 1 ensures that ${\fancyscript{N}}(\mathbf{H} - \mathbf{L}) = {\fancyscript{N}}(\mathbf{L}\overline{\mathbf{H}}) = {\fancyscript{N}}(\mathbf{L}\mathbf{M})$, point (i) of the theorem is established.

To derive the second part of the theorem, note that Lemma 1 entails ${\fancyscript{N}}(\mathbf{T} - \mathbf{L}) = {\fancyscript{N}}(\mathbf{L}\overline{\mathbf{T}}) \cap {\fancyscript{N}}(\overline{\mathbf{L}}\mathbf{T})$. Similarly as in the proof of point (i), we obtain $\overline{\mathbf{L}}\mathbf{T} = \mathbf{0}$ which implies ${\fancyscript{N}}(\mathbf{T} - \mathbf{L}) = {\fancyscript{N}}(\mathbf{L}\overline{\mathbf{T}})$, and thus, by Lemma 1,

$$\begin{aligned} {\fancyscript{N}}(\mathbf{T } - \mathbf{L }) = {\fancyscript{N}}(\mathbf{L }\overline{\mathbf{T }}) = {\fancyscript{N}}(\overline{\mathbf{T }}) \oplus [{\fancyscript{N}}(\mathbf{L }) \cap {\fancyscript{R}}(\overline{\mathbf{T }})] = {\fancyscript{R}}(\mathbf{T }) \oplus [{\fancyscript{N}}(\mathbf{L }) \cap {\fancyscript{N}}(\mathbf{T })]. \end{aligned}$$

On the other hand, from Theorem 1 we have ${\fancyscript{N}}(\mathbf{T} - \mathbf{L}) = {\fancyscript{N}}(\mathbf{T}\overline{\mathbf{L}})$, which completes the proof. $\square $

4 Bounds for the Euclidean distance

Baksalary and Kala (1980) derived a bound for $|| \varvec{\mu }^*- \hat{\varvec{\mu }} ||$, where $\varvec{\mu }^*= \mathsf{OLSE }(\mathbf{X} \varvec{\beta })$ and $\hat{\varvec{\mu }} = \mathsf{BLUE }(\mathbf{X} \varvec{\beta })$, and $|| . ||$ denotes the Euclidean norm. To be precise (Baksalary and Kala 1980, Theorem) reads

$$\begin{aligned} || \varvec{\mu }^*- \hat{\varvec{\mu }} || \leqslant (\gamma ^{1/2}/\delta ) || \mathbf{y } - \varvec{\mu }^*||, \end{aligned}$$

where $\gamma $ is the largest eigenvalue of $\mathbf{H}\mathbf{V}\mathbf{M}\mathbf{V}\mathbf{H}$ and $\delta $ is the smallest nonzero eigenvalue of $\mathbf{M}\mathbf{V}\mathbf{M}$. Subsequently, we provide an alternative bound, derived from the representation of $\mathsf{BLUE }(\mathbf{X} \varvec{\beta })$, using the oblique projector $\mathbf{T} = (\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H})^\dagger $ given in (8).

Theorem 4

Let $\hat{\varvec{\mu }} = (\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H})^\dagger \mathbf{y}$ with $(\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H})^\dagger $ of the form (8) be one of the representations of $\mathsf{BLUE }(\mathbf{X} \varvec{\beta })$. If $\varvec{\mu }^*= \mathbf{H}\mathbf{y} = \mathsf{OLSE }(\mathbf{X} \varvec{\beta })$, then

$$\begin{aligned} || \varvec{\mu }^*- \hat{\varvec{\mu }} || \leqslant \tau _1( \mathbf{B}\mathbf{D }^\dagger ) || \mathbf{y } ||, \end{aligned}$$

where $\tau _1( \mathbf{B}\mathbf{D}^\dagger )$ is the largest singular value of $\mathbf{B}\mathbf{D}^\dagger $.

Proof

From

$$\begin{aligned} \varvec{\mu }^*= \mathbf{U }\left( \begin{array}{l@{\quad }l} \mathbf{I }_r &{} \mathbf{0 } \\ \mathbf{0 } &{} \mathbf{0 } \end{array}\right) \mathbf{U }^\prime \mathbf{y } \quad {\mathrm{and}} \quad \hat{\varvec{\mu }} = \mathbf{U }\left( \begin{array}{ll} \mathbf{I }_r &{}\quad -\mathbf{B }\mathbf{D }^\dagger \\ \mathbf{0 } &{}\quad \mathbf{0 } \\ \end{array}\right) \mathbf{U }^\prime \mathbf{y } \end{aligned}$$

we obtain

$$\begin{aligned} \varvec{\mu }^*- \hat{\varvec{\mu }} = \mathbf{U }\left( \begin{array}{ll} \mathbf{0 } &{}\quad \mathbf{B }\mathbf{D }^\dagger \\ \mathbf{0 } &{}\quad \mathbf{0 } \\ \end{array}\right) \mathbf{U }^\prime \mathbf{y }. \end{aligned}$$

Hence,

$$\begin{aligned} || \varvec{\mu }^*- \hat{\varvec{\mu }} ||^2 = \mathbf{y }^\prime \mathbf{U } \left( \begin{array}{ll} \mathbf{0 } &{}\quad \mathbf{0 } \\ \mathbf{0 } &{}\quad \mathbf{D }^\dagger \mathbf{B }^\prime \mathbf{B } \mathbf{D }^\dagger \\ \end{array}\right) \mathbf{U }^\prime \mathbf{y }. \end{aligned}$$

In consequence, $|| \varvec{\mu }^*- \hat{\varvec{\mu }} ||^2 \leqslant \lambda _1(\mathbf{D}^\dagger \mathbf{B}^\prime \mathbf{B} \mathbf{D}^\dagger ) ||\mathbf{y}||^2$, where $\lambda _1(\mathbf{D}^\dagger \mathbf{B}^\prime \mathbf{B} \mathbf{D}^\dagger )$ is the largest eigenvalue of $\mathbf{D}^\dagger \mathbf{B}^\prime \mathbf{B} \mathbf{D}^\dagger $. The assertion follows by taking square roots. $\square $

It is seen from Theorem 4 that $\varvec{\mu }^*= \hat{\varvec{\mu }}$ for all $\mathbf{y}$ if and only if $\mathbf{B} \mathbf{D}^\dagger = \mathbf{0}$, or, equivalently, $\mathbf{B} = \mathbf{0}$, which means that $\mathbf{H}$ and $\mathbf{P}_{\mathbf{V}\mathbf{M}}$ commute.

5 Equality of BLUE and OLSE

The commutativity $\mathbf{H}\mathbf{P}_{\mathbf{V}\mathbf{M}} = \mathbf{P}_{\mathbf{V}\mathbf{M}} \mathbf{H}$, just mentioned in the preceding section, is not contained in the standard catalogue of conditions necessary and sufficient for the equality $\mathsf{BLUE }(\mathbf{X} \varvec{\beta }) = \mathsf{OLSE }(\mathbf{X} \varvec{\beta })$. Among the most important conditions equivalent to the equality are:

(i)
$\mathbf{H}\mathbf{V} = \mathbf{V}\mathbf{H}$,
(ii)
$\mathbf{H}\mathbf{V} = \mathbf{H}\mathbf{V}\mathbf{H}$,
(iii)
${\fancyscript{R}}(\mathbf{V}\mathbf{X}) = {\fancyscript{R}}(\mathbf{X}) \cap {\fancyscript{R}}(\mathbf{V})$,
(iv)
$\mathbf{H}\mathbf{V}\mathbf{M} = \mathbf{0}$,
(v)
${\fancyscript{R}}(\mathbf{V}\mathbf{X}) \subseteq {\fancyscript{R}}(\mathbf{X})$.

Note that the condition (v) can be rewritten as ${\fancyscript{R}}(\mathbf{V}\mathbf{X}) = {\fancyscript{R}}(\mathbf{X})$ when $\mathbf{V}$ is nonsingular; see Krämer (1980) for a discussion related to Kruskal’s theorem and Puntanen et al. (2011, Proposition 10.1). Motivated by Theorem 4 we obtain the following result.

Theorem 5

Under the model (1), let $\mathbf{H} = \mathbf{X}\mathbf{X}^\dagger $ and $\mathbf{M} = \mathbf{I}_n - \mathbf{H}$. Then, the following conditions are equivalent:

(i)
$\mathsf{BLUE }(\mathbf{X} \varvec{\beta }) = \mathsf{OLSE }(\mathbf{X} \varvec{\beta })$,
(ii)
$\mathbf{H}\mathbf{V}\mathbf{M} = \mathbf{0}$,
(iii)
$\mathbf{H}\mathbf{P}_{\mathbf{V}\mathbf{M}} = \mathbf{0}$,
(iv)
$\mathbf{H} + \mathbf{P}_{\mathbf{V}\mathbf{M}}$ is an orthogonal projector,
(v)
$\mathbf{H}\mathbf{P}_{\mathbf{V}\mathbf{M}} = \mathbf{P}_{\mathbf{V}\mathbf{M}}\mathbf{H}$,
(vi)
$\mathbf{H}\mathbf{P}_{\mathbf{V}\mathbf{M}}$ is an orthogonal projector.

Proof

For the proof of the equivalence (i) $\Leftrightarrow $ (ii) see Puntanen et al. (2011, Proposition 10.1).

To show that (ii) implies (iii) postmultiply $\mathbf{H}\mathbf{V}\mathbf{M} = \mathbf{0}$ by $(\mathbf{V}\mathbf{M})^\dagger $ and refer to $\mathbf{P}_{\mathbf{V}\mathbf{M}} = \mathbf{V}\mathbf{M}(\mathbf{V}\mathbf{M})^\dagger $. To establish the reverse implication, note that the condition $\mathbf{H}\mathbf{P}_{\mathbf{V}\mathbf{M}} = \mathbf{0}$ entails $\mathbf{H}\mathbf{V}\mathbf{M}(\mathbf{V}\mathbf{M})^\dagger = \mathbf{0}$. Postmultiplying this equality by $\mathbf{V}\mathbf{M}$ leads to $\mathbf{H}\mathbf{V}\mathbf{M} = \mathbf{0}$.

The equivalence (iii) $\Leftrightarrow $ (iv) is well known; see e.g., Rao and Mitra (1971, Theorem 5.1.2).

The fact that (iii) $\Rightarrow $ (v) is visibly seen by taking the transpose of $\mathbf{H}\mathbf{P}_{\mathbf{V}\mathbf{M}} = \mathbf{0}$. For the proof of the reverse implication, recall that $\mathsf{BLUE }(\mathbf{X} \varvec{\beta })$ exists if and only if Eq. (7) are satisfied. By condition (v) we have $\mathbf{H}\mathbf{P}_{\mathbf{V}\mathbf{M}}\mathbf{x} = \mathbf{P}_{\mathbf{V}\mathbf{M}}\mathbf{H}\mathbf{x}$ for any vector $\mathbf{x} \in \mathbb R ^{n,1}$. Thus, $\mathbf{x} \in {\fancyscript{R}}(\mathbf{H})\cap {\fancyscript{R}}(\mathbf{P}_{\mathbf{V}\mathbf{M}})$, whence $\mathbf{H}\mathbf{P}_{\mathbf{V}\mathbf{M}}\mathbf{x} = \mathbf{0}$ for any $\mathbf{x}$, i.e., $\mathbf{H}\mathbf{P}_{\mathbf{V}\mathbf{M}} = \mathbf{0}$.

The part (v) $\Leftrightarrow $ (vi) is also known in the literature; see e.g., Baksalary et al. (2002, Theorem). $\square $

Krämer (1980) showed how his theorem characterizing the vectors $\mathbf{y}$ ensuring the coincidence of $\mathsf{BLUE }(\varvec{\beta })$ and $\mathsf{OLSE }(\varvec{\beta })$ can be used to prove Kruskal’s Theorem. This is done in a similar fashion in the present set-up.

Theorem 6

Under the model (1), let $\mathbf{H} = \mathbf{X}\mathbf{X}^\dagger $ and $\mathbf{M} = \mathbf{I}_n - \mathbf{H}$. Then, the following conditions are equivalent:

(i)
$\mathsf{BLUE }(\mathbf{X} \varvec{\beta }) = \mathsf{OLSE }(\mathbf{X} \varvec{\beta })$,
(ii)
${\fancyscript{N}}(\mathbf{H}) = {\fancyscript{N}}(\mathbf{H}\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}) \cap {\fancyscript{N}}(\mathbf{H})$.

Proof

First we show that (ii) implies (i). The condition (ii) ensures that ${\fancyscript{N}}(\mathbf{H}) \subseteq {\fancyscript{N}}(\mathbf{H}\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}})$, which yields ${\fancyscript{R}}(\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H}) \subseteq {\fancyscript{R}}(\mathbf{H})$. Thus, $\mathbf{H}\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H} = \overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H}$. In consequence, $\mathbf{H}\mathbf{P}_{\mathbf{V}\mathbf{M}}\mathbf{H} = \mathbf{P}_{\mathbf{V}\mathbf{M}}\mathbf{H}$, and taking the transpose leads to $\mathbf{H}\mathbf{P}_{\mathbf{V}\mathbf{M}} = \mathbf{P}_{\mathbf{V}\mathbf{M}}\mathbf{H}$. The implication now follows on account of point (v) of Theorem 5.

The part (i) $\Rightarrow $ (ii) is established in a similar fashion by reversing the preceding chain. $\square $

From the discussion preceding Corollary 1 it follows that ${\fancyscript{N}}(\mathbf{H}\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}})$, specified in (10), coincides with ${\fancyscript{R}}(\mathbf{P}_{\mathbf{V}\mathbf{M}})$ when (4) holds. In such a case, the condition (ii) of Theorem 6 reduces to ${\fancyscript{N}}(\mathbf{H}) = {\fancyscript{R}}(\mathbf{P}_{\mathbf{V}\mathbf{M}}) \cap {\fancyscript{N}}(\mathbf{H})$, or, equivalently, to ${\fancyscript{R}}(\mathbf{P}_{\mathbf{V}\mathbf{M}}) \subseteq {\fancyscript{N}}(\mathbf{H})$, i.e., ${\fancyscript{R}}(\mathbf{V}\mathbf{M}) \subseteq {\fancyscript{R}}(\mathbf{M})$. When $\mathbf{V}$ is nonsingular, we get ${\fancyscript{R}}(\mathbf{V}\mathbf{M}) = {\fancyscript{R}}(\mathbf{M})$, which is the final condition of Kruskal’s Theorem in Krämer (1980).

Another observation is that the conditions of Theorems 5 and 6, unlike the customary conditions given on the top of the present section, predominantly deal with orthogonal projectors. Thus, the equality of $\mathsf{BLUE }(\mathbf{X} \varvec{\beta })$ and $\mathsf{OLSE }(\mathbf{X} \varvec{\beta })$ is characterized in a more symmetric way.

When $\mathbf{P}_{\mathbf{V}\mathbf{M}}$ has representation (6), then we get the following equivalences among the statements of Theorem 5:

$\mathbf{H } + \mathbf{P }_{\mathbf{V }\mathbf{M }}$ is an orthogonal projector if and only if $\mathbf{A } = \mathbf{0 }$,

$\mathbf{H }\mathbf{P }_{\mathbf{V }\mathbf{M }} = \mathbf{P }_{\mathbf{V }\mathbf{M}}\mathbf{H}$ if and only if $\mathbf{B } = \mathbf{0 }$.

Note that condition $\mathbf{A} = \mathbf{0}$ in general is stronger than $\mathbf{B} = \mathbf{0}$, but in the present set-up they are equivalent. There exists a large number of equivalent conditions to characterize condition (v) of Theorem 5, for instance:

$$\begin{aligned}&\mathbf{HP }_{\mathbf{VM }} = \mathbf{P }_{{\fancyscript{R}}(\mathbf{H}) \cap {\fancyscript{R}}(\mathbf{V }\mathbf{M })},\\&{\fancyscript{R}}(\mathbf{H }\mathbf{P }_{\mathbf{V }\mathbf{M }}) = {\fancyscript{R}}(\mathbf{H }) \cap {\fancyscript{R}}(\mathbf{V }\mathbf{M}),\\&{\mathrm{rank}}(\mathbf{H }\mathbf{P }_{\mathbf{V }\mathbf{M }}) = \dim [{\fancyscript{R}}(\mathbf{H }) \cap {\fancyscript{R}}(\mathbf{V }\mathbf{M })],\\&{\mathrm{rank}}(\mathbf{H } - \mathbf{P }_\mathbf{VM }) = {\mathrm{rank}}(\mathbf{H } + \mathbf{P }_{\mathbf{V}\mathbf{M }}) - {\mathrm{rank}}(\mathbf{H }\mathbf{P }_{\mathbf{V }\mathbf{M }}); \end{aligned}$$

see Baksalary and Trenkler (2008).

References

Baksalary JK, Kala R (1980) A new bound for the Euclidean norm of the difference between the least squares and the best linear unbiased estimators. Ann Stat 8:679–681
Article MathSciNet MATH Google Scholar
Baksalary JK, Baksalary OM, Szulc T (2002) A property of orthogonal projectors. Linear Algebra Appl 354:35–39
Article MathSciNet MATH Google Scholar
Baksalary OM, Trenkler G (2008) An alternative approach to characterize the commutativity of orthogonal projectors. Discuss Math Probab Stat 28:113–137
MathSciNet MATH Google Scholar
Baksalary OM, Trenkler G (2009) A projector oriented approach to the best linear unbiased estimator. Stat Pap 50:721–733
Article MathSciNet MATH Google Scholar
Baksalary OM, Trenkler G (2010) Functions of orthogonal projectors involving the Moore–Penrose inverse. Comput Math Appl 59:764–778
Article MathSciNet MATH Google Scholar
Baksalary OM, Trenkler G (2011) Between OLSE and BLUE. Aust N Z J Stat 53:289–303
Article MathSciNet Google Scholar
Baksalary OM, Trenkler G (2012) On projectors and some of their applications in statistics. In: Bapat RB, Kirkland S, Prasad KM, Puntanen S (eds) Lectures on matrix and graph methods. Manipal University Press, Manipal, pp 113–127
Baksalary OM, Trenkler G (2013) On column and null spaces of functions of a pair of oblique projectors. Linear Multilinear Algebra. doi:10.1080/03081087.2012.731055
Greville TNE (1974) Solutions of the matrix equation $XAX = X$, and relations between oblique and orthogonal projectors. SIAM J Appl Math 26:828–832
Article MathSciNet MATH Google Scholar
Groß J (2004) The general Gauss–Markov model with possibly singular dispersion matrix. Stat Pap 45:311–336
Article MATH Google Scholar
Groß J, Trenkler G, Werner HJ (2001) The equality of linear transforms of the ordinary least squares estimator and the best linear unbiased estimator. Sankhyā 63:118–127
Google Scholar
Jaeger A, Krämer W (1998) A final twist on the equality of OLS and GLS. Stat Pap 39:321–324
Article MATH Google Scholar
Krämer W (1980) A note on the equality of ordinary least squares and Gauss–Markov estimates in the general linear model. Sankhyā 42:130–131
Google Scholar
Krämer W, Bartels R, Fiebig DG (1996) Another twist on the equality of OLS and GLS. Stat Pap 37: 277–281
Google Scholar
Kruskal W (1968) When are Gauss–Markov and least squares estimators identical? A coordinate-free approach. Ann Math Stat 39:70–75
Article MathSciNet MATH Google Scholar
Puntanen S, Styan GPH, Isotalo J (2011) Matrix tricks for linear statistical models: our personal top twenty. Springer, Heidelberg
Rao CR (1973) Linear statistical inference and its applications, 2nd edn. Wiley, New York
Rao CR, Mitra SK (1971) Generalized inverse of matrices and its applications. Wiley, New York
Trenkler G (1994) Characterizations of oblique and orthogonal projectors. In: Caliński T, Kala R (eds) Proceedings of the international conference on linear statistical inference LINSTAT’93. Kluwer, Dordrecht, pp 255–270

Download references

Acknowledgments

The authors are very grateful to two anonymous referees whose remarks considerably improved the paper. In particular, we are thankful for the hints concerning uniqueness of the BLUE from which, besides a number of improvement proposals, we got the idea for the example given in the Appendix.

Author information

Authors and Affiliations

Faculty of Physics, Adam Mickiewicz University, ul., Umultowska 85, 61-614 , Poznań, Poland
Oskar Maria Baksalary
Faculty of Statistics, Dortmund University of Technology, Vogelpothsweg 87, 44221 , Dortmund, Germany
Götz Trenkler
School of Information Sciences, University of Tampere, 33014 , Tampere, Finland
Erkki Liski

Authors

Oskar Maria Baksalary
View author publications
You can also search for this author in PubMed Google Scholar
Götz Trenkler
View author publications
You can also search for this author in PubMed Google Scholar
Erkki Liski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Oskar Maria Baksalary.

Appendix

As an example to describe the situation when ${\fancyscript{R}}(\mathbf{X} : \mathbf{V})$ is a proper subset of $\mathbb R ^{n,1}$, consider the linear model (1), where $\mathbf{X} = (1, 0, 0)^\prime \in \mathbb R ^{3, 1}$ and $\mathbf{V} = {\mathrm{diag}}(0, 1, 0) \in \mathbb R ^{3 \times 3}$. Then ${\fancyscript{R}}(\mathbf{X} : \mathbf{V})$ is the linear combination of the vectors $(1, 0, 0)^\prime $ and $(0, 1, 0)^\prime $, and does not fill out the whole space $\mathbb R ^{3, 1}$. It follows that $\mathbf{H} = \mathbf{X}\mathbf{X}^\dagger = {\mathrm{diag}}(1, 0, 0)$, $\mathbf{M} = \mathbf{I}_3 - \mathbf{H} = {\mathrm{diag}}(0, 1, 1)$, $\mathbf{V}\mathbf{M} = \mathbf{V}$, $\mathbf{H}\mathbf{V} = \mathbf{V}\mathbf{H} = \mathbf{0}$, $\mathbf{P}_{\mathbf{V}\mathbf{M}} = \mathbf{V}$, $\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}} = {\mathrm{diag}}(1, 0, 1)$, $\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H} = \mathbf{H}$, and $\mathbf{T} = (\overline{\mathbf{P}}_{\mathbf{V}\mathbf{M}}\mathbf{H})^\dagger = \mathbf{H}$. Hence, $\mathbf{H}\mathbf{y} = \mathbf{T}\mathbf{y} = \mathbf{y}$ for all $\mathbf{y} \in \mathbb R ^{3,1}$, i.e., $\mathsf{OLSE }(\mathbf{X} \varvec{\beta }) = \mathsf{BLUE }(\mathbf{X} \varvec{\beta })$ everywhere.

Let us now have a look at the estimator $\mathbf{G}\mathbf{y}$, where $\mathbf{G} = {\mathrm{diag}}(1, 0, g)$, with arbitrary $g \in \mathbb R $. The matrix $\mathbf{G}$ satisfies Eq. (3), which means that with varying g the statistic $\mathbf{G}\mathbf{y}$ gives an infinite number of alternative representations of $\mathsf{BLUE }(\mathbf{X} \varvec{\beta })$. Observe that when $\mathbf{y} = (y_1, y_2, y_3)^\prime $, then we get $(y_1, 0, gy_3)^\prime $ as a best unbiased, but somewhat ridiculous estimator of $\mathbf{X} \varvec{\beta }$. It follows that $\mathsf Cov (\mathbf{H}\mathbf{y}) = \mathsf Cov (\mathbf{T}\mathbf{y}) = \mathsf Cov (\mathbf{G}\mathbf{y}) = \mathbf{0}$. Furthermore, when $g \ne 0$, then we have ${\fancyscript{N}}(\mathbf{H} - \mathbf{G}) = {\mathrm{span}} \{(1, 0, 0)^\prime , (0, 1, 0)^\prime \}$, in contrast to ${\fancyscript{N}}(\mathbf{H} - \mathbf{T}) = \mathbb R ^{3, 1}$. Note however that in this case the vector $\mathbf{y}$ does not satisfy the consistency condition of Sect. 1.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Reprints and permissions

About this article

Cite this article

Baksalary, O.M., Trenkler, G. & Liski, E. Let us do the twist again. Stat Papers 54, 1109–1119 (2013). https://doi.org/10.1007/s00362-013-0512-3

Download citation

Received: 08 June 2012
Revised: 25 October 2012
Published: 17 March 2013
Issue Date: November 2013
DOI: https://doi.org/10.1007/s00362-013-0512-3

Keywords

Mathematics Subject Classification (2000)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Let us do the twist again

Abstract

Similar content being viewed by others

The Legend of the Equality of OLSE and BLUE: Highlighted by C. R. Rao in 1967

Orthogonal Block Structure and Uniformly Best Linear Unbiased Estimators

Following K. Pearson to test the general linear hypothesis

1 Introduction

2 Representations of BLUE and OLSE

3 Another twist

Lemma 1

Proof

Theorem 1

Proof

Theorem 2

Proof

Corollary 1

Theorem 3

Proof

4 Bounds for the Euclidean distance

Theorem 4

Proof

5 Equality of BLUE and OLSE

Theorem 5

Proof

Theorem 6

Proof

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Navigation

Let us do the twist again

Abstract

Similar content being viewed by others

The Legend of the Equality of OLSE and BLUE: Highlighted by C. R. Rao in 1967

Orthogonal Block Structure and Uniformly Best Linear Unbiased Estimators

Following K. Pearson to test the general linear hypothesis

1 Introduction

2 Representations of BLUE and OLSE

3 Another twist

Lemma 1

Proof

Theorem 1

Proof

Theorem 2

Proof

Corollary 1

Theorem 3

Proof

4 Bounds for the Euclidean distance

Theorem 4

Proof

5 Equality of BLUE and OLSE

Theorem 5

Proof

Theorem 6

Proof

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation