The QR decomposition of a matrix A is defined by \(A=QR\), where Q is an orthonormal matrix and R is an upper triangular matrix. In its standard form, this decomposition requires algebraic extensions to the domain of A, but a fraction-free form is possible. The modified form given in [26] is \(QD^{-1}R\), and is proved below in Theorem 15. In [10], an exact-division algorithm for a fraction-free Gram-Schmidt orthogonal basis for the columns of a matrix A was given, but a complete fraction-free decomposition was not considered. We now show that the algorithms in [10] and in [26] both lead to a systematic common factor in their results. We begin by considering a fraction-free form of the Cholesky decomposition of a symmetric matrix. See [23, Eqn (3.70)] for a description of the standard form, which requires algebraic extensions to allow for square roots, but which are avoided here.
This section assumes that \(\mathbb {D}\) has characteristic 0; this assumption is needed in order to ensure that \(A^t A\) has full rank.
Lemma 13
Let \(A \in \mathbb {D}^{n\times n}\) be a symmetric matrix such that its \(L D^{-1} U\) decomposition can be computed without permutations; then we have \(U = L^t\), that is,
$$\begin{aligned} A = L D^{-1} L^t. \end{aligned}$$
Proof
Compute the decomposition \(A = L D^{-1} U\) as in Theorem 1. If we do not execute item 4 of Algorithm 4, we obtain the decomposition
$$\begin{aligned} A = \tilde{L} \tilde{D}^{-1} \tilde{U} = \begin{pmatrix} \mathcal {L} &{}\quad {\mathbf {0}} \\ \mathcal {M} &{}\quad {\mathbf {1}} \end{pmatrix} \begin{pmatrix} D &{}\quad {\mathbf {0}} \\ {\mathbf {0}} &{}\quad {\mathbf {1}} \end{pmatrix}^{-1} \begin{pmatrix} \mathcal {U} &{}\quad \mathcal {V} \\ {\mathbf {0}} &{}\quad {\mathbf {0}} \end{pmatrix}. \end{aligned}$$
Then because A is symmetric, we obtain
$$\begin{aligned} \tilde{L} \tilde{D}^{-1} \tilde{U} = A = A^t = \tilde{U}^t \tilde{D}^{-1} \tilde{L}^t \end{aligned}$$
The matrices \(\tilde{L}\) and \(\tilde{D}\) have full rank which implies
$$\begin{aligned} \tilde{U} (\tilde{L}^t)^{-1} \tilde{D} = \tilde{D} \tilde{L}^{-1} \tilde{U}^t. \end{aligned}$$
Examination of the matrices on the left hand side reveals that they are all upper triangular. Therefore also their product is an upper triangular matrix. Similarly, the right hand side is a lower triangular matrix and the equality of the two implies that they must both be diagonal. Cancelling \(\tilde{D}\) and rearranging the equation yields \(\tilde{U} = (\tilde{L}^{-1} \tilde{U}^t) \tilde{L}^t\) where \(\tilde{L}^{-1} \tilde{U}^t\) is diagonal. This shows that the rows of \(\tilde{U}\) are just multiples of the rows of \(\tilde{L}^t\). However, we know that the first r diagonal entries of \(\tilde{U}\) and \(\tilde{L}\) are the same, where r is the rank of \(\tilde{U}\). This yields
$$\begin{aligned} \tilde{L}^{-1} \tilde{U}^t = \begin{pmatrix} {\mathbf {1}}_{r} &{}\quad {\mathbf {0}} \\ {\mathbf {0}} &{}\quad {\mathbf {0}} \end{pmatrix}, \end{aligned}$$
and hence, when we remove the unnecessary last \(n-r\) rows of \(\tilde{U}\) and the last \(n-r\) columns of \(\tilde{L}\) (as suggested in Jeffrey [15]), we remain with \(U = L^t\). \(\square \)
As another preliminary to the main theorem, we need to delve briefly into matrices over ordered rings. Following, for example, the definition in [6, Sect. 8.6] an ordered ring is a (commutative) ring \(\mathbb {D}\) with a strict total order > such that \(x > x'\) together with \(y > y'\) implies \(x + y > x' + y'\) and also \(x > 0\) together with \(y > 0\) implies \(x y > 0\) for all \(x, x', y, y' \in \mathbb {D}\). As Cohn [6, Prop. 8.6.1] shows, such a ring must always be a domain, and squares of non-zero elements are always positive. Thus, the inner product of two vectors \(a, b \in \mathbb {D}^{m}\) defined by \((a,b) \mapsto a^t \,b\) must be positive definite. This implies that given a matrix \(A \in \mathbb {D}^{m\times n}\) the Gram matrix \(A^t A\) is positive semi-definite. If we additionally require the columns of A to be linearly independent, then \(A^t A\) becomes positive definite.
Lemma 14
Let \(\mathbb {D}\) be an ordered domain and let \(A \in \mathbb {D}^{n\times n}\) be a symmetric and positive definite matrix. Then the \(L D^{-1} U\) decomposition of A can be computed without using permutations.
Proof
By Sylvester’s criterion (see Theorem 22 in the “Appendix”) a symmetric matrix is positive definite if and only if its leading principal minors are positive. However, by Remark 2 and Equation 2.1, these are precisely the pivots that are used during Bareiss’s algorithm. Hence, permutations are not necessary. \(\square \)
If we consider domains which are not ordered, then the \(L D^{-1} U\) decomposition of \(A^t A\) will usually require permutations: Consider, for example, the Gaussian integers \(\mathbb {D}= \mathbb {Z}[i]\) and the matrix
$$\begin{aligned} A = \begin{pmatrix} 1 &{}\quad i \\ i &{}\quad 0 \end{pmatrix}. \end{aligned}$$
Then
$$\begin{aligned} A^t A = \begin{pmatrix} 0 &{}\quad i \\ i &{}\quad -1 \end{pmatrix}; \end{aligned}$$
and Bareiss’s algorithm must begin with a row or column permutationFootnote 5.
We are now ready to discuss the fraction-free QR decomposition. The theorem below makes two major changes to Zhou and Jeffrey [26, Thm. 8]: first, we add that \(\Theta ^t \Theta \) is not just any diagonal matrix but actually equal to D. Secondly, the original theorem did not require the domain \(\mathbb {D}\) to be ordered, which means that the proof cannot work.
Theorem 15
Let \(A \in \mathbb {D}^{m\times n}\) with \(n\le m\) and with full column rank where \(\mathbb {D}\) is an ordered domain. Then the partitioned matrix \((A^t A \mid A^t)\) has \(LD^{-1}U\) decomposition
$$\begin{aligned} (A^t A \mid A^t) = R^t D^{-1} (R \mid \Theta ^t), \end{aligned}$$
where \(\Theta ^t \Theta = D\) and \(A = \Theta D^{-1} R\).
Proof
By Lemma 14, we can compute an \(L D^{-1} U\) decomposition of \(A^t A\) without using permutations; and by Theorem 13, the decomposition must have the shape
$$\begin{aligned} A^t A = R^t D^{-1} R. \end{aligned}$$
Applying the same row transformations to \(A^t\) yields a matrix \(\Theta ^t\), that is, we obtain \((A^t A \mid A^t) = R^t D^{-1} (R \mid \Theta ^t).\) As in the proof of Zhou and Jeffrey [26, Thm. 8], we easily compute that \(A = \Theta D^{-1} R\) and that \(\Theta ^t \Theta = D^t (R^{-1})^t A^t A R^{-1} D = D^t (R^{-1})^t R^t D^{-1} R R^{-1} D = D.\) \(\square \)
For example, let \(A\in \mathbb {Z}[x]^{3\times 3}\) be the matrix
$$\begin{aligned} A=\begin{pmatrix} x &{}\quad 1 &{}\quad 2 \\ 2 &{}\quad 0 &{}\quad -x \\ x &{}\quad 1 &{}\quad x + 1 \end{pmatrix}. \end{aligned}$$
Then the \(LD^{-1}U\) decomposition of \(A^tA=R^tD^{-1}R\) is given by
$$\begin{aligned} R= & {} \begin{pmatrix} 2 (x^2+2) &{}\quad 2 x &{}\quad x (x+1) \\ 0 &{}\quad 8 &{}\quad 4 (x^2+x+3) \\ 0 &{}\quad 0 &{}\quad 4 (x-1)^2 \end{pmatrix},\\ D= & {} \begin{pmatrix} 2 (x^2+2) &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 16 (x^2+2) &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 32 (x-1)^2 \end{pmatrix}, \end{aligned}$$
and we obtain for the QR decomposition \(A = \Theta D^{-1} R\):
$$\begin{aligned} \Theta = \begin{pmatrix} x &{}\quad 4 &{}\quad -4 (x-1) \\ 2 &{}\quad -4 x &{}\quad 0 \\ x &{}\quad 4 &{}\quad 4 (x-1) \end{pmatrix}. \end{aligned}$$
We see that the \(\Theta D^{-1} R\) decomposition has some common factor in the last column of \(\Theta \). This observation is explained by the following theorem.
Theorem 16
With full-rank \(A \in \mathbb {D}^{n\times n}\) and \(\Theta \) as in Theorem 15, we have for all \(i=1,\ldots ,n\) that
$$\begin{aligned} \Theta _{in} = (-1)^{n+i} \det \limits _{i,n} A \cdot \det A \end{aligned}$$
where \(\det _{i,n}A\) is the (i, n) minor of A.
Proof
We use the notation from the proof of Theorem 15. From \(\Theta D^{-1} R = A\) and \(\Theta ^t \Theta =D\) we obtain
$$\begin{aligned} \Theta ^t A = \Theta ^t \Theta D^{-1} R = R. \end{aligned}$$
Thus, since A has full rank, \(\Theta ^t = R A^{-1}\) or, equivalently,
$$\begin{aligned} \Theta = (R A^{-1})^t = (A^{-1})^t R^t = (\det A)^{-1} ({\text {adj}} A)^t R^t \end{aligned}$$
where \({\text {adj}} A\) is the adjoint matrix of A. Since \(R^t\) is a lower triangular matrix with \(\det A^t A = (\det A)^2\) at position (n, n), the claim follows. \(\square \)
For the other columns of \(\Theta \) we can state the following.
Theorem 17
The kth determinantal divisor \(d_k^*\) of A divides the kth column of \(\Theta \) and the kth row of R. Moreover, \(d_{k-1}^* d_k^*\) divides \(D_{k,k}\) for \(k \ge 2\).
Proof
We first show that the kth determinantal divisor \(\delta _k^*\) of \((A^t A \mid A^t)\) is the same as \(d_k^*\). Obviously, \(\delta _k^* \mid d_k^*\) since all minors of A are also minors of the right block \(A^t\) of \((A^t A \mid A^t)\). Consider now the left block \(A^t A\). We have by the Cauchy–Binet theorem [4, § 4.6]
$$\begin{aligned} \det \limits _{I,J} (A^t A) = \sum _{\genfrac{}{}{0.0pt}{}{K \subseteq \{1,\ldots ,n\}}{|K| = q}} (\det \limits _{K,I} A) (\det \limits _{K,J} A) \end{aligned}$$
where \(I, J \subseteq \{1,\ldots ,n\}\) with \(|I| = |J| = q \ge 1\) are two index sets and \(\det _{I,J} M\) denotes the minor for these index sets of a matrix M. Thus, \((d_k^*)^2\) divides any minor of \(A^t A\) since it divides every summand on the right hand side; and we see that \(d_k^* \mid \delta _k^*\).
Now, we use Theorems 15 and 8 to conclude that \(d_k^*\) divides the kth row of \((R \mid \Theta ^t)\) and hence the kth row of R and the kth column of \(\Theta \). Moreover, \(D_{k,k} = R_{k-1,k-1} R_{k,k}\) for \(k \ge 2\) by Theorem 1 which implies \(d_{k-1}^* d_k^* \mid D_{k,k}\). \(\square \)
Knowing that there is always a common factor, we can cancel it, which leads to a fraction-free QR decomposition of smaller size.
Theorem 18
For a square matrix A, a reduced fraction-free QR decomposition is \(A=\hat{\Theta }\hat{D}^{-1}\hat{R}\), where \(S={\text {diag}}(1,1,\ldots ,\det A)\) and \(\hat{\Theta }= \Theta S^{-1}\), and \(\hat{R}=S^{-1}R\). In addition, \(\hat{D}=S^{-1}DS^{-1}=\hat{\Theta }^t \hat{\Theta }\).
Proof
By Theorem 16, \(\Theta S^{-1}\) is an exact division. The statement of the theorem then follows from \(A=\Theta S^{-1} S D^{-1} S S^{-1} R\). \(\square \)
If we apply Theorem 18 to our previous example, we obtain the simpler QR decomposition, where the factor \(\det A=-2(x-1)\) has been removed.
$$\begin{aligned} \begin{pmatrix} x &{}\quad 4 &{}\quad 2 \\ 2 &{}\quad -4 x &{} 0 \\ x &{}\quad 4 &{}\quad -2 \end{pmatrix}\; \begin{pmatrix} 2 (x^2+2) &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 16 (x^2+2) &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 8 \end{pmatrix}^{\!-1} \begin{pmatrix} 2 (x^2+2) &{}\quad 2 x &{}\quad x (x+1) \\ 0 &{}\quad 8 &{}\quad 4 (x^2+x+3) \\ 0 &{}\quad 0 &{}\quad -2 (x-1) \end{pmatrix}. \end{aligned}$$
The properties of the QR-decomposition are strong enough to guarantee a certain uniqueness of the output.
Theorem 19
Let \(A \in \mathbb {D}^{n\times n}\) have full rank. Let \(A = \Theta D^{-1} R\) the decomposition from Theorem 15; and let \(A = \tilde{\Theta } \tilde{D}^{-1} \tilde{R}\) be another decomposition where \(\tilde{\Theta }, \tilde{D}, \tilde{R} \in \mathbb {D}^{n\times n}\) are such that \(\tilde{D}\) is a diagonal matrix, \(\tilde{R}\) is an upper triangular matrix and \(\Delta = \tilde{\Theta }^t \tilde{\Theta }\) is a diagonal matrix. Then \(\Theta ^t \tilde{\Theta }\) is also a diagonal matrix and \(\tilde{R} = (\Theta ^t \tilde{\Theta })^{-1} \tilde{D} R\).
Proof
We have
$$\begin{aligned} \tilde{\Theta } \tilde{D}^{-1} \tilde{R} = \Theta D^{-1} R \qquad \text {and thus}\qquad \Theta ^t \tilde{\Theta } \tilde{D}^{-1} \tilde{R} = \Theta ^t \Theta D^{-1} R = R. \end{aligned}$$
Since R and \(\tilde{R}\) have full rank, this is equivalent to
$$\begin{aligned} \Theta ^t \tilde{\Theta } = R \tilde{R}^{-1} \tilde{D}. \end{aligned}$$
Note that all the matrices on the right hand side are upper triangular. Similarly, we can compute that
$$\begin{aligned} \tilde{\Theta }^t \Theta D^{-1} R = \tilde{\Theta }^t \tilde{\Theta } \tilde{D}^{-1} \tilde{R} = \Delta \tilde{D}^{-1} \tilde{R} \end{aligned}$$
which implies \(\tilde{\Theta }^t \Theta = \Delta \tilde{D}^{-1} \tilde{R} R^{-1} D.\) Hence, also \(\tilde{\Theta }^t \Theta = (\Theta ^t \tilde{\Theta })^t\) is upper triangular and consequently \(\tilde{\Theta }^t \Theta = T\) for some diagonal matrix T with entries from \(\mathbb {D}\). We obtain \(R = T \tilde{D}^{-1} \tilde{R}\) and thus \(\tilde{R} = T^{-1} \tilde{D} R\). \(\square \)