Common Factors in Fraction-Free Matrix Decompositions

We consider LU and QR matrix decompositions using exact computations. We show that fraction-free Gauss--Bareiss reduction leads to triangular matrices having a non-trivial number of common row factors. We identify two types of common factors: systematic and statistical. Systematic factors depend on the reduction process, independent of the data, while statistical factors depend on the specific data. We relate the existence of row factors in the LU decomposition to factors appearing in the Smith--Jacobson normal form of the matrix. For statistical factors, we identify some of the mechanisms that create them and give estimates of the frequency of their occurrence. Similar observations apply to the common factors in a fraction-free QR decomposition. Our conclusions are tested experimentally.


Introduction
Although known earlier, the fraction-free method for exact matrix computations became well known because of its application by Bareiss [1] to the solution of a system on integer equations AX = B. He implemented fraction-free Gaussian elimination of the augmented matrix [A B], and retained integer computations until a final division step. Since, in linear algebra, equation solving is related to the matrix factorizations LU and QR, it was natural that fraction-free methods would be extended later to those topics. The extensions required that the forms of the factorizations be modified from their floating-point counterparts in order to retain purely integer data. The first proposals were based on inflating the initial data until all divisions were guaranteed exact, see for example Lee and Saunders [12]; Nakos et al. [14]; Corless and Jeffrey [4]. This strategy, however, led to the entries in the L and U matrices becoming very large, and an alternative form was presented in Zhou and Jeffrey [18]. Fractionfree Gram-Schmidt orthogonalization and QR factorization were similarly studied by Erlingsson et al. [5]; Zhou and Jeffrey [18]. Further extensions have addressed fraction-free full-rank factoring of noninvertible matrices and fraction-free computation of the Moore-Penrose inverse [10]. More generally, applications exist in areas such as the Euclidean algorithm, and the Berlekamp-Massey algorithm [11].
More general domains are possible, and here we consider matrices over a principal ideal domain D. For the purpose of giving illustrative examples and conducting computational experiments, matrices over Z and Q[x] are used, because these domains are well established and familiar to readers. We emphasize, however, that the methods here apply for all principal ideal domains, as opposed to methods that target specific domains, such as Giesbrecht and Storjohann [7]; Pauderis and Storjohann [16].
The shift from equation solving to matrix factorization has the effect of making visible the intermediate results that are deleted in the original Bareiss implementation. Because of this, it becomes Throughout the paper, we employ the following notation. We assume, unless otherwise stated, that the ring D is an arbitrary principal ideal domain. We denote the set of all m-by-n matrices over D by D m×n . We write 1 n for the n-by-n identity matrix and 0 m×n for the m-by-n zero matrix. We shall usually omit the subscripts if no confusion is possible. For A ∈ D m×n and 1 ≤ i ≤ m, A i, * is the i th row of A. Similarly, A * ,j is the j th column of A for 1 ≤ j ≤ n. If 1 ≤ i 1 < i 2 ≤ m and 1 ≤ j 1 < j 2 ≤ n, we use A i1...i2,j1...j2 to refer to the submatrix of A made up from the entries of the rows i 1 to i 2 and the columns j 1 to j 2 . Given elements a 1 , . . . , a n ∈ D, with diag(a 1 , . . . , a n ) we refer to the diagonal matrix that has a j as the entry at position (j, j) for 1 ≤ j ≤ n. We will use the same notation for block diagonal matrices.
We denote the set of all column vectors of length m with entries in D by D m and that of all row vectors of length n by D 1×n . If D is a unique factorization domain and v = (v 1 , . . . , v n ) ∈ D 1×n , then we set gcd(v) = gcd(v 1 , . . . , v n ).
. We also use the same notation for column vectors.
We will sometimes write column vectors w ∈ D m with an underline w and row vectors v ∈ D 1×n with an overline v if we want to emphasize the specific type of vector.

Bareiss's Algorithm and the LD −1 U Decomposition
For the convenience of the reader, we start by recalling Bareiss's algorithm from Bareiss [1] as well as the LD −1 U decomposition from Jeffrey [10].
Let D be an integral domain 1 , and let A ∈ D m×n be a matrix and b ∈ D m be a vector. We are interested in computing the exact solution(s) for the system Ax = b over the quotient field of D. Following standard methods from linear algebra, we compute a (variation of the) LU decomposition for A. Suppose that where we assume that u and v are not zero. Choosing u as our pivot in the usual Gaussian elimination, we would subtract v/u times the first row from the second as seen in, for example, Geddes et al. [6,Section 9.2]. The division, however, forces us to work in the quotient field of D, which is computationally expensive. One way to avoid this is to use cross-multiplication (see Geddes et al. [6,Section 9.2] where it is called division-free elimination): In order to eliminate v from the second row, first multiply the second row by u and then subtract v times the first row. While this avoids dealing with fractions, when repeating this process for the entire elimination the matrix entries grow in size exponentially with the number of rows, which will also need more computation time (and take up more memory). If D is a unique factorization domain, it is possible to keep the sizes of the entries as small as possible by dividing each row by its greatest common divisor after the cross-multiplication. The drawback is that computing these greatest common divisors in all steps is again computationally expensive. Bareiss [1] observed that after carrying out cross-multiplication for two iterations (that is, eliminating the entries from the first column except the first and eliminating the entries from the second column except the first two) every entry in the third row and below is divisible by u. More generally, if we denote the pivots during the elimination by p 1 , . . . , p r (where r is the rank of A), then after k ≥ 2 iterations every entry below the k th row is divisible by p k−1 . (In actual implementations one often additionally defines p −1 = 1 in order to prevent k = 1 from being a special case.) Bareiss's observation gives us the possibility of removing the factor p k−1 after the k th iteration from the lower rows of the matrix by using (relatively) cheap exact division while at the same time avoiding the cost of computing the greatest common divisors. This makes Bareiss's algorithm a valuable choice for exact matrix computations (see, for example, Geddes et al. [6,Section 9.3] where it is called the single-step fraction-free elimination scheme). We will show below that we can extract even more factors at a moderate additional cost.
In Jeffrey [10] the idea of Bareiss's algorithm was used in order to obtain a fraction-free variant of the LU factorization. We repeat the main result from that paper here.
Note that in this section we do not require D to be a principal ideal domain, but it suffices to assume that D is an integral domain.
where the permutation matrix P r is m × m; the permutation matrix P c is n × n; L is r × r, lower triangular and has full rank: where the p i = 0 are the pivots in a Gaussian elimination; M is (m − r) × r and could be null; D is r × r and diagonal: D = diag(p 1 , p 1 p 2 , p 2 p 3 , . . . , p r−2 p r−1 , p r−1 p r ); U is r × r and upper triangular, while V is r × (n − r) and could be null: Inspecting the proof given in Jeffrey [10], it is possible to extract an algorithm for the computation of the LD −1 U decomposition. Note that the algorithm can be seen as a variant of Bareiss's algorithm [1] which will yield the same U . The difference is that Jeffrey [10] also explains how to obtain L and D in a fraction-free way.
Output:. The LD −1 U decomposition of A as in Theorem 1.

2.
For each k = 1, . . . , min{m, n}: (a) Find a non-zero pivot p k in U k...m,k...n and bring it to position (k, k) recording the row and column swaps in P r and P c . Also apply the row swaps to L accordingly. If no pivot is found, then set r = k and exit the loop. (b) Set L k,k = p k and L i,k = U i,k for i = k + 1, . . . , m.
Then eliminate the entries in the k th column and below the k th row in U by cross-multiplication; that is, Note that the divisions will be exact. 3. If r is not set yet, set r = min{m, n}. 4. If r < m, then trim the last m − r columns from L as well as the last m − r rows from U . 5. Set D = diag(p 1 , p 1 p 2 , . . . , p r−1 p r ). 6. Return P r , L, D, U , and P c .
The algorithm does not specify the choice of pivot in step 2a. Conventional wisdom (see, for example, Geddes et al. [6]) is that in exact algorithms choosing the smallest possible pivot (measured in a way suitable for D) will lead to the smallest output sizes. We have been able to confirm this experimentally in Middeke and Jeffrey [13] for D = Z where size was measured as the absolute value. In step 2c the divisions are guaranteed to be exact. Thus, an implementation can use more efficient procedures for this step if available (for example, for big integers using mpz divexact in the gmp library which is based on Jebelean [9] instead of regular division).
One of the goals of the present paper is to discuss improvements to the decomposition explained above. Throughout this paper we shall use the term LD −1 U decomposition to mean exactly the decomposition from Theorem 1 as computed by Algorithm 2. For the variations of this decomposition we introduce the following term: Definition 3 (Fraction-Free LU Decomposition). For a matrix A ∈ D m×n of rank r we say that A = P r LD −1 U P c is a fraction-free LU decomposition if P r ∈ D m×m and P c ∈ D n×n are permutation matrices, L ∈ D m×r has L ij = 0 for j > i and L ii = 0 for all i, U ∈ D r×n has U ij = 0 for i > j and U ii = 0 for all i, and D ∈ D r×r is a diagonal matrix (with full rank).
We will usually refer to matrices L ∈ D m×r with L ij = 0 for j > i and L ii = 0 for all i as lower triangular and to matrices U ∈ D r×n with U ij = 0 for i > j and U ii = 0 for all i as upper triangular even if they are not square.
As mentioned in the introduction, Algorithm 2 does result in common factors in the rows of the output U and the columns of L. In the following sections, we will explore methods to explain and predict those factors. The next result asserts that we can cancel all common factors which we find from the final output. This yields a fraction-free LU decomposition of A where the size of the entries of U (and L) are smaller than in the LD −1 U decomposition.
Corollary 4. Given a matrix A ∈ D m×n with rank r and its standard Proof. By Theorem 1, the diagonal entries of U are the pivots chosen during the decomposition and they also divide the diagonal entries of D. Thus, any common divisor of U k, * will also divide D kk and therefore bothÛ andD are fraction-free. We can easily check that Remark 5. If we predict common column factors of L we can cancel them in the same way. However, if we have already canceled factors from U , then there is no guarantee that d | L * ,k implies d |D kk . Thus, in general we can only cancel gcd(d,D kk ) from L * ,k . The same holds mutatis mutandis if we cancel the factors from L first.
It will be an interesting discussion for future research whether it is better to cancel as many factors as possible from U or to cancel them from L.

LU and the Smith-Jacobson Normal Form
This section explains a connection between "systematic factors" (that is, common factors which appear in the decomposition due to the algorithm being used) and the Smith-Jacobson normal form. Given a matrix A over a principal ideal domain D, we study the decomposition A = P r LD −1 U P c . For simplicity, from now on we consider the decomposition in the form The following theorem connecting the LD −1 U decomposition with the Smith-Jacobson normal form can essentially be found in [1]. Proof. According to [15,II.15], the diagonal entries of the Smith-Jacobson normal form are quotients of the determinantal divisors, i. e., d * 1 = d 1 and d k = d * k /d * k−1 for k = 2, . . . , n. Moreover, d * k is the greatest common divisor of all k-by-k minors of A for each k = 1, . . . , n. Thus, we only have to prove that the entries of the k th row of U are k-by-k minors of A. However, this follows from [6, Eqns (9.8), (9.12)], since the k th row of U consists of the elements Similarly, following the algorithm in Jeffrey [10], we see that the columns of L are just made up by copying entries from the columns of U during the reduction. More precisely, the k th column of L will have the entries a (k−1) 1k , . . . , a (k−1) nk , using the notation of Geddes et al. [6], and these are again just k-by-k minors of A. Explicitly, the j th column of L consists of the elements ¿From Theorem 6, we obtain the following result.
Corollary 8. The k th determinantal divisor d * k can be removed from the k th row of U (since it divides D k,k by Corollary 4) and also d * k−1 can be removed from the The Smith-Jacobson normal form S of A is diag(1, x, x(x + 1), x(x + 1)(x − 1)) and thus its determinantal divisors are d * Computing the column factors of L and the row factors of U yields the list 1, x, x 2 (x + 1) and x 3 (x − 1)(x + 1) 2 , i. e., exactly the determinantal divisors. In general, there could be other factors as well.

Efficient Detection of Factors
When considering the output of Algorithm 2, we find an interesting relation between the entries of L and U which can be exploited in order to find "systematic" common factors in the LD −1 U decomposition. Theorem 9 below predicts a divisor of the common factor in the k th row of U , by looking at just three entries of L. Likewise, we obtain a divisor of the common factor of the k th column of L from three entries of U . As in the previous section, let D be a principal ideal domain.
Theorem 9. Let A ∈ D m×n and let P r LD −1 U P c be the LD −1 U decomposition of A. Then Proof. Suppose that during Bareiss's algorithm after k − 1 iterations we have reached the following state where T is an upper triangular matrix, p, a, b ∈ D, v, w ∈ D 1×n−k−1 and the other overlined quantities are row vectors and the underlined quantities are column vectors. Assume that a = 0 and that we choose it as a pivot. Continuing the computations we now eliminate b (and the entries below) by cross-multiplication Here, we can see that any common factor of a and b will be a factor of every entry in that row, i. e., gcd(a, b) | aw − bv. However, we still have to carry out the exact division step. This leads to The division by p is exact. Some of the factors in p might be factors of a or b while others are hidden in v or w. However, every common factor of a and b which is not also a factor of p will still be a common factor of the resulting row. In other words, In fact, the factors do not need to be tracked during the LD −1 U reduction but can be computed afterwards: All the necessary entries a, b and p of A (k−1) will end up as entries of L. More precisely, we shall have p = L k−2,k−2 , a = L k−1,k−1 and b = L k,k−1 .
Similar reasoning can be used to predict common factors in the columns of L. Here, we have to take into account that the columns of L are made up from entries in U during each iteration of the computation. Note that in this example pivoting is not needed, that is, we have P r = P c = 1. The method outlined in Theorem 9 correctly predicts the common factor 2 in the second row, the factor 3 in the third row and the factor 2 in the fourth row. However, it does not detect the additional factor 5 in the fourth row.
The example also provides an illustration of the proof of Given Theorem 9, one can ask how good this prediction actually is. Concentrating on the case of integer matrices, the following Theorem 10 shows that with this prediction we do find a common factor in roughly a quarter of all rows. Experimental data suggest a similar behavior for matrices containing polynomials in F p [x] where p is prime. Moreover, these experiments also showed that the prediction was able to account for 40.17% of all the common prime factors (counted with multiplicity) in the rows of U . 2 Theorem 10. For random integers a, b, p ∈ Z the probability that the formula in Theorem 9 predicts a non-trivial common factor is The following calculation is due to Hare [8]; Winterhof [17]: First note that the probability that gcd(a, b) = n is 1/n 2 times the probability that gcd(a, b) = 1. Summing up all of these probabilities gives As this sum must be 1, this gives that the P gcd(a, b) = 1 = 6/π 2 , and the P gcd(a, b) = n = 6/(π 2 n 2 ). Given that gcd(a, b) = n, the probability that n | c is 1/n. So the probability that gcd(a, b) = n and that gcd(p, a, b) = n is 6/(π 2 n 3 ). So P gcd(a, b)/ gcd(p, a, b) = 1 is ∞ n=1 P gcd(a, b) = n and gcd(p, a, b) = n = ∞ n=1 6 π 2 n 3 = 6 There is another way in which common factors in integer matrices can arise. Let d be any number. Then for random a, b the probability that d | a + b is 1/d. That means that if v, w ∈ Z 1×n are vectors, then d | v + w with a probability of 1/d n . This effect is noticeable in particular for small numbers like d = 2, 3 and in the last iterations of the LD −1 U decomposition when the number of non-zero entries in the rows has shrunk. For instance, in the second last iterations we only have three rows with at most three non-zero entries each. Moreover, we know that the first non-zero entries of the rows cancel during cross-multiplication. Thus, a factor of 2 appears with a probability of 25% in one of those rows, a factor of 3 with a probability of 11.11%. In the example above, the probability for the factor 5 to appear in the fourth row was 4%.

Expected Number of Factors
In this section, we provide a detailed analysis of the expected number of common "statistical" factors in the rows of U , in the case when the input matrix A has integer entries, that is, D = Z. We base our considerations on a "uniform" distribution on Z, e.g., by imposing a uniform distribution on {−n, . . . , n} for very large n. However, the only relevant property that we use is the assumption that the probability that a randomly chosen integer is divisible by p is 1/p.
We consider a matrix A = (A i,j ) 1≤i,j≤n ∈ Z n×n of full rank. The assumption that A be square is made for the sake of simplicity; the results shown below immediately generalize to rectangular matrices. As before, let U be the upper triangular matrix from the LD −1 U decomposition of A: Define g k := gcd(U k,k , U k,k+1 , . . . , U k,n ) to be the greatest common divisor of all entries in the k th row of U . Counting (with multiplicities) all the prime factors of g 1 , . . . , g n−1 , one gets the picture shown in Figure 1; g n is omitted as it contains only the single nonzero entry U n,n = det(A). Our goal is to give a probabilistic explanation for the occurrence of these common factors, whose number seems to grow linearly with the dimension of the matrix.
As we have seen in the proof of Theorem 6, the entries U k,ℓ can be expressed as minors of the original matrix A: Observe that the entries U k,ℓ in the k th row of U are all given as determinants of the same matrix, where only the last column varies. For any integer q ≥ 2 we have that q | g k if q divides all these determinants. A sufficient condition for the latter to happen is that the determinant is divisible by q as a polynomial in Z[x], i.e., if q divides the content of the polynomial h k . We now aim at computing how likely it is that q | h k when q is fixed and when the matrix entries A 1,1 , . . . , A k,k−1 are chosen randomly. Since q is now fixed, we can equivalently study this problem over the finite ring Z q , which means that the matrix entries are picked randomly and uniformly from the finite set {0, . . . , q − 1}. Moreover, it turns out that it suffices to answer this question for prime powers q = p j . The probability that all k × k-minors of a randomly chosen k × (k + 1)-matrix are divisible by p j , where p is a prime number and j ≥ 1 is an integer, is given by (1 − aq i ), (a; q) 0 := 1, the above formula can be written more succinctly as Now, an interesting observation is that this probability does not, as one could expect, tend to zero as k goes to infinity. Instead, it approaches a nonzero constant that depends on p and j (see Table 1):  Table 1. Behavior of the sequence P p,j,k k∈N for some small values of p j .
Using the probability P p,j,k , one can write down the expected number of factors in the determinant h k+1 , i.e., the number of prime factors in the content of the polynomial h k+1 , counted with multiplicities: ¿From the discussion above, it follows that for large n this expected number can be approximated by a linear function as follows: F (n) ≈ 0.89764 n − 1.53206.

QR Decomposition
A fraction-free QR decomposition, which is based on the LD −1 U decomposition, was given in Zhou and Jeffrey [18]. In this section, we present a refined version of this algorithm (see Theorem 12). As a first step in its proof, we will need the Cholesky decomposition, which is introduced in the following lemma. This section assumes that D has characteristic 0 which is needed in order to assure that A t A has full rank.
Theorem 11. Let A ∈ D n×n be a symmetric matrix such that its LD −1 U decomposition can be computed without permutations; then we have U = L t , that is, Proof. Compute the decomposition A = LD −1 U as in Theorem 1. If we do not execute item 4 of Algorithm 2, we obtain the decomposition Then because A is symmetric, we obtaiñ The matricesL andD have full rank which implies Examination of the matrices on the left hand side reveals that they are all upper triangular. Therefore also their product is an upper triangular matrix. Similarly, the right hand side is a lower triangular matrix and the equality of the two implies that they must both be diagonal. CancelingD and rearranging the equation yieldsŨ = (L −1Ũ t )L t whereL −1Ũ t is diagonal. This shows that the rows ofŨ are just multiples of the rows ofL t . However, we know that the first r diagonal entries ofŨ andL are the same, where r is the rank ofŨ . This yields and hence, when we remove the unnecessary last n − r rows ofŨ and the last n − r columns ofL (as suggested in Jeffrey [10]), we remain with U = L t .
The following theorem is a variant of Zhou and Jeffrey [18,Thm. 8], where we exploit the symmetry of A t A by invoking Theorem 11. This leads to a nicer representation of the decomposition, and we obtain more information about Θ t Θ.
Theorem 12. Let A ∈ D m×n with n ≤ m and with full column rank. Then the partitioned matrix where Θ t Θ = D and A = ΘD −1 R.
Proof. Since A has full column rank, the Gramian matrix A t A will have full rank, too. By taking the first k columns of A (and the first k rows of A t ), it follows that also the k th principal minor of A t A is nonzero. Consequently, when we compute the LD −1 U decomposition, we do not need any permutations.
Hence, by Theorem 11, we can decompose the symmetric matrix A t A as Applying the same row transformations to A t yields a matrix Θ t , that is, we obtain As in the proof of Zhou and Jeffrey [18,Thm. 8], we easily compute that For example, let A ∈ Z[x] 3×3 be the matrix x(x + 1) 0 and we obtain for the QR decomposition A = ΘD −1 R: We see that the ΘD −1 R decomposition has some common factor in the last column of Θ. This observation is explained by the following theorem.
Theorem 13. With full-rank A ∈ D n×n and Θ as in Theorem 12, we have for all i = 1, . . . , n that Proof. We use the notation from the proof of Theorem 12. From ΘD −1 R = A and Θ t Θ = D we obtain Thus, since A has full rank, Θ t = RA −1 or, equivalently, where adj A is the adjoint matrix of A. Since R t is a lower triangular matrix with det A t A = (det A) 2 at position (n, n), the claim follows.
For the other columns of Θ we can state the following.
Theorem 14. The k th determinantal divisor d * k of A divides the k th column of Θ and the k th row of R. Moreover, d * k−1 d * k divides D k,k for k ≥ 2. Proof. We first show that the k th determinantal divisor δ * k of (A t A | A t ) is the same as d * k . Obviously, δ * k | d * k since all minors of A are also minors of the right block A t of (A t A | A t ). Consider now the left block A t A. We have by the Cauchy-Binet theorem [3, § 4.6]  where I, J ⊆ {1, . . . , n} with |I| = |J| = q ≥ 1 are two index sets and det I,J M denotes the minor for these index sets of a matrix M . Thus, (d * k ) 2 divides any minor of A t A since it divides every summand on the right hand side; and we see that d * k | δ * k . Now, we use Theorem 12 and Theorem 6 to conclude that d * k divides the k th row of (R | Θ t ) and hence the k th row of R and the k th column of Θ. Moreover, D k,k = R k−1,k−1 R k,k for k ≥ 2 by Theorem 1 which implies d * k−1 d * k | D k,k . Knowing that there is always a common factor, we can cancel it, which leads to a fraction-free QR decomposition of smaller size. Proof. By Theorem 13, ΘS −1 is an exact division. The statement of the theorem follows from A = ΘS −1 SD −1 SS −1 R.
If we apply Theorem 15 to our previous example, we obtain the simpler QR decomposition, where the factor det A = −2(x − 1) has been removed.
The properties of the QR-decomposition are strong enough to guarantee a certain uniqueness of the output.
Theorem 16. Let A ∈ D n×n have full rank. Let A = ΘD −1 R the decomposition from Theorem 12; and let A =ΘD −1R be another decomposition whereΘ,D,R ∈ D n×n are such thatD is a diagonal matrix,R is an upper triangular matrix andΘ tΘ is a diagonal matrix. Then Θ tΘ is also a diagonal matrix andR = (Θ tΘ ) −1D R.
If R andR have full rank, this is equivalent to Note that all the matrices on the right hand side are upper triangular. Similarly, we can compute that which impliesΘ t Θ = ∆D −1R R −1 D. Hence, alsoΘ t Θ = (Θ tΘ ) t is upper triangular and consequentlỹ Θ t Θ = T for some diagonal matrix T with entries from D. We obtain R = TD −1R and thusR = T −1D R.