Entries of the inverses of large positive definite Toeplitz matrices

This is an expository paper embarking on the asymptotic behavior of the entries of the inverses of positive definite symmetric Toeplitz matrices as the matrix dimension goes to infinity. We consider the behavior of the entries in neighborhoods of the four corners as well as the density of the distribution of the entries over all of the inverse matrix.


Introduction
Gábor Szegő's two papers [20] of 1920 and 1921 contain the roots of an enormous development that is lasting until the present. This development includes Toeplitz determinants and orthogonal polynomials, and now there are many recent works which present these topics in expository style; see, for example, [4,7,18]. Another aspect of Szegő's two papers concerns the analysis of positive definite Toeplitz matrices, and as I am not aware of an expository article devoted exclusively to this issue, I decided to write this paper when receiving the honorable invitation to submit a contribution to the 100th anniversary of the journal Acta Scientiarum Mathematicarum, which was founded by Szegő's contemporaries Alfréd Haar and Frigyes Riesz in 1922 Given a real-valued function a in L 1 over the complex unit circle T, we denote by a k = π −π a(e iθ )e −ikθ dθ/(2π) (k ∈ Z) its Fourier coefficients and by T n (a) the n × n Toeplitz matrix Since a is real-valued, this matrix is Hermitian. We confine ourselves to the case where the matrix is real and thus symmetric. This is equivalent to the 1 3

88
A. Böttcher requirement that the function θ → a(e iθ ) is even on (−π, π). If a ≥ 0 almost everywhere on T and a is not identically zero, then T n (a) is positive definite. Indeed, if x = (x 0 , . . . , x n−1 ) ∈ C n is nonzero, then a(e iθ )|x(e iθ )| 2 dθ (1) with x(e iθ ) = x 0 + x 1 e iθ + · · · + x n−1 e i(n−1)θ , and the integral is clearly strictly positive. Throughout this paper we assume that a ∈ L 1 is real-valued and not identically zero, that a k = a −k for all k, and that a ≥ 0 a.e. on T, so that T n (a) is (symmetric) positive definite and hence invertible for all n ≥ 1. We are interested in the entries of the inverse matrix T −1 n (a) := [T n (a)] −1 . The interest in the entries of the inverse is, for example, motivated by the following result of Spitzer and Stone [19] on random walks with deadly barriers. Let {h k } ∞ k=−∞ be a sequence of non-negative numbers such that Clearly, a ≥ 0 on T. Let X be a random variable that assumes the value k with probability h k , that is, P(X = k) = h k . With an unspecified integer S 0 , consider the stochastic process where the X i are identically distributed, independent copies of X. Then Equivalently, [T −1 n (a)] j,k is the expected number of visits of the stochastic process S n to the point k starting with S 0 = j and before leaving the closed interval [1, n] for the first time. In the case where h 1 = h −1 = 1/2 and h k = 0 for all other k, this insight was already gained by Courant, Friedrichs, and Lewy in [6]. In this special case the matrix T n (a) is the tridiagonal Toeplitz matrix This is 1/2 times the matrix we will examine in Section 2.
There are clever fast and stable algorithms for solving systems with positive definite Toeplitz matrices and thus for inverting such matrices. These give the desired entries within seconds for n up to some thousands. However, if a depends on a parameter or if one wants to know the behavior of the entries for n → ∞, one cannot invoke numerical algorithms. The purpose of this paper is to record a couple of (mostly known) tools that can be employed to tackle such problems.
One well-known and very useful tool is the Gohberg-Sementsul-Trench formula [10,22]. Full proofs can also be found in [9, Chap. III.6.1] and [11, p. 21]. In the case of positive definite symmetric matrices, it says that if (c 1 c 2 . . . c n ) is the first column of T −1 n (a) or, equivalently, the solution of the system T n (a)c = e 1 with e 1 = (1 0 . . . 0) , then c 1 = 0 and Thus, the entire inverse matrix is captured in its first column. Note also that Cramer's rule gives c 1 = det T n−1 (a)/ det T n (a). for n = 8 and n = 9 in Matlab to see that these two inverses are particular beauties. It is a standard exercise to show that det T n (a) = n + 1 for n ≥ 1, 1 3

An illuminating example
and as c (n) 1 is known, these equations successively yield Consequently, c as n → ∞. In other terms, for each fixed m, the upper-left m × m corner of T −1 n (a) converges to (min(j, k)) m j,k=1 as n → ∞. By symmetry, the limit of the lower-right corner is the double flip of this matrix. Analogously one obtains that the upper-right and lower-left corners of T −1 n (a) approach the zero matrix as n → ∞.
Once c n are available, formula (3) allows us to determine the j, k entry of the inverse by elementary computations. For example, the k, k entry turns out to be k(n + 1 − k)/(n + 1), which is about n/4 for k around n/2. The left picture of Figure 1 shows a pseudocolor plot of T −1 100 (a). One suspects that after shrinking the inverse to the square Q = [0, 1] 2 , the appropriately scaled values of the entries will eventually form a nice surface about Q. To be more precise, the expectation is that after changing the enumeration of the entries from 1, 2, . . . , n to 0, 1, . . . , n−1 and denoting by [ ] the integral part of ∈ R, one has . This can indeed be proved by simply computing the j, k entry using (3) and (4). It results that G is symmetric about the diagonals x = y and x = 1− y and that G( In the right picture of Figure 1 we see a pseudocolor plot of G. The two pictures fit after a flip about the horizontal axis (because the rows of a matrix are numbered downwards, whereas the y-axis of Q is oriented upwards). Now let α > 2. In that case things change drastically.
. If even f ∈ L ∞ , this matrix induces a bounded linear operator on 2 (N) in the natural fashion. If α = 2, the operator T (a) is not (boundedly) invertible: its formal inverse is given by the extension of (5) to an infinite matrix, and this matrix does clearly not generate a bounded operator. However, if α > 2, then T (a) has a bounded inverse, denoted by T −1 (a) := [T (a)] −1 . This inverse can be found by what is called Wiener-Hopf factorization. Recall that t = e iθ ∈ T. Simple computation shows that we get the Wiener-Hopf factorization a = a − a + , and the inverse T −1 (a) can be shown to be T (a −1 + )T (a −1 − ); see, for example, [4, Section 1.5]. We have the Fourier expansions The expansions for a −1 ± give 1 3

A. Böttcher
A formula by Widom [24] (which is as Theorem 2.14 also in [4]) reads T −1 n (a) = P n T −1 (a)P n + W n K(a)W n + C n (7) = T n (a −1 ) + P n K(a)P n + W n K(a)W n + C n (8) where P n is projection onto the first n coordinates, W n is P n followed by reversal of the coordinates, K(a) = T −1 (a) − T (a −1 ), and C n → 0 as n → ∞.
Here P n AP n : 2 (N) → 2 (N) is identified with an n × n matrix in the obvious manner. From (6) and the Fourier expansion of a −1 we obtain The (Hankel) operator K(a) is compact, W n converges weakly to zero, and P n converges strongly to the identity operator. It follows in particular that W n K(a)W n converges strongly to zero and thus (7)

Convergence of corner entries
We now turn to a fairly large class of functions. Let a ∈ L 1 := L 1 (T), suppose the function θ → a(e iθ ) is real-valued, non-negative, and even on (−π, π), and assume log a is also in L 1 . The latter assumption admits polynomial zeros on T, but it rules out functions with strong zeros, in particular functions vanishing identically on entire intervals. Let log a(t) = Various versions of this theorem are known for a long time; see, for example, [19,Theorem 1.6]. In the form presented, that is, under the sole assumption that a ∈ L 1 , a ≥ 0, log a ∈ L 1 , it was stated with a full proof in [3], but it may well be that this is also known. The polynomials are called the Szegő polynomials of a. They are the monic orthogonal polynomials on T with respect to the measure a(e iθ )dθ/(2π). Theorem 1 is therefore a result on the limiting behavior of the coefficients of these polynomials and its proof is based on the classical results in this field, which can be found in [18,20,21]. Notice that (a −1 where G(a) = exp(log a) 0 is the so-called geometric mean of a, and that the convergence is a classical result of Szegő. We also note that a + (z) is sometimes defined as Then the (a −1 + ) j−1 in Theorem 1 must be replaced by (1/G(a))(a −1 , the factor 1/G(b) is correctly contained in the formula for c j but incorrectly missing in the formulas for c 1 , c 2 , c 3 .) Example 2. Let a(t) = |1 − t| 2p where p ∈ (−1/2, ∞). For p = 1, this is the function 2 − t −1 − t examined in Section 2. If p = 2, we get the pentadiagonal symmetric Toeplitz matrix whose first row is (6 −4 1 0 . . .). The matrices T n (a) are no longer banded if p / ∈ Z. In the case at hand, which shows that a −1 Consequently, the limit of c (n) j is p+j−2 j−1 . Theorem 4.1 of [3] contains the more precise result To include a more general case, let a(t) = |1−t| 2p b(t) where p ∈ (−1/2, ∞) and b is bounded and bounded away from zero in an open neighborhood of t = 1. Our assumptions on a imply that b and log b are also in L 1 . We now have and eventually we obtain c (n) Example 3. Let T n (a) be the pentadiagonal symmetric Toeplitz matrix whose first row is .
What results is Inserting γ = (−3 − √ 5)/2 and using (3) we see after some computations that the upper-left 3 × 3 corner of T −1 n (a) converges to ⎛ In the case where a is not only non-negative on T but even bounded and bounded away from zero (which is the case α > 2 in Section 2), a little more can be said. So suppose a is actually a real-valued and even function in L ∞ and there is an ε > 0 such that a ≥ ε a.e. on T. Then the infinite Toeplitz matrix T (a) induces a bounded and invertible operator on 2 (N). We may the matrices T n (a) identify with the operator P n T (a)P n , and this operator converges strongly to T (a). From (1) we infer that the eigenvalues of T n (a) are all greater than or equal to ε > 0, which reveals that the norms of the inverses T −1 n (a) are uniformly bounded for n ≥ 1. All this together implies that the so-called finite section method is applicable to T (a), which means that T −1 n (a)P n = (P n T (a)P n ) −1 P n converges strongly to T −1 (a); see, e.g., [4,Section 2.1] or [9,Chap. II.2]. It follows in particular from Theorem 1 that {c j } ∞ j=1 with c j = (a −1 + ) j−1 is the sequence of entries of the first column of T −1 (a) and that not only c

Global limiting distribution
The subject of this section is the behavior of [T −1 n (a)] [nx], [ny] for (x, y) ∈ Q = [0, 1] 2 as n → ∞. Recall that in this context we number rows and columns from 0 to n − 1 and not from 1 to n.
If a takes positive values, is continuous, and bounded away from zero on T, then T (a) is invertible and Widom's formulas (7) and (8)  Things become more interesting if a has zeros on T. Throughout this section we assume that a(t) = |1 − t| 2p b(t) with an integer p ≥ 1 and with a continuous function b : T → [ε, ∞) (ε > 0) whose Fourier coefficients satisfy b k = b −k for all k and ∞ k=−∞ |k||b k | < ∞. Paper [16] treats the case where p is no longer assumed to be an integer. The following result is the Theorema Egregium in the business.
Above we already mentioned Green's function. The connection is as follows. Consider the boundary value problem The Green function for this problem is the function G on [0, 1] 2 which gives the solution to this boundary value problem by the formula u(x) = 1 0 G(x, y)v(y) dy. Discretization of problem (10), (11) leads to the Toeplitz matrices T n (ω p ), and it is therefore not too much a surprise that the Green function for the problem is connected with the inverse T −1 n (ω p ). This was rigorously proved in [6] for p = 1 and by Parter [12] and Widom [23] for general p. In the latter two papers, the authors relate the minimal eigenvalues of truncated discrete and continuous convolutions to the reciprocals of the maximal eigenvalues of an integral operator with some (quite complicated) kernel K(x, y) which is identified as Green's function of problem (10), (11). However, the relation between the entries of T −1 n (ω p b) and Green's kernel lies deeper than the relation between the eigenvalues of these two and was uncovered only by Rambour and Seghier [14,15]. Anyway, eventually we know that G(x, y) = K(x, y) = G p (x, y). A direct proof (avoiding the detour through Toeplitz matrices) that Green's function is given by (9) can be found in [1]. Paper [5] contains more pieces of information about this and related topics.

Declarations
Conflict of interest. The author states that there is no conflict of interest. No datasets were generated or analysed during the current study.
Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access: This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit (http://creativecommons.org/ licenses/by/4.0/).