Stable polefinding and rational least-squares fitting via eigenvalues

A common way of finding the poles of a meromorphic function f in a domain, where an explicit expression of f is unknown but f can be evaluated at any given z, is to interpolate f by a rational function \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{p}{q}$$\end{document}pq such that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r(\gamma _i)=f(\gamma _i)$$\end{document}r(γi)=f(γi) at prescribed sample points \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{\gamma _i\}_{i=1}^L$$\end{document}{γi}i=1L, and then find the roots of q. This is a two-step process and the type of the rational interpolant needs to be specified by the user. Many other algorithms for polefinding and rational interpolation (or least-squares fitting) have been proposed, but their numerical stability has remained largely unexplored. In this work we describe an algorithm with the following three features: (1) it automatically finds an appropriate type for a rational approximant, thereby allowing the user to input just the function f, (2) it finds the poles via a generalized eigenvalue problem of matrices constructed directly from the sampled values \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f(\gamma _i)$$\end{document}f(γi) in a one-step fashion, and (3) it computes rational approximants \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{p},\hat{q}$$\end{document}p^,q^ in a numerically stable manner, in that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\hat{p}+\Delta p)/(\hat{q}+\Delta q)=f$$\end{document}(p^+Δp)/(q^+Δq)=f with small \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta p,\Delta q$$\end{document}Δp,Δq at the sample points, making it the first rational interpolation (or approximation) algorithm with guaranteed numerical stability. Our algorithm executes an implicit change of polynomial basis by the QR factorization, and allows for oversampling combined with least-squares fitting. Through experiments we illustrate the resulting accuracy and stability, which can significantly outperform existing algorithms.

Mathematics Subject Classification 65D05 Numerical analysis, Interpolation · 65D15 Numerical analysis, Algorithms for functional approximation

Introduction
Let f be a meromorphic function in a domain Ω, whose explicit expression is unknown but can be evaluated at any set of sample points {γ i } L i=1 . This paper investigates numerical algorithms for finding the poles of f , along with the associated problem of finding a rational approximant r = p/q ≈ f in Ω. Finding the poles of a meromorphic or rational function f is required in many situations, including resolvent-based eigensolvers [4,36,40] and the analysis of transfer functions [30,38].
One natural way of finding the poles is to first approximate f in Ω by a rational function r = p/q, then find the poles of r , i.e., the roots of q. A common approach to obtain r ∈ R m,n (a rational function of type m, n, i.e., r = p/q for polynomials p, q of degree at most m, n respectively) is to interpolate f at m + n + 1 points in Ω (such as the unit disk), a code for which is available in the Chebfun command ratinterp [21,35]. However, this is a two-step process; when the poles are of primary interest, explicitly forming r is unnecessary and can be a cause for numerical errors. Moreover, the type of the rational function is usually required as input.
In this paper we develop a polefinding algorithm ratfun that essentially involves just solving one generalized eigenvalue problem, thereby bypassing the need to form r . ratfun starts by finding an appropriate type for the rational approximant from the function values: roughly, it finds the pair (m, n) with a smallest possible n (without taking m excessively large) such that r = p/q ≈ f holds with r ∈ R m,n ; in Sect. 3 we make this more precise. This allows the user to input just the function f to obtain the poles. The rational approximant can also be obtained if necessary.
Since polefinding for r = p/q boils down to rootfinding for q, it is inevitable that the algorithm involves an iterative process (as opposed to processes requiring finitely many operations in exact arithmetic such as a linear system), and hence it is perhaps unsurprising that we arrive at an eigenvalue problem. Our algorithm has runtime that scales cubically with the type of the rational approximant, which is comparable to some of the state-of-the-art algorithms.
A key property of ratfun is its numerical stability. To our knowledge, no previous polefinding algorithm has been proven to be numerically stable. Numerical stability here means backward stability in the sense thatp + p q+ q = f holds exactly at the sample points, wherep q is the computed rational approximant and p L / p L , q L / q L are O(u) where u is the unit roundoff (throughout we write x = O(y) to mean x ≤ My for a moderate M > 0), and · L is the vector norm of function values at the sample points {γ i } L i=1 , see Notation below. Classical algorithms such as Cauchy's [13], Jacobi's [26], Thiele's continued fractions [44,Sect. 2.2.2] and Neville-type algorithms [44,Sect. 2.2.3] are known to be of little practical use due to instability [47,Ch. 26]. The more recent Chebfun's ratinterp is based on the SVD, and combined with a degree reduction technique, ratinterp is reliable in many situations. However, as we shall see, the algorithm is still unstable when a sample point lies near a pole. Once the numerical degree is determined, the way our algorithm ratfun finds the poles is mathematically equivalent to ratinterp (and some other algorithms), but overcomes this instability by avoiding the use of the FFT and employing a diagonal scaling to attenuate the effect of an excessively large sample value | f (γ i )|.
Another practical method is a naive SVD-based interpolation algorithm (described in Sect. 2.1), and despite its simplicity and straightforward derivation, it works surprisingly well; indeed we prove stability in the above sense for obtaining r when an appropriate diagonal scaling is employed. Nonetheless, it is still based on a two-step approach, and the detour of forming the coefficients of p, q before computing the poles incurs unnecessary inaccuracy. As is well known, in rootfinding problems the choice of the polynomial basis is critical for accurate computation [47,App. 6], as Wilkinson famously illustrated in [52]. ratfun, by contrast, bypasses the coefficients and implicitly performs an appropriate change of polynomial basis.
Also worth mentioning are polefinding algorithms based on a Hankel generalized eigenvalue constructed via evaluating discretized contour integrals of the form s j := 1 2π i γ z j 1 f (z) dz [29,40]. This algorithm still has a two-step flavor (computing integrals and solving eigenproblem), and it was recently shown [3] to be mathematically equivalent to rational interpolation followed by polefinding, as in Chebfun's ratinterp. We shall see that this algorithm is also unstable.
We shall see that ratfun is also equivalent mathematically to these two algorithms, in that our eigenproblem can be reduced to the Hankel eigenproblem by a left equivalence transformation. However, numerically they are very different, and we explain why ratfun is stable while others are not.
The contributions of this paper can be summarized as follows.
1. Polefinding (and rootfinding if necessary) by a one-step eigenvalue problem. 2. Automatic determination of a type (m, n) for the rational approximant. This allows the user to obtain p, q from the input f alone. In previous algorithms the type (m, n) has been a required input. 3. Stability analysis. We introduce a natural measure of numerical stability for rational interpolation, and establish that our algorithm ratfun is numerically stable. Table 1 compares algorithms for polefinding and indicates the stability and complexity of each method, along with the dominant computational operation. Here, RKFIT refers to the recent algorithm by Berljafa and Güttel [8], Hankel is the algorithm based on contour integration, resulting in a generalized eigenvalue problem involving Hankel matrices (summarized in Sect. 4.8), and naive is the naive method presented in Sect. 2.1. By "avoid roots(q)" we mean the algorithm can compute the poles without forming the polynomial q and then finding its roots. This paper is organized as follows. In Sect. 2.1 we review some previous algorithms, which also leads naturally to our proposed algorithm. In Sect. 3 we discuss the process of finding an appropriate type of the rational approximation. Section 4 is the main part where our eigenvalue-based algorithm is derived, and we prove its numerical stability in Sect. 5. We present numerical experiments in Sect. 6.
Notation P n is the set of polynomials of degree at most n, and R m,n is the set of rational functions of type at most (m, n). Unless mentioned otherwise, f is assumed to Main computation SVD etc Krylov GEP SVD Rectangular GEP GEP generalized eigenvalue problem be meromorphic in a region Ω in the complex plane, and (m, n) denotes the type of the rational approximant r = p/q ≈ f that our algorithm finds: r ∈ R m,n , that is, p ∈ P m and q ∈ P n . When necessary, when f is rational, we denote by (M, N ) its exact type, that is, f = p q where p, q are coprime polynomials of degree M, N , respectively. We define c p = [c p,0 , c p,1 , c p,2 , . . . , c p,m ] , c q = [c q,0 , c q,1 , c q,2 , . . . , c q,n ] to be the vectors of their coefficients such that p(z) = is a polynomial basis, which we take to be the monomials φ i (z) = z i unless otherwise mentioned. When other bases are taken we state the choice explicitly. L is the number of sample points, denoted by {γ i } L i=1 , which we assume to take distinct points in Ω ∈ C. F = diag( f (γ 1 ), . . . , f (γ L )) is the diagonal matrix of function values at the sample points. We also let Γ = diag(γ 1 , . . . , γ L ). · L denotes a norm of a function, defined via the function values at the sample points g L = L i=1 |g(γ i )| 2 . Computed approximants wear a hat, so for exampleξ i is a computed pole. V is the Vandermonde matrix generated from the L sample points (1.1) The Vandermonde matrix and its inverse play the important role of mapping between coefficient space and value space. When a non-monomial basis {φ i (z)} is used, (V ) i j = φ j−1 (γ i ). We denote by V i the matrix of first i columns of V . u denotes the unit roundoff, u ≈ 10 −16 in IEEE double precision arithmetic. We write x = (y) to mean x = O(y) and y = O(x).

Existing methods for rational interpolation and least-squares fitting
Rational interpolation is a classical problem in numerical analysis and many algorithms have been proposed, such as those by Jacobi [26] We first clarify what is meant by rational interpolation and least-squares fitting.
Rational interpolation With sample points γ i for i = 1, . . . , L with L = m + n + 1, the goal of rational interpolation is to find polynomials p ∈ P m , q ∈ P n satisfying the set of L equations However, as is well known [12,Ch. 5], [47,Ch. 26], (2.1) does not always have a solution p ∈ P m , q ∈ P n . To avoid difficulties associated with nonexistence, a numerical algorithm often starts with the linearized equation which always has solution(s), which all correspond to the same rational function p q . Most methods discussed in this paper work with (2.2).
Rational least-squares fitting When we sample f at more than m + n + 1 sample 2) has more equations than unknowns, and a natural approach is to find p, q such that This leads to a least-squares problem. Least-squares fitting is used throughout scientific computing, and it often leads to more robust algorithms than interpolation. For example, when function values contain random errors, polynomial least-squares fitting has the benefit of reducing the variance in the outcome as compared with interpolation [14,Sect. 4.5.5].
One main message of this paper is that the precise formulation of the least-squares problem (2.3) is crucial for numerical stability. For example, the minimizers of are clearly different. As we describe below, our method works with D . (2.4) Here . This choice is crucial for establishing numerical stability.

Naive method
Perhaps the most straightforward, "naive" method for rational interpolation is to find the coefficients of p(z) = m i=0 c p,i z i and q(z) = n i=0 c q,i z i by writing out (2.2) as a matrix equation . . , f (γ L )) for L = m + n + 1 and V m+1 , V n+1 are the first (m + 1) and (n + 1) columns of V , the Vandermonde matrix of size L as in (1.1). To obtain (2.5), note that the (partial) Vandermonde matrices map the coefficients c p , c q to value space (i.e., in which "multiplication by f (γ i )" corresponds simply to "multiplication by F". Equation (2.5) is thus a matrix formulation of rational interpolation (2.2) in value space. Solving (2.5) for c p , c q amounts to finding a null vector of the L × (L + 1) matrix Sometimes the matrix C has null space of dimension larger than 1; in this case all the null vectors of C give the same rational function p/q [12, Sect. V.3.A].
To find the poles of f once c q is obtained, we find the roots of the polynomial q = n i=0 c q,i x i , for example by the companion linearization [22]. When a nonmonomial polynomial basis {φ i (z)} L i=0 is chosen, other linearizations such as comrade and confederate are available [5,22].
The above process (2.6) can be easily extended to the oversampled case, in which L > m + n + 1 and the matrix C above is of size L × (m + n + 2). In this case the matrix in (2.6) has at least as many rows as columns, and does not necessarily have a null vector. Then the task is to perform a least-squares fitting, which we do by finding the right singular vector corresponding to the smallest singular value of the matrix C, which for later use we state as an optimization problem: Here the normalization c p 2 2 + c q 2 2 = 1 is imposed to rule out the trivial solution c p = 0, c q = 0.
We shall consider a scaled formulation of (2.7), which left-multiplies a suitably chosen diagonal matrix D by the matrix in the objective function as (2.8) Note that (2.7) and (2.8) have the same solution when the optimal objective value is zero, but otherwise they are different, and in the oversampled case L > m + n + 1 this is usually the case. Numerically, they are vastly different even when L = m + n + 1.
The dominant cost is in the SVD (more precisely computing the right singular vector corresponding to the smallest singular value) of F V n+1 V m+1 or the scaled The naive method (2.5) is mentioned for example in [10], but seems to be rarely used in practice, and we are unaware of previous work that explicitly investigate the leastsquares formulation (2.8) or its scaled variant (2.7). Nonetheless, in Sect. 5 we shall show that the scaled formulation (2.8) is numerically stable for rational interpolation (i.e., computing p, q) for a suitable choice of D. In this paper we refer to (2.8) as the scaled naive method (or just naive method).
Another method that relies on finding a null vector of a matrix is described in [41], whose matrix elements are defined via the divided differences. Analyzing stability for this method appears to be complicated and is an open problem.

Chebfun's ratinterp ratinterp ratinterp
Chebfun [18] is a MATLAB package for working with functions based primarily on polynomial interpolation, but it also provides basic routines for rational functions. In particular, the ratinterp command runs a rational interpolation or least-squares fitting algorithm for the linearized equation (2.3), as outlined below.
We start again with the matrix equation in the naive method (2.6), which we rewrite as F V n+1 c q = V m+1 c p . Expanding the matrices V m+1 , V n+1 to form a full Vandermonde matrix V , the equation becomes Now when the sample points are at roots of unity γ j = exp( 2π i j L ) for j = 1, . . . , L, and using the monomial basis {z i } L−1 i=0 , we can use the FFT to efficiently multiply by V or V −1 = 1 L V * (V * =V T denotes that Hermitian conjugate), and left-multiplying (2.9) by . (2.10) The multiplication by V −1 brings the equation back to coefficient space, and so unlike the naive method (2.5) given in value space, (2.10) is a formulation of rational interpolation in coefficient space. Note that the matrix 1 L V * F V can be formed in O(L 2 log L) operations using the FFT. An analogous result holds for Chebyshev points γ j = cos( π( j−1) L−1 ) using the Chebyshev polynomial basis [6], [46,Ch. 8]. By (2.10), c q is a null vector of the bottom-left (L −m −1)×(n +1) part of V * F V , which has one more column than rows in the interpolation case L = m + n + 1. Then the task is to find c q such that V * F V n+1 c q = 0, (2.11) where V denotes the last L − m − 1 columns of V (as before, V n+1 is the first N + 1 columns). Again, in the oversampled case a least-squares fitting can be done by finding the smallest singular value and its right singular vector of the As in the naive method, ratinterp finds the poles by finding the roots of q via the eigenvalues of the companion (when sampled at roots of unity) or colleague (Chebyshev points) matrices.

RKFIT
The recent work by Berljafa and Güttel [7,8] introduces RKFIT, a toolbox for working with matrices and rational functions based on rational Krylov decompositions. Given matrices F, A ∈ C L×L and a vector b ∈ C L , RKFIT is designed to find a rational matrix approximant (2.12) whereD ∈ C L×L is an elementwise weight matrix, which the user can specify. The objective function in (2.12) is called the absolute misfit in [8]. In the special case where F = diag( f (γ i )), A = diag(γ i ) and b = [1, . . . , 1] , RKFIT seeks to solve the optimization problem . . . (2.13) RKFIT solves (2.13) by an iterative process: starting with an initial guess for poles (e.g. ∞) that determines a temporary q = q 0 , form a rational Krylov decomposition and solve (2.13) over p ∈ P m via computing an SVD. Using the obtained solution, RKFIT then updates the pole estimates and q 0 , then repeats the process until convergence is achieved. See [8] for details, which shows RKFIT can deal with more general problems, for example with multiple vectors b and matrices F. Note that (2.13) has the flavor of dealing with the original rational approximation problem (2.1) rather than the linearized version (2.2). We observe, nonetheless, that (2.13) becomes very close to (2.8) (same except for the normalization) if we takẽ D = Ddiag(q(γ i )). As we discuss in Sect. 5, the choice of D (and henceD) is crucial for numerical stability. Indeed, RKFIT is not stable with the default parameters when used for scalar rational approximation, but the user can input an appropriate D (which depends on f ) to achieve stability.

Automatic type determination via oversampling
A significant feature of Chebfun's polynomial approximation process for a continuous function f is that the numerical degree can be obtained automatically by oversampling. This allows the user to obtain the polynomial approximant by taking just the function f as input, without prior information on the (numerical) degree of f .
Specifically, when the user inputs chebfun(f) for a function handle f , an adaptive process is executed to find the appropriate degree: Chebfun first samples f at 2 s + 1 Chebyshev points {cos jπ 2 s } 2 s j=0 for a modest integer s, examines the leading Chebyshev coefficients of the interpolant, and if they have not decayed sufficiently, then increments s by 1 to sample at twice as many points, and repeat until the leading Chebyshev coefficients decay to O(u). For details see [2]. We emphasize the important role that oversampling plays for determining the degree; the coefficient decay is observed only after f is sampled more than necessary to obtain the polynomial interpolant.
For rational interpolation or approximation, we argue that it is possible to determine an appropriate type for a rational approximant just as in the polynomial case by oversampling, although the process is not just to look at coefficients but rather based on the SVD of a certain matrix. Related studies exist: Antoulas and Anderson [1] find a minimum-degree interpolant in the barycentric representation by examining a socalled Löwner matrix, given a set of sample points. Similar techniques have been used in Chebfun's ratinterp [21] and padeapprox [20], and in RKFIT [8], which are designed for removing spurious root-pole pairs, rather than to find a type of a rational approximant.

Type determination by oversampling and examining singular values
Suppose that we sample a rational function f = p/q at sufficiently many points {γ j } L j=1 , so that L/2 is larger than both deg( p), deg(q). We initially take m = n = (L − 1)/2 as tentative upper bounds for the degrees. Then, as in the naive method (2.6), we compute the null space of C (which is square or tall, corresponding to the oversampled case). In "Appendix B" we examine the rank of the matrix C as the integers m, n, L vary, which shows that assuming L is taken large enough so that L ≥ max{M + n, m + N } + 1, (recalling that (M, N ) is the exact type of f ) 1. If m < M or n < N , then dim null(C) = 0.   shows how to reduce n so that there is no redundancy: n := n − dim null(C) + 1 should give us the correct n(= N ) provided that m was set large enough. Even if m − M < n − N , if > 1 singular values of C are negligible then we reduce n by − 1 and repeat the process, which will eventually give us the correct n = N provided that m > M. Once n = N is determined, we can find m as the smallest integer such that the L × (N + 1 + m + 1) matrix C has a null vector. m can be obtained also by looking at the leading coefficients of the computed c p , but we have found this SVD-based approach to be more reliable. We emphasize the important role played by oversampling, which is necessary for (3.1) and (3.2) to hold.
The above process would find the exact type in exact arithmetic if f is rational. In practice, f may not be rational, and we compute dim null(C) numerically by the number of singular values that are smaller than a tolerance tol = O(u) to find a "numerical type" of f , which is the type (m, n) of a rational function r ∈ R m,n such that r ≈ f in Ω. It is worth noting that "numerical type" is an ambiguous notion: for example, (1) r 1 ∈ R 20,5 and r 2 ∈ R 5,20 may be equally good as approximants to f in the domain Ω, and (2) if f is analytic in Ω, polynomials would suffice if the degree is taken large enough, but rational functions give much better approximants if singularities lie near Ω, see [3,Sect. 6]. (1) Suggests that the "smallest" m, n is not uniquely defined without further restriction. A natural approach is to find an approximant with the smallest possible n (since we do not want unnecessary poles), but (2) suggests that this may lead to an approximant p/q of excessively high deg( p).
Given f , we attempt to find a rational approximant with as few poles as possible, within a controlled amount of computational effort. Specifically, our Algorithm 3.1 below finds a rational approximant p/q of type (m, n) with the following properties: 1. There exists p/q ∈ R m,n such that f q − p L ≤ tol and c p c q 2 = 1, and 2. No rational function r = p q ∈ Rm ,ñ withñ < n andm ≤ (L − 1)/2 (∈ [max(m, n), 2 max(m, n)]) satisfies f q − p L ≤ tol and c p c q 2 = 1.
In other words, no rational function with lower denominator degree is a good approximant unless the numerator degree is more than doubled. In what follows, we set tol = 10 −14 unless otherwise mentioned. Numerically in practice, we shall show in Sect. 4.2 that it is important that a preprocessing step is carried out before examining the singular values of C. Specifically, we first scale f as f ← f /median L | f (γ )| so that the median of the scaled f is 1 in absolute value, and left-multiply a diagonal matrix so that each row of C has roughly the same norm: This choice of D is the same as the one we use in the scaled naive method (2.8) for stability, to be justified in Sect. 5. Diagonal scaling has the effect of reducing the condition number (when ill-conditioning is caused by the entries having widely varying magnitudes rather than the rows being linearly dependent), and a simple scaling that scales the rows to have identical norms is known to be nearly optimal [16,50]; the scaling in (3.3) achieves this approximately. For further stability, we orthogonalize the two block columns by the "thin" QR factorizations. 1 and determine the rational function type by the singular values ofC. Note that (3.2) continues to hold with C replaced withC in exact arithmetic. Summarizing, Algorithm 3.1 is the pseudocode for our type determination algorithm. In Algorithm 3.1 we increase the number of sample points by a factor 2 untilC has a nontrivial null vector. Doubling the points allows us to reuse the previously sampled values f (γ i ) when γ i are roots of unity; for the same reason, when sampling at Chebyshev points on [−1, 1] (this variant replaces the Lth roots of unity in step 2 by L Chebyshev points), we sample at 2 s + 1 points as in Chebfun.
We note that (3.1) and (3.2) assume that sufficiently many sample points are taken so that L ≥ max{M + n, m + N } + 1. If this does not hold, it is possible that dim null(C) > 0 although m < M or n < N , causing Algorithm 3.1 to wrongly conclude f is of a lower type. Fortunately, even if L < max{M + n, m + N } + 1, it is unlikely that dim null(C) > 0, as this requires that f q ≈ p at L points, where p, q together have < L degrees of freedom. Similarly, a tall rectangular matrix is 1 We note that these QR factorizations can be computed exploiting the Vandermonde-like structure of DF V n+1 , DV m+1 . Namely, when the basis is degree-graded, i.e., deg φ i (z) = i, then the column space of DV m+1 is equal to the Krylov subspace unlikely to have nontrivial null vectors: a random rectangular matrix is full rank with probability one, and well conditioned with high probability if the aspect ratio is safely above 1 [39]. The default value n = L − m − 3 was chosen to ensureC is always tall rectangular.
The description of Step 3(b) is not necessarily the most efficient: we can instead take m ← m − for some (> dim null(C) − 1), if this results in dim null(C) > 0. In step 4, we can use bisection to determine the smallest integer m. The worst-case cost is thus computing O(log 2 m) SVDs.
When the evaluation of f incurs nonnegligible (relative) error, tol should be adjusted accordingly. The output sigma indicates the error bound; a successful degree detection implies sigma < tol.
Mathematically in exact arithmetic, the matrix C orC having null space of dimension greater than 1 indicates the presence of a spurious root-pole pair, and in fact the coefficients of p, q obtained from any null vector of C are known to result in the same rational function p/q. In finite precision arithmetic, however, this property gets lost and a numerical null vector gives a function p/q that may be far from the function f . Furthermore, the accuracy of a computed singular vector is known to be inversely proportional to the gap between the corresponding singular value and the rest [43,Ch. 5]. Besides making the solution unique, finding the smallest possible degree has the additional benefit of widening the distance between the smallest and the second smallest singular values.

Experiments with oversampling for degree determination
Here we illustrate typefind through some numerical experiments. For illustration purposes, instead of doubling L = 2 3 , 2 4 , 2 5 , . . . as in Algorithm 3.1, we formed C for each integer L = 2, 3, . . . with γ j = exp( 2π i j L ), and examined the resulting output type without doubling L. For convenience, below we refer to this process as typefind(f,tol,L), where the number of sample points L is an input.  When f is a rational function We first examine the simplest case where f is a rational function  Figure 1 shows the types obtained by typefind(f,tol,L) as we increase the number of sample points L.
Observe that with 13 sample points or more, the algorithm correctly finds the type (4, 5) of the rational function f . With five sample points, however, typefind(f,tol,L) erroneously concludes that the function is of lower type; this is an artifact of the symmetry of the function (which disappears e.g. by changing the location of one pole), and illustrates the importance of oversampling. We will come back to this issue in Fig. 5.
Algorithm 3.1 samples at 2 4 = 16 points to determine the type of the rational approximant. Although 16 is larger than the smallest number M + N + 1 = 10 of sample points to theoretically obtain the rational interpolant p/q if the degree were known, we believe this is a small price to pay for an automated degree-finding algorithm. 2 When f is a meromorphic function The situation becomes more complicated when f is not rational but merely meromorphic. For example consider We take an example again with N = 5. f can be regarded as being of numerical type (M, 5) whereM ≈ 20, because the exponential function can be resolved to O(u) accuracy by a degree ≈ 15 polynomial in the unit disk. Moreover, we expect that by increasing the denominator degree one can reduce the numerator degree for the same approximation quality, so we could also approximate f in the unit disk by a type (20 − δ M , 5 + δ N ) rational function where δ M , δ N are modest integers such as 1, 2.  Figure 2 shows the numerical degrees obtained by typefind(f,tol,L), which confirms this observation. Algorithm 3.1 (i.e., typefind(f,tol)) outputs the type (m, n) = (14,9) by sampling at 2 5 = 32 points. Our polefinder ratfun (described in Sect. 4) computes nine poles, five of which approximate the correct poles ξ i to within 10 −14 and four of which have absolute value larger than 10. The same is true of all the types found by typefind(f, tol, L) for L ≥ 25; this example suggests they are all appropriate types, illustrating the nonunique nature of the numerical type. When f is an analytic function with poles near Ω Finally, we consider the function which is analytic in the unit disk Ω, therefore a polynomial p exists such that f − p Ω < for any > 0. However, as described in [3, Sect. 6], rational functions do a much better job of approximating analytic functions with a singularity lying near Ω, and (3.7) is such an example. Indeed, to achieve f − p Ω < for a polynomial p, we need deg( p) ≥ 386, whereas with rationals, f − p/q Ω < is achieved for a p/q ∈ R 16,1 . Figure 3 shows the types obtained by typefind(f,tol,L), which outputs the type (16, 1) for L > 36. The output would become (deg( p), 0) for deg( p) ≈ 400 if we take L ≥ 800, but typefind(f,tol) terminates doubling the sample points once dim null(C) ≥ 1 with L = 32, giving type (13,3). Again, the two extra poles are far outside the unit disk. See Fig. 7 for a function with exact poles far from the unit disk, along with other experiments in Sect. 6.

Interpretations as optimization problems
We have emphasized the role of the diagonal scaling D in the discussion in Sect. 3.1.
Here we reassess its significance from the viewpoint of optimization problems. Let us consider the meaning of the smallest singular value σ m+n+2 (C) of C in (2.6), allowing for the oversampled case. As discussed in (2.7), it has the characterization  8) and the solution c q −c p is obtained by the corresponding right singular vector. Since by the definition of C, its smallest singular value σ m+n+2 is equal to the optimal value of the following optimization problem: Note that the constraint in (3.9) changes depending on the choice of the polynomial basis, and the norm in the constraint · 2 differs from that of the objective function · L .
Recall that for stability, instead of C, we work with the scaled-orthogonalized matrixC in (3.4). We claim that the smallest singular value σ m+n+2 (C) ofC is equal to the optimal value of the following optimization problem: To verify the claim, we express σ m+n+2 (C) as where the last equality comes from the orthonormality of the columns of Q n+1 and Q m+1 . From the definition of Q n+1 and Q m+1 , we have Hence, σ m+n+2 (C) in (3.11) is equal to the optimal value of the problem given by (3.10). We note that, if the optimal value σ m+n+2 (C) is sufficiently small, then the optimal solutions p and q are scaled so that dp 2 We can also show similarly (and more easily) for the scaled (but not orthogonalized) The optimization problems (3.9), (3.10) and (3.12) differ in the following respects: -The objective function in (3.10) and (3.12 The diagonal scaling in the objective function is crucial for numerical stability as we show in Sect. 5. The independence ofC from the polynomial basis is due to the QR factorization, and it "automatically" chooses polynomial bases {φ p,i } L−1 i=0 and {φ q,i } L−1 i=0 for p and q respectively, for which discrete orthonormality is achieved: , and similarly for q, the vectors Note that the two bases for p, q are different, and they depend on the function f and sample points {γ i } L i=1 . Working with orthonormal matrices have numerical benefits, as we shall illustrate in Sect. 5.3. Together with the fact that the objective function and constraint are defined with respect to the same norm · L , this "scaled and QR'd" approach results in a natural and numerically stable interpolation. For these reasons, we argue that (3.10) is a natural way to formulate our problem.
Note, however, that the scaled naive method (2.8) works with (3.12), not (3.10). No QR factorizations is performed in (2.8), because if one uses it, the null vector ofC no longer gives the coefficients c p , c q as in (2.8). Although we could retrieve c p , c q by applying the inverse transformation with respect to the R factors in the QR factorizations, this leads to numerical instability when F V n+1 , V m+1 are ill-conditioned.
In the next section we shall overcome this difficulty by formulating an algorithm that directly computes the poles, bypassing the coefficient vector c q . The resulting algorithm ratfun essentially works with (3.10), but is immune to the difficulty associated with the change of polynomial basis.

Polefinding via a generalized eigenvalue problem
We now describe our eigenvalue-based algorithm for finding the poles of f . Here we take m, n as given, assumed to be obtained by Algorithm 3.1 or given as inputs.

Formulating polefinding as an eigenproblem
We consider finding the poles of f (z) = p(z) q(z) ∈ R m,n , i.e., the roots of q(z). Denote the desired poles by ξ i for i = 1, . . . , n.
As before we start with the linearized interpolation equation (2.2). Here we consider interpolation where L = m+n+1; we treat the oversampled case later in Sect. 4.3. The key idea is to make a pole ξ k , the sought quantity, appear explicitly in the equation to be solved. To this end we rewrite q(z) usingq(z) := q(z) z−ξ k , which is also a polynomial, as We can then express (2.2) as which is the crucial guiding equation for our algorithm. The equations (4.2) can be written as a matrix equation using the Vandermonde matrix as where Γ = diag(γ i ), cq is the vector of coefficients for the polynomialq, and as before, F = diag( f (γ i )) and V i is the first i columns of the Vandermonde matrix. Just as in the naive method (2.5), we obtain (4.3) by mapping into value space using the Vandermonde matrix, then noting that in value space, " f (γ i )-multiplication" is "Fmultiplication" and "γ i -multiplication" is "Γ -multiplication". Thus (4.3) formulates rational interpolation again in value space, but now with ξ k shown explicitly. Of course, ξ k is unknown in (4.3), and setting it as an unknown λ we arrive at the generalized eigenvalue problem where A 1 = FΓ V n , A 2 = V m+1 and B 1 = F V n , and O is the zero matrix of size L × (m + 1). Since the matrix [B 1 O] clearly has null space of dimension m +1, the eigenproblem (4.4) has m +1 eigenvalues at infinity. By construction, we expect the finite eigenvalues to contain information about the poles. The next result shows indeed that the finite eigenvalues of (4.4) are the poles of f .
for some d ≤ m, where a = 0 and η j does not coincide with any element of is singular if and only if λ is one of the roots ξ 1 , . . . , ξ n of q. We can easily confirm the "if" part as follows. Let λ = ξ k for a fixed integer k.
has a nontrivial kernel, and hence, Next, for the "only if" part, suppose (4.4) holds for a nonzero x = [x 1 , Then, it suffices to show that λ is one of the roots ξ 1 , . . . , ξ n of q. Define polynomials r x 1 (z) and r x 2 (z) by r x 1 for z = γ 1 , . . . , γ m+n+1 . Since the left-hand side of (4.5) is a polynomial of degree at most m + n and take on the value 0 at m + n + 1 distinct points, it must be the zero polynomial, i.e., (4.5) holds for arbitrary z ∈ C. Hence, the polynomial (z − λ)r x 1 (z) p(z) is equal to the polynomial −r x 2 (z)q(z). Note that these two polynomials are not the zero polynomial since x = 0. Let α 1 , . . . , α d 1 be the roots of r x 1 and β 1 , . . . , β d 2 the roots of r x 2 . Since (z −λ)r x 1 (z) p(z) has the same roots as −r x 2 (z)q(z), We have thus shown that for every λ and x = 0 such that (4.4) holds, λ must be a pole of f . It hence follows that for any showing the matrix pencil is regular.
As shown in the proof, the eigenvectors of (4.4) have a special structure: the eigenvector corresponding to ξ i is In the appendix we give further analysis of the eigenproblem, revealing the Kronecker canonical form. It shows in particular that the orders of the poles are equal to the multiplicities of the eigenvalues.

Techniques for efficient and stable solution of eigenproblem
We now discuss how to solve (4.4) in practice. We employ techniques to remove undesired eigenvalues at ∞, and to achieve numerical stability.
Projecting out eigenvalues at infinity The generalized eigenvalue problem (4.4) has n eigenvalues ξ i along with m + 1 eigenvalues at infinity. These eigenvalues at infinity can be projected out easily. Let A ⊥ 2 ∈ C L×(L−n) be the orthogonal complement of A 2 such that A * 2 A ⊥ 2 = 0. Then is an n × n eigenvalue problem whose eigenvalues are {ξ i } n i=1 with corresponding eigenvectors cq i . To see this, recall (4.6) and note that (4.4) is equivalent to and so taking the QR factorization from which we can deflate the m + 1 eigenvalues corresponding to the top-left corner and solve for the lower-right part, to arrive at (4.7) with eigenvector cq . Alternatively, (4.8) shows that the "residual" A 1 x 1 − λB 1 x 1 ∈ Span(A 2 ), which means it is orthogonal to A ⊥ 2 ; (4.7) is its representation.
Diagonal scaling Generally, given an eigenvalue problem A − λB, a well known technique of balancing the elements in the presence of widely varying elements is diagonal scaling. As with the scaled naive method (2.8), we left-multiply a diagonal matrix D and work with the pencil D(A − λB) so that each row of D[A B] has about the same norm. In Sect. 5 we show that this scaling makes our approach numerically stable.
Orthogonalization As alluded to at the end of Sect. 3.3, the final technique that we use for improved stability, which is inapplicable in the naive method, is orthogonalization. As in (3.4), we take the thin QR factorizations where Q A 2 , Q B 1 have orthonormal columns and are of the same size as D A 2 , D B 1 . The rationale is that numerical errors are reduced by working with orthogonal matrices. These can be computed exploiting the Vandermonde structure, as explained after (3.4).
Applying scaling and orthogonalization to (4.9), the eigenvalue problem we solve becomesÃ This is a n × n eigenproblem; the n eigenvalues are precisely the n sought poles.
Recall that as a consequence of this orthogonalization, the eigenvector of (4.10) goes through the change of basis with respect to R −1 B 1 . This severely affects the naive method (for which the singular vector is the sought quantity), but not our algorithm (for which the eigenvalues are sought). Use of FFT? In the practically important cases where the sample points are at roots of unity or Chebyshev points, we can use the FFT to efficiently obtain the matrices in (4.7), as discussed in Sect. 2.2.
However, we shall not use the FFT in this work, for two reasons. First, while the FFT significantly speeds up the matrix-matrix multiplication, from O(L 3 ) to O(L 2 log L), this is not essential to the overall algorithm as it inevitably invokes an eigensolver (or an SVD), which requires O(L(m+n) 2 ) operations. Indeed [35] designs the algorithm to fascilitate the use of the FFT, but again the saving is attenuated by the O(Ln 2 ) SVD step.
The second, more fundamental, reason is stability. We shall see in Sect. 5 and through numerical experiments that diagonal scaling is crucial for stability. Unfortunately, using the FFT makes diagonal scaling inapplicable. Pole exactly at sample point When a pole happens to exactly coincide with a sample point, f (γ i ) = ∞ and the eigenvalue problem breaks down due to infinity elements in the matrices. However, this should be a "happy" breakdown, rather than a difficulty. In this case we can simply take γ i to be a computed pole, and work with the function f := (z − γ i ) f , taking n := n − 1. An alternative and equally valid approach is to take f (γ i ) = 1 u median L | f (γ )|, and proceed as usual.

Oversampling and least-squares fitting
As with previous algorithms, it is often recommended to take advantage of the oversampled values f (γ i ) at more than m + n + 1 points γ i , and perform a least-squares fitting. This is true especially in our context, where the degree-finding process in Algorithm 3.1 has oversampled f to find the type, and it is natural to try to reuse the computed quantities f (γ i ).
Consider finding the poles of f (z) = p(z) q(z) ∈ R m,n with L > m+n+1 sample points We form the matrices as in the previous Sect. (4.4) with A 1 = Γ F V n ∈ C L×n , A 2 = V m+1 ∈ C L×(m+1) and B 1 = F V n ∈ C L×n . We proceed as in (4.10) and apply projection, scaling, and orthogonalization as in (4.10), but these matrices are now nonsquare, of size (L − m − 1) × n: they have more rows than columns since L > m + n + 1. Under the assumption that f has n poles, there exists a nonzero x ∈ C n with (Ã − λB)x = 0 if and only if λ is one of the poles of f ; this can be shown as in Proposition 1. Hence, in theory, we can compute the poles of f by solving the rectangular eigenvalue problemÃ However, traditional methods for generalized eigenvalue problems such as the QZ algorithm [19,Ch. 7], [31] are not applicable to (4.11) since the pencilÃ − λB is rectangular.
To solve the rectangular eigenvalue problem (4.11), we use the recent algorithm by Ito and Murota [25]. The idea is to find perturbations A, B with smallest [ Ã B ] F so that the pencil (Ã + Ã ) − λ(B + B ) has n eigenpairs:

Pseudocode
Summarizing, the following is the pseudocode for our polefinding algorithm ratfun.
By default, the sample points γ i are the roots of unity γ j = exp( 2π i j L ) (once L is specified); other choices are allowed such as Chebyshev points on [−1, 1]. We justify the scaling in step 2 and the choice of the diagonal matrix D in Sect. 5.2.

Algorithm 4.1 ratfun: Polefinding and interpolation
1: Determine m, n by Algorithm 3.1, if not given as inputs.

5: Compute full QR factorization
(4.14) 8: Solve the square generalized eigenvalue problem (4.15) The computed eigenvalues λ are approximants to the poles ξ i of f .
When the domain of interest is far from the origin, it is recommended that one work with a shifted function f (z − z 0 ) so that the domain becomes near the origin (this affects the Vandermonde matrix, in particular its condition number).

Efficiency
The dominant costs of Algorithm 4.1 are in the QR factorizations, forming Q ⊥ A 2 , computing the SVD (4.14) and solving the eigenproblem (4.15). These are all O(L(m + n) 2 ) or less, using standard algorithms in numerical linear algebra. This is comparable in complexity with other approaches, as we summarized in Table 1.

Input/output parameters
Combined with the degree determination process Algorithm 3.1, ratfun lets us find poles and rational interpolants with the minimum input requirement: just the function f . Our algorithm ratfun (described in detail in Sect. 4) adapts to the specifications as necessary when the user inputs more information such as the location of the sample points and type of the rational approximants. Below we detail the process for three types of inputs: 1. Minimum input requirement: poles = ratfun(f).
Minimum input requirement poles = ratfun(f) When the function f is the only input the algorithm first determines the numerical type of the rational approximant by Algorithm 3.1, then runs the polefinding algorithm to be described in Sect. 4. By default, we take the sample points to be roots of unity; Chebyshev points can be chosen by invoking ratfun(f,'c').
Inputs are function and sample points poles = ratfun(f,gam) When the sample points are specified by gam the algorithm first runs the degree finder typefind(f,tol,L) with L = length(gam), and gives a warning if the number of sample points L appears to be insufficient L < max{M + n, m + N } + 1, indicated by σ m+n+2 (C) > tol. Regardless, the algorithm proceeds with solving the generalized eigenvalue problem to obtain approximate poles and p, q with deg( p) + deg(q) + 2 ≤ L. We note that the backward errors p L and q L have magnitudes O(σ m+n+2 (C)), which is not necessarily O(tol) in this case (see Sect. 5 for details on backward errors).
Full input: function, sample points and degrees poles = ratfun(f,gam,m,n) When the degrees are further specified the algorithm directly solves the (rectangular or square when L = m + n + 1) generalized eigenvalue problem to obtain the poles and p, q.

Outputs
The full output information is [poles,cp,cq,type,roots]=ratfun(f), in which poles are the computed poles, cp,cq are the vectors of coefficients c p , c q of the polynomials p, q in the monomial basis, type is a 2-dimensional vector [m,n] of the computed type, and roots are the computed roots.
We next discuss how to compute the roots and finding c p , c q .

Computing the roots
One situation that Algorithm 4.1 did not deal with is when the roots of f are sought. We suggest two approaches for rootfinding, depending on whether poles are also sought or not.
Finding roots only First, when only the roots are of interest, we can invoke Algorithm 4.1 to find the poles of 1/ f . Alternatively, the roots can be computed from f by definingp(z) = p(z) z−r k and starting from the guiding equation [recall (4. 2)] which, as before, can be rewritten as a generalized eigenvalue problem with λ := r k . For brevity we omit the details, as the formulation is analogous to that for (4.4) and (4.11).
Finding poles and roots When both the poles and roots are required, we suggest the following. First compute the poles as in Algorithm 4.1. Then we find the roots by solving for λ the equation Herep(z) is the same as above, and we form q(γ i ) from the expression q(z) = n i=1 (z −ξ i ) using the polesξ i that have previously been computed. Equation (4.17) can be rearranged to γ ip (γ i ) − f (γ i )q(γ i ) = λp(γ i ), which we write in matrix form as This is again a rectangular generalized eigenvalue problem. This has one irrelevant eigenvalue at infinity, and the problem can again be solved via an SVD. Since the matrices involved are of smaller size than (4.4) and (4.11), this process is cheaper than finding the poles of 1/ f .

Finding c p , c q
To find the coefficient vectors c p and c q , we can take advantage of the eigenvector structure (4.6) to extract c p from any eigenvector, along with cq , from which we obtain c q via (4.1). Note that to do this we need to give up the QR factorization in step 4 of Algorithm 4.1. Equally effective and stable is to invoke the scaled naive method (2.8), which gives c p , c q directly (our current code adopts this approach). A word of caution is that eigenvectors are sensitive to perturbation if (but not only if) the corresponding eigenvalues are nearly multiple.
We note that there are many other ways of representing a rational function f = p/q. Since ratfun can compute the poles and roots as described above, one effective representation is to take , (4.19) in which we store the constant c and the rootsr i and polesξ i .

Mathematical equivalence with previous algorithms: interpolation-based and Hankel eigenproblem
Here we briefly discuss the connection between our eigenvalue problem and existing ones. We shall show that the eigenproblem (4.7), when L = m + n + 1, is equivalent in exact arithmetic to the generalized eigenvalue problem of Hankel matrices derived in [29,40], which are in turn equivalent to Chebfun's ratinterp as shown in [3]. Essentially, both our algorithm and ratinterp find the roots of q such that p q interpolates f at the sample points. We shall show that the eigenvalues and right eigenvectors of (4.7) and those of the Hankel matrix pencil are the same. Before proving this claim, we briefly review the Hankel eigenproblem approach, which originates in work of Delves and Lyness [15] and Kravanja et al. [27,28], see also [3]. In this algorithm, one computes the discretized moments We write this as H 1 x = λH 0 x for simplicity. The pencil H 1 − λH 0 can be written using a contour integral as If f is meromorphic in the unit disk {z ∈ C | |z| ≤ 1} and has n poles ξ 1 , . . . , ξ n ∈ {z ∈ C | |z| < 1}, then the poles are eigenvalues of H 1 x = λH 0 x. Indeed, defining q = l =k,1≤l≤n (z − ξ l ) and letting cq be its coefficient vector as in (4.3) we obtain is analytic in the unit disk. The contour integral (4.22) needs to be discretized in a practical computation. If we use the standard trapezoidal rule evaluating at roots of unity γ j = exp(2π i j/L) for j = 1, . . . , L to approximate H 1 − λH 0 , the computed pencilĤ 1 − λĤ 0 becomeŝ where F = diag( f (γ 1 ), . . . , f (γ L )) and Γ = diag(γ 1 , . . . , γ L ) as before. Hence if f is a rational function f = p q , we have The ith element of the final vector is L j=1 γ i+1 j p(γ i ) for i = 1, . . . , n, which is equal to the evaluation of 1 2π i L |z|=1 z i p(z). Now the L-point trapezoidal rule is exact if the integrand is polynomial of degree L − 1 or below [49,Cor. 2.3]. Therefore, if L ≥ m +n +1 then (Ĥ 1 −ξ iĤ0 )cq = 0. Thus also for the discretized pencilĤ 1 −ξ iĤ0 , cq is again an eigenvector if f = p q with p ∈ P m , q ∈ P n with L ≥ m + n + 1. This shows that the eigenproblemsĤ 1 x in (4.9) have the same eigenvalues and eigenvectors, thus are equivalent, i.e., there exists a nonsingular matrix W such that WĤ 1 = (A ⊥ 2 ) * A 1 and WĤ 0 = (A ⊥ 2 ) * B 1 . Despite the mathematical equivalence, we reiterate that the numerical behavior of the algorithms is vastly different. Crucially, the left-multiplication by V n in (4.23) mixes up the magnitudes of f (γ i ), resulting in the instability due to near-pole sampling. This will be made precise in the next section.

Numerical stability
A crucial aspect of any numerical algorithm is stability [23]. It is common, and often inevitable for problems that are potentially ill-conditioned, to investigate backward stability (as opposed to analyzing the forward error in the outcome itself), in which we ask whether a computed output is guaranteed to be the exact solution of a slightly perturbed input.
The great success of polynomial interpolation of a continuous function f at roots of unity (for approximation in the unit disk) or Chebyshev points (on an interval [−1, 1]) is due to its combined efficiency and stability: a degree-n polynomial interpolation can be done in O(n log n) operations employing the Chebyshev polynomials and FFT [6]. Moreover, since the FFT matrix has condition number 1, the process is numerically stable, and we obtain an interpolantp satisfying at every sample point γ i ; this holds regardless of f . Suppose further the interpolation is successful (with smooth f , good points γ i and basis φ i (z)) in that Then with a stable rootfinding algorithm forp, one obtains stability in the computed roots: p(r i ) = O(u f ∞ ). This showsr i are the exact roots of a slightly perturbed input f . Rootfinding algorithms with proven stability include the companion [51] (for monomials) and colleague linearizations [32] (for Chebyshev).
For rational interpolation and polefinding, to our knowledge, stability in the context of polefinding and rational interpolation has been rarely discussed; [35], which connects the inaccuracies with the presence of ill-conditioned matrices, is one of the few, but their argument does not treat the backward stability of the rational interpolantsp q . Here we attempt to make a step forward and analyze backward stability for rational interpolation algorithms.
First we need to elucidate our goal. The presence of poles complicates the situation because, for example f −p q ∞ is infinity unless we compute the poles exactly, and this is true even for the linearized version fq −p ∞ . For this reason, sometimes rational interpolation is thought to be inherently ill-posed for a stable computation.
There is a natural workaround here: we allow for perturbation in both the numerator and denominator polynomialsp andq. We then analyze whether the rational interpolation is satisfied with small backward errors, that is, for i = 1, . . . , L ,. As before, we work with the linearized formulation.
Definition 1 Let f be a meromorphic function. Given sample points {γ i } L i=1 and computed polynomialsp,q, we say thatp/q is a stable rational interpolant of f if there exist functions q, p : We note that the requirement here is a rather weak condition: for example, it does not require thatp,q are close to the correct p, q when f = p/q. Nonetheless, we shall see that many previous algorithms fail to satisfy them. We now give a necessary and sufficient condition for stability that is easy to work with.
is a necessary and sufficient condition forp/q to be a stable rational interpolant at γ i satisfying (5.3), for i = 1, . . . , L.
(Proof) Suppose (5.4) is satisfied. Then, defining p and q by This proves the claim. Below we analyze the stability of algorithms based on Lemma 1. In Sects. 5.1 and 5.2, to avoid the jarring complications due to the ill-conditioning of the Vandermonde matrix, we discuss the case where the sample points are the roots of unity and the polynomial basis is the monomials φ i (z) = z i . Essentially the same argument carries over to other sets of sample points employed with an appropriate polynomial basis {φ i (z)}, such as Chebyshev-points sampling employing the Chebyshev polynomial basis.

Instability of previous algorithms
Here we illustrate with the example of Chebfun's ratinterp that previous algorithms can be numerically unstable, i.e., they do not necessarily satisfy (5.4) in Lemma 1. Recall that ratinterp computes c q in (2.11) as the null vector of Let us explain the numerical issue here. Letĉ q be the computed null vector. Consider the Eq. (2.10) left-multiplied by the Vandermonde matrix V , which is unitary times √ L. Taking into account the numerical errors, the equation can be written as which we rewrite using L = V 0 The vectors L−(m+1) and L are zero whenĉ q is equal to the exact c q , but L−(m+1) , L = 0 due to numerical errors. Indeed, we see that the ith element of L is f (γ i )q(γ i ) −p(γ i ), which is precisely the linearized interpolation residual in (5.4). Now, the computed null vectorĉ q of the matrix V * F V N +1 in (2.11) obtained by a stable algorithm such as the SVD generally satisfies the normwise condition which violates the condition (5.4) for stability.
Although we do not present the details, such instability is present in most algorithms, including the unscaled naive method and RKFIT (with default weightD).

Diagonal scaling and stability of ratfun ratfun ratfun and scaled naive method
Let us reconsider the eigenvalue problem (4.4) from a similar viewpoint, and we shall show that our approach of solving (4.4) employing diagonal scaling is immune to the instability just discussed, and ratfun gives a stable rational interpolation.
For simplicity we rewrite the eigenvalue problem (4.4) with diagonal scaling. 3 By the backward stability of the standard QZ algorithm, each computed eigenpair (ξ i ,x) satisfies in which denotes a constant of magnitude O(u).
To establish stability we need two preparations. First, we use an appropriate scaling of f . We can clearly scale f ← 1 κ f for any κ > 0 without changing the poles and roots, and the analysis below will show that a good choice is one such that c p 2 ≈ c q 2 . To be precise, it suffices to have This means we expect f = (1) holds at most of the sample points. In practice, we achieve (5.11) by sampling at sufficiently many points and taking κ to be the median value median L | f (γ )|; this is adopted in the pseudocode of ratfun, Step 2 of Algorithm 4.1.

Theorem 1 Let A, B be as defined in
(Proof) By (5.10) we have (5.13) where we used the fact that D A 2 , D B 2 and |ξ k | are all O (1). Now recalling (4.2), the ith element of D Ax −ξ k D Bx is This represents a scaled interpolation error, which is O(u x 2 ) by (5.13). Since γ i are roots of unity we have p L = √ L ĉ p 2 and q L = √ L ĉ q 2 , so in view of Lemma 1, it suffices to show that Now since x 2 = ĉq c p , using the assumption ĉ q 2 / ĉ p 2 = (1) and the fact (1). Using these in Eq. (5.14) divided by x 2 , we see that it suffices to establish 1 1)), which indeed holds due to the choice of diagonal scaling (5.12).
Since the above argument is valid for every i = 1, . . . , L, we conclude from Lemma 1 thatp/q is a stable rational interpolant of f .
We emphasize the crucial role that diagonal scaling plays in the stability analysis. We also note that the scaling such that f ← c f is actually not necessary for the reduced eigenproblem (4.7) (without the diagonal scaling D), which is invariant under the scaling f ← c f .

Stability of scaled naive method
The scaled naive method can also be proven to be stable. In this case the analysis is even simpler as the jth row of DC c q −c p , where C is as in (2.6), represents That is, the residual of each row is exactly the scaled interpolation error. Thus a null vector c q −c p computed in a stable manner under the same assumptions as above [(5.10) and (5.11)] is automatically a stable rational interpolant. However, for finding the poles, the additional process of finding the roots of q is necessary, and this can be a cause for further numerical instability. We discuss this further in Sect. 5.3.
Barycentric formula Finally, we mention the rational interpolation based on the barycentric formula [9][10][11]41] (5.16) where w = [w 1 , . . . , w n ] are called the barycentric weights. For a general w (e.g. for randomly chosen w) the rational function r (z) in (5.16) is of type (n, n). However, by choosing appropriate weights w one obtains an interpolant of desired type; Berrut and Mittelmann [10] show how such w can be found by a null vector of a matrix related to F V n+1 V m+1 as in (2.6). Antoulas and Anderson [1] introduce an algorithm for computing w to interpolate r ( are taken to be half of the sample points and hence interpolation is achieved at 2n points. The recent AAA algorithm [33] chooses the points (γ k ) n i=1 in a greedy manner reduce the linearized error in the rational approximant.
As noted in [10], at the sample points γ k the barycentric formula (5.16) essentially gives an exact interpolation function, in that r (γ k ) = f (γ k ) at all γ k (this holds regardless of the choice of w as long as w k = 0). However, this is due to the representation of the rational function; finding the poles and obtaining p, q from (5.16) would induce further numerical errors. Below we focus our attention on algorithms that work with the coefficients of the rational interpolant.

Accuracy of polefinder and effect of orthogonalization
For rational interpolation, we have described and identified two stable methods: ratfun and the scaled naive method.
Let us now turn to polefinding, and focus on the accuracy of the computed poles. ratfun finds the poles while simultaneously computing the rational approximant. By contrast, in the scaled naive method (or practically any other existing method for rational approximation) we first find the denominator polynomial q, then compute its roots. Intuitively this two-step approach should be more susceptible to numerical instability, and here we illustrate that this is indeed the case.
We compare two algorithms: scaled naive and ratfun. In ratfun, the QR factorization D B 1 = Q B 1 R B 1 in step 3 of Algorithm 4.1 is implicitly performing a change of basis for the polynomial q, so that discrete weighted orthogonality is achieved in the new basis. This has the possible disadvantage that the computed eigenvector in (4.15) contains the coefficients in the changed basis. However, crucially, for polefinding, this is not an issue at all, because the eigenvalues are unaffected. The change-of-basis is rather a benefit, because in the new basis the matrices A 1 , B 1 are well-conditioned, which reduces numerical errors.
By contrast, this change-of-basis cannot be done for the naive method, because it requires the coefficients of q in a polynomial basis that is easy to work with for computing the roots.
Numerical example We illustrate the above discussion with an example. Let f be a rational function of the form (3.5) with N = 20 poles in equispaced points on Fig. 4 Error of 20 computed poles for ratfun and the naive method using the monomial basis. ratfun implicitly and automatically uses the appropriate polynomial basis to obtain accurate poles [−1 + δ, 1 − δ] with δ = 10 −3 . We use the monomial basis but sample at Chebyshev points, whose number we vary. Therefore we are employing a "wrong" polynomial basis for the sample points; the "correct" one is the Chebyshev polynomials. Figure 4 shows the result with the two algorithms, showing the errors in the computed poles min j |ξ i −ξ j | for i = 1, . . . , n, as the number of sample points is varied. The diagonal scaling (5.12) is used for both algorithms; without this, the accuracy is significantly worse than in the figures.
ratfun clearly gives significantly more accurate results. The inaccuracy of the naive method is mainly due to the wrong choice of basis used to represent q. For example, by choosing the "correct" Chebyshev basis, the red plots become only slightly worse than ratfun. Indeed it is known that when looking for real roots of a polynomial, one is often advised to use Chebyshev polynomials instead of monomials.
The point here is that ratfun automatically finds an appropriate basis for the particular problem given: If a function f and a set of sample points are given, the QR factorization finds the appropriate basis. Indeed, if the QR factorization is not used in ratfun, the accuracy deteriorates significantly.
All the experiments were conducted in MATLAB R2014a using IEEE double precision arithmetic with u ≈ 1.1 × 10 −16 , on a desktop machine with an Intel Core i7 Processor with four cores, and f16GB RAM.
For "easy" problems like those in Sect. 3.2, all the algorithms compute the poles and approximants reliably. Below we thus focus on more challenging problems.

High-degree examples
We consider a moderately high-degree example, where we take f as in (3.5) with N = 50. The results are in Fig. 5.
With a sufficient number of sample points L ≥ 103, ratfun finds the type of the rational approximant and computes the poles and roots stably. Here and below, the roots are computed simply by finding the poles of 1/ f by ratfun; the other processes described in Sect. 4.6 had similar performances.
It is worth observing that most computed roots turned out to lie on a circle of radius about 0.5. This may look bizarre at first sight, as one easily sees that the only zero of f is at z = 0. This can be explained by eigenvalue perturbation theory: the zero of f has multiplicity N − 1 = 49, so the eigenvalue problem for computing it attempts to find the eigenvalue of algebraic multiplicity 49. Now it is well known [43,Ch. 5] that the eigenvalues of algebraic multiplicity k and geometric multiplicity 1, which are essentially eigenvalues of a k × k Jordan block, can be perturbed by O( 1/k ) by a perturbation in the matrix of norm O( ). The QZ algorithm computed the roots in a backward stable manner, but the small perturbation is enough to perturb them by u 1 49 ≈ 0.4725. The reason the roots appear to lie systematically on a circle is that the eigenvalues of a Jordan block are extremely sensitive to perturbation in the bottom-left element, but much less so in other positions.
Note in the left figure that with insufficient sample points L < 100 the type finder outputs an incorrect output. In view of Lemma 4, this happens when the number of sample points L was less than the necessary max{M + n, m + N } + 1, but the function f and sample points γ i happened to (e.g. by symmetry) make the matrix C rank-deficient, and so at γ i the function behaved as if it is a lower-type rational function. The same was observed in Fig. 1, and the problem is pronounced here. Indeed the degree determination Algorithm 3.1 indicated a numerical degree of (0, 2) after sampling initially at the eigth roots of unity. We have not overcome this issue completely; indeed such difficulty is present even in the polynomial case [48]; for example when a highly oscillatory function f happened to be 0 at all the initial sample points. Perhaps some further insurance policy is needed to ensure that the type obtained by Algorithm 3.1 is appropriate, such as sampling f at a few more random points [2]. One could also try typefind(f,tol,L) for neighboring values of L and accept only if they are the same. These remedies, while effective, cannot be proven to be fool-proof.  The pole that eventually gets lost by ratfun and naive corresponds to ξ 1 = 10, the pole far from the sample points. Right: sample points, poles and roots Nonetheless, this is a rather contrived example with high symmetry, which generically would not happen. For example, if we take the residues of each term in (3.6) to be random numbers, we obtain Fig. 6, for which an appropriate type is chosen. In both cases, once a sufficient number of sampled points is taken, ratfun finds the correct poles and rational intepolant.
When f has poles far away from sample points Another possible difficulty is when f has a pole far away from the sample points.
To examine the behavior of our algorithm in such cases we take f as in (3.5) of type (4, 5), but we now set one pole to be far by taking ξ 1 = 10. Figure 7 shows the results.
Again, with a sufficient number of sample points we obtain a rational approximant of correct type (4,5). In the middle error plot in Fig. 7, the poles inside the unit disk are computed accurately to machine precision. By contrast, the pole ξ 1 = 10 is computed with poorer accuracy. Loss of accuracy for poles outside the unit disk is a typical phenomenon, and the accuracy worsens rapidly if we take |ξ 1 | larger or let f be of higher type. This observation can be explained via eigenvalue conditioning analysis, which shows the condition numbers of the eigenvalues of (4.4) corresponding to poles outside the unit disk grow exponentially with base |ξ i | and exponent m + n, whereas those of eigenvalues inside the unit circle decrease (slowly) with m + n. The analysis is presented in "Appendix C".
Recall that ratfun finds a numerical type by Algortihm 3.1. As explained in Sect. 3.1, there can be other numerical types for f that may be appropriate: Indeed, if we sample at many more points than necessary (i.e., typefind(f,tol,L) with L m + n + 1), ratfun eventually ignores the outlying pole and converges to a rational function of type (m, 4) where m is large. That is, the computed outcome has lower denominator degree than that of the exact type 5; recall the experiment with (3.7) with a similar discussion. This can be explained as follows. By a standard result in complex analysis [34,Ch. 9], inside the unit disk a meromorphic function f can be written as The sum is taken over the poles inside the unit disk. Here p f (z) is a power series, obtained e.g. as a Taylor series of f (z) − |ξ i |≤1 j a i, j (z−ξ i ) j , which converges inside a disk centered at the origin and of radius |ξ 0 |, where ξ 0 is the pole closest to the origin besides those with |ξ i | ≤ 1; here |ξ 0 | = 10. Therefore, near the sample points (the unit circle), f behaves as if it is a sum of terms 1/(z − ξ i ) with |ξ i | ≤ 1, and an analytic function.
From a practical viewpoint, this example suggests that we should locate the sample points near the poles of interest. For example, we can find the pole ξ 1 = 10 accurately in the above example by taking the sample points to lie on a circle centered around 10.
When a sample point is near a pole This example illustrates how existing algorithms lose accuracy when a sample point is near a pole.
We form a rational function where the roots r i and poles ξ i are generated randomly to lie in the unit disk. Here we take M = 4, N = 5, and let the sample points be equispaced points on the unit circle. We then reset one pole to be 1 + 10 −13 , forcing it to lie close to a sample point.
ratfun and the naive method compute the poles much more accurately than the other methods. This is largely due to the diagonal scaling discussed in Sect. 5; indeed, if we turn off the diagonal scaling and take D = I , the accuracy deteriorates for both ratfun and the naive methods to about the same as Chebfun's ratinterp and RKFIT.
We believe that with RKFIT, which allows for tuning various inputs and parameters, it is possible to obtain accurate results if appropriate parameters are provided, such as D; recall the discussion in Sect. 2.3. The point here is that our analysis revealed an appropriate choice (Fig. 8).
Rational functions with poles of order > 1 When f is memorophic but has poles ξ i of order d i > 1, the generalized eigenvalue problems (4.4) and (4.15) have an eigenvalue ξ i of the same multiplicity d i . Here we examine the behavior of our algorithm in such cases.
We generate the function f simply by squaring the function in (3.5), that is, f (z) = ( 5 j=1 1 z−ξ j ) 2 with ξ j = 0.9 exp(2π i j/5). Then f has 5 poles, all of which are of order 2.
Observe that all the algorithms, including ratfun, find the poles with accuracy O( √ ), which is what one would expect from a backward stable algorithm: the poles  (Fig. 9).
Non-meromorphic functions Although we have started our discussion assuming f is a meromorphic function in the unit disk, our algorithm can be applied to f with singularities other than poles, as long as f can be evaluated at the sample points. We now explore such cases by examining functions with a branch cut, or an essential singularity. First let f have a log-type branch cut which has a branch cut on z = 1 10 i − R + . Figure 10 shows the results. Observe that spurious poles and roots appear along the branch cut; we suspect this is related to a similar phenomenon known for Padé approximants of functions with branch cuts [42].
For a function with an essential singularity, we examine the standard example 3) The results are in Fig. 11. Again, spurious poles and roots appear near the singularity point 0, but away from the singularity f is bounded and analytic, and is well approximated by the rational interpolant. This is no surprise as exp(1/z) behaves as a completely analytic function on the unit circle. Sample points at Chebyshev points In this example the sample points are taken to be the Chebyshev points γ j = cos( π( j−1) L−1 ) and the polynomial basis is Chebyshev polynomials φ i (x) = T i (x). This is numerically recommended when most poles lie on the real axis. In this example f is again as in (3.5), with 6 equispaced poles on [−1 + 10 −2 , 1 − 10 −2 ], along with complex poles at 0.2i and 2i. The results are in Fig. 12. For such functions, sampling at Chebyshev points give better accuracy than roots of unity.
Although not shown in the figure, the accuracy of poles far from [−1, 1] worsens rapidly as the poles lie farther away, or the function type increases. This is analogous to the observation made in Fig. 7: the poles far from the sample points will eventually get ignored (here the poles that converge are those within a narrow ellipse that covers the real interval [−1, 1]).
Speed illustration Here we examine the speed and accuracy of ratfun for high-degree rational functions. We take f to be as in (3.5) with poles ξ being the Chebyshev points, scaled by 1 − 10 −5 , and vary the number of points (i.e., the degree of q) from 100 to 1000. We sample at the Chebyshev points. In order to examine the breakdown Fig. 12 Sampled at Chebyshev points. ratfun(f) samples at 2 5 points and determines the correct type (8,7). The pole at 2i loses accuracy as we sample more and increase m Fig. 13 High-degree example, accuracy of computed poles max j |ξ j −ξ j | (left) and runtime (right) of the runtime we present the runtime for (1) ratfun(f), which inputs only the function (hence starts by finding the type), and (2) ratfun(f,m,n), which inputs the correct type (and hence bypasses the type determination). The results are in Fig. 13. This example illustrates that ratfun can work with rational functions of quite high degree, and that the degree determination step often takes up a dominant part of the runtime.
Eigenvalues of a matrix via the resolvent One use of rational approximation and polefinding that has been attracting recent interest [4] is in finding eigenvalues of a matrix A or matrix pencil A − λB via finding the poles of the projected resolvent where u, v are some vectors (usually random). We have applied our algorithm to this problem, and observed that it works. However, usually it is not superior to the algorithm presented in [4], which combines a rational filter function with a block subspace whose dimension is proportional to the estimated number of eigenvalues in the region of interest. The distinct feature in [4] (and also the FEAST eigensolver [36]) is that the algorithm works with the subspaces (A − z B) −1 V instead of the function u (A − z B) −1 v, and this is crucial to overcome the difficulty associated with a nearly multiple eigenvalue. We suspect that an extension of our algorithm to work with block subspaces would be possible; we leave this for future work. orthogonalization for the QR factorizations, Stefan Guettel for discussions on RKFIT, and Anthony Austin and Olivier Séte for their comments on an early draft. We thank the referees for their valuable suggestions.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

A The Kronecker canonical form
Here we analyze in detail the generalized eigenvalue problem (4.4) and derive its Kronecker canonical form [17,Ch. 4.5.2]. It shows in particular that multiple poles (if any) can be computed along with their multiplicities, at least in exact arithmetic.
Here, let f be a rational function f (z) = p(z) q(z) ∈ R m,n , where p(z) and q(z) have no common divisors except for constants (i.e., f = p/q is an irreducible expression). Then p(z) and q(z) are in the following form: where η 1 , . . . , η J , ξ 1 , . . . , ξ K ∈ C are distinct and α, β ∈ C \ {0}. Each η j is a root of f of multiplicity c j and each ξ k is a pole of order d k . Note that M = deg p = J j=1 c j ≤ m and N = deg q = K k=1 d k ≤ n since f ∈ R m,n . For simplicity we analyze the case where L = m + n + 1.

Proposition 2 The matrix pencil
(Proof) It suffices to show that there exist nonsingular constant matrices P, Q ∈ C L×L satisfying (A − λB)P = Q F(λ). We will construct such P and Q in two steps: for k = 1, . . . , K , and (2) construct S ∈ C L×(L−N ) , T 1 ∈ C L×(m+1) and T 2 ∈ C L×(n−N ) such that (using MATLAB notation) Similarly, for p 1 (z) = n−1 j=0 a j z j ∈ P n−1 and p 2 (z) = m j=0 b j z j ∈ P m , define the "vector of coefficients" as Then, it holds for arbitrary λ that (1) We will construct P k , Q k ∈ C (L)×d k satisfying (A.5) for each k ∈ {1, . . . , K }.
where we define q (0) k = 0. Then, for d = 1, . . . , d k , the polynomial q From this equation and (A.11), we see that P k , Q k ∈ C (L)×d k defined by By substituting p 1 = 0 into (A.11), we obtain (A − λB)c(0, p 2 (z)) = v( p 2 (z)) for an arbitrary p 2 ∈ P m , which implies (A.6). Since (A.11) gives It remains to show that P and Q are nonsingular. This is proven in the next lemma. (Proof) Since T 1 (1 : n, :) is a zero matrix and T 1 (n + 1 : m + n + 1, :) is an upper triangular matrix with non-zero diagonal entries, P = [P 1 , . . . , P K , T 1 , T 2 ] is nonsingular if and only if W = [P 1 , . . . , P K , T 2 ](1 : n, :) is nonsingular. We shall prove that R is nonsingular by showing that W x = 0 implies x = 0. Write

Lemma 2 The matrices P and Q defined by
. . , y n−N ] ∈ C n . If W x = 0, from the definitions (A.13), (A.15) of P k and T 2 , all the coefficients of the polynomial are equal to zero, and hence p x is the zero polynomial. Therefore, the rational function is also the zero function. This means that which implies that all the elements in x are zero. It follows from the above argument that P is nonsingular. We prove that Q is nonsingular similarly by showing that Qx = 0 implies for z = γ 1 , . . . , γ m+n+1 . By multiplying q(z) to both sides, we have Since h x is a polynomial of degree at most m + n and take on the value 0 at m + n + 1 distinct points, h x must be the zero polynomial. Using this fact we obtain x = 0 as in the case of P, and hence Q is nonsingular.

Corollary 1
The Kronecker canonical form of the pencil A − λB is as follows: (Proof) We can transform the (m + n − N ) × (m + n − N ) lower-right block of L(λ) to obtain the Kronecker canonical form as follows:
In this section we focus on the case where r is a rational function, and assume that r does not have poles coinciding with the sample points γ 1 , . . . , γ L . When r = p/q is an irreducible expression, the degrees M, N of p, q are uniquely determined, dependening only on r . For these M and N , we say that r is of exact type (M, N ).
Below we summarize the properties of the matrix C mn (r ) when r is a rational function of exact type (M, N ) and has an irreducible expression r = p/q. Note that r is not necessarily in R mn .
From (B.1), we have range(C mn (r )) = {v(( pp 1 + qp 2 )/q) | p 1 ∈ P n , p 2 ∈ P m }. (Proof) This follows from (B.5) and the fact that pP n ∩ qP m = {0}. We now prove results on the rank of the matrix C mn (r ) that we use for finding the type of the rational approximant r in our algorithm. The first result shows that if we take m, n large enough so that m ≥ M, n ≥ N , then the rank of C mn (r ) gives the information on M, N . as long as L ≥ D + 1, which always holds in our algorithm in which L ≥ m + n + 1 ≥ D +1. Thus we obtain dim null(C) = m +n +2−(D +1) = min(m − M, n − N )+1, which is (3.2). The next result shows that if we do not take m, n large enough then this is indicated by C mn (r ) not having a null vector, provided we still sample at sufficiently many points. (M, N ) does not belong to R mn , i.e., M > m or N > n. If L ≥ max{M + n, m + N } + 1, then the rank of C mn (r ) is equal to m + n + 2, i.e., C mn (r ) has full column rank.

Proposition 4 Suppose that r of exact type
(Proof) This follows from (B.4) and Lemma 4.
The above two propositions indicate that we can obtain the type (M, N ) of r by combining (1) sufficient sample points, and (2) adjusting m, n so that C mn (r ) has null space of exactly dimension 1. This is the crux of the degree determination process described in Sect. 3.1.

B.1 Analysis of our generalized eigenvalue problem
The above results provide information on the building-block eigenvalue problem (4.4) in terms of its regularity and eigenvalues. We say that a (possibly rectangular) matrix pencil A − λB is regular if the matrix A − λB has full column rank for some value of λ. Conversely, if λ is not a pole of f , then (z − λ) f (z) is of exact type (M + 1, N ), and hence by Proposition 4, C m,n−1 ((z − λ) f (z))) has full column rank.
We are thus able to correctly compute the poles of f provided that we take one of m, n to be the correct value M, N .

C Condition number of eigenvalues
Here we analyze the condition number of the eigenvalues of the matrix pencil [A 1 , A 2 ] − λ[B 1 , O], which here we write simply as A − λB. For simplicity we focus on the case where the sample points are roots of unity, and examine the conditioning as n is fixed and the number of sample points grow along with m. We shall show that the eigenvalues outside the unit disk become increasingly ill-conditioned, which explains the observation in Fig. 7.
Assume that f = p/q is irreducible and f has N simple poles ξ 1 , . . . , ξ N . We consider the square generalized eigenvalue problem (4.4) where n = N (i.e., the denominator degree is fixed to the correct value), L = m + n + 1 and {γ j } L j=1 = {exp(2π i · j/L)} L j=1 . For m ≥ M, A − λB has eigenvalue equal to ξ k for each k. We will investigate how the condition number of each eigenvalue ξ k change, when we increase m (and hence also L = m + n + 1).
The absolute condition number of a simple eigenvalues of a matrix pencil A −λB is known [45] to be proportional to x y /|y Bx|, where y and x are the corresponding left and right eigenvectors. Thus we need to identify the eigenvectors for ξ k .
The right eigenvector x k such that [A 1 , A 2 ]x k = ξ k [B 1 , O]x k is given by x k = [c q , c p ] , whereq(z) = q(z)/(z − ξ k ). This x k satisfies Bx k = v( fq).
(C. 2) In fact, the trapezoidal rule approximation to I with sample points {γ 1 , γ 2 , . . . , γ L } is given by Now suppose that |ξ k | > 1, that is, ξ k lies outside the unit disk. Then, the integrand is analytic in the disc |z| < |ξ | and have a simple pole on the circle |z| = |ξ |, and hence the trapezoidal rule approximation satisfies (see [49] for the trapezoidal rule and its properties) Furthermore, since y k = ( √ L) and x k = Ω(1), the condition number κ(ξ k ) of the eigenvalue ξ k satisfies κ(ξ k ) = y k x k |y k Bx k | = Ω(L − 1 2 |ξ k | L ) for |ξ k | > 1. (C. 6) This means that the condition numbers of eigenvalues outside the unit circle grow exponentially as we use more sample points.
On the other hand, if the eigenvalue ξ k is inside the unit disk, i.e., |ξ k | < 1, then the condition number of ξ k behaves differently. Indeed, since we have lim L→∞ y k Bx k L = I = 0, (C. 7) we see that the condition number of the eigenvalue ξ k satisfies This is in sharp contrast to (C.6). Equations (C.6) and (C.8) imply that, when the number of sample points grows, the computed eigenvalues outside the unit disk lose accuracy exponentially, while those inside do not.
We confirmed this in our numerical experiments as the figures in Sect. 6 show.