Krylov subspace methods for estimating operator-vector multiplications in Hilbert spaces

The Krylov subspace method has been investigated and refined for approximating the behaviors of finite or infinite dimensional linear operators. It has been used for approximating eigenvalues, solutions of linear equations, and operator functions acting on vectors. Recently, for time-series data analysis, much attention is being paid to the Krylov subspace method as a viable method for estimating the multiplications of a vector by an unknown linear operator referred to as a transfer operator. In this paper, we investigate a convergence analysis for Krylov subspace methods for estimating operator-vector multiplications.


Introduction
Linear operators are used in various tasks in engineering and scientific research such as simulations and data analysis. A classical example of linear operators is a differential operator used for describing various natural phenomena. Krylov subspace methods have been actively researched in numerical linear algebra for approximating the behavior of such given operators, such as approximation of eigenvalues, solutions of linear equations, and operator functions acting on vectors, which provide approximations of solutions or information on the solutions of differential equations [16, 19, 24, 27, 31-33, 35, 36, 43, 45]. In many cases, problems in infinite dimensional spaces such as differential equations are discretized by, for example, a finite difference method [8] or finite element method [1] and are transformed into finite dimensional problems with matrices, after which the Krylov subspace methods are applied to the matrices. On the other hand, Krylov subspace methods for operators in infinite dimensional Hilbert spaces without discretization have also been investigated and more general results than those for matrices have been developed [6, 9-12, 15, 26, 28, 30].
Meanwhile, linear operators that represent time evolutions in dynamical systems, called transfer operators, are being investigated in relation to various fields such as machine learning, physics, molecular dynamics, and control engineering [18,21,25,[40][41][42]. Unknown transfer operators are estimated through data generated by dynamical systems. Since the transfer operators are generally linear even if the dynamical systems are nonlinear, Krylov subspace methods can be used to understand nonlinear dynamical systems [3,4,14,20]. To make the algorithm computable only with the data, transfer operators are often discussed in relation to RKHSs (reproducing kernel Hilbert spaces). The Arnoldi and shift-invert Arnoldi methods have been proposed as Krylov subspace methods for transfer operators in RKHSs. The Arnoldi method is a standard Krylov subspace method [3,35], but for its convergence, operators applied to it have to be bounded. However, not all transfer operators are bounded [17]. For example, transfer operators defined in the RKHS associated with the Gaussian kernel are unbounded if the dynamical system is nonlinear and deterministic. Thus, the shiftinvert Arnoldi method was also considered [14]. When we apply the shift-invert Arnoldi method, a shifted and inverted operator ( I − K) −1 for some which is not contained in the spectrum of K was considered instead of an unbounded K.
The main difference between the classical settings assumed for the Krylov subspace methods mentioned in the first paragraph and those for transfer operators is whether the information of the model is given or not. In the classical setting, a differential operator is given, and a model-driven approach with the operator is applied. On the other hand, in the above setting for transfer operators, neither a dynamical system nor a transfer operator is given. Instead, data generated by the system is given. A data-driven approach is applied in this case. The purpose of applying the Krylov subspace method also differs in some situations. When we apply it to a transfer operator, denoted as K, one important task is to estimate K n v for a given vector v and some n ∈ ℕ because K is unknown. On the other hand, in the classical setting for a given operator such as a differential operator, denoted as A, Av for a given vector v is already known, because both A and v are known. The main task of the Krylov subspace method in such a setting is to estimate f(A)v for a given vector v and a function f such as f (z) = z −1 and f (z) = e z , except for f (z) = z n . For this reason, Krylov subspace methods for estimating operator-vector multiplications have not been discussed in the classical setting of numerical 1 3 Krylov subspace methods for operator-vector multiplications linear algebra. Meanwhile, although Krylov subspace methods for estimating operator-vector multiplications have been proposed in machine learning, the convergence analysis for them has not been fully investigated.
The objective of this paper is to analyze the convergence of such Krylov subspace methods for estimating operator-vector multiplications. We define a "residual" for approximating K n v for a vector v and analyze the convergence of the residuals of the Krylov approximations. The classical Krylov subspace methods for estimating f(A)v are frequently associated with residuals. For f (z) = z −1 , for example, the GMRES (generalized minimal residual method) approximation minimizes the residual in a Krylov subspace and the convergence of the residual is superlinear [9,28,44]. Moreover, in BiCG (biconjugate gradient) type approximations, the residual or a value relevant to the residual is orthogonal to the Krylov subspace [43]. For a more general f, a generalized residual is proposed for evaluating the convergence of approximations [2,13,16,34].
In our case, we show that the Arnoldi approximation converges to the minimizer of the residual. To illustrate this point, an error bound for a Krylov approximation of an operator function acting on a vector is used [10-12, 15, 27]. For the shift-invert Arnoldi method, the convergence analysis is not straightforward. At first glance, the problem of estimating K n v seems to be the same as that of estimating the operator function f(K) acting on the vector v, where f (z) = z n . However, the situation is different from that of the classical Krylov subspace methods in terms of operator functions. The existing error bound for the Krylov approximation of f(K)v requires an assumption of the holomorphicity of f on the spectrum of K. On the other hand, the function f (z) = ( − z −1 ) n , where " f (( I − K) −1 ) = K n " holds formally, is not holomorphic at 0, but 0 is contained in the spectrum of ( I − K) −1 if K is unbounded. We resolve this problem through the factor K −n that appears in the residual.
This paper is structured in the following manner. In Sect. 2, to explain why operatorvector multiplications need to be estimated for data analysis, we review the definition of a transfer operator and the Krylov subspace methods for it. In Sect. 3, we generalize the problem to Krylov approximations for estimating operator-vector multiplications for linear operators in a Hilbert space and investigate a convergence analysis. In Sect. 4, we empirically confirm the results investigated in Sect. 3. Section 5 is the conclusion.

Notations
Linear operators are denoted with standard capital letters, except for m × m matrices, which are denoted in bold. Calligraphic capital letters and Italicized Greek capital letters denote sets. The inner product and norm are denoted as ⟨⋅, ⋅⟩ and ‖ ⋅ ‖ , respectively.

Background
In this section, we briefly review the definition of Perron-Frobenius operators and Krylov subspace methods for Perron-Frobenius operators [14,20]. Perron-Frobenius operators are transfer operators often discussed in relation to RKHSs, and their Krylov subspaces naturally appear [18,20,22]. The adjoint operators of Perron-Frobenius operators are referred to as the Koopman operators [23], which are also transfer operators and have been researched for data-driven approaches [3,4,21,40,42].

Perron-Frobenius operator in RKHS
Consider the following dynamical system with random noise [14]: where t ∈ ℤ ≥0 , ( , F) is a measurable space, (X, B) is a Borel measurable and locally compact Hausdorff vector space, X t and t are random variables from to X , and h ∶ X → X is a generally nonlinear map. Assume { t } t∈ℤ is an i.i.d. stochastic process and that t is also independent of X t . The random variable t corresponds to random noise in X.
Let P be a probability measure in . The nonlinear time evolution of X t in a dynamical system (1) is regarded as a linear time evolution of the push forward measure X t * P , defined by X t * P(B) = P(X t −1 (B)) for B ∈ B . To describe the time evolution in a Hilbert space, RKHS [37] is used. An RKHS is a Hilbert space constructed by a map k ∶ X × X → ℂ called a positive definite kernel. For x ∈ X , a map ∶ X → ℂ X defined as (x) = k(x, ⋅) is called a feature map. Let H k,0 be a vector space defined as In H k,0 , the inner product associated with k is defined, and the completion of H k,0 , which is denoted as H k , is called an RKHS. An observation z ∈ X is regarded as a vector (z) in H k through . Moreover, if k is bounded, continuous, and c 0 -universal, then the space of all the complex-valued finite regular Borel measures on X , which is denoted as M(X) , is densely embedded into H k . That is, a map ∶ M(X) → H k defined as ↦ ∫ x∈X (x) d (x) is injective [39] and (M(X)) is dense in H k [14]. Here, c 0 -universal means that H k is dense in the space of all continuous functions that vanish at infinity. The map is called a kernel mean embedding [29]. For example, the Gaussian kernel e −c‖x−y‖ 2 2 and Laplacian kernel e −c‖x−y‖ 1 with c > 0 for x, y ∈ ℝ d are bounded and continuous c 0 -universal kernels. Therefore, a complexvalued finite regular Borel measure is regarded as a vector ( ) in the dense subset of Hilbert space H k . Since the map ∶ M(X) → H k is linear, it is possible to define a linear operator K ∶ (M(X)) → H k , which is called a Perron-Frobenius operator, in H k as follows: where t ∶ X × → X is defined as (x, ) ↦ h(x) + t ( ) . Since t and X t are independent, ( t * (X t * P ⊗ P)) = ((h(X t ) + t ) * P) holds, and K maps (X t * P) to (2) K ( ) = ( t * ( ⊗ P)),

3
Krylov subspace methods for operator-vector multiplications (X t+1 * P) . In addition, since { t } t∈ℤ is an i.i.d. process, it can be shown that K does not depend on t.

Krylov subspace methods for Perron-Frobenius operators
Let {x 0 , x 1 , …} ⊆ X be observed time-series data from the dynamical system (1), i.e., x t = X t ( 0 ) for some 0 ∈ . By using Krylov subspace methods, we estimate K n (x t ) for x t ∈ X to predict (x t+n ) through available data. In the mth Krylov step, the data is split into S datasets. Examples of the choice for S are S = m + 1 and S = M for a sufficiently large natural number M. Let S t,N = 1∕N x t+iS (t = 0, … , m) be empirical measures with the datasets, where N ∈ ℕ and x denotes the Dirac measure at x ∈ X . It is assumed that S t,N weakly converges to a finite regular Borel To construct the Krylov subspace only with the observed data {x 0 , x 1 , … } , the following equality of the average of noise t is assumed for any measurable and integrable function f: The left-and right-hand sides of Eq. (3) represent the space average and time average of t , respectively. The same types of assumptions as Eq. (3) have also been considered in other studies [4,42]. By applying the above settings and assumptions, the Arnoldi and shift-invert Arnoldi approximations for the Perron-Frobenius operator K, defined as Eq. (2), are computed as explained in the following subsections.

The Arnoldi method
In this section, the Perron-Frobenius operator K is assumed to be bounded. Under the assumption (3), the following equation is derived since is continuous: , is an m-dimensional Krylov subspace of the operator K and vector ( S 0 ):

Remark 2.1
If S depends on m, the initial vector ( S 0 ) depends on m. In this case, the inclusion K m−1 (K, ( S 0 )) ⊆ K m (K, ( S 0 )) does not always hold. On the other Let q 1 , … , q m be an orthonormal basis of the Krylov subspace K m (K, ( S 0 )) obtained through the Gram-Schmidt orthonormalization and where * means adjoint, is a projection operator onto the Krylov subspace. There exists an invertible matrix This makes it possible to compute the following Arnoldi approximation of K (z) for an observable z ∈ X only with observed data {x 0 , x 1 , …}:

The shift-invert Arnoldi method
The convergence of the Arnoldi method along m is not always attained if K is unbounded [14]. According to Ikeda et al. [17], not all the Perron-Frobenius operators are bounded. For this reason, the shift-invert Arnoldi method is also considered. Let ∉ (K) be fixed, where (K) is the spectrum of K under the assumption of (K) ≠ ℂ , and consider using bounded bijective operator ( I − K) −1 . Under the assumption (3), the following equation is derived: Similar to the Arnoldi method, let q 1 , … , q m be an orthonormal basis of the Krylov subspace On the basis of this observation, the following shift-invert Arnoldi approximation of

Convergence analysis
In this section, we provide a convergence analysis of the Arnoldi method and shiftinvert Arnoldi method described in Sect. 2. The problem is generalized to a separable complex Hilbert space H and linear operator K on H by setting In Sect. 3.1, we generalize the problem. In Sect. 3.2, we define a residual of an approximation of K n v . Then, we investigate the relationship between the two methods and the residuals in Sects. 3.3 and 3.4.

The general setting for Krylov subspace methods for estimating operator-vector multiplications
Let H be a separable complex Hilbert space, let K ∶ D → H be an unknown linear map, where D is a dense subset of H , and let v and v 0 be given vectors in H . We assume K i v 0 ∈ D for any natural number i since by the definition of the Perron-Frobenius operator K, Kv ∈ D holds for D = (M(X)) . The purpose of the Krylov subspace method is to estimate Here, we give the different expression of ̃ m for the shift-invert Arnoldi method from the Arnoldi method.

A residual of an approximation of operator-vector multiplication
Assume 0 ∉ (K) . We define a residual of an approximation a m of K n v as follows: Although the approximation error K n v − a m is generally not available since K n v is unknown, K −n a m is available in some cases. For example, if K is a Perron-Frobenius operator and we know past observations x −1 , … , x −n , then we can calculate Then, we can also calculate K −n a Arnoldi m . In fact, the residual (6) is a reasonable criterion for evaluating the convergence of the approximation for two reasons. First, the residual of an approximation a m of the solution of a linear equation Second, the following proposition shows that the value v − K −n a m can be decomposed into a generalized residual of the Krylov approximation proposed by Saad [34] and Hochbruck et al. [16] and the error with respect to projecting v into a Krylov subspace.

Convergence analysis for the Arnoldi method
In this section, we assume K is bounded and 0 ∉ (K). The Arnoldi approximation a Arnoldi m = Q m̃ n m Q * m v is obtained through two projections. First, the vector v ∈ H is projected onto the Krylov subspace K m (K, v 0 ) . Then, K acts on the projected vector in K m (K, v 0 ) and is projected back to the Krylov subspace again. Note that we do not need the first projection in the classical Krylov subspace method for approximating f(A)v for a given linear operator A, vector v, and function f since we can compute A i v for i = 1, … , m − 1 and construct the Krylov subspace of A and v. On the other hand, we cannot construct the Krylov subspace of K and v in our current case since K is unknown and only K i v 0 , not K i v , for i = 1, … , m − 1 and a vector v 0 are given. This prevents us evaluating the convergence speed of the approximation error or residual directly since the convergence speed of the approximation depends on that of the projected vector Q m Q * m v to the original vector v. Therefore, we first consider the minimizer of the residual in a Krylov subspace and evaluate the difference between the Arnoldi approximation and minimizer.
In fact, since the projection Q m Q * m is orthogonal, the projected vector Q m Q * m v minimizes the difference from the original vector v, i.e., . Therefore, ã m minimizes ‖v − K −n u‖ for all u ∈ K m (K, v n ) , i.e., However, in practice, K m (K, v n ) is unavailable only with v, v 0 , … , v m . Therefore, ã m is also unavailable. Thus, a Arnoldi m , instead of ã m is used for estimating K n v. We evaluate the difference between a Arnoldi m and ã m . Let = {z ∈ ℂ | |z| ≤ } be the disk of diameter > 0 , let W(K) = {⟨v, Kv⟩ | v ∈ D, ‖v‖ = 1} be the numerical range of K, and let ℂ = ℂ ⋃ {∞} be the extended complex plane. Moreover, let be a conformal map from ℂ⧵W(K) to ℂ⧵ that satisfies (∞) = ∞ and lim z→∞ (z)∕z = 1 , and let r be the region enclosed by the contour {z ∈ ℂ | | (z)| = r} for r > . Here, W(K) is the closure of W(K) and by the Riemann mapping theorem, the map exists. The following theorem is deduced. where C 1 > 0 is a constant and C 2 (m) > 0 depends on m. Proof (Proof of Theorem 3.2) Let P m−1 be the set of all polynomials of orders less than or equal to m − 1 . By Crouzeix et al. [5], the following bound is deduced for any p ∈ P m−1 : 2 . In addition, for a linear operator K and a map f ∶ ℂ → ℂ that is holomorphic in the interior of W(K) and continuous in W(K) , the norm ‖f ‖ ∞,W(K) is defined as ‖f ‖ ∞,W(K) = sup z∈W(K) �f (z)� . By taking the infimum among p ∈ P m−1 , we obtain

Krylov subspace methods for operator-vector multiplications
In fact, the infimum in the inequality (9) can be taken among p ∈ {p ∈ P m−1 | ‖p‖ ∞,W(K) ≤ 2‖f m ‖ ∞,W(K) } , which is a compact space. Indeed, for a polynomial p ∈ P m−1 satisfying ‖p‖ ∞,W(K) > 2‖f m ‖ ∞,W(K) , we have and 0 ∈ P m−1 . Therefore, the infimum in the inequality (9) can be replaced with the minimum. By Ellacott [7, Corollary 2.2], this factor is bounded as where C 2 (m) = max z∈ r |f m (z)| , which completes the proof of the theorem.  As a result, the following inequality is derived: , We now evaluate C 2 (m) . By the inequality (10) and the holomorphicity of f m , we obtain where Applying the inequality (11)

Convergence analysis for the shift-invert Arnoldi method
The convergence of the Arnoldi method is not guaranteed when K is unbounded. Moreover, although Theorem 3.2 requires the assumption about the numerical range of K, it is generally hard to calculate the numerical range of K, a linear operator in an infinite dimensional space. Therefore, we also consider the shift-invert Arnoldi method.
The shift-invert Arnoldi approximation a SIA m = Q m̃ n m Q * m v can also be obtained through two projections similar to the Arnoldi method. However, in this case, instead of K, the polynomial of ( I − K) −1 that approximates K acts on the vector which is the projection of v onto K m (( I − K) −1 , u m ).
Let n < m . To address K −n in the residual, we slightly modify the Krylov subspace K m (( I − K) −1 , u m ) and define a space K m (( I − K) −1 , w m−n ) as follows: Since the projection Q m Q * m is orthogonal, the projected vector Q m Q * m v minimizes the difference from the original vector v, i.e., Since ( I − K) i K −n w m−n can be represented as a linear combination of K −n+i w m−n , … , K −n w m−n , for i = 1, … , n , we have As a result, ã m minimizes ‖v − K −n u‖ for all u ∈ Span{w 1 , … , w m } , i.e., However, in practice, Span{w 1 , … , w m } is unavailable only with v, v 0 , … , v m . Therefore ã m is also unavailable. Thus, a SIA m , instead of ã m is used for estimating K n v. Concerning the difference between a SIA m and ã m , the following theorem is deduced. Here, r, , and r are defined in the same manner as those for the Arnoldi approximation by replacing W(K) with W(( I − K) −1 ). We use the following lemma for deriving Theorem 3.5. The vector Q m g (̃ m )Q * mpn (K −1 )w m−n in the right-hand side of Eq. (13) is equivalent to the shift-invert Arnoldi approximation of the operator function g (( I − K) −1 ) acting on the vector p n (K −1 )w m−n . In the same manner as the Arnoldi method, Theorem 3.5 is proved as follows.

Remark 3.7
If K is unbounded, 0 is contained in holds, 0 is contained in r . This is why g , which is always holomorphic at 0 and holomorphic in ℂ when = 0 , is used instead of f , which is not holomorphic at 0, for evaluating Eq. (13).

Remark 3.8
We can choose r arbitrary as long as g is holomorphic in r . The choice of r corresponds to a trade-off between the decay rate ∕r of ‖res(a SIA m ) − res(ã m )‖ and the magnitude of the constant C 2 . Indeed, the decay rate ∕r is small if r is large. On the other hand, the larger r becomes, the smaller the distance between r and 1∕ , the singular point of g , becomes. Thus, the larger r becomes, the larger C 2 = max z∈ r |g (z)| becomes.
As a result, if is chosen so that g is holomorphic in r , and if the factor ‖p n (K −1 )w m−n ‖ is bounded by some constant, the difference between residuals of a SIA m and ã m , the minimizer of the residual, exponentially decays as m becomes larger. Unfortunately, evaluating ‖p n (K −1 )w m−n ‖ theoretically is a challenging task. Thus, we numerically confirm that the factor ‖p n (K −1 )w m−n ‖ is bounded by a constant in Sect. 4.2.

Numerical experiments
Several typical numerical experiments are implemented in this section. These experiments are in a collection of problems to illustrate that the shift-invert Arnoldi method performs better than the Arnoldi method and to confirm the results in Theorems 3.2 and 3.5 numerically. All numerical computations of these experiments are executed with Python 3.6.
All the experiments are under the setting described in Sect. 2. Therefore, Krylov subspaces are subspaces of RKHSs and discrepancies are measured by norms in the RKHSs. In practical computations, all S t s in the algorithms are replaced by S t,N for N ∈ ℕ . The convergence of the approximation constructed with S t,N to the one constructed with S t is shown in [14,Section 4.3].

Comparison between the Arnoldi and shift-invert Arnoldi methods
To illustrate that the shift-invert Arnoldi method performs better than the Arnoldi method, the following dynamical system is considered in X ⊆ ℝ:  Fig. 1. If K is bounded, and if the Krylov subspace converges to the dense subset of H k , Q m̃ m Q * m converges strongly to K. Therefore, a m converges to Kv in H k , and ‖a m − a m−1 ‖ decreases as m becomes larger in this case. However, in Fig. 1, ‖a m − a m−1 ‖ does not decrease as m becomes larger with the Arnoldi method. This is due to the unboundedness of K. On the other hand, that with the shift-invert Arnoldi method decreases as m becomes larger. The results indicate that the shiftinvert Arnoldi method can address the unboundedness of K and is a better choice than the Arnoldi method in this case.

Confirmation of Theorems 3.2 and 3.5
The following deterministic dynamical system on X = ℝ is considered: where X 0 = 1 . In this example, we set n = 1 , N = 10 , S = 15 , v 0 = ( S 0,N ) , and v = (x 300 ). Next, we consider the decay rate of ‖res(a SIA m ) − res(ã m )‖ . We can set ≈ and < r ≲ 1∕| | . However, as we mentioned in Remark 3.8, choosing r as a larger value makes the constant C 2 in Theorem 3.5 larger. Figure 3 illustrates the value ‖res(a SIA m ) − res(ã m )‖ along m. We can see the decay rate is around 9/10. Thus, we set = ≈ 1 and r = 10∕9 , and computed the following theoretical upper bound in accordance with the proof of Theorem 3.5: where C 2 (r) = max z∈ r |g (z)| . We can see the theoretical upper bound describes the decay of the value ‖res(a SIA m ) − res(ã m )‖ correctly. Finally, we confirm Theorem 3.2. In the same manner as the numerical range of K −1 , that of K is shown to be contained in the ball 0.99 . Unfortunately, this evaluation is not sufficient for checking the holomorphicity of f m defined in Theorem 3.2 and determine and r since f m has a singular point at 0. Figure 4 illustrates the b( , r, m) ∶= 2(1 + It decays for small m, but grows for larger m, which implies the assumption of the holomorphicity of f m in Theorem 3.2 would not be satisfied for this example.

Confirmation of the decrease in the residuals
To confirm the decrease in the residuals of the Arnoldi and shift-invert Arnoldi approximations, experiments with synthetic data generated by the Landau equation and real-world data are implemented.

Experiments with data generated by the Landau equation
The

Experiments with real-world data
For the last experiment, real-world teletraffic data, 1 as shown in Fig. 6, is used. For t = 0, 1, … , x t represents the amount of teletraffic (gbps) that passes through a certain node (ID 12) in a network composed of 23 nodes and 227 links at time t. The time width is 15 seconds. To extract the relationship between x t+1 and x t−p+1 , … , x t , x t is considered to be generated by a random variable Y t , and X t = [Y t , … , Y t−p+1 ] is set for obtaining the relation (1) Fig. 7. For p = 1 , the residual ‖res(a m )‖ does not always decrease when m becomes larger with either the Arnoldi or shift-invert Arnoldi method. This implies that the relationship between x t+1 and x t does not describe the behavior of the data fully. For p = 3 , the residual of Real-world teletraffic data at each t the shift-invert Arnoldi method decreases as m becomes larger, whereas that of the Arnoldi method does not decrease even if m becomes larger. This implies that the relationship between x t+1 and x t , … , x t−2 describes the behavior of the data, but the unboundedness of the Perron-Frobenius operator prevents the Arnoldi method from constructing a proper approximation from the data or the assumption about the holomorphicity of f m in Theorem 3.2 is not satisfied. The advantage of the shift-invert Arnoldi method over the Arnoldi method is also underlined in this example.

Conclusion
We investigated a convergence analysis for Krylov subspace methods for estimating operator-vector multiplications K n v , where K is a linear operator and v is a vector in a Hilbert space, in this paper. The Arnoldi method and shift-invert Arnoldi method were considered. Although these methods have been proposed for time-series data analysis in machine learning, their convergence has not been thoroughly analyzed. We proved that the Arnoldi approximation converges to the minimizer of the residual. For the shift-invert Arnoldi method, the derivation was not straightforward since the function that appears in the estimated operator-vector multiplications was not holomorphic. This problem was addressed by applying the factor K −n that appears in the residual. As a result, we showed the shift-invert Arnoldi approximation converges to the minimizer of the residual on the assumption that a factor related to the initial vector and K −1 is bounded by some constant. The aforementioned results were also confirmed numerically with synthetic and real-world data.
As future work, we will theoretically evaluate the aforementioned factor and give sufficient conditions for the convergence of the shift-invert Arnoldi approximation.
Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/ licen ses/by/4.0/.