Inertial Krasnoselskii-Mann Iterations

We establish the weak convergence of inertial Krasnoselskii-Mann iterations towards a common ﬁxed point of a family of quasi-nonexpansive operators, along with estimates for the non-asymptotic rate at which the residuals vanish. Strong and linear convergence are obtained in the quasi-contractive setting. In both cases, we highlight the relationship with the non-inertial case, and show that passing from one regime to the other is a continuous process in terms of the hypotheses on the parameters. Numerical illustrations for an inertial primal-dual method and an inertial three-operator splitting algorithm, whose performance is superior to that of their non-inertial counterparts.


Introduction
Krasnoselskii-Mann (KM) iterations [34,39] are at the core of numerical methods used in optimization, fixed point theory and variational analysis, since they include many fundamental splitting algorithms whose convergence can be analyzed in a unified manner.These include the forward-backward [36,46] to approximate a zero of the sum of two maximally monotone operators, and its various particular instances: on the one hand, we have the gradient projection algorithm [30,35], the gradient method [14] and the proximal point algorithm [40,50,11,31], to cite some abstract methods, as well as the Iterative Shrinkage-Thresholding Algorithm (ISTA) [22,20], to speak more concretely.KM iterations also encompass other splitting methods like Douglas-Rachford [28], primal-dual methods [17,3,18,53,21] and the three-operator splitting [23].
In convex optimization, first order methods can be enhanced by adding an inertial substep, motivated by physical considerations [48,43,1].To our knowledge, the first extensions beyond the optimization setting was developed in [2], followed by [38,37,42] some years later.The main drawback of the previous results is that they require an implicit hypothesis on the sequence generated by the algorithm (the summability of a certain series) to ensure its convergence.In [2], however, this difficulty is overcome, in some special cases and for small values of the inertial parameters.These ideas were also used in [10], and then improved in [27], by adapting the inertial factors to the relaxation ones (see below).A similar principle had been used in [4], whose analysis was based on [7].Nonasymptotic convergence rates for the residuals have been given in [51,33].Strong and linear convergence can be found in [52], for strictly contractive forward-projection operators.Other extensions have been considered in [26,19,41,25].See also [24] for a more thorough account of KM iterations, with and without inertia.Interest in this type of methods increased remarkably in the past decade in view of theoretical advances in the convergence theory for the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA) [9], obtained in [16,5,6].
The purpose of this work is to develop further insight into the convergence properties of inertial Krasnoselskii-Mann iterations in their general form where (T k ) is a family of operators defined on a real Hilbert space H, and the positive sequences (α k ) and (λ k ) are the inertial and relaxation (or averaging) parameters, respectively.
Remark 1.To fix the ideas, suppose inf k≥1 λ k > 0, (α k ) is bounded, and T k ≡ T , where T is continuous.If x k happens to converge to a point x, then the residual T x k − x k goes to zero, and x is a fixed point of T .
Our general aim is to provide conditions on the parameter sequences and the family of operators to ensure that the sequences generated by ( 1) converges (weakly or strongly) to a common fixed point of the T k 's, provided there are any.
More specifically, we mean to establish a setting, which is as general as possible, but such that (1) the hypotheses are interpretable and verifiable; (2) the proofs are transparent and mostly elementary; and (3) the convergence results are quantifiable in terms of appropriate sequences.We shall also see that adding the inertial term does not always make algorithms faster (this is reflected in the worst-case convergence rates), but may boost their convergence in some relevant instances.Another interesting line of research consists in identifying the combination of parameters for which the algorithm has its best numerical performance.Although we consider this highly relevant, we shall not pursue that direction here.
The paper is organized as follows: in Section 2 we establish the weak convergence of the iterations towards a common fixed point of the family of operators in the quasi-nonexpansive case, along with a non-asymptotic rate at which the residuals vanish.Section 3 is devoted to the strong and linear convergence in the quasi-contractive setting.In both cases, we highlight the relationship with the non-inertial case, and show that passing from one regime to the other is a continuous process in terms of parameter hypotheses and convergence rates.In Section 4, we discuss several instances of KM iterations, which are relevant to the numerical illustrations provided in Section 5, concerning an inertial primal-dual method and an inertial three-operator splitting algorithm.

Vanishing residuals and weak convergence
An operator T : H → H is quasi-nonexpansive if Fix(T ) = ∅ and T y − p ≤ y − p for all y ∈ H and p ∈ Fix(T ).This implies, in particular, that 2 y − p, T y − y ≤ − T y − y 2 for all y ∈ H and p ∈ Fix(T ).
To simplify the notation, given p ∈ F , we set ( At different points, and in order to simplify the computations, we shall make use of a basic property of the norm in H: for every x, y ∈ H and α ∈ [0, 1], we have The following auxiliary result will be useful in the sequel: 1 This is just to simplify the proof and is sufficiently general for practical purposes. Lemma 2. Let (T k ) be a family of quasi-nonexpansive operators on H, with F := k≥1 Fix(T k ) = ∅, and let (x k , y k ) satisfy (1).For each k ≥ 1 and p ∈ F , we have Proof.Take p ∈ F .From (1), it follows that where the inequality is given by (2).Notice that and using (4) we get By combining expressions ( 6) and ( 7), we obtain , we rewrite the latter as Notice that and using (4) gives By multiplying the latter by ν k = (1 − λ k )/λ k , and using the definition of δ k in (3), we rewrite this as Summing ( 8) and (11), we obtain (5).
We are now in a position to show that the sequence (x n ) remains anchored to the set F , while both the residuals y k − T k y k and the speed x k − x k−1 tend to 0. We shall make some assumptions on the parameter sequences (α k ) and (λ k ).
A reinforced version with strict inequality is given by: for all k ≥ k 0 (if Hypothesis B holds, then ε > 0; otherwise, ε = 0).Also, under Hypothesis B, α := sup k≥1 α k < 1 and λ := inf k≥1 λ k > 0.  ii) If Hypothesis B holds, the series convergent, and there is a constant M > 0, depending only on (α k ) and (λ k ), such that Moreover, for each p ∈ F , lim k→∞ x k − p exists.
Proof.Without any loss of generality, we may assume that (12) holds with k 0 = 1.Take any p ∈ F , and combine (12) with (5), to obtain On the one hand, (14) immediately gives On the other, since (α k ) is nondecreasing, we have Therefore, (14) implies It ensues that C k (p) is nonincreasing.To show that it is nonnegative, suppose that C k1 (p) < 0 for some for all k ≥ k 1 , which is impossible.As a consequence C k (p) is nonnegative, and lim k→∞ C k (p) exists.
For ii), Inequality (12) holds with ε > 0. The summability of the first two series follows from (16).In particular, The third one is a consequence of the second one, since λ := inf k≥1 λ k > 0. For the last one, use (10) to write In view of (17), this gives the summability of the fourth series, with Since this holds for each p ∈ F , we obtain (13) with M = (1+α) 2 ελ 2 .Now, denoting the positive part of d ∈ R by [d] + , we obtain from (15) that Summing for k ≥ 1, we obtain Remark 5. Hypotheses A and B are closely related, but different, from the hypotheses used in [4] for forwardbackward iterations.In the non-inertial case α = 0, Hypothesis A is just lim sup k→∞ λ k < 1.On the other hand, since (α k ) is nondecreasing and bounded, we have For each α ∈ [0, 1), there is λ α > 0 such that (18) holds for all λ < λ α .
In order to prove the weak convergence of the sequences generated by Algorithm (1), we shall use the following nonautonomous extension of the concept of demiclosedness.
The family of operators (I − T k ) is asymptotically demiclosed at 0 if for every sequence (z k ) such that z k ⇀ z and Of course, if T : H → H is nonexpansive and T k ≡ T , then I − T k is asymptotically demiclosed at 0. We shall discuss other examples in the next section.Proof.Recall that lim From (1), we deduce that (y k ) and (x k ) have the same (weak and strong) limit points.Suppose x n k ⇀ x.Then, y n k ⇀ x as well.Since y n k − T k y n k → 0, the asymptotic demiclosedness implies x ∈ F .Opial's Lemma [45] (see, for instance, [47, Lemma 5.2]) yields the conclusion.

Strong and linear convergence
We now focus on the strong convergence of the sequences generated by ( 1), and their convergence rate.As before, we assume that (α k ) is nondecreasing but we do not assume, in principle, that inf k≥1 λ k > 0.
Given q ∈ (0, 1), an operator T : Given λ, q ∈ (0, 1) and ξ ∈ [0, 1], we define Notice that Q(λ, q, ξ) ∈ (0, 1), and that it decreases as λ increases, or as either q or ξ decreases.The quantity Q(λ, q, ξ) will play a crucial role in the linear convergence rate of the sequences satisfying (1).The inclusion of the auxiliary parameter ξ will also allow us to establish convergence rates, with and without inertia, in a unified manner (see the discussion in Subsection 3.3).
The following result establishes a bound on the distance to a solution after performing a standard KM step: Lemma 7. Let T : H → H be q-quasi-contractive with fixed point p * , and let x, y ∈ H and λ > 0 be such that x = (1 − λ)y + λT y.Then, for each ξ ∈ [0, 1], we have Proof.Notice that Then, using (4), we get On the other hand, we have Then, inequality ( 20) is just a convex combination of ( 21) and the square of ( 22).

Convergence analysis
We now turn to the convergence of the sequences verifying (1).To simplify the notation, for each k ∈ N, we set We have the following: , where Q is defined in (19).For each k ∈ N, we have If, moreover, for all k ∈ N, then and Proof.We use ( 1) and ( 20) to obtain Now, by ( 7), we deduce that On the other hand, from (11), we get and the last two inequalities together imply (23).For the second part, inequalities ( 23) and ( 24) together give where the second inequality comes from α k being nondecreasing and Q k ≤ 1.This gives (25), recalling that which we then iterate to obtain (26).
The preceding estimations allow us to establish the main result of this section, namely: , and assume that (24) holds for all k ∈ N. We have the following: ii) If λ k ≥ λ > 0 and q k ≤ q < 1 for all k ∈ N, then x k converges linearly to p * , as k → ∞.More precisely, Proof.For part i), write (25), lim k→∞ Ck (p * ) = 0.As in the proof of Theorem 4, we can show that the sum of the first two terms in Ck (p * ), namely

and the conclusion follows.
For ii), we know that Q(λ k , q k , ξ) ≤ Q(λ, q, ξ), because Q increases either if λ decreases, and also if q increases.Gathering the common factors in the second and third terms on the left-hand side of inequality (24), we deduce that Q ≥ α (strictly if α > 0).Using (26), and observing that the case Q(λ, q, ξ) = α is incompatible with inequality (24), we deduce that as claimed.

Behavior with and without inertia
In the non-inertial case α k ≡ 0, (24) holds if either ξ = 0 or λ k ≤ 1 for all k, as in Hypothesis A. This is less restrictive than Hypothesis B (see Remark 5).To simplify the explanation, suppose q k ≡ q ∈ (0, 1).The best convergence rate is obtained from Theorem 9 with λ k ≡ 1 and ξ = 0.If α k > 0 for at least one k, the case ξ = 0 is ruled out, and All inequalities are strict if λ k ∈ (0, 1).This suggests that there may be operators for which the inertial step actually deteriorate the convergence, so inertial steps should be handled with caution and this can be seen as an argument against the use of inertia.Actually, it is possible to find a wide variety of behaviors, even for some of the simplest operators, as shown by the following case study: Example 10.Let λ k ≡ λ ∈ (0, 1) and α k ≡ α ∈ [0, 1).Take q ∈ (0, 1], and consider the operator T : R → R, defined by T y = −qy, whose unique fixed point is the origin. If α = 0, for each k ≥ 0, we have x k+1 = Lx k , where we have written L = 1 − λ(1 + q).Iterating from Now, let α ∈ (0, 1), so that (1) reads Here, we take x 1 = x 0 = 1.We can rewrite (28) in matrix form as As before, convergence occurs in one step if L = 0.The eigenvalues of M are Let us consider the case L > 0 first.
), the eigenvalues are complex conjugates, both with modulus , the inertial iterations converge strictly faster than the noninertial ones if If L = α, the convergence rate is the same.Else, if (1 + α) 2 L 2 ≥ 4αL, then M has two real eigenvalues (counting multiplicities), with 0 < µ − ≤ µ + .But since L ∈ (0, 1) implies −L < −L 2 , we always have Therefore, the inertial iterations also converge strictly faster if When L < 0 (λ(1 + q) > 1), the matrix M will always have two real eigenvalues, one of each sign.It is easy to verify that |µ + | < |µ − |, which implies that |µ − | determines the convergence (the initial condition is not an eigenvector of M , so both eigenvalues intervene).But In this case, the inertial algorithm performs worse than the noninertial one.Moreover, the inertial iterations do not converge if µ − ≤ −1, which is equivalent to A few comments are in order: • For 0 < λ(1 + q) < 1 − α, the inertial iterations converge at a strictly faster linear rate than the noninertial ones, even in the noncontracting case q = 1.
• At the transition point λ(1 + q) = 1 − α the convergence rate is the same.
1+2α , the inertial iterations do not converge, while the noninertial ones do.This combination of parameters is not feasible if q ≤ 1/3.Notice that, picking λ and α satisfying (18) can be read as picking λ < S(α), with S(α) Then λ(1 + q) < 2(1+α) 1+2α for all q ∈ (0, 1].Therefore, this last case is incompatible with Hypotheses (A) or (B).Now, the convergence rate results given by Theorem 9 correspond to worst-case scenarios, which certainly must include cases like the one discussed in Example 10.However, this situation need not be representative of other concrete instances found in practice, in which inertia improves either the theoretical convergence rate guarantees (see Subsection 4.2 below, and the commented references), or the actual behavior when the algorithm is implemented.In fact, the numerical tests reported below show noticeable improvements in the performance of the selected algorithms, upon adding the inertial substep.

Averaged Operators
An operator T : H → H is γ-averaged if there is a nonexpansive operator R : H → H such that T = (1 − γ)I + γR.
Let R : H → H be nonexpansive and let (γ k ) be a sequence in (0, 1).Setting T k = (1 − γ k )I + γ k R, (1) can be rewritten as and Hypothesis B becomes lim sup It is not necessary to implement the algorithm using the operator R explicitly.However, the interval for the relaxation parameters is enlarged, and it may be convenient to over-relax.We shall come back to this point in the numerical illustrations.

Euler Iterations and Gradient Descent
An operator B is β-cocoercive with β > 0 if Bx − By, x − y ≥ β Bx − By 2 for all x, y ∈ H.
Now, let f : H → H be convex and differentiable, and assume ∇f is Lipschitz-continuous with constant L.Then, B = ∇f is cocoercive with constant β = 1/L.If, moreover, f is strongly convex with parameter µ and ρ k ≤ 2/(L+µ), then T k is q k -quasi-contractive with Therefore, (T k ) is q-quasi-contractive.Considering the non-inertial case (α k ≡ 0), λ k ≡ 1 and the fixed-sted choice ρ k = 2/(µ + L), the algorithm exhibits a rate of convergence where Q = L/µ is the condition number ([44, Theorem 2.1.15].Introducing the inertial term, and using it turns into constant step scheme, III [44], which has a rate of convergence of Here, Hypothesis B can be written as which gives the condition for the convergence of Nesterov's constant step scheme with constant relaxation λ.

Proximal and Forward-Backward Methods
Let M : H → 2 H be maximally monotone and let (ρ k ) be a positive sequence.The proximal method consists in iterating for k ≥ 1.The operator As before, the family (I − T k ) is asymptotically demiclosed at 0 if inf k≥1 ρ k > 0. To see this, let (z k ) be a sequence in H such that z k ⇀ z and z k − T k z k → 0. We must show that 0 ∈ M z.By the definition of T k , we have The left-hand side converges strongly to zero, while T k z k ⇀ z.We conclude by the weak-strong closedness of the graph of M .
Let A : H → 2 H be maximally monotone, let B : H → H be cocoercive with parameter β, and let (ρ k ) be a sequence in (0, 2β).For each k ≥ 1, set Then, As in the proximal case, the family (I − T k ) is asymptotically demiclosed at 0 if inf k≥1 ρ k > 0.

Douglas-Rachford and primal-dual splitting
Let A, B : H → 2 H be maximally monotone, and let (r k ) be a positive sequence.The Douglas-Rachford splitting method consists in iterating z k+1 = T r k z k , for k ≥ 1, where The second expression shows that T r is averaged.Using the weak-strong closedness of the graphs of A and B, and a little algebra, one proves that the family I − T r k is asymptotically demiclosed if inf k≥0 r k > 0. Finally, observe that Zer(A + B) = J rB Fix(T r ).
More generally, let X and Y be Hilbert spaces, and consider the primal problem, which is to find x ∈ X such that where A : X → 2 X and B : Y → 2 Y are maximally monotone operators, and L : X → Y is linear and bounded.The The primal and dual solutions, namely x and ŷ, are linked by the inclusions respectively.Douglas-Rachford splitting applied to A = ∂g * and B = ∂ f * • (−L * ) yields the alternating direction method of multipliers (see [29]).
An inertial version of the primal-dual iterations is given by with appropriate sequences α k and λ k .
In [13], the authors propose the Split Douglas-Rachford algorithm where Υ and Σ are elliptic linear operators that induce an ad-hoc metric and account for preconditioning.

Three Operator Splitting
Given three maximally monotone operators A, B, C defined on the Hilbert space H, we wish to find x ∈ H such that If C is β-cocoercive, the three-operator splitting method [23] generates a sequence (z k ) by starting from a point z 0 ∈ H.Here ρ ∈ (0, 2β), λ k ∈ (0, 1/γ) and This recurrence is generated by iterating the γ-averaged operator and we have Zer(A + B + C) = J ρB (Fix T ).Also, it gives the forward-backward method if B = 0 and the Douglas-Rachford method if C = 0.An inertial version is given by for appropriate choices of α k , λ k .One particular instance is given by the optimization problem where f, g, h are closed and convex, h has a (1/β)-Lipschitz-continuous gradient, and L is a bounded linear mapping.

Numerical Illustrations
In this section, we test the performance of the algorithm given by iterations (1) in two of the settings described in Section 4.More precisely, we apply an inertial primal-dual splitting method to solve a TV-based denoising problem, and an inertial three-operator splitting algorithm to in-paint a corrupted image.

Primal-Dual Splitting and TV-based Denoising
The algorithm will be tested in an image processing framework.Consider the problem min where x ∈ R N1×N2 is an image to recover from a noisy observation b ∈ R M1×M2 , R : R N1×N2 → R M1×M2 is a blur operator, w is a positive parameter, and ∇ : x → ∇x = (D 1 x, D 2 x) is the classical discrete gradient, whose adjoint ∇ * is the discrete divergence.A formulation for the gradient and divergence operators can be seen on [15].In these experiments, R will be a Gaussian blur of size 9 × 9, standard deviation 4 and relative boundary conditions (see [32] for details on the construction of the operator), and w = 10 −4 .Considering the original image x in Figure 3a composed by 256 × 256 pixels, the observation b is generated as b = Rx + e, where e is an additive zero-mean white Gaussian noise with standard deviation 10 −3 (Figure 3b).
The algorithm is tested for 17 combinations of τ, σ satisfying the critical condition τ σ L 2 = 1 (according to [13], this tends to yield the best performance).The number L is computed using an adaptation of [49,Algorithm 12].
Comparison in terms of the parameters τ and σ.In a first stage, we compare the performance of the primaldual splitting algorithm given by (36) (that is, Algorithm 1 with α k ≡ 0), and its inertial counterpart (37), with with α = 1/(3 + 0.0001) (condition (32) with η = λ/2 gives the constraint α < 1/3).Table 1 shows the execution time, number of iterations, and the value for the objective value reached, using a tolerance ε = 10 −5 .These results are depicted graphically, along with the percentage of reduction, in Figure 2. The recovered images are collected in Figures 3c and 3d.pixels.Following [23] we consider the following formulation of the in-panting problem: where H is the set of 3-D tensors, Z (1) is the matrix [Z(:, :, 1) Z(:, :, 2) Z(:, :, 3)], Z (2) is the matrix [Z(:, :, 1) T Z(:, : , 2) T Z(:, :, 3) T ] T , • * denotes the matrix nuclear norm and w is a penalty parameter, which we take equal to 1 here, for simplicity.This problem fits in the context of ( 43), with f 2 .In this case, the operator ∇(h • A) is cocoercive with constant 1.With the error function R defined in (46), the iterations defined by (42) lead to Algorithm 2.

Algorithm 2:
Choose Z 0 , Z 1 ∈ R m×n , (λ k ) k∈N and (α k ) k∈N such that hypotheses of Theorem 4 are fulfilled, ρ ∈ (0, 2), ε > 0 and r 0 > ε ; while r k > ε do As in the previous section, Algorithm 2 will be tested in the case α k ≡ 0 (the algorithm studied in [23]) and, for the inertial version, where α satisfies the condition (32).The corresponding algorithms will be referred to as original and inertial, respectively.Algorithm (2) returns both the value of Z k and X g k , since the latter represents the image solution of the problem.Throughout this section, the initial points are both set to zero.
Comparison in terms of the number of erased pixels.Between 10000 and 250000 pixels are randomly erased from the image in Figure 10a to obtain the one in Figure 10b.We compare the number of iterations and execution time needed by both methods with step size ρ = 1 and λ k ≡ 1, for a tolerance of 10 −3 .The results are shown in Figure 6.The reduction stands between 12% and 22% in most cases, and the improvement seems to increase with the number of erased pixels.Comparison in terms of the step size.Both algorithms are tested for the same image with 250000 randomly erased pixels for λ k ≡ 1 and different values of the step size ρ.For the inertial version, the constant α in ( 49) is adapted accordingly.The results are reported in Table 3 and depicted graphically in Figure 7.The percentage of reduction is noticeably higher for lower values of ρ (always above 20% when ρ ≤ 1).This is to be expected, since larger values of ρ require lower values of α, which limits the effect of inertia.Comparison in terms of the relaxation parameter.Finally, we fix the value ρ = 1, and compare the performance of the two methods for different values of the relaxation parameter λ, which, as before, limit the possible range for the inertial parameter α in view of condition (32).The results are presented in Table 4, and shown graphically in Figure 8.As with the step size, the reduction is greater for lower values of λ, which is consistent with the loss of the inertial character imposed by condition (32).Nevertheless, observe that over-relaxing with λ = 1.2 or λ = 1.4 gives better results (both in number of iterations and execution time) than keeping λ in a neighborhood of 1.The evolution of the function values, the distance to the limit and the residuals are shown (in logarithmic scale) in Figure 9 for 250000 erased pixels, using ρ = 1 and λ k ≡ 1.As in the previous example, the sequence k z k − T z k 2 tends to zero, allowing us to conjecture again an asymptotic rate of o(1/k).Finally, Figure 10 shows the original, corrupted (with 250000 erased pixels) and recovered images.The datasets generated during and/or analysed during the current study are available from the corresponding author upon request.

Theorem 6 .
Let (T k ) be a family of quasi-nonexpansive operators on H, with F = k≥1 Fix(T k ) = ∅.Let (x k , y k ) satisfy (1), and assume Hypotheses B holds.If (I − T k ) is asymptotically demiclosed at 0, then both x k and y k converge weakly, as k → ∞, to a point in F .
and g : Y → R ∪ {+∞} be closed and convex, and set A = ∂f and B = ∂g.The inclusions above are the optimality conditions for the primal and dual (in the sense of Fenchel-Rockafellar) optimization problems min x∈X {f (x) + g(Lx)} and min y∈Y {g * (y) + f * (−L * y)},

Figure 4 :Figure 5 :
Figure 4: Average number of iterations performed by the original (left) and inertial (right) algorithms, with tolerance ε = 10 −5 , for each value of λ, and each case of τ and σ, fromTable2.

Figure 6 :
Figure 6: Number of iterations (left), execution time (center) and percentage of reduction (right) in terms of the number of erased pixels, with step size ρ = 1 and relaxation parameter λ k ≡ 1, for a tolerance of 10 −3 .

Figure 7 :
Figure 7: Number of iterations (left), execution time (center) and percentage of reduction (right) in terms of the step size ρ.

Figure 8 :
Figure 8: Number of iterations (left), execution time (center) and percentage of reduction (right) in terms of the relaxation parameter λ.

Figure 9 :
Figure 9: Evolution to the distance to the computed solution (top left), objective function values (top right), residuals z k − T z k 2 (bottom left) and k z k − T z k 2 (bottom right), for 250000 erased pixels using ρ = 1 and λ k ≡ 1.
image (c) Recovered without inertia (d) Recovered with inertia

Table 2 :
Execution time, number of iterations, final function value and reduction percentage for the original primaldual algorithm and the inertial version (case 14), with tolerance ε = 10 −5 .

Table 3 :
Execution time and number of iterations in terms of the step size ρ.

Table 4 :
Execution time and number of iterations for different values of λ.