Lower estimation of the difference between quasi-arithmetic means

In the 1960s Cargo and Shisha introduced a metric in a family of quasi-arithmetic means defined on a common interval as the maximal possible difference between these means taken over all admissible vectors with corresponding weights. During the years 2013–2016 we proved that, having two quasi-arithmetic means, we can majorize the distance between them in terms of the Arrow–Pratt index. In this paper we are going to prove that this operator can also be used to establish certain lower bounds of this distance.


Introduction
One of the most popular family of means encountered in the literature is the family of quasi-arithmetic means. These means are defined for any continuous, strictly monotone function f : U → R, U is an arbitrary interval. When a = (a 1 , . . . , a n ) is an arbitrary sequence of points in U and w = (w 1 , . . . , w n ) is a sequence of corresponding weights (w i > 0, w i = 1), then the mean A [f ] (a, w) is defined by the equality This family of means was first mentioned in the pioneering paper by Knopp [5]; shortly after that it was formally introduced in a series of nearly simultaneous papers [4,6,8] at the beginning of the 1930s. In fact quasi-arithmetic means have been considered as a generalization of the well-known family of Power Means ever since their first appearance in [5]. Indeed, upon putting 8 P. Pasteczka AEM U = (0, +∞), p s (x) := x s for s = 0, completed by p 0 (x) := ln x we could easily identify power means as a subfamily of quasi-arithmetic means (this was first mentioned in [5]). Let us specify one more classical family of quasiarithmetic means that will be used in this paper-let {e s : R → R} s∈R be given by e s (x) := exp(s · x) for s = 0, completed by e 0 (x) := x. Denote by E s the quasi-arithmetic mean generated by e s , that is E s := A [es] for every s ∈ R. In the other words for every vector a = (a 1 , . . . , a n ) of real numbers and corresponding weights w we get These means are sometimes called log -exp means (cf. [1, p. 269]). The family (E s ) s∈R is closely related to the family of Power Means (denoted here as P s ). For n ∈ N, a ∈ R n , and corresponding weights w there holds P s ((e a1 , e a2 , . . . , e an ), w) = exp (E s (a, w)) for every s ∈ R.
In fact this relation is even closer-(P s ) s∈R and (E s ) s∈R are two examples of so-called invariant scales (cf. [11] for details).
Let's return to (general) quasi-arithmetic means. These means have been developed in many ways (see e.g. [1, chap. 4]). In particular, in the 1960s Cargo and Shisha [3] introduced a metric among them. Namely, if f and g are both continuous, strictly monotone and have the same domain, then one can define a distance The original definition was established in a different way, nevertheless this wording is equivalent. The main goal of the present paper is to establish some lower bounds of the distance ρ. In [3], there was presented a number of majorizations of ρ. In particular few of them concern lower boundaries.
Remark. It is easy to verify that the assumption about the domain could be omitted by using the scaling of functions f and g (cf. [13, section 4.1]).
If means are defined on an unbounded interval, then the distance between them could be infinite (for example in the case of power means). Therefore from now on we will assume that means considered in the present paper are defined on a bounded interval U unless otherwise stated.
By [3,Theorem 6.1] the maximal value of the distance between quasiarithmetic means is obtained for a vector of length two with some weights (it obviously does not imply that it cannot be obtained for any other vector of entries). More precisely in the definition above we can assume, without loss of generality, that the vector a has length two. Thus it is reasonable to restrict our consideration to 2-entry vectors only. In this setting it will be handy to denote . Therefore, using the notation introduced previously, we obtain the equality The essence of this equality is that the problem of find the distance between two quasi-arithmetic means can be reduced to finding the maximum value of some function in the space U 2 × (0, 1).
Note that the problem of finding a lower and upper estimate for the difference is rather different. To establish an upper boundary we are looking for a inequality that is valid for all entries (and weights)-this is a very common method whenever the sup operator appears. Contrary to this, when it comes to finding a lower boundary, we need to prove that there exists a vector (with corresponding weights) such that the distance between two quasi-arithmetic means evaluated at it can be bounded from below. Additionally, it is natural to look for estimations which are resistant under affine changing of the functions (cf. Remark 1 below).
The first possible solution comes from Mikusiński [7] and, independently, Lojasiewicz (cf. [7, footnote 2]). For a twice differentiable function f : U → R (U -an interval) having a nowhere vanishing first derivative [hereafter we will denote such a family of functions by S(U )] they defined an operator P f := f /f . In mathematical economy, the negative of this operator happened to be called the Arrow-Pratt measure of risk aversion, in dynamical systems it is called nonlinearity of the function. Besides, Mikusiński and Lojasiewicz proved that the comparability of quasi-arithmetic means might be easily expressed in terms of the operator P.

Proposition 2. (Basic comparison)
Let U be an interval, f, g ∈ S(U ). Then the following conditions are equivalent: (a, w) for all vectors a ∈ U n and weights w, with both sides equal only when a is a constant vector, Mikusiński [7] proved the equivalence (i) ⇐⇒ (iii), while the (i) ⇐⇒ (ii) part is simply implied by Jensen's inequality (cf. e.g. [1, p. 276]). As an immediate corollary (cf. [1, p. 278], [7]) (1.1) The second solution uses the three-parameters-operator introduced by Pàles in [14]. He proved that the pointwise convergence of quasiarithmetic means can be easily expressed in terms of this operator. There were, however, no results estimating the distance ρ using this mapping. Some approach was recently given in Proposition 3. [10] Let U be an interval, f, g : U → R be two continuous, strictly monotone functions and α > 0. If There is one crucial reason making the operator P and the mapping above very natural for describing quasi-arithmetic means. Namely, Proposition 2 has its equal-type counterpart: Remark 1. Let U be an interval, f, g : U → R be continuous and strictly monotone functions. Then the following conditions are equivalent: (a, w) for all vectors a ∈ U n and corresponding weights w; The equivalence (i) ⇐⇒ (ii) is a folk result announced for example in [1, p. 271] and references therein. The implication (ii) ⇒ (iii) is trivial; to obtain the opposite we need to fix x, z ∈ U , x = z. The equivalence (i) ⇐⇒ (iv) is a simple implication of the result by Mikusiński [7].
Having established this, whenever the final result is stated in terms of the operator P, we do not have to make any extra assumptions involving affine transformations of generating functions-contrary to Proposition 1.

Vol. 92 (2018)
Lower estimation of the difference 11 In what follows we are going to present a number of results majorizing the difference between two quasi-arithmetic means in terms of the operator P. Next, we are going to present certain lower bounds of distance ρ in the general setting (Sect. 2), and under the stronger assumption (Sect. 3).

Operator P in estimating differences among quasi-arithmetic means
The first result majorizing differences among quasi-arithmetic means using the Arrow-Pratt index was established in [9]. We do not recall this result here, because it was strengthened in [10] in terms of a special norm · * defined by For a subinterval U ⊂ dom f we will also define f * ,U := f | U * . Our result from [10], inspired by [9] and the earlier result by Páles [14], reads as follows

Proposition 4. Let U be a closed, bounded interval and f, g ∈ S(U ).
Then Notice that the assumption that the set U is closed could be overlooked. Indeed, consider a sequence U 1 ⊂ U 2 ⊂ . . . of closed intervals such that U n = U . Then every vector having entries in U also has all its entries in U n for a certain natural number n. Therefore, we may apply a relevant result to each set, and finally pass to the limit. On the other hand, if U is not closed, it could happen that P f * = +∞, or P f − P g * = +∞. In this case the right hand side equals +∞ (unless A [f ] and A [g] are equal) and, consequently, we obtain ρ(A [f ] , A [g] ) ≤ +∞, which is a trivial estimation.
Moreover the left hand side is symmetric with respect to f and g, while the right one is not. One could clearly symmetrize this inequality using the min function. Nevertheless, this operation will be omitted to keep the notation compact. The same remark applies to most results in the present paper. Finally, having a trivial inequality ρ(A [f ] , A [g] ) ≤ |U |, the only significant case of Proposition 4 is when P f * is finite and P g − P f * < ln 2 (by the triangle inequality it also implies P g * < +∞).
Very often we will use a global estimation of the operator P and, as it is useful, for K > 0 we denote By virtue of Proposition 2, we may rewrite the definition of this family in the following way By [13,, there exists a universal majorization of the difference between two means generated by a function from S K .

Proposition 5.
Let U be a closed, bounded interval, K > 0, and f, g ∈ S K (U ). Then The two parts of this proposition are not comparable between each other; if an interval |U | is big, then the first inequality is better, for small |U |-the second one. Let us note that in the mentioned paper this proposition was stated for K = 1, so to skip this restriction we can apply the machinery described in [13, section 4.1].

Main result
We have already presented a number of upper boundaries for the distance between two quasi-arithmetic means generated by functions from S(U ). In this section we are going to present some lower boundary for this number. Throughout K is a positive real (recall that U is a bounded interval). Our main tool is the following The relevant proof is postponed until Sect. 4. Having this in hand we are ready to prove the main theorem of the present note. Observe that the right hand side of (2.1) is a function of five variables: K, |U |, P f * , P g * , P f − P g * ( P f * and P g * are going to be ruled out shortly; cf. Corollary 2). In this setting the whole description of A [f ] , A [g] , and their interaction are suppressed to a few parameters only. Such a situation will naturally lead to a huge disproportion between the proper value of the distance ρ(A [f ] , A [g] ) and its lower boundary. In fact this is the essential difference between all results appearing in the present note and the one announced in Proposition 1. This idea has already been used to obtain some upper bounderies in [13] (cf. Proposition 5 above). For fixed c and δ Proposition 6 implies Vol. 92 (2018) Lower estimation of the difference 13 Proof. Upon putting c = 2 3 and δ = ε 8K in Proposition 6, we get At the moment it is sufficient to prove that or, equivalently, By the inequality (e x − 1)(e y − 1) ≤ (e x+y − 1) valid for all positive reals, it suffices to prove that But considering that f, g ∈ S K (U ) we have ε ∈ (0, 2K |U | ]. Therefore This theorem, combined with Proposition 4, immediately implies A problem of convergence of quasi-arithmetic means was already discussed among the examples in [10] and characterized for an arbitrary (not necessary differentiable) function in [14].
Theorem 1 has an important disadvantage. The definition of S K (U ) implies ε ≤ 2K |U | (in fact this ineqality was already used in the proof of this theorem), whence the right hand side of the inequality stated as the main result is always smaller than 1 8 |U | e −K|U |/2 (this technical estimation is omitted). To avoid this drawback we will use the simple fact that the distance between means are the lowest upper boundary of the distance between means taken for every vector and weight. Whence, if we restrict the set of admissible vectors or weights then the distance will not increase. In particular we can only take vectors 14 P. Pasteczka AEM from some subinterval V ⊂ U . More precisely, for all continuous, monotone functions f, g : U → R the following inequality holds: While we are taking a subinterval V of U we need to control the distance P f − P g * ,V . Luckily we have the following . . , n}. Then, by the triangle inequality, In particular, u * ,Vj ≥ 1 n u * ,U for some j ∈ {1, . . . , n}. At this point we will divide our consideration into two cases. The first possibility is that the factor e K|U | − 1 appearing in the denominator of the inequality in Theorem 1 is majorized by a given numerical constant. Otherwise, having Lemma 1 in hand, we split the set U obtaining a subinterval V ⊂ U of length comparable to 1 K . In this setting e K|V | becomes bounded from both sides by some constants. This idea is quite simple, but there appear a number of artificial values both in its wording and proof.
Proof. Let us denote, as usually, ε := P f − P g * . We are going to prove each part of this corollary separately. Part (i). By max( P f * , P g * ) ≤ K |U | ≤ C0 2 and a common inequality e x − 1 ≥ x one obtains

Box distance
In the previous section the distance between means generated by f, g ∈ S K (U ) was expressed in terms of P f − P g * -the main theorem stated that ρ(A [f ] , A [g] ) may be estimated from below by some term involving this value, the length of the interval, and the number K.
In this section we will define the distance between generators in another way. More precisely we say that f, g ∈ S(U ) are (φ, K, δ)-separated if there exist a closed interval V ⊂ U , |V | = φ such that for all x ∈ V the following inequalities are satisfied: Let us note that (φ, K, δ)-separation does not imply that the functions belong to S K (U ), because the majorization of the Arrow-Pratt index is only on some subinterval (denoted above by V ). However both f | V , g| V belong to S K (V ) which makes the use of the letter K absolutely natural in this context. As we will see, results from the previous section are useless here. We are going to prove the following statement: Before we begin the proof let us notice that by the convexity of e −φ/2 (x) = e −φx/2 we have which, by rearranging the terms, simplifies to α > 0 (see for example [1, p. 26]).
Proof. First, let us establish a proper setting. Consider V from the definition of (φ, K, δ)-separation. Denote its endpoints and the middle point by l, u, and m, respectively. That is Suppose (swapping f and g if necessary) that As P is invertible (or, more precisely, invertible up to an affine transformation; cf. [9] for details) there exists a function h : V → R such that Vol. 92 (2018) Lower estimation of the difference 17 By (1.1) we immediately obtain for every vector a having all its entries in V and corresponding weights w. Therefore We are going to bound a value of the rightmost side of this inequality. Furthermore, be virtue of Remark 1 we may apply any affine transformation to f and h as it has no influence on the values of A [f ] and A [h] . Whence we apply transformations Notice that the statement of Theorem 2 applied to the pair (f, h) and (f,h) are equivalent, as they are expressed in terms of P f and P h (resp. Pf and Ph). Therefore, just to skip this awkward notation we bind f, h andf,h, respectively. In this setting (3.1) Moreover, since P f (x) ∈ [−K, K − δ] for x ∈ V , we immediately obtain two inequalities

These inequalities allow us to estimate [from below] the values of h(l) and h(u). Indeed, (3.1) and (3.2) yield
On the other hand h(m) = 0 and |P h (x)| ≤ K for x ∈ V . Thus, for s ≥ m, we simply get Combining this inequality with (3.6), we obtain θ (u, l). Then one can rewrite the inequality above in the alternative form The number α appearing in Theorem 2 is rather a complicated one. Nevertheless, we can observe that α is a difference of one simple function evaluated in two different points. We can use this fact to simplify the right hand side.
Proof. Notice that the function ω(x) := e −x − 1 x is increasing and concave (elementary proof is ommitted here), and therefore ω is decreasing. By the mean value theorem we get Vol. 92 (2018) Lower estimation of the difference 19 Applying this, Theorem 2, and the algebraic identity At the end of this section applications of all the results are going to be presented in a simple example. Let me note that the order of these numbers may vary depending on the means. We take all additional parameters appearing in each result so as to obtain the best possible boundaries. The exact value follows from the result of Cargo and Shisha [2].
In what follows, this proof will be split into two cases. In each of them we will prove that ρ(A [f ] , A [g] ) can be bounded from below by one of the terms appearing on the right hand side of (4.1). Since we are not able to predict which case is valid, we have to use a min function in the final result. Before we begin the real work let us introduce two technical, however important, constants: Let us briefly describe the idea of the proof. We have plenty of free parameters (μ, c, δ, θ, α, x, z). The meaning of μ is just to provide (by the definition of · * ) existence of x and z such that z x P f − P g = μ while c and δ are free parameters that appear in this proof-we are going to take the supremum over these variables in the final result to obtain the best estimation. θ is taken such that The description of F and G is very naive and it could happen that F = G, also it could happen thatF =G. However, as we will prove shortly, these equalities cannot be simultaneously satisfied (this fact is implied by (4.16)). Then, using the elementary inequality max(|p| , |q|) ≥ p−q 2 we get Notice that this step is in fact the main reason for the huge disproportion between our estimation and the optimal one. Assume for example that G−F ≈ ρ(A [f ] , A [g] ). AsF andG are close to F and G respectively, we obtain that G −F ≈ ρ(A [f ] , A [g] ) too. However in this case the right hand side of (4.5) is By Remark 1, let f (x) = g(x) = 0 and f (x) = g (x) = 1 (like it was already done in the proof of Theorem 2). Assume without loss of generality, switching f and g if necessary, that (4.6) Then either Vol. 92 (2018) Lower estimation of the difference 21 These cases are analogous-the mapping is a transition between them (the terms appearing on the right hand side are functions of t); cf. e.g. [9,10,12]. Therefore, from now on we may assume that the first inequality holds. By (4.6) there exists y ∈ [ x+z 2 , z) such that for all u ∈ (y, z). (4.9) Equality (4.8) can be expressed equivalently as (4.10) Moreover, by (4.9) and the identity we have On the other hand, since f, g ∈ S K (U ), we get |P f (t) − P g (t)| ≤ 2K for every t ∈ U . Therefore (4.10) implies z − y ≥ μ 4K . The same estimation applied to (4.11) yields f (u) ≤ e 2K·|x−u| g (u), u ∈ U. (4.13) The definition of S K and θ expressed in (1.2), (4.2), respectively, imply (4.14) Whence F ≥ y ≥ x+z 2 and G ≥ y ≥ x+z 2 . Furthermore we have the simple equalities (4.15) These equalities combined with (4.13) and (4.12) imply In fact the inequality above is crucial. We know that α < 1 and f (F ) − f (F ) is positive. Therefore this inequality alone implies that the equalities F = G andF =G cannot be simultaneously satisfied. This simple idea allows us to estimate the lower boundary of the difference between these values. To do this, let us express the inequality above in the integral form We know thatF > F,G > G, and f (x) > 0 for every x ∈ U . Thus either (this is the place where parameter c is used) This naturally splits our proof into two cases depending on which of the inequalities holds. It could happen that both of them hold, but it does not affect the proof.

Case (i)
By the mean value theorem, there exists t 0 ∈ (0, 1) such that On the other hand since f ∈ S K (U ) we get But t 0 ∈ (0, 1), so we simply obtain Finally in this case we have the inequality Vol. 92 (2018) Lower estimation of the difference 23

Case (ii)
By the mean value theorem, there exist ξ ∈ (F,F ) and ν ∈ (x, x + δ) such that Equality (4.15) can be rewritten as At this point we are going to use the elementary inequality from [10]: Thus we immediately obtaiñ . (4.17) Lastly, combining (4.5), (ii), (4.17), (4.3), and (4.2) we get Finally, it could happen that f and g were switched by applying (4.7). Therefore we need to symmetrize the right hand side by using the min operator in the final result. As the only nonsymmetric term on the right hand side is exp( P f * ) appearing in the denominator, we have to replace it by exp(max( P f * , P g * )) to decrease the right hand side of this inequality.