1 Introduction

Let \(I\subset {\mathbb {R}}\) be an interval. Assume that \(n\geqslant 2\) and \(x=(x_1,\dots ,x_n)\), \(y=(y_1,\dots ,y_n)\in I^{n}\). Following Schur (cf. e.g. [7, 12]) we say that xis majorized byy, and write \(x \preccurlyeq y\), if there exists a doubly stochastic \(n\times n\) matrix P (ie. a matrix containing nonnegative elements with all rows and columns summing up to 1) such that \(x=y\cdot P\). The notion of majorization arises in a variety of contexts. It has a deep background in economics. Issues concerning the measurment of income inequality, the distribution of wealth in a population (the Lotenz curve) or the Dalton principle of income transfer, are examples of its historical origins. Majorization arises naturally in many mathematical problems, especially in group theory and algebra. It is also useful in certain physical and chemical contexts. The terms: x is more mixed (chaotic, disordered) than y are directly related to majorization and can be expressed as \(x \preccurlyeq y\) (see [7] for details).

A function \(F:I^{n}\rightarrow {\mathbb {R}}\) is said to be Schur-convex if \(F(x)\leqslant F(y)\) whenever \(x \preccurlyeq y\), for \(x,y\in I^{n}\).

It is known, by the classical Schur theorem [12] that if a function \(f:I\rightarrow {\mathbb {R}}\) is convex then it generates Schur-convex sums, that is the function \(F:I^{n}\rightarrow {\mathbb {R}}\) defined by

$$\begin{aligned} F(x)=F(x_1,\dots ,x_n)=f(x_1)+\dots +f(x_n) \end{aligned}$$

is Schur-convex. It is also known that the convexity of f is a sufficient but not necessary condition under which F is Schur-convex. A full characterization of functions generating Schur-convex sums was given by Ng [8]. Namely, he proved that F is Schur-convex if and only if the generating function f is Wright-convex. Similar results connected with strong convexity was obtained by Nikodem, Rajba and Wa̧sowicz in [11] (cf. also [1]).

In this note we introduce the notion of Schur-convex stochastic processes and we present some counterparts of the results mentioned above for stochastic processes.

Let \((\Omega , {\mathcal {A}}, P)\) be an arbitrary probability space and \(I\subset {\mathbb {R}}\) be an interval. A function \(A:\Omega \rightarrow {\mathbb {R}}\) is called a random variable if it is \({\mathcal {A}}\)-measurable. A function \(X:I\times \Omega \rightarrow {\mathbb {R}}\) is called a stochastic process if for every \(t\in I\) the function \(X(t,\cdot )\) is a random variable.

Let \(I\subset {\mathbb {R}}\) be an interval and \(n\geqslant 2\). We say that a stochastic process \(S:I^n \times \Omega \rightarrow {\mathbb {R}}\) is Schur-convex if

$$\begin{aligned} x\preccurlyeq y\ \ \implies \ \ S(x,\cdot )\leqslant S(y,\cdot ) \quad \text {(a.e.)}\end{aligned}$$

for all \(x,y\in I^n\).

Recall also that a stochastic process \(X:I\times \Omega \rightarrow {\mathbb {R}}\) is called:

  • convex if for all \( x,y \in I \) and \(t \in [0,1]\) the following inequality holds

    $$\begin{aligned} X(tx + (1 -t) y,\cdot ) \leqslant t X(x,\cdot ) + (1 - t)X(y,\cdot ) \quad \text {(a.e.)}; \end{aligned}$$
    (1)
  • Wright-convex if for all \( x,y \in I \) and \(t \in [0,1]\) the condition holds

    $$\begin{aligned} X(tx + (1 -t)y,\cdot ) + X ((1-t)x + ty,\cdot ) \leqslant X(x,\cdot )+X(y,\cdot ) \quad \text {(a.e.)}; \end{aligned}$$
    (2)
  • Jensen-convex if (1) is assumed only for \(t = \frac{1}{2}, \) that is for all \( x,y \in I\)

    $$\begin{aligned} X \Bigl (\frac{x + y}{2},\cdot \Bigr )\leqslant \frac{X(x,\cdot ) + X(y,\cdot )}{2} \quad \text {(a.e.)}. \end{aligned}$$
    (3)

For some properties of the above classes of processes we refer to (cf. [2, 4, 5, 10, 13,14,15]).

2 Main results

We first prove that convex stochastic processes generate Schur-convex sums.

Theorem 1

If a stochastic process \(X:I \times \Omega \rightarrow {\mathbb {R}}\) is convex, then for every \(n \geqslant 2\) the stochastic process \(S_X:I^n \times \Omega \rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} S_X(x_1,\dots ,x_n, \omega )=X(x_1, \omega )+\dots +X(x_n, \omega ), \quad (x_1,\dots ,x_n, \omega )\in I^n \times \Omega , \end{aligned}$$

is Schur-convex.

Proof

Assume that X is a convex stochastic process. Let \(x=(x_1,\dots ,x_n)\), \(y=(y_1,\dots ,y_n)\) be elements of \(I^n\) such that \(x \preccurlyeq y\). There exists a double stochastic \(n\times n\) matrix \(P=[t_{ij}]\) satisfying the condition \(x=yP\). Then

$$\begin{aligned} x_j = \sum _{i=1}^{n} t_{ij}y_i, \quad j=1,\dots ,n, \end{aligned}$$

and by the convexity of X, we have

$$\begin{aligned}&X(x_1, \cdot )+ \dots +X(x_n, \cdot ) \\&\quad = \sum _{j=1}^{n}X(x_j,\cdot )=\sum _{j=1}^n X\Bigl (\sum _{i=1}^n t_{ij}y_i,\cdot \Bigr )\\&\quad \leqslant \sum _{j=1}^n\sum _{i=1}^n t_{ij}X(y_i,\cdot )=\sum _{i=1}^n\sum _{j=1}^n t_{ij}X(y_i,\cdot )\\&\quad =\sum _{i=1}^n X(y_i,\cdot )\sum _{j=1}^n t_{ij}=X(y_1,\cdot ) + \dots + X(y_n,\cdot ) \quad \text {(a.e.)}. \end{aligned}$$

The above calculations shows that \(S_X(x_1,\dots ,x_n, \cdot )\leqslant S_X(y_1,\dots ,y_n, \cdot )\), which means that \(S_X\) is a Schur-convex stochastic process. \(\square \)

Many inequalities connected with convex stochastic processes can be obtained by the use of the above theorem. The corollaries below give some examples.

Corollary 2

(Jensen inequality [10]). If a stochastic process \(X:I \times \Omega \rightarrow {\mathbb {R}}\) is convex, then for every \(n\in {\mathbb {N}}\) and \(x_1,\dots ,x_n \in I\),

$$\begin{aligned} X\Big ( \frac{x_1+ \cdots +x_n}{n},\cdot \Big ) \leqslant \frac{X(x_1,\cdot )+ \cdots + X(x_n,\cdot )}{n} \quad \text {(a.e.)}. \end{aligned}$$

Proof

Put \({\bar{x}}=\frac{1}{n}(x_1+ \cdots +x_n).\) Then \(({\bar{x}},\dots ,{\bar{x}}) \preccurlyeq (x_1, \dots , x_n)\) and hence

$$\begin{aligned} X({\bar{x}},\cdot )+\cdots + X({\bar{x}},\cdot ) \leqslant X(x_1,\cdot )+ \cdots + X(x_n,\cdot ) \quad \text {(a.e.)}. \end{aligned}$$

\(\square \)

The next result is a counterpart of the classical Hardy–Littlewood–Pólya majorization theorem [3] (see also [7, 9]).

Corollary 3

Let \(I\subset {\mathbb {R}}\) be an interval and \(n\geqslant 2\). Assume also that \(x=(x_1,\ldots ,x_n), \ y=(y_1,\ldots ,y_n) \in I^n\) satisfy:

  1. (a)

    \(x_n\leqslant \cdots \leqslant x_1, \ y_n\leqslant \cdots \leqslant y_1\);

  2. (b)

    \(x_1 + \cdots + x_k \leqslant y_1 + \cdots + y_k\), \(k=1,\ldots ,n-1\);

  3. (c)

    \(x_1 + \cdots + x_n = y_1 + \cdots + y_n\).

If a stochastic process \(X:I \times \Omega \rightarrow {\mathbb {R}}\) is convex, then

$$\begin{aligned} X(x_1,\cdot )+ \cdots +X(x_n,\cdot ) \leqslant X(y_1,\cdot )+ \cdots + X(y_n,\cdot ) \quad \text {(a.e.)}. \end{aligned}$$

Proof

Note that assumptions (a)–(c) imply \(x \preccurlyeq y\) (see e.g. [7]) and apply Theorem 1. \(\square \)

By Corollary 3 we get a counterpart of the Lim inequality (cf. [6]) for convex stochastic processes.

Corollary 4

Let \(a\geqslant 0\), \(b\geqslant 0\), \(c\geqslant a+b\) be real numbers, and let \(X:[0,+\infty )\times \Omega \rightarrow {\mathbb {R}}\) be a convex stochastic process. Then

$$\begin{aligned} X(a,\cdot )+X(b+c,\cdot )\geqslant X(a+b,\cdot )+X(c,\cdot )\quad \text {(a.e.)}. \end{aligned}$$
(4)

Proof

We use Corollary 3 for \(n=2\), \(x_1=c\), \(x_2=a+b\), \(y_1=b+c\), \(y_2=a\), and the inequality (4) follows. \(\square \)

The next result is connected with the Shannon information entropy.

Corollary 5

Let \(Y:\Omega \rightarrow {\mathbb {R}}\) be a random variable and \(H:(0,\infty )^n\times \Omega \rightarrow {\mathbb {R}}\) be a stochastic process defined by

$$\begin{aligned} H(p_1,\dots ,p_n,\omega )=-(p_1 \ln p_1 +\cdots +p_n\ln p_n)Y(\omega ). \end{aligned}$$

Then, for every \(p_1,\dots , p_n > 0\) such that \(p_1+\cdots +p_n=1\), we have

$$\begin{aligned} H\Big (\frac{1}{n},\dots ,\frac{1}{n},\cdot \Big )\geqslant H(p_1,\dots ,p_n, \cdot )\geqslant H(1,0,\dots ,0,\cdot ) \ \quad \text {(a.e.)}. \end{aligned}$$
(5)

Proof

The stochastic process \(X:(0,\infty )\times \Omega \rightarrow {\mathbb {R}}\) defined by \(X(p,\omega )=(p\ln p)Y(\omega )\) is convex. Since for every \(p_1,\dots , p_n > 0\) such that \(p_1+\cdots +p_n=1\)

$$\begin{aligned} \Big (\frac{1}{n},\dots ,\frac{1}{n}\Big ) \preccurlyeq (p_1,\dots ,p_n) \preccurlyeq (1,0,\dots ,0), \end{aligned}$$

we get (5) by Theorem 1. \(\square \)

We will show now that processes generating Schur-convex sums must be Jensen-convex.

Theorem 6

If for some \(n\geqslant 2\) the process \(S_X:I^n \times \Omega \rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} S_X(x_1,\dots ,x_n, \omega )=X(x_1, \omega )+\dots +X(x_n, \omega ), \quad (x_1,\dots ,x_n, \omega )\in I^n \times \Omega , \end{aligned}$$

is Schur-convex, then the process  \(X:I \times \Omega \rightarrow {\mathbb {R}}\) is Jensen-convex.

Proof

Take \(y_1, y_2 \in I\) and put \(x_1=x_2=\frac{1}{2}(y_1+y_2)\). Let us consider the points \(y=(y_1,y_2,y_2,\dots ,y_2)\) and \( x=(x_1,x_2,y_2,\dots ,y_2)\) (if \(n=2\), we take \(y=(y_1,y_2)\), \(x=(x_1,x_2)\)). Then \(x=y\cdot P\) with

$$\begin{aligned} P= \begin{bmatrix} \frac{1}{2}&\quad \frac{1}{2}&\quad 0&\quad \cdots&\quad 0 \\ \frac{1}{2}&\quad \frac{1}{2}&\quad 0&\quad \cdots&\quad 0 \\ 0&\quad 0&\quad 1&\quad \cdots&\quad 0 \\ \vdots&\quad \vdots&\quad \vdots&\quad \ddots&\quad \vdots \\ 0&\quad 0&\quad 0&\quad \cdots&\quad 1 \\ \end{bmatrix} \end{aligned}$$

which means that \(x \preccurlyeq y\). Therefore, by the Schur-convexity of \(S_X\), we obtain

$$\begin{aligned} S_X (x_1,x_2,y_2,\dots ,y_2,\cdot )\leqslant S_X (y_1,y_2,y_2,\dots ,y_2,\cdot ) \quad \text {(a.e.)}, \end{aligned}$$

and finally

$$\begin{aligned} 2X\Bigl (\frac{y_1 + y_2}{2},\cdot \Bigr )\leqslant X(y_1,\cdot )+X(y_2,\cdot ) \quad \text {(a.e.)}. \end{aligned}$$

This shows that X is Jensen-convex, which was to be proved. \(\square \)

As was shown above in Theorems 1 and 6, if a stochastic process \(X:I \times \Omega \rightarrow {\mathbb {R}}\) is convex, then for every \(n\geqslant 2\) the corresponding process \(S_X\) is Schur-convex and if for some \(n\geqslant 2\) the process \(S_X\) is Schur-convex, then X is Jensen-convex. The next theorem characterizes all the processes X for which \(S_X\) are Schur-convex. It is a counterpart of the result of Ng [8] on functions generating Schur-convex sums.

Theorem 7

The following conditions are equivalent:

  1. (i)

    For every \(n\geqslant 2\) the stochastic process \(S_X:I^n \times \Omega \rightarrow {\mathbb {R}}\) defined by

    $$\begin{aligned} S_X (x_1,\dots ,x_n, \omega )=X(x_1, \omega )+\dots +X(x_n, \omega ), \quad (x_1,\dots ,x_n, \omega )\in I^n \times \Omega ,\nonumber \\ \end{aligned}$$
    (6)

    is Schur-convex.

  2. (ii)

    For some \(n\geqslant 2\) the stochastic process  \(S_X\) given by (6) is Schur-convex.

  3. (iii)

    The stochastic process \(X:I\times \Omega \rightarrow {\mathbb {R}}\) is Wright-convex.

  4. (iv)

    There exist a convex stochastic process \(Y:I\times \Omega \rightarrow {\mathbb {R}}\) and an additive stochastic process \(A:I\times \Omega \rightarrow {\mathbb {R}}\) such that

    $$\begin{aligned} X(x,\cdot )=Y(x, \cdot )+A(x, \cdot ) \quad \text {(a.e.)}, \quad x\in I. \end{aligned}$$
    (7)

Proof

The implication \((i)\Rightarrow (ii)\) is obvious.

To see that \((ii)\Rightarrow (iii)\) holds fix \(y_1, y_2\in I\), \(t\in (0,1)\). We put \( x_1=ty_1+(1-t)y_2\), \(x_2=(1-t)y_1+ty_2\) and, if \(n>2\), take additionally \(x_i=y_i=z\in I\) for \(i=3,\dots ,n\). Then, by a similar argumentation as in the proof of Theorem 6 we get

$$\begin{aligned} (x_1,\dots ,x_n)\preccurlyeq (y_1,\dots ,y_n) . \end{aligned}$$

By the Schur-convexity of \(S_X\) we have

$$\begin{aligned} S_X(x_1,\dots ,x_n,\cdot )\leqslant S_X(y_1,\dots ,y_n,\cdot ) \quad \text {(a.e.)}. \end{aligned}$$

Therefore

$$\begin{aligned} X\bigl (ty_1+(1-t)y_2,\cdot \bigr )+X\bigl ((1-t)y_1+ty_2,\cdot \bigr )\leqslant X(y_1,\cdot )+X(y_2,\cdot )\quad \text {(a.e.)}, \end{aligned}$$

which shows that the process X is Wright-convex.

The implication \((iii)\Rightarrow (iv)\) follows from Skowroński’s theorem giving the representation of Wright-convex stochastic processes (cf. [15], Theorem on page 31).

To prove \((iv)\Rightarrow (i)\) assume that X has the representation of the form (7). By the convexity of Y and by Theorem 1 the process Y generates Schur-convex sums \(S_Y\), that is, for any \( x=(x_1,\dots ,x_n)\preccurlyeq y=(y_1,\dots ,y_n) \), the following condition holds

$$\begin{aligned} Y(x_1,\cdot )+\cdots +Y(x_n,\cdot ) \leqslant Y(y_1,\cdot )+\cdots +Y(y_n,\cdot )\quad \text {(a.e.)}. \end{aligned}$$

Note also that by the additivity of A, similarly as in the proof of Theorem 1, we get

$$\begin{aligned} A(x_1,\cdot )+\cdots +A(x_n,\cdot )&= A(x_1+\cdots +x_n, \cdot )\\&= A(y_1+\cdots +y_n, \cdot )=A(y_1,\cdot )+\cdots +A(y_n,\cdot )\quad \text {(a.e.)}. \end{aligned}$$

Using the above inequalities we obtain

$$\begin{aligned} X(x_1, \cdot )+ \cdots + X(x_n, \cdot )&= Y(x_1,\cdot )+\cdots + Y(x_n,\cdot )+ A(x_1,\cdot )+\cdots +A(x_n,\cdot )\\&\leqslant Y(y_1,\cdot )+\cdots +Y(y_n,\cdot )+ A(y_1,\cdot )+\cdots +A(y_n,\cdot )\\&=X(y_1, \cdot )+ \dots +X(y_n, \cdot ) \quad \text {(a.e.)}, \end{aligned}$$

which shows that \(S_X\) is Schur-convex. This finishes the proof. \(\square \)