1 Introduction

Concentration inequalities are useful technique tools for studying the limit theory in the classical probability and statistics, which describe the bounds of a random variable deviating from some value. The law of large numbers, central limit theorem, and law of the iterated logarithm could all be regarded as derivative results of concentration inequalities. The Bernstein-type inequality plays an important role among concentration inequalities especially, which provides a bound for the sum of independent random variables deviating from its mean value, while McDiarmid’s inequality is established to bound the deviations for Doob martingale in the probability space. More precisely, let Ω be a sample space, \(\mathcal{F}\) be a Borel field of subsets of Ω, and P be a probability measure on \(\mathcal{F}\). Suppose that \(X_{1}, X_{2}, \ldots , X_{n}, \ldots \) is a sequence of independent, zero-mean random variables defined in the probability space \((\varOmega , \mathcal{F}, \mathrm{P})\). Denote by \(S_{n}\) the partial sum of this sequence, namely \(S_{n} \triangleq \sum_{k=1}^{n} X_{k}\) and \(v_{n}^{2} \triangleq \frac{1}{n} \sum_{k=1}^{n} \mathrm{E}[X_{k}^{2}]\). For any \(x > 0\), Bernstein proved in [2] that

$$ \mathrm{P}(S_{n} \geq nx) \leq e^{- \frac{nx^{2}}{2(v_{n}^{2}+cx)}} $$
(1.1)

under the standard Bernstein condition which supposes that there exists a positive constant c, for any \(1 \leq k \leq n\) and any integer \(p \geq 3\),

$$ \mathrm{E} \bigl[ \vert X_{k} \vert ^{p} \bigr] \leq \frac{p!c^{p-2}}{2} \mathrm{E} \bigl[X_{k}^{2} \bigr]. $$

Since then, some special cases of Bernstein’s inequality were established, known as Hoeffding’s inequality [1], Bennett’s inequality [1], and so on. Especially, Rio [21] improved (1.1) under a weaker condition

$$ \sum_{k=1}^{n} \mathrm{E} \bigl[ \bigl(X_{k}^{+} \bigr)^{p} \bigr] \leq \frac{p!c^{p-2}}{2}\sum_{k=1}^{n} \mathrm{E} \bigl[X_{k}^{2} \bigr]. $$
(1.2)

McDiarmid’s inequality was first proved in [17] by using martingale theory and reproved by Ying in [23]. This inequality shows that, for any \(\varepsilon >0\), any \(f:\mathbb{R}^{n} \rightarrow \mathbb{R}\) with bounded differences \(\{c_{k}\}_{k=1}^{n}\),

$$ \mathrm{P} \bigl(f(X_{1},\ldots ,X_{n})-\mathrm{E} \bigl[f(X_{1}, \ldots ,X_{n}) \bigr] \geq \varepsilon \bigr) \leq e^{- \frac{2\varepsilon ^{2}}{\sum _{k=1}^{n} c_{k}^{2}}}. $$
(1.3)

A function f with bounded differences \(\{c_{k}\}_{k=1}^{n}\) means that

$$ \sup_{x_{1},\ldots ,x_{k-1},x_{k},x_{k}',x_{k+1}\cdots ,x_{n}} \bigl\vert f(x_{1}, \ldots ,x_{k-1},x_{k},x_{k+1},\ldots ,x_{n}) - f \bigl(x_{1},\ldots ,x_{k-1},x_{k}',x_{k+1}, \ldots ,x_{n} \bigr) \bigr\vert \leq c_{k} $$

for all \(1 \leq k \leq n\).

Stimulated by the uncertainties of the model, the theory of nonlinear expectations and nonadditive probabilities emerges as the times require. Nonlinear expectations have been used in a wide range of realistic situations, such as risk measures in finance and statistical uncertainty in decisions. The general framework of the nonlinear expectation space was proposed by Peng [20]. In [20], Peng established the fundamental theoretical framework of the sublinear expectation space and redefined the concepts of independence, identical distribution, law of large numbers, central limit theorem in this system. Recently, some researchers have already got the large deviation principle under the nonlinear framework. Chen and Xiong [8] studied the large deviation principle for diffusion processes under the g-expectation; Hu [10] obtained the upper bound of Cramér’s theorem for capacities; Gao and Xu [11] proved the large deviation principle for independent random variables in a sublinear expectation space; Chen and Feng [24] established the large deviation principle for negatively dependent random variables under sublinear expectations. Many authors began to study the corresponding inequalities in sublinear framework, including upper expectation spaces, where upper expectations are typical sublinear expectations generated by a family of probabilities. For instance, Chen et al. [7] presented several elementary inequalities in an upper expectation space, Hölder’s inequality, Chebyshev’s inequality, and Jensen’s inequality included; Wu [22] proved the maximal inequalities, exponential inequalities, and the Marcinkiewicz–Zygmund inequality for the partial sums of random variables which are independent in an upper expectation space; Zhang [25] showed Rosenthal’s inequalities for negatively dependent random variables; Zhang [26] proved the Kolmogorov-type exponential inequalities of the partial sums of independent random variables as well as negatively dependent random variables under the sublinear expectations; Huang and Wu [14] obtained the equivalent relations between Kolmogorov maximal inequality and Hájek–Rényi maximal inequality both in the moment and capacity types in sublinear expectation spaces. All of these inequalities could be used to investigate the limit theory. More detailed information about these results under sublinear expectations can be found in [35, 79, 12, 16, 1820] and the references therein.

As is well known, concentration inequalities play an important role in the limit theory area, especially for the large deviation principle and the convergence rates. However, there are few results about the convergence rate of limit theory in sublinear expectation space. Our motivation is to study the Bernstein-type inequality (1.1) under Rio–Bernstein condition (1.2) and McDiarmid’s inequality (1.3) in upper expectation spaces, and then apply them to improve several types law of large numbers and obtain the convergence rates.

The organization of this paper is as follows. We first recall some preliminary definitions and notations about upper probabilities and sublinear expectation spaces in Sect. 2. Section 3 is to study the Bernstein-type inequality under the Rio–Bernstein condition and McDiarmid’s inequality for the upper probability, which will be used to discuss the convergence rates of the laws of large numbers in Sect. 4.

2 Preliminaries

Let \((\varOmega , \mathcal{F})\) be a measurable space and \(\mathcal{H}\) be a linear space of random variables defined on Ω. In this paper, we suppose that \(\mathcal{H}\) satisfies \(c \in \mathcal{H}\) for each constant c and \(|X| \in \mathcal{H}\) if \(X \in \mathcal{H}\).

Definition 2.1

([20, Definition 1.1.1])

A sublinear expectation \(\mathbb{E}\) is a functional \(\mathbb{E}: \mathcal{H} \rightarrow \mathbb{R}\) satisfying

  1. (1)

    Monotonicity: \(X \geq Y\) implies \(\mathbb{E}[X] \geq \mathbb{E}[Y]\).

  2. (2)

    Constant preserving: \(\mathbb{E}[c]=c\) for \(c \in \mathbb{R}\).

  3. (3)

    Subadditivity: For each \(X, Y \in \mathcal{H}\), \(\mathbb{E}[X+Y] \leq \mathbb{E}[X] + \mathbb{E}[Y]\).

  4. (4)

    Positive homogeneity: \(\mathbb{E}[\lambda X] = \lambda \mathbb{E}[X] \) for \(\lambda \geq 0\).

The triplet \((\varOmega ,\mathcal{H},\mathbb{E})\) is called a sublinear expectation space. Generally, we consider the following sublinear expectation space \((\varOmega ,\mathcal{H},\mathbb{E})\): if \(X_{1}, \ldots , X_{n} \in \mathcal{H}\), then \(\varphi (X_{1},\ldots ,X_{n}) \in \mathcal{H}\) for each \(\varphi \in C_{l,\mathrm{Lip}}(\mathbb{R}^{n})\), where \(C_{l,\mathrm{Lip}}(\mathbb{R}^{n})\) denotes the linear space of functions φ satisfying the following local Lipschitz condition:

$$\begin{aligned} \bigl\vert \varphi (x)-\varphi (y) \bigr\vert &\leq C \bigl(1+ \bigl\vert x^{m} \bigr\vert + \vert y \vert ^{m} \bigr) \vert x-y \vert \quad \text{for } x,y \in \mathbb{R}^{n}, \\ &\quad \text{some } C >0, m \in \mathbb{N} \text{ depending on } \varphi . \end{aligned}$$

In this case \(X=(X_{1},\ldots ,X_{n})\) is called an n-dimensional random vector, denoted by \(X \in \mathcal{H}^{n}\).

Definition 2.2

([20, Definition 1.3.1, Proposition 1.3.2])

Let \(X_{1}\) and \(X_{2}\) be two n-dimensional random vectors defined on nonlinear expectation spaces \((\varOmega _{1},\mathcal{H}_{1},\mathbb{E}_{1})\) and \((\varOmega _{2},\mathcal{H}_{2},\mathbb{E}_{2})\), respectively. They are called identically distributed, denoted by \(X_{1} \overset{d}{=} X_{2}\), if

$$ \mathbb{E}_{1} \bigl[\varphi (X_{1}) \bigr] = \mathbb{E}_{2} \bigl[\varphi (X_{2}) \bigr] \quad \text{for all } \varphi \in C_{l,\mathrm{Lip}} \bigl(\mathbb{R}^{n} \bigr). $$

Definition 2.3

([20, Definition 1.3.11])

In a sublinear expectation space \((\varOmega ,\mathcal{H},\mathbb{E})\), a random vector \(Y \in \mathcal{H}^{n}\) is said to be independent of another random vector \(X \in \mathcal{H}^{m}\) under \(\mathbb{E}[\cdot ]\) if, for each function \(\varphi \in C_{l,\mathrm{Lip}}(\mathbb{R}^{m+n})\), we have

$$ \mathbb{E} \bigl[\varphi (X,Y) \bigr] = \mathbb{E} \bigl[ \mathbb{E} \bigl[ \varphi (x,Y) \bigr]|_{x = X} \bigr]. $$

Remark 2.1

It is important to observe that, under a nonlinear expectation, Y is independent of X does not in general imply that X is independent of Y. An example constructed in [20, Example 1.3.15] shows this nonsymmetric property. In fact, Hu and Li [13, Theorem 15] showed that, for two nontrivial random variables X and Y under a sublinear expectation space, if X is independent of Y and Y is independent of X, then X and Y must be maximally distributed.

Let \(\mathcal{M}\) be the collection of all probability measures on \((\varOmega ,\mathcal{F})\). Any given nonempty subset \(\mathcal{P} \subseteq \mathcal{M}\), define

$$\begin{aligned}& \mathbb{V}(A):= \sup_{\mathrm{P} \in \mathcal{P}} \mathrm{P}(A), \quad A \in \mathcal{F}, \\& \nu (A):= \inf_{\mathrm{P} \in \mathcal{P}}\mathrm{P}(A), \quad A \in \mathcal{F}, \end{aligned}$$

as the upper probability and lower probability, respectively. Obviously, \(\mathbb{V}\) and ν are conjugate to each other, that is,

$$ \mathbb{V}(A)+ \nu \bigl(A^{c} \bigr)=1, $$

where \(A^{c}\) is the complement set of A.

The upper expectation \(\mathbb{E}[\cdot ]\) and the lower expectation \(\mathcal{E}[\cdot ]\) generated by \(\mathcal{P}\) can be defined respectively as follows [15]:

$$\begin{aligned}& \mathbb{E}[X] = \mathbb{E}^{\mathcal{P}}[X]:= \sup_{\mathrm{P} \in \mathcal{P}} \mathrm{E}_{\mathrm{P}}[X], \\& \mathcal{E}[X] = \mathcal{E}^{\mathcal{P}}[X]:= \inf_{ \mathrm{P} \in \mathcal{P}} \mathrm{E}_{\mathrm{P}}[X], \end{aligned}$$

for each \(X \in L^{0}(\varOmega )\), where \(L^{0}(\varOmega )\) is the space of all \(X \in \mathcal{H}\) such that \(\mathrm{E}_{\mathrm{P}}[X]\) exists for each \(\mathrm{P} \in \mathcal{P}\). In this case, \((\varOmega ,\mathcal{H},\mathbb{E})\) is called an upper expectation space. It is easy to check that the upper expectation \(\mathbb{E}[\cdot ]\) is also a sublinear expectation. We also consider the random variables in the following spaces:

$$\begin{aligned}& \mathcal{L}^{p}:= \bigl\{ X \in L^{0}(\varOmega ):\mathbb{E} \bigl[ \vert X \vert ^{p} \bigr]< \infty \bigr\} ; \qquad \mathcal{L}:= \bigcap_{p\geq 1}\mathcal{L}^{p}; \\& \mathcal{L}^{\infty }:= \bigl\{ X \in L^{0}(\varOmega ): \text{exists a constant } M, \mbox{s.t. } \vert X \vert \leq M \bigr\} \subset \mathcal{L}. \end{aligned}$$

In the end of this section, we give an example of the sublinear expectation [20, Example 1.1.5].

Example

In a game a gambler randomly picks a ball from an urn containing W white, B black, and Y yellow balls. The owner of the urn, who is the banker of the game, does not tell the gambler the exact numbers of W, B, and Y. He/She only ensures that \(W + B + Y = 100\) and \(W = B \in [20, 25]\). Let ξ be a random variable defined by

$$ \xi = \textstyle\begin{cases} 1, & \text{if the picked ball is white;} \\ 0, & \text{if the picked ball is yellow;} \\ -1, & \text{if the picked ball is black.} \end{cases} $$

We know that the distribution of ξ is

$$ \begin{Bmatrix} -1&0&1 \\ \frac{p}{2}&1-p&\frac{p}{2} \end{Bmatrix} \text{with uncertainty: } p \in [0.4,0.5]. $$

Thus the robust expectation of ξ is

$$\begin{aligned} \mathbb{E} \bigl[\varphi (\xi ) \bigr] \triangleq \sup_{p \in [0.4,0.5]} \biggl[ \frac{p}{2} \bigl(\varphi (1)+\varphi (-1) \bigr)+(1-p)\varphi (0) \biggr], \quad \forall \varphi \in C_{l,\mathrm{Lip}}(\mathbb{R}). \end{aligned}$$

Especially, \(\mathbb{E}[\xi ]=0\).

3 Concentration inequalities for upper probabilities

The following theorem gives the Bernstein-type inequality in an upper probability space.

Theorem 3.1

Let\(\{X_{i}\}_{i=1}^{\infty } \subset \mathcal{L}\)be a sequence of random variables in an upper expectation space\((\varOmega ,\mathcal{H},\mathbb{E})\). Assume that\(e^{X} \in \mathcal{H}\)if\(X \in \mathcal{H}\), and for any given integer\(n \geq 1\), all\(t \geq 0\),

$$ \mathbb{E} \Biggl[\prod_{i=1}^{n+1}e^{tX_{i}} \Biggr]= \mathbb{E} \Biggl[\prod_{i=1}^{n}e^{tX_{i}} \Biggr] \cdot \mathbb{E} \bigl[e^{tX_{n+1}} \bigr]. $$
(3.1)

Denote

$$\begin{aligned}& \bar{S}_{n} = \sum_{i=1}^{n} \bigl(X_{i}-\mathbb{E}[X_{i}] \bigr), \\& B_{n}^{2}=\sum_{i=1}^{n} \mathbb{E} \bigl[ \bigl(X_{i}-\mathbb{E}[X_{i}] \bigr)^{2} \bigr], \\& b_{n}^{2}=\frac{1}{n}\sum _{i=1}^{n} \mathbb{E} \bigl[ \bigl(X_{i}- \mathbb{E}[X_{i}] \bigr)^{2} \bigr]= \frac{B_{n}^{2}}{n}. \end{aligned}$$

Suppose that there exists a constant\(c > 0\)such that, for any integer\(p \geq 3\),

$$ \sum_{i=1}^{n}\mathbb{E} \bigl[ \bigl( \bigl(X_{i}-\mathbb{E}[X_{i}] \bigr)^{+} \bigr)^{p} \bigr] \leq \frac{c^{p-2} p!}{2}B_{n}^{2}. \quad (\textit{Rio--Bernstein condition}). $$
(3.2)

Then, for any\(x > 0\),

$$\begin{aligned} \begin{aligned}[b] \mathbb{V}(\bar{S}_{n} \geq nx) & \leq \biggl(1+ \frac{x^{2}}{2(b_{n}^{2}+cx)} \biggr)^{n} e^{- \frac{nx^{2}}{b_{n}^{2}+cx}} \\ & \leq e^{-\frac{nx^{2}}{2(b^{2}_{n}+cx)}}. \end{aligned} \end{aligned}$$
(3.3)

Proof

Notice that

$$\begin{aligned} \begin{aligned}[b] \frac{1}{n}\log \prod_{i=1}^{n} \mathbb{E} \bigl[ e^{t(X_{i}- \mathbb{E}[X_{i}])} \bigr] &=\frac{1}{n} \sum _{i=1}^{n} \log \mathbb{E} \bigl[ e^{t(X_{i}-\mathbb{E}[X_{i}])} \bigr] \\ &\leq \log \Biggl(\frac{1}{n}\sum_{i=1}^{n} \mathbb{E} \bigl[ e^{t(X_{i}- \mathbb{E}[X_{i}])} \bigr] \Biggr). \end{aligned} \end{aligned}$$
(3.4)

Meanwhile, for any \(x \in \mathbb{R}\), note the fact that

$$ e^{x} \leq 1+x+\frac{x^{2}}{2}+\sum_{p=3}^{\infty } \frac{(x^{+})^{p}}{p!}, $$

where \(x^{+} \) denotes \(0 \vee x\). That is,

$$ e^{t(X_{i}-\mathbb{E}[X_{i}])}\leq 1+t \bigl(X_{i}-\mathbb{E}[X_{i}] \bigr)+ \frac{t^{2}(X_{i}-\mathbb{E}[X_{i}])^{2}}{2} + \sum_{p=3}^{\infty } \frac{t^{p}}{p!} \bigl( \bigl(X_{i}-\mathbb{E}[X_{i}] \bigr)^{+} \bigr)^{p}. $$

Because of the monotonicity, constant-preserving property, positive homogeneity, and countable subadditivity of \(\mathbb{E}\), we have

$$ \begin{aligned} &\mathbb{E} \bigl[e^{t(X_{i}-\mathbb{E}[X_{i}])} \bigr] \\ &\quad \leq 1+t \mathbb{E} \bigl[ \bigl(X_{i}-\mathbb{E}[X_{i}] \bigr) \bigr]+ \frac{t^{2}}{2}\mathbb{E} \bigl[ \bigl(X_{i}- \mathbb{E}[X_{i}] \bigr)^{2} \bigr] +\sum _{p=3}^{\infty } \frac{t^{p}}{p!}\mathbb{E} \bigl[ \bigl( \bigl(X_{i}- \mathbb{E}[X_{i}] \bigr)^{+} \bigr)^{p} \bigr] \\ &\quad =1+\frac{t^{2}}{2}\mathbb{E} \bigl[ \bigl(X_{i}- \mathbb{E}[X_{i}] \bigr)^{2} \bigr]+ \sum _{p=3}^{\infty }\frac{t^{p}}{p!}\mathbb{E} \bigl[ \bigl( \bigl(X_{i}- \mathbb{E}[X_{i}] \bigr)^{+} \bigr)^{p} \bigr]. \end{aligned} $$

It follows from (3.2) that

$$\begin{aligned} \begin{aligned}[b] & \log \Biggl(\frac{1}{n}\sum _{i=1}^{n}\mathbb{E} \bigl[ e^{t(X_{i}- \mathbb{E}[X_{i}])} \bigr] \Biggr) \\ &\quad\leq \log \Biggl(1+ \frac{1}{n} \sum_{i=1}^{n} \Biggl( \frac{t^{2}}{2} \mathbb{E} \bigl[ \bigl(X_{i}- \mathbb{E}[X_{i}] \bigr)^{2} \bigr] + \sum _{p=3}^{ \infty }\frac{t^{p}}{p!}\mathbb{E} \bigl[ \bigl( \bigl(X_{i}-\mathbb{E}[X_{i}] \bigr)^{+} \bigr)^{p} \bigr] \Biggr) \Biggr) \\ &\quad = \log \Biggl(1+ \frac{1}{n} \Biggl(\frac{t^{2}}{2}\sum _{i=1}^{n} \mathbb{E} \bigl[ \bigl(X_{i}- \mathbb{E}[X_{i}] \bigr)^{2} \bigr] + \sum _{p=3}^{ \infty } \Biggl( \frac{t^{p}}{p!} \cdot \sum _{i=1}^{n} \mathbb{E} \bigl[ \bigl( \bigl(X_{i}-\mathbb{E}[X_{i}] \bigr)^{+} \bigr)^{p} \bigr] \Biggr) \Biggr) \Biggr) \\ &\quad \leq \log \Biggl( 1+ \sum_{p=2}^{\infty } \frac{c^{p-2}t^{p}}{2}b_{n}^{2} \Biggr) \\ &\quad = \log \biggl( 1+ \frac{t^{2}b_{n}^{2}}{2(1-ct)} \biggr). \end{aligned} \end{aligned}$$
(3.5)

In addition, according to Jensen’s inequality [7, Proposition 2.1],

$$ \mathbb{E} \bigl[ e^{t(X_{i}-\mathbb{E}[X_{i}])} \bigr] \geq e^{ \mathbb{E}[t(X_{i}-\mathbb{E}[X_{i}])]}=1, $$

namely, \(0< ct<1 \) in (3.5).

For all \(t > 0\), by (3.1),

$$\begin{aligned} \mathbb{V}(\bar{S}_{n} \geq nx) &= \mathbb{V} \Biggl(\sum _{i=1}^{n} \bigl(X_{i}- \mathbb{E}[X_{i}] \bigr) \geq nx \Biggr) \\ &\leq \mathbb{E} \biggl[ \frac{e^{t \sum _{i=1}^{n}(X_{i}-\mathbb{E}[X_{i}])}}{e^{tnx}} \biggr] \\ &=e^{-tnx} \mathbb{E} \bigl[ e^{t \sum _{i=1}^{n}(X_{i}- \mathbb{E}[X_{i}])} \bigr] \\ &=e^{-tnx} \prod_{i=1}^{n} \mathbb{E} \bigl[ e^{t(X_{i}- \mathbb{E}[X_{i}])} \bigr]. \end{aligned}$$
(3.6)

Thus, taking \(t= \frac{x}{b_{n}^{2}+cx}\) in (3.5) and (3.6), together with (3.4), we get

$$ \begin{aligned} \mathbb{V}(\bar{S}_{n} \geq nx) & \leq \biggl(1+ \frac{x^{2}}{2(b_{n}^{2}+cx)} \biggr)^{n} \cdot e^{- \frac{nx^{2}}{b_{n}^{2}+cx}} \\ & \leq e^{-\frac{nx^{2}}{2(b^{2}_{n}+cx)}}. \end{aligned} $$

The proof is completed. □

Remark 3.1

The exponential independence in [6, Definition 2.3], which is similar to (3.1), is defined for all bounded Lipschitz functions.

Remark 3.2

Particularly, when \(X_{i} \leq M\) uniformly, the Rio–Bernstein condition is satisfied with \(c = M\). Then the conclusion holds. The similar result for \(\{X_{i}\}_{i=1}^{\infty } \subset \mathcal{L}^{\infty }\) with different upper bounds is proved by Wu [22].

Remark 3.3

Suppose that \(\{-X_{i}\}_{i=1}^{\infty }\) satisfies the condition in Theorem 3.1. More precisely, for any fixed n, there exists \(c > 0\), for all \(p > 2\),

$$ \sum_{i=1}^{n} \mathbb{E} \bigl[ \bigl( \bigl(X_{i}-\mathcal{E}[X_{i}] \bigr)^{-} \bigr)^{p} \bigr] \leq \frac{c^{p-2}p!}{2} \sum _{i=1}^{n} \mathbb{E} \bigl[ \bigl(X_{i}- \mathcal{E}[X_{i}] \bigr)^{2} \bigr]. $$

Then we have

$$ \mathbb{V} \Biggl(\sum_{i=1}^{n} \bigl(X_{i}-\mathcal{E}[X_{i}] \bigr) \leq -nx \Biggr) \leq e^{-\frac{nx^{2}}{2(\tilde{b}^{2}_{n}+cx)}}, $$

where \(\tilde{b}^{2}_{n} \triangleq \frac{1}{n} \sum_{i=1}^{n} \mathbb{E}[(X_{i}-\mathcal{E}[X_{i}])^{2}]\). This inequality could be regarded as the other side of the Bernstein-type inequality.

Particularly, if \(\mathbb{E}[X_{i}] = \mathcal{E}[X_{i}] = 0\) for any \(i \geq 1\) and

$$ \sum_{i=1}^{n} \mathbb{E} \bigl[ \vert X_{i} \vert ^{p} \bigr] \leq \frac{c^{p-2}p!}{2}\sum _{i=1}^{n} \mathbb{E} \bigl[X_{i}^{2} \bigr], $$

the result degenerates to the two-side inequality, i.e.,

$$ \mathbb{V} \bigl( \vert S_{n} \vert > nx \bigr) \leq 2e^{-\frac{nx^{2}}{2({b}^{2}_{n}+cx)}}, $$

where \(b^{2}_{n} = \tilde{b}^{2}_{n} = \frac{1}{n} \sum_{i=1}^{n} \mathbb{E}[X_{i}^{2}]\).

Next we provide an example (\(\mathbb{E}[X_{i}] \neq 0\)) under which assumption (3.2) holds.

Example

Reconsider the example in Sect. 2. Let \(W + B = 90\) and \(W \in [30, 60]\). Let ξ be a random variable defined by

$$ \xi = \textstyle\begin{cases} 1, & \text{if the picked ball is white;} \\ 0, & \text{if the picked ball is black.} \end{cases} $$

We know that the distribution of ξ is

$$ \begin{Bmatrix} 0&1 \\ 1-q&q \end{Bmatrix} \text{with uncertainty: } q \in \biggl[ \frac{1}{3},\frac{2}{3} \biggr]. $$

Thus the robust expectation of ξ is

$$\begin{aligned} \mathbb{E} \bigl[\varphi (\xi ) \bigr] \triangleq \sup_{q \in [1/3,2/3]} \bigl[q \varphi (1)+(1-q)\varphi (0) \bigr], \quad \forall \varphi \in C_{l,\mathrm{Lip}}( \mathbb{R}), \end{aligned}$$

and \(\mathbb{E}[\xi ]=2/3\), \(\mathbb{E}[(\xi -\mathbb{E}[\xi ])^{2}]=1/3\), \(\mathbb{E} [ ((\xi -\mathbb{E}[\xi ])^{+} )^{p} ] = 2/3^{p+1}\) for every \(p \geqslant 3\). Then we define a sequence of random variables which are identically distributed with ξ. By some simple computations, Rio–Bernstein condition (3.2) holds for \(c=1/3\).

Before presenting McDiarmid’s inequality in the upper expectation space, we recall the definition of a bounded differences function and a crucial lemma whose proof could be found in [12, Lemma 3.1].

Definition 3.1

We say \(f:\mathbb{R}^{n} \rightarrow \mathbb{R}\) is a function with bounded differences \(\{c_{k}\}_{k=1}^{n}\) if

$$ \sup_{x_{1},\ldots ,x_{k-1},x_{k},x_{k}',x_{k+1},\ldots ,x_{n}} \bigl\vert \varphi (x_{1},\ldots ,x_{k-1},x_{k},x_{k+1},\ldots ,x_{n}) - \varphi \bigl(x_{1},\ldots ,x_{k-1},x_{k}',x_{k+1}, \ldots ,x_{n} \bigr) \bigr\vert \leq c_{k} $$

for all \(1 \leq k \leq n\), where \(\{c_{k}\}_{k=1}^{n}\) is a finite sequence of bounded numbers.

Lemma 3.1

([12, Lemma 3.1])

If a random variableXsatisfies\(\mathbb{E}[X] \leq 0\)and\(m \leq X \leq M\), \(m, M \in \mathbb{R}\), then for all\(h > 0\),

$$ \mathbb{E} \bigl[e^{hX} \bigr] \leq e^{\frac{1}{8}h^{2}(M-m)^{2}}. $$

The following result tells McDiarmid’s inequality in upper expectation spaces.

Theorem 3.2

Let\(\{X_{i}\}_{i=1}^{\infty }\)be a sequence of random variables in an upper expectation space\((\varOmega ,\mathcal{H},\mathbb{E})\). For any given integer\(n \geq 1\), \(X_{n+1}\)is independent of\((X_{1},\ldots ,X_{n})\). Suppose that\(\varphi : \mathbb{R}^{n} \rightarrow \mathbb{R}\)is any local Lipschitz function with bounded differences\(\{c_{k}\}_{k=1}^{n}\). Then, for any\(\varepsilon > 0\), it holds that

$$ \mathbb{V} \bigl( \varphi (X_{1},\ldots ,X_{n})-\mathbb{E} \bigl[\varphi (X_{1}, \ldots ,X_{n}) \bigr] \geq \varepsilon \bigr) \leq e^{- \frac{2\varepsilon ^{2}}{\sum _{k=1}^{n}c_{k}^{2}}}. $$
(3.7)

Proof

Note that

$$\begin{aligned} \begin{aligned}[b] & \varphi (X_{1},\ldots ,X_{n})- \mathbb{E} \bigl[\varphi (X_{1}, \ldots ,X_{n}) \bigr] \\ &\quad=\varphi (X_{1},\ldots ,X_{n})-\mathbb{E} \bigl[ \varphi (x_{1},\ldots ,x_{n-1},X_{n}) \bigr] |_{\substack{x_{i}=X_{i},\\ 1 \leq i \leq n-1}} \\ &\qquad{} + \mathbb{E} \bigl[\varphi (x_{1},\ldots ,x_{n-1},X_{n}) \bigr]|_{ \substack{x_{i}=X_{i},\\1 \leq i \leq n-1}} - \mathbb{E} \bigl[ \varphi (x_{1}, \ldots ,x_{n-2},X_{n-1},X_{n}) \bigr]|_{ \substack{x_{i}=X_{i},\\ 1 \leq i \leq n-2}} \\ &\qquad {}+ \mathbb{E} \bigl[\varphi (x_{1},\ldots ,x_{n-2},X_{n-1},X_{n}) \bigr] |_{\substack{x_{i}=X_{i},\\ 1 \leq i \leq n-2}} - \mathbb{E} \bigl[ \varphi (x_{1},\ldots ,x_{n-3},X_{n-2}, \ldots ,X_{n}) \bigr]|_{ \substack{x_{i}=X_{i},\\ 1 \leq i \leq n-3}} \\ &\quad \vdots \\ &\qquad{} + \mathbb{E} \bigl[\varphi (x_{1},X_{2},\ldots ,X_{n}) \bigr]|_{x_{1}=X_{1}} - \mathbb{E} \bigl[\varphi (X_{1},\ldots ,X_{n}) \bigr] \\ &\quad=\sum_{k=1}^{n} g_{k}(X_{1}, \ldots ,X_{k}), \end{aligned} \end{aligned}$$
(3.8)

where

$$\begin{aligned} \begin{aligned} &g_{n}(X_{1},\ldots ,X_{n})=g_{n}(x_{1}, \ldots ,x_{n})|_{ \substack{x_{i}=X_{i},\\ 1 \leq i \leq n}} \\ &\quad =\varphi (X_{1},\ldots ,X_{n})-\mathbb{E} \bigl[ \varphi (x_{1},\ldots ,x_{n-1},X_{n}) \bigr]|_{ \substack{x_{i}=X_{i},\\ 1 \leq i \leq n-1}}, \\ &g_{n-1}(X_{1},\ldots ,X_{n-1})=g_{n-1}(x_{1}, \ldots ,x_{n-1})|_{ \substack{x_{i}=X_{i},\\ 1 \leq i \leq n-1}} \\ &\quad = \mathbb{E} \bigl[\varphi (x_{1}, \ldots ,x_{n-1},X_{n}) \bigr]|_{ \substack{x_{i}=X_{i},\\ 1 \leq i \leq n-1}} - \mathbb{E} \bigl[\varphi (x_{1}, \ldots ,x_{n-2},X_{n-1},X_{n}) \bigr]|_{ \substack{x_{i}=X_{i},\\1 \leq i \leq n-2}}, \\ &\vdots \\ &g_{k}(X_{1},\ldots ,X_{k})=g_{k}(x_{1}, \ldots ,x_{k})|_{ \substack{x_{i}=X_{i},\\ 1 \leq i \leq k}} \\ &\quad = \mathbb{E} \bigl[\varphi (x_{1},\ldots ,x_{k},X_{k+1}, \ldots ,X_{n}) \bigr]|_{\substack{x_{i}=X_{i},\\ 1 \leq i \leq k}} - \mathbb{E} \bigl[\varphi (x_{1},\ldots ,x_{k-1},X_{k},\ldots ,X_{n}) \bigr]|_{ \substack{x_{i}=X_{i},\\1 \leq i \leq k-1}}, \\ &\vdots \\ &g_{1}(X_{1})= g_{1}(x_{1})|_{x_{1}=X_{1}} = \mathbb{E} \bigl[\varphi (x_{1},X_{2}, \ldots ,X_{n}) \bigr]|_{x_{1}=X_{1}} - \mathbb{E} \bigl[\varphi (X_{1}, \ldots ,X_{n}) \bigr]. \end{aligned} \end{aligned}$$
(3.9)

Applying Chebyshev’s inequality [7, Proposition 2.3], we obtain, for any \(h > 0\),

$$\begin{aligned} \begin{aligned}[b] & \mathbb{V} \bigl( \varphi (X_{1},\ldots ,X_{n})-\mathbb{E} \bigl[ \varphi (X_{1},\ldots ,X_{n}) \bigr] \geq \varepsilon \bigr) \\ &\quad\leq e^{-h\varepsilon }\mathbb{E} \bigl[e^{h (\varphi (X_{1}, \ldots ,X_{n}) - \mathbb{E}[\varphi (X_{1},\ldots ,X_{n})] )} \bigr] \\ &\quad=e^{-h\varepsilon }\mathbb{E} \bigl[e^{h\sum _{k=1}^{n} g_{k}(X_{1}, \ldots ,X_{k})} \bigr]. \end{aligned} \end{aligned}$$
(3.10)

Denote

$$ M_{k} \triangleq \sup_{x_{k}}g_{k}(x_{1},x_{2}, \ldots ,x_{k}) \quad \text{and} \quad m_{k} \triangleq \inf _{x_{k}}g_{k}(x_{1},x_{2}, \ldots ,x_{k}). $$

By the definition of \(g_{k}(x_{1},\ldots ,x_{k})\), we have

$$ m_{k} \leq g_{k}(x_{1},\ldots ,x_{k}) \leq M_{k} \quad \text{and} \quad 0\leq M_{k}-m_{k} \leq c_{k}. $$

In fact, by the monotonicity and subadditivity of \(\mathbb{E}[\cdot ]\),

$$\begin{aligned} & M_{k}-m_{k} \\ &\quad = \sup_{x_{k}}g_{k}(x_{1},x_{2}, \ldots ,x_{k}) - \inf_{x_{k}}g_{k}(x_{1},x_{2}, \ldots ,x_{k}) \\ &\quad = \sup_{x_{k}} \mathbb{E} \bigl[\varphi (x_{1},\ldots ,x_{k},X_{k+1}, \ldots ,X_{n}) \bigr] - \inf_{x_{k}} \mathbb{E} \bigl[\varphi (x_{1},\ldots ,x_{k},X_{k+1}, \ldots ,X_{n}) \bigr] \\ &\quad =\sup_{x_{k},x_{k'}} \bigl( \mathbb{E} \bigl[\varphi (x_{1}, \ldots ,x_{k-1},x_{k},X_{k+1}, \ldots ,X_{n}) \bigr] - \mathbb{E} \bigl[\varphi (x_{1}, \ldots ,x_{k-1},x_{k'},X_{k+1}, \ldots ,X_{n}) \bigr] \bigr) \\ &\quad \leq \sup_{x_{k},x_{k'}} \mathbb{E} \bigl[ \bigl\vert \varphi (x_{1},\ldots ,x_{k-1},x_{k},X_{k+1}, \ldots ,X_{n}) - \varphi (x_{1},\ldots ,x_{k-1},x_{k'},X_{k+1}, \ldots ,X_{n}) \bigr\vert \bigr] \\ &\quad \leq c_{k}. \end{aligned}$$

Then due to the independence, it follows that

$$ \begin{aligned} & \mathbb{V} \bigl( \varphi (X_{1},\ldots ,X_{n})-\mathbb{E} \bigl[ \varphi (X_{1},\ldots ,X_{n}) \bigr] \geq \varepsilon \bigr) \\ &\quad\leq e^{-h\varepsilon }\mathbb{E} \bigl[e^{h\sum _{k=1}^{n-1} g_{k}(X_{1},\ldots ,X_{k})}\mathbb{E} \bigl[e^{hg_{n}(x_{1},\ldots ,x_{n-1}, X_{n})} \bigr]|_{\substack{x_{i}=X_{i},\\1 \leq i \leq n-1}} \bigr] \\ &\quad \leq e^{-h\varepsilon }\mathbb{E} \bigl[e^{h\sum _{k=1}^{n-1} g_{k}(X_{1},\ldots ,X_{k})}e^{\frac{h^{2}(M_{n}-m_{n})^{2}}{8}} \bigr] \\ &\quad \leq e^{-h\varepsilon }\mathbb{E} \bigl[e^{h\sum _{k=1}^{n-1} g_{k}(X_{1},\ldots ,X_{k})}e^{\frac{h^{2}c_{n}^{2}}{8}} \bigr] \\ & \qquad \vdots \\ &\quad \leq e^{-h \varepsilon + \frac{h^{2}\sum _{k=1}^{n} c_{k}^{2}}{8}}. \end{aligned} $$

Choosing \(h=\frac{4\varepsilon }{\sum_{k=1}^{n} c_{k}^{2}}\), McDiarmid’s inequality (3.7) holds. □

4 Applications

As applications of Bernstein-type inequality, we get the following strong laws of large numbers.

Corollary 4.1

Let\(\{X_{i}\}_{i=1}^{\infty }\)and\(\{-X_{i}\}_{i=1}^{\infty }\)both satisfy the conditions in Theorem 3.1. Suppose that there exists some constantMsuch that\(b_{n}^{2} \leq M\)and\(\tilde{b}_{n}^{2} \leq M\)uniformly for anyn. For every\(i \geq 1\), \(\mathbb{E}[X_{i}] = \overline{\mu }\), \(\mathcal{E}[X_{i}] =- \mathbb{E}[-X_{i}] = \underline{\mu }\), and set\(S_{n}= \sum_{i=1}^{n} X_{i}\). Then we have

$$ \mathbb{V} \Bigl( \Bigl\{ \liminf_{n \rightarrow \infty } S_{n}/n < \underline{\mu } \Bigr\} \cup \Bigl\{ \limsup _{n \rightarrow \infty } S_{n}/n > \overline{\mu } \Bigr\} \Bigr) = 0. $$
(4.1)

Moreover,

$$\begin{aligned}& \limsup_{n \rightarrow \infty }\frac{1}{n}\log \mathbb{V} \bigl( \{S_{n}/n \geq \overline{\mu } + \varepsilon \} \cup \{S_{n}/n \leq \underline{\mu } - \varepsilon \} \bigr) \leq {- \frac{\varepsilon ^{2}}{2(M+c \varepsilon )}}, \end{aligned}$$
(4.2)
$$\begin{aligned}& \limsup_{\varepsilon \rightarrow 0} \varepsilon ^{2} \sum _{n=1}^{ \infty } \mathbb{V} \bigl( \{S_{n}/n \geq \overline{\mu } + \varepsilon \} \cup \{S_{n}/n \leq \underline{\mu } - \varepsilon \} \bigr) \leq 2M. \end{aligned}$$
(4.3)

Proof

By the lower continuity and subadditivity of \(\mathbb{V}\), it is obvious that result (4.1) is equivalent to the conjunction of

$$ \mathbb{V} \Bigl( \limsup_{n \rightarrow \infty } S_{n}/n \geq \overline{\mu } + \varepsilon \Bigr)=0 $$

and

$$ \mathbb{V} \Bigl( \liminf_{n \rightarrow \infty } S_{n}/n \leq \underline{\mu } - \varepsilon \Bigr)=0 $$

for any \(\varepsilon > 0\). Given any \(\varepsilon > 0\), a direct result of Bernstein-type inequality (3.3) is

$$ \mathbb{V}(S_{n}/n - \overline{\mu } \geq \varepsilon ) \leq e^{- \frac{n\varepsilon ^{2}}{2(b_{n}^{2}+c \varepsilon )}} \leq e^{- \frac{n\varepsilon ^{2}}{2(M+c \varepsilon )}}. $$
(4.4)

Thus, we obtain

$$ \sum_{n=1}^{\infty } \mathbb{V}(S_{n}/n - \overline{\mu } \geq \varepsilon ) < \infty . $$

It follows from the Borel–Cantelli lemma [7, Lemma 2.2] that

$$ \mathbb{V} \Biggl( \bigcap_{n=1}^{\infty } \bigcup_{k=n}^{\infty } \{S_{k}/k \geq \overline{\mu } + \varepsilon \} \Biggr)=0. $$

Then

$$ \mathbb{V} \Bigl( \limsup_{n \rightarrow \infty } S_{n}/n \geq \overline{\mu } + \varepsilon /2 \Bigr) =0. $$

More precisely, by (4.4),

$$ \frac{1}{n} \log \mathbb{V}( S_{n}/n \geq \overline{\mu } + \varepsilon ) \leq - \frac{{\varepsilon }^{2}}{2 (M + c \varepsilon ) } $$

and

$$ \limsup_{\varepsilon \rightarrow 0} \varepsilon ^{2} \sum _{n=1}^{ \infty } \mathbb{V}( S_{n}/n \geq \overline{\mu } + \varepsilon ) \leq 2M. $$

Similarly, by applying the other side of the Bernstein-type inequality, we can get

$$\begin{aligned}& \mathbb{V} \Bigl( \liminf_{n \rightarrow \infty } S_{n}/n \leq \underline{\mu } - \varepsilon \Bigr)=0, \\& \frac{1}{n} \log \mathbb{V}( S_{n}/n \leq \underline{\mu } - \varepsilon ) \leq - \frac{{\varepsilon }^{2}}{2 (\tilde{b}_{n}^{2} + c \varepsilon ) } \leq - \frac{{\varepsilon }^{2}}{2 (M + c \varepsilon ) }, \end{aligned}$$

and

$$ \limsup_{\varepsilon \rightarrow 0} \varepsilon ^{2} \sum _{n=1}^{ \infty } \mathbb{V}( S_{n}/n \leq \underline{\mu } - \varepsilon ) \leq 2M. $$

Together with the subadditivity of \(\mathbb{V}\), (4.2) and (4.3) follow. □

Remark 4.1

The exponential moment condition is stronger for result (4.1). In fact, Chen et al. proved the strong law of large numbers (4.1) in upper expectation spaces in [7, Theorem 3.1] and [4, Theorem 3.1] under different conditions without the exponential moment condition. We strengthen the conditions to apply the Bernstein-type inequality and obtain results (4.2) and (4.3). Formula (4.2) illustrates the convergence rate of (4.1). In the probability theory, (4.3) characterizes the complete convergence and it is precise asymptotic.

Corollary 4.2

(Marcinkiewicz–Zygmund-type law of large numbers)

Let\(\{X_{i}\}_{i=1}^{\infty }\)be the sequence in Corollary 4.1. For any\(1< r< 2\), we have

$$\begin{aligned}& \mathbb{V} \Bigl( \Bigl\{ \liminf_{n \rightarrow \infty } \tilde{S}_{n}/{n^{1/r}} < 0 \Bigr\} \cup \Bigl\{ \limsup _{n \rightarrow \infty } \bar{S}_{n}/{n^{1/r}} > 0 \Bigr\} \Bigr) = 0, \end{aligned}$$
(4.5)
$$\begin{aligned}& \limsup_{n \rightarrow \infty }\frac{1}{n^{\frac{2}{r}-1}}\log \mathbb{V} \bigl( \bigl\{ \bar{S}_{n}/n^{1/r} \geq \varepsilon \bigr\} \cup \bigl\{ \tilde{S}_{n}/n^{1/r} \leq \underline{ \mu } - \varepsilon \bigr\} \bigr) \leq {- \frac{\varepsilon ^{2}}{4M}}, \end{aligned}$$
(4.6)

where\(\bar{S}_{n} = \sum_{i=1}^{n} (X_{i}-\mathbb{E}[X_{i}])\)and\(\tilde{S}_{n} = \sum_{i=1}^{n} (X_{i}-\mathcal{E}[X_{i}])\).

Proof

For any given \(1 < r< 2\), \(\varepsilon > 0\), we take \(x = n^{1/r-1}\varepsilon \) in Theorem 3.1, we have

$$ \mathbb{V} \biggl(\frac{\bar{S}_{n}}{n^{1/r}} \geq \varepsilon \biggr) \leq e^{- \frac{n^{1/r}\varepsilon ^{2}}{2(b_{n}^{2} \cdot n^{1-1/r}+c\varepsilon )}} \leq e^{- \frac{n^{1/r}\varepsilon ^{2}}{2(M\cdot n^{1-1/r}+c\varepsilon )}}. $$

Thus, we get

$$ \sum_{n=1}^{\infty }\mathbb{V} \biggl( \frac{\bar{S}_{n}}{n^{1/r}} \geq \varepsilon \biggr) < \infty . $$

Similar to the above Corollary 4.1, we can get

$$ \mathbb{V} \Bigl( \Bigl\{ \liminf_{n \rightarrow \infty } \tilde{S}_{n}/{n^{1/r}} < 0 \Bigr\} \cup \Bigl\{ \limsup_{n \rightarrow \infty } \bar{S}_{n}/{n^{1/r}} > 0 \Bigr\} \Bigr) = 0. $$

Moreover, for n large enough,

$$\begin{aligned}& \frac{1}{n^{\frac{2}{r}-1}} \log \mathbb{V} \biggl( \frac{\bar{S}_{n}}{n^{1/r}} \geq \varepsilon \biggr) \leq - \frac{\varepsilon ^{2}}{4b_{n}^{2}}, \\& \frac{1}{n^{\frac{2}{r}-1}} \log \mathbb{V} \biggl( \frac{\tilde{S}_{n}}{n^{1/r}} \leq - \varepsilon \biggr) \leq - \frac{\varepsilon ^{2}}{4\tilde{b}_{n}^{2}}. \end{aligned}$$

Result (4.6) is obvious. □

Remark 4.2

The similar result to (4.5) was considered by Lan [16, Theorem 4.4] under some weaker moment condition. Result (4.6) is our main result.

Corollary 4.3

Suppose that\(\{X_{i}\}_{i=1}^{\infty }\)satisfies the conditions in Corollary 4.1. Then, for any\(\varepsilon > 0\), we have

$$ \mathbb{V} \Biggl(\frac{1}{\sqrt{n}}\sum_{i=1}^{n} \bigl(X_{i}-\mathbb{E}[X_{i}] \bigr) \geq \varepsilon \Biggr) \leq e^{- \frac{\varepsilon ^{2}}{2(\bar{b}_{n}^{2} + c \cdot \frac{\varepsilon }{\sqrt{n}})}} \leq e^{- \frac{\varepsilon ^{2}}{2(M + c \cdot \frac{\varepsilon }{\sqrt{n}})}} $$

and

$$ \mathbb{V} \Biggl(\frac{1}{\sqrt{n}}\sum_{i=1}^{n} \bigl(X_{i}-\mathcal{E}[X_{i}] \bigr) \leq - \varepsilon \Biggr) \leq e^{- \frac{\varepsilon ^{2}}{2(\tilde{b}_{n}^{2} + c \cdot \frac{\varepsilon }{\sqrt{n}})}} \leq e^{- \frac{\varepsilon ^{2}}{2(M + c \cdot \frac{\varepsilon }{\sqrt{n}})}}. $$

Moreover,

$$ \limsup_{n \rightarrow \infty }\mathbb{V} \Biggl(\frac{1}{\sqrt{n}} \sum _{i=1}^{n} \bigl(X_{i}- \mathbb{E}[X_{i}] \bigr) \geq \varepsilon \Biggr) \leq e^{-\frac{\varepsilon ^{2}}{2M}} $$

and

$$ \limsup_{n \rightarrow \infty }\mathbb{V} \Biggl(\frac{1}{\sqrt{n}} \sum _{i=1}^{n} \bigl(X_{i}- \mathcal{E}[X_{i}] \bigr) \leq - \varepsilon \Biggr) \leq e^{-\frac{\varepsilon ^{2}}{2M}}. $$

Proof

It is a straightforward result of Theorem 3.1 by taking \(x=\frac{1}{\sqrt{n}}\cdot \varepsilon \), and we omit the proof. □