1 Introduction

The study of backward stochastic differential equations (BSDEs) has necessary applications in stochastic optimal control, stochastic differential games, the probabilistic formula for the solutions of quasilinear partial differential equations, and financial markets. The adapted solution for a linear BSDE arising as an adjoint process for a stochastic control problem was first studied by Bismut [1] in 1973, then by Bensousssan [2], and while Pardoux and Peng [3] first studied the result for the existence and uniqueness of an adapted solution for a continuous general nonlinear BSDE, which is a final value problem for a stochastic differential equation of Itô type under the uniform Lipschitz conditions of the following form:

$$\begin{aligned} {\left\{ \begin{array}{ll} \textrm{d}Y(t)=h(t,Y(t),Z(t))\textrm{d}t + Z(t)\textrm{d}W(t), t \in [0,T],\\ Y(T)=\xi . \end{array}\right. } \end{aligned}$$

They proved the existence and uniqueness of an adapted solution by means of the Bihari’s inequality, which is the most important generalization of the Gronwall-Bellman inequality. Since then, the theory of BSDE became a powerful tool in many fields, such as financial mathematics, optimal control, semi-linear and quasi-linear partial differential equations. Later, there have been many works devoted to the study of BSDEs and their applications in a series of papers [3,4,5,6,7,8,9,10,11,12,13,14] under the assumptions that the coefficients satisfy Lipschitz conditions. Moreover, Mao [15] obtained a more general result than that of Pardoux and Peng [3] in which he proved existence and uniqueness under mild assumptions by applying Bihari’s inequality, which was the key tool in the proof.

A few years later, Lin [16] considered the following backward stochastic nonlinear Volterra integral equation.

$$\begin{aligned} X(t)+\int _{t}^{T}f(t,s,X(s),Z(t,s))\textrm{d}s+\int _{t}^{T}\left[ g(t,s,X(s))+Z(t,s)\right] \textrm{d}W(s)=X. \end{aligned}$$

His goal in [16] is to find a pair \(\left\{ X(s),Z(t,s)\right\} \) that requires that this pair \(\left\{ \mathscr {F}_{t \vee s}\right\} \)-adapted and Z(ts) is related to t. This is the intersection point of our result on linear singular BSVIE with [16]. The author also defines \(Z(t,s)=\tilde{Z}(t,s)-g(t,s)\), \((t,s)\in \mathcal {D}=\left\{ (t,s)\in \mathbb {R}^{2}_{+}; 0\le t\le s \le T\right\} \) as we defined for linear singular BSVIEs, see Sect. 3. Another important point in this paper is the use of the well-known extended martingale representation theorem to an adapted solution \(\left\{ x(t), y(t,s) \right\} \), \((t,s)\in \mathcal {D}\) also for linear singular BSVIEs, as we extend the martingale representation to an adapted solution \(\left\{ x(t), y(t, s)\right\} \), where \((t, s) \in \mathcal {D} \), also applicable to linear singular BSVIEs.

BSVIEs find their origin in the realm of optimal control problems linked with forward SVIEs. The initial examination of these equations was undertaken by Yong [17]. Building upon Yong’s pioneering work, subsequent advancements were made; Yong [18] introduced singular Lipschitz functions and proposed an innovative solution concept termed adapted M-solution. Yong’s contributions sparked heightened interest in BSVIEs, prompting diverse extensions and variations by numerous researchers. These expansions encompass a broad spectrum of BSVIE forms, including backward doubly SVIEs (Shi et al. [19]), reflected BSVIEs (Agram-Djehiche [20]), BSVIEs featuring diagonal-solution generators (Hernandez-Possamai [21], Hernandez [22], Wang-Yong [23]), backward stochastic Volterra integro-differential equations (Wang [24]), mean-field BSVIEs (Shi et al. [25] ), path-dependent BSVIEs (Overbeck and Röder [26]), infinite horizon BSVIEs (Hamaguchi [27]), BSVIEs within general filtration frameworks accommodating jumps (Popier [28]), and fractional order BSDE [29].

Now let us briefly introduce some notations used throughout the article. First, let us recall some spaces. Let \((\mathcal {H}, \Vert \cdot \Vert )\) and \((U, \Vert \cdot \Vert )\) be real separable Hilbert spaces with inner product \(\langle \cdot , \cdot \rangle \). Let \(\mathcal {L}(U,\mathcal {H})\) be the space of bounded linear operators mapping from U to \(\mathcal {H}\) and \((\Omega ,\mathscr {F}, \mathbb {F}, \mathbb {P})\) with natural filtration \(\mathbb {F} :=\left\{ \mathscr {F}_{t}\right\} _{t \ge 0}\) satisfying usual conditions is a complete probability space. \((w(t))_{t\ge 0} \) is a \(\mathcal {Q}\)-Wiener process on \((\Omega ,\mathscr {F}, \mathbb {F}, \mathbb {P})\) with a linear covariance bounded operator \(\mathcal {Q} \in \mathcal {L}(U)\) such that \(\text {tr}\mathcal {Q}<\infty \). Furthermore, suppose that there exists a complete orthonormal system \(\left\{ e_{k}\right\} _{k\ge 1} \) in U, a bounded sequence of nonnegative real numbers \(\lambda _{k}\), such that \(\mathcal {Q}e_{k}=\lambda _{k}e_{k}\), \(k=1,2,\ldots \) and a sequence \(\left\{ \beta _{k}\right\} _{k\ge 1}\)of independent Brownian motions such that

$$\begin{aligned} \langle w(t), e\rangle _{U}= \sum _{k=1}^{\infty }\sqrt{\lambda _{k}}\langle e_{k},e \rangle _{U}\beta _{k}(t), \quad e \in U, \quad t \ge 0. \end{aligned}$$

In addition, let \(\mathcal {L}^{0}_{2}=\mathcal {L}_{2}(\mathcal {Q}^{1/2}U,\mathcal {H})\) be the space of Hilbert-Schmidt operators from \(\mathcal {Q}^{1/2}U\) to \(\mathcal {H}\) with the inner product \(\Vert \varphi \Vert ^{2}_{\mathcal {L}^{0}_{2}}= \text {tr}[\varphi \mathcal {Q} \varphi ^{*}] <\infty \), \(\varphi \in \mathcal {L}(U,\mathcal {H})\). We also consider that \(\mathcal {L}_{2}^{\mathscr {F}_{t}}(\Omega ,\mathcal {H})\) is the Hilbert space \(\mathcal {H}\)-valued, \(\mathscr {F}_{t}\)-measurable and square-integrable random variables \(\xi \), i.e. \({\textbf {E}}\Vert \xi \Vert ^{2}_{\mathcal {H}} <\infty \).

We study singular backward stochastic nonlinear Volterra integral equation which is an unaddressed topic in the previous literature (singular BSVIE, for short) of order \(\alpha \in (\frac{1}{2},1)\) on [0, T] as follows:

$$\begin{aligned} x(t)=&\xi +\int _{t}^{T}(s-t)^{\alpha -1}f(t,s,x(s),y(t,s))\textrm{d}s\nonumber \\&+\int _{t}^{T}(s-t)^{\alpha -1}\left[ g(t,s,x(s))+y(t,s) \right] \textrm{d}w(s), \qquad \text {P-a.s}. \end{aligned}$$
(1.1)

where \(f: \mathcal {D}\times \mathcal {H} \times \mathcal {L}^{0}_{2}\rightarrow \mathcal {H}\) and \(g: \mathcal {D}\times \mathcal {H} \rightarrow \mathcal {L}^{0}_{2}\) are assumed to be measurable mappings and terminal value \(\xi \in \mathcal {L}_{2}^{\mathscr {F}_{T}}(\Omega ,\mathcal {H})\) is an \(\mathscr {F}_{T}\)-measurable square integrable variables with values in \(\mathcal {H}\) such that \({\textbf {E}}\Vert \xi \Vert ^{2}< \infty \). Let \(x(t,\omega )=x(t)\) be a stochastic process on [0, T] and \(\omega \in \Omega \).

The aim of the current paper is to introduce and establish the singular backward stochastic nonlinear Volterra integral equations theory in the infinite dimensional framework. This concept appears to introduce a fresh perspective within the literature. Our goal in this paper is to search for a pair of stochastic processes \(\left\{ x(t),y(t,s); (t,s)\in \mathcal {D}\right\} \), which we require to be \(\mathscr {F}_{t\vee s}\) -adapted and satisfy (1.1) in the usual sense of Itô. Such a pair is called an adapted solution of the equation (1.1). In contrast to the prevailing discourse on BSVIEs, our current study emphasizes two fundamental elements: singularity and the infinite-dimensional setting. We first derive representations of the adapted solution and then study existence and uniqueness results under a weaker condition than the Lipschitz condition. This can be constructed by an approximate sequence using the Picard type iteration. Unlike other research discussed above, we apply the Carathéodory- type condition to prove the existence and uniqueness of the adapted solution of Eq. (1.1).

The motivation behind investigating singular BSVIEs stems from two primary reasons. Firstly, Volterra integral equations have a historical association with singularities. Our study represents a continuation of the research strand, delving into the backward stochastic setting. Secondly, investigations have revealed a close relationship between (time) fractional differential equations and singular Volterra integral equations (e.g., [29]).

Hence the plan of this work is as follows. Section 2 is devoted to introducing some important inequalities. In Sect. 3, we establish a fundamental lemma that will play a key role in this paper. Section 4 is devoted to the construction of a Picard-type approximation of the adapted solution to show existence and uniqueness under non-Lipschitz conditions using the Bihari inequality, and Sect. 5 is devoted to the conclusion.

2 Preliminaries

We begin this section with some important inequalities of stochastic calculus. The following inequalities are fundamental inequalities of stochastic calculus. First, we introduce Doob’s maximum inequality for the sub-martingale process which has played an important role in the stochastic process theory. It has become a standard result which appears in almost every introductory text in this field. We then state a theorem about Jensen’s inequality which is often used in probabilistic setting.

Theorem 2.1

(Doob’s martingale maximal inequality, [30]) Let \(\left\{ \mathscr {F}_t\right\} _{t \ge 0}\) be a filtration on probability space \((\Omega , \mathscr {F},\mathbb {P})\) and let \((M_t)_{t \ge 0}\) be a continuous martingale with respect to the filtration \(\left\{ \mathscr {F}_t\right\} _{t \ge 0}\). Let \(p > 1\) and \(T>0\). If \({\textbf {E}}(\Vert M_T\Vert ^p)<+\infty \), then we have

$$\begin{aligned} {\textbf {E}}\left( \left( \sup _{0 \le t \le T} \Vert M_t \Vert \right) ^p \right) \le \left( \frac{p}{p-1} \right) ^p {\textbf {E}} (\Vert M_T\Vert ^p). \end{aligned}$$
(2.1)

Within the manuscript we use the extended martingale representation. The extended martingale representation, as described by Jacod and Shiryaev [31], concerns the representation of an arbitrary semimartingale on an abstract filtered probability space. In the context of stochastic calculus and probability theory, a semimartingale is a class of stochastic processes that generalize both martingales and processes with independent increments. The extended martingale representation theorem deals with expressing such semimartingales in terms of a martingale and a predictable process.

The representation states that any semimartingale X on a filtered probability space can be decomposed into the sum of two components:

  1. 1.

    A local martingale M - a special type of martingale that is a stochastic process adapted to a filtration and, at every stopping time, is a martingale when restricted to the interval up to that stopping time.

  2. 2.

    A predictable process A - a process that can be predicted using information available at earlier times.

The representation typically takes the form:

$$\begin{aligned} X_t= X_0+M_t+A_t, \end{aligned}$$

where

  • \(X_t\) represents the value of the semimartingale at time t

  • \(X_0\) is the initial value of the semimartingale.

  • \(M_t\) is a local martingale.

  • \(A_t\) is a predictable process.

This representation is a fundamental result in stochastic calculus as it allows decomposing a wide class of stochastic processes into two components, facilitating analysis and providing insights into the underlying structure of these processes. Jacod and Shiryaev’s work [31] delves into the theory of stochastic processes, including martingale theory and semimartingales, providing rigorous proofs and foundational results, including the extended martingale representation theorem for semimartingales. Their book is considered a standard reference in the field of stochastic analysis and probability theory.

The following Theorem 2.2 states the Bihari type inequality which will be a key tool in the proof of Theorem 3.3.1.

Theorem 2.2

(Bihari type inequality, [15]) Let \(T>0\) and \(u_{0}\ge 0\). Let u(t) and h(t) be a nonnegative, continuous functions on [0, T]. Let \(\omega \) be a continuous, nondecreasing function \(\omega (0)=0\), \(\omega (r)>0\) \(r\ge 0\) if

$$\begin{aligned} u(t)\le u_{0}+\int _{0}^{t}h(s)\omega (u(s))\textrm{d}s,\quad \text {for all}\quad t \in [0,T], \end{aligned}$$

then

$$\begin{aligned} u(t) \le B^{-1}\left( B(u_{0})+\int _{0}^{t}h(s)\textrm{d}s \right) ,\quad t\in [0,T], \end{aligned}$$

that

$$\begin{aligned} B(u_{0})+\int _{0}^{t}h(s)\textrm{d}s \in \text {Dom}(B^{-1}), \end{aligned}$$

where

$$\begin{aligned} B(u)=\int _{T}^{t}\frac{\textrm{d}s}{\omega (s)},\quad r\ge 0 \end{aligned}$$

and \(B^{-1}\) is the inverse of B. In particular, if, moreover \(u_{0}=0\) and

$$\begin{aligned} \int \limits _{0+}\frac{\textrm{d}s}{\omega (s)}=\infty , \end{aligned}$$

then \(u(t)=0\) for all \(t \in [0,T]\).

To conclude the preliminary section, we introduce the following definition, which will be used throughout the paper.

Definition 2.1

For any \(t \in [0,T]\), we define M[tT] to be a Banach space

$$\begin{aligned} M[t,T]:=\mathcal {L}_{2}^{\mathscr {F}}(\Omega ,C([t,T],\mathcal {H}))\times \mathcal {L}_{2}^{\mathscr {F}}(\mathcal {D},\mathcal {L}_{2}^{0}) \end{aligned}$$

endowed with the norm

$$\begin{aligned} \Vert (x,y)\Vert ^{2}_{t}={\textbf {E}}\sup _{t\le s\le T}\Vert x(s)\Vert ^{2}+{\textbf {E}}\int _{t}^{T}\int _{s}^{T}\Vert y(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s< \infty . \end{aligned}$$

3 Fundamental lemma

In this section, we establish a fundamental lemma to prove existence and uniqueness result using Picard type iteration in Sect. 4.

Definition 3.1

A pair of adapted process \((x,y)\in M[t,T]\) is a mild solution of (1.1) for all \(t \in [0,T]\) if satisfies the backward stochastic nonlinear Volterra integral equation (1.1).

We now introduce a fundamental lemma which plays an efficient role throughout this paper. To do so, we consider backward linear stochastic Volterra integral equation.

Lemma 3.1

For any \((x,y)\in M[t,T]\), the linear singular BSVIE

$$\begin{aligned} x(t)=\xi&+\int _{t}^{T}(s-t)^{\alpha -1}f(t,s)\textrm{d}s\nonumber \\&+\int _{t}^{T}(s-t)^{\alpha -1}\left[ g(t,s)+y(t,s) \right] \textrm{d}w(s), \quad \text {P-a.s.} \end{aligned}$$
(3.1)

has a unique solution in M[0, T] and moreover we have

$$\begin{aligned}&{\textbf {E}}\sup _{t\le s\le T}\Vert x(s)\Vert ^{2}+{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\nonumber \\&\quad \le 8{\textbf {E}}\Vert \xi \Vert ^{2}+16T{\textbf {E}}\Vert \xi \Vert ^{2} + \frac{16(2T)^{2\alpha }}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\Vert f(t,r)\Vert ^{2}\textrm{d}r \nonumber \\&\qquad +2{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert g(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s. \end{aligned}$$
(3.2)

Proof

Uniqueness: Let \((x_{1},y_{1})\) and \((x_{2},y_{2})\) be two solutions of (3.1).

$$\begin{aligned} x_{1}(t)-x_{2}(t)=\int _{t}^{T}(s-t)^{\alpha -1}\left[ y_{1}(t,s)-y_{2}(t,s) \right] \textrm{d}w(s), \end{aligned}$$

Taking \({\textbf {E}}\left\{ \cdot \mid \mathscr {F}_{t} \right\} \) from above, we can deduce that

$$\begin{aligned} {\textbf {E}}\left\{ x_{1}(t)-x_{2}(t)\mid \mathscr {F}_{t} \right\} =0, \quad \forall t\in [0,T], \end{aligned}$$

It is obvious that \(x_{1}(t)=x_{2}(t)\) and this follows that \(y_{1}(t,s)=y_{2}(t,s)\).

Existence: Taking a conditional expectation from (3.1), we have

$$\begin{aligned} x(t)={\textbf {E}}\left\{ \xi \mid \mathscr {F}_{t}\right\} +\int _{t}^{T}(s-t)^{\alpha -1}{} {\textbf {E}}\left\{ f(t,s) \mid \mathscr {F}_{t}\right\} \textrm{d}s. \end{aligned}$$
(3.3)

From extended martingale representation theorem ([31]), there exists \(L(\cdot )\in \mathcal {L}_{2}^{\mathscr {F}}([0,T],\mathcal {L}^{0}_{2})\) and uniquely \(K(t,\cdot )\in \mathcal {L}_{2}^{\mathscr {F}}(\mathcal {D};\mathcal {L}^{0}_{2})\) which satisfy the following relations:

$$\begin{aligned} {\textbf {E}}\left\{ \xi \mid \mathscr {F}_{t}\right\} ={\textbf {E}}\xi +\int _{0}^{t}L(u)\textrm{d}w(u), \end{aligned}$$
(3.4)
$$\begin{aligned} {\textbf {E}}\left\{ f(t,s) \mid \mathscr {F}_{t}\right\} ={\textbf {E}}f(t,s)+\int _{0}^{t}K(s,u)\textrm{d}w(u). \end{aligned}$$
(3.5)

Note also from (3.5), we can easily deduce that \(\forall s \in [0,T]\)

$$\begin{aligned} K(s,u)=0, \quad \text {a.e.}, \quad u\in [s,T], \quad \text {a.s.} \end{aligned}$$

and that

$$\begin{aligned} {\textbf {E}}\int _{0}^{T}\int _{0}^{s}|K(s,u)|^{2}\textrm{d}u\textrm{d}s\le 4{\textbf {E}}\int _{0}^{T}|f(t,s)|^{2}\textrm{d}s. \end{aligned}$$
(3.6)

Since \(t\in [0.T]\), it is obvious that

$$\begin{aligned} \xi&={\textbf {E}}\xi +\int _{0}^{T}L(u)\textrm{d}w(u)\\&={\textbf {E}}\xi +\int _{0}^{t}L(u)\textrm{d}w(u)+\int _{t}^{T}L(u)\textrm{d}w(u)\\&={\textbf {E}}\left\{ \xi \mid \mathscr {F}_{t}\right\} +\int _{t}^{T}L(u)\textrm{d}w(u), \end{aligned}$$

and since \(s\ge t\), we have

$$\begin{aligned} f(t,s)&={\textbf {E}}f(t,s)+\int _{0}^{s}K(s,u)\textrm{d}w(u)\\&={\textbf {E}}f(t,s)+\int _{0}^{t}K(s,u)\textrm{d}w(u)+\int _{t}^{s}K(s,u)\textrm{d}w(u)\\&={\textbf {E}}\left\{ f(t,s) \mid \mathscr {F}_{t}\right\} +\int _{t}^{s}K(s,u)\textrm{d}w(u). \end{aligned}$$

Therefore, we obtain

$$\begin{aligned} {\textbf {E}}\left\{ \xi \mid \mathscr {F}_{t}\right\} =\xi -\int _{t}^{T}L(u)\textrm{d}w(u), \end{aligned}$$
(3.7)

and

$$\begin{aligned} {\textbf {E}}\left\{ f(t,s) \mid \mathscr {F}_{t}\right\} = f(t,s)-\int _{t}^{s}K(s,u)\textrm{d}w(u). \end{aligned}$$
(3.8)

Substituting (3.7) and (3.8) into (3.3) and using stochastic Fubini’s theorem, we have

$$\begin{aligned} x(t)&=\left( \xi -\int _{t}^{T}L(u)\textrm{d}w(u) \right) +\int _{t}^{T}(s-t)^{\alpha -1} \left( f(t,s)-\int _{t}^{s}K(s,u)\textrm{d}w(u) \right) \textrm{d}s \\&=\xi +\int _{t}^{T}(s-t)^{\alpha -1} f(t,s)\textrm{d}s -\int _{t}^{T}L(u)\textrm{d}w(u)\\&\quad -\int _{t}^{T}(s-t)^{\alpha -1}\int _{t}^{s}K(s,u)\textrm{d}w(u)\textrm{d}s\\&=\xi +\int _{t}^{T}(s-t)^{\alpha -1} f(t,s)\textrm{d}s -\int _{t}^{T}L(u)\textrm{d}w(u)\\&\quad -\int _{t}^{T}\int _{u}^{T}(s-t)^{\alpha -1}K(s,u)\textrm{d}s\textrm{d}w(u). \end{aligned}$$

Thus, we get

$$\begin{aligned} x(t)=\xi +\int _{t}^{T}(s-t)^{\alpha -1} f(t,s)\textrm{d}s+\int _{t}^{T}\tilde{y}(t,u)\textrm{d}w(u). \end{aligned}$$

Then there exists a mild solution \((x,y)\in M[0,T]\) of (3.1) given by

$$\begin{aligned} x(t)={\textbf {E}}\left\{ \xi \mid \mathscr {F}_{t}\right\} +\int _{t}^{T}(s-t)^{\alpha -1}{} {\textbf {E}}\left\{ f(t,s) \mid \mathscr {F}_{t}\right\} \textrm{d}s, \end{aligned}$$
(3.9)

and

$$\begin{aligned} \tilde{y}(t,u)=-L(u)-\int _{u}^{T}(s-t)^{\alpha -1}K(s,u)\textrm{d}s. \end{aligned}$$
(3.10)

We finally define \(y(t,u)=\tilde{y}(t,u)-g(t,u)\), \((t,u)\in \mathcal {D}=\left\{ (t,u)\in \mathbb {R}^{2}_{+};0\le t\le \right. \)\(\left. u\le T\right\} \). It is easily seen that the pair (xy) solves (3.1). Therefore, the existence is proved.

From (3.7) and (3.8), we invoke the following inequalities for \(0\le t\le s\le T\):

$$\begin{aligned} {\textbf {E}}\int _{t}^{T}\Vert L(u)\Vert ^{2}\textrm{d}u\le 4{\textbf {E}}\Vert \xi \Vert ^{2}, \end{aligned}$$

and

$$\begin{aligned} {\textbf {E}}\int _{t}^{s}\Vert K(s,u)\Vert ^{2}\textrm{d}u\le 4{\textbf {E}}\Vert f(t,s)\Vert ^{2}. \end{aligned}$$

Now we estimate the solution (xy) given by (3.9) and (3.10) in [0, T]. From (3.9) it follows that

$$\begin{aligned} {\textbf {E}}\sup _{t\le s\le T}\Vert x(s)\Vert ^{2}&\le 2{\textbf {E}}\sup _{t\le s\le T}\Vert {\textbf {E}}\left\{ \xi | \mathscr {F}_{s} \right\} \Vert ^{2}\\&\quad +2{\textbf {E}}\sup _{t\le s\le T}\Big (\int _{s}^{T} (r-s)^{\alpha -1}{} {\textbf {E}}\left\{ \Vert f(t,r)\Vert \quad | \mathscr {F}_{s}\right\} \textrm{d}r\Big )^{2}:=\mathcal {I}_{1}+\mathcal {I}_{2}. \end{aligned}$$

From Doob’s inequality and the law of total expectation it follows that

$$\begin{aligned} \mathcal {I}_{1}\le 2{\textbf {E}}\sup _{t\le s\le T}{} {\textbf {E}}\Vert \left\{ \xi | \mathscr {F}_{s} \right\} \Vert ^{2}\le 8{\textbf {E}}\left( {\textbf {E}}\Vert \left\{ \xi | \mathscr {F}_{t} \right\} \Vert ^{2}\right) \le 8{\textbf {E}}\Vert \xi \Vert ^{2}. \end{aligned}$$

Doob’s inequality and Jensen’s inequality in probabilistic setting imply that

$$\begin{aligned} \mathcal {I}_{2}&\le 2{\textbf {E}}\sup _{t\le s\le T}\left( {\textbf {E}}\Biggl \{ \int _{s}^{T}(r-s)^{\alpha -1}\Vert f(t,r)\Vert \textrm{d}r \quad | \mathscr {F}_{s} \Biggr \} \right) ^{2}\\&\le 2{\textbf {E}}\sup _{t\le s\le T}\left( {\textbf {E}}\Biggl \{\sup _{t\le \tau \le T} \int _{\tau }^{T}(r-\tau )^{\alpha -1}\Vert f(t,r)\Vert \textrm{d}r \quad | \mathscr {F}_{s} \Biggr \} \right) ^{2}\\&\le 8{\textbf {E}}\left( \sup _{t\le \tau \le T} \int _{\tau }^{T}(r-\tau )^{\alpha -1}\Vert f(t,r)\Vert \textrm{d}r \right) ^{2}\\&\le 8{\textbf {E}}\sup _{t\le \tau \le T} \int _{\tau }^{T}(r-\tau )^{2\alpha -2}\textrm{d}r \int _{\tau }^{T}\Vert f(t,r)\Vert ^{2}\textrm{d}r \\&\le 8\frac{(T-t)^{2\alpha -1}}{2\alpha -1}\ {\textbf {E}}\int _{t}^{T}\Vert f(t,r)\Vert ^{2}\textrm{d}r\\&\le 8\frac{(2T)^{2\alpha }}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\Vert f(t,r)\Vert ^{2}\textrm{d}r. \end{aligned}$$

Eventually, we have

$$\begin{aligned} {\textbf {E}}\sup _{t\le s\le T}\Vert x(s)\Vert ^{2}\le 8{\textbf {E}}\Vert \xi \Vert ^{2}+\frac{8(2T)^{2\alpha }}{2\alpha -1}{} {\textbf {E}}\int _{t}^{T}\Vert f(t,r)\Vert ^{2}\textrm{d}r. \end{aligned}$$
(3.11)

Next we estimate \(\tilde{y}\) using Hölder’s inequality. We attain,

$$\begin{aligned} \Vert \tilde{y}(s,u)\Vert ^{2}&\le 2\Vert L(u))\Vert ^{2}+2\left\Vert \int _{u}^{T}(r-s)^{\alpha -1}K(r,u)\textrm{d}r\right\Vert ^{2}\\&\le 2\Vert L(u)\Vert ^{2}+2\left( \frac{(T-s)^{2\alpha -1}}{2\alpha -1}-\frac{(u-s)^{2\alpha -1}}{2\alpha -1} \right) \int _{u}^{T}\Vert K(r,u)\Vert ^{2}\textrm{d}r\\&\le 2\Vert L(u)\Vert ^{2}+2\frac{(T+u)^{2\alpha -1}}{2\alpha -1} \int _{u}^{T}\Vert K(r,u)\Vert ^{2}\textrm{d}r\\&\le 2\Vert L(u)\Vert ^{2}+2\frac{(2T)^{2\alpha -1}}{2\alpha -1} \int _{u}^{T}\Vert K(r,u)\Vert ^{2}\textrm{d}r. \end{aligned}$$

Taking double integral of above inequality and applying Fubini’s theorem twice yield that

$$\begin{aligned}&{\textbf {E}}\sup _{t \le \tau \le T} \int _{\tau }^{T}\int _{s}^{T}\Vert \tilde{y}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s \le 2{\textbf {E}}\sup _{t \le \tau \le T}\int _{\tau }^{T}\int _{s}^{T}\Vert L(u)\Vert ^{2}\textrm{d}u\textrm{d}s\nonumber \\&\quad +2\frac{(2T)^{2\alpha -1}}{2\alpha -1} {\textbf {E}}\sup _{t \le \tau \le T}\int _{\tau }^{T}\int _{s}^{T}\int _{u}^{T}\Vert K(r,u)\Vert ^{2}\textrm{d}r\textrm{d}u\textrm{d}s\nonumber \\&\le 8(T-t){\textbf {E}}\Vert \xi \Vert ^{2}+2\frac{(2T)^{2\alpha -1}}{2\alpha -1} {\textbf {E}}\sup _{t \le \tau \le T}\int _{\tau }^{T}\int _{s}^{T}\int _{s}^{r}\Vert K(r,u)\Vert ^{2}\textrm{d}u\textrm{d}r\textrm{d}s\nonumber \\&\le 8T{\textbf {E}}\Vert \xi \Vert ^{2}+8\frac{(2T)^{2\alpha -1}}{2\alpha -1}{} {\textbf {E}}\sup _{t \le \tau \le T}\int _{\tau }^{T}\int _{s}^{T}\Vert f(t,r)\Vert ^{2}\textrm{d}r \textrm{d}s\nonumber \\&\le 8T{\textbf {E}}\Vert \xi \Vert ^{2}+8\frac{(2T)^{2\alpha -1}}{2\alpha -1}{} {\textbf {E}}\sup _{t \le \tau \le T}\int _{\tau }^{T}\int _{\tau }^{r}\Vert f(t,r)\Vert ^{2}\textrm{d}s \textrm{d}r\nonumber \\&\le 8T{\textbf {E}}\Vert \xi \Vert ^{2}+8\frac{(2T)^{2\alpha -1}}{2\alpha -1}{} {\textbf {E}}\sup _{t \le \tau \le T}\int _{\tau }^{T}\int _{\tau }^{r}\Vert f(t,r)\Vert ^{2} \textrm{d}r\\&\le 8T{\textbf {E}}\Vert \xi \Vert ^{2}+8\frac{(2T)^{2\alpha -1}(T-t)}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\Vert f(t,r)\Vert ^{2}\textrm{d}r\\&\le 8T{\textbf {E}}\Vert \xi \Vert ^{2}+8\frac{(2T)^{2\alpha -1}T}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\Vert f(t,r)\Vert ^{2}\textrm{d}r\\&\le 8T{\textbf {E}}\Vert \xi \Vert ^{2}+\frac{4(2T)^{2\alpha }}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\Vert f(t,r)\Vert ^{2}\textrm{d}r. \end{aligned}$$

Thus, we get

$$\begin{aligned} {\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert \tilde{y}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s \le 8T{\textbf {E}}\Vert \xi \Vert ^{2}+\frac{4(2T)^{2\alpha }}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\Vert f(t,r)\Vert ^{2}\textrm{d}r. \end{aligned}$$
(3.12)

Since we know \(y(t,u)=\tilde{y}(t,u)-g(t,u)\), we also have

$$\begin{aligned} \int _{t}^{T}\int _{s}^{T}\Vert y(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\le 2\int _{t}^{T}\int _{s}^{T}\Vert \tilde{y}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s+2\int _{t}^{T}\!\int _{s}^{T}\!\Vert g(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s.\nonumber \\ \end{aligned}$$
(3.13)

Taking into account (3.13) and summing over (3.11) and (3.12) yield that

$$\begin{aligned}&{\textbf {E}}\sup _{t\le s\le T}\Vert x(s)\Vert ^{2}+{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\\&\quad \le 8{\textbf {E}}\Vert \xi \Vert ^{2}+16T{\textbf {E}}\Vert \xi \Vert ^{2} +\frac{16(2T)^{2\alpha }}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\Vert f(t,r)\Vert ^{2}\textrm{d}r \\&\qquad +2{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert g(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s. \end{aligned}$$

Therefore, the proof is complete. \(\square \)

4 Picard approximation

In this section, we introduce the existence and uniqueness problem of the solution to fractional backward stochastic evolution equation in more general form, that is, if the function f(txy) is a non-Lipschitzian function. This can be constructed by an approximate sequence using the Picard type iteration. Let \(\left\{ x_{j}, y_{j}\right\} \) be a sequence in M[0, T] defined recursively by

$$\begin{aligned} {\left\{ \begin{array}{ll} (x_{0}(t),y_{0}(t,s))=(x(T),0)=(\xi ,0)\\ x_{j}(t)=\xi +\int _{t}^{T}(s-t)^{\alpha -1}f(s,x_{j-1}(s),y_{j}(t,s))\textrm{d}s\\ \hspace{1.2cm}+\int _{t}^{T}(s-t)^{\alpha -1}\left[ g(t,s,x_{j-1}(s))+y_{j}(t,s) \right] \textrm{d}w(s), j\ge 1. \end{array}\right. } \end{aligned}$$
(4.1)

To state our main results, we impose the following assumptions on the functions f and g.

Assumption 4.1

\(f(\cdot ,\cdot ,0,0) \in \mathcal {L}_{2}(0,T,\mathcal {H};\mathcal {L}^{0}_{2})\) and \(g(\cdot ,\cdot ,0) \in \mathcal {L}_{2}(0,T;\mathcal {L}^{0}_{2})\).

Assumption 4.2

Let c be a positive constant and let \(\varpi :=1- 8\frac{(2T)^{2\alpha }}{2\alpha -1}c>0\).

Assumption 4.3

For all \(x,\bar{x} \in \mathcal {H}\), \(y,\bar{y}\in \mathcal {L}^{0}_{2}\) and \(0\le t\le T\),

$$\begin{aligned}&\Vert f(t,s,x,y)-f(t,s,\bar{x},\bar{y})\Vert ^{2} \le \rho (\Vert x-\bar{x}\Vert ^{2}) +c\Vert y-\bar{y}\Vert ^{2},\quad \text {a.s.},\\&\Vert g(t,s,x)-g(t,s,\bar{x})\Vert ^{2}\le \rho (\Vert x-\bar{x}\Vert ^{2}), \end{aligned}$$

where \(\rho (u)\) satisfies:

  • \(\rho (\cdot )\) is a concave nondecreasing function from \(\mathbb {R}_{+}\) to \(\mathbb {R}_{+}\) such that \(\rho (0)=0\), \(\rho (u)>0\) for \(u>0\) and

    $$\begin{aligned} \int _{0+}\frac{du}{\rho (u)}=\infty . \end{aligned}$$
  • there exists \(a\ge 0\), \(b\ge 0\) such that

    $$\begin{aligned} \rho (u)\le a+bu, \end{aligned}$$

    for all \(u\ge 0\);

Now we introduce some important constants which are used throughout this work.

$$\begin{aligned} C_{1}=&16\frac{(2T)^{2\alpha }}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\int _{s}^{T}\left( 2\Vert f(s,u,0,0)\Vert ^{2}+2a\right) \textrm{d}u \textrm{d}s\nonumber \\&+2{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\left( 2a+2\Vert g(s,u,0)\Vert ^{2}\right) \textrm{d}u\textrm{d}s, \nonumber \\ C_{2}&=4bT\left( 1+ 36\frac{T^{2\alpha }}{2\alpha -1}\right) ,\nonumber \\ C_{3}&=\left( 16 \frac{(2T)^{2\alpha }}{2\alpha -1}+2(T-t)\right) ,\nonumber \\ C_{4}&=C_{3}\rho \left( 4C_{1}\exp (C_{2}T)\right) . \end{aligned}$$
(4.2)

Lemma 4.1

Under Assumptions 4.1 and 4.3, for all \(t\in [0,T]\) and \(j\ge 1\).

$$\begin{aligned}{} & {} {\textbf {E}}\left( \sup _{t\le s\le T}\Vert x_{j}(s)\Vert ^{2}\right) \le C_{1}\exp (C_{2}(T-t)), \end{aligned}$$
(4.3)
$$\begin{aligned}{} & {} {\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y_{j}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\le \varpi ^{-1} C_{1}\left( 1+C_{2}(T-t)\exp (C_{2}(T-t)) \right) . \end{aligned}$$
(4.4)

Proof

It follows from Lemma 3.1 that

$$\begin{aligned}&{\textbf {E}}\sup _{t\le s\le T}\Vert x(s)\Vert ^{2}+{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y_{j}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\nonumber \\&\quad \le 8{\textbf {E}}\Vert \xi \Vert ^{2}+16T{\textbf {E}}\Vert \xi \Vert ^{2} +16\frac{(2T)^{2\alpha }}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\!\int _{r}^{T}\!\Vert f(r,u,x_{j-1}(u),y_{j}(r,u))\Vert ^{2}\textrm{d}u\textrm{d}r\nonumber \\&\qquad +2{\textbf {E}} \int _{t}^{T}\int _{r}^{T}\Vert g(r,u,x_{j-1}(u))\Vert ^{2}\textrm{d}u\textrm{d}r. \end{aligned}$$
(4.5)

Using Assumptions 4.1 and 4.3, we have

$$\begin{aligned}&\Vert f(t,s,x_{j-1}(s),y_{j}(s,u))\Vert ^{2}\\&\quad =\Vert f(t,s,x_{j-1}(s),y_{j}(s,u))-f(t,s,0,0)+f(t,s,0,0)\Vert ^{2}\\&\quad \le 2\Vert f(t,s,0,0)\Vert ^{2}+2a+2b\Vert x_{j-1}(s)\Vert ^{2}+2c\Vert y_{j}(s,u)\Vert ^{2}\\&\Vert g(t,s,x_{j-1}(s))\Vert ^{2}=\Vert g(t,s,x_{j-1}(s))-g(t,s,0)+g(t,s,0)\Vert ^{2}\\&\qquad \qquad \qquad \qquad \qquad \le 2a+2\Vert g(t,s,0)\Vert ^{2}+2b\Vert x_{j-1}(s)\Vert ^{2} \end{aligned}$$

Substituting these into (4.5) yields that

$$\begin{aligned}&{\textbf {E}}\sup _{t\le s\le T}\Vert x_{j}(s)\Vert ^{2}+{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y_{j}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\nonumber \\&\quad \le 8{\textbf {E}}\Vert \xi \Vert ^{2}+16T{\textbf {E}}\Vert \xi \Vert ^{2}+16 \frac{(2T)^{2\alpha }}{2\alpha -1} {\textbf {E}}\nonumber \\&\qquad \times \int _{t}^{T}\int _{s}^{T}\left( 2\Vert f(s,u,0,0)\Vert ^{2}+2a+2b\Vert x_{j-1}(u)\Vert ^{2}+2c\Vert y_{j}(s,u)\Vert ^{2} \right) \textrm{d}u\textrm{d}s\nonumber \\&\qquad +2{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\left( 2a+2\Vert g(s,u,0)\Vert ^{2}+2b\Vert x_{j-1}(u)\Vert ^{2} \right) \textrm{d}u\textrm{d}s. \end{aligned}$$

Thus, we get

$$\begin{aligned}&{\textbf {E}}\sup _{t\le s\le T}\Vert x_{j}(s)\Vert ^{2}+\left( 1-32\frac{(2T)^{2\alpha }}{2\alpha -1}c \right) {\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y_{j}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\nonumber \\&\quad \le C_{1}+C_{2}{} {\textbf {E}}\int _{t}^{T}\sup _{s \le r \le T}\left( \Vert x_{j-1}(r)\Vert ^{2}\right) \textrm{d}s, \end{aligned}$$
(4.6)

where \(C_{1}\) and \(C_{2}\) are defined in (4.2).

Then, we have

$$\begin{aligned} \sup \limits _{1\le j\le k}{} {\textbf {E}}\left( \sup _{t\le s\le T}\Vert x_{j}(s)\Vert ^{2}\right)&\le C_{1}+C_{2}\int _{t}^{T}\sup \limits _{1\le j\le k}{} {\textbf {E}}\sup _{t\le r\le T}\left( \Vert x_{j-1}(r)\Vert ^{2}\right) \textrm{d}r. \end{aligned}$$

Applying Gronwall’s inequality invokes that

$$\begin{aligned} \sup \limits _{1\le j\le k}{} {\textbf {E}}\left( \sup _{t\le s\le T}\Vert x_{j}(s)\Vert ^{2}\right)&\le C_{1}\exp (C_{2}(T-t)). \end{aligned}$$

Since k was arbitrary, the inequality (4.34.4) follows. Finally it follows from (4.6), we have

$$\begin{aligned} {\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y_{j}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s&\le \varpi ^{-1}\left( C_{1}+C_{2}\int _{s}^{T} C_{1}\exp (C_{2}(T-s))\textrm{d}s\right) \\&\le \varpi ^{-1} C_{1}\left( 1+C_{2}(T-t)\exp (C_{2}(T-t)) \right) . \end{aligned}$$

\(\square \)

Lemma 4.2

Under Assumptions 4.1 and 4.3, there exists a constant \(C_{3}>0\) defined in (4.2) such that

$$\begin{aligned} {\textbf {E}}\sup _{t\le s\le T}\Vert x_{j+k}(s)-x_{j}(s)\Vert ^{2}\le C_{3}\int _{t}^{T}\rho \left( {\textbf {E}}\sup _{s\le r\le T}\Vert x_{j+k-1}(r)-x_{j-1}(r)\Vert ^{2}\right) \textrm{d}s, \end{aligned}$$
(4.7)

for all \(0\le t \le T\) and \(j,k\ge 1\).

Proof

Applying Lemma 3.1, we have

$$\begin{aligned}&{\textbf {E}}\sup _{t\le s\le T}\Vert x_{j+k}(s)-x_{j}(s)\Vert ^{2}+{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y_{j+k}(s,u)-y_{j}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\\&\quad \le 16\frac{(2T)^{2\alpha }}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\int _{s}^{T}\Vert f(s,u,x_{j+k-1}(u),y_{j+k}(s,u))\\&\qquad -f(s,u,x_{j-1}(u),y_{j}(s,u))\Vert ^{2}\textrm{d}u\textrm{d}s\nonumber \\&\qquad +2{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert g(s,u,x_{j+k-1}(u))-g(s,u,x_{j-1}(u))\Vert ^{2}\textrm{d}u\textrm{d}s\\&\quad \le 16\frac{(2T)^{2\alpha }}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\int _{s}^{T}\left[ \rho \left( \Vert x_{j+k-1}(u)-x_{j-1}(u)\Vert ^{2}\right) \right. \\&\qquad \left. +c\Vert y_{j+k}(s,u)-y_{j}(s,u)\Vert ^{2}\right] \textrm{d}u\textrm{d}s\\&\qquad +2{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\rho \left( \Vert x_{j+k-1}(u)-x_{j-1}(u)\Vert ^{2}\right) \textrm{d}u\textrm{d}s\\&\quad \le 16 \frac{(2T)^{2\alpha }}{2\alpha -1}(T-t)\int _{t}^{T}\rho \left( {\textbf {E}}\sup _{s \le r \le T}\Vert x_{j+k-1}(r)-x_{j-1}(r)\Vert ^{2}\right) \textrm{d}s\\&\qquad + 16 \frac{(2T)^{2\alpha }}{2\alpha -1}c\int _{t}^{T}\int _{s}^{T}\Vert y_{j+k}(s,u)-y_{j}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\\&\qquad +2(T-t){\textbf {E}} \int _{t}^{T}\rho \left( {\textbf {E}}\sup _{s \le r \le T}\Vert x_{j+k-1}(r)-x_{j-1}(r)\Vert ^{2}\right) \textrm{d}s\\&\quad \le \left( 16 \frac{(2T)^{2\alpha }}{2\alpha -1}+2(T-t)\right) \int _{t}^{T}\rho \left( {\textbf {E}}\sup _{s \le r \le T}\Vert x_{j+k-1}(r)-x_{j-1}(r)\Vert ^{2}\right) \textrm{d}s\\&\qquad +32 \frac{(2T)^{2\alpha }}{2\alpha -1}c\int _{t}^{T}\int _{s}^{T}\Vert y_{j+k}(s,u)-y_{j}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s. \end{aligned}$$

Therefore, we have

$$\begin{aligned}&{\textbf {E}}\sup _{t\le s\le T}\Vert x_{j+k}(s)-x_{j}(s)\Vert ^{2}+\left( 1- 32 \frac{(2T)^{2\alpha }}{2\alpha -1}c\right) {\textbf {E}}\\&\qquad \times \int _{t}^{T}\int _{s}^{T}\Vert y_{j+k}(s,u)-y_{j}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\\&\quad \le C_{3}\int _{t}^{T}\rho \left( {\textbf {E}}\sup _{s \le r \le T}\Vert x_{j+k-1}(r)-x_{j-1}(r)\Vert ^{2}\right) \textrm{d}s, \end{aligned}$$

which completes the proof. \(\square \)

Lemma 4.3

Under Assumptions 4.14.3, there exists a constant \(C_{4}>0\) defined in (4.2) such that

$$\begin{aligned} {\textbf {E}}\sup _{t\le s\le T}\Vert x_{j+k}(s)-x_{j}(s)\Vert ^{2}\le C_{4}(T-t), \end{aligned}$$

for all \(0\le t \le T\) and for all \(j,k \ge 1\).

Proof

By Lemmas 4.1 and 4.2, we have

$$\begin{aligned} {\textbf {E}}\sup _{t\le s\le T}\Vert x_{j+k}(s)-x_{j}(s)\Vert ^{2}&\le C_{3}\int _{t}^{T}\rho \left( {\textbf {E}}\sup _{s \le r \le T}\Vert x_{j+k-1}(r)-x_{j-1}(r)\Vert ^{2}\right) \textrm{d}s\\&\le C_{3}\int _{t}^{T}\rho \left( 2C_{1}\exp (C_{2}(T-s))\right) \textrm{d}s\\&\le C_{3}\rho \left( 2C_{1}\exp (C_{2}T)\right) (T-t)=C_{4}(T-t). \end{aligned}$$

Therefore, the proof is complete. \(\square \)

Let us define the following sequences:

$$\begin{aligned}&\varphi _{1}(t)=C_{4}(T-t),\\&\varphi _{j+1}(t)=C_{3}\int _{t}^{T}\rho ( \varphi _{j}(s))\textrm{d}s, \quad j\ge 1\\&\tilde{\varphi }_{j,k}(t)={\textbf {E}}\sup _{t\le s\le T}\Vert x_{j+k}(s)-x_{j}(s)\Vert ^{2}, \quad j\ge 1, k\ge 1, \end{aligned}$$

Lemma 4.4

There exists \(0\le T_{0}\le T\) such that for all \(j,k\ge 1\)

$$\begin{aligned} 0\le \tilde{\varphi }_{j,k}(t)\le \varphi _{j}(t)\le \varphi _{j-1}(t)\le \ldots \le \varphi _{1}(t), \quad \text {for all} \quad t\in [T_{0},T]. \end{aligned}$$

Proof

We prove this lemma by mathematical induction principle in j.

By Lemma 4.3, we have

$$\begin{aligned} \tilde{\varphi }_{1,k}(t)={\textbf {E}}\sup _{t\le s\le T}\Vert x_{1+k}(s)-x_{1}(s)\Vert ^{2}\le C_{4}(T-t)=\varphi _{1}(t). \end{aligned}$$

By Lemma 4.2, we get

$$\begin{aligned} \tilde{\varphi }_{2,k}(t)&={\textbf {E}}\sup _{t\le s\le T}\Vert x_{2+k}(s)-x_{2}(s)\Vert ^{2}\\&\le C_{3}\int _{t}^{T}\rho \left( {\textbf {E}}\sup _{s\le r\le T}\Vert x_{1+k}(r)-x_{1}(r)\Vert ^{2}\right) \textrm{d}s\\&=C_{3}\int _{t}^{T}\rho \left( \tilde{\varphi }_{1,k}(s)\right) \textrm{d}s\le C_{3}\int _{t}^{T}\rho \left( \varphi _{1}(s)\right) \textrm{d}s=\varphi _{2}(t). \end{aligned}$$

We have to prove that there exists \(T_{0}>0\) such that for all \(t \in [T_{0},T]\) the following inequality holds:

$$\begin{aligned} \varphi _{2}(t)=C_{3}\int _{t}^{T}\rho \left( C_{4}(T-s)\right) \textrm{d}s\le C_{4}(T-t)=\varphi _{1}(t). \end{aligned}$$
(4.8)

To this end, note that this inequality provided that

$$\begin{aligned} C_{3}\rho \left( C_{4}(T-t)\right) \le C_{4}=C_{3}\rho \left( 2C_{1}\exp (C_{2}T)\right) \end{aligned}$$

or

$$\begin{aligned} C_{3}\rho \left( 2C_{1}\exp (C_{2}T)\right) \le 2C_{1}\exp (C_{2}T)=2u \end{aligned}$$

On the other hand, this holds if

$$\begin{aligned} C_{3}(a+2bu)(T-t)\le 2u \end{aligned}$$

Since \(u=C_{1}\exp (C_{2}T)\ge C_{1}\) the above inequality holds if

$$\begin{aligned} T-t\le \frac{2}{C_{3}(\frac{a}{u}+2b)} \end{aligned}$$
(4.9)

Thus, (4.8) holds true for any t satisfying (4.9). Obviously, such a t does not depend on the value \(\xi \). Thus, there exists \(T_{0}>0\) such that

$$\begin{aligned} \varphi _{2}(t)\le \varphi _{1}(t) \end{aligned}$$

for all \(t \in [T_{0},T]\). Now we assume that (20) holds for some \(n\ge 2\). Then using the same inequalities as above yields

$$\begin{aligned} \tilde{\varphi }_{j+1,k}(t)&\le C_{3}\int _{t}^{T}\rho \left( {\textbf {E}}\sup _{s\le r\le T}\Vert x_{j+k}(r)-x_{j}(r)\Vert ^{2}\right) \textrm{d}s\\&\le C_{3}\int _{t}^{T}\rho \left( \tilde{\varphi }_{j,k}(s)\right) \textrm{d}s\le C_{3}\int _{t}^{T}\rho \left( \varphi _{j}(s)\right) \textrm{d}s=\varphi _{j+1}(t) \end{aligned}$$

for all \(t\in [T_{0},T]\). On the other hand, we have

$$\begin{aligned} \varphi _{j+1}(t)=C_{3}\int _{t}^{T}\rho \left( \varphi _{j}(s)\right) \textrm{d}s\le C_{3}\int _{t}^{T}\!\rho \left( \varphi _{j-1}(s)\right) \textrm{d}s=\varphi _{j}(t) \text { for all } t\in [T_{0},T]. \end{aligned}$$

This completes the proof. \(\square \)

Theorem 4.1

Assume that Assumptions 4.14.3 hold. Then there exists a unique mild solution (xy) of (1.1).

Proof

Uniqueness: To show the uniqueness let both (xy) and \((\tilde{x},\tilde{y})\) be solutions of (1.1). Then Lemma 3.1 implies that

$$\begin{aligned}&{\textbf {E}}\sup _{t\le s\le T}\Vert x(s)-\bar{x}(s)\Vert ^{2}+{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y(s,u)-\bar{y}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\\&\quad \le 16\frac{(2T)^{2\alpha }}{2\alpha -1} {\textbf {E}}\int _{t}^{T}\int _{s}^{T}\Vert f(s,x(s),y(s,u))-f(s,\bar{x}(s),\bar{y}(s,u))\Vert ^{2}\textrm{d}s\nonumber \\&\qquad +2{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert g(s,u,x(s))-g(s,u,\bar{x}(s))\Vert ^{2}\textrm{d}u\textrm{d}s\\&\quad \le \left( 16 \frac{(2T)^{2\alpha }}{2\alpha -1}+2(T-t)\right) \int _{t}^{T}\rho \left( {\textbf {E}}\sup _{s \le r \le T}\Vert x(r)-\bar{x}(r)\Vert ^{2}\right) \textrm{d}s\\&\qquad +16\frac{(2T)^{2\alpha }}{2\alpha -1}c{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y(s,u)-\bar{y}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s. \end{aligned}$$

For \(t\le s\le T\), we have

$$\begin{aligned}&{\textbf {E}}\sup _{t\le s\le T}\Vert x(s)-\bar{x}(s)\Vert ^{2}+{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y(s,u)-\bar{y}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s\\&\quad \le C\int _{t}^{T}\rho \left( {\textbf {E}}\sup _{s\le r\le T}\Vert x(r)-\bar{x}(r)\Vert ^{2} \right) \textrm{d}s\\&\quad +8\frac{(2T)^{2\alpha }}{2\alpha -1}c{\textbf {E}} \int _{t}^{T}\int _{s}^{T}\Vert y(s,u)-\bar{y}(s,u)\Vert ^{2}\textrm{d}u\textrm{d}s. \end{aligned}$$

Let \(1-8\frac{(2T)^{2\alpha }}{2\alpha -1}c>0\), then for any \(0\le t \le T\), we have

$$\begin{aligned} {\textbf {E}}\sup _{t\le s\le T}\Vert x(s)-\bar{x}(s)\Vert ^{2}\le C\int _{t}^{T}\rho \left( {\textbf {E}}\sup _{s\le r\le T}\Vert x(r)-\bar{x}(r)\Vert ^{2} \right) \textrm{d}s. \end{aligned}$$

Therefore, by Bihari’s inequality we obtain

$$\begin{aligned} {\textbf {E}}\sup _{t\le s\le T}\Vert x(s)-\tilde{x}(s)\Vert ^{2}=0. \end{aligned}$$

So \(x(t)=\bar{x}(t)\) for all \(0\le t\le T\) almost surely. It then follows from (3.2) that \(y(t,s)=\bar{y}(t,s)\) for all \((t,s)\in \mathcal {D}\) almost surely as well. This establishes the uniqueness.

Existence: We claim that

$$\begin{aligned} {\textbf {E}}\sup _{t\le s\le T}\Vert x_{j+k}(s)-x_{j}(s)\Vert ^{2}\rightarrow 0,\quad \text {for all} \quad T_{0}\le t\le T, \quad \text {as} \quad j,k\rightarrow \infty .\nonumber \\ \end{aligned}$$
(4.10)

Note that by definition, \(\varphi _{j}\) is continuous on \([T_{0},T]\) and also for each \(n\ge 1\), \(\varphi _{j}(\cdot )\) is decreasing on \([T_{0},T]\) and for each \(\varphi _{j}(t)\) is a nonincreasing sequence. Therefore, we can define the function \(\varphi (t)\) by \(\varphi _{j}(t)\downarrow \varphi (t)\). It is easy to verify that \(\varphi (t)\) is continuous and nonincreasing on \([T_{0},T]\). By definition of \(\varphi _{j}(t)\) and \(\varphi \)(t), we get

$$\begin{aligned} \varphi (t)= \lim \limits _{j\rightarrow \infty }C_{3}\int _{t}^{T}\rho (\varphi _{j}(s))\textrm{d}s=C_{3}\int _{t}^{T}\rho (\varphi (s))\textrm{d}s \end{aligned}$$

for each \(t\in [T_{0},T]\). Since

$$\begin{aligned} \int _{0+}\frac{\textrm{d}u}{\rho (u)}=\infty , \end{aligned}$$

By virtue of Assumption 4.3, \(\varphi (t)=0\) for all \(t\in [T_{0},T]\). As a consequence, \(\lim \limits _{j\rightarrow \infty }\varphi _{j}(T_{0})=0\). By Lemma 4.4

$$\begin{aligned} {\textbf {E}}\sup _{t\le s\le T}\Vert x_{j+k}(s)-x_{j}(s)\Vert ^{2}&\le \sup _{T_{0}\le t\le T}\tilde{\varphi }_{j,k}(t)\\&\le \sup _{T_{0}\le t\le T}\tilde{\varphi }_{j}(t)=\varphi _{j}(T_{0})\rightarrow 0, \quad \text {as} \quad j\rightarrow \infty . \end{aligned}$$

So (4.10) must hold. Applying (4.10) to (4.7), we see that \(\left\{ x_{j},y_{j} \right\} \) is a Cauchy sequence (hence convergent) in \(M[T_{0},T]\) and its limit is denoted by (xy). Now letting \(j\rightarrow \infty \) in (4.1), we obtain

$$\begin{aligned} x(t)&=\xi +\int _{t}^{T}(s-t)^{\alpha -1}f(t,s,x(s),y(t,s))\textrm{d}s\\&\quad +\int _{t}^{T}(s-t)^{\alpha -1}\left[ g(t,s,x(s))+y(t,s) \right] \textrm{d}w(s), \end{aligned}$$

on \([T_{0},T]\). Since the value of \(T_{0}\) depends only on the function \(\rho \), one can deduce by iteration the existence on \([T-q(T-T_{0}),T]\) for each q, and therefore the existence on the entire interval [0, T]. \(\square \)

5 Conclusions and future works

In this paper, singular backward stochastic nonlinear Volterra differential equation is introduced which is an untreated topic in recent literature. To derive an adapted pair of stochastic processes, we first formulated fundamental lemma which plays a crucial role in the theory of singular BSDE. The main results in our paper were to show existence and uniqueness of an adapted solution to (1.1) in infinite dimensional setting using non-Lipschitz condition. In doing so, we constructed Picard type approximation. The key point in the proof of main results was to apply extended martingale representation theorem and Bihari’s inequality.

Since our results are sufficiently new in the theory of BSDEs, there are still open problems to discuss regarding their applications to finance and optimal control theory using stochastic maximum principle. Our aim is solving high-dimensional nonlinear backward stochastic differential equations using Deep learning algorithms in the future work.