1 Introduction

Cubature on Wiener space [19] is a certain family of numerical formula for approximating the expectation of functionals of diffusion processes, which are important in mathematical finance and other related fields. The existence of cubature formula of general dimension and degree has been known, but the constructions given in the literature were based on algebraic structure among continuous paths and Brownian motion, and limited to ones of low degree and dimension. Our aim in this paper is to obtain a method of constructing general cubature formula on Wiener space through mathematical optimization.

More concretely, we are interested in approximating values in the form \(\mathrm {E}\left[ f(X_T)\right]\), where \((X_t)_{0\le t\le T}\) is a solution of stochastic differential equation (SDE) driven by a multidimensional Brownian motion, and f is a function with some regularity (e.g., Lipschitz continuity). The standard approach consists of two steps:

  1. (1)

    approximate the distribution of \(X_{t+\varDelta t}\) from the information of (approximated) \(X_t\), where \(\varDelta t \ll 1\);

  2. (2)

    split the time interval [0, T] by \(0 = t_0< t_1< \cdots < t_k = T\) and sequentially apply (1) over each \([t_\ell , t_{\ell +1}]\).

For example, the stochastic process \(\tilde{X}^{\mathrm {EM}, (k)}_T\) given by the Euler–Maruyama method with equal partition attains \(\left|\mathrm {E}\,[f(\tilde{X}^{\mathrm {EM}, (k)}_T)] - \mathrm {E}\left[ f(X_T)\right] \right|= \mathrm {O}(1/k)\) with respect to k, the number of partitions [12].

To achieve higher precision of approximation such as \(\mathrm {O}(1/k^2)\), Lyons and Victoir [19] introduced the cubature on Wiener space, which uses the higher order of stochastic Taylor expansion in step (1). The basic scheme for (1) in cubature on Wiener space is as follows:

  1. (c1)

    approximate the distribution of Brownian motion \((B_s)_{t\le s\le t+\varDelta t}\) by a weighted discrete set of deterministic paths;

  2. (c2)

    locally (over \([t, t+\varDelta t]\)) solve ordinary differential equations (ODEs) driven by the deterministic paths in (c1) instead of the stochastic differential equations for approximating the distribution of \(X_{t+\varDelta t}\);

Among the steps (c1), (c2), (2) in cubature on Wiener space, this paper primarily focuses on finding a good weighted set of deterministic paths described in (c1).

In the field of high-order approximation of SDEs, there is also a relative of cubature on Wiener space called Kusuoka approximation [15] using a random paths in (c1), which is followed by the papers [20, 22] presenting concrete second-order schemes. The main challenge shared by these approaches (cubature on Wiener space, Kusuoka approximation) is that constructing such a (random) set of paths has been performed by actually solving Lie-algebraic equations and is limited to low dimension and degree of precision. Therefore, our objective is to find a way of generally constructing such a formula in arbitrary dimension and degree.

Contribution of this study Broadly speaking, the contribution of this study comprises the following two items:

  • (main contribution) We show that one can construct a cubature formula on Wiener space of general dimension and degree with a randomized algorithm.

  • (technical contribution) To apply the technique of [10] to the problem of cubature on Wiener space, we characterize the affine hull of the distribution of iterated Stratonovich integrals and prove stochastic Tchakaloff’s theorem in a stronger way.

The main result with a simple Monte Carlo construction is given in Proposition 19. It asserts that a certain random generation of piecewise linear paths combined with a linear programming yields a cubature formula on Wiener space.

As a technical contribution, we extend stochastic Tchakaloff’s theorem [19], which assures the existence of cubature formulas on Wiener space. Although the original statement was just that there exists a cubature formula, we show that the expectation of iterated Stratonovich integrals of a Brownian motion (\(\mathrm {E}\left[ \varvec{\varphi }_\mathrm {W}(B)\right]\) in (4)) is contained in the relative interior of \({{\,\mathrm{conv}\,}}\{\varvec{\varphi }_\mathrm {W}(w)\}\) with a valid range of bounded variation (BV) paths w (Theorem 17). This stronger statement with “relative interior" follows our characterization (Proposition 15) of \(\mathrm{aff}{{\,\mathrm{supp}\,}}\mathrm {P}_{\varvec{\varphi }_\mathrm {W}(B)}\) in terms of (4) and is essential in exploiting the existing construction of general cubature formula [10].

We only treat the part (c1) in this study, but it is important to consider combining our construction with ODE solvers (c2) and some techniques reducing computational complexity (2) such as recombination [17], which needs further investigation and is deferred to future research.


Outline We give a brief overview of the following sections.

Section 2 describes the idea and background of this study in a more mathematical way. We briefly explain the concept of cubature on Wiener space in Sect. 2.1 and the overview of a preceding result recently given by one of the authors [10] in Sect. 2.2.

Section 3 is devoted to a theoretical review of cubature on Wiener space by [19]. We introduce the facts around vector fields and relevant notations in Sect. 3.1 used throughout the paper. The precise definition and error estimate of cubature on Wiener space is given in Sect. 3.2, and Sect. 3.3 provides information about known constructions based on algebraic arguments.

In Sect. 4, we give the extended statement of stochastic Tchakaloff’s theorem and its proof. Sections 4.1 and 4.2 provide algebraic background of cubature construction. Section 4.3 is devoted to the proof of our version of stochastic Tchakaloff’s theorem, and it also includes the characterization of the distribution of iterated Stratonovich integrals from the viewpoint of the affine hull of the support.

We discuss a way to obtain piecewise linear cubature formula on Wiener space in Sect. 5. After giving some general properties of continuous BV paths in our context in Sect. 5.1, we prove our main result in Sect. 5.2 that we can generally construct cubature formulas on Wiener space with simple randomized algorithms. Section 5.3 represents the numerical verification of our result in a small range of parameters.

Finally we summarize our conclusion in Sect. 6.

2 Preliminaries

In this section, we shall briefly explain the background and motivation of this study. Section 2.1 gives a brief explanation of the cubature theory on Wiener space [19] and identifies the problem to be solved. Section 2.2 introduces the recent Monte Carlo approach by [10] to general cubature construction, which our method in this paper is based on. We explain the relation between cubature on Wiener space and a generalized one, as well as the particular difficulty that arise on the Wiener space.

2.1 Cubature on Wiener space

Let \(B=(B^1, \ldots , B^d)\) be a d-dimensional standard Brownian motion. Let \(C_b^\infty (\mathbb {R}^N;\, \mathbb {R}^N)\) be the space of infinitely differentiable \(\mathbb {R}^N\)-valued functions defined on \(\mathbb {R}^N\) whose every order of derivative is bounded. Let us consider the following N-dimensional Stratonovich SDE:

$$\begin{aligned} \mathrm {d}X_t = \sum _{i=1}^d V_i(X_t) \circ \mathrm {d}B_t^i + V_0(X_t)\,\mathrm {d}t, \qquad X_0=x, \end{aligned}$$
(1)

where \(x\in \mathbb {R}^N\), \(V_i\in C_b^\infty (\mathbb {R}^N;\mathbb {R}^N)\) for \(i=0,\ldots ,d\). As the process \(X_t\) is dependent on the initial value x, we denote it by \(X_t(x)\) if necessary. We may assume the solution \(X_t(x)\) is continuous with respect to t and x. Our aim is to efficiently compute or approximate the expectation \(\mathrm {E}\left[ f(X_t)\right]\) with \(t>0\) and some smooth or Lipschitz f. This sort of approximation is called a weak approximation of SDE and well-studied in the literature [12, 13, 15, 19, 20, 22, 23].

We here focus on the approach introduced in [19] called cubature on Wiener space. Broadly speaking, a cubature formula on Wiener space (of the time interval [0, T]) is the approximation

$$\begin{aligned} \mathrm {P}\simeq \sum _{i=1}^n \lambda _j \delta _{w_j}, \end{aligned}$$

where \(\mathrm {P}\) is the Wiener measure on the Wiener space \(C_0^0([0, T]; \mathbb {R}^d)\) (the space of \(\mathbb {R}^d\)-valued continuous function in [0, T] starting at the origin),

$$\begin{aligned} w_j=(w_j^0, w_j^1,\ldots , w_j^d)\in C_0^0([0, T]; \mathbb {R}\oplus \mathbb {R}^d) \end{aligned}$$

for \(j=1,\ldots , n\), and \(\lambda _1,\ldots ,\lambda _n\) are positive real weights whose sum equals one. Instead of polynomials in conventional cubature formulas, we adopt iterated integrals as the test functionals, that is, we want to find paths \(w_i\) satisfying

$$\begin{aligned} \mathrm {E}\left[ \int_{0<t_1<\cdots<t_k<T}\circ \, \mathrm {d}B_{t_1}^{i_1} \cdots \circ \mathrm {d}B_{t_k}^{i_k}\right] =\sum _{j=1}^n \lambda _j\int_{0<t_1<\cdots<t_k<T}\mathrm {d}w^{i_1}_j(t_1) \cdots \mathrm {d}w_j^{i_k}(t_k) \end{aligned}$$
(2)

over some set of multiindices \((i_1,\ldots ,i_k)\), where the iterated Stratonovich integral appears in the left-hand side. Precisely speaking, we formally set \(B^0_t=t\) for \(t\ge 0\) and assume \(w_j\) is a path of BV for each \(j=1,\ldots ,n\). Although \(w_j^0(t)=t\) is also assumed in [19], here we may generalize and remove this condition.

The iterated integrals appearing in (2) have a rich algebraic structure (see Sect. 4.1), so algebraic approaches have been adopted in the literature [8, 19, 21, 22, 24, 29]. However, solving complicated equations of Lie algebra is required in those approaches, and constructions of the formula are limited to a small range (see Sect. 3.3). Our objective is to give a construction method for a general setting where there are no limitations on the number of iterations of the integral (k in (2)). For this purpose, we adopt an optimization-based viewpoint instead of algebraic ones and extend to our situation the result of [10], which gives a randomized construction of general cubature formula (Tables 1, 2, 3, 4, 5, and 6).

We should mention the Kusuoka approximation [13, 15, 29], which is closely related to cubature on Wiener space. However, our objective in this paper is limited to the construction of the cubature on Wiener space, and the optimization-based approach to the general Kusuoka approximation is deferred for future work.

2.2 Monte Carlo approach to generalized cubature

A cubature formula is originally a numerical integration formula on some Euclidean space that exactly integrates polynomials up to a certain degree [30], the theory of which underlies cubature on Wiener space introduced in the previous section. Then, we shall explain it in a generalized setting and briefly explain the idea of [10] for constructing general cubature formulas.

Let \((\varOmega , \mathscr {G})\) be some measurable space and X be a random variable on it. A generalized cubature formula with respect to X and integrable functions \(\varphi _1, \ldots , \varphi _D:\varOmega \rightarrow \mathbb {R}\) is a set of points \(x_1,\ldots ,x_n\in \varOmega\) and positive weights, \(\lambda _1,\ldots ,\lambda _n\) such as

$$\begin{aligned} \mathrm {E}\left[ \varphi _i(X)\right] =\sum _{j=1}^n\lambda _j\varphi _i(x_j), \quad i=1,\ldots ,D. \end{aligned}$$

For simplicity, we then assume \(\varphi _1\equiv 1\). In this setting, \(\lambda _1+\cdots +\lambda _n=1\) must hold. We can also regard the above condition as one vector-valued equality \(\mathrm {E}\left[ \varvec{\varphi }(X)\right] =\sum _{j=1}^n\lambda _j\varvec{\varphi }(x_j)\), where \(\varvec{\varphi }:\varOmega \rightarrow \mathbb {R}^D\) is defined as \(\varvec{\varphi }=(\varphi _1,\ldots ,\varphi _D)^\top\). The existence of such formula is assured by the following theorem [2, 28, 31]:

Theorem 1

(Generalized Tchakaloff’s theorem) Under the above setting, there exists a cubature formula whose number of points satisfies \(n\le D\). Moreover, we can take points \(x_1,\ldots ,x_n\) so as to satisfy \(\varvec{\varphi }(x_j)\in {{\,\mathrm{supp}\,}}\mathrm {P}_{\varvec{\varphi }(X)}\) for each \(j=1,\ldots ,n\).

In the above statement, \({{\,\mathrm{supp}\,}}\mathrm {P}_{\varvec{\varphi }(X)}\) is the support of the distribution of the vector-valued random variable \(\varphi (X)\). Equivalently, this is the smallest closed set \(A\subset \mathbb {R}^D\) satisfying \(\mathrm {P}(\varvec{\varphi }(X)\in A) = 1\). Generalized Tchakaloff’s theorem can be understood as an immediate consequence of a discrete-geometric argument. Indeed, \(\mathrm {E}\left[ \varvec{\varphi }(X)\right]\) is contained in the convex hull of \({{\,\mathrm{supp}\,}}\mathrm {P}_{\varvec{\varphi }(X)}\) (the convex hull of an \(A\subset \mathbb {R}^D\) is defined as \({{\,\mathrm{conv}\,}}A :=\{\sum _{i=1}^m\lambda _i x_i \mid m\ge 1,\ \lambda _i\ge 0, \ \sum _{i=1}^m\lambda _i=1,\ x_i\in A\}\)). Therefore, the generalized Tchakaloff’s theorem follows Carathéodory’s theorem (note that we assume \(\varphi _1\equiv 1\)):

Theorem 2

(Carathéodory) For an arbitrary \(A\subset \mathbb {R}^D\) and \(x\in {{\,\mathrm{conv}\,}}A\), there exists \(D+1\) points \(x_1, \ldots , x_{D+1}\in A\) such that \(x\in {{\,\mathrm{conv}\,}}\{x_1,\ldots ,x_{D+1}\}\).

Although the above argument cannot directly be used in the construction of cubature, we can use its nature by introducing the concept of relative interior. For a set \(A\subset \mathbb {R}^D\), its affine hull is defined by

$$\begin{aligned} \mathrm{aff}A:=\left\{ \sum _{i=1}^m \lambda _ix_i\, \bigg {|}\, m\ge 1,\ \lambda _i\in \mathbb {R},\ \sum _{i=1}^m \lambda _i=1,\ x_i\in A \right\} \end{aligned}$$

Then, the relative interior of A is the interior of A regarding the subspace topology on \(\mathrm{aff}A\) and denoted by \({{\,\mathrm{ri}\,}}A\). In terms of this relative interior, the following generalization of Carathéodory’s theorem holds:

Theorem 3

([3, 30]) Suppose an \(A\subset \mathbb {R}^D\) satisfies \(\mathrm{aff}A\) is a k-dimensional affine subspace of \(\mathbb {R}^D\). Then, for each \(x\in {{\,\mathrm{ri}\,}}{{\,\mathrm{conv}\,}}A\), there exists some subset \(B\subset A\) composed of at most 2k points such that \(\mathrm{aff}B=\mathrm{aff}A\) and \(x\in {{\,\mathrm{ri}\,}}{{\,\mathrm{conv}\,}}B\).

From this generalization and the fact that

$$\begin{aligned} \mathrm {E}\left[ \varvec{\varphi }(X)\right] \in {{\,\mathrm{ri}\,}}{{\,\mathrm{conv}\,}}{{\,\mathrm{supp}\,}}\mathrm {P}_{\varvec{\varphi }(X)} \end{aligned}$$
(3)

holds (see, e.g., [10] or essentially [2] for proof), we obtain the following randomized construction of cubature formulas from i.i.d. copies of X:

Theorem 4

([10]) Let \(X_1, X_2, \ldots\) be i.i.d. copies of X. Then there exists almost surely a positive integer n satisfying \(\mathrm {E}\left[ \varvec{\varphi }(X)\right] \in {{\,\mathrm{conv}\,}}\{\varvec{\varphi }(X_1), \ldots , \varvec{\varphi }(X_n)\}\).

Though the weights remain undetermined, for a sufficiently large n, it suffices to take a basic feasible solution of the linear programming problem

$$\begin{aligned} \text {minimize}\ \ 0 \; \text {subject to}\ \ \sum _{j=1}^n\lambda _j \varvec{\varphi }(X_j) = \mathrm {E}\left[ \varvec{\varphi }(X)\right] ,\ \lambda _j\ge 0. \end{aligned}$$

Indeed, its basic feasible solution satisfies the bound of points used in a cubature given in Tchakaloff’s theorem (Theorem 1). This sort of technique reducing the number of points in a discrete measure is called Carathéodory-Tchakaloff subsampling [25].

Here, if we formally write the iterated integral \(\int _{0<t_1<\cdots<t_k<T}\mathrm {d}w^{i_1}(t_1) \cdots \mathrm {d}w^{i_k}(t_k)\) appearing in (2) as \(\varphi _{(i_1, \ldots ,i_k)}(w)\) for a valid w (and the Brownian motion B), then the cubature on Wiener space is a set of paths \(w_j\) and weights \(\lambda _j\) formally satisfying

$$\begin{aligned} \mathrm {E}\left[ \varvec{\varphi }_\mathrm {W}(B)\right] = \sum _{j=1}^n \lambda _j \varvec{\varphi }_\mathrm {W}(w_j), \end{aligned}$$
(4)

where \(\varvec{\varphi }_\mathrm {W}\) denotes a vector of some functions of the form \(\varphi _{(i_1,\ldots ,i_k)}\). Therefore, if we could directly generate sample paths of the Brownian motion, then Theorem 4 should be applicable. In reality, it is impossible to generate a Brownian motion on a computer, and it is not even a BV path. However, we assume the following variant, supporting our arguments.

Remark 1

The assumption that \(X_1, X_2, \ldots\) possess the same distribution as X in Theorem 4 can be relaxed; the same conclusion yields from the following condition for the i.i.d. sequence:

$$\begin{aligned} \mathrm{aff}{{\,\mathrm{supp}\,}}\mathrm {P}_{\varvec{\varphi }(X_1)}=\mathrm{aff}{{\,\mathrm{supp}\,}}\mathrm {P}_{\varvec{\varphi }(X)}, \qquad {{\,\mathrm{supp}\,}}\mathrm {P}_{\varvec{\varphi }(X_1)} \supset {{\,\mathrm{supp}\,}}\mathrm {P}_{\varvec{\varphi }(X)}. \end{aligned}$$
(5)

From this fact, it is sufficient to investigate the distribution of iterated integrals, and we indeed show the Wiener space counterpart of the condition (5) in Proposition 15.

3 Theoretical background of cubature on Wiener space

In this section, we provide a theoretical review on the cubature theory on Wiener space, first introduced by [19]. We quickly introduce basic notions concerning multidimensional stochastic flows, and give the error estimate of cubature formula. Moreover, we will demonstrate few examples of concrete construction of cubature formula on Wiener space in Sect. 3.3.

3.1 Vector fields

In this section, we define vector fields on \(\mathbb {R}^N\) and show the correspondence between vector fields and vector-valued functions. Let \(C^\infty (\mathbb {R}^N)\) be the set of real-valued smooth functions over \(\mathbb {R}^N\).

Definition 5

A vector field on \(\mathbb {R}^N\) is a (\(\mathbb {R}\)-)linear mapping \(V:C^\infty (\mathbb {R}^N)\rightarrow C^\infty (\mathbb {R}^N)\) such that \(V(fg)=(Vf)g + fVg\) holds for arbitrary \(f, g\in C^\infty (\mathbb {R}^N)\).

Due to this condition, a vector field on \(\mathbb {R}^N\) has to be a differential operator \(\sum _{i=1}^NV^i\partial _i\) where \(V^i\in C^\infty (\mathbb {R}^N)\) and \(\partial _i\) denotes the i-th partial derivative for \(i=1,\ldots ,d\). Therefore, a vector field corresponds to the vector-valued smooth function \((V^1, \ldots , V^N)^\top : \mathbb {R}^N \rightarrow \mathbb {R}^N\). By abuse of notation, we also denote this vector-valued function by V.

If A and B are vector fields on \(\mathbb {R}^N\), we define the Lie bracket \([A, B]:=AB-BA\). This [AB] is also a vector field because the second derivatives vanish. Note that [AB] corresponds to the vector \((\partial B)A -(\partial A)B\), where A, B are regarded as functions and \(\partial C\) denotes the Jacobian matrix of C (see, e.g., [9]).

Reciprocally, the coefficients of the SDE (1) can be regarded as vector fields. These vector fields are closely related to the behavior of \(X_t\).

Let \(V_0, \ldots , V_d\) be the vector fields (operators) induced by the coefficients of (1), and define the operator \(L:=V_0+\frac{1}{2}(V_1^2+\cdots +V_d^2)\). Let us consider the parabolic partial differential equation (PDE)

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \frac{\partial }{\partial t}u(t, x) = Lu(t, x),\\ u(0, x) = f(x), \end{array}\right. } \end{aligned}$$
(6)

with a Lipschitz function \(f:\mathbb {R}^N\rightarrow \mathbb {R}^N\). Because \(u(T, x)=\mathrm {E}\left[ f(X_T(x))\right]\) holds [11], we can exploit the numerical schemes in PDE theory to get \(\mathrm {E}\left[ f(X_T(x))\right]\) and vice versa.

We also introduce several conditions on the vector fields. They are assumed to obtain the estimate given in the Proposition 6. Before we state those, we introduce some notations based on [13].

Let \(\mathscr {A}:=\{\emptyset \}\cup \bigcup _{k=1}^\infty \{0, 1,\ldots ,d\}^k\). For \(\alpha \in \mathscr {A}\), define \(|\alpha |:=0\) if \(\alpha =\emptyset\) and \(|\alpha |:=k\) if \(\alpha =(\alpha _1,\ldots ,\alpha _k)\in \{0,\ldots ,d\}^k\). We also define \(\Vert \alpha \Vert :=|\alpha |+|\{1\le j\le |\alpha | \mid \alpha _j=0\}|\). For \(\alpha ,\beta \in \mathscr {A}\), define \(\alpha *\beta := (\alpha _1,\ldots ,\alpha _{|\alpha |},\beta _1,\ldots ,\beta _{|\beta |})\). Let \(\mathscr {A}_0:=\mathscr {A}\setminus \{\emptyset \}\) and \(\mathscr {A}_1:=\mathscr {A}\setminus \{\emptyset \cup (0)\}\). We also define, for each integer \(m\ge 1\),

$$\begin{aligned} \mathscr {A}(m):=\{\alpha \in \mathscr {A}\mid \Vert \alpha \Vert \le m\}, \quad \mathscr {A}_0(m):=\mathscr {A}(m)\cap \mathscr {A}_0, \quad \mathscr {A}_1(m):=\mathscr {A}(m)\cap \mathscr {A}_1. \end{aligned}$$

Define a vector field \(V_{[\alpha ]}\) for each \(\alpha \in \mathscr {A}\) inductively by \(V_{[\emptyset ]}=0\) and

$$\begin{aligned}&V_{[i]}(=V_{[(i)]}) := V_i\quad (i=0,\ldots ,d),\\&V_{[\alpha *(i)]} := [V_\alpha , V_i] \quad (|\alpha |\ge 1, i=0,\ldots ,d). \end{aligned}$$

We can now state the uniformly finitely generated (UFG) condition [13, 16]:


(UFG) There exists a positive integer \(L\ge 1\) such that, for an arbitrary \(\alpha \in \mathscr {A}_1\), there exists \(\varphi _{\alpha ,\beta }\in C_b^\infty (\mathbb {R}^N)\) for each \(\beta \in \mathscr {A}_1(L)\) satisfying

$$\begin{aligned} V_{[\alpha ]}=\sum _{\beta \in \mathscr {A}_1(L)}\varphi _{\alpha ,\beta }V_{[\beta ]}. \end{aligned}$$

This is equivalent to the statement that the \(C_b^\infty (\mathbb {R}^N)\)-module generated by \(\{V_{[\alpha ]}\mid \alpha \in \mathscr {A}_1\}\) is finitely generated. Note that (UFG) is known to be strictly weaker than the uniform Hörmander condition (see, e.g., Example 2 in [14]), which is one of the typical assumptions on vector fields.

Although only the condition (UFG) was assumed in [19], it was pointed out in [5] that the following condition is also essential:

(V0) There exists \(\varphi _\beta \in C_b^\infty (\mathbb {R}^N)\) for each \(\beta \in \mathscr {A}_1(2)\) such that

$$\begin{aligned} V_0=\sum _{\beta \in \mathscr {A}_1(2)} \varphi _\beta V_{[\beta ]}. \end{aligned}$$

For a function \(f\in C_b^\infty (\mathbb {R}^N)\), define \((P_tf)(x):=\mathrm {E}\left[ f(X_t(x))\right]\). The following estimate is essential.

Proposition 6

([5, 16]) Assume that both (UFG) and (V0) hold. Then, for any positive integer r and \(\alpha _1,\ldots , \alpha _r\in \mathscr {A}\), there exists a constant \(C>0\) such that

$$\begin{aligned} \Vert V_{[\alpha _1]}\cdots V_{[\alpha _r]}P_tf\Vert _\infty \le \frac{Ct^{1/2}}{t^{(\Vert \alpha _1\Vert +\cdots +\Vert \alpha _r\Vert )/2}} \Vert \nabla f\Vert _\infty \end{aligned}$$
(7)

Although we can obtain a weaker bound without assuming (V0), we later exploit this bound assuming both (UFG) and (V0) for simplicity.

We finally state the stochastic Taylor formula in terms of the vector-field notation introduced above. By Itô’s formula, we obtain for any \(f\in C_b^\infty (\mathbb {R}^N)\)

$$\begin{aligned} f(X_t)&= f(x)+ \sum _{i=0}^d \sum _{j=1}^N \int _0^t V_i^j(X_s) \partial _jf(X_s) \circ \mathrm {d}B^i_s\\&=f(x)+\sum _{i=0}^d\int _0^t (V_if)(X_s)\circ \mathrm {d}B^i_s, \end{aligned}$$

where we denote \(\mathrm {d}s\) by \(\circ \,\mathrm {d}B^0_s\). Therefore, the repetition of Ito’s formula yields

$$\begin{aligned} f(X_t)=f(x)+\sum _{i=0}^d (V_if)(x) \int _{0<s<t} \circ \,\mathrm {d}B_s^i + \sum _{i,j=0}^d (V_iV_jf)(x) \int _{0<t_1<t_2<1} \circ \,\mathrm {d}B^i_{t_1}\circ \mathrm {d}B^j_{t_2} \cdots . \end{aligned}$$

This is the stochastic Taylor formula, which is rigorously stated as follows. For a multiindex \(\alpha =(\alpha ^1,\ldots ,\alpha ^k)\in \mathscr {A}\), we denote by \(V_\alpha\) the operator \(V_{\alpha ^1}\cdots V_{\alpha ^k}\).

Proposition 7

([19, Proposition 2.1]; [12]) Let \(f\in C_b^\infty (\mathbb {R}^N)\) and m be a positive integer. Then, we have

$$\begin{aligned} f(X_t(x))=\sum _{\alpha \in \mathscr {A}(m)} (V_\alpha f)(x) +R_m(t, x, f), \end{aligned}$$

where the remainder term satisfies, for some constant \(C>0\),

$$\begin{aligned} \sup _{x\in \mathbb {R}^N}\sqrt{\mathrm {E}\left[ R_m(t, x, f)^2\right] } \le C t^{\frac{m+1}{2}}\sup _{\beta \in \mathscr {A}(m+2)\setminus \mathscr {A}(m)} \Vert V_\beta f\Vert _\infty . \end{aligned}$$

3.2 Formulation and evaluation of cubature on Wiener space

We can now precisely define the cubature formula [19].

Definition 8

Let \(T>0\), and let m be a positive integer. BV paths \(w_1,\ldots ,w_n\in C_0^0([0, T]; \mathbb {R}\oplus \mathbb {R}^d)\) and weights \(\lambda _1,\ldots ,\lambda _n\ge 0\) with \(\sum _{i=1}^n\lambda _j=1\) to define a cubature formula on Wiener space of degree m at time T, if only

$$\begin{aligned} \mathrm {E}\left[ \int_{0<t_1<\cdots<t_k<T} \circ \,\mathrm {d}B^{i_1}_{t_1}\cdots \circ \mathrm {d}B^{i_k}_{t_k}\right] =\sum _{j=1}^n \lambda _j \int_{0<t_1<\cdots<t_k<T} \mathrm {d}w_{j}^{i_1}(t_1)\cdots \mathrm {d}w_{j}^{i_k}(t_k). \end{aligned}$$

holds for all \((i_1,\ldots ,i_k)\in \mathscr {A}(m)\). Here, \(\circ \,\mathrm {d}B^0_s\) represents \(\mathrm {d}s\).

To construct a cubature formula, it suffices to find it over [0, 1]. Indeed, when \(w_1,\ldots ,w_n\in C_0^0([0, 1]; \mathbb {R}\oplus \mathbb {R}^d)\) form the cubature over [0, 1],

$$\begin{aligned} w_{T, j}^i(t):={\left\{ \begin{array}{ll} Tw_j^i(t/T) &{} (i=0)\\ \sqrt{T}w_j^i(t/T) &{} (i=1,\ldots ,d) \end{array}\right. } \end{aligned}$$

with the same weights define the cubature over [0, T]. This is an immediate consequence of the scaling property of the Brownian motion.

Once such paths are given, we can easily compute each evolution driven by \(w_i\) as it is just an ODE. For a BV path \(w\in C_0^0([0, T]; \mathbb {R}\oplus \mathbb {R}^d)\), define \(\tilde{X}_{t}(x, w)\) as the solution of the ODE

$$\begin{aligned} \mathrm {d}\tilde{X}_t(x, w)=\sum _{i=0}^d V_i(\tilde{X}_t(x, w))\,\mathrm {d}w^i (t), \quad \tilde{X}_0(x, w)=x. \end{aligned}$$

Then, \(\sum _{j=1}^n\lambda _jf(\tilde{X}_T(x, w_{T,j}))\) should approximate well \(\mathrm {E}\left[ f(X_T(x))\right]\). Indeed, the estimate in Proposition 7 holds for \(t=T\) if we replace the Wiener measure by the discrete measure \(\sum _{j=1}^n \lambda _j \delta _{w_j}\). Therefore, by applying Cauchy-Schwarz we obtain the evaluation:

$$\begin{aligned} \sup _{x\in \mathbb {R}^N}\left|\mathrm {E}\left[ f(X_T(x))\right] - \sum _{j=1}^n\lambda _j f(\tilde{X}_T(x, w_{T,j}))\right|\le CT^{\frac{m+1}{2}}\sup _{\beta \in \mathscr {A}(m+2)\setminus \mathscr {A}(m)} \Vert V_\beta f\Vert _\infty \end{aligned}$$
(8)

with the constant \(C>0\) depending only on \(w_1,\ldots ,w_n\).

The above formula does not work as a good approximation unless T is small. Therefore, we divide [0, T] into smaller time intervals as \(0=t_0<t_1<\cdots <t_k=T\). If we consider the repeated application of the cubature formula over each subinterval \([t_{\ell -1}, t_\ell ]\), we can use

$$\begin{aligned} \sum _{j_1,\ldots ,j_k=1}^n \lambda _{j_1}\cdots \lambda _{j_k} f(X_T(x, w_{s_1, j_1}*\cdots *w_{s_k, j_k})), \end{aligned}$$
(9)

where \(w*v\) denotes the concatenation of two paths and \(s_\ell :=t_\ell -t_{\ell -1}\) for each \(\ell =1,\ldots k\), as an approximation of the expectation \(\mathrm {E}\left[ f(X_T(x))\right]\). If we define discrete Markov random variables \(Y_0,\ldots ,Y_k\) independent of the Brownian motion as

$$\begin{aligned} Y_0=x, \qquad \mathrm {P}(Y_\ell =\tilde{X}_{s_\ell , y} (w_{s_\ell , j}) \mid Y_{\ell -1}=y) =\lambda _j \quad (\ell =1,\ldots ,k,\ j=1,\ldots , n), \end{aligned}$$

\(\mathrm {E}\left[ Y_k\right]\) coincides with the approximation (9). Then, combining an estimate for

$$\begin{aligned} \sup _{x\in \mathbb {R}^N}\left|\mathrm {E}\left[ f(Y_k)\mid Y_0=x\right] -\mathrm {E}\left[ f(X_T(x))\right] \right|\end{aligned}$$

with Proposition 6, we can prove the following assertion.

Proposition 9

( [19, Proposition 3.6]) Let f be a bounded Lipschitz function in \(\mathbb {R}^N\). Then, under (UFG) and (V0), we have

$$\begin{aligned} \sup _{x\in \mathbb {R}^N}\left|\mathrm {E}\left[ f(Y_k)\mid Y_0=x\right] -\mathrm {E}\left[ f(X_T(x))\right] \right|\le C\Vert \nabla f\Vert _\infty \left( s_k^{1/2}+\sum _{\ell =1}^{k-1} \frac{s_\ell ^{(m+1)/2}}{(T-t_\ell )^{m/2}}\right) \end{aligned}$$

for some constant \(C>0\), which is dependent only on m and \(w_1,\ldots ,w_n\).

The equally spaced partition \(t_\ell =\ell T/k\) for \(\ell =0,\ldots , k\) is not the optimal one in terms of asymptotic error bound with \(k\rightarrow \infty\). Consider taking \(t_\ell = T\left( 1-\left( 1 -\frac{\ell }{k}\right) ^\gamma \right)\) with a constant \(\gamma >0\) independent of k (\(\gamma =1\) corresponds to the equally spaced partition). By taking \(\gamma > m-1\), we have the following estimate [13]:

$$\begin{aligned} \sup _{x\in \mathbb {R}^N}\left|\mathrm {E}\left[ f(Y_k)\mid Y_0=x\right] -\mathrm {E}\left[ f(X_T(x))\right] \right|\le C k^{-(m-1)/2}\Vert \nabla f\Vert _\infty . \end{aligned}$$

Therefore, a cubature formula of degree m with an appropriate time partition achieves the error rate \(\mathrm {O}(k^{-(m-1)/2})\) where k is the number of partitions.

Remark 2

If we have to compute all the k-times concatenation of a cubature formula composed of n sample paths, we have to solve \(\frac{n^{k+1}-1}{n-1}\) ODEs in total [19]. When number of ODEs is too large, we reduce the computational complexity through some Monte Carlo simulation or subsampling method [17, 25, 32]. In this paper, we do not consider efficient implementation of concatenation of cubature formula, but only consider their constructions.

3.3 Known constructions of cubature on Wiener space

We should note that some concrete examples of cubature formulas on Wiener space are already known. The simplest case treated in [19] is \(m=3\), where we have a cubature formula composed of linear paths (i.e., with only one linear segment).

Let \(n:=2^d\) and \(z_1, \ldots , z_n \in \mathbb {R}^d\) be all the elements of \(\{-1, 1\}^d\). Then, paths

$$\begin{aligned} w_i(t) := t(1, z_i) = (t, tz_i^1, \ldots , tz_i^d),\qquad 0\le t\le 1, \quad i=1,\ldots ,n \end{aligned}$$

with weights \(\lambda _1=\cdots =\lambda _n=2^{-d}\) construct a cubature formula with \(m=3\). Although \(2^d\) is much larger than \(|\mathscr {A}(3)| = \mathrm {O}(d^3)\), we can reduce the number of paths, e.g., using Carathéodory-Tchakaloff subsampling.

Constructions for the case d in general and \(m=5\) are also given in [19], where the authors give cubature formula using only \(\mathrm {O}(d^3)\) paths. Moreover, [8] constructed higher order cubature formula up to \(m=11\), but the construction is limited to one dimensional space-time (\(d=1\)). Ref. [21] represents other concrete examples when \(m=5\) with general d and the case \((d, m)=(2, 7)\).

All the aforementioned examples are derived by solving equations in terms of Lie algebra (see Sect. 4.1), which can be directly written using the Campbell-Baker-Hausdorff formula. However, as a different approach, we address an optimization-based construction in the next section.

4 Stochastic Tchakaloff’s theorem

In the previous section, we have demonstrated the theoretical support of cubature on Wiener space. However, it is important to know if such formulas can actually be constructed. In this section, we shall state stochastic Tchakaloff’s theorem, which assures the existence of cubature formula on Wiener space. Though the stochastic Tchakaloff’s theorem is originally given in [19], we state it in a stronger way by using the concept of relative interior.

Before doing so, we shall introduce the rich algebraic structures behind the theory of cubature on Wiener space in the following two sections, which are also essential in our proof of stochastic Tchakaloff’s theorem.

4.1 Tensor and Lie algebra

We introduce a tensor algebra which is suitable to our case [15, 18, 19]. Denote \(\mathbb {R}\oplus \mathbb {R}^d\) by E. Define \(U_0(E):=\mathbb {R}\ (=: E^{\otimes 0})\). Let \(A_0:=\mathbb {R}\) and \(A_1:=\mathbb {R}^d\), and define

$$\begin{aligned} U_n(E):=\bigoplus _{ \begin{array}{c} (i_1,\ldots ,i_k)\in \{0, 1\}^k,\\ 2k-(i_1+\cdots +i_k)=n \end{array} } A_{i_1}\otimes \cdots \otimes A_{i_k} \end{aligned}$$

for each positive integer n. Here, the condition for \((i_1,\ldots ,i_k)\) means that \(A_{i_1}\otimes \cdots \otimes A_{i_k}\) takes all the arrangement of \(\mathbb {R}\) and \(\mathbb {R}^d\) such that \(2(\#\text { of }\mathbb {R})+(\#\text { of } \mathbb {R}^d)=n\). Then, we consider the tensor algebra of formal series

$$\begin{aligned} T((E)):= \overline{\bigoplus _{n=0}^\infty U_n} \left( \simeq \overline{\bigoplus _{n=0}^\infty E^{\otimes n}}\, \right) , \end{aligned}$$

where the completion (overline) is taken so that T((E)) consists of series, i.e., T((E)) is the set of all the infinite sequences \((a_n)_{n=0}^\infty\) where \(a_n\in U_n\) for each \(n\ge 0\). Let \(T^{(n)}(E):=\bigoplus _{k=0}^nU_k\), and let \(\pi _n:T((E))\rightarrow T^{(n)}(E)\) be the canonical projection for each \(n\ge 0\). As we are only interested in these projections in practice, we do not need to differentiate the usual tensor algebra from that of series (i.e., its completion) treated here [27].

It might be easier to understand T as the ring of formal power series \(\mathbb {R}[[Z_0, Z_1,\ldots ,Z_d]]\) with noncommutative variables \(Z_0,\ldots ,Z_d\) (see, e.g., [1]). In that case, we redefine the degree of some monomial Y by \(\deg Y:=2\deg _{Z_0}Y + (\deg _{Z_1}Y+\cdots +\deg _{Z_d}Y)\) (where each \(\deg _{Z_i}Y\) denotes the number of \(Z_i\) appearing in Y) and regard \(U_n(E)\) as the subspace spanned by monomials of degree n for each \(n\ge 0\).

For any elements \(a=(a_n)_{n=0}^\infty , b=(b_n)_{n=0}^\infty \in T((E))\), we define the sum and product as follows:

$$\begin{aligned} a+b:=(a_n+b_n)_{n=0}^\infty ,\qquad a\otimes b:=\left( \sum _{i=0}^n a_i\otimes b_{n-i}\right) _{n=0}^\infty . \end{aligned}$$

The action by scalar is element-wise. These definitions are straightforward if we consider \(\mathbb {R}[[Z_0,Z_1,\ldots ,Z_d]]\). Moreover, we define the exponential, inverse, and logarithm:

$$\begin{aligned} \exp (a):=\sum _{k=0}^\infty \frac{a^{\otimes k}}{k!}, \quad a^{-1}:=\frac{1}{a_0}\sum _{k=0}^\infty \left( 1-\frac{a}{a_0} \right) ^{\otimes k}, \quad \log a:=\log a_0 - \sum _{k=1}^\infty \frac{1}{k}\left( 1-\frac{a}{a_0} \right) ^{\otimes k}, \end{aligned}$$

where the latter two operations are limited for \(a\in T((E))\) with \(a_0\ne 0\). Note that these operations commute with each projection homomorphism \(\pi _n\).

Let us introduce the space of Lie series. Define

$$\begin{aligned} L((E)):= 0\oplus E \oplus [E, E] \oplus [E, [E, E]] \oplus \cdots \subset T((E)) \simeq \overline{\bigoplus _{n=0}^\infty E^{\otimes n}}, \end{aligned}$$

where, for linear subspaces \(A, B\in T((E))\), [AB] is the linear subspace of T((E)) spanned by Lie brackets \([a, b]:=a\otimes b-b\otimes a\) (\(a\in A\), \(b\in B\)). L((E)) is the so-called free Lie algebra generated by E [26]. The elements of L((E)) are called Lie series. We also define \(L^{(n)}(E):=\pi _n(L((E)))\), the elements of which are called Lie polynomials.

4.2 Signature of a path

We shall introduce the signature (or Chen series [4]) of a path, which summarizes the algebraic structure of iterated integrals. Let \(w=(w^0,\ldots ,w^d)\in C_0^0([0, T]; \mathbb {R}\oplus \mathbb {R}^d)\) be a BV path and we define its signature.

Definition 10

For \(0\le s\le t\le T\), define \(S(w)_{s,t}\in T((E)) \simeq \mathbb {R}[[Z_0, Z_1,\ldots ,Z_d]]\) by

$$\begin{aligned} S(w)_{s, t}&:=\sum _{n=0}^\infty \int_{s<t_1<\cdots<t_n<t}\mathrm {d}w(t_1) \otimes \cdots \otimes \mathrm {d}w(t_n)\\&:=\sum _{n=0}^\infty \sum _{(i_1,\ldots ,i_k)\in \mathscr {A}(n)\setminus \mathscr {A}(n-1)} \left( \int_{s<t_1<\cdots<t_n<t}\mathrm {d}w^{i_1}(t_1) \cdots \mathrm {d}w^{i_k}(t_n)\right) Z_{i_1}\cdots Z_{i_k}, \end{aligned}$$

where the integration by \(\mathrm {d}w\) means the Lebesgue–Stieltjes integration. In both presentations, we think of the 0-th (or constant) term of \(S(w)_{s, t}\) as 1. We call \(S(w)_{s, t}\) the signature of w over [st].

The following is Chen’s theorem.

Theorem 11

( [4, 18]) The process S(w) satisfies \(S(w)_{s,t}\otimes S(w)_{t,u}=S(w)_{s,u}\) for arbitrary \(0\le s\le t\le u\le T\). It also holds that \(\log S(w)_{s,t}\in L((E))\) and therefore \(\pi _n\left( \log S(w)_{s,t}\right) \in L^{(n)}(E)\).

Moreover, the inverse of this correspondence holds, i.e., for an arbitrary Lie polynomial \(\mathscr {L}\in L^{(n)}(E)\subset T(E)\) and arbitrary \(0\le s < t \le T\), there exists a bounded-variation path \(w\in C_0^0([0,T]; \mathbb {R}\oplus \mathbb {R}^d)\) such that \(\pi _n\left( \log S(w)_{s,t}\right) =\mathscr {L}\).

Remark 3

Regarding the latter part, a stronger result is known [7, Theorem 7.28]. Every Lie polynomial can be exactly (not approximately) represented as a (truncated) logarithm of some continuous piecewise linear path with a finite number of linear intervals.

By virtue of these assertions, we see that the problem of finding paths constructing a cubature formula is equivalent to the problem of finding the corresponding Lie polynomials. The following Brownian-motion version of this result is also important.

Proposition 12

([15, 19]) Define the (Stratonovich) signature of the Brownian motion as an element of \(T((E))=\mathbb {R}[[Z_0,Z_1,\ldots ,Z_d]]\) by

$$\begin{aligned} S(B)_{s, t}=\sum _{n=0}^\infty \sum _{(i_1,\ldots ,i_k)\in \mathscr {A}(n)\setminus \mathscr {A}(n-1)} \left(\int_{s<t_1<\cdots<t_n<t}\circ \,\mathrm {d}B^{i_1}_{t_1} \cdots \circ \mathrm {d}B^{i_k}_{t_n}\right) Z_{i_1}\cdots Z_{i_k} \end{aligned}$$

for each \(0\le s\le t\). Then, \(\log S(B)_{s, t}\) is almost surely a Lie series.

As we mainly deal with the signature over [0, 1], hereafter let S(w) and S(B) represent \(S(w)_{0, 1}\) and \(S(B)_{0, 1}\), respectively. We also define for each \(\alpha =(i_1,\ldots ,i_k)\in \mathscr {A}\),

$$\begin{aligned} I^\alpha (w):=\int_{0<t_1<\cdots<t_k<1}\mathrm {d}w^{i_1}(t_1) \cdots \mathrm {d}w^{i_k}(t_k),\quad I^\alpha (B):=\int_{0<t_1<\cdots<t_k<1}\circ \,\mathrm {d}B^{i_1}_{t_1} \cdots \circ \mathrm {d}B^{i_k}_{t_k}. \end{aligned}$$

Note that we set \(I^\alpha (w)=I^\alpha (B)=1\) if \(\alpha =\emptyset\).

If to define \(\mathscr {L}:=\pi _n\left( \log S(B)\right)\), Indeed, we obtain the expression

$$\begin{aligned} \mathrm {E}\left[ \pi _n(S(B))\right] =\mathrm {E}\left[ \pi _n(\exp \mathscr {L})\right] . \end{aligned}$$

As \(\mathscr {L}\) is a random Lie polynomial from the previous assertion, roughly speaking, the generalized Tchakaloff’s theorem (Theorem 1) and the inverse statement in Theorem 11 yield the existence of a cubature formula on Wiener space. However, we should point out that the surjectivity stated in Theorem 11 fails if we require that \(w^0(t)\) is monotone (so the original proof of stochastic Tchakaloff’s theorem in [19] should be modified).

Proposition 13

Let \(n\ge 4\). Then, there exists a Lie polynomial \(\mathscr {L}\in L^{(n)}(\mathbb {R}\oplus \mathbb {R})\) such that \(\pi _n(\exp \mathscr {L})\) cannot be expressed as \(\pi _n(S(w)_{s, t})\) for any BV path \(w\in C_0^0([0, T];\, \mathbb {R}\oplus \mathbb {R})\) with strictly monotone \(w^0\).

Proof

Consider an \(\mathbb {R}\oplus \mathbb {R}\)-valued continuous BV path \(w=(w^0, w^1)\) on [0, T] that starts at the origin with \(w^0\) strictly increasing. We have

$$\begin{aligned} S(w)^{(1,1,0)}_{0, T}=\int_{0<t_1<t_2<t_3<T}\,\mathrm {d}w^1(t_1) \,\mathrm {d}w^1(t_2) \,\mathrm {d}w^0(t_3) =\int_0^T \frac{w^1(t)^2}{2} \,\mathrm {d}w^0(t). \end{aligned}$$

Because \(w^0\) is strictly increasing, there exists a differentiation \(\frac{\mathrm {d}w^0(t)}{\mathrm {d}t} \in L^1([0, T])\) that is positive almost everywhere on [0, T]. Therefore, if \(S(w)^{(1,1,0)}_{0, T}=0\) holds, then \(w^1\) is zero almost everywhere and so \(S(w)^{(1,0)}_{0, T}=0\) holds in particular. We have the same conclusion for strictly decreasing \(w^0\), so, we have

$$\begin{aligned} S(w)^{(1,1,0)}_{0, T} = 0 \quad \Longrightarrow \quad S(w)^{(1, 0)}_{0, T} = 0 \end{aligned}$$

for each \(w=(w^0, w^1)\) with strictly monotone \(w^0\).

Let \(e_0, e_1 \in \mathbb {R}\oplus \mathbb {R}\) be the standard basis. If we consider

$$\begin{aligned} Y:=\exp (e_0+[e_0, e_1]) = \sum _{i=0}^\infty \frac{1}{n!}(e_0 + e_0\otimes e_1 - e_1\otimes e_0)^{\otimes n}\in T((\mathbb {R}\oplus \mathbb {R})), \end{aligned}$$

then its coefficient of \(e_1\otimes e_1\otimes e_0\) is obviously zero, whereas that of \(e_1\otimes e_0\) is \(-1\). As \(e_0+[e_0, e_1]\) is clearly a Lie polynomial, the proof is complete. \(\square\)

Although the surjectivity fails, we can actually prove the existence of a cubature formula with \(w^0(t) = t\) in the following section. We use the following well-known approximation statement for the Brownian motion.

Proposition 14

Let n be a positive integer and B be a d-dimensional Brownian motion. Then, with probability one, the sequence of piecewise linear paths \(w_1, w_2, \cdots \in C_0^0([0, 1]; \mathbb {R}\oplus \mathbb {R}^d)\) given by linearly interpolating \(w_k(j/2^k)= B(j/2^k)\) for \(j = 0, 1, \ldots , 2^k\) satisfies

$$\begin{aligned} \pi _n(S(w_k)) \rightarrow \pi _n(S(B)),\qquad k \rightarrow \infty . \end{aligned}$$

Proof

We only give a sketch as we use arguments based on rough paths. If there is no time-term (the zero-th entry of the path) and \(n=2\), then the result yields from the well-known dyadic piecewise-linear approximation for the Brownian rough path (see [6, Proposition 3.6] or [7, Proposition 13.18]). Adding the time (zero-th entry) is not difficult as it is sufficiently smooth and does not affect the regularity of the rough path. To generalize n from \(n=2\), it suffices to observe the continuity of the “Lyons lift" (also see [7, Chapter 9]). \(\square\)

4.3 Proof of stochastic Tchakaloff’s theorem

Throughout the section, we fix a positive integer m and consider elements in \(T^{(m)}(E)\). Note that \(T^{(m)}(E)\) can naturally be regarded in the same light as \(F:=\mathbb {R}^{\mathscr {A}(m)}\). Define a set G (as a subset of F) by

$$\begin{aligned} G:= \{S(w)\mid w\in C_0^0([0, 1];\,\mathbb {R}\oplus \mathbb {R}^d)\ \text {is a BV path}, \ w^0(1)=1\} \end{aligned}$$

From Theorem 11, this coincides with the set of \(\exp (\mathscr {L})\), where \(\mathscr {L}\) is a Lie polynomial such that the coefficient of \(Z_0\) is 1.

We denote the distribution of S(B) over F by \(\mathrm {P}_{S(B)}\). We shall argue the relation of G and \({{\,\mathrm{supp}\,}}\mathrm {P}_{S(B)}\) in the following.

Proposition 15

It holds that \(\mathrm{aff}{{\,\mathrm{supp}\,}}\mathrm {P}_{S(B)}=\mathrm{aff}G\).

Proof

From Proposition 12, \({{\,\mathrm{supp}\,}}\mathrm {P}_{S(B)}\subset \mathrm{aff}G\) holds (as \(\mathrm{aff}G\) includes the closure of G). Therefore, it is sufficient to show \(G\subset \mathrm{aff}{{\,\mathrm{supp}\,}}\mathrm {P}_{S(B)}\).

As \(\mathrm{aff}{{\,\mathrm{supp}\,}}\mathrm {P}_{S(B)}\) is the intersection of all the hyperplanes, which includes \({{\,\mathrm{supp}\,}}\mathrm {P}_{S(B)}\), it can be represented as

$$\begin{aligned} \mathrm{aff}{{\,\mathrm{supp}\,}}\mathrm {P}_{S(B)} =\bigcap _{({\varvec{c}}, d)\in H}\{ \varvec{v}\in F \mid \varvec{c}^\top \varvec{v}=d \}, \end{aligned}$$

where H is the family of all \((\varvec{c}, d)\in F\times \mathbb {R}\) such that \(\varvec{c}^\top S(B)=d\) holds almost surely. The problem is now reduced to the statement

$$\begin{aligned} \varvec{c}^\top S(B)=d\ \ \text {a.s.}\quad \Longrightarrow \quad \varvec{c}^\top S(w)=d \end{aligned}$$

for every w appearing in the definition of G. This results from the following lemma as \(\pi _0(S(B))=\pi _0(S(w))=1\) always holds. \(\square\)

The following is the key lemma in the above proof. We give its proof in the appendix as it is elementary.

Lemma 16

Let \((c_\alpha )_{\alpha \in \mathscr {A}} \in \mathbb {R}^{\mathscr {A}}\) be a vector whose all but finite entries are zero. Then, if \(\sum _{\alpha \in \mathscr {A}}c_\alpha I^\alpha (B)=0\) holds almost surely,

$$\begin{aligned} \sum _{\alpha \in \mathscr {A}}c_\alpha I^\alpha (w)=0 \end{aligned}$$

holds for every bounded-variation path \(w\in C_0^0([0, 1]; \mathbb {R}\oplus \mathbb {R}^d)\) with \(w^0(1)=1\).

The following is a stochastic version of Tchakaloff’s theorem, which assures the existence of cubature formulas on Wiener space. It is stated in a little stronger way than [19, Theorem 2.4], wherein the “relative interior" did not appear. Note that the time interval considered hereafter is [0, 1].

Theorem 17

Let m be a positive integer. There exist n BV paths \(w_1,\ldots ,w_n \in C_0^0(\mathbb {R}\oplus \mathbb {R}^d)\) and n positive weights \(\lambda _1,\ldots ,\lambda _n\) whose sum is 1 that satisfy \(n\le |\mathscr {A}(m)|\) and

$$\begin{aligned} \mathrm {E}\left[ \pi _m(S(B))\right] =\sum _{i=1}^n \lambda _i \pi _m(S(w_i)). \end{aligned}$$

Moreover, if we loosen the condition to be \(n\le 2|\mathscr {A}(m)|\), \(w_1, \ldots , w_n\) can be taken such that \(\pi _m(S(w_i))\) is contained in G for each i, \(\mathrm{aff}\{ \pi _m(S(w_1)), \ldots , \pi _m(S(w_n))\}=\mathrm{aff}G\), and

$$\begin{aligned} \mathrm {E}\left[ \pi _m(S(B))\right] \in {{\,\mathrm{ri}\,}}{{\,\mathrm{conv}\,}}\{\pi _m(S(w_1)), \ldots , \pi _m(S(w_n))\}. \end{aligned}$$

Proof

By virtue of Carathéodory’s theorem,

the former part follows from the latter part. We here show the latter part.

From Theorem 3, (3) and Proposition 15, we can find n Lie polynomials \(\mathscr {L}_1,\ldots , \mathscr {L}_n\) such that each \(\pi _m(\exp \mathscr {L}_i)\) is contained in G, and \(\mathrm {E}\left[ \pi _m(S(B))\right]\) is contained in the relative interior of their convex hull. Here, n can actually be taken such that \(n\le 2\dim G \ (\le 2 |\mathscr {A}(m)|)\) because of Theorem 3. From the correspondence stated in Chen’s theorem (Theorem 11), we can find a desired set of paths in \(C_0^0([0, 1]; \mathbb {R}\oplus \mathbb {R}^d)\). Note that the condition \(\pi _m(\exp \mathscr {L}_i)\subset G\) implies that the corresponding path satisfies \(w_i^0(1)=1\). \(\square\)

Remark 4

By exploiting Proposition 14, we can also prove the same result even if we require \(w_i^0(t)=t\) for each \(i=1,\ldots ,n\) and \(0\le t\le 1\). Although Proposition 14 only asserts an approximation result, “relative interior" argument fills the gap.

5 Monte Carlo approach to cubature on Wiener space

In this section, we investigate a way to construct cubature formulas, which is based on mathematical optimization instead of Lie algebra. We limit the arguments to cubature formula composed of continuous piecewise linear paths, and propose a construction based on Monte Carlo sampling, which is the application of [10] to our case. We also carry out numerical experiments in concrete cases.

Although existing constructions of cubature formulas on Wiener space are based on Lie-algebraic equations, we can simply regard the cubature construction as an optimization problem. One such way is to consider an LP problem, which is analogous to ordinary cubature problems treated in Sect. 2.2. From this viewpoint, we can naively generate many sample paths and then reduce their number by using Carathéodory-Tchakaloff subsampling. We later see that this approach is applicable at least theoretically (Sect. 5.2).

5.1 Signature of continuous BV paths

In this section, we see the properties of BV paths and their signature. We also see that the truncated signature of continuous BV paths can be approximated with any accuracy by that of piecewise linear paths.

Let \(w=(w^0, w^1, \ldots , w^d)\in C_0^0([0, 1]; \mathbb {R}\oplus \mathbb {R}^d)\) be BV paths. We define the total variation of w as

$$\begin{aligned} \Vert w\Vert _1 :=\sup _{\varDelta } \sum _{i=1}^k\max _{0\le j\le d} |w^j(t_i)-w^j(t_{i-1})| \left( =\sup _{\varDelta }\sum _{i=1}^k\Vert w(t_i)-w(t_{i-1})\Vert _{\infty } \right) , \end{aligned}$$
(10)

where \(\varDelta\) is the partition of [0, 1] by \(0=t_0<t_1<\cdots <t_k=1\) and k varies in \(\sup _\varDelta\). We call w a BV path if \(\Vert w\Vert _1<\infty\) holds. Note that other norms are also equivalent as the space \(\mathbb {R}\oplus \mathbb {R}^d\) is finite-dimensional though we are using the sup-norm \(\Vert \cdot \Vert _\infty\) of \(\mathbb {R}\oplus \mathbb {R}^d\)

We can reparameterize w so that it becomes Lipschitz continuous if necessary. Indeed, if we let \(\Vert w|_{[s, t]}\Vert _1\) be the total variation of w over [st] and

$$\begin{aligned} \tau (t):=\frac{\Vert w|_{[0, t]}\Vert _1}{\Vert w\Vert _1} \end{aligned}$$

for a nonconstant w, then \(w\circ \tau\) is a well-defined Lipschitz path (\(\tau\) becomes a nondecreasing function onto [0, 1]). It is also important that the signature is invariant under this reparametrization (see, e.g., [7, Proposition 1.42 and 7.10]; note that \(w^0(t)=t\) might be lost then even if the original path satisfies it).

Hereafter, we may assume that there exist \(d+1\) derivative functions \(f^0, f^1, \ldots , f^d \in L^\infty ([0, 1]; \mathbb {R})\) such that

$$\begin{aligned} w^j(t)=\int_0^t f^j(s) \,\mathrm {d}s, \qquad t\in [0, 1],\ j=0,1,\ldots ,d. \end{aligned}$$

In this case, the total variation of w can be written as

$$\begin{aligned} \Vert w\Vert _1=\int_0^1 \max _{0\le j\le d}|f^j(s)|\,\mathrm {d}s. \end{aligned}$$

Signature can also be represented by the derivatives as

$$\begin{aligned} I^\alpha (w)&=\int_{0<t_1<\cdots<t_k<1}\mathrm {d}w^{i_1}(t_1)\cdots \mathrm {d}w^{i_k}(t_k)\\&=\int_{0<t_1<\cdots<t_k<1}f^{i_1}(t_1)\cdots f^{i_k}(t_k) \,\mathrm {d}t_1\cdots \mathrm {d}t_k. \end{aligned}$$

for each multiindex \(\alpha =(i_1,\ldots ,i_k)\in \mathscr {A}\).

As a special case of BV paths, we are interested in (continuous) piecewise linear paths, which are easy to implement on computers. Let \(0=s_0<s_1<\cdots <s_n=1\) be a partition of [0, 1]. Then, we can define a path \(w\in C_0^0([0, 1]; \mathbb {R}\oplus \mathbb {R}^d)\) which is linear on each interval \([s_{j-1}, s_j]\) (\(j=1, \ldots , n\)) by determining the slope vector in \(\mathbb {R}\oplus \mathbb {R}^d\) at each interval. For the sake of computation, here we give the calculation of the signature explicitly. Let \(g_j=(g^0_j, g^1_j, \ldots , g^d_j)\) be the slope of w over \((s_{j-1}, s_j)\). Then, for \(\alpha =(i_1,\ldots ,i_k)\in \mathscr {A}\),

$$\begin{aligned} I^\alpha (w) =\sum _{\begin{array}{c} 1\le \ell _1\le \cdots \le \ell _{n-1}\le k+1\\ \ell _0=1,\ \ell _n=k+1 \end{array}} \prod _{j=1}^n \frac{(s_j-s_{j-1})^{\ell _j-\ell _{j-1}}}{(\ell _j-\ell _{j-1})!} \prod _{\ell =\ell _{j-1}}^{\ell _j-1} g_j^{i_\ell } \end{aligned}$$
(11)

holds. We can derive this by dividing the integral domain into disjoint segments, which are compatible with the partition \(0=s_0<s_1<\cdots <s_n=1\). If we adopt the notation \(g_j^\alpha :=g_j^{i_1}\cdots g_j^{i_k}\) for each \(\alpha \in \mathscr {A}\), \(I^\alpha (w)\) can also be written as

$$\begin{aligned} I^\alpha (w) =\sum _{\alpha _1*\cdots *\alpha _n=\alpha } \prod _{j=1}^n \frac{(s_j-s_{j-1})^{|\alpha _j|}}{|\alpha _j|!} g_j^{\alpha }. \end{aligned}$$

The latter expression can also be easily derived from Chen’s theorem (Theorem 11).

5.2 Piecewise linear cubature

The following theorem assures the existence of a cubature formula on Wiener space composed of continuous piecewise linear paths.

Theorem 18

For each positive integer m, there exist n paths \(w_1,\ldots , w_n\in C_0^0(\mathbb {R}\oplus \mathbb {R}^d)\), which are piecewise linear and n positive weights \(\lambda _1, \ldots , \lambda _n\) whose sum is 1 that satisfy \(n\le |\mathscr {A}(m)|\) and

$$\begin{aligned} \mathrm {E}\left[ \pi _m(S(B))\right] =\sum _{i=1}^n \lambda _i\pi _m(S(w_i)). \end{aligned}$$

The statement still holds even if we require \(w_i^0(t)=t\) for \(i=1,\ldots ,n\) and \(0\le t\le 1\).

Proof

From Theorem 17 and Remark 3 (also from Proposition 14 if we want \(w^0(t) = t\)), we can easily deduce that there exist a set of at most \(2|\mathscr {A}(m)|\) continuous piecewise linear paths whose truncated (by \(\pi _m\)) signatures convex hull contains \(\mathrm {E}\left[ \pi _m(S(B))\right]\) (in its relative interior). Rigorously, we can apply the same argument as the proof of Theorem 4. Finally, by Carathéodory’s theorem, \(n\le |\mathscr {A}(m)|\) can actually be achieved. \(\square\)

Based on this theorem, it is sufficient for us to look for cubature formula within piecewise linear paths. Our approach to construction of a piecewise linear cubature is an application of “Monte Carlo cubature construction" [10]. Of course, we are not able to generate a Brownian motion and use it as a candidate for sample points of cubature formulas, because it is not a BV path and it cannot be implemented on computers anyway. However, the methods in [10] are still applicable here as we see in the following proposition.

Proposition 19

Let m be a positive integer. Then, for a sufficiently large M (the lower bound of M depends on m), the following statement holds:

Let a sequence of continuous piecewise linear paths \(w_1, w_2, \ldots\) be generated identically and independently. Assume also each \(w_i\) satisfies that

  • \(w_i\) is a path that linearly connects points \(w_i(k/M)\) \((0\le k\le M)\);

  • one of (a) and (b) holds:

    1. (a)

      \(w_i^0(t)=t\) \((0\le t\le 1)\) holds;

    2. (b)

      \(w_i^0(0)=0\) and \(w_i^0(1)=1\) hold, and \(w_i^0(\frac{k}{M})-w_i^0(\frac{k-1}{M})\) \((1\le k\le M-1)\) are independent random variables and have a density on \(\mathbb {R}\) which is positive almost everywhere;

  • random variables \(w_i^j(\frac{k}{M})-w_i^j(\frac{k-1}{M})\) \((1\le j \le d,\ 1\le k\le M)\) are independent (also from the ones of zero-th coordinate) and have a density on \(\mathbb {R}\), which is positive almost everywhere.

Then, with probability one, there exists an N such that a subset of \(\{w_1, w_2,\ldots ,w_N\}\) can construct a cubature on Wiener space of degree m.

Proof

Take M as large as we can find (at most) \(2|\mathscr {A}(m)|\) piecewise linear paths (denoted by \(\tilde{w}_\ell\)) with at most M linear segments such that \({{\,\mathrm{conv}\,}}\{\pi _m(S(\tilde{w}_\ell ))\}_\ell\) contains \(\mathrm {E}\left[ \pi _m(S(B))\right]\) in its relative interior. The existence of an M is assured by the proof of Theorem 18 (or see Remark 3, Theorem 17, and Remark 4).

From (11), the truncated signature of each \(w_i\) is a polynomial of random variables \(w_i^j(\frac{k}{M})-w_i^j(\frac{k-1}{M})\) \((0\le j \le d,\ 1\le k\le M,\ (j, k)\ne (0, M))\). Our assumption assures that these variables take values in every neighborhood of some point with a positive probability. In particular, it implies that

$$\begin{aligned} \mathrm {P}\left( \Vert \pi _m(S(\tilde{w}_\ell ))-\pi _m(S(w_i))\Vert <\varepsilon \right) >0, \end{aligned}$$

where \(\Vert \cdot \Vert\) is the Euclidean norm on \(T^{(m)}(E)\simeq F\), holds for each \(i, \ell\) and \(\varepsilon >0\). Note that the left-hand side probability does not depend on i by i.i.d. assumption. Therefore, the argument in the proof of Theorem 4 holds here again and we obtain the desired assertion. \(\square\)

Remark 5

For the scheme (b), the condition \(w_i^0(1)=1\) is necessary to assure that each \(\pi _m(S(w_i))\) is contained in G defined in Sect. 4.3.

Note that the generation rule of sample paths in this proposition is just one of infinitely many possible examples. We may alternatively, for instance, directly generate \(w_i^j(k/M)\) independently.

Proposition 19 assures only the existence of “some large" M and N, so it might be numerically hard to find cubature formulas from this approach. However, it is beneficial to know that the construction can be reduced at least to the stage of machine power.

5.3 Numerical experiments

Explaining our simple numerical method based on a Monte Carlo approach we first describe the algorithm for computing the signature of piecewise linear paths, and then present our Monte Carlo approach and its result in some pairs of (dm).

Calculation of signature Note that hereafter d and m are regarded as already given parameters. For positive integers M and N, we generate N piecewise linear paths (denoted by \(w_1, \ldots , w_N\)) with M intervals of time (see Proposition 19). The time complexity of this paths generation is \(\mathrm {O}(NMd)\). Then, we address each component of generated paths by

$$\begin{aligned} \text {PATH}[i, j, k]: = w_i^k\left( \frac{j}{M}\right) - w_i^k\left( \frac{j-1}{M}\right) \qquad (1\le i\le N,\ 1\le j\le M,\ 0\le k\le d) . \end{aligned}$$

In all the experiments, we generated \(\text {PATH}[i,j,k]\) (with \(k\ne 0\)) so that it follows the centered normal distribution of variance 1/M. We set \(\text {PATH}[i,j,0]=1/M\) for each ij and denote \(w_i\) by just writing \(\text {PATH}[i]\) for each i.

As we consider not so large M in this study, we calculate the signature of generated paths by a simple dynamic programming (Algorithm 1). In the algorithm, we calculate the signature of \(w_i\) over [0, k/M] for \(k=1,\ldots , M\), by using the expression (11). The time complexity of this algorithm is \(\mathrm {O}(M|\mathscr {A}(m)|^2)\), though a pruning of possible multiindices \((\alpha , \beta )\) helps a bit.

figure a

Monte Carlo approach In the approach based on Monte Carlo sampling, we simply generate many paths and determine by solving an LP problem whether or not we can construct a cubature formula of desired degree from generated paths.

The part of solving an LP problem was performed using IBM ILOG CPLEX Optimization Studio (https://www.ibm.com/analytics/cplex-optimizer, version 12.10; CPLEX hereafter). From Proposition 19, for a sufficiently large N we can construct a cubature formula using a subset of paths \(\{w_1, \ldots , w_N\}\).

We conducted experiments for six cases \((d, m) = (2, 3), (3, 3), (4, 3), (2, 5), (3, 5), (2, 7)\) and for each (dm), set \(N=2|\mathscr {A}(m)|, 4|\mathscr {A}(m)|, 8|\mathscr {A}(m)|\), \(M=2, 4, 8, 16, 32\) and examined if we could construct a cubature formula by using CPLEX (note that \(|\mathscr {A}(m)|\) depends on d). The following tables show how many times out of 10 trials we successfully obtained cubature formula. Blanks in the tables mean that the corresponding experiments were not performed because we already got 10 successes out of 10 with a smaller N.

Table 1 \((d, m)=(2, 3)\), \(|\mathscr {A}(m)|=20\)
Table 2 \((d, m)=(3, 3)\), \(|\mathscr {A}(m)|=47\)
Table 3 \((d, m)=(4, 3)\), \(|\mathscr {A}(m)|=94\)
Table 4 \((d, m)=(2, 5)\), \(|\mathscr {A}(m)|=119\)
Table 5 \((d, m)=(3, 5)\), \(|\mathscr {A}(m)|=516\)
Table 6 \((d, m)=(2, 7)\), \(|\mathscr {A}(m)|=696\)

From these results, one may expect that the change in m is more essential than d where we need larger number of partitions and ratio \(N/|\mathscr {A}(m)|\) as m gets larger, though more experiments are necessary.

6 Concluding remarks

In this paper, we have demonstrated that piecewise linear cubature formula can be constructed on Wiener space through a Monte Carlo sampling and an LP problem. Our construction is supported by the technical contribution, which extends stochastic Tchakaloff’s theorem using our characterization of the distribution of Stratonovich iterated integrals. We confirmed that for small pairs of (dm) our algorithm actually works in numerical experiments.

Although we have shown that one can theoretically construct cubature formulas of any dimension and degree, the number of paths used in our construction only attains the Tchakaloff bound, and therefore it requires too much computational cost for large (dm) in practice. Therefore, we may consider reducing the number of paths by using additional optimization techniques.