Skip to main content
Log in

Polynomial diffusions and applications in finance

  • Published:
Finance and Stochastics Aims and scope Submit manuscript

Abstract

This paper provides the mathematical foundation for polynomial diffusions. They play an important role in a growing range of applications in finance, including financial market models for interest rates, credit risk, stochastic volatility, commodities and electricity. Uniqueness of polynomial diffusions is established via moment determinacy in combination with pathwise uniqueness. Existence boils down to a stochastic invariance problem that we solve for semialgebraic state spaces. Examples include the unit ball, the product of the unit cube and nonnegative orthant, and the unit simplex.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. We thank Mykhaylo Shkolnikov for suggesting a way to improve an earlier version of this result.

  2. For geometric Brownian motion, there is a more fundamental reason to expect that uniqueness cannot be proved via the moment problem: it is well known that the lognormal distribution is not determined by its moments; see Heyde [29]. It thus becomes natural to pose the following question: Can one find a process  \(Y\) , essentially different from geometric Brownian motion, such that all joint moments of all finite-dimensional marginal distributions,

    $$ {\mathbb {E}}[Y_{t_{1}}^{\alpha_{1}} \cdots Y_{t_{m}}^{\alpha_{m}}], \qquad m\in{\mathbb {N}}, (\alpha _{1},\ldots,\alpha_{m})\in{\mathbb {N}}^{m}, 0\le t_{1}< \cdots< t_{m}< \infty, $$

    coincide with those of geometric Brownian motion? We have not been able to exhibit such a process. Note that any such \(Y\) must possess a continuous version. Indeed, the known formulas for the moments of the lognormal distribution imply that for each \(T\ge0\), there is a constant \(c=c(T)\) such that \({\mathbb {E}}[(Y_{t}-Y_{s})^{4}] \le c(t-s)^{2}\) for all \(s\le t\le T, |t-s|\le1\), whence Kolmogorov’s continuity lemma implies that \(Y\) has a continuous version; see Rogers and Williams [42, Theorem I.25.2].

  3. Note that unlike many other results in that paper, Proposition 2 in Bakry and Émery [4] does not require \(\widehat{\mathcal {G}}\) to leave \(C^{\infty}_{c}(E_{0})\) invariant, and is thus applicable in our setting.

  4. Details regarding stochastic calculus on stochastic intervals are available in Maisonneuve [36]; see also Mayerhofer et al. [37], Carr et al. [7], Larsson and Ruf [34].

  5. A matrix \(A\) is called strictly diagonally dominant if \(|A_{ii}|>\sum_{j\ne i}|A_{ij}|\) for all \(i\); see Horn and Johnson [30, Definition 6.1.9].

References

  1. Ackerer, D., Filipović, D.: Linear credit risk models. Swiss Finance Institute Research Paper No. 16-34 (2016). Available online at http://ssrn.com/abstract=2782455

  2. Ackerer, D., Filipović, D., Pulido, S.: The Jacobi stochastic volatility model. Swiss Finance Institute Research Paper No. 16-35 (2016). Available online at http://ssrn.com/abstract=2782486

  3. Akhiezer, N.I.: The Classical Moment Problem and Some Related Questions in Analysis. Oliver & Boyd, Edinburgh (1965)

    MATH  Google Scholar 

  4. Bakry, D., Émery, M.: Diffusions hypercontractives. In: Yor, M., Azéma, J. (eds.) Séminaire de Probabilités XIX. Lecture Notes in Mathematics, vol. 1123, pp. 177–206. Springer, Berlin (1985)

    Google Scholar 

  5. Berg, C., Christensen, J.P.R., Jensen, C.U.: A remark on the multidimensional moment problem. Math. Ann. 243, 163–169 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bochnak, J., Coste, M., Roy, M.-F.: Real Algebraic Geometry. Springer, Berlin (1998)

    Book  MATH  Google Scholar 

  7. Carr, P., Fisher, T., Ruf, J.: On the hedging of options on exploding exchange rates. Finance Stoch. 18, 115–144 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  8. Cherny, A.: On the uniqueness in law and the pathwise uniqueness for stochastic differential equations. Theory Probab. Appl. 46, 406–419 (2002)

    Article  MATH  Google Scholar 

  9. Cuchiero, C.: Affine and polynomial processes. Ph.D. thesis, ETH Zurich (2011). Available online at http://e-collection.library.ethz.ch/eserv/eth:4629/eth-4629-02.pdf

  10. Cuchiero, C., Keller-Ressel, M., Teichmann, J.: Polynomial processes and their applications to mathematical finance. Finance Stoch. 16, 711–740 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  11. Curtiss, J.H.: A note on the theory of moment generating functions. Ann. Math. Stat. 13, 430–433 (1942)

    Article  MathSciNet  MATH  Google Scholar 

  12. Da Prato, G., Frankowska, H.: Invariance of stochastic control systems with deterministic arguments. J. Differ. Equ. 200, 18–52 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  13. Da Prato, G., Frankowska, H.: Stochastic viability of convex sets. J. Math. Anal. Appl. 333, 151–163 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  14. Delbaen, F., Schachermayer, W.: A general version of the fundamental theorem of asset pricing. Math. Ann. 300, 463–520 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  15. Delbaen, F., Shirakawa, H.: An interest rate model with upper and lower bounds. Asia-Pac. Financ. Mark. 9, 191–209 (2002)

    Article  MATH  Google Scholar 

  16. Dummit, D.S., Foote, R.M.: Abstract Algebra, 3rd edn. Wiley, Hoboken (2004)

    MATH  Google Scholar 

  17. Dunkl, C.F.: Hankel transforms associated to finite reflection groups. Contemp. Math. 138, 123–138 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  18. Ethier, S.N.: A class of degenerate diffusion processes occurring in population genetics. Commun. Pure Appl. Math. 29, 483–493 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  19. Ethier, S.N., Kurtz, T.G.: Markov Processes: Characterization and Convergence. Wiley, Hoboken (2005)

    MATH  Google Scholar 

  20. Filipović, D., Mayerhofer, E., Schneider, P.: Density approximations for multivariate affine jump-diffusion processes. J. Econom. 176, 93–111 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  21. Filipović, D., Larsson, M., Trolle, A.: Linear-rational term structure models. J. Finance. Forthcoming. Available at SSRN http://ssrn.com/abstract=2397898

  22. Filipović, D., Tappe, S., Teichmann, J.: Invariant manifolds with boundary for jump-diffusions. Electron. J. Probab. 19, 1–28 (2014)

    MathSciNet  MATH  Google Scholar 

  23. Filipović, D., Gourier, E., Mancini, L.: Quadratic variance swap models. J. Financ. Econ. 119, 44–68 (2016)

    Article  Google Scholar 

  24. Forman, J.L., Sørensen, M.: The Pearson diffusions: a class of statistically tractable diffusion processes. Scand. J. Stat. 35, 438–465 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  25. Gallardo, L., Yor, M.: A chaotic representation property of the multidimensional Dunkl processes. Ann. Probab. 34, 1530–1549 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  26. Göing-Jaeschke, A., Yor, M.: A survey and some generalizations of Bessel processes. Bernoulli 9, 313–349 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  27. Gouriéroux, C., Jasiak, J.: Multivariate Jacobi process with application to smooth transitions. J. Econom. 131, 475–505 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  28. Hajek, B.: Mean stochastic comparison of diffusions. Z. Wahrscheinlichkeitstheor. Verw. Geb. 68, 315–329 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  29. Heyde, C.C.: On a property of the lognormal distribution. J. R. Stat. Soc., Ser. B, Stat. Methodol. 25, 392–393 (1963)

    MathSciNet  MATH  Google Scholar 

  30. Horn, R.A., Johnson, C.A.: Matrix Analysis. Cambridge University Press, Cambridge (1985)

    Book  MATH  Google Scholar 

  31. Ikeda, N., Watanabe, S.: Stochastic Differential Equations and Diffusion Processes. North-Holland, Amsterdam (1981)

    MATH  Google Scholar 

  32. Kleiber, C., Stoyanov, J.: Multivariate distributions and the moment problem. J. Multivar. Anal. 113, 7–18 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  33. Larsen, K.S., Sørensen, M.: Diffusion models for exchange rates in a target zone. Math. Finance 17, 285–306 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  34. Larsson, M., Ruf, J.: Convergence of local supermartingales and Novikov–Kazamaki type conditions for processes with jumps (2014). arXiv:1411.6229

  35. Lord, R., Koekkoek, R., van Dijk, D.: A comparison of biased simulation schemes for stochastic volatility models. Quant. Finance 10, 177–194 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  36. Maisonneuve, B.: Une mise au point sur les martingales locales continues définies sur un intervalle stochastique. In: Dellacherie, C., et al. (eds.) Séminaire de Probabilités XI. Lecture Notes in Mathematics, vol. 581, pp. 435–445. Springer, Berlin (1977)

    Chapter  Google Scholar 

  37. Mayerhofer, E., Pfaffel, O., Stelzer, R.: On strong solutions for positive definite jump diffusions. Stoch. Process. Appl. 121, 2072–2086 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  38. Mazet, O.: Classification des semi-groupes de diffusion sur ℝ associés à une famille de polynômes orthogonaux. In: Azéma, J., et al. (eds.) Séminaire de Probabilités XXXI. Lecture Notes in Mathematics, vol. 1655, pp. 40–53. Springer, Berlin (1997)

    Chapter  Google Scholar 

  39. Penrose, R.: A generalized inverse for matrices. Math. Proc. Camb. Philos. Soc. 51, 406–413 (1955)

    Article  MathSciNet  MATH  Google Scholar 

  40. Petersen, L.C.: On the relation between the multidimensional moment problem and the one-dimensional moment problem. Math. Scand. 51, 361–366 (1982)

    MathSciNet  MATH  Google Scholar 

  41. Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion, 3rd edn. Springer, Berlin (1999)

    Book  MATH  Google Scholar 

  42. Rogers, L.C.G., Williams, D.: Diffusions, Markov Processes and Martingales. Cambridge University Press, Cambridge (1994)

    MATH  Google Scholar 

  43. Schmüdgen, K.: The \(K\)-moment problem for compact semi-algebraic sets. Math. Ann. 289, 203–206 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  44. Spreij, P., Veerman, E.: Affine diffusions with non-canonical state space. Stoch. Anal. Appl. 30, 605–641 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  45. Stieltjes, T.J.: Recherches sur les fractions continues. Ann. Fac. Sci. Toulouse 8(4), 1–122 (1894)

    Article  MathSciNet  Google Scholar 

  46. Stoyanov, J.: Krein condition in probabilistic moment problems. Bernoulli 6, 939–949 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  47. Willard, S.: General Topology. Courier Corporation, North Chelmsford (2004)

    MATH  Google Scholar 

  48. Wong, E.: The construction of a class of stationary Markoff processes. In: Bellman, R. (ed.) Stochastic Processes in Mathematical Physics and Engineering, pp. 264–276. Am. Math. Soc., Providence (1964)

    Chapter  Google Scholar 

  49. Zhou, H.: Itô conditional moment generator and the estimation of short-rate processes. J. Financ. Econom. 1, 250–271 (2003)

    Article  Google Scholar 

Download references

Acknowledgements

The authors wish to thank Damien Ackerer, Peter Glynn, Kostas Kardaras, Guillermo Mantilla-Soler, Sergio Pulido, Mykhaylo Shkolnikov, Jordan Stoyanov and Josef Teichmann for useful comments and stimulating discussions. Thanks are also due to the referees, co-editor, and editor for their valuable remarks.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Larsson.

Additional information

The research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013)/ERC Grant Agreement n. 307465-POLYTE.

Appendices

Appendix A: Nonnegative Itô processes

The following auxiliary result forms the basis of the proof of Theorem 5.3. It gives necessary and sufficient conditions for nonnegativity of certain Itô processes.

Lemma A.1

Let \(Z\) be a continuous semimartingale of the form

$$ Z_{t}=Z_{0}+\int_{0}^{t}\mu_{s}{\,\mathrm{d}} s+\int_{0}^{t}\nu_{s}{\,\mathrm{d}} B_{s}, $$

where \(Z_{0}\ge0\), \(\mu\) and \(\nu\) are continuous processes, and \(B\) is a Brownian motion. Let \(L^{0}\) be the local time of \(Z\) at level zero.

  1. (i)

    If \(\mu>0\) on \(\{Z=0\}\) and \(L^{0}=0\), then \(Z\ge0\) and \(\int _{0}^{t} {\boldsymbol{1}_{\{Z_{s}=0\}}}{\,\mathrm{d}} s=0\).

  2. (ii)

    If \(Z\ge0\), then on \(\{Z=0\}\), we have \(\mu\ge0\) and \(\nu=0\).

Proof

After stopping we may assume that \(Z_{t}\), \(\int_{0}^{t}\mu_{s}{\,\mathrm{d}} s\) and \(\int _{0}^{t}\nu_{s}{\,\mathrm{d}} B_{s}\) are uniformly bounded. This is done throughout the proof.

We first prove (i). By [41, Theorem VI.1.7] and using that \(\mu>0\) on \(\{Z=0\}\) and \(L^{0}=0\), we obtain \(0 = L^{0}_{t} =L^{0-}_{t} + 2\int_{0}^{t} {\boldsymbol {1}_{\{Z_{s}=0\}}}\mu _{s}{\,\mathrm{d}} s \ge0\). In particular, \(\int_{0}^{t}{\boldsymbol{1}_{\{Z_{s}=0\} }}{\,\mathrm{d}} s=0\), as claimed. Furthermore, Tanaka’s formula [41, Theorem VI.1.2] yields

$$ Z_{t}^{-} = -\int_{0}^{t} {\boldsymbol{1}_{\{Z_{s}\le0\}}}{\,\mathrm{d}} Z_{s} - \frac {1}{2}L^{0}_{t} = -\int_{0}^{t}{\boldsymbol{1}_{\{Z_{s}\le0\}}}\mu_{s} {\,\mathrm{d}} s - \int_{0}^{t}{\boldsymbol{1}_{\{Z_{s}\le0\}}}\nu_{s} {\,\mathrm{d}} B_{s}. $$
(A.1)

Define \(\rho=\inf\left\{ t\ge0: Z_{t}<0\right\}\) and \(\tau=\inf \left\{ t\ge\rho: \mu_{t}=0 \right\} \wedge(\rho+1)\). Using that \(Z^{-}=0\) on \(\{\rho=\infty\}\) as well as dominated convergence, we obtain

$$ {\mathbb {E}}[Z^{-}_{\tau\wedge n}] = {\mathbb {E}}\big[Z^{-}_{\tau\wedge n}{\boldsymbol{1}_{\{\rho< \infty\}}}\big] \longrightarrow{\mathbb {E}}\big[ Z^{-}_{\tau}{\boldsymbol{1}_{\{\rho < \infty\}}}\big] \qquad(n\to\infty). $$

Here \(Z_{\tau}\) is well defined on \(\{\rho<\infty\}\) since \(\tau <\infty\) on this set. On the other hand, by (A.1), the fact that \(\int_{0}^{t}{\boldsymbol{1}_{\{Z_{s}\le0\}}}\mu_{s}{\,\mathrm{d}} s=\int _{0}^{t}{\boldsymbol{1}_{\{Z_{s}=0\}}}\mu_{s}{\,\mathrm{d}} s=0\) on \(\{ \rho =\infty\}\) and monotone convergence, we get

$$\begin{aligned} {\mathbb {E}}[Z^{-}_{\tau\wedge n}] &= {\mathbb {E}}\left[ - \int_{0}^{\tau\wedge n}{\boldsymbol{1}_{\{Z_{s}\le 0\}}}\mu_{s}{\,\mathrm{d}} s\right] = {\mathbb {E}} \left[ - \int_{0}^{\tau\wedge n}{\boldsymbol{1}_{\{Z_{s}\le0\}}}\mu_{s}{\,\mathrm{d}} s {\boldsymbol{1}_{\{\rho< \infty\}}}\right] \\ &\!\!\longrightarrow{\mathbb {E}}\left[ - \int_{0}^{\tau}{\boldsymbol {1}_{\{Z_{s}\le0\}}}\mu_{s}{\,\mathrm{d}} s {\boldsymbol{1}_{\{\rho< \infty\}}}\right ] \qquad\text{as $n\to\infty$.} \end{aligned}$$

Consequently,

$$ {\mathbb {E}}\left[ Z^{-}_{\tau}{\boldsymbol{1}_{\{\rho< \infty\}}}\right] = {\mathbb {E}}\left[ - \int _{0}^{\tau}{\boldsymbol{1}_{\{Z_{s}\le0\}}}\mu_{s}{\,\mathrm{d}} s {\boldsymbol{1}_{\{\rho < \infty\}}}\right]. $$
(A.2)

The following hold on \(\{\rho<\infty\}\): \(\tau>\rho\); \(Z_{t}\ge0\) on \([0,\rho]\); \(\mu_{t}>0\) on \([\rho,\tau)\); and \(Z_{t}<0\) on some nonempty open subset of \((\rho,\tau)\). Therefore, the random variable inside the expectation on the right-hand side of (A.2) is strictly negative on \(\{\rho<\infty\}\). The left-hand side, however, is nonnegative; so we deduce \({\mathbb {P}}[\rho<\infty]=0\). Part (i) is proved.

The proof of Part (ii) involves the same ideas as used for instance in Spreij and Veerman [44, Proposition 3.1]. We first assume \(Z_{0}=0\) and prove \(\mu_{0}\ge0\) and \(\nu_{0}=0\). Assume for contradiction that \({\mathbb {P}} [\mu_{0}<0]>0\), and define \(\tau=\inf\{t\ge0:\mu_{t}\ge0\}\wedge1\). Then \(0\le{\mathbb {E}}[Z_{\tau}] = {\mathbb {E}}[\int_{0}^{\tau}\mu_{s}{\,\mathrm{d}} s]<0\), a contradiction, whence \(\mu_{0}\ge0\) as desired. Next, pick any \(\phi\in{\mathbb {R}}\) and consider an equivalent measure \({\mathrm{d}}{\mathbb {Q}}={\mathcal {E}}(-\phi B)_{1}{\,\mathrm{d}} {\mathbb {P}}\). Then \(B^{\mathbb {Q}}_{t} = B_{t} + \phi t\) is a ℚ-Brownian motion on \([0,1]\), and we have

$$ Z_{t}=\int_{0}^{t}(\mu_{s}-\phi\nu_{s}){\,\mathrm{d}} s+\int_{0}^{t}\nu_{s}{\,\mathrm{d}} B^{\mathbb {Q}}_{s}. $$

Pick any \(\varepsilon>0\) and define \(\sigma=\inf\{t\ge0:|\nu_{t}|\le \varepsilon\}\wedge1\). The first part of the proof applied to the stopped process \(Z^{\sigma}\) under ℚ yields \((\mu_{0}-\phi \nu_{0}){\boldsymbol{1}_{\{\sigma>0\}}}\ge0\) for all \(\phi\in {\mathbb {R}}\). But this forces \(\sigma=0\) and hence \(|\nu_{0}|\le\varepsilon\). Since \(\varepsilon>0\) was arbitrary, we get \(\nu_{0}=0\) as desired.

Now consider any stopping time \(\rho\) such that \(Z_{\rho}=0\) on \(\{\rho <\infty\}\). Applying the result we have already proved to the process \((Z_{\rho+t}{\boldsymbol{1}_{\{\rho<\infty\}}})_{t\ge0}\) with filtration \(({\mathcal {F}} _{\rho+t}\cap\{\rho<\infty\})_{t\ge0}\) then yields \(\mu_{\rho}\ge0\) and \(\nu_{\rho}=0\) on \(\{\rho<\infty\}\). Finally, let \(\{\rho_{n}:n\in{\mathbb {N}}\}\) be a countable collection of such stopping times that are dense in \(\{t:Z_{t}=0\}\). Applying the above result to each \(\rho_{n}\) and using the continuity of \(\mu\) and \(\nu\), we obtain (ii). □

The following two examples show that the assumptions of Lemma A.1 are tight in the sense that the gap between (i) and (ii) cannot be closed.

Example A.2

The strict inequality appearing in Lemma A.1(i) cannot be relaxed to a weak inequality: just consider the deterministic process \(Z_{t}=(1-t)^{3}\).

Example A.3

The assumption of vanishing local time at zero in Lemma A.1(i) cannot be replaced by the zero volatility condition \(\nu =0\) on \(\{Z=0\}\), even if the strictly positive drift condition is retained. This is demonstrated by a construction that is closely related to the so-called Girsanov SDE; see Rogers and Williams [42, Sect. V.26]. Let \(Y\) be a one-dimensional Brownian motion, and define \(\rho(y)=|y|^{-2\alpha }\vee1\) for some \(0<\alpha<1/4\). The occupation density formula implies that

$$ \int_{0}^{t}\rho(Y_{s})^{2}{\,\mathrm{d}} s=\int_{-\infty}^{\infty}(|y|^{-4\alpha}\vee 1)L^{y}_{t}(Y){\,\mathrm{d}} y< \infty $$

for all \(t\ge0\); so we may define a positive local martingale by

$$ R_{t} = \exp\left( \int_{0}^{t} \rho(Y_{s}){\,\mathrm{d}} Y_{s} - \frac{1}{2}\int_{0}^{t} \rho (Y_{s})^{2}{\,\mathrm{d}} s\right). $$

Let \(\tau\) be a strictly positive stopping time such that the stopped process \(R^{\tau}\) is a uniformly integrable martingale. Then define the equivalent probability measure \({\mathrm{d}}{\mathbb {Q}}=R_{\tau}{\,\mathrm{d}}{\mathbb {P}}\), under which the process \(B_{t}=Y_{t}-\int_{0}^{t\wedge\tau}\rho(Y_{s}){\,\mathrm{d}} s\) is a Brownian motion. We now change time via

$$ \varphi_{t} = \int_{0}^{t} \rho(Y_{s}){\,\mathrm{d}} s, \qquad A_{u} = \inf\{t\ge0: \varphi _{t} > u\}, $$

and define \(Z_{u} = Y_{A_{u}}\). This process satisfies \(Z_{u} = B_{A_{u}} + u\wedge\sigma\), where \(\sigma=\varphi_{\tau}\). Define then \(\beta _{u}=\int _{0}^{u} \rho(Z_{v})^{1/2}{\,\mathrm{d}} B_{A_{v}}\), which is a Brownian motion because we have \(\langle\beta,\beta\rangle_{u}=\int_{0}^{u}\rho(Z_{v}){\,\mathrm{d}} A_{v}=u\). This finally gives

$$ Z_{u} = \int_{0}^{u} (|Z_{v}|^{\alpha}\wedge1) {\,\mathrm{d}}\beta_{v} + u\wedge\sigma. $$

This process starts at zero, has zero volatility whenever \(Z_{t}=0\), and strictly positive drift prior to the stopping time \(\sigma\), which is strictly positive. Nonetheless, its sign changes infinitely often on any time interval \([0,t)\) since it is a time-changed Brownian motion viewed under an equivalent measure.

Appendix B: Proof of Theorem 3.1

We first establish a lemma.

Lemma B.1

For any \(k\in{\mathbb {N}}\) such that \({\mathbb {E}}[\|X_{0}\|^{2k}]<\infty \), there is a constant \(C\) such that

$$ {\mathbb {E}}\big[ 1 + \|X_{t}\|^{2k} \,\big|\, {\mathcal {F}}_{0}\big] \le \big(1+\|X_{0}\| ^{2k}\big)\mathrm{e}^{Ct}, \qquad t\ge0. $$

Proof

This is done as in the proof of Theorem 2.10 in Cuchiero et al. [10] via Gronwall’s inequality. Specifically, let \(f\in {\mathrm{Pol}}_{2k}(E)\) be given by \(f(x)=1+\|x\|^{2k}\), and note that the polynomial property implies that there exists a constant \(C\) such that \(|{\mathcal {G}}f(x)| \le Cf(x)\) for all \(x\in E\). For each \(m\), let \(\tau_{m}\) be the first exit time of \(X\) from the ball \(\{x\in E:\|x\|< m\}\). We can always choose a continuous version of \(t\mapsto{\mathbb {E}}[f(X_{t\wedge \tau_{m}})\,|\,{\mathcal {F}}_{0}]\), so let us fix such a version. Then by Itô’s formula and the martingale property of \(\int_{0}^{t\wedge\tau_{m}}\nabla f(X_{s})^{\top}\sigma(X_{s}){\,\mathrm{d}} W_{s}\),

$$\begin{aligned} {\mathbb {E}}[f(X_{t\wedge\tau_{m}})\,|\,{\mathcal {F}}_{0}] &= f(X_{0}) + {\mathbb {E}}\left[\int_{0}^{t\wedge\tau_{m}}{\mathcal {G}}f(X_{s}) {\,\mathrm{d}} s\,\bigg|\, {\mathcal {F}}_{0} \right] \\ &\le f(X_{0}) + C {\mathbb {E}}\left[\int_{0}^{t\wedge\tau_{m}} f(X_{s}) {\,\mathrm{d}} s\,\bigg|\, {\mathcal {F}}_{0} \right] \\ &\le f(X_{0}) + C\int_{0}^{t}{\mathbb {E}}[ f(X_{s\wedge\tau_{m}})\,|\, {\mathcal {F}}_{0} ] {\,\mathrm{d}} s. \end{aligned}$$

Gronwall’s inequality now yields \({\mathbb {E}}[f(X_{t\wedge\tau_{m}})\, |\,{\mathcal {F}} _{0}]\le f(X_{0}) \mathrm{e}^{Ct}\). Sending \(m\) to infinity and applying Fatou’s lemma gives the result. □

We can now prove Theorem 3.1. For any \(p\in{\mathrm{Pol}}_{n}(E)\), Itô’s formula yields

$$ p(X_{u}) = p(X_{t}) + \int_{t}^{u} {\mathcal {G}}p(X_{s}) {\,\mathrm{d}} s + \int_{t}^{u} \nabla p(X_{s})^{\top}\sigma(X_{s}){\,\mathrm{d}} W_{s}. $$

The quadratic variation of the right-hand side satisfies

$$ \int_{0}^{T}\nabla p^{\top}a \nabla p(X_{s}){\,\mathrm{d}} s\le C \int_{0}^{T} (1+\|X_{s}\| ^{2n}){\,\mathrm{d}} s $$

for some constant \(C\). This right-hand side has finite expectation by Lemma B.1, so the stochastic integral above is a martingale. Let \(\vec{p}\in{\mathbb {R}}^{{N}}\) be the coordinate representation of \(p\). Then (3.1) and (3.2) in conjunction with the linearity of the expectation and integration operators yield

$$\begin{aligned} \vec{p}^{\top}{\mathbb {E}}[H(X_{u}) \,|\, {\mathcal {F}}_{t} ] &= {\mathbb {E}}[p(X_{u}) \,|\, {\mathcal {F}}_{t} ] = p(X_{t}) + {\mathbb {E}}\bigg[\int_{t}^{u} {\mathcal {G}}p(X_{s}) {\,\mathrm{d}} s\,\bigg|\,{\mathcal {F}}_{t}\bigg] \\ &={ \vec{p} }^{\top}H(X_{t}) + (G \vec{p} )^{\top}{\mathbb {E}}\bigg[ \int_{t}^{u} H(X_{s}){\,\mathrm{d}} s \,\bigg|\,{\mathcal {F}}_{t} \bigg]. \end{aligned}$$

Fubini’s theorem, justified by Lemma B.1, yields

$$ { \vec{p} }^{\top}F(u) = { \vec{p} }^{\top}H(X_{t}) + { \vec{p} }^{\top}G^{\top}\int_{t}^{u} F(s) {\,\mathrm{d}} s, \qquad t\le u\le T, $$

where we define \(F(u) = {\mathbb {E}}[H(X_{u}) \,|\,{\mathcal {F}}_{t}]\). By choosing unit vectors for \(\vec{p}\), this gives a system of linear integral equations for \(F(u)\), whose unique solution is given by \(F(u)=\mathrm{e}^{(u-t)G^{\top}}H(X_{t})\). Hence

$$ {\mathbb {E}}[p(X_{T}) \,|\, {\mathcal {F}}_{t} ] = F(T)^{\top}\vec{p} = H(X_{t})^{\top}\mathrm{e} ^{(T-t)G} \vec{p}, $$

as claimed. This completes the proof of the theorem.  □

Appendix C: Proof of Theorem 3.3

Theorem 3.3 is an immediate corollary of the following result.

Lemma C.1

Consider the \(d\)-dimensional Itô process \(X\) with representation

$$ dX_{t} = (b+\beta X_{t})dt + \sigma(X_{t}) dW_{t}, $$

where \(\sigma\) satisfies a square-root growth condition

$$ \|\sigma(X_{t})\|^{2} \le C(1+\|X_{t}\|) \qquad \textit{for all }t\ge0 $$
(C.1)

for some constant \(C\). If

$$ {\mathbb {E}}\big[ \mathrm{e}^{\delta\|X_{0}\|}\big]< \infty \qquad \textit{for some } \delta>0, $$
(C.2)

then for each \(T\ge0\), there exists \(\varepsilon>0\) with

$$ {\mathbb {E}}\big[\mathrm{e}^{\varepsilon\|X_{T}\|}\big]< \infty. $$
(C.3)

Proof

Fix \(T\ge0\). Variation of constants lets us rewrite \(X_{t} = A_{t} + \mathrm{e} ^{-\beta(T-t)}Y_{t} \) with

$$ A_{t} = \mathrm{e}^{\beta t} X_{0}+\int_{0}^{t} \mathrm{e}^{\beta(t- s)}b ds $$

and

$$ Y_{t}= \int_{0}^{t} \mathrm{e}^{\beta(T- s)}\sigma(X_{s}) dW_{s} = \int_{0}^{t} \sigma^{Y}_{s} dW_{s}, $$

where we write \(\sigma^{Y}_{t} = \mathrm{e}^{\beta(T- t)}\sigma(A_{t} + \mathrm{e}^{-\beta (T-t)}Y_{t} )\). By (C.1), the dispersion process \(\sigma^{Y}\) satisfies

$$ \|\sigma^{Y}_{t}\|^{2} \le C_{Y}(1+\| Y_{t}\|) $$
(C.4)

for some constant \(C_{Y}\).

Now let \(f(y)\) be a real-valued and positive smooth function on \({\mathbb {R}}^{d}\) satisfying \(f(y)=\sqrt{1+\|y\|}\) for \(\|y\|>1\). Some differential calculus gives, for \(y\neq0\),

$$ \nabla\|y\| = \frac{y}{\|y\|} \qquad\text{and}\qquad\frac {\partial^{2} \|y\|}{\partial y_{i}\partial y_{j}}= \textstyle\begin{cases} \frac{1}{\|y\|}-\frac{1}{2}\frac{y_{i}^{2}}{\|y\|^{3}}, & i=j,\\ -\frac{1}{2}\frac{y_{i} y_{j}}{\|y\|^{3}},& i\neq j. \end{cases} $$

Hence

$$ \nabla f(y)= \frac{1}{2\sqrt{1+\|y\|}}\frac{ y}{\|y\|} $$

and

$$ \frac{\partial^{2} f(y)}{\partial y_{i}\partial y_{j}}=-\frac{1}{4\sqrt {1+\| y\|}^{3}}\frac{ y_{i}}{\|y\|}\frac{ y}{\|y\|}+\frac{1}{2\sqrt{1+\|y\| }}\times \textstyle\begin{cases} \frac{1}{\|y\|}-\frac{1}{2}\frac{y_{i}^{2}}{\|y\|^{3}}, & i=j\\ -\frac{1}{2}\frac{y_{i} y_{j}}{\|y\|^{3}},& i\neq j \end{cases} $$

for \(\|y\|>1\), while the first and second order derivatives of \(f(y)\) are uniformly bounded for \(\|y\|\le1\). Itô’s formula for \(Z_{t}=f(Y_{t})\) gives

$$ dZ_{t} = \mu^{Z}_{t} dt +\sigma^{Z}_{t} dW_{t} $$

with drift and dispersion processes

$$ \mu^{Z}_{t} = \frac{1}{2}\sum_{i,j=1}^{d} \frac{\partial^{2} f(Y_{t})}{\partial y_{i}\partial y_{j}} (\sigma^{Y}_{t}{\sigma^{Y}_{t}}^{\top})_{ij},\qquad\sigma ^{Z}_{t}= \nabla f(Y_{t})^{\top}\sigma^{Y}_{t}. $$

In view of (C.4) and the above expressions for \(\nabla f(y)\) and \(\frac{\partial^{2} f(y)}{\partial y_{i}\partial y_{j}}\), these are bounded,

$$ \mu^{Z}_{t} \le m\qquad\text{and}\qquad\| \sigma^{Z}_{t} \|\le\rho, $$

for some constants \(m\) and \(\rho\). Hajek [28, Theorem 1.3] now implies that

$$ {\mathbb {E}}\left[\varPhi(Z_{T})\right] \le{\mathbb {E}}\left[\varPhi (V)\right] $$

for any nondecreasing convex function \(\varPhi\) on ℝ, where \(V\) is a Gaussian random variable with mean \(f(0)+m T\) and variance \(\rho^{2} T\). Hence, for any \(0<\varepsilon' <1/(2\rho^{2} T)\), we have \({\mathbb {E}}[\mathrm{e} ^{\varepsilon' V^{2}}] <\infty\). We now let \(\varPhi\) be a nondecreasing convex function on ℝ with \(\varPhi (z) = \mathrm{e}^{\varepsilon' z^{2}}\) for \(z\ge0\). Noting that \(Z_{T}\) is positive, we obtain \({\mathbb {E}}[ \mathrm{e}^{\varepsilon' Z_{T}^{2}}]<\infty\). As \(f^{2}(y)=1+\|y\|\) for \(\|y\|>1\), this implies \({\mathbb {E}}[ \mathrm{e}^{\varepsilon' \| Y_{T}\|}]<\infty\). Combining this with the fact that \(\|X_{T}\| \le\|A_{T}\| + \|Y_{T}\| \) and (C.2), we obtain using Hölder’s inequality the existence of some \(\varepsilon>0\) with (C.3). □

Appendix D: Proof of Theorem 4.4

We first provide a lemma.

Lemma D.1

Assume uniqueness in law holds for \(E_{Y}\)-valued solutions to (4.1). Let \(Y^{1}\), \(Y^{2}\) be two \(E_{Y}\)-valued solutions to (4.1) with driving Brownian motions \(W^{1}\), \(W^{2}\) and with \(Y^{1}_{0}=Y^{2}_{0}=y\) for some \(y\in E_{Y}\). Then \((Y^{1},W^{1})\) and \((Y^{2},W^{2})\) have the same law.

Proof

Consider the equation

$$ {\mathrm{d}} Y_{t} = \widehat{b}_{Y}(Y_{t}) {\,\mathrm{d}} t + \widehat{\sigma}_{Y}(Y_{t}) {\,\mathrm{d}} W_{t}, $$

where \(\widehat{b}_{Y}(y)=b_{Y}(y){\mathbf{1}}_{E_{Y}}(y)\) and \(\widehat{\sigma}_{Y}(y)=\sigma_{Y}(y){\mathbf{1}}_{E_{Y}}(y)\). Since \(E_{Y}\) is closed, any solution \(Y\) to this equation with \(Y_{0}\in E_{Y}\) must remain inside \(E_{Y}\). To see this, let \(\tau=\inf\{t:Y_{t}\notin E_{Y}\}\). Then there exists \(\varepsilon >0\), depending on \(\omega\), such that \(Y_{t}\notin E_{Y}\) for all \(\tau < t<\tau+\varepsilon\). However, since \(\widehat{b}_{Y}\) and \(\widehat{\sigma}_{Y}\) vanish outside \(E_{Y}\), \(Y_{t}\) is constant on \((\tau,\tau +\varepsilon )\). Since \(E_{Y}\) is closed this is only possible if \(\tau=\infty\).

The hypothesis of the lemma now implies that uniqueness in law for \({\mathbb {R}}^{d}\)-valued solutions holds for \({\mathrm{d}} Y_{t} = \widehat{b}_{Y}(Y_{t}) {\,\mathrm{d}} t + \widehat{\sigma}_{Y}(Y_{t}) {\,\mathrm{d}} W_{t}\). Since \((Y^{i},W^{i})\), \(i=1,2\), are two solutions with \(Y^{1}_{0}=Y^{2}_{0}=y\), Cherny [8, Theorem 3.1] shows that \((W^{1},Y^{1})\) and \((W^{2},Y^{2})\) have the same law. □

The proof of Theorem 4.4 follows along the lines of the proof of the Yamada–Watanabe theorem that pathwise uniqueness implies uniqueness in law; see Rogers and Williams [42, Theorem V.17.1]. Let \((W^{i},Y^{i},Z^{i})\), \(i=1,2\), be \(E\)-valued weak solutions to (4.1), (4.2) starting from \((y_{0},z_{0})\in E\subseteq{\mathbb {R}}^{m}\times{\mathbb {R}}^{n}\). We need to show that \((Y^{1},Z^{1})\) and \((Y^{2},Z^{2})\) have the same law. Since uniqueness in law holds for \(E_{Y}\)-valued solutions to (4.1), Lemma D.1 implies that \((W^{1},Y^{1})\) and \((W^{2},Y^{2})\) have the same law, which we denote by \(\pi({\mathrm{d}} w,{\,\mathrm{d}} y)\). Let \(Q^{i}({\mathrm{d}} z;w,y)\), \(i=1,2\), denote a regular conditional distribution of \(Z^{i}\) given \((W^{i},Y^{i})\). We equip the path space \(C({\mathbb {R}}_{+},{\mathbb {R}}^{d}\times{\mathbb {R}}^{m}\times{\mathbb {R}}^{n}\times{\mathbb {R}}^{n})\) with the probability measure

$$ \overline{\mathbb {P}}({\mathrm{d}} w,{\,\mathrm{d}} y,{\,\mathrm{d}} z,{\,\mathrm{d}} z') = \pi({\mathrm{d}} w, {\,\mathrm{d}} y)Q^{1}({\mathrm{d}} z; w,y)Q^{2}({\mathrm{d}} z'; w,y). $$

Let \((W,Y,Z,Z')\) denote the coordinate process on \(C({\mathbb {R}}_{+},{\mathbb {R}}^{d}\times{\mathbb {R}}^{m}\times{\mathbb {R}}^{n}\times{\mathbb {R}}^{n})\). Then the law under \(\overline{\mathbb {P}}\) of \((W,Y,Z)\) equals the law of \((W^{1},Y^{1},Z^{1})\), and the law under \(\overline{\mathbb {P}}\) of \((W,Y,Z')\) equals the law of \((W^{2},Y^{2},Z^{2})\). By well-known arguments, see for instance Rogers and Williams [42, Lemma V.10.1 and Theorems V.10.4 and V.17.1], it follows that

$$\begin{aligned} Y_{t} &= y_{0} + \int_{0}^{t} b_{Y}(Y_{s}){\,\mathrm{d}} s + \int_{0}^{t} \sigma_{Y}(Y_{s}){\,\mathrm{d}} W_{s}, \\ Z_{t} &= z_{0} + \int_{0}^{t} b_{Z}(Y_{s},Z_{s}){\,\mathrm{d}} s + \int_{0}^{t} \sigma _{Z}(Y_{s},Z_{s}){\,\mathrm{d}} W_{s}, \\ Z'_{t} &= z_{0} + \int_{0}^{t} b_{Z}(Y_{s},Z'_{s}){\,\mathrm{d}} s + \int_{0}^{t} \sigma _{Z}(Y_{s},Z'_{s}){\,\mathrm{d}} W_{s}. \end{aligned}$$

By localization, we may assume that \(b_{Z}\) and \(\sigma_{Z}\) are Lipschitz in \(z\), uniformly in \(y\). A standard argument based on the BDG inequalities and Jensen’s inequality (see Rogers and Williams [42, Corollary V.11.7]) together with Gronwall’s inequality yields \(\overline{\mathbb {P}}[Z'=Z]=1\). Hence

$$ \mathrm{Law}(Y^{1},Z^{1}) = \mathrm{Law}(Y,Z) = \mathrm{Law}(Y,Z') = \mathrm{Law}(Y^{2},Z^{2}), $$

as was to be shown.  □

Remark D.2

Theorem 4.4 carries over, and its proof literally goes through, to the case where \((Y,Z)\) is an arbitrary \(E\)-valued diffusion that solves (4.1), (4.2) and where uniqueness in law for \(E_{Y}\)-valued solutions to (4.1) holds, provided (4.3) is replaced by the assumption that both \(b_{Z}\) and \(\sigma_{Z}\) are locally Lipschitz in \(z\), locally in \(y\), on \(E\). That is, for each compact subset \(K\subseteq E\), there exists a constant \(\kappa\) such that for all \((y,z,y',z')\in K\times K\),

$$ \|b_{Z}(y,z) - b_{Z}(y',z')\| + \| \sigma_{Z}(y,z) - \sigma_{Z}(y',z') \| \le \kappa\|z-z'\|. $$

Appendix E: Proof of Theorem 5.3

The proof of Theorem 5.3 consists of two main parts. First, we construct coefficients \(\widehat{a}=\widehat{\sigma}\widehat{\sigma}^{\top}\) and \(\widehat{b}\) that coincide with \(a\) and \(b\) on \(E\), such that a local solution to (2.2), with \(b\) and \(\sigma\) replaced by \(\widehat{b}\) and \(\widehat{\sigma}\), can be obtained with values in a neighborhood of \(E\) in \(M\). This relies on (G1) and (A2), and occupies this section up to and including Lemma E.4. Second, we complete the proof by showing that this solution in fact stays inside \(E\) and spends zero time in the sets \(\{p=0\}\), \(p\in{\mathcal {P}}\). This relies on (G2) and (A1).

Let \(\pi:{\mathbb {S}}^{d}\to{\mathbb {S}}^{d}_{+}\) be the Euclidean metric projection onto the positive semidefinite cone. It has the following well-known property.

Lemma E.1

For any symmetric matrix \(A\in{\mathbb {S}}^{d}\) with the spectral decomposition \(A=S\varLambda S^{\top}\), we have \(\pi(A)=S\varLambda^{+} S^{\top}\), where \(\varLambda^{+}\) is the element-wise positive part of \(\varLambda\).

Proof

This result follows from the fact that the map \(\lambda:{\mathbb {S}}^{d}\to{\mathbb {R}}^{d}\) taking a symmetric matrix to its ordered eigenvalues is 1-Lipschitz; see Horn and Johnson [30, Theorem 7.4.51]. Indeed, for any \(B\in{\mathbb {S}}^{d}_{+}\), we have

$$ \|A-S\varLambda^{+}S^{\top}\| = \|\lambda(A)-\lambda(A)^{+}\| \le\|\lambda (A)-\lambda(B)\| \le\|A-B\|. $$

Here the first inequality uses that the projection of an ordered vector \(x\in{\mathbb {R}}^{d}\) onto the set of ordered vectors with nonnegative entries is simply \(x^{+}\). □

We use the projection \(\pi\) to modify the given coefficients \(a\) and \(b\) outside \(E\) in order to obtain candidate coefficients for the stochastic differential equation (2.2). The diffusion coefficients are defined by

$$ \widehat{a}(x) = \pi\circ a(x), \qquad\widehat{\sigma}(x) = \widehat{a}(x)^{1/2}. $$

In order to construct the drift coefficient \(\widehat{b}\), we need the following lemma.

Lemma E.2

There exists a continuous map \(\widehat{b} :{\mathbb {R}}^{d}\to{\mathbb {R}}^{d}\) with \(\widehat{b}=b\) on \(E\) and such that the operator \(\widehat{\mathcal {G}}\) given by

$$ \widehat{\mathcal {G}}f = \frac{1}{2}\operatorname{Tr}( \widehat{a} \nabla^{2} f) + \widehat{b} ^{\top} \nabla f $$

satisfies \(\widehat{\mathcal {G}}f={\mathcal {G}}f\) on \(E\) and \(\widehat {\mathcal {G}}q = 0 \) on \(M\) for all \(q\in{\mathcal {Q}}\).

Proof

We first prove that there exists a continuous map \(c:{\mathbb {R}}^{d}\to {\mathbb {R}}^{d}\) such that

$$ c=0\mbox{ on }E \qquad \mbox{and}\qquad\nabla q^{\top}c = - \frac {1}{2}\operatorname{Tr}\big( (\widehat{a}-a) \nabla^{2} q \big) \mbox{ on } M\mbox{, for all }q\in {\mathcal {Q}}. $$
(E.1)

Indeed, let \(a=S\varLambda S^{\top}\) be the spectral decomposition of \(a\), so that the columns \(S_{i}\) of \(S\) constitute an orthonormal basis of eigenvectors of \(a\) and the diagonal elements \(\lambda_{i}\) of \(\varLambda\) are the corresponding eigenvalues. These quantities depend on \(x\) in a possibly discontinuous way. For each \(q\in{\mathcal {Q}}\),

$$ \operatorname{Tr}\big((\widehat{a}-a) \nabla^{2} q \big) = \operatorname{Tr}( S\varLambda^{-} S^{\top}\nabla ^{2} q) = \sum_{i=1}^{d} \lambda_{i}^{-} S_{i}^{\top}\nabla^{2}q S_{i}. $$
(E.2)

Consider now any fixed \(x\in M\). For each \(i\) such that \(\lambda _{i}(x)^{-}\ne0\), \(S_{i}(x)\) lies in the tangent space of \(M\) at \(x\). Thus we may find a smooth path \(\gamma_{i}:(-1,1)\to M\) such that \(\gamma _{i}(0)=x\) and \(\gamma_{i}'(0)=S_{i}(x)\). For any \(q\in{\mathcal {Q}}\), we have \(q=0\) on \(M\) by definition, whence

$$ 0 = \frac{{\,\mathrm{d}}^{2}}{{\,\mathrm{d}} s^{2}} (q \circ\gamma_{i})(0) = \operatorname {Tr}\big( \nabla^{2} q(x) \gamma_{i}'(0) \gamma_{i}'(0)^{\top}\big) + \nabla q(x)^{\top}\gamma_{i}''(0), $$

or equivalently, \(S_{i}(x)^{\top}\nabla^{2} q(x) S_{i}(x) = -\nabla q(x)^{\top}\gamma_{i}'(0)\). In view of (E.2), this yields

$$ \operatorname{Tr}\Big(\big(\widehat{a}(x)- a(x)\big) \nabla^{2} q(x) \Big) = -\nabla q(x)^{\top}\sum_{i=1}^{d} \lambda_{i}(x)^{-}\gamma_{i}'(0) \qquad\text{for all } q\in{\mathcal {Q}}. $$

Let \(q_{1},\ldots,q_{m}\) be an enumeration of the elements of \({\mathcal {Q}}\), and write the above equation in vector form as

$$ \begin{pmatrix} \operatorname{Tr}((\widehat{a}(x)- a(x)) \nabla^{2} q_{1}(x) ) \\ \vdots\\ \operatorname{Tr}((\widehat{a}(x)- a(x)) \nabla^{2} q_{m}(x) ) \end{pmatrix} = - \begin{pmatrix} \nabla q_{1}(x)^{\top}\\ \vdots\\ \nabla q_{m}(x)^{\top}\end{pmatrix} \sum_{i=1}^{d} \lambda_{i}(x)^{-}\gamma_{i}'(0). $$

The left-hand side thus lies in the range of \([\nabla q_{1}(x) \cdots \nabla q_{m}(x)]^{\top}\) for each \(x\in M\). Since linear independence is an open condition, (G1) implies that the latter matrix has full rank for all \(x\) in a whole neighborhood \(U\) of \(M\). It thus has a Moore–Penrose inverse which is a continuous function of \(x\); see Penrose [39, page 408]. The desired map \(c\) is now obtained on \(U\) by

$$ c(x) = - \frac{1}{2} \begin{pmatrix} \nabla q_{1}(x)^{\top}\\ \vdots\\ \nabla q_{m}(x)^{\top}\end{pmatrix} ^{-1} \begin{pmatrix} \operatorname{Tr}((\widehat{a}(x)- a(x)) \nabla^{2} q_{1}(x) ) \\ \vdots\\ \operatorname{Tr}((\widehat{a}(x)- a(x)) \nabla^{2} q_{m}(x) ) \end{pmatrix}, $$

where the Moore–Penrose inverse is understood. Finally, after shrinking \(U\) while maintaining \(M\subseteq U\), \(c\) is continuous on the closure \(\overline{U}\), and can then be extended to a continuous map on \({\mathbb {R}}^{d}\) by the Tietze extension theorem; see Willard [47, Theorem 15.8]. This proves (E.1).

The extended drift coefficient is now defined by \(\widehat{b} = b + c\), and the operator \(\widehat{\mathcal {G}}\) by

$$ \widehat{\mathcal {G}}f = \frac{1}{2}\operatorname{Tr}( \widehat{a} \nabla^{2} f) + \widehat{b} ^{\top} \nabla f. $$

In view of (E.1), it satisfies \(\widehat{\mathcal {G}}f={\mathcal {G}}f\) on \(E\) and

$$ \widehat{\mathcal {G}}q = {\mathcal {G}}q + \frac{1}{2}\operatorname {Tr}\big( (\widehat{a}- a) \nabla ^{2} q \big) + c^{\top}\nabla q = 0 $$

on \(M\) for all \(q\in{\mathcal {Q}}\), as desired. □

We now define the set

$$ E_{0} = M \cap\{\|\widehat{b}-b\|< 1\}. $$

Note that \(E\subseteq E_{0}\) since \(\widehat{b}=b\) on \(E\). Furthermore, the linear growth condition

$$ \|\widehat{a}(x)\|^{1/2} + \|\widehat{b}(x)\| \le\|a(x)\|^{1/2} + \| b(x)\| + 1 \le C(1+\|x\|),\qquad x\in E_{0}, $$
(E.3)

is satisfied for some constant \(C\). This uses that the component functions of \(a\) and \(b\) lie in \({\mathrm{Pol}}_{2}({\mathbb {R}}^{d})\) and \({\mathrm{Pol}} _{1}({\mathbb {R}}^{d})\), respectively.

An \(E_{0}\)-valued local solution to (2.2), with \(b\) and \(\sigma\) replaced by \(\widehat{b}\) and \(\widehat{\sigma}\), can now be constructed by solving the martingale problem for the operator \(\widehat{\mathcal {G}}\) and state space \(E_{0}\). We first prove an auxiliary lemma.

Lemma E.3

Let \(f\in C^{\infty}({\mathbb {R}}^{d})\) and assume the support \(K\) of \(f\) satisfies \(K\cap M\subseteq E_{0}\). Let \(x_{0}\) be a maximizer of \(f\) over \(E_{0}\). Then \(\widehat{\mathcal {G}} f(x_{0})\le0\).

Proof

Let \(\gamma:(-1,1)\to M\) be any smooth curve in \(M\) with \(\gamma (0)=x_{0}\). Optimality of \(x_{0}\) and the chain rule yield

$$ 0 = \frac{{\,\mathrm{d}}}{{\,\mathrm{d}} s} (f \circ\gamma)(0) = \nabla f(x_{0})^{\top}\gamma'(0), $$

from which it follows that \(\nabla f(x_{0})\) is orthogonal to the tangent space of \(M\) at \(x_{0}\). Thus

$$ \nabla f(x_{0})=\sum_{q\in{\mathcal {Q}}} c_{q} \nabla q(x_{0}) $$
(E.4)

for some coefficients \(c_{q}\). Next, differentiating once more yields

$$ 0 \ge\frac{{\,\mathrm{d}}^{2}}{{\,\mathrm{d}} s^{2}} (f \circ\gamma)(0) = \operatorname {Tr}\big( \nabla^{2} f(x_{0}) \gamma'(0) \gamma'(0)^{\top}\big) + \nabla f(x_{0})^{\top}\gamma''(0). $$

Similarly, for any \(q\in{\mathcal {Q}}\),

$$ 0 = \frac{{\,\mathrm{d}}^{2}}{{\,\mathrm{d}} s^{2}} (q \circ\gamma)(0) = \operatorname{Tr}\big( \nabla^{2} q(x_{0}) \gamma'(0) \gamma'(0)^{\top}\big) + \nabla q(x_{0})^{\top}\gamma''(0). $$

In view of (E.4), this implies

$$ \operatorname{Tr}\bigg( \Big(\nabla^{2} f(x_{0}) - \sum_{q\in {\mathcal {Q}}} c_{q} \nabla^{2} q(x_{0})\Big) \gamma'(0) \gamma'(0)^{\top}\bigg) \le0. $$
(E.5)

Observe that Lemma E.1 implies that \(\ker A\subseteq\ker\pi (A)\) for any symmetric matrix \(A\). Thus \(\widehat{a}(x_{0})\nabla q(x_{0})=0\) for all \(q\in{\mathcal {Q}}\) by (A2), which implies that \(\widehat{a}(x_{0})=\sum_{i} u_{i} u_{i}^{\top}\) for some vectors \(u_{i}\) in the tangent space of \(M\) at \(x_{0}\). Thus, choosing curves \(\gamma\) with \(\gamma'(0)=u_{i}\), (E.5) yields

$$ \operatorname{Tr}\bigg( \Big(\nabla^{2} f(x_{0}) - \sum_{q\in {\mathcal {Q}}} c_{q} \nabla^{2} q(x_{0})\Big) \widehat{a}(x_{0}) \bigg) \le0. $$
(E.6)

Combining (E.4), (E.6) and Lemma E.2, we obtain

$$ \widehat{\mathcal {G}}f(x_{0}) = \frac{1}{2} \operatorname{Tr}\big( \widehat{a}(x_{0}) \nabla^{2} f(x_{0}) \big) + \widehat{b}(x_{0})^{\top}\nabla f(x_{0}) \le\sum_{q\in {\mathcal {Q}}} c_{q} \widehat{\mathcal {G}}q(x_{0})=0, $$

as desired. □

Let \(C_{0}(E_{0})\) denote the space of continuous functions on \(E_{0}\) vanishing at infinity. Lemma E.3 implies that \(\widehat {\mathcal {G}} \) is a well-defined linear operator on \(C_{0}(E_{0})\) with domain \(C^{\infty}_{c}(E_{0})\). It also implies that \(\widehat{\mathcal {G}}\) satisfies the positive maximum principle as a linear operator on \(C_{0}(E_{0})\). Hence the following local existence result can be proved.

Lemma E.4

Let \(\mu\) be a probability measure on \(E\). There exists an \({\mathbb {R}} ^{d}\)-valued càdlàg process \(X\) with initial distribution  \(\mu\) that satisfies

$$ X_{t} = X_{0} + \int_{0}^{t} \widehat{b}(X_{s}) {\,\mathrm{d}} s + \int_{0}^{t} \widehat{\sigma}(X_{s}) {\,\mathrm{d}} W_{s} $$
(E.7)

for all \(t<\tau\), where \(\tau= \inf\{t \ge0: X_{t} \notin E_{0}\}>0\), and some \(d\)-dimensional Brownian motion \(W\).

Proof

The conditions of Ethier and Kurtz [19, Theorem 4.5.4] are satisfied, so there exists an \(E_{0}^{\Delta}\)-valued càdlàg process \(X\) such that \(N^{f}_{t} {=} f(X_{t}) {-} f(X_{0}) {-} \int_{0}^{t} \widehat{\mathcal {G}}f(X_{s}) {\,\mathrm{d}} s\) is a martingale for any \(f\in C^{\infty}_{c}(E_{0})\). Here \(E_{0}^{\Delta}\) denotes the one-point compactification of \(E_{0}\) with some \(\Delta \notin E_{0}\), and we set \(f(\Delta)=\widehat{\mathcal {G}}f(\Delta)=0\). Bakry and Émery [4, Proposition 2] then yields that \(f(X)\) and \(N^{f}\) are continuous.Footnote 3 In particular, \(X\) cannot jump to \(\Delta\) from any point in \(E_{0}\), whence \(\tau\) is a strictly positive predictable time.

A localized version of the argument in Ethier and Kurtz [19, Theorem 5.3.3] now shows that on an extended probability space, \(X\) satisfies (E.7) for all \(t<\tau\) and some Brownian motion \(W\). It remains to show that \(X\) is non-explosive in the sense that \(\sup_{t<\tau}\|X_{\tau}\|<\infty\) on \(\{\tau<\infty\}\). Indeed, non-explosion implies that either \(\tau=\infty\), or \({\mathbb {R}}^{d}\setminus E_{0}\neq\emptyset\) in which case we can take \(\Delta\in{\mathbb {R}}^{d}\setminus E_{0}\). In either case, \(X\) is \({\mathbb {R}}^{d}\)-valued. To prove that \(X\) is non-explosive, let \(Z_{t}=1+\|X_{t}\|^{2}\) for \(t<\tau\), and observe that the linear growth condition (E.3) in conjunction with Itô’s formula yields \(Z_{t} \le Z_{0} + C\int_{0}^{t} Z_{s}{\,\mathrm{d}} s + N_{t}\) for all \(t<\tau\), where \(C>0\) is a constant and \(N\) a local martingale on \([0,\tau)\). Let \(Y_{t}\) denote the right-hand side. Then

$$\begin{aligned} e^{-tC}Z_{t}\le e^{-tC}Y_{t} &= Z_{0}+C \int_{0}^{t} e^{-sC}(Z_{s}-Y_{s}){\,\mathrm{d}} s + \int _{0}^{t} e^{-sC} {\,\mathrm{d}} N_{s} \\ &\le Z_{0} + \int_{0}^{t} e^{-s C}{\,\mathrm{d}} N_{s} \end{aligned}$$

for all \(t<\tau\). The right-hand side is a nonnegative supermartingale on \([0,\tau)\), and we deduce \(\sup_{t<\tau}Z_{t}<\infty\) on \(\{\tau <\infty \}\), as required. □

Let \(X\) and \(\tau\) be the process and stopping time provided by Lemma E.4. We now show that \(\tau=\infty\) and that \(X_{t}\) remains in \(E\) for all \(t\ge0\) and spends zero time in each of the sets \(\{p=0\}\), \(p\in{\mathcal {P}}\). This will complete the proof of Theorem 5.3, since \(\widehat{a}\) and \(\widehat{b}\) coincide with \(a\) and \(b\) on \(E\).

We need to prove that \(p(X_{t})\ge0\) for all \(0\le t<\tau\) and all \(p\in{\mathcal {P}}\). Fix \(p\in{\mathcal {P}}\) and let \(L^{y}\) denote the local time of \(p(X)\) at level \(y\), where we choose a modification that is càdlàg in \(y\); see Revuz and Yor [41, Theorem VI.1.7]. Itô’s formula yields

$$ p(X_{t}) = p(x) + \int_{0}^{t} \widehat{\mathcal {G}}p(X_{s}) {\,\mathrm{d}} s + \int_{0}^{t} \nabla p(X_{s})^{\top}\widehat{\sigma}(X_{s})^{1/2}{\,\mathrm{d}} W_{s}, \qquad t< \tau. $$

We first claim that \(L^{0}_{t}=0\) for \(t<\tau\). The occupation density formula [41, Corollary VI.1.6] yields

$$ \int_{-\infty}^{\infty}\frac{1}{y}{\boldsymbol{1}_{\{y>0\}}}L^{y}_{t}{\,\mathrm{d}} y = \int_{0}^{t} \frac {\nabla p^{\top}\widehat{a} \nabla p(X_{s})}{p(X_{s})}{\boldsymbol{1}_{\{ p(X_{s})>0\}}}{\,\mathrm{d}} s. $$

By right-continuity of \(L^{y}_{t}\) in \(y\), it suffices to show that the right-hand side is finite. For this, in turn, it is enough to prove that \((\nabla p^{\top}\widehat{a} \nabla p)/p\) is locally bounded on \(M\). To this end, let \(a=S\varLambda S^{\top}\) be the spectral decomposition of \(a\), so that the columns \(S_{i}\) of \(S\) constitute an orthonormal basis of eigenvectors of \(a\) and the diagonal elements \(\lambda_{i}\) of \(\varLambda \) are the corresponding eigenvalues. Note that these quantities depend on \(x\) in general. Since \(a \nabla p=0\) on \(M\cap\{p=0\}\) by (A1), condition (G2) implies that there exists a vector \(h=(h_{1},\ldots ,h_{d})^{\top}\) of polynomials such that

$$ a \nabla p = h p \qquad\text{on } M. $$

Thus \(\lambda_{i} S_{i}^{\top}\nabla p = S_{i}^{\top}a \nabla p = S_{i}^{\top}h p\), and hence \(\lambda_{i}(S_{i}^{\top}\nabla p)^{2} = S_{i}^{\top}\nabla p S_{i}^{\top}h p\). In conjunction with Lemma E.1, this yields

$$ \nabla p^{\top}\widehat{a} \nabla p = \nabla p^{\top}S\varLambda^{+} S^{\top}\nabla p = \sum_{i} \lambda_{i}{\boldsymbol{1}_{\{\lambda_{i}>0\}}}(S_{i}^{\top}\nabla p)^{2} = \sum_{i} {\boldsymbol{1}_{\{\lambda_{i}>0\}}}S_{i}^{\top}\nabla p S_{i}^{\top}h p. $$

Consequently,

$$ \nabla p^{\top}\widehat{a} \nabla p \le|p| \sum_{i} \|S_{i}\|^{2} \|\nabla p\| \|h\|. $$

Since \(\|S_{i}\|=1\) and \(\nabla p\) and \(h\) are locally bounded, we deduce that \((\nabla p^{\top}\widehat{a} \nabla p)/p\) is locally bounded, as required. Thus \(L^{0}=0\) as claimed.

Next, since \(\widehat{\mathcal {G}}p= {\mathcal {G}}p\) on \(E\), the hypothesis (A1) implies that \(\widehat{\mathcal {G}}p>0\) on a neighborhood \(U_{p}\) of \(E\cap\{ p=0\}\). Shrinking \(E_{0}\) if necessary, we may assume that \(E_{0}\subseteq E\cup\bigcup_{p\in{\mathcal {P}}} U_{p}\) and thus

$$ \widehat{\mathcal {G}}p > 0\qquad \mbox{on } E_{0}\cap\{p=0\}. $$

Since \(L^{0}=0\) before \(\tau\), Lemma A.1 implies

$$ p(X_{t})\ge0\qquad \mbox{for all }t< \tau. $$

Thus the stopping time \(\tau_{E}=\inf\{t\colon X_{t}\notin E\}\le\tau\) actually satisfies \(\tau_{E}=\tau\). This implies \(\tau=\infty\). Indeed, \(X\) has left limits on \(\{\tau<\infty\}\) by Lemma E.4, and \(E_{0}\) is a neighborhood in \(M\) of the closed set \(E\). Thus \(\tau _{E}<\tau\) on \(\{\tau<\infty\}\), whence this set is empty. Finally, Lemma A.1 also gives \(\int_{0}^{t}{\boldsymbol{1}_{\{p(X_{s})=0\} }}{\,\mathrm{d}} s=0\). The proof of Theorem 5.3 is complete.  □

Appendix F: Proof of Theorem 5.7

The proof of Theorem 5.7 is divided into three parts.

Proof of Theorem 5.7(i)

The following argument is a version of what is sometimes called “McKean’s argument”; see Mayerhofer et al. [37, Sect. 4.1] for an overview and further references. Suppose first \(p(X_{0})>0\) almost surely. Itô’s formula and the identity \(a \nabla h=h p\) on \(M\) yield

$$ \begin{aligned} \log& p(X_{t}) - \log p(X_{0}) \\ &= \int_{0}^{t} \left(\frac{{\mathcal {G}}p(X_{s})}{p(X_{s})} - \frac {1}{2}\frac {\nabla p^{\top}a \nabla p(X_{s})}{p(X_{s})^{2}}\right) {\,\mathrm{d}} s + \int_{0}^{t} \frac {\nabla p^{\top}\sigma(X_{s})}{p(X_{s})}{\,\mathrm{d}} W_{s} \\ &= \int_{0}^{t} \frac{2 {\mathcal {G}}p(X_{s}) - h^{\top}\nabla p(X_{s})}{2p(X_{s})} {\,\mathrm{d}} s + \int_{0}^{t} \frac{\nabla p^{\top}\sigma(X_{s})}{p(X_{s})}{\,\mathrm{d}} W_{s} \end{aligned} $$
(F.1)

for \(t<\tau=\inf\{s\ge0:p(X_{s})=0\}\). We now modify \(\log p(X)\) to turn it into a local submartingale. To this end, define

$$ V_{t} = \int_{0}^{t} {\boldsymbol{1}_{\{X_{s}\notin U\}}} \frac{1}{p(X_{s})}|2 {\mathcal {G}}p(X_{s}) - h^{\top}\nabla p(X_{s})| {\,\mathrm{d}} s. $$

We claim that \(V_{t}<\infty\) for all \(t\ge0\). To see this, note that the set \(E {\cap} U^{c} {\cap} \{x:\|x\| {\le} n\}\) is compact and disjoint from \(\{ p=0\}\cap E\) for each \(n\). Thus

$$ \varepsilon_{n}=\min\{p(x):x\in E\cap U^{c}, \|x\|\le n\} $$

is strictly positive. Defining \(\sigma_{n}=\inf\{t:\|X_{t}\|\ge n\}\), this yields

$$ V_{t\wedge\sigma_{n}} \le\frac{t}{2\varepsilon_{n}} \max_{\|x\|\le n} |2 {\mathcal {G}}p(x) - h^{\top}\nabla p(x)| < \infty. $$

Since \(\sigma_{n}\to\infty\) due to the fact that \(X\) does not explode, we have \(V_{t}<\infty\) for all \(t\ge0\) as claimed. It follows that the process

$$ A_{t} = \int_{0}^{t} {\boldsymbol{1}_{\{X_{s}\notin U\}}} \frac{1}{p(X_{s})}\big(2 {\mathcal {G}}p(X_{s}) - h^{\top}\nabla p(X_{s})\big) {\,\mathrm{d}} s $$

is well defined and finite for all \(t\ge0\), with total variation process \(V\).

Now define stopping times \(\rho_{n}=\inf\{t\ge0: |A_{t}|+p(X_{t}) \ge n\}\) and note that \(\rho_{n}\to\infty\) since neither \(A\) nor \(X\) explodes. Consider the process \(Z = \log p(X) - A\), which satisfies

$$\begin{aligned} Z_{t} &= \log p(X_{0}) + \int_{0}^{t} {\boldsymbol{1}_{\{X_{s}\in U\}}} \frac {1}{2p(X_{s})}\big(2 {\mathcal {G}}p(X_{s}) - h^{\top}\nabla p(X_{s})\big) {\,\mathrm{d}} s \\ &\phantom{=:}{}+ \int_{0}^{t} \frac{\nabla p^{\top}\sigma(X_{s})}{p(X_{s})}{\,\mathrm{d}} W_{s}. \end{aligned}$$

Then \(-Z^{\rho_{n}}\) is a supermartingale on the stochastic interval \([0,\tau)\), bounded from below.Footnote 4 Thus by the supermartingale convergence theorem, \(\lim_{t\uparrow\tau}Z_{t\wedge\rho_{n}}\) exists in ℝ, which implies \(\tau\ge\rho_{n}\). Since \(\rho_{n}\to \infty\), we deduce \(\tau=\infty\), as desired.

Finally, suppose \({\mathbb {P}}[p(X_{0})=0]>0\). The above proof shows that \(p(X)\) cannot return to zero once it becomes positive. But due to (5.2), we have \(p(X_{t})>0\) for arbitrarily small \(t>0\), and this completes the proof. □

Proof of Theorem 5.7(ii)

As in the proof of (i), it is enough to consider the case where \(p(X_{0})>0\). By (G2), we deduce \(2 {\mathcal {G}}p - h^{\top}\nabla p = \alpha p\) on \(M\) for some \(\alpha\in{\mathrm{Pol}}({\mathbb {R}}^{d})\). However, we have \(\deg {\mathcal {G}}p\le\deg p\) and \(\deg a\nabla p \le1+\deg p\), which yields \(\deg h\le1\). Consequently \(\deg\alpha p \le\deg p\), implying that \(\alpha\) is constant. Inserting this into (F.1) yields

$$ \log p(X_{t}) = \log p(X_{0}) + \frac{\alpha}{2}t + \int_{0}^{t} \frac {\nabla p^{\top}\sigma(X_{s})}{p(X_{s})}{\,\mathrm{d}} W_{s} $$

for \(t<\tau=\inf\{t: p(X_{t})=0\}\). The process \(\log p(X_{t})-\alpha t/2\) is thus locally a martingale bounded from above, and hence nonexplosive by the same “McKean’s argument” as in the proof of part (i). This proves the result. □

Proof of Theorem 5.7(iii)

The proof of relies on the following two lemmas.

Lemma F.1

Let \(b:{\mathbb {R}}^{d}\to{\mathbb {R}}^{d}\) and \(\sigma:{\mathbb {R}}^{d}\to {\mathbb {R}}^{d\times d}\) be continuous functions with \(\|b(x)\|^{2}+\|\sigma(x)\|^{2}\le\kappa(1+\|x\|^{2})\) for some \(\kappa>0\), and fix \(\rho>0\). Let \(Y\) be a \(d\)-dimensional Itô process satisfying \(Y_{t} = Y_{0} + \int_{0}^{t} b(Y_{s}){\,\mathrm{d}} s + \int_{0}^{t} \sigma(Y_{s}){\,\mathrm{d}} W_{s}\). Then there exist constants \(c_{1},c_{2}>0\) that only depend on \(\kappa\) and \(\rho\), but not on \(Y_{0}\), such that

$$ {\mathbb {P}}\bigg[ \sup_{s\le t}\|Y_{s}-Y_{0}\| < \rho\bigg] \ge1 - t c_{1} (1+{\mathbb {E}} [\| Y_{0}\|^{2}]), \qquad t\le c_{2}. $$

Proof

By Markov’s inequality,

$$ {\mathbb {P}}\bigg[ \sup_{t\le\varepsilon}\|Y_{t}-Y_{0}\| < \rho\bigg]\ge 1-\rho ^{-2}{\mathbb {E}}\bigg[\sup_{t\le\varepsilon}\|Y_{t}-Y_{0}\|^{2}\bigg]. $$

Let \(\tau_{n}\) be the first time \(\|Y_{t}\|\) reaches level \(n\). A standard argument using the BDG inequality and Jensen’s inequality yields

$$ {\mathbb {E}}\bigg[ \sup_{s\le t\wedge\tau_{n}}\|Y_{s}-Y_{0}\|^{2}\bigg] \le 2c_{2} {\mathbb {E}} \bigg[\int_{0}^{t\wedge\tau_{n}}\big( \|\sigma(Y_{s})\|^{2} + \|b(Y_{s})\|^{2}\big){\,\mathrm{d}} s \bigg] $$

for \(t\le c_{2}\), where \(c_{2}\) is the constant in the BDG inequality. The growth condition yields

$$\begin{aligned} {\mathbb {E}}\bigg[ \sup_{s\le t\wedge\tau_{n}}\!\|Y_{s}-Y_{0}\|^{2}\bigg] &\le2c_{2}\kappa{\mathbb {E}}\bigg[\int_{0}^{t\wedge\tau_{n}}( 1 + \|Y_{s}\| ^{2} ){\,\mathrm{d}} s \bigg] \\ &\le4c_{2}\kappa(1+{\mathbb {E}}[\|Y_{0}\|^{2}])t + 4c_{2}\kappa\! \int_{0}^{t}\! {\mathbb {E}}\bigg[\sup _{u\le s\wedge\tau_{n}}\!\|Y_{u}-Y_{0}\|^{2} \bigg]{\,\mathrm{d}} s, \end{aligned}$$

for \(t\le c_{2}\), and Gronwall’s lemma then gives \({\mathbb {E}}[ \sup _{s\le t\wedge \tau_{n}}\|Y_{s}-Y_{0}\|^{2}] \le c_{3}t \mathrm{e}^{4c_{2}\kappa t}\), where \(c_{3}=4c_{2}\kappa(1+{\mathbb {E}}[\|Y_{0}\|^{2}])\). Sending \(n\) to infinity and applying Fatou’s lemma concludes the proof, upon setting \(c_{1}=4c_{2}\kappa\mathrm{e}^{4c_{2}^{2}\kappa}\wedge c_{2}\). □

Lemma F.2

Let \(0<\alpha<2\) and \(z\ge0\), and let \(Z\) be a \(\mathrm{BESQ}(\alpha)\) process starting from \(z\ge0\). Let \({\mathbb {P}}_{z}\) denote its law. Let \(\tau _{0}=\inf\{t\ge0:Z_{t}=0\}\) be the first time \(Z\) hits zero. Then for any \(\varepsilon>0\),

$$ \lim_{z\to0}{\mathbb {P}}_{z}[\tau_{0}>\varepsilon] = 0. $$

Proof

By Göing-Jaeschke and Yor [26, Eq. (15)], we have

$$ {\mathbb {P}}_{z}[\tau_{0}>\varepsilon] = \int_{\varepsilon}^{\infty}\frac {1}{t\varGamma (\widehat{\nu})}\left(\frac{z}{2t}\right)^{\widehat{\nu}} \mathrm{e}^{-z/(2t)}{\,\mathrm{d}} t, $$

where \(\varGamma(\cdot)\) is the Gamma function and \(\widehat{\nu}=1-\alpha /2\in(0,1)\). Changing variables to \(s=z/(2t)\) yields \({\mathbb {P}}_{z}[\tau _{0}>\varepsilon]=\frac{1}{\varGamma(\widehat{\nu})}\int _{0}^{z/(2\varepsilon )}s^{\widehat{\nu}-1}\mathrm{e}^{-s}{\,\mathrm{d}} s\), which converges to zero as \(z\to0\) by dominated convergence. □

We may now complete the proof of Theorem 5.7(iii). The hypotheses yield

$$ 0 \le2 {\mathcal {G}}p({\overline{x}}) < h({\overline{x}})^{\top}\nabla p({\overline{x}}). $$

Hence there exist some \(\delta>0\) such that \(2 {\mathcal {G}}p({\overline{x}}) < (1-2\delta) h({\overline{x}})^{\top}\nabla p({\overline{x}})\) and an open ball \(U\) in \({\mathbb {R}}^{d}\) of radius \(\rho>0\), centered at \({\overline{x}}\), such that

$$ 2 {\mathcal {G}}p \le\left(1-\delta\right) h^{\top}\nabla p \quad\text{and}\quad h^{\top}\nabla p >0 \qquad\text{on } E\cap U. $$

Note that the radius \(\rho\) does not depend on the starting point \(X_{0}\).

For all \(t<\tau(U)=\inf\{s\ge0:X_{s}\notin U\}\wedge T\), we have

$$\begin{aligned} p(X_{t}) - p(X_{0}) - \int_{0}^{t}{\mathcal {G}}p(X_{s}){\,\mathrm{d}} s &= \int_{0}^{t} \nabla p^{\top}\sigma(X_{s}){\,\mathrm{d}} W_{s} \\ &= \int_{0}^{t} \sqrt{\nabla p^{\top}a\nabla p(X_{s})}{\,\mathrm{d}} B_{s}\\ &= 2\int_{0}^{t} \sqrt{p(X_{s})}\, \frac{1}{2}\sqrt{h^{\top}\nabla p(X_{s})}{\,\mathrm{d}} B_{s} \end{aligned}$$

for some one-dimensional Brownian motion, possibly defined on an enlargement of the original probability space. Here the equality \(a\nabla p =hp\) on \(E\) was used in the last step. Define an increasing process \(A_{t}=\int_{0}^{t}\frac{1}{4}h^{\top}\nabla p(X_{s}){\,\mathrm{d}} s\). Since \(h^{\top}\nabla p(X_{t})>0\) on \([0,\tau(U))\), the process \(A\) is strictly increasing there. It follows that the time-change \(\gamma_{u}=\inf\{ t\ge 0:A_{t}>u\}\) is continuous and strictly increasing on \([0,A_{\tau(U)})\). The time-changed process \(Y_{u}=p(X_{\gamma_{u}})\) thus satisfies

$$ Y_{u} = p(X_{0}) + \int_{0}^{u} \frac{4 {\mathcal {G}}p(X_{\gamma_{v}})}{h^{\top}\nabla p(X_{\gamma_{v}})}{\,\mathrm{d}} v + 2\int_{0}^{u} \sqrt{Y_{v}}{\,\mathrm{d}}\beta_{v}, \qquad u< A_{\tau(U)}. $$

Consider now the \(\mathrm{BESQ}(2-2\delta)\) process \(Z\) defined as the unique strong solution to the equation

$$ Z_{u} = p(X_{0}) + (2-2\delta)u + 2\int_{0}^{u} \sqrt{Z_{v}}{\,\mathrm{d}}\beta_{v}. $$

Since \(4 {\mathcal {G}}p(X_{t}) / h^{\top}\nabla p(X_{t}) \le2-2\delta\) for \(t<\tau(U)\), a standard comparison theorem implies that \(Y_{u}\le Z_{u}\) for \(u< A_{\tau(U)}\); see for instance Rogers and Williams [42, Theorem V.43.1]. It is well known that a BESQ\((\alpha)\) process hits zero if and only if \(\alpha<2\); see Revuz and Yor [41, page 442]. It thus remains to exhibit \(\varepsilon>0\) such that if \(\|X_{0}-\overline{x}\|<\varepsilon\) almost surely, there is a positive probability that \(Z_{u}\) hits zero before \(X_{\gamma_{u}}\) leaves \(U\), or equivalently, that \(Z_{u}=0\) for some \(u< A_{\tau(U)}\). To this end, set \(C=\sup_{x\in U} h(x)^{\top}\nabla p(x)/4\), so that \(A_{\tau(U)}\ge C\tau(U)\), and let \(\eta>0\) be a number to be determined later. We have

$$ \begin{aligned} &{\mathbb {P}}\Big[ \eta< A_{\tau(U)} \text{ and } \inf_{u\le\eta} Z_{u} = 0\Big] \\ &\ge{\mathbb {P}}\big[ \eta< A_{\tau(U)} \big] - {\mathbb {P}}\Big[ \inf_{u\le\eta } Z_{u} > 0\Big] \\ &\ge{\mathbb {P}}\big[ \eta C^{-1} < \tau(U) \big] - {\mathbb {P}}\Big[ \inf_{u\le \eta} Z_{u} > 0\Big] \\ &= {\mathbb {P}}\bigg[ \sup_{t\le\eta C^{-1}} \|X_{t} - {\overline{x}}\| < \rho \bigg] - {\mathbb {P}}\Big[ \inf_{u\le\eta} Z_{u} > 0\Big] \\ &\ge{\mathbb {P}}\bigg[ \sup_{t\le\eta C^{-1}} \|X_{t} - X_{0}\| < \rho/2 \bigg] - {\mathbb {P}} \Big[ \inf_{u\le\eta} Z_{u} > 0\Big], \end{aligned} $$
(F.2)

where we recall that \(\rho\) is the radius of the open ball \(U\), and where the last inequality follows from the triangle inequality provided \(\|X_{0}-{\overline{x}}\|\le\rho/2\). By Lemma F.1, we can choose \(\eta>0\) independently of \(X_{0}\) so that \({\mathbb {P}}[ \sup _{t\le\eta C^{-1}} \|X_{t} - X_{0}\| <\rho/2 ]>1/2\). Then by Lemma F.2, we have \({\mathbb {P}}[ \inf_{u\le\eta} Z_{u} > 0]<1/3\) whenever \(Z_{0}=p(X_{0})\) is sufficiently close to zero. This happens if \(X_{0}\) is sufficiently close to \({\overline{x}}\), say within a distance \(\rho'>0\). Thus, setting \(\varepsilon=\rho'\wedge(\rho/2)\), the condition \(\|X_{0}-{\overline{x}}\| <\rho'\wedge(\rho/2)\) implies that (F.2) is valid, with the right-hand side strictly positive. The theorem is proved.  □

Appendix G: Proof of Proposition 6.1

Condition (G1) is vacuously true, so we prove (G2). If \(d=1\), then \(\{p=0\}=\{-1,1\}\), and it is clear that any univariate polynomial vanishing on this set has \(p(x)=1-x^{2}\) as a factor. Thus (G2) holds. If \(d\ge2\), then \(p(x)=1-x^{\top}Qx\) is irreducible and changes sign, so (G2) follows from Lemma 5.4.

Next, it is straightforward to verify that (6.1), (6.2) imply (A0)–(A2), so we focus on the converse direction and assume (A0)–(A2) hold. We first prove that \(a(x)\) has the stated form. Write \(a(x)=\alpha+ L(x) + A(x)\), where \(\alpha=a(0)\in{\mathbb {S}}^{d}_{+}\), \(L(x)\in{\mathbb {S}}^{d}\) is linear in \(x\), and \(A(x)\in{\mathbb {S}}^{d}\) is homogeneous of degree two in \(x\). Since \(a(x)Qx=a(x)\nabla p(x)/2=0\) on \(\{p=0\}\), we have for any \(x\in\{p=0\}\) and \(\epsilon\in\{-1,1\} \) that

$$ 0 = \epsilon a(\epsilon x) Q x = \epsilon\big( \alpha Qx + A(x)Qx \big) + L(x)Qx. $$

This implies \(L(x)Qx=0\) for all \(x\in\{p=0\}\), and thus, by scaling, for all \(x\in{\mathbb {R}}^{d}\). We now argue that this implies \(L=0\). To this end, consider the linear map \(T: {\mathcal {X}}\to{\mathcal {Y}}\) where

$$\begin{aligned} {\mathcal {X}}&=\{\text{all linear maps ${\mathbb {R}}^{d}\to{\mathbb {S}}^{d}$}\}, \\ {\mathcal {Y}}&=\{\text{all second degree homogeneous maps ${\mathbb {R}}^{d}\to{\mathbb {R}}^{d}$}\}, \end{aligned}$$

and \(TK\in{\mathcal {Y}}\) is given by \((TK)(x) = K(x)Qx\). One readily checks that we have \(\dim{\mathcal {X}}=\dim{\mathcal {Y}}=d^{2}(d+1)/2\). Thus if we can show that \(T\) is surjective, the rank-nullity theorem \(\dim(\ker T) + \dim(\mathrm{range } T) = \dim{\mathcal {X}} \) implies that \(\ker T\) is trivial. But the identity \(L(x)Qx\equiv0\) precisely states that \(L\in\ker T\), yielding \(L=0\) as desired. To see that \(T\) is surjective, note that \({\mathcal {Y}}\) is spanned by elements of the form

$$ (0,\ldots,0,x_{i}x_{j},0,\ldots,0)^{\top}$$

with the \(k\)th component being nonzero. But all these elements can be realized as \((TK)(x)=K(x)Qx\) as follows: If \(i,j,k\) are all distinct, one may take

$$ \begin{pmatrix} K_{ii} & K_{ij} &K_{ik} \\ K_{ji} & K_{jj} &K_{jk} \\ K_{ki} & K_{kj} &K_{kk} \end{pmatrix} \!(x) = \frac{1}{2} \begin{pmatrix} 0 &-x_{k} &x_{j} \\ -x_{k} &0 &x_{i} \\ x_{j} &x_{i} &0 \end{pmatrix} \begin{pmatrix} Q_{ii}& 0 &0 \\ 0 & Q_{jj} &0 \\ 0 & 0 &Q_{kk} \end{pmatrix}, $$

and all remaining entries of \(K(x)\) equal to zero. If \(i=k\), one takes \(K_{ii}(x)=x_{j}\) and the remaining entries zero, and similarly if \(j=k\). If \(i=j\ne k\), one sets

$$ \begin{pmatrix} K_{ii} & K_{ik} \\ K_{ki} & K_{kk} \end{pmatrix} \!(x) = \begin{pmatrix} -x_{k} &x_{i} \\ x_{i} &0 \end{pmatrix} \begin{pmatrix} Q_{ii}& 0 \\ 0 & Q_{kk} \end{pmatrix}, $$

and the remaining entries zero. This covers all possible cases, and shows that \(T\) is surjective. Thus \(L=0\) as claimed.

At this point, we have shown that \(a(x)=\alpha+A(x)\) with \(A\) homogeneous of degree two. Next, since \(a \nabla p=0\) on \(\{p=0\}\), there exists a vector \(h\) of polynomials such that \(a \nabla p/2=h p\). By counting degrees, \(h\) is of the form \(h(x)=f+Fx\) for some \(f\in {\mathbb {R}} ^{d}\), \(F\in{\mathbb {R}}^{d\times d}\). For any \(s>0\) and \(x\in{\mathbb {R}}^{d}\) such that \(sx\in E\),

$$ \alpha Qx + s^{2} A(x)Qx = \frac{1}{2s}a(sx)\nabla p(sx) = (1-s^{2}x^{\top}Qx)(s^{-1}f + Fx). $$

By sending \(s\) to zero, we deduce \(f=0\) and \(\alpha x=Fx\) for all \(x\) in some open set, hence \(F=\alpha\). Thus \(a(x)Qx=(1-x^{\top}Qx)\alpha Qx\) for all \(x\in E\). Defining \(c(x)=a(x) - (1-x^{\top}Qx)\alpha\), this shows that \(c(x)Qx=0\) for all \(x\in{\mathbb {R}}^{d}\), that \(c(0)=0\), and that \(c(x)\) has no linear part. In particular, \(c\) is homogeneous of degree two. To prove that \(c\in{\mathcal {C}}^{Q}_{+}\), it only remains to show that \(c(x)\) is positive semidefinite for all \(x\). For this we observe that for any \(u\in{\mathbb {R}}^{d}\) and any \(x\in\{p=0\}\),

$$ u^{\top}c(x) u = u^{\top}a(x) u \ge0. $$

In view of the homogeneity property, positive semidefiniteness follows for any \(x\). Thus \(c\in{\mathcal {C}}^{Q}_{+}\) and hence this \(a(x)\) has the stated form. Furthermore, the drift vector is always of the form \(b(x)=\beta +Bx\), and a brief calculation using the expressions for \(a(x)\) and \(b(x)\) shows that the condition \({\mathcal {G}}p> 0\) on \(\{p=0\}\) is equivalent to (6.2).  □

Appendix H: Proof of Proposition 6.4

Condition (G1) is vacuously true, and it is not hard to check that (G2) holds.

Next, it is straightforward to verify that (i) and (ii) imply (A0)–(A2), so we focus on the converse direction and assume (A0)–(A2) hold.

We first deduce (i) from the condition \(a \nabla p=0\) on \(\{p=0\}\) for all \(p\in{\mathcal {P}}\) together with the positive semidefinite requirement of \(a(x)\). Taking \(p(x)=x_{i}\), \(i=1,\ldots,d\), we obtain \(a(x)\nabla p(x) = a(x) e_{i} = 0\) on \(\{x_{i}=0\}\). Hence the \(i\)th column of \(a(x)\) is a polynomial multiple of \(x_{i}\). Similarly, with \(p=1-x_{i}\), \(i\in I\), it follows that \(a(x)e_{i}\) is a polynomial multiple of \(1-x_{i}\) for \(i\in I\). Hence, by symmetry of \(a\), we get

$$ \gamma_{ji}x_{i}(1-x_{i}) = a_{ji}(x) = a_{ij}(x) = h_{ij}(x)x_{j}\qquad (i\in I,\ j\in I\cup J) $$

for some constants \(\gamma_{ij}\) and polynomials \(h_{ij}\in{\mathrm {Pol}}_{1}(E)\) (using also that \(\deg a_{ij}\le2\)). For \(i\ne j\), this is possible only if \(a_{ij}(x)=0\), and for \(i=j\in I\) it implies that \(a_{ii}(x)=\gamma_{i}x_{i}(1-x_{i})\) as desired. In order to maintain positive semidefiniteness, we necessarily have \(\gamma_{i}\ge0\).

Now consider \(i,j\in J\). By the above, we have \(a_{ij}(x)=h_{ij}(x)x_{j}\) for some \(h_{ij}\in{\mathrm{Pol}}_{1}(E)\). Similarly as before, symmetry of \(a(x)\) yields

$$ h_{ij}(x)x_{j} = a_{ij}(x) = a_{ji}(x) = h_{ji}(x)x_{i}, $$

so that for \(i\ne j\), \(h_{ij}\) has \(x_{i}\) as a factor. It follows that \(a_{ij}(x)=\alpha_{ij}x_{i}x_{j}\) for some \(\alpha_{ij}\in{\mathbb {R}}\). If \(i=j\), we get \(a_{jj}(x)=\alpha_{jj}x_{j}^{2}+x_{j}(\phi_{j}+\psi_{(j)}^{\top}x_{I} + \pi _{(j)}^{\top}x_{J})\) for some \(\alpha_{jj}\in{\mathbb {R}}\), \(\phi_{j}\in {\mathbb {R}}\), \(\psi _{(j)}\in{\mathbb {R}}^{m}\), \(\pi_{(j)}\in{\mathbb {R}}^{n}\) with \(\pi _{(j),j}=0\). Positive semidefiniteness requires \(a_{jj}(x)\ge0\) for all \(x\in E\). This directly yields \(\pi_{(j)}\in{\mathbb {R}}^{n}_{+}\). Further, by setting \(x_{i}=0\) for \(i\in J\setminus\{j\}\) and making \(x_{j}>0\) sufficiently small, we see that \(\phi_{j}+\psi_{(j)}^{\top}x_{I}\ge0\) is required for all \(x_{I}\in [0,1]^{m}\), which forces \(\phi_{j}\ge(\psi_{(j)}^{-})^{\top}{\mathbf{1}}\). Finally, let \(\alpha\in{\mathbb {S}}^{n}\) be the matrix with elements \(\alpha_{ij}\) for \(i,j\in J\), let \(\varPsi\in{\mathbb {R}}^{m\times n}\) have columns \(\psi_{(j)}\), and \(\varPi \in{\mathbb {R}} ^{n\times n}\) columns \(\pi_{(j)}\). We then have

$$\begin{aligned} s^{-2} a_{JJ}(x_{I},s x_{J}) &= \operatorname{Diag}(x_{J})\alpha \operatorname{Diag}(x_{J}) \\ &\phantom{=:}{} + \operatorname{Diag}(x_{J})\operatorname{Diag}\big(s^{-1}(\phi+\varPsi^{\top}x_{I}) + \varPi ^{\top}x_{J}\big), \end{aligned}$$

so by sending \(s\) to infinity we see that \(\alpha+ \operatorname {Diag}(\varPi^{\top}x_{J})\operatorname{Diag}(x_{J})^{-1}\) must lie in \({\mathbb {S}}^{n}_{+}\) for all \(x_{J}\in {\mathbb {R}}^{n}_{++}\). This proves (i).

For (ii), note that \({\mathcal {G}}p(x) = b_{i}(x)\) for \(p(x)=x_{i}\), and \({\mathcal {G}} p(x)=-b_{i}(x)\) for \(p(x)=1-x_{i}\). In particular, if \(i\in I\), then \(b_{i}(x)\) cannot depend on \(x_{J}\). This establishes (6.4). Next, for \(i\in I\), we have \(\beta _{i}+B_{iI}x_{I}> 0\) for all \(x_{I}\in[0,1]^{m}\) with \(x_{i}=0\), and this yields \(\beta_{i} - (B^{-}_{i,I\setminus\{i\}}){\mathbf{1}}> 0\). Similarly, \(\beta _{i}+B_{iI}x_{I}<0\) for all \(x_{I}\in[0,1]^{m}\) with \(x_{i}=1\), so that \(\beta_{i} + (B^{+}_{i,I\setminus\{i\}}){\mathbf{1}}+ B_{ii}< 0\). For \(j\in J\), we may set \(x_{J}=0\) to see that \(\beta_{J}+B_{JI}x_{I}\in{\mathbb {R}}^{n}_{++}\) for all \(x_{I}\in [0,1]^{m}\). Hence \(\beta_{j}> (B^{-}_{jI}){\mathbf{1}}\) for all \(j\in J\). Moreover, fixing \(j\in J\), setting \(x_{j}=0\) and letting \(x_{i}\to\infty\) for \(i\ne j\) forces \(B_{ji}>0\). The proof of (ii) is complete.  □

Appendix I: Proof of Proposition 6.6

Since \({\mathcal {Q}}\) consists of the single polynomial \(q(x)=1-{\mathbf{1}} ^{\top}x\), it is clear that (G1) holds. To prove (G2), it suffices by Lemma 5.5 to prove for each \(i\) that the ideal \((x_{i}, 1-{\mathbf {1}}^{\top}x)\) is prime and has dimension \(d-2\). But an affine change of coordinates shows that this is equivalent to the same statement for \((x_{1},x_{2})\), which is well known to be true.

Next, the only nontrivial aspect of verifying that (i) and (ii) imply (A0)–(A2) is to check that \(a(x)\) is positive semidefinite for each \(x\in E\). To do this, fix any \(x\in E\) and let \(\varLambda\) denote the diagonal matrix with \(a_{ii}(x)\), \(i=1,\ldots,d\), on the diagonal. Then for each \(s\in[0,1)\), the matrix \(A(s)=(1-s)(\varLambda+{\mathrm{Id}})+sa(x)\) is strictly diagonally dominantFootnote 5 with positive diagonal elements. Hence by Horn and Johnson [30, Theorem 6.1.10], it is positive definite. But since \({\mathbb {S}}^{d}_{+}\) is closed and \(\lim_{s\to1}A(s)=a(x)\), we get \(a(x)\in{\mathbb {S}}^{d}_{+}\).

We now focus on the converse direction and assume (A0)–(A2) hold. We first prove (i). As the ideal \((x_{i},1-{\mathbf{1}}^{\top}x)\) satisfies (G2) for each \(i\), the condition \(a(x)e_{i}=0\) on \(M\cap\{x_{i}=0\}\) implies that

$$ a_{ji}(x) = x_{i} h_{ji}(x) + (1-{\mathbf{1}}^{\top}x) g_{ji}(x) $$
(I.1)

for some polynomials \(h_{ji}\) and \(g_{ji}\) in \({\mathrm {Pol}}_{1}({\mathbb {R}}^{d})\). Suppose \(j\ne i\). By symmetry of \(a(x)\), we get

$$ x_{j}h_{ij}(x) = x_{i}h_{ji}(x) + (1-{\mathbf{1}}^{\top}x) \big(g_{ji}(x) - g_{ij}(x)\big). $$

Thus \(h_{ij}=0\) on \(M\cap\{x_{i}=0\}\cap\{x_{j}\ne0\}\), and, by continuity, on \(M\cap\{x_{i}=0\}\). Another application of (G2) and counting degrees gives \(h_{ij}(x)=-\alpha_{ij}x_{i}+(1-{\mathbf{1}}^{\top}x)\gamma_{ij}\) for some constants \(\alpha_{ij}\) and \(\gamma_{ij}\). This proves \(a_{ij}(x)=-\alpha_{ij}x_{i}x_{j}\) on \(E\) for \(i\ne j\), as claimed. For \(i=j\), note that (I.1) can be written as

$$ a_{ii}(x) = -\alpha_{ii}x_{i}^{2} + x_{i}(\phi_{i} + \psi_{(i)}^{\top}x) + (1-{\mathbf{1}} ^{\top}x) g_{ii}(x) $$

for some constants \(\alpha_{ij}\), \(\phi_{i}\) and vectors \(\psi _{(i)}\in{\mathbb {R}} ^{d}\) with \(\psi_{(i),i}=0\). We need to identify \(\phi_{i}\) and \(\psi _{(i)}\). To this end, note that the condition \(a(x){\mathbf{1}}=0\) on \(\{ 1-{\mathbf{1}} ^{\top}x=0\}\) yields \(a(x){\mathbf{1}}=(1-{\mathbf{1}}^{\top}x)f(x)\) for all \(x\in {\mathbb {R}}^{d}\), where \(f\) is some vector of polynomials \(f_{i}\in{\mathrm {Pol}}_{1}({\mathbb {R}}^{d})\). Writing the \(i\)th component of \(a(x){\mathbf{1}}\) in two ways then yields

$$ \begin{aligned} x_{i}\bigg( -\sum_{j=1}^{d} \alpha_{ij}x_{j} + \phi_{i} + \psi_{(i)}^{\top}x\bigg) &= (1 - {\mathbf{1}}^{\top}x)\big(f_{i}(x) - g_{ii}(x)\big) \\ &= (1 - {\mathbf{1}}^{\top}x)\big(\eta_{i} + ({\mathrm {H}}x)_{i}\big) \end{aligned} $$
(I.2)

for all \(x\in{\mathbb {R}}^{d}\) and some \(\eta\in{\mathbb {R}}^{d}\), \({\mathrm {H}} \in{\mathbb {R}}^{d\times d}\). Replacing \(x\) by \(sx\), dividing by \(s\) and sending \(s\) to zero gives \(x_{i}\phi_{i} = \lim_{s\to0} s^{-1}\eta_{i} + ({\mathrm {H}}x)_{i}\), which forces \(\eta _{i}=0\), \({\mathrm {H}}_{ij}=0\) for \(j\ne i\) and \({\mathrm {H}}_{ii}=\phi _{i}\). Substituting into (I.2) and rearranging yields

$$ x_{i}\bigg(- \sum_{j=1}^{d} \alpha_{ij}x_{j} + \psi_{(i)}^{\top}x + \phi _{i} {\mathbf{1}} ^{\top}x\bigg) = 0 $$
(I.3)

for all \(x\in{\mathbb {R}}^{d}\). The coefficient in front of \(x_{i}^{2}\) on the left-hand side is \(-\alpha_{ii}+\phi_{i}\) (recall that \(\psi_{(i),i}=0\)), which therefore is zero. That is, \(\phi_{i}=\alpha_{ii}\). With this in mind, (I.3) becomes \(x_{i} \sum_{j\ne i} (-\alpha _{ij}+\psi _{(i),j}+\alpha_{ii})x_{j} = 0\) for all \(x\in{\mathbb {R}}^{d}\), which implies \(\psi _{(i),j}=\alpha_{ij}-\alpha_{ii}\). At this point, we have proved

$$ a_{ii}(x) = -\alpha_{ii}x_{i}^{2} + x_{i}\bigg(\alpha_{ii} + \sum_{j\ne i}(\alpha_{ij}-\alpha_{ii})x_{j}\bigg) = \alpha_{ii}x_{i}(1-{\mathbf {1}}^{\top}x) + \sum_{j\ne i}\alpha_{ij}x_{i}x_{j} $$

on \(E\), which yields the stated form of \(a_{ii}(x)\). It remains to show that \(\alpha_{ij}\ge0\) for all \(i\ne j\). To see this, suppose for contradiction that \(\alpha_{ik}<0\) for some \((i,k)\). Pick \(s\in(0,1)\) and set \(x_{k}=s\), \(x_{j}=(1-s)/(d-1)\) for \(j\ne k\). Then

$$ a_{ii}(x) = x_{i} \sum_{j\ne i}\alpha_{ij}x_{j} = x_{i}\bigg(\alpha_{ik}s + \frac{1-s}{d-1}\sum_{j\ne i,k}\alpha_{ij}\bigg). $$

For \(s\) sufficiently close to 1, the right-hand side becomes negative, which contradicts positive semidefiniteness of \(a\) on \(E\). This proves (i).

For (ii), first note that we always have \(b(x)=\beta+Bx\) for some \(\beta \in{\mathbb {R}}^{d}\) and \(B\in{\mathbb {R}}^{d\times d}\). The condition \({\mathcal {G}}q=0\) on \(M\) for \(q(x)=1-{\mathbf{1}}^{\top}x\) yields \(\beta^{\top}{\mathbf{1}}+ x^{\top}B^{\top}{\mathbf{1}}= 0\) on \(M\). Hence by Lemma 5.4, \(\beta^{\top}{\mathbf{1}}+ x^{\top}B^{\top}{\mathbf{1}} =\kappa(1-{\mathbf{1}}^{\top}x)\) for all \(x\in{\mathbb {R}}^{d}\) and some constant \(\kappa\). This yields \(\beta^{\top}{\mathbf{1}}=\kappa\) and then \(B^{\top}{\mathbf {1}}=-\kappa {\mathbf{1}} =-(\beta^{\top}{\mathbf{1}}){\mathbf{1}}\). Next, the condition \({\mathcal {G}}p_{i} \ge0\) on \(M\cap\{ p_{i}=0\}\) for \(p_{i}(x)=x_{i}\) can be written as

$$ \min\Bigg\{ \beta_{i} + {\sum_{j=1}^{d}} B_{ji}x_{j}: x\in{\mathbb {R}}^{d}_{+}, {\mathbf{1}} ^{\top}x = {\mathbf{1}}, x_{i}=0\Bigg\} \ge0, $$

which in turn is equivalent to

$$ \min\Biggl\{ \beta_{i} + {\sum_{j\ne i}} B_{ji}x_{j}: x\in{\mathbb {R}}^{d}_{+}, {\sum_{j\ne i}} x_{j}=1\Biggr\} \ge0. $$

The feasible region of this optimization problem is the convex hull of \(\{e_{j}:j\ne i\}\), and the linear objective function achieves its minimum at one of the extreme points. Thus we obtain \(\beta_{i}+B_{ji} \ge0\) for all \(j\ne i\) and all \(i\), as required.  □

Appendix J: Some notions from algebraic geometry

In this appendix, we briefly review some well-known concepts and results from algebra and algebraic geometry. The reader is referred to Dummit and Foote [16, Chaps. 7 and 15] and Bochnak et al. [6, Chap. 4] for more details.

An ideal \(I\) of \({\mathrm{Pol}}({\mathbb {R}}^{d})\) is a subset of \({\mathrm{Pol}} ({\mathbb {R}}^{d})\) closed under addition and such that \(f\in I\) and \(g\in{\mathrm {Pol}}({\mathbb {R}}^{d})\) implies \(fg\in I\). Given a finite family \({\mathcal {R}}=\{r_{1},\ldots,r_{m}\}\) of polynomials, the ideal generated by ℛ, denoted by \(({\mathcal {R}})\) or \((r_{1},\ldots,r_{m})\), is the ideal consisting of all polynomials of the form \(f_{1} r_{1}+\cdots+f_{m}r_{m}\), with \(f_{i}\in{\mathrm {Pol}}({\mathbb {R}}^{d})\). Given any set of polynomials \(S\), its zero set is the set

$$ {\mathcal {V}}(S)=\{x\in{\mathbb {R}}^{d}:f(x)=0 \text{ for all }f\in S\}. $$

The zero set of the family ℛ coincides with the zero set of the ideal \(I=({\mathcal {R}})\), that is, \({\mathcal {V}}( {\mathcal {R}})={\mathcal {V}}(I)\). For example, the set \(M\) in (5.1) is the zero set of the ideal \(({\mathcal {Q}})\). Given a set \(V\subseteq{\mathbb {R}}^{d}\), the ideal generated by  \(V\), denoted by \({\mathcal {I}}(V)\), is the set of all polynomials that vanish on \(V\). It follows from the definition that \(S\subseteq{\mathcal {I}}({\mathcal {V}}(S))\) for any set \(S\) of polynomials. A basic problem in algebraic geometry is to establish when an ideal \(I\) is equal to the ideal generated by the zero set of \(I\),

$$ I = {\mathcal {I}}\big({\mathcal {V}}(I)\big). $$
(J.1)

If the ideal \(I=({\mathcal {R}})\) satisfies (J.1), then that means that any polynomial \(f\) that vanishes on the zero set \({\mathcal {V}}(I)\) has a representation \(f=f_{1}r_{1}+\cdots+f_{m}r_{m}\) for some polynomials \(f_{1},\ldots,f_{m}\).

An ideal \(I\) of \({\mathrm{Pol}}({\mathbb {R}}^{d})\) is said to be prime if it is not all of \({\mathrm{Pol}}({\mathbb {R}}^{d})\) and if the conditions \(f,g\in {\mathrm{Pol}}({\mathbb {R}}^{d})\) and \(fg\in I\) imply \(f\in I\) or \(g\in I\). The dimension of an ideal \(I\) of \({\mathrm{Pol}} ({\mathbb {R}}^{d})\) is the dimension of the quotient ring \({\mathrm {Pol}}({\mathbb {R}}^{d})/I\); for a definition of the latter, see Dummit and Foote [16, Sect. 16.1].

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Filipović, D., Larsson, M. Polynomial diffusions and applications in finance. Finance Stoch 20, 931–972 (2016). https://doi.org/10.1007/s00780-016-0304-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00780-016-0304-4

Keywords

Mathematics Subject Classification

JEL Classification

Navigation