Skip to main content

Concentration inequalities for matrix martingales in continuous time

Abstract

This paper gives new concentration inequalities for the spectral norm of a wide class of matrix martingales in continuous time. These results extend previously established Freedman and Bernstein inequalities for series of random matrices to the class of continuous time processes. Our analysis relies on a new supermartingale property of the trace exponential proved within the framework of stochastic calculus. We provide also several examples that illustrate the fact that our results allow us to recover easily several formerly obtained sharp bounds for discrete time matrix martingales.

This is a preview of subscription content, access via your institution.

Notes

  1. Let us note that this definition does not imply that a purely discontinuous martingale is the sum of its jumps: for example a compensated Poisson process \(N_t-\lambda t\) is a purely discontinuous martingale that has a continuous component.

References

  1. Ahlswede, R., Winter, A.: Strong converse for identification via quantum channels. Inf. Theory IEEE Trans. 48(3), 569–579 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bandeira, A.S.: Concentration inequalities, scalar and matrix versions. Lecture Notes. Available at http://math.mit.edu/~bandeira (2015)

  3. Bhatia, R.: Matrix Analysis. Springer, New York (1997)

    Book  MATH  Google Scholar 

  4. Brémaud, P.: Point Processes and Queues: Martingale Dynamics: Martingale Dynamics. Springer, New York (1981)

    MATH  Google Scholar 

  5. Bunea, F., She, Y., Wegkamp, M.H.: Optimal selection of reduced rank estimators of high-dimensional matrices. Ann. Stat. 39(2), 1282–1309 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  6. Candès, E.J., Li, X., Ma, Y., John, W.: Robust principal component analysis? J. ACM 8, 1–37 (2009)

    MathSciNet  MATH  Google Scholar 

  7. Christofides, D., Markström, K.: Expansion properties of random cayley graphs and vertex transitive graphs via matrix martingales. Random Struct. Algorithms 32(1), 88–100 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  8. Gaïffas, S., Guilloux, A.: High-dimensional additive hazards models and the lasso. Electron. J. Stat. 6, 522–546 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  9. Gittens, A.: The spectral norm error of the naive nystrom extension. arXiv preprint arXiv:1110.5305 (2011)

  10. Golub, G.H., Van Loan, C.F.: Matrix Computations. JHU Press, Baltimore (2013)

    MATH  Google Scholar 

  11. Gross, D.: Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inf. Theory 57, 1548–1566 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  12. Hansen, N.R., Reynaud-Bouret, P., Rivoirard, V.: Lasso and probabilistic inequalities for multivariate point processes. Bernoulli 21(1), 83–143 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  13. Jacod, J., Shiryaev, A.N.: Limit Theorems for Stochastic Processes. Springer, New York (1987)

    Book  MATH  Google Scholar 

  14. Johnson, C.R., Horn, R.A.: Topics in Matrix Analysis. Cambridge University, Cambridge (1991)

    MATH  Google Scholar 

  15. Koltchinskii, V.: Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems: Saint-Flour XXXVIII-2008, 2033rd edn. Springer, New York (2011)

    Book  MATH  Google Scholar 

  16. Koltchinskii, V., Lounici, K., Tsybakov, A.B.: Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Stat. 39(5), 2302–2329 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  17. Latala, R.: Some estimates of norms of random matrices. Proc. Am. Math. Soc. 133, 1273–1282 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  18. Lieb, E.H.: Convex trace functions and the Wigner–Yanase–Dyson conjecture. Adv. Math. 11(3), 267–288 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  19. Liptser, R.S., Shiryayev, A.N.: Theory of Martingales. Springer, New York (1989)

    Book  MATH  Google Scholar 

  20. Mackey, L., Jordan, M.I., Chen, R.Y., Farrell, B., Tropp, J.A.: Matrix concentration inequalities via the method of exchangeable pairs. Ann. Probab. 42(3), 906–945 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  21. Mackey, L.W., Jordan, M.I., Talwalkar, A.: Divide-and-conquer matrix factorization. In: NIPS, pp. 1134–1142 (2011)

  22. Massart, P.: Concentration Inequalities and Model Selection, 1896th edn. Springer, New York (2007)

    MATH  Google Scholar 

  23. Minsker, S.: On some extensions of Bernstein’s inequality for self-adjoint operators. arXiv preprint arXiv:1112.5448 (2011)

  24. Negahban, S., Wainwright, M.J.: Restricted strong convexity and weighted matrix completion: optimal bounds with noise. J. Mach. Learn. Res. 13(1), 1665–1697 (2012)

    MathSciNet  MATH  Google Scholar 

  25. Oliveira, R.I.: Concentration of the adjacency matrix and of the laplacian in random graphs with independent edges. arXiv preprint arXiv:0911.0600 (2009)

  26. Oliveira, R.I.: Sums of random hermitian matrices and an inequality by Rudelson. Electron. Commun. Probab. 15(203–212), 26 (2010)

    MathSciNet  MATH  Google Scholar 

  27. Paulsen, V.I., Bollobás, B., Fulton, W., Katok, A., Kirwan, F., Sarnak, P.: Completely Bounded Maps and Operator Algebras. Cambridge University Press, Cambridge (2002)

    Google Scholar 

  28. Petz, D.: A survey of certain trace inequalities. Funct. Anal. Oper. Theory 30, 287–298 (1994)

    MathSciNet  MATH  Google Scholar 

  29. Recht, B.: A simpler approach to matrix completion. J. Mach. Learn. Res. 12, 3413–3430 (2011)

    MathSciNet  MATH  Google Scholar 

  30. Reynaud-Bouret, P.: Compensator and exponential inequalities for some suprema of counting processes. Stat. Probab. Lett. 76(14), 1514–1521 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  31. Rohde, A., Tsybakov, A.B.: Estimation of high-dimensional low-rank matrices. Ann. Stat. 39(2), 887–930 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  32. Seginer, Y.: The expected norm of random matrices. Combin. Probab. Comput. 9, 149–166 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  33. Tropp, J.A.: Freedman’s inequality for matrix martingales. Electron. Commun. Probab. 16, 262–270 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  34. Tropp, J.A.: User-friendly tail bounds for sums of random matrices. Found. Comput. Math. 12(4), 389–434 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  35. van de Geer, S.: Exponential inequalities for martingales, with application to maximum likelihood estimation for counting processes. Ann. Stat. 23(5), 1779–1801 (1995)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We gratefully acknowledge the anonymous reviewers of the first version of this paper for their helpful comments and suggestions. The authors would like to thank Carl Graham and Peter Tankov for various comments on our paper. This research benefited from the support of the Chair “Markets in Transition”, under the aegis of “Louis Bachelier Finance and Sustainable Growth” laboratory, a joint initiative of École Polytechnique, Université d’Évry Val d’Essonne and Fédération Bancaire Francaise, and of the Data Initiative of Ecole Polytechnique.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stéphane Gaïffas.

Appendices

Appendix A: Tools for the study of matrix martingales in continuous time

In this section we give tools for the study of matrix martingales in continuous time. We proceed by steps. The main result of this section, namely Proposition 1, proves that the trace exponential of a matrix martingale is a supermartingale, when properly corrected by terms involving quadratic covariations.

1.1 Appendix A.1: A first tool

We give first a simple lemma that links the largest eigenvalues of random matrices to the trace exponential of their difference.

Lemma A.1

Let \(\varvec{X}\) and \(\varvec{Y}\) be two symmetric random matrices such that

$$\begin{aligned} {{\mathrm{tr}}}\mathbbm {E}[ e^{{\varvec{X}}- {\varvec{Y}}}] \le k \end{aligned}$$

for some \(k > 0\). Then, we have

$$\begin{aligned} \mathbbm {P}[ \lambda _{\max }({\varvec{X}}) \ge \lambda _{\max }({\varvec{Y}}) + x ] \le k e^{-x} \end{aligned}$$

for any \(x > 0\).

Proof

Using the fact that [28]

$$\begin{aligned} {\varvec{A}}\preccurlyeq {\varvec{B}}\Rightarrow {{\mathrm{tr}}}\exp ({\varvec{A}}) \le {{\mathrm{tr}}}\exp ({\varvec{B}}),~~\text{ for } \text{ any }~{\varvec{A}},{\varvec{B}}~\text{ symmetric }, \end{aligned}$$
(23)

along with the fact that \({\varvec{Y}}\preccurlyeq \lambda _{\max }({\varvec{Y}}) {\varvec{I}}\), one has

$$\begin{aligned} {{\mathrm{tr}}}\exp (\varvec{X} - \varvec{Y}) \mathbf {1}_{E} \ge {{\mathrm{tr}}}\exp (\varvec{X} - \lambda _{\max }({\varvec{Y}}) \varvec{I}) \mathbf {1}_{E}, \end{aligned}$$

where we set \(E = \{ \lambda _{\max }({\varvec{X}}) \ge \lambda _{\max }({\varvec{Y}}) + x \}\). Now, since \(\lambda _{\max }({\varvec{M}}) \le {{\mathrm{tr}}}{\varvec{M}}\) for any symmetric positive definite matrix \({\varvec{M}}\), we obtain

$$\begin{aligned} {{\mathrm{tr}}}\exp (\varvec{X} - \varvec{Y}) \mathbf {1}_{E}&\ge \lambda _{\max }(\exp (\varvec{X} - \lambda _{\max }({\varvec{Y}}) \varvec{I})) \mathbf {1}_{E} \\&= \exp (\lambda _{\max }(\varvec{X}) - \lambda _{\max }(\varvec{Y})) \mathbf {1}_{E} \\&\ge e^x \mathbf {1}_{E}, \end{aligned}$$

so that taking the expectation on both sides proves Lemma A.1. \(\square \)

1.2 Appendix A.2: Various definitions and Itô’s Lemma for functions of matrices

In this section we describe some classical notions from stochastic calculus [13, 19] and extend them to matrix semimartingales. Let us recall that the quadratic covariation of two scalar semimartingales \(X_t\) and \(Y_t\) is defined as

$$\begin{aligned}{}[X,Y]_t = X_t Y_t - \int _0^t Y_{t^-} dX_t - \int _0^t X_{t^-} dY_t - X_0 Y_0 \; . \end{aligned}$$

It can be proven (see e.g. [13]) that the non-decreasing process \([X, X]_t\), often denoted as \([X]_t\), does correspond to the quadratic variation of \(X_t\) since it is equal to the limit (in probability) of \(\sum _i (X_{t_i}-X_{t_{i-1}})^2\) when the mesh size of the partition \(\{t_i \}_i\) of the interval [0, t] goes to zero.

If \(X_t\) is a square integrable scalar martingale, then its predictable quadratic variation \(\langle X \rangle _t\) is defined as the unique predictable increasing process such that \(X_t^2 - \langle X \rangle _t\) is a martingale. The predictable quadratic covariation between two square integrable scalar martingales \(X_t\) and \(Y_t\) is then defined from the polarization identity:

$$\begin{aligned} \langle X ,Y \rangle = \frac{1}{4} \big ( \langle X + Y, X + Y \rangle - \langle X - Y, X - Y \rangle \big ). \end{aligned}$$

A martingale \(X_t\) is said to be continuous if its sample paths \(t\mapsto X_t\) are a.s. continuous, and purely discontinuous Footnote 1 if \(X_0 = 0\) and \(\langle X, Y \rangle _t = 0\) for any continuous martingale \( Y_t\).

The notion of predictable quadratic variation can be extended to semimartingales. Indeed, any semimartingale \(X_t\) can be represented as a sum:

$$\begin{aligned} X_t = X_0 + X^{c}_t + X^{d}_t + A_t, \end{aligned}$$
(24)

where \(X^{c}_t\) is a continuous local martingale, \(X^{d}_t\) is a purely discontinuous local martingale and \(A_t\) is a process of bounded variations. Since in the decomposition (24), \(X^{c}_t\) is unambiguously determined, \(\langle X^c \rangle _t\) is therefore well defined [13]. Within this framework, one can prove (see e.g. [13]) that if \(X_t\) and \(Y_t\) are two semimartingales, then:

$$\begin{aligned} {[}X,Y]_t = \langle X^c,Y^c \rangle _t + \sum _{0\le s \le t} \Delta X_s \Delta Y_s. \end{aligned}$$
(25)

All these definitions can be naturally extended to matrix valued semimartingales. Let \({\varvec{X}}_t\) be a \(p \times q\) matrix whose entries are real-valued square-integrable semimartingales. We denote by \(\langle {\varvec{M}} \rangle _t\) the matrix of entry-wise predictable quadratic variations. The predictable quadratic covariation of \({\varvec{X}}_t\) is defined with the help of the vectorization operator \(\mathrm {vec}: \mathbb {R}^{p \times q} \rightarrow \mathbb {R}^{pq}\) which stacks vertically the columns of \({\varvec{X}}\), namely if \({\varvec{X}}\in \mathbb {R}^{p \times q}\) then

$$\begin{aligned} \mathrm {vec}({\varvec{X}}) = \begin{bmatrix} {\varvec{X}}_{1, 1} \cdots {\varvec{X}}_{p, 1} {\varvec{X}}_{1, 2} \cdots {\varvec{X}}_{p, 2} \cdots {\varvec{X}}_{1, q} \cdots {\varvec{X}}_{p, q} \end{bmatrix}^\top . \end{aligned}$$

We define indeed the predictable quadratic covariation matrix \(\langle \mathrm {vec}{\varvec{X}} \rangle _t\) of \({\varvec{X}}_t\) as the \(pq \times pq\) matrix with entries

$$\begin{aligned} (\langle \mathrm {vec}{\varvec{X}} \rangle _t)_{i,j} = \langle (\mathrm {vec}{\varvec{X}}_t)_i, (\mathrm {vec}{\varvec{X}}_t)_j \rangle \end{aligned}$$
(26)

for \(1 \le i, j \le pq\), namely such that \(\mathrm {vec}({\varvec{X}}_t) \mathrm {vec}({\varvec{X}}_t)^\top - \langle \mathrm {vec}{\varvec{X}} \rangle _t\) is a martingale. The matrices of quadratic variations \([{\varvec{X}}]_t\) and quadratic covariations \([\mathrm {vec}{\varvec{X}}]_t\) are defined along the same line.

Then according to Eq. (25), we have:

$$\begin{aligned} {[}{\varvec{X}}]_t = \langle {\varvec{X}}^c \rangle _t + \sum _{0 \le s \le t} (\Delta {\varvec{X}}_s)^2, \end{aligned}$$
(27)

and

$$\begin{aligned} {[}\mathrm {vec}{\varvec{X}}]_t = \langle \mathrm {vec}{\varvec{X}}^c \rangle _t + \sum _{0 \le s \le t} \mathrm {vec}(\Delta {\varvec{X}}_{s}) \mathrm {vec}(\Delta {\varvec{X}}_{s})^\top . \end{aligned}$$

An important tool for our proofs is Itô’s lemma, that allows one to compute the stochastic differential \(d F({\varvec{M}}_t)\) where \(F : \mathbb {R}^{p \times q} \rightarrow \mathbb {R}\) is a twice differentiable function. We denote by \(\frac{d F}{d \mathrm {vec}({\varvec{X}})}\) the pq-dimensional vector such that

$$\begin{aligned} \Big [ \frac{d F}{d \mathrm {vec}({\varvec{X}})} \Big ]_i = \frac{\partial F}{\partial (\mathrm {vec}{\varvec{X}})_i} \; \text { for } \; 1 \le i \le p q. \end{aligned}$$

The second order derivative is the \(pq \times pq\) symmetric matrix given by

$$\begin{aligned} \Big [ \frac{d^2 F}{d \mathrm {vec}({\varvec{X}})d \mathrm {vec}({\varvec{X}})^\top }\Big ]_{i, j} = \frac{\partial ^2 F}{\partial (\mathrm {vec}{\varvec{X}})_i \partial (\mathrm {vec}{\varvec{X}})_j} \; \text {for} \; 1 \le i, j \le pq. \end{aligned}$$

A direct application of the multivariate Itô Lemma ([19] Theorem 1, p. 118) writes for matrix semimartingales as follows.

Lemma A.2

(Itô’s Lemma) Let \(\{{\varvec{X}}_t\}_{t \ge 0}\) be a \(p \times q\) matrix semimartingale and \(F : \mathbb {R}^{p \times q} \rightarrow \mathbb {R}\) be a twice continuously differentiable function. Then

$$\begin{aligned} dF({\varvec{X}}_t)&= \Big (\frac{d F}{d \mathrm {vec}({\varvec{X}})}({\varvec{X}}_{t^-}) \Big )^\top \mathrm {vec}(d {\varvec{X}}_t) + \Delta F({\varvec{X}}_t) - \Big (\frac{d F}{d \mathrm {vec}{\varvec{X}}}({\varvec{X}}_{t^-}) \Big )^\top \mathrm {vec}(\Delta {\varvec{X}}_t) \\&\quad + \frac{1}{2} {{\mathrm{tr}}}\bigg (\Big (\frac{d^2 F}{d \mathrm {vec}({\varvec{X}}) d \mathrm {vec}({\varvec{X}})^\top } ({\varvec{X}}_{t^-}) \Big )^\top d\langle \mathrm {vec}{\varvec{X}}^c \rangle _t \bigg ). \end{aligned}$$

As an application, let us apply Lemma A.2 to the function \(F({\varvec{X}})= {{\mathrm{tr}}}\exp ({\varvec{X}})\) that acts on the set of symmetric matrices. This result will be of importance for the proof of our results.

Lemma A.3

(Itô’s Lemma for the trace exponential) Let \(\{{\varvec{X}}_t\}\) be a \(d \times d\) symmetric matrix semimartingale. The Itô formula for \(F({\varvec{X}}_t) = {{\mathrm{tr}}}\exp ({\varvec{X}}_t)\) gives

$$\begin{aligned} d({{\mathrm{tr}}}e^{{\varvec{X}}_t}) = {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} d {\varvec{X}}_t) + \Delta ({{\mathrm{tr}}}e^{{\varvec{X}}_t}) - {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} \Delta {\varvec{X}}_t) + \frac{1}{2} \sum _{i=1}^d {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} d\langle {\varvec{X}}_{\bullet , i}^c \rangle _t), \end{aligned}$$
(28)

where \(\langle {\varvec{X}}_{\bullet , i}^c \rangle _t\) denotes the \(d \times d\) predictable quadratic variation of the continuous part of the i-th column \(({\varvec{X}}_t)_{\bullet , i}\) of \({\varvec{X}}_t\).

Proof

An easy computation gives

$$\begin{aligned} {{\mathrm{tr}}}e^{{\varvec{X}}+ {\varvec{H}}} = {{\mathrm{tr}}}e^{{\varvec{X}}} + {{\mathrm{tr}}}(e^{{\varvec{X}}} {\varvec{H}}) + {{\mathrm{tr}}}(e^{{\varvec{X}}} {\varvec{H}}^2) + \text { higher order terms in } {\varvec{H}}\end{aligned}$$

for any symmetric matrices \({\varvec{X}}\) and \({\varvec{H}}\). Note that \({{\mathrm{tr}}}(e^{\varvec{X}}{\varvec{H}}) = (\mathrm {vec}{\varvec{H}})^\top \mathrm {vec}(e^{\varvec{X}}) \), and we have from [14] Exercise 25 p. 252 that

$$\begin{aligned} {{\mathrm{tr}}}(e^{{\varvec{X}}} {\varvec{H}}^2) = {{\mathrm{tr}}}({\varvec{H}}e^{{\varvec{X}}} {\varvec{H}}) = (\mathrm {vec}{\varvec{H}})^\top ({\varvec{I}}\otimes e^{\varvec{X}}) (\mathrm {vec}{\varvec{H}}), \end{aligned}$$

where the Kronecker product \({\varvec{I}}\otimes e^{\varvec{X}}\) stands for the block matrix

$$\begin{aligned} {\varvec{I}}\otimes {\varvec{Y}}= \begin{bmatrix} e^{\varvec{X}}&0&\cdots&0 \\ 0&e^{\varvec{X}}&\vdots \\ \vdots&\ddots&0 \\ 0&\cdots&0&e^{\varvec{X}}\end{bmatrix}. \end{aligned}$$

This entails that

$$\begin{aligned} \frac{d({{\mathrm{tr}}}e^{{\varvec{X}}})}{d \mathrm {vec}({\varvec{X}})} = \mathrm {vec}(e^{{\varvec{X}}}) \;\; \text { and } \;\; \frac{d^2({{\mathrm{tr}}}e^{{\varvec{X}}})}{d \mathrm {vec}({\varvec{X}}) d \mathrm {vec}({\varvec{X}})^\top } = {\varvec{I}}\otimes e^{\varvec{X}}. \end{aligned}$$

Hence, using Lemma A.2 with \(F({\varvec{X}}) = {{\mathrm{tr}}}e^{{\varvec{X}}}\) we obtain

$$\begin{aligned} d({{\mathrm{tr}}}e^{{\varvec{X}}_t})&= \mathrm {vec}(e^{{\varvec{X}}_{t^-}})^\top \mathrm {vec}(d {\varvec{X}}_t) + \Delta ({{\mathrm{tr}}}e^{{\varvec{X}}_t}) - \mathrm {vec}(e^{{\varvec{X}}_{t^-}})^\top \mathrm {vec}(\Delta {\varvec{X}}_t) \\&\quad + \frac{1}{2} {{\mathrm{tr}}}\big ( ({\varvec{I}}\otimes e^{{\varvec{X}}_{t^-}}) d\langle \mathrm {vec}{\varvec{X}}^c \rangle _t \big ). \end{aligned}$$

Since \(\mathrm {vec}({\varvec{Y}})^\top \mathrm {vec}({\varvec{Z}}) = {{\mathrm{tr}}}({\varvec{Y}}{\varvec{Z}})\), one gets

$$\begin{aligned} d({{\mathrm{tr}}}e^{{\varvec{X}}_t})&= {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} d {\varvec{X}}_t) + \Delta ({{\mathrm{tr}}}e^{{\varvec{X}}_t}) \\&\quad - {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} \Delta {\varvec{X}}_t) + \frac{1}{2} {{\mathrm{tr}}}\big ( ({\varvec{I}}\otimes e^{{\varvec{X}}_{t^-}}) d\langle \mathrm {vec}{\varvec{X}}^c \rangle _t \big ). \end{aligned}$$

To conclude the proof of Lemma A.3, it remains to prove that

$$\begin{aligned} {{\mathrm{tr}}}\big ( ({\varvec{I}}\otimes e^{{\varvec{X}}_{t^-}}) d \langle \mathrm {vec}{\varvec{X}}^c \rangle _t \big ) = \sum _{i=1}^d {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} d\langle {\varvec{X}}_{\bullet , i}^c \rangle _t). \end{aligned}$$

First, let us write

$$\begin{aligned} d\langle \mathrm {vec}{\varvec{X}}^c \rangle _t = \sum _{1 \le i, j \le d} {\varvec{E}}^{i, j} \otimes d \langle {\varvec{X}}_{\bullet , i}^c, {\varvec{X}}_{\bullet , j}^c \rangle _t, \end{aligned}$$

where \({\varvec{E}}^{i, j}\) is the \(d \times d\) matrix with all entries equal to zero excepted for the (ij)-entry, which is equal to one. Since

$$\begin{aligned} ({\varvec{A}}\otimes {\varvec{B}})({\varvec{C}}\otimes {\varvec{D}}) = ({\varvec{A}}{\varvec{C}}) \otimes ({\varvec{B}}{\varvec{D}}) \quad \text {and} \quad {{\mathrm{tr}}}({\varvec{A}}\otimes {\varvec{B}}) = {{\mathrm{tr}}}({\varvec{A}}) {{\mathrm{tr}}}({\varvec{B}}) \end{aligned}$$

for any matrices \({\varvec{A}}, {\varvec{B}}, {\varvec{C}}, {\varvec{D}}\) with matching dimensions (see for instance [14]), we have

$$\begin{aligned} {{\mathrm{tr}}}\big (({\varvec{I}}\otimes e^{{\varvec{X}}_{t^-}}) d\langle \mathrm {vec}{\varvec{X}}^c \rangle _t \big )= & {} \sum _{1 \le i, j \le d} {{\mathrm{tr}}}({\varvec{E}}^{i, j}) {{\mathrm{tr}}}(e^{{\varvec{X}}_{t-}} d \langle {\varvec{X}}_{\bullet , i}^c,{\varvec{X}}_{\bullet , j}^c \rangle _t) \\= & {} \sum _{i=1}^d {{\mathrm{tr}}}(e^{{\varvec{X}}_{t-}} d\langle {\varvec{X}}_{\bullet , i}^c, {\varvec{X}}_{\bullet , i}^c \rangle _t) \end{aligned}$$

since \({{\mathrm{tr}}}{\varvec{E}}^{i, j} = 0\) for \(i \ne j\) and 1 otherwise. This concludes the proof of Lemma A.3. \(\square \)

1.3 Appendix A.3: Proof of Proposition 1

Define for short

$$\begin{aligned} {\varvec{X}}_t = {\varvec{Y}}_t - {\varvec{A}}_t - \frac{1}{2} \sum _{j=1}^d \langle {\varvec{Y}}_{\bullet ,j}^c \rangle _t. \end{aligned}$$

Since \({\varvec{A}}_t\) and \(\langle {\varvec{Y}}_{\bullet , j}^c \rangle _t\) for \(j=1, \ldots , d\) are FV processes, then

$$\begin{aligned} \langle \mathrm {vec}{\varvec{X}}^c \rangle = \langle \mathrm {vec}{\varvec{Y}}^c \rangle \end{aligned}$$
(29)

and in particular \(\langle {\varvec{Y}}_{\bullet , j}^c \rangle = \langle {\varvec{X}}_{\bullet , j}^c \rangle \) for any \(j=1, \ldots , d\). Using Lemma A.3, one has that for all \(t_1 < t_2\):

$$\begin{aligned} L_{t_2} - L_{t_1}&= \int _{t_1}^{t_2} {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} d {\varvec{X}}_t) + \sum _{t_1 \le t \le t_2} \big (\Delta ({{\mathrm{tr}}}e^{{\varvec{X}}_{t}}) - {{\mathrm{tr}}}(e^{{\varvec{X}}_{{t}^-}} \Delta {\varvec{X}}_{t}) \big ) \\&\quad + \frac{1}{2} \int _{t_1}^{t_2} \sum _{j=1}^d {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} d\langle {\varvec{X}}_{\bullet ,j}^c \rangle _t) \\&= \int _{t_1}^{t_2} {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} d {\varvec{Y}}_t) - \int _{t_1}^{t_2} {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} d {\varvec{A}}_t) \\&\quad + \sum _{t_1 \le t \le t_2} \big ( {{\mathrm{tr}}}( e^{{\varvec{X}}_{{t}^-} + \Delta {\varvec{Y}}_{t}}) - {{\mathrm{tr}}}(e^{{\varvec{X}}_{{t}^-}}) -{{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} \Delta {\varvec{Y}}_{t}) \big ), \end{aligned}$$

where we used (29) together with the fact that \(\Delta {\varvec{X}}_t = \Delta {\varvec{Y}}_t\), since \({\varvec{A}}_t\) and \(\langle {\varvec{Y}}_{\bullet , j}^c \rangle _t\) are both continuous.

The Golden-Thompson’s inequality, see [3], states that \({{\mathrm{tr}}}e^{{\varvec{A}}+ {\varvec{B}}} \le {{\mathrm{tr}}}( e^{\varvec{A}}e^{\varvec{B}})\) for any symmetric matrices \({\varvec{A}}\) and \({\varvec{B}}\). Using this inequality we get

$$\begin{aligned} L_{t_2} - L_{t_1}&\le \int _{t_1}^{t_2} {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} d {\varvec{Y}}_t) - \int _{t_1}^{t_2} {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} d {\varvec{A}}_t) \\&\quad + \sum _{t_1 \le t \le t_2} {{\mathrm{tr}}}\big ( e^{{\varvec{X}}_{{t}^-}}(e^{\Delta {\varvec{Y}}_{t}} -\Delta {\varvec{Y}}_{t} - {\varvec{I}}) \big ) \\&= \int _{t_1}^{t_2} {{\mathrm{tr}}}(e^{{\varvec{X}}_{t^-}} d {\varvec{Y}}_t) + \int _{t_1}^{t_2} {{\mathrm{tr}}}\big (e^{{\varvec{X}}_{t^-}}d({\varvec{U}}_t - {\varvec{A}}_t) \big ). \end{aligned}$$

Since \({\varvec{Y}}_t\) and \({\varvec{U}}_t - {\varvec{A}}_t\) are matrix martingales, \(e^{{\varvec{X}}_{t^-}}\) is a predictable process with locally bounded entries and \(L_t \ge 0\), the r.h.s of the last equation corresponds to the variation between \(t_1\) and \(t_2\) of a non-negative local martingale, i.e., of a supermartingale. It results that \(\mathbbm {E}[L_{t_2} - L_{t_1}| \mathscr {F}_{t_1}] \le 0\), which proves that \(L_t\) is also a supermartingale. Using this last inequality with \(t_1 = 0\) and \(t_2 = t\) gives \(\mathbbm {E}[L_{t}] \le d\). This concludes the proof of Proposition 1.

1.4 Appendix A.4: Bounding the odd powers of the dilation operator

The process \(\{ {\varvec{Z}}_t \}\) is not symmetric, hence following [34], we will force symmetry in our proofs by extending it in larger dimensions, using the symmetric dilation operator [27] given, for a matrix \({\varvec{X}}\), by

$$\begin{aligned} \mathscr {S}(\varvec{X}) = \begin{bmatrix} \varvec{0}&\varvec{X} \\ \varvec{X}^\top&\varvec{0} \end{bmatrix}. \end{aligned}$$
(30)

The following Lemma will prove useful:

Lemma A.4

Let \({\varvec{X}}\) be some \(n \times m\) matrix and \(k \in \mathbb {N}\). Then

$$\begin{aligned} \mathscr {S}({\varvec{X}})^{2k+1} = \begin{bmatrix} \varvec{0}&{\varvec{X}}({\varvec{X}}^\top {\varvec{X}})^{k} \\ {\varvec{X}}^\top ({\varvec{X}}{\varvec{X}}^\top )^{k}&0 \end{bmatrix} \preccurlyeq \begin{bmatrix} ({\varvec{X}}{\varvec{X}}^\top )^{k + 1/2}&\varvec{0} \\ \varvec{0}&({\varvec{X}}^\top {\varvec{X}})^{k + 1/2} \end{bmatrix}. \end{aligned}$$

Proof

The first equality results from a simple algebra. It can be rewritten as:

$$\begin{aligned} \mathscr {S}({\varvec{X}})^{2k+1} = \begin{bmatrix} \varvec{0}&({\varvec{X}}{\varvec{X}}^\top )^{k} {\varvec{X}}\\ {\varvec{X}}^\top ({\varvec{X}}{\varvec{X}}^\top )^{k}&0 \end{bmatrix} = {\varvec{C}}\begin{bmatrix} \varvec{0}&({\varvec{X}}{\varvec{X}}^\top )^{k} \\ ({\varvec{X}}{\varvec{X}}^\top )^{k}&0 \end{bmatrix} {\varvec{C}}^\top \end{aligned}$$
(31)

where

$$\begin{aligned} {\varvec{C}}= \begin{bmatrix} \varvec{0}&{\varvec{I}}_n \\ {\varvec{X}}^\top&0 \end{bmatrix}. \end{aligned}$$
(32)

Since \(({\varvec{X}}{\varvec{X}}^\top )^{k} \succcurlyeq \varvec{0} \) and

$$\begin{aligned} {\varvec{A}}= \begin{bmatrix} 1&-1 \\ -1&1 \end{bmatrix} \succcurlyeq 0, \end{aligned}$$

we obtain that \({\varvec{A}}\otimes ({\varvec{X}}{\varvec{X}}^\top )^{k} \succcurlyeq \varvec{0}\), since the eigenvalues of a Kronecker product \({\varvec{A}}\otimes {\varvec{B}}\) are given by the products of the eigenvalues of \({\varvec{A}}\) and \({\varvec{B}}\), see [10]. This leads to:

$$\begin{aligned} \begin{bmatrix} \varvec{0}&({\varvec{X}}{\varvec{X}}^\top )^{k} \\ ({\varvec{X}}{\varvec{X}}^\top )^{k}&0 \end{bmatrix} \preccurlyeq \begin{bmatrix} ({\varvec{X}}{\varvec{X}}^\top )^{k}&\varvec{0} \\ \varvec{0}&({\varvec{X}}{\varvec{X}}^\top )^{k} \end{bmatrix}. \end{aligned}$$

Using the fact that [28]

$$\begin{aligned} {\varvec{A}}\preccurlyeq {\varvec{B}}\;\; \Rightarrow \;\; {\varvec{C}}{\varvec{A}}{\varvec{C}}^\top \preccurlyeq {\varvec{C}}{\varvec{B}}{\varvec{C}}^\top \end{aligned}$$
(33)

for any real matrices \({\varvec{A}}, {\varvec{B}}, {\varvec{C}}\) (with compatible dimensions), we have:

$$\begin{aligned} \mathscr {S}({\varvec{X}})^{2k+1} \preccurlyeq {\varvec{C}}\begin{bmatrix} ({\varvec{X}}{\varvec{X}}^\top )^{k}&\varvec{0} \\ \varvec{0}&({\varvec{X}}{\varvec{X}}^\top )^{k} \end{bmatrix} {\varvec{C}}^\top = \begin{bmatrix} ({\varvec{X}}{\varvec{X}}^\top )^{k}&\varvec{0} \\ \varvec{0}&({\varvec{X}}^\top {\varvec{X}})^{k+1} \end{bmatrix}. \end{aligned}$$

Along the same line, one can establish that:

$$\begin{aligned} \mathscr {S}({\varvec{X}})^{2k+1} \preccurlyeq \begin{bmatrix} ({\varvec{X}}{\varvec{X}}^\top )^{k+1}&\varvec{0} \\ \varvec{0}&({\varvec{X}}^\top {\varvec{X}})^{k} \end{bmatrix}. \end{aligned}$$

The square root of the product of the two inequalities provides the desired result. \(\square \)

Appendix B: Proof of Theorem 1

Let us recall the definition (30) of the dilation operator. Let us point out that \(\mathscr {S}({\varvec{X}})\) is symmetric and satisfies \(\lambda _{\max }(\mathscr {S}({\varvec{X}})) = \Vert \mathscr {S}({\varvec{X}})\Vert _{{{\mathrm{op}}}} = \Vert {\varvec{X}}\Vert _{{{\mathrm{op}}}}\). Note that \(\mathscr {S}({\varvec{Z}}_t)\) is purely discontinuous, so that \(\langle \mathscr {S}({\varvec{Z}})_{\bullet , j}^c \rangle _t = {\varvec{0}}\) for any j. Recall that we work on events \(\{ \lambda _{\max }({\varvec{V}}_t) \le v \}\) and \(\{ b_t \le b\}\).

We want to apply Proposition 1 (see “Appendix A” above) to \({\varvec{Y}}_t = \xi \mathscr {S}({\varvec{Z}}_t) / b\). In order to do so, we need the following Proposition.

Proposition B.1

Let the matrix \({\varvec{W}}_t\) be the matrix defined in Eq. (7). Let any \(\xi \ge 0\) be fixed and consider \(\phi (x) = e^x - x - 1\) for \(x \in \mathbb {R}\). Assume that

$$\begin{aligned} \mathbbm {E}\bigg [ \int _0^t \frac{\phi \big ( \xi J_{\max } \Vert {\varvec{C}}_s\Vert _\infty \max (\Vert \mathbb {T}_s\Vert _{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert _{{{\mathrm{op}}}; \infty })\big )}{J_{\max }^2 \Vert {\varvec{C}}_s\Vert _\infty ^2 \max (\Vert \mathbb {T}_s\Vert ^2_{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert ^2_{{{\mathrm{op}}}; \infty })} ({\varvec{W}}_s)_{i, j} ds \bigg ] < +\infty , \end{aligned}$$
(34)

for any \(1 \le i, j \le m + n\) and grant also Assumption 1 from Sect. 2.3. Then, the process

$$\begin{aligned} {\varvec{U}}_t = \sum _{0 \le s \le t} \Big ( e^{\xi \Delta \mathscr {S}({\varvec{Z}}_s)} - \xi \Delta \mathscr {S}({\varvec{Z}}_{s}) - {\varvec{I}}\Big ), \end{aligned}$$
(35)

admits a predictable, continuous and FV compensator \({\varvec{\Lambda }}_t\) given by Eq. (39) below. Moreover, the following upper bound for the semi-definite order

$$\begin{aligned} {\varvec{\Lambda }}_t \preccurlyeq \int _0^t \frac{\phi \big (\xi J_{\max } \Vert {\varvec{C}}_s\Vert _\infty \max (\Vert \mathbb {T}_s\Vert _{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert _{{{\mathrm{op}}}; \infty })\big )}{J_{\max }^2 \Vert {\varvec{C}}_s\Vert _\infty ^2 \max (\Vert \mathbb {T}_s\Vert ^2_{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert ^2_{{{\mathrm{op}}}; \infty })} {\varvec{W}}_s ds \end{aligned}$$
(36)

is satisfied for any \(t > 0\).

This proposition is proved in “Appendix C” below. We use Proposition 1, Eq. (36) and Eq. (23) together with (9) to obtain

$$\begin{aligned}&\mathbbm {E}\Big [{{\mathrm{tr}}}\exp \Big (\frac{\xi }{b} \mathscr {S}({\varvec{Z}}_t) - \int _0^t \frac{\phi \big (\xi J_{\max } \Vert {\varvec{C}}_s\Vert _\infty \max (\Vert \mathbb {T}_s\Vert _{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert _{{{\mathrm{op}}}; \infty }) b^{-1} \big )}{J_{\max }^2 \Vert {\varvec{C}}_s\Vert _\infty ^2 \max (\Vert \mathbb {T}_s\Vert ^2_{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert ^2_{{{\mathrm{op}}}; \infty })} {\varvec{W}}_s ds \Big ) \Big ]\\&\quad \le m + n \end{aligned}$$

for any \(\xi \in [0, 3]\). Using this with Lemma A.1 entails

$$\begin{aligned}&\mathbbm {P}\bigg [ \frac{\lambda _{\max }(\mathscr {S}({\varvec{Z}}_t))}{b} \\&\quad \ge \frac{1}{\xi }\lambda _{\max }\Big (\int _0^t \frac{\phi \big (\xi J_{\max } \Vert {\varvec{C}}_s\Vert _\infty \max (\Vert \mathbb {T}_s\Vert _{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert _{{{\mathrm{op}}}; \infty }) b^{-1}\big )}{J_{\max }^2 \Vert {\varvec{C}}_s\Vert _\infty ^2 \max (\Vert \mathbb {T}_s\Vert ^2_{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert ^2_{{{\mathrm{op}}}; \infty })} {\varvec{W}}_s ds \Big )+ \frac{x}{\xi }\bigg ] \\&\quad \le (m + n) e^{-x}. \end{aligned}$$

Note that on \(\{ b_t \le b \}\) we have \(J_{\max } \Vert {\varvec{C}}_s\Vert _\infty \max (\Vert \mathbb {T}_s\Vert _{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert _{{{\mathrm{op}}}; \infty })b^{-1} \le 1\) for any \(s \in [0, t]\). The following facts on the function \(\phi (x)\) hold true (cf. [12, 22]):

  1. (i)

    \(\phi (x h)\le h^2 \phi (x)\) for any \(h \in [0,1]\) and \(x > 0\)

  2. (ii)

    \(\phi (\xi ) \le \frac{\xi ^2}{2(1 - \xi / 3)}\) for any \(\xi \in (0, 3)\)

  3. (iii)

    \(\min _{\xi \in (0, 1/c)} \big ( \frac{a \xi }{1 - c \xi } + \frac{x}{\xi }\big ) = 2 \sqrt{ax} + c x\) for any \(a, c, x > 0\).

Using successively (i) and (ii), one gets, on \(\{ b_t \le b \} \cap \{ \lambda _{\max }({\varvec{V}}_t) \le v \}\), that for \(\xi \in (0,3)\):

$$\begin{aligned}&\frac{1}{\xi }\lambda _{\max }\Big ( \int _0^t \frac{\phi \big (\xi J_{\max } \Vert {\varvec{C}}_s\Vert _\infty \max (\Vert \mathbb {T}_s\Vert _{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert _{{{\mathrm{op}}}; \infty }) b^{-1}\big )}{J_{\max }^2 \Vert {\varvec{C}}_s\Vert _\infty ^2 \max (\Vert \mathbb {T}_s\Vert ^2_{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert ^2_{{{\mathrm{op}}}; \infty })} {\varvec{W}}_s ds \Big ) + \frac{x}{\xi }\\&\quad \le \frac{\phi (\xi )}{\xi b^2} \lambda _{\max }\Big ( \int _0^t {\varvec{W}}_s ds \Big ) + \frac{x}{\xi }\\&\quad = \frac{\phi (\xi )}{\xi b^2} \lambda _{\max }({\varvec{V}}_t) + \frac{x}{\xi } \\&\quad \le \frac{\xi v}{2 b^2 (1 - \xi / 3)} + \frac{x}{\xi }, \end{aligned}$$

where we recall that \({\varvec{V}}_t\) is given by (10). This gives

$$\begin{aligned} \mathbbm {P}\bigg [ \frac{\lambda _{\max }(\mathscr {S}({\varvec{Z}}_t))}{b} \ge \frac{\xi v}{2 b^2 (1 - \xi / 3)} + \frac{x}{\xi }, \quad b_t \le b, \quad \lambda _{\max }({\varvec{V}}_t) \le v \bigg ] \le (m + n) e^{-x}, \end{aligned}$$

for any \( \xi \in (0,3)\). Now, by optimizing over \(\xi \) using (iii) (with \(a = v/2b^2\) and \(c = 1/3\)), one obtains

$$\begin{aligned} \mathbbm {P}\bigg [ \frac{\lambda _{\max }(\mathscr {S}({\varvec{Z}}_t))}{b} \ge \frac{\sqrt{2 v x}}{b} + \frac{x}{3}, \quad b_t \le b, \quad \lambda _{\max }({\varvec{V}}_t) \le v \bigg ] \le (m + n) e^{-x}. \end{aligned}$$

Since \(\lambda _{\max }(\mathscr {S}({\varvec{Z}}_t)) = \Vert \mathscr {S}({\varvec{Z}}_t)\Vert _{{{\mathrm{op}}}}\), this concludes the proof of Theorem 1 when the variance term is expressed using Eq. (10). It only remains to prove the fact that

$$\begin{aligned} \sigma ^2({\varvec{Z}}_t) = \lambda _{\max }({\varvec{V}}_t). \end{aligned}$$

Since \({\varvec{W}}_s\) is block-diagonal, we have obviously:

$$\begin{aligned} \lambda _{\max }({\varvec{V}}_t)&= \max \bigg ( \Big \Vert \int _0^t \mathbb {T}_s \mathbb {T}_s^\top \circ \big (\mathbbm {E}({\varvec{J}}_1^{\odot 2}) \odot {\varvec{C}}^{\odot 2}_s \odot {\varvec{\lambda }}_s \big ) ds \Big \Vert _{{{\mathrm{op}}}}, \\&\quad \quad \Big \Vert \int _0^t \mathbb {T}_s^\top \mathbb {T}_s \circ \big (\mathbbm {E}({\varvec{J}}_1^{\odot 2}) \odot {\varvec{C}}^{\odot 2}_s \odot {\varvec{\lambda }}_s \big ) ds \Big \Vert _{{{\mathrm{op}}}} \bigg ) \; . \end{aligned}$$

From the definition of \({\varvec{Z}}_t\), since the entries of \(\Delta {\varvec{M}}_t\) do not jump at the same time, the predictable quadratic covariation of \(({\varvec{Z}}_t)_{k,j}\) and \(({\varvec{Z}}_t)_{l,j}\) is simply the predictable compensator of \(\sum _{a,b} \sum _{s \le t} (\mathbb {T}_s)_{k,j;a,b} (\mathbb {T}_s)_{l,j} ({\varvec{C}}_s)^2_{a,b} ({\varvec{J}}_{N_s})^2_{a,b} (\Delta {\varvec{N}}_s)_{a,b}\). It results that

$$\begin{aligned} \sum _{j=1}^n (d \langle {\varvec{Z}}_{\bullet ,j} \rangle _t)_{k,l} =&\sum _j (\mathbb {T}_t)_{k,j;a,b}(\mathbb {T}_t)_{l,j;a,b} \mathbbm {E}(({\varvec{J}}_1)_{a,b}^2) ({\varvec{\lambda }}_t)_{a,b} ({\varvec{C}}_t)_{a,b}^2 dt \\ =&\left( \mathbb {T}_t \mathbb {T}_t^\top \circ \mathbbm {E}({\varvec{J}}_1^{\odot 2}) \odot {\varvec{C}}_t^{\odot 2} \odot {\varvec{\lambda }}_t \right) _{k,l} dt. \end{aligned}$$

An analogous computation for \(\langle {\varvec{Z}}_{j,\bullet } \rangle _t\) leads to the expected result, and concludes the proof of Theorem 1. \(\square \)

Appendix C: Proof of Proposition B.1

Let us first remark that:

$$\begin{aligned} \exp (\mathscr {S}({\varvec{X}}))= & {} \sum _{k=0}^{\infty } \frac{1}{(2k) !} \begin{bmatrix} ({\varvec{X}}{\varvec{X}}^\top )^k&\varvec{0} \\ \varvec{0}&({\varvec{X}}^\top {\varvec{X}})^k \end{bmatrix} \\&+ \,\frac{1}{(2k+1)!} \begin{bmatrix} \varvec{0}&{\varvec{X}}({\varvec{X}}^\top {\varvec{X}})^{k} \\ {\varvec{X}}^\top ({\varvec{X}}{\varvec{X}}^\top )^{k}&0 \end{bmatrix}. \end{aligned}$$

Then, from the definition of \({\varvec{U}}_t\) in Eq. (35), we have:

$$\begin{aligned} {\varvec{U}}_t&= \sum _{0 \le s\le t} \sum _{k \ge 2} \frac{\xi ^k \mathscr {S}(\Delta {\varvec{Z}}_s)^k}{k!} \\&= \sum _{0 \le s\le t} \sum _{k \ge 1} \begin{bmatrix} \frac{\xi ^{2k}}{(2k)!} (\Delta {\varvec{Z}}_s \Delta {\varvec{Z}}_s^\top )^k&\frac{\xi ^{2k+1}}{(2k+1)!} \Delta {\varvec{Z}}_s (\Delta {\varvec{Z}}_s^\top \Delta {\varvec{Z}}_s)^{k+1} \\ \frac{\xi ^{2k+1}}{(2k+1)!} \Delta {\varvec{Z}}_s^\top (\Delta {\varvec{Z}}_s \Delta {\varvec{Z}}_s^\top )^{k+1}&\frac{\xi ^{2k}}{(2k)!}(\Delta {\varvec{Z}}_s^\top \Delta {\varvec{Z}}_s)^k \end{bmatrix}. \end{aligned}$$

Since \((\Delta {\varvec{Z}}_s (\Delta {\varvec{Z}}_s^\top \Delta {\varvec{Z}}_s)^k)^\top = \Delta {\varvec{Z}}_s^\top (\Delta {\varvec{Z}}_s \Delta {\varvec{Z}}_s^\top )^k\), we need to compute three terms: \((\Delta {\varvec{Z}}_s \Delta {\varvec{Z}}_s^\top )^k\), \((\Delta {\varvec{Z}}_s^\top \Delta {\varvec{Z}}_s)^k\) and \(\Delta {\varvec{Z}}_s^\top (\Delta {\varvec{Z}}_s \Delta {\varvec{Z}}_s^\top )^k\).

From Assumption 1, one has, a.s. that the entries of \({\varvec{M}}_t\) cannot jump at the same time, hence

$$\begin{aligned} \begin{aligned} (\Delta {\varvec{M}}_t)_{i_1, j_1}&\times \cdots \times (\Delta {\varvec{M}}_t)_{i_m, j_m} \\&= {\left\{ \begin{array}{ll} ((\Delta {\varvec{M}}_t)_{i_1, j_1})^m &{}\text { if } i_1 = \cdots = i_m \text { and } j_1 = \cdots = j_m \\ {\varvec{0}}&{}\text { otherwise} \end{array}\right. } \end{aligned} \end{aligned}$$
(37)

a.s. for any t, \(m \ge 2\) and any indexes \(i_k \in \{ 1, \ldots , p \}\) and \(j_k \in \{ 1, \ldots , q\}\). This entails, with the definition (1) of \(\Delta {\varvec{Z}}_s\), that \((\Delta {\varvec{Z}}_s \Delta {\varvec{Z}}_s^\top )^k\) is given, a.s., by

$$\begin{aligned} \sum _{a=1}^{p} \sum _{b=1}^{q} ( (\mathbb {T}_s)_{\bullet ;a,b} (\mathbb {T}_s)_{\bullet ;a,b}^{\top })^k (({\varvec{C}}_s)_{a,b} (\Delta {\varvec{M}}_s)_{a,b})^{2k} = (\mathbb {T}_s\mathbb {T}_s^\top )^k \circ ({\varvec{C}}_s \odot \Delta {\varvec{M}}_s)^{\odot 2k}. \end{aligned}$$

Let us remark that Eq. (34) entails

$$\begin{aligned}&\mathbbm {E}\int _0^t \sum _{k \ge 1} \frac{\xi ^{2k}}{(2k)!} \sum _{a=1}^{p} \sum _{b=1}^{q} \big (( (\mathbb {T}_s)_{\bullet ;a,b} (\mathbb {T}_s)_{\bullet ;a,b}^{\top })^k \big )_{i,j} (({\varvec{C}}_s)_{a,b})^{2k} \; \mathbbm {E}[|{\varvec{J}}_1|_{a, b}^{2k}] \; ({\varvec{\lambda }}_s)_{a,b} \; ds\\&\quad < + \infty \end{aligned}$$

for any ij, so that together with Assumption 1, it is easily seen that the compensator of

$$\begin{aligned} \sum _{0 \le s\le t} \sum _{k \ge 1} \frac{\xi ^{2k}}{(2k)!} (\Delta {\varvec{Z}}_s \Delta {\varvec{Z}}_s^\top )^k \end{aligned}$$
(38)

is a.s. given by

$$\begin{aligned} \int _0^t \sum _{k \ge 1} \frac{\xi ^{2k}}{(2k)!} \sum _{a=1}^{p} \sum _{b=1}^{q} ((\mathbb {T}_s)_{\bullet ;a,b} (\mathbb {T}_s)_{\bullet ;a,b}^{\top })^k ({\varvec{C}}_s)_{a,b}^{2k} \mathbbm {E}[({\varvec{J}}_1)_{a,b}^{2k}] ({\varvec{\lambda }}_s)_{a, b} ds. \end{aligned}$$

Following the same arguments as for (38), we obtain that the compensator of

$$\begin{aligned} \sum _{0 \le s\le t} \sum _{k \ge 1} \frac{\xi ^{2k}}{(2k)!} (\Delta {\varvec{Z}}_s^\top \Delta {\varvec{Z}}_s)^k \end{aligned}$$

is a.s. given by

$$\begin{aligned} \int _0^t \sum _{k \ge 1} \frac{\xi ^{2k}}{(2k)!} \sum _{a=1}^{p} \sum _{b=1}^{q} ((\mathbb {T}_s)_{\bullet ;a,b}^{\top } (\mathbb {T}_s)_{\bullet ;a,b} )^k ({\varvec{C}}_s)_{a,b}^{2k} \mathbbm {E}[({\varvec{J}}_1)_{a,b}^{2k}] ({\varvec{\lambda }}_s)_{a, b} ds. \end{aligned}$$

Along the same line, one can easily show that the compensator of

$$\begin{aligned} \sum _{0 \le s\le t} \sum _{k \ge 1} \frac{\xi ^{2k + 1}}{(2k + 1)!} \Delta {\varvec{Z}}_s^\top (\Delta {\varvec{Z}}_s \Delta {\varvec{Z}}_s^\top )^k, \end{aligned}$$

reads a.s.:

$$\begin{aligned} \int _0^t \sum _{k \ge 1} \frac{\xi ^{2k}}{(2k)!} \sum _{a=1}^{p} \sum _{b=1}^{q} (\mathbb {T}_s)_{\bullet ;a,b}^{\top } ((\mathbb {T}_s)_{\bullet ;a,b} (\mathbb {T}_s)_{\bullet ;a,b}^{\top } )^k ({\varvec{C}}_s)_{a,b}^{2k+1} \mathbbm {E}[({\varvec{J}}_1)_{a,b}^{2k+1}] ({\varvec{\lambda }}_s)_{a, b} ds. \end{aligned}$$

Finally, we can write, a.s., the compensator of \({\varvec{U}}_t\) as

$$\begin{aligned} {\varvec{\Lambda }}_t = \int _0^t \sum _{k \ge 1} {\varvec{R}}_s^{(k)} ds \end{aligned}$$
(39)

where

$$\begin{aligned} {\varvec{R}}_s^{(k)} = \begin{bmatrix} \frac{\xi ^{2k}}{(2k)!} {\varvec{D}}_{1, s}^{(k)}&\frac{\xi ^{2k+1}}{(2k+1)!} ({\varvec{H}}_{s}^{(k+1)})^\top \\ \frac{\xi ^{2k+1}}{(2k+1)!} {{\varvec{H}}_{s}^{(k+1)}}&\frac{\xi ^{2k}}{(2k)!} {\varvec{D}}^{(k)}_{2, s} \end{bmatrix} \end{aligned}$$
(40)

with

$$\begin{aligned} {\varvec{D}}_{1, s}^{(k)}&= \sum _{a=1}^{p} \sum _{b=1}^{q} ((\mathbb {T}_s)_{\bullet ;a,b} (\mathbb {T}_s)_{\bullet ;a,b}^{\top })^k ({\varvec{C}}_s)_{a,b}^{2k} \mathbbm {E}[({\varvec{J}}_1)_{a,b}^{2k}] ({\varvec{\lambda }}_s)_{a, b} \\ {\varvec{D}}_{2, s}^{(k)}&= \sum _{a=1}^{p} \sum _{b=1}^{q} ((\mathbb {T}_s)_{\bullet ;a,b}^\top (\mathbb {T}_s)_{\bullet ;a,b} )^k ({\varvec{C}}_s)_{a,b}^{2k} \mathbbm {E}[({\varvec{J}}_1)_{a,b}^{2k}] ({\varvec{\lambda }}_s)_{a, b} \\ {\varvec{H}}_{s}^{(k+1)}&= \sum _{a=1}^{p} \sum _{b=1}^{q} (\mathbb {T}_s)_{\bullet ;a,b}^{\top } ((\mathbb {T}_s)_{\bullet ;a,b} (\mathbb {T}_s)_{\bullet ;a,b}^{\top } )^k ({\varvec{C}}_s)_{a,b}^{2k+1} \mathbbm {E}[({\varvec{J}}_1)_{a,b}^{2k+1}] ({\varvec{\lambda }}_s)_{a, b}. \end{aligned}$$

One can now directly use Lemma A.4 with \({\varvec{X}}= (\mathbb {T}_s)_{\bullet ;a,b} \mathbbm {E}[({\varvec{J}}_1)_{a,b}^{2k+1}]^{1/(2k+1)} ({\varvec{C}}_s)_{a,b}\) to obtain:

$$\begin{aligned} {\varvec{\Lambda }}_t \preccurlyeq&\int _0^t \sum _{k \ge 2} \sum _{a=1}^p \sum _{b=1}^{q} \frac{\xi ^k J_{\max }^{k-2}}{k!} \begin{bmatrix} ((\mathbb {T}_s)_{\bullet ;a,b} (\mathbb {T}_s)_{\bullet ;a,b}^{\top } )^{k/2}&\varvec{0} \\ \varvec{0}&((\mathbb {T}_s)_{\bullet ;a,b}^\top (\mathbb {T}_s)_{\bullet ;a,b})^{k/2} \end{bmatrix}\\&\times ({\varvec{C}}_s)_{a,b}^{k} \mathbbm {E}[({\varvec{J}}_1)_{a,b}^2] ({\varvec{\lambda }}_s)_{a, b} ds \\ =&\int _0^t \sum _{k \ge 2} \frac{\xi ^k J_{\max }^{k-2}}{k!} \begin{bmatrix} (\mathbb {T}_s \circ \mathbb {T}_s^\top )^{k/2}&\varvec{0} \\ \varvec{0}&(\mathbb {T}_s^{\top } \circ \mathbb {T}_s)^{k/2} \end{bmatrix} \circ \big ( {\varvec{C}}_s^{\odot k} \odot \mathbbm {E}({\varvec{J}}_1^{\odot 2}) \odot {\varvec{\lambda }}_s\big ) ds \end{aligned}$$

where we used the fact that \(|({\varvec{J}}_1)_{i, j}| \le J_{\max }\) a.s. for any ij under Assumption 1. Given the fact that

$$\begin{aligned} ((\mathbb {T}_s)_{\bullet ; a,b} (\mathbb {T}_s)_{\bullet ;a,b}^\top )^{1/2} \preccurlyeq \Vert (\mathbb {T}_s)_{\bullet ; a,b}\Vert _{{{\mathrm{op}}}} {\varvec{I}}_m \preccurlyeq \Vert \mathbb {T}_s\Vert _{{{\mathrm{op}}}, \infty } {\varvec{I}}_m \end{aligned}$$

for any ab, where we used the notations and definitions from Sect. 2.1, we have:

$$\begin{aligned} {\varvec{\Lambda }}_t \preccurlyeq&\int _0^t \begin{bmatrix} \mathbb {T}_s \mathbb {T}_s^{\top }&\varvec{0} \\ \varvec{0}&\mathbb {T}_s^\top \mathbb {T}_s \end{bmatrix} \circ ({\varvec{C}}_s^{\odot 2} \odot \mathbbm {E}({\varvec{J}}_1^{\odot 2}) \odot {\varvec{\lambda }}_s)\\&\times \sum _{k \ge 2} \frac{\xi ^k}{k!} J_{\max }^{k-2} \Vert {\varvec{C}}_s\Vert _\infty ^{k-2} \max (\Vert \mathbb {T}_s\Vert _{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert _{{{\mathrm{op}}}; \infty })^{2k-1} ds \\ =&\int _0^t \begin{bmatrix} \mathbb {T}_s \mathbb {T}_s^{\top }&\varvec{0} \\ \varvec{0}&\mathbb {T}_s^\top \mathbb {T}_s \end{bmatrix} \circ ({\varvec{C}}_s^{\odot 2} \odot \mathbbm {E}({\varvec{J}}_1^{\odot 2}) \odot {\varvec{\lambda }}_s)\\&\times \frac{\phi \big (\xi J_{\max } \Vert {\varvec{C}}_s\Vert _\infty \max (\Vert \mathbb {T}_s\Vert _{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert _{{{\mathrm{op}}}; \infty })\big )}{J_{\max }^2 \Vert {\varvec{C}}_s\Vert _\infty ^2 \max (\Vert \mathbb {T}_s\Vert ^2_{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert ^2_{{{\mathrm{op}}}; \infty })} ds \end{aligned}$$

where we recall that \(\phi (x) = e^x - 1 - x\). Hence, we finally get

$$\begin{aligned} {\varvec{\Lambda }}_t \preccurlyeq \int _0^t \frac{\phi \big (\xi J_{\max } \Vert {\varvec{C}}_s\Vert _\infty \max (\Vert \mathbb {T}_s\Vert _{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert _{{{\mathrm{op}}}; \infty })\big )}{J_{\max }^2 \Vert {\varvec{C}}_s\Vert _\infty ^2 \max (\Vert \mathbb {T}_s\Vert ^2_{{{\mathrm{op}}}; \infty },\Vert \mathbb {T}_s^\top \Vert ^2_{{{\mathrm{op}}}; \infty })} {\varvec{W}}_s ds, \end{aligned}$$

where \({\varvec{W}}_t\) is given by (7). This concludes the proof of Proposition B.1. \(\square \)

Appendix D: Proof of Theorem 2

The proof follows the same lines as the proof of Theorem 1. We consider as before the symmetric dilation \(\mathscr {S}({\varvec{Z}}_t)\) of \({\varvec{Z}}_t\) (see Eq. (30)) and apply Proposition 1 with \({\varvec{Y}}_t = \xi \mathscr {S}({\varvec{Z}}_t)\) and \(d = m + n\). Since \({\varvec{Z}}_t\) is a continuous martingale, we have \({\varvec{U}}_t = {\varvec{0}}\) (cf. (3)), so that \(\langle {\varvec{U}} \rangle _t = {\varvec{0}}\) and we have \(\langle {\varvec{Z}}^c \rangle _t = \langle {\varvec{Z}} \rangle _t\). So, Proposition 1 gives

$$\begin{aligned} \mathbbm {E}\Big [ {{\mathrm{tr}}}\exp \Big ( \xi \mathscr {S}({\varvec{Z}}_t) - \frac{1}{2} \sum _{j=1}^{m+n} \xi ^2 \langle \mathscr {S}({\varvec{Z}})_{\bullet ,j} \rangle _t \Big ) \Big ] \le m + n. \end{aligned}$$
(41)

From the definition of the dilation operator \(\mathscr {S}\), it can be directly shown that:

$$\begin{aligned} \sum _{j=1}^{m+n} \langle \mathscr {S}({\varvec{Z}})_{\bullet ,j} \rangle _t = \begin{bmatrix} \sum _{j=1}^n \langle {\varvec{Z}}_{\bullet ,j} \rangle _t&{\varvec{0}}_{m,n} \\ {\varvec{0}}_{n,m}&\sum _{j=1}^m \langle {\varvec{Z}}_{j,\bullet } \rangle _t \end{bmatrix} \end{aligned}$$

where \(\langle {\varvec{Z}}_{\bullet ,j} \rangle _t\) (resp. \(\langle {\varvec{Z}}_{\bullet ,j} \rangle _t\)) is the \(m \times m\) (resp. \(n \times n\)) matrix of the quadratic variation of the j-th column (resp. row) of \({\varvec{Z}}_t\). Since \([{\varvec{M}}^\mathrm {con}]_t = \langle {\varvec{M}}^\mathrm {con} \rangle _t = t {\varvec{I}}\), we have (for the sake of clarity, we omit the subscript t in the matrices):

$$\begin{aligned} \sum _{j=1}^n (d\langle {\varvec{Z}}_{\bullet ,j} \rangle _t)_{kl}&= \sum _{j=1}^n d[{\varvec{Z}}_{k,j},{\varvec{Z}}_{l,j}] \\&= \sum _{j=1}^n \sum _{a=1}^p \sum _{b=1}^q \mathbb {T}_{k,j;a,b} \mathbb {T}_{l,j;a,b} {\varvec{C}}_{a,b}^2 dt \\&= \big ( \mathbb {T}_t \mathbb {T}_t^\top \circ {\varvec{C}}_t^{\odot 2} \big )_{k,l} dt \end{aligned}$$

which gives in a matrix form

$$\begin{aligned} \sum _{j=1}^n d\langle {\varvec{Z}}_{\bullet ,j} \rangle _t = \mathbb {T}_t \mathbb {T}_t^\top \circ {\varvec{C}}_t^{\odot 2} dt. \end{aligned}$$

One can easily prove in the same way that

$$\begin{aligned} \sum _{j=1}^m d\langle {\varvec{Z}}_{j,\bullet } \rangle _t = \mathbb {T}_t^\top \mathbb {T}_t \circ {\varvec{C}}_t^{\odot 2} dt. \end{aligned}$$

Thus,

$$\begin{aligned} \sum _{j=1}^{m+n} \langle \mathscr {S}({\varvec{Z}})_{\bullet ,j}^c \rangle _t = {\varvec{V}}_t, \end{aligned}$$

where \({\varvec{V}}_t\) is given by (13). From (41), it results

$$\begin{aligned} \mathbbm {E}\Big [ {{\mathrm{tr}}}\exp \Big ( \xi \mathscr {S}({\varvec{Z}}_t)-\frac{\xi ^2}{2} {\varvec{V}}_t \Big ) \Big ] \le m + n. \end{aligned}$$

Then, using Lemma A.1, one gets

$$\begin{aligned} \mathbbm {P}\bigg [ \lambda _{\max }(\mathscr {S}({\varvec{Z}}_t)) \ge \frac{\xi }{2} \lambda _{\max }({\varvec{V}}_t) + \frac{x}{\xi }\bigg ] \le (m+n) e^{-x}. \end{aligned}$$
(42)

On the event \(\{\lambda _{\max }( {\varvec{V}}_t ) \le v\}\), one gets

$$\begin{aligned} \mathbbm {P}\bigg [ \lambda _{\max }(\mathscr {S}({\varvec{Z}}_t)) \ge \frac{\xi }{2} v + \frac{x}{\xi }~,~~~ \lambda _{\max }( {\varvec{V}}_t ) \le v \bigg ] \le (m+n) e^{-x}. \end{aligned}$$
(43)

Optimizing on \(\xi \), we apply this last result for \(\xi = \sqrt{2x / v}\) and get

$$\begin{aligned} \mathbbm {P}\bigg [ \lambda _{\max }(\mathscr {S}({\varvec{Z}}_t)) \ge \sqrt{2xv}, \quad \lambda _{\max }( {\varvec{V}}_t ) \le v \bigg ] \le (m+n) e^{-x}. \end{aligned}$$
(44)

Since \( \lambda _{\max }(\mathscr {S}({\varvec{Z}}_t)) = \Vert \mathscr {S}({\varvec{Z}}_t)\Vert _{{{\mathrm{op}}}}\), this concludes the proof of Theorem 2. \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bacry, E., Gaïffas, S. & Muzy, JF. Concentration inequalities for matrix martingales in continuous time. Probab. Theory Relat. Fields 170, 525–553 (2018). https://doi.org/10.1007/s00440-017-0786-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00440-017-0786-9

Mathematics Subject Classification

  • 60B20
  • 60G44
  • 60H05
  • 60G48