1 Introduction

Given a probability space \((\Omega , {\mathcal {F}}, P)\) with a right-continuous filtration \(\textbf{F}=\{{\mathcal {F}}_t\}_{t\ge 0}\), let \(X^{(\alpha )}=\{X_t^{(\alpha )}\}_{t\ge 0}=\{(X^{(\alpha ),1}_t,X^{(\alpha ),2}_t)\}_{t\ge 0}\) be a two-dimensional \(\textbf{F}\)-adapted process satisfying the following stochastic differential equation:

$$\begin{aligned} {\textrm{d}}X_t^{(\alpha )}=\mu _t(\theta ){\textrm{d}}t+b_t(\sigma ) {\textrm{d}}W_t, \quad X_0=x_0, \end{aligned}$$
(1.1)

where \(x_0\in \mathbb {R}^2\), \(\{W_t\}_{0\le t\le T}\) is a two-dimensional standard \(\textbf{F}\)-Wiener process, \(\{\mu _t(\theta )\}_{t\ge 0}\) and \(\{b_t(\sigma )\}_{t\ge 0}\) are deterministic functions with values in \({\mathbb {R}}^{2}\) and \({\mathbb {R}}^{2\times 2}\), respectively, \(\alpha =(\sigma ,\theta )\), \(\sigma \in \Theta _1\), \(\theta \in \Theta _2\), and \(\Theta _1\) and \(\Theta _2\) are bounded open subsets of \({\mathbb {R}}^{d_1}\) and \(\mathbb {R}^{d_2}\), respectively. Let \(\alpha _0=(\sigma _0,\theta _0)\in \Theta _1\times \Theta _2\) be the true value, and let \(X_t=(X_t^1,X_t^2)=X_t^{(\alpha _0)}\). We consider estimation of \(\alpha _0\) when X is observed with nonsynchronous manner, that is, observation times of \(X^1\) and \(X^2\) are different to each other.

The problem of nonsynchronous observations appears in the analysis of high-frequency financial data. If we analyze the intra-day stock price data, we observe stock prices when a new transaction or a new order arrives. Then, the observation times are different for different stocks, and hence, we cannot avoid the problem of nonsynchronous observations. Statistical analysis with such data is much more complicated compared to the analysis with synchronous data. Parametric estimation for diffusion processes with synchronous and equidistant observations has been analyzed through quasi-maximum-likelihood methods in Florens-Zmirou (1989), Yoshida (1992, 2011), Kessler (1997), and Uchida and Yoshida (2012). Related to the estimation problem for nonsynchronously observed diffusion processes, estimators for the quadratic covariation have been actively studied. Hayashi and Yoshida (2005, 2008, 2011) and Malliavin and Mancino (2002, 2009) have independently constructed consistent estimators under nonsynchronous observations. There are also studies of covariation estimation under the simultaneous presence of microstructure noise and nonsynchronous observations (Barndorff-Nielsen et al., 2011; Bibinger et al., 2014; Christensen et al., 2010, and so on). For parametric estimation with nonsynchronous observations, Ogihara and Yoshida (2014) have constructed maximum-likelihood-type and Bayes-type estimators and have shown the consistency and the asymptotic mixed normality of the estimators when the terminal time point \(T_n\) is fixed and the observation frequency goes to infinity. Ogihara (2015) have shown local asymptotic mixed normality for the model in Ogihara and Yoshida (2014), and the maximum-likelihood-type and Bayes-type estimators have been shown to be asymptotically efficient. On the other hand, we need to consider asymptotic theory that the terminal time point \(T_n\) goes to infinity to consistently estimate the parameter \(\theta \) in the drift term. To the best of the author’s knowledge, there are no studies of the asymptotic theory of parametric estimation for nonsynchronously observed diffusion processes when \(T_n\rightarrow \infty \).

In this work, we consider the asymptotic theory for nonsynchronously observed diffusion processes when \(T_n\rightarrow \infty \), and construct maximum-likelihood-type estimators for the parameter \(\sigma \) in the diffusion part and the parameter \(\theta \) in the drift part. We show the consistency and the asymptotic normality of the estimators. Moreover, we show local asymptotic normality of the statistical model, and we obtain asymptotic efficiency of our estimator as a consequence. Our estimator is constructed based on the quasi-likelihood function that is similarly defined to the one in Ogihara and Yoshida (2014), though we need some modification to deal with the drift part. To investigate asymptotic theory for the maximum-likelihood-type estimator, we need to specify the limit of the quasi-likelihood function. Then, we need to assume some conditions for the asymptotic behavior of the sampling scheme. In Ogihara and Yoshida (2014), for a matrix

$$\begin{aligned} G=\left\{ \frac{(S_i^{n,1}\wedge S_j^{n,2}-S_{j-1}^{n,2} \vee S_{i-1}^{n,1})\vee 0}{|S_i^{n,1}-S_{i-1}^{n,1}|^{1/2}|S_j^{n,2}-S_{j-1}^{n,2}|^{1/2}}\right\} _{i,j} \end{aligned}$$

generated by the sampling scheme, the existence of the probability limit of \(n^{-1}\textrm{tr}((GG^\top )^p) \ (p\in \mathbb {Z}_+)\) is required, where \((S_i^{n,l})_i\) are observation times of \(X^l\) and \(\top \) denotes transpose of a vector or a matrix. Since we consider the different asymptotics, the asymptotic behavior of the quasi-likelihood function is different from that in Ogihara and Yoshida (2014). We also need to consider estimation for the drift parameter \(\theta \). Then, we need other assumptions for the asymptotic behavior of the sampling scheme [Assumption (A5)]. Though these conditions for the sampling scheme are difficult to check directly, we study tractable sufficient conditions in Sect. 2.4.

The rest of this paper is organized as follows. In Sect. 2, we introduce our model settings and the assumptions for main results. Our estimator is constructed in Sect. 2.1, and the asymptotic normality of the estimator is given in Sect. 2.2. Section 2.3 deals with local asymptotic normality of our model and asymptotic efficiency of the estimator. Tractable sufficient conditions for the assumptions of the sampling scheme are given in Sect. 2.4. Section 3 contains the proofs of main results. Preliminary results are collected in Sect. 3.1. Section 3.2 is for the consistency of the estimator for \(\sigma \), Sect. 3.3 is for the asymptotic normality of the estimator for \(\sigma \), Sect. 3.4 is for the consistency of the estimator for \(\theta \), and Sect. 3.5 is for the asymptotic normality of the estimator for \(\theta \). Other proofs are collected in Sect. 3.6.

2 Main results

2.1 Setting and parameter estimation

Let \(\mathbb {N}\) be the set of all positive integers. For \(l\in \{1,2\}\), let the observation times \(\{S_i^{n,l}\}_{i=0}^{M_l}\) be strictly increasing random times with respect to i, and satisfy \(S_0^{n,l}=0\) and \(S_{M_l}^{n,l}=nh_n\), where \(M_l\) is a random positive integer depending on n and \((h_n)_{n=1}^\infty \) is a sequence of positive numbers satisfying

$$\begin{aligned} h_n\rightarrow 0, \quad {n^{1-\epsilon _0}}h_n\rightarrow \infty , \quad nh_n^2\rightarrow 0 \end{aligned}$$
(2.1)

as \(n\rightarrow \infty \) for some \(\epsilon _0>0\). Intuitively, n is of the order of the number of observations and \(h_n\) is of the order of the length of the observation intervals. More precise assumptions of observation times are given in (A2), (A4), and (A5) later. We assume that \(\{S_i^{n,l}\}_{0\le i\le M_l, l=1,2}\) is independent of \({\mathcal {F}}_T\), and its distribution does not depend on \(\alpha \). We consider nonsynchronous observations of X, that is, we observe \(\{S_i^{n,l}\}_{0\le i\le M_l, l=1,2}\) and \(\{X^l_{S^{n,l}_i}\}_{0\le i\le M_l, l=1,2}\). In particular, we consider the nonendogenous observation times.

We denote by \(\Vert \cdot \Vert \) the operator norm with respect to the Euclidean norm for a matrix. We often regard a p-dimensional vector v as a \(p\times 1\) matrix. For \(j\in \mathbb {N}\), we denote \(\partial _z=\frac{\partial }{\partial z}\) for a variable \(z\in \mathbb {R}^j\), and denote \(\partial _z^l=(\partial _{z_{i_1}}\cdots \partial _{z_{i_l}})_{i_1,\ldots , i_l=1}^j\) for \(l\in \mathbb {N}\). For functions f and g, we often use shorthand notation \(\partial _zf\partial _zg=(\partial _zf(\partial _zg)^\top +\partial _zg (\partial _zf)^\top )/2\). For a set A in a topological space, let \(\textrm{clos}(A)\) denote the closure of A. For a matrix A, \([A]_{ij}\) denotes its (ij) element. For a vector \(v=(v_j)_{j=1}^K\), we denote \([v]_j=v_j\), and \(\textrm{diag}(v)\) denotes a \(K\times K\) diagonal matrix with elements \([{\textrm{diag}}(v)]_{jj}=v_j\).

Let \(M=M_1+M_2\). For \(1\le i\le M\), let

$$\begin{aligned} \varphi (i)=\left\{ \begin{array}{ll} i, &{}\quad \textrm{if} \ i\le M_1,\\ i-M_1, &{} \quad \textrm{if} \ i>M_1, \end{array} \right. \quad \psi (i)=\left\{ \begin{array}{ll} 1, &{}\quad \textrm{if} \ i\le M_1,\\ 2, &{}\quad \textrm{if} \ i>M_1. \end{array} \right. \end{aligned}$$

For a two-dimensional stochastic process \((U_t)_{t\ge 0}=((U_t^1,U_t^2))_{t\ge 0}\), let \(\Delta _i^l U=U^l_{S^{n,l}_i}-U^l_{S^{n,l}_{i-1}}\), and let \(\Delta ^l U=(\Delta _i^l U)_{1\le i\le M_l}\) and \(\Delta _i U=\Delta _{\varphi (i)}^{\psi (i)} U\) for \(1\le i\le M\). Let \(\Delta U=((\Delta ^1 U)^\top , (\Delta ^2 U)^\top )^\top \). Let \(|K|=b-a\) for an interval \(K=(a,b]\). Let \(I_i^l=(S_{i-1}^{n,l},S_i^{n,l}]\) for \(1\le i\le M_l\), and let \(I_i=I_{\varphi (i)}^{\psi (i)}\) for \(1\le i\le M\). We denote a unit matrix of size k by \({\mathcal {E}}_k\).

Let \(\tilde{\Sigma }_i^l(\sigma )=\int _{I_i^l}[b_tb_t^\top (\sigma )]_{ll}{\textrm{d}}t\) and \(\tilde{\Sigma }_{i,j}^{1,2}(\sigma )=\int _{I_i^1\cap I_j^2}[b_tb_t^\top (\sigma )]_{12}{\textrm{d}}t\), and let \(\tilde{\Sigma }_i=\tilde{\Sigma }_{\varphi (i)}^{\psi (i)}\) for \(1\le i\le M\). By setting \(\tilde{\mathcal {D}}=\textrm{diag}((\tilde{\Sigma }_i)_{1\le i\le M})\)

$$\begin{aligned} \tilde{G}(\sigma )= \bigg \{\frac{\tilde{\Sigma }_{i,j}^{1,2}}{\sqrt{\tilde{\Sigma }_i^1} \sqrt{\tilde{\Sigma }_j^2}}(\sigma )\bigg \}_{1\le i\le M_1, 1\le j\le M_2}, \end{aligned}$$

we can calculate the covariance matrix of \(\Delta X\) as

$$\begin{aligned} S_n(\sigma )=\tilde{\mathcal {D}}^{1/2}\left( \begin{array}{cc} \mathcal {E}_{M_1} &{} \tilde{G}(\sigma ) \\ \tilde{G}^\top (\sigma ) &{} \mathcal {E}_{M_2} \end{array} \right) \tilde{\mathcal {D}}^{1/2}. \end{aligned}$$
(2.2)

As we will see later, we can ignore the term related to \(\mu _t(\theta )\) (drift term) when we consider estimation of \(\sigma \), because this term converges to zero very fast. Therefore, we first construct an estimator for \(\sigma \), and then construct an estimator for \(\theta \). Such adaptive estimation can speed up the calculation.

We define the quasi-likelihood function \(H_n^1(\sigma )\) for \(\sigma \) as follows:

$$\begin{aligned} H_n^1(\sigma )=-\frac{1}{2}\Delta X^\top S_n^{-1}(\sigma )\Delta X-\frac{1}{2}\log \det S_n(\sigma ). \end{aligned}$$

Then, the maximum-likelihood-type estimator for \(\sigma \) is defined by

$$\begin{aligned} \hat{\sigma }_n\in \mathop {\textrm{argmax}}\limits _{\sigma \in \textrm{clos}(\Theta _1)} H_n^1(\sigma ). \end{aligned}$$

We consider estimation for \(\theta \) next. Let \(V(\theta )=(V_t(\theta ))_{t\ge 0}\) be a two-dimensional stochastic process defined by \(V_t(\theta )=(\int _0^t\mu ^1_s(\theta )^\top {\textrm{d}}s,\int _0^t\mu ^2_s(\theta )^\top {\textrm{d}}s)^\top \). Let \(\bar{X}(\theta )=\Delta X-\Delta V(\theta )\). We define the quasi-likelihood function \(H_n^2(\theta )\) for \(\theta \) as follows:

$$\begin{aligned} H_n^2(\theta )=-\frac{1}{2}\bar{X}(\theta )^\top S_n^{-1}(\hat{\sigma }_n)\bar{X}(\theta ). \end{aligned}$$

Then, the maximum-likelihood-type estimator for \(\theta \) is defined by

$$\begin{aligned} \hat{\theta }_n\in \mathop {\textrm{argmax}}\limits _{\theta \in \textrm{clos}(\Theta _2)} H_n^2(\theta ). \end{aligned}$$

The quasi-(log-)likelihood function \(H_n^1\) is defined in the same way as that in Ogihara and Yoshida (2014). Since \(\Delta X\) follows normal distribution, we can construct such a Gaussian quasi-likelihood function even for the nonsynchronous data. When the coefficients are random, though the distribution of \(\Delta X\) is not Gaussian, such Gaussian-type quasi-likelihood function is still valid due to the local Gaussian property of diffusion processes. The Gaussian mean that comes from the drift part is ignored when we construct the quasi-likelihood \(H_n^1\). When we estimate the parameter \(\theta \) for the drift part, we subtract the mean in \(\bar{X}(\theta )\) to construct the quasi-likelihood function \(H_n^2\). Since the effect of the drift term on the estimation of \(\sigma \) is small, it works well to estimate \(\sigma \) in this way and then plug in \(\hat{\sigma }_n\) to \(S_n\) to construct the estimator for \(\theta \). Thus, we can speed up the calculation by separating the estimation for \(\sigma \) and \(\theta \).

Remark 2.1

\(H_n^1(\sigma )\) and \(H_n^2(\theta )\) are well defined only if \(\det S_n(\sigma )>0\) and \(\det S_n(\hat{\sigma }_n)>0\), respectively. For the covariance matrix \(S_n\) of nonsynchronous observations \(\Delta X\), it is not trivial to check these conditions. Proposition 1 in Section 2 of Ogihara and Yoshida (2014) shows that these conditions are satisfied if \(b_t(\sigma )\) is continuous on \([0,\infty )\times \textrm{clos}(\Theta _1)\) and \(\inf _{t,\sigma } \det (b_tb_t^\top (\sigma ))>0\). We assume such conditions in our setting (Assumption (A1) in Sect. 2.2).

Remark 2.2

As seen in Ogihara and Yoshida (2014), the quasi-likelihood analysis for nonsynchronously observed diffusion processes becomes much more complicated compared to synchronous observations. In this work, estimation for the drift parameter \(\theta \) is added, and hence, we consider nonrandom drift and diffusion coefficients to avoid overcomplication. For general diffusion processes with the random drift and diffusion coefficients, we need to set predictable coefficients to use the martingale theory. However, the quasi-likelihood function loses a Markov property with nonsynchronous observations and the coefficients in the quasi-likelihood function contain randomness of future time. Then, we need to approximate the coefficients by predictable functions. This operation is particularly complicated. Moreover, approximating the true likelihood function by the quasi-likelihood function is much more difficult problem when we show local asymptotic normality and asymptotic efficiency of the estimators. Therefore, we left asymptotic theory under general random drift and diffusion coefficients as a future work.

2.2 Asymptotic normality of the estimator

In this section, we state the assumptions of our main results, and state the asymptotic normality of the estimator.

For \(m\in \mathbb {N}\), an open subset \(U\subset \mathbb {R}^m\) is said to admit Sobolev’s inequality if, for any \(p>m\), there exists a positive constant C depending on U and p, such that \(\sup _{x\in U}|u(x)|\le C\sum _{k=0,1}(\int |\partial _x^k u(x)|^p{{\textrm{d}}x})^{1/p}\) for any \(u\in C^1(U)\). This is the case when U has a Lipschitz boundary. We assume that \(\Theta \), \(\Theta _1\), and \(\Theta _2\) admit Sobolev’s inequality.

Let \(\Sigma _t(\sigma )=b_tb_t^\top (\sigma )\), and let

$$\begin{aligned} \rho _t(\sigma )=\frac{[\Sigma _t]_{12}}{[\Sigma _t]_{11}^{1/2} [\Sigma _t]_{22}^{1/2}}(\sigma ), \quad B_{l,t}(\sigma )=\frac{[\Sigma _t(\sigma _0)]_{ll}}{[\Sigma _t(\sigma )]_{ll}}. \end{aligned}$$

Let \(\rho _{t,0}=\rho _t(\sigma _0)\).

Assumption (A1). There exists a positive constant \(c_1\), such that \(c_1\mathcal {E}_2 \le \Sigma _t(\sigma )\) for any \(t\in [0,\infty )\) and \(\sigma \in \Theta _1\). For \(k\in \{0,1,2,3,4\}\), \(\partial _\theta ^k\mu _t(\theta )\) and \(\partial _\sigma ^kb_t(\sigma )\) exist and are continuous with respect to \((t,\sigma ,\theta )\) on \([0,\infty )\times \textrm{clos}(\Theta _1)\times \textrm{clos}(\Theta _2)\). For any \(\epsilon >0\), there exist \(\delta >0\) and \(K>0\), such that

$$\begin{aligned}{} & {} |\partial _\theta ^k\mu _t(\theta )|+|\partial _\sigma ^kb_t(\sigma )|\le K, \\{} & {} |\partial _\theta ^k\mu _t(\theta )-\partial _\theta ^k\mu _s(\theta )|+|\partial _\sigma ^kb_t(\sigma )-\partial _\sigma ^kb_s(\sigma )|\le \epsilon \end{aligned}$$

for any \(k\in \{0,1,2,3,4\}\), \(\sigma \in \Theta _1\), \(\theta \in \Theta _2\), and \(t,s\ge 0\) satisfying \(|t-s|<\delta \). Let \(r_n=\max _{i,l}|I_i^l|\).

Assumption (A2). \(r_n\overset{P}{\rightarrow }0\) as \(n\rightarrow \infty \).

Assumption (A3). For any \(l\in \{1,2\}\), \(i_1\in \mathbb {Z}_+\), \(i_2\in \{0,1\}\), \(i_3\in \{0,1,2,3,4\}\), \(k_1,k_2\in \{0,1,2\}\) satisfying \(k_1+k_2=2\), and any polynomial function \(F(x_1,\ldots , x_{14})\) of degree equal to or less than 6, there exist continuous functions \(\Phi _{i_1,i_2}^{1,F}(\sigma )\), \(\Phi _{l,i_3}^2(\sigma )\) and \(\Phi ^{3,k_1,k_2}_{i_1,i_3}(\theta )\) on \(\textrm{clos}(\Theta _1)\) and \(\textrm{clos}(\Theta _2)\), such that

$$\begin{aligned}{} & {} \frac{1}{T}\int _0^TF((\partial _\sigma ^k B_{l,t}(\sigma ))_{0\le k\le 4, l=1,2},\\{} & {} \quad (\partial _\sigma ^{k'} \rho _t(\sigma ))_{k'=1}^4)\rho _t(\sigma )^{i_1}\rho _{t,0}^{i_2} {\textrm{d}}t \rightarrow \Phi _{i_1,i_2}^{1,F}(\sigma ),\\{} & {} \frac{1}{T}\int _0^T \partial _\sigma ^{i_3}\log B_{l,t}(\sigma ){\textrm{d}}t \rightarrow \Phi _{l,i_3}^2(\sigma ), \\ {}{} & {} \frac{1}{T}\int _0^T \partial _\theta ^{i_3}(\phi _{1,t}^{k_1}\phi _{2,t}^{k_2})(\theta )\rho _{t,0}^{i_1}{\textrm{d}}t \rightarrow \Phi ^{3,k_1,k_2}_{i_1,i_3}(\theta ) \end{aligned}$$

as \(T\rightarrow \infty \) for \(\sigma \in \textrm{clos}(\Theta _1)\), \(\theta \in \textrm{clos}(\Theta _2)\), where \(\phi _{l,t}(\theta )=[\Sigma _t(\sigma _0)]_{ll}^{-1/2}(\mu _t^l(\theta )-\mu _t^l(\theta _0))\).

Assumption (A1) and the Ascoli–Arzelà theorem yield that the convergences in (A3) can be replaced by uniform convergence with respect to \(\sigma \) and \(\theta \) (the left-hand sides of the above equations become relatively compact, and then, any uniformly convergent subsequence converges to the right-hand sides due to the pointwise convergence assumptions). Assumption (A3) is satisfied if \(\mu _t(\theta )\) and \(b_t(\sigma )\) are independent of t, or are periodic functions with respect to t having a common period (when the period does not depend on \(\sigma \) nor \(\theta \)). Let \(\mathfrak {S}\) be the set of all partitions \((s_k)_{k=0}^\infty \) of \([0,\infty )\) satisfying \(\sup _{k\ge 1}|s_k-s_{k-1}|\le 1\) and \(\inf _{k\ge 1}|s_k-s_{k-1}|>0\). For \((s_k)_{k=0}^\infty \in \mathfrak {S}\), let \(M_{l,k}=\#\{i;\sup I_i^l\in (s_{k-1},s_k]\}\) and \(q_n=\max \{k;s_k\le nh_n\}\), and let \(\mathcal {E}_{(k)}^l\) be an \(M_l\times M_l\) matrix satisfying \([\mathcal {E}_{(k)}^l]_{ij}=1\) if \(i=j\) and \(\sup I_i^l\in (s_{k-1},s_k]\), and otherwise, \([\mathcal {E}_{(k)}^l]_{ij}=0\). Let

$$\begin{aligned} G=\bigg \{\frac{|I_i^1\cap I_j^2|}{|I_i^1|^{1/2}|I_j^2|^{1/2}}\bigg \}_{1\le i\le M_1, 1\le j\le M_2}. \end{aligned}$$

Assumption (A4). There exist positive constants \(a_0^1\) and \(a_0^2\), such that \(\{h_nM_{l,q_n+1}\}_{n=1}^\infty \) is P-tight and

$$\begin{aligned} \max _{1\le k\le q_n} |h_nM_{l,k}-a_0^l(s_k-s_{k-1})|\overset{P}{\rightarrow }0 \end{aligned}$$

for \(l\in \{1,2\}\) and any partition \((s_k)_{k=0}^\infty \in \mathfrak {S}\). Moreover, for any \(p\in \mathbb {N}\), there exists a nonnegative constant \(a_p^1\), such that

$$\begin{aligned} \max _{1\le k\le q_n}|h_n\textrm{tr}(\mathcal {E}_{(k)}^1(GG^\top )^p)-a_p^1(s_k-s_{k-1})|\overset{P}{\rightarrow }0 \end{aligned}$$

as \(n\rightarrow \infty \) for any partition \((s_k)_{k=0}^\infty \in \mathfrak {S}\). Let \(\mathfrak {I}_l=(|I_i^l|^{1/2})_{i=1}^{M_l}\).

Assumption (A5). For \(p\in \mathbb {Z}_+\), there exist nonnegative constants \(f_p^{1,1}\), \(f_p^{1,2}\), and \(f_p^{2,2}\), such that \(\{|\mathcal {E}_{(q_n+1)}^l\mathfrak {I}_l|\}_{n=1}^\infty \) is P-tight for \(l\in \{1,2\}\), and

$$\begin{aligned}{} & {} \max _{1\le k\le q_n}|\mathfrak {I}_1\mathcal {E}_{(k)}^1(GG^\top )^p\mathfrak {I}_1-f_p^{1,1}(s_k-s_{k-1})|\overset{P}{\rightarrow }0,\nonumber \\{} & {} \max _{1\le k\le q_n}|\mathfrak {I}_1\mathcal {E}_{(k)}^1(GG^\top )^pG\mathfrak {I}_2-f_p^{1,2}(s_k-s_{k-1})|\overset{P}{\rightarrow }0,\nonumber \\{} & {} \max _{1\le k\le q_n}|\mathfrak {I}_2\mathcal {E}_{(k)}^2(G^\top G)^p\mathfrak {I}_2-f_p^{2,2}(s_k-s_{k-1})|\overset{P}{\rightarrow }0 \end{aligned}$$

as \(n\rightarrow \infty \) for any partition \((s_k)_{k=0}^\infty \in \mathfrak {S}\).

Assumption (A4) corresponds to [A3\('\)] in Ogihara and Yoshida (2014). The functionals in (A4) and (A5) appear in \(H_n^1\) and \(H_n^2\), and hence, we cannot specify the limits of \(H_n^1\) and \(H_n^2\) unless we assume existence of the limits of these functionals. It is difficult to directly check (A4) and (A5) for concrete statistical experiments with general sampling schemes. We study sufficient conditions for these conditions in Sect. 2.4.

Assumption (A6). The constant \(a_1^1\) in (A4) is positive, and there exist positive constants \(c_2\) and \(c_3\), such that

$$\begin{aligned} \limsup _{T\rightarrow \infty }\bigg (\frac{1}{T}\int _0^T\Vert \Sigma _t(\sigma ) -\Sigma _t(\sigma _0)\Vert ^2{\textrm{d}}t\bigg )\ge & {} c_2|\sigma -\sigma _0|^2,\\ \limsup _{T\rightarrow \infty }\bigg (\frac{1}{T}\int _0^T|\mu _t(\theta )-\mu _t(\theta _0)|^2{\textrm{d}}t\bigg )\ge & {} c_3|\theta -\theta _0|^2 \end{aligned}$$

for any \(\sigma \in \textrm{clos}(\Theta _1)\) and \(\theta \in \textrm{clos}(\Theta _2)\).

Assumption (A6) is necessary to identify the parameter \(\sigma \) and \(\theta \) from the data. For \(p<q\)

$$\begin{aligned} \textrm{tr}(\mathcal {E}_{(k)}^1(GG^\top )^q)\le \textrm{tr}(\mathcal {E}_{(k)}^1(GG^\top )^p)\Vert (GG^\top )^{q-p}\Vert \le \textrm{tr}(\mathcal {E}_{(k)}^1(GG^\top )^p) \end{aligned}$$
(2.3)

by Lemma 3.3 later and Lemma A.1 in Ogihara (2018). Then, \(a_p^1\) is monotone non-increasing with respect to p. This implies that \(a_p^1=0\) for any \(p\in \mathbb {N}\) if \(a_1^1=0\). In this case, the non-diagonal components of the covariance matrix \(S_n\) are negligible in the limit. Then, we cannot consistently estimate the parameter in \(\rho _t(\sigma )\). This is why, we need the assumption \(a_1^1>0\) (see Proposition 3.9 and the following discussion to obtain the consistency).

Let \(\mathcal {A}(\rho )=\sum _{p=1}^\infty a_p^1\rho ^{2p}\) for \(\rho \in (-1,1)\). Then, (2.3) implies that \(\mathcal {A}(\rho )\) is finite. Moreover, (A5) yields

$$\begin{aligned} f_p^{1,1}= & {} (nh_n)^{-1}\sum _{k=1}^{q_n}\mathfrak {I}_1^\top \mathcal {E}_{(k)}^1(GG^\top )^p\mathfrak {I}_1+o_p(1) \\= & {} (nh_n)^{-1}\mathfrak {I}_1^\top (GG^\top )^p\mathfrak {I}_1+o_p(1) \\\le & {} \Vert (GG^\top )^p\Vert (nh_n)^{-1}|\mathfrak {I}_1|^2+o_p(1) \\\le & {} 1+o_p(1), \end{aligned}$$

which implies \(f_p^{1,1}\le 1\). Similarly, we have \(f_p^{1,2}\le 1\) and \(f_p^{2,2}\le 1\). Let \(\partial _\sigma ^kB_{l,t,0}=\partial _\sigma ^kB_{l,t}(\sigma _0)\), and let

$$\begin{aligned} \gamma _{1,t}= & {} \mathcal {A}(\rho _{t,0})\bigg (\frac{\partial _\sigma \rho _{t,0}}{\rho _{t,0}}-\partial _\sigma B_{1,t,0}-\partial _\sigma B_{2,t,0}\bigg )^2-\partial _\rho \mathcal {A}(\rho _{t,0})\frac{(\partial _\sigma \rho _{t,0})^2}{\rho _{t,0}} -2\sum _{l=1}^2(a_0^l\\{} & {} +\mathcal {A}(\rho _{t,0}))(\partial _\sigma B_{l,t,0})^2, \end{aligned}$$

and let \(\Gamma _1=\lim _{T\rightarrow \infty }T^{-1}\int _0^T\gamma _{1,t}{\textrm{d}}t\), which exists under (A1), (A3), and (A4). Let

$$\begin{aligned} \Gamma _2= & {} \lim _{T\rightarrow \infty }\frac{1}{T}\int _0^T\sum _{p=0}^\infty \rho _{t,0}^{2p}\bigg \{\sum _{l=1}^2f_p^{ll}(\partial _\theta \phi _{l,t})^2(\theta _0)\\{} & {} -2\rho _{t,0}f_p^{12}\partial _\theta \phi _{1,t}\partial _\theta \phi _{2,t}(\theta _0)\bigg \}{\textrm{d}}t, \end{aligned}$$

which exists under (A1), (A3), and (A5). Let \(T_n=nh_n\) and

$$\begin{aligned} \Gamma =\left( \begin{array}{cc} \Gamma _1 &{} 0 \\ 0 &{} \Gamma _2 \end{array} \right) . \end{aligned}$$

Theorem 2.3

Assume (A1)–(A6). Then,  \(\Gamma \) is positive definite,  and

$$\begin{aligned} (\sqrt{n}(\hat{\sigma }_n-\sigma _0),\sqrt{T_n}(\hat{\theta }_n-\theta _0))\overset{d}{\rightarrow }N(0,\Gamma ^{-1}) \end{aligned}$$

as \(n\rightarrow \infty \).

2.3 Local asymptotic normality

Let \(\alpha _0\in \Theta \), \(\Theta \subset \mathbb {R}^d\), and \(\{P_{\alpha ,n}\}_{\alpha \in \Theta }\) be a family of probability measures defined on a measurable space \(({\mathcal {X}}_n,\mathcal {A}_n)\) for \(n\in \mathbb {N}\), where \(\Theta \) is an open subset of \({\mathbb {R}}^d\). As usual, we shall refer to \({\textrm{d}}P_{\alpha _2,n}/{\textrm{d}}P_{\alpha _1,n}\) the derivative of the absolutely continuous component of the measure \(P_{\alpha _2,n}\) with respect to measure \(P_{\alpha _1,n}\) at the observation x as the likelihood ratio. The following definition of local asymptotic normality is Definition 2.1 in Chapter II of Ibragimov and Has’minskiĭ (1981).

Definition 2.4

A family \(P_{\alpha ,n}\) is called locally asymptotically normal (LAN) at point \(\alpha _0\in \Theta \) as \(n\rightarrow \infty \) if for some nondegenerate \(d \times d\) matrix \(\epsilon _n\) and any \(u\in \mathbb {R}^d\), the representation

$$\begin{aligned} \log \frac{{\textrm{d}}P_{\alpha _0+\epsilon _n u,n}}{{\textrm{d}}P_{\alpha _0,n}}-(u^\top \Delta _n-|u|^2/2)\rightarrow 0 \end{aligned}$$

in \(P_{\alpha _0,n}\)-probability as \(n\rightarrow \infty \), where

$$\begin{aligned}\mathcal {L}(\Delta _n|P_{\alpha _0,n})\rightarrow N(0,\mathcal {E}_d) \end{aligned}$$

as \(n\rightarrow \infty \), and \(\mathcal {L}(\cdot |P_{\alpha ,n})\) denotes the distribution with respect to \(P_{\alpha ,n}\).

Let \(\Theta =\Theta _1\times \Theta _2\). For \(\alpha \in \Theta \), let \(P_{\alpha ,n}\) be the probability measure generated by the observations \(\{S_i^{n,l}\}_{i,l}\) and \(\{X_{S_i^{n,l}}^{(\alpha ),l}\}_{i,l}\).

Theorem 2.5

Assume (A1)–(A6). Then,  \(\{P_{\alpha ,n}\}_{\alpha ,n}\) satisfies the LAN property at \(\alpha =\alpha _0\) with

$$\begin{aligned} \epsilon _n=\left( \begin{array}{cc} n^{-1/2}\Gamma _1^{-1/2} &{} 0 \\ 0 &{} T_n^{-1/2}\Gamma _2^{-1/2} \end{array} \right) . \end{aligned}$$

The proof is left to Sect. 3.6. Theorem 11.2 in Chapter II of Ibragimov and Has’minskiĭ (1981) gives lower bounds of estimation errors for any regular estimator of parameters under the LAN property. Then, the optimal asymptotic variance of \(\epsilon _n^{-1}({U_n}-\alpha _0)\) for regular estimator \(U_n\) is \(\mathcal {E}_d\). We will show that \((\hat{\sigma }_n, \hat{\theta }_n)\) is regular in Remark 3.18. Therefore, Theorem 2.5 ensures that our estimator \((\hat{\sigma }_n,\hat{\theta }_n)\) is asymptotically efficient in this sense under the assumptions of the theorem.

2.4 Sufficient conditions for the assumptions

It is not easy to directly check Assumptions (A4) and (A5) for general random sampling schemes (even for a sampling scheme generated by simple Poisson processes given in Example 2.6). In this section, we study tractable sufficient conditions for these assumptions. The proofs of the results in this section are left to Sect. 3.6.

Let \(q>0\) and \(\mathcal {N}^{n,l}_t=\sum _{i=1}^{M_l}1_{\{S_i^{n,l}\le t\}}\). We consider the following conditions for the point process \(\mathcal {N}_t^{n,l}\).

Assumption (B1-q).:
$$\begin{aligned} \sup _{n\ge 1}\max _{l\in \{1,2\}}\sup _{0\le t\le (n-1)h_n}E[(\mathcal {N}^{n,l}_{t+h_n}-\mathcal {N}^{n,l}_t)^q]<\infty . \end{aligned}$$
Assumption (B2-q).:
$$\begin{aligned} \limsup _{u\rightarrow \infty } \sup _{n\ge 1}\max _{l\in \{1,2\}}\sup _{0\le t\le nh_n-uh_n} u^qP(\mathcal {N}^{n,l}_{t+uh_n}-\mathcal {N}^{n,l}_t=0)<\infty . \end{aligned}$$

Example 2.6

Let \((\bar{\mathcal {N}}_t^1, \bar{\mathcal {N}}_t^2)\) be two independent homogeneous Poisson processes with positive intensities \(\lambda _1\) and \(\lambda _2\), respectively, and \(\mathcal {N}^{n,l}_t=\bar{\mathcal {N}}_{h_n^{-1}t}^l\), that is, \(S_i^{n,l}=\inf \{t\ge 0; \bar{\mathcal {N}}_{h_n^{-1}t}^l\ge i\}\). Even in this simple case, it is not trivial to directly check (A4) and (A5). On the other hand, (B1-q) obviously holds for any \(q > 0\). Moreover, (B2-q) holds for any \(q > 0\), since

$$\begin{aligned} \limsup _{u\rightarrow \infty } \sup _{n\ge 1}\max _{l\in \{1,2\}}\sup _{0\le t\le nh_n-uh_n} u^qP(\mathcal {N}^{n,l}_{t+uh_n}-\mathcal {N}^{n,l}_t=0)=\lim _{u\rightarrow \infty }u^q{\textrm{e}}^{-(\lambda _1\wedge \lambda _2)u}=0. \end{aligned}$$

Then, by Corollary 2.12, we can check Assumptions (A2), (A4), and (A5) for this sampling scheme.

To give sufficient conditions for (A4) and (A5), we consider mixing properties of \(\mathcal {N}^{n,l}\). That is, we assume conditions for the following mixing coefficient \(\alpha _k^n\). Let

$$\begin{aligned} \mathcal {G}_{i,j}^n=\sigma (\mathcal {N}^{n,l}_t-\mathcal {N}^{n,l}_s; ih_n\le s<t\le jh_n, l=1,2) \quad (0\le i,j\le n), \end{aligned}$$

and let

$$\begin{aligned} \alpha _k^n=0\vee \sup _{1\le i,j\le n-1, j-i\ge k}\sup _{A\in \mathcal {G}_{0,i}^n}\sup _{B\in \mathcal {G}_{j,n}^n}|P(A\cap B)-P(A)P(B)|. \end{aligned}$$

Proposition 2.7

Assume that (B1-q) and (B2-q) hold and that

$$\begin{aligned} \sup _{n\in \mathbb {N}} \sum _{k=0}^\infty (k+1)^q\alpha _k^n<\infty \end{aligned}$$
(2.4)

for any \(q>0\). Moreover,  assume that there exist positive constants \(a_0^1\) and \(a_0^2,\) and a nonnegative constant \(a_p^1\) for \(p\in \mathbb {N},\) such that \(\{E[h_nM_{l,q_n+1}]\}_{n=1}^\infty \) is bounded and

$$\begin{aligned} \max _{1\le k\le q_n}|h_nE[M_{l,k}]-a_0^l(s_k-s_{k-1})|\rightarrow & {} 0,\nonumber \\ \max _{1\le k\le q_n}|h_nE[\textrm{tr}(\mathcal {E}_{(k)}^1(GG^\top )^p)]-a_p^1(s_k-s_{k-1})|\rightarrow & {} 0 \end{aligned}$$
(2.5)

as \(n\rightarrow \infty \) for \(p\in \mathbb {Z}_+,\) \(l\in \{1,2\}\) and any partition \((s_k)_{k=0}^\infty \in \mathfrak {S}\). Then,  (A4) holds.

Proposition 2.8

Assume that (B1-q) and (B2-q) hold and that (2.4) is satisfied for any \(q>0\). Moreover,  assume that there exist nonnegative constants \(f_p^{1,1},\) \(f_p^{1,2},\) and \(f_p^{2,2}\) for \(p\in \mathbb {Z}_+,\) such that \(\{E[|\mathcal {E}_{(q_n+1)}^l\mathfrak {I}_l|]\}_{n=1}^\infty \) is bounded and

$$\begin{aligned} \max _{1\le k\le q_n}|E[\mathfrak {I}_1\mathcal {E}_{(k)}^1(GG^\top )^p\mathfrak {I}_1]-f_p^{1,1}(s_k-s_{k-1})|\rightarrow & {} 0,\nonumber \\ \max _{1\le k\le q_n}|E[\mathfrak {I}_1\mathcal {E}_{(k)}^1(GG^\top )^pG\mathfrak {I}_2]-f_p^{1,2}(s_k-s_{k-1})|\rightarrow & {} 0, \nonumber \\ \max _{1\le k\le q_n}|E[\mathfrak {I}_2\mathcal {E}_{(k)}^2(G^\top G)^p\mathfrak {I}_2]-f_p^{2,2}(s_k-s_{k-1})|\rightarrow & {} 0 \end{aligned}$$
(2.6)

as \(n\rightarrow \infty \) for \(l\in \{1,2\},\) \(p\in \mathbb {Z}_+\) and any partition \((s_k)_{k=0}^\infty \in \mathfrak {S}\). Then,  (A5) holds.

Proposition 2.9

Assume that there exists \(q>0,\) such that (A4) and (B2-q) hold,  \(\{\mathcal {N}_{t+h_n}^{n,l}-\mathcal {N}_t^{n,l}\}_{0\le t\le T_n-h_n, l\in \{1,2\}, n\in \mathbb {N}}\) is P-tight,  and \(\sum _{k=1}^\infty k\alpha _k^n<\infty \). Then,  \(a_1^1>0\).

In the following, let \((\bar{\mathcal {N}}_t^l)_{t\ge 0}\) be an exponential \(\alpha \)-mixing point process for \(l\in \{1,2\}\). Assume that the distribution of \((\bar{\mathcal {N}}_{t+t_k}^l-\bar{\mathcal {N}}_{t+t_{k-1}}^l)_{1\le k\le K, l=1,2}\) does not depend on \(t\ge 0\) for any \(K\in \mathbb {N}\) and \(0\le t_0<t_1<\cdots < t_K\).

Lemma 2.10

Let \(\mathcal {N}^{n,l}_t=\bar{\mathcal {N}}_{h_n^{-1}t}^l\) for \(0\le t\le nh_n\) and \(l\in \{1,2\}\). Then,  (2.4) is satisfied for any \(q>2,\) and there exist constants \(a_0^1,\) \(a_0^2,\) and \(a_p^1=a_p^2\) for \(p\in \mathbb {N},\) such that (2.5) holds and \(\{E[h_nM_{l,q_n+1}]\}_{n=1}^\infty \) is bounded for any \((s_k)_{k=0}^\infty \in \mathfrak {S}\). Moreover,  there exist nonnegative constants \(f_p^{1,1},\) \(f_p^{1,2},\) and \(f_p^{2,2}\) for \(p\in \mathbb {Z}_+,\) such that (2.6) holds and \(\{E[|\mathcal {E}_{(q_n+1)}^l\mathfrak {I}_l|\}_{n=1}^\infty \) is bounded for \(l\in \{1,2\}\) and any \((s_k)_{k=0}^\infty \in \mathfrak {S}\).

Proposition 2.11

(Proposition 8 in Ogihara & Yoshida, 2014) Let \(q\in \mathbb {N}\). Assume (B2-\((q+1)).\) Then,  \(\sup _nE[h_n^{-q+1}r_n^q]<\infty \). In particular,  (A2) holds under (B2-2).

By the above results, we obtain simple tractable sufficient conditions for the assumptions of the sampling scheme when the observation times are generated by the exponential \(\alpha \)-mixing point process \(\bar{\mathcal {N}}_t^l\) defined above.

Corollary 2.12

Let \(\mathcal {N}_t^{n,l}=\bar{\mathcal {N}}_{h_n^{-1}t}^l\) for \(0\le t\le T_n\) and \(l\in \{1,2\}\). Assume that (B1-q) and (B2-q) hold for any \(q>0\). Then,  (A2), (A4), and (A5) hold,  and \(a_1^1>0\).

3 Proofs

3.1 Preliminary results

For a real number a, [a] denotes the maximum integer which is not greater than a. Let \(\Pi =\Pi _n=\{S_i^{n,l}\}_{1\le i\le M_l, l\in \{1,2\}}\). We denote \(|x|^2=\sum _{i_1,\ldots , i_k}|x_{i_1,\ldots , i_k}|^2\) for \(x=\{x_{i_1,\ldots , i_k}\}_{i_1,\ldots , i_k}\) with \(k\in \mathbb {N}\) and \(x_{i_1,\ldots ,i_k}\in \mathbb {R}\). For a matrix \(A=(A_{ij})_{ij}\), \(\mathrm{Abs(A)}\) denotes the matrix \((|A_{ij}|)_{ij}\). C denotes generic positive constant whose value may vary depending on context. We often omit the parameters \(\sigma \) and \(\theta \) in general functions \(f(\sigma )\) and \(g(\theta )\).

For a sequence \(p_n\) of positive numbers, let us denote by \(\{\bar{R}_n(p_n)\}_{n\in {\mathbb {N}}}\) a sequence of random variables (which may also depend on \(1\le i\le M\) and \(\alpha \in \Theta \)) satisfying that \(\{\sup _{\alpha ,i}E_\Pi [|p_n^{-1}\bar{R}_n(p_n)|^q]\}_{n\in \mathbb {N}}\) is P-tight for any \(q\ge 1\), where \(E_\Pi [\textbf{X}]=E[\textbf{X}|\sigma (\Pi _n)]\) for a random variable \(\textbf{X}\).

For a matrix A and vectors vw with suitable sizes, we repeatedly use the following inequality:

$$\begin{aligned} |w^\top Av|\le |w||Av|\le \Vert A\Vert |v||w|. \end{aligned}$$

Lemma 3.1

(A special case of Lemma 3.1 in Ogihara and Uehara, 2022) Let \((Z_n)_{n\in \mathbb {N}}\) be nonnegative-valued random variables. Then

  1. 1.

    \(E_\Pi [Z_n]\overset{P}{\rightarrow }0\) as \(n\rightarrow \infty \) implies that \(Z_n\overset{P}{\rightarrow }0\) as \(n\rightarrow \infty \).

  2. 2.

    P-tightness of \((E_\Pi [Z_n])_{n\in \mathbb {N}}\) implies P-tightness of \((Z_n)_{n\in \mathbb {N}}\).

Let \(\bar{V}=V(\theta _0)\), and let

$$\begin{aligned} \rho _{ij}(\sigma )=\left\{ \begin{array}{ll} \frac{\tilde{\Sigma }_{i,j}^{1,2}}{\sqrt{\tilde{\Sigma }_i^1}\sqrt{\tilde{\Sigma }_j^2}[G]_{ij}}, &{}\quad \textrm{if}~ |I_i^1\cap I_j^2|\ne \emptyset , \\ 0, &{}\quad \textrm{otherwise}. \end{array} \right. \end{aligned}$$

Let \(\bar{\rho }_n=\sup _\sigma (\max _{i,j}|\rho _{i,j}(\sigma )| \vee \sup _t|\rho _t(\sigma )|)\), and let

$$\begin{aligned} {\dot{S}}_n=\left( \begin{array}{cc} \mathcal {E}_{M_1} &{} -\bar{\rho }_nG \\ -\bar{\rho }_n G^\top &{} \mathcal {E}_{M_2} \end{array} \right) . \end{aligned}$$
(3.1)

Let \(\Delta _{i,t}^l U=U^l_{t\wedge S^{n,l}_i}-U^l_{t\wedge S^{n,l}_{i-1}}\), and let \(\Delta _{i,t} U=\Delta _{\varphi (i),t}^{\psi (i)} U\) for \(t\ge 0\) and a two-dimensional stochastic process \((U_t)_{t\ge 0}=((U_t^1,U_t^2))_{t\ge 0}\).

Under (A4), we have

$$\begin{aligned} h_nM_l=h_n\sum _{k=1}^{{q_n+1}}M_{l,k}=a_0^lnh_n+o_p(nh_n). \end{aligned}$$

Then, we obtain

$$\begin{aligned} M_l=a_0^ln+o_p(n). \end{aligned}$$
(3.2)

Lemma 3.2

Assume (A1). Then,  for any \(p\ge 1,\) there exist positive constants \(C_p\) (depending on p) and C,  such that

$$\begin{aligned} \sup _{\theta }|\Delta _i^lV(\theta )|\le C|I_i^l|, \quad E_\Pi [|\Delta _i^l X|^p]^{1/p}\le C_p(|I_i^l|+\sqrt{|I_i^l|}) \end{aligned}$$

for \(l\in \{1,2\}\) and \(1\le i\le M_l\).

Proof

Since \(\mu _t^l(\theta )\) and \([b_tb_t(\sigma _0)]_{ll}\) are bounded by (A1), the Burkholder–Davis–Gundy inequality yields

$$\begin{aligned}{} & {} \sup _{\theta }|\Delta _i^lV(\theta )|=\sup _{\theta }\bigg |\int _{I_i^l}\mu _t^l(\theta ){\textrm{d}}t\bigg |\le C|I_i^l|,\\{} & {} \begin{aligned}E_\Pi [|\Delta _i^l X|^p]^{1/p}&=E_\Pi \bigg [\bigg |\int _{I_i^l}\mu _t^l(\theta _0){\textrm{d}}t+\int _{I_i^l}[b_t(\sigma _0){\textrm{d}}W_t]_l\bigg |^p\bigg ]^{1/p}\\&\le C_p|I_i^l|+C_pE_\Pi \bigg [\bigg |\int _{I_i^l}[b_tb_t(\sigma _0)]_{ll}{\textrm{d}}t\bigg |^{p/2}\bigg ]^{1/p} \\&\le C_p(|I_i^l|+\sqrt{|I_i^l|}). \end{aligned} \end{aligned}$$

\(\square \)

Lemma 3.3

(Lemma 2 in Ogihara & Yoshida, 2014) \(\Vert G\Vert \vee \Vert G^\top \Vert \le 1\).

Lemma 3.4

\(\Vert \tilde{G}\Vert \vee \Vert \tilde{G}^\top \Vert \le \bar{\rho }_n\).

Proof

Since all the elements of G are nonnegative, we have

$$\begin{aligned} \Vert \tilde{G}\Vert ^2= & {} \sup _{|x|=1}|\tilde{G}x|^2 =\sup _{|x|=1}\sum _i\bigg (\sum _j \rho _{ij}G_{ij}x_j\bigg )^2 \\\le & {} \bar{\rho }_n^2 \sup _{|x|=1}\sum _i\bigg (\sum _j G_{ij}|x_j|\bigg )^2 \le \bar{\rho }_n^2\Vert G\Vert ^2\le \bar{\rho }_n^2. \end{aligned}$$

Since \(\Vert \tilde{G}^\top \Vert =\Vert \tilde{G}\Vert \), we obtain the conclusion. \(\square \)

Let \(\mathcal {D}=\textrm{diag}(\{|I_i|\}_{i=1}^M)\). It is difficult to deduce the orders of upper bounds of the operator norms \(\Vert S_n(\sigma )\Vert \) and \(\Vert S_n^{-1}\Vert \), because they depend on the maximum and minimum lengths of observation intervals. However, we can deduce the orders of upper bounds for \(\tilde{\mathcal {D}}^{-1/2}S_n(\sigma )\tilde{\mathcal {D}}^{-1/2}\) and its inverse. Indeed, we obtain the following estimates, which are repeatedly used in the following sections (we use \(\mathcal {D}\) instead of \(\tilde{\mathcal {D}}\) to avoid parameter dependence).

Lemma 3.5

Assume (A1). Then,  there exists a positive constant C,  such that \(\Vert \mathcal {D}^{1/2} \partial _\sigma ^kS_n^{-1}(\sigma )\mathcal {D}^{1/2} \Vert \le C(1-\bar{\rho }_n)^{-k-1}\) and \(|[S_n^{-1}(\sigma )]_{ij}|\le C[\mathcal {D}^{-1/2}{\dot{S}}_n^{-1}\mathcal {D}^{-1/2}]_{ij}\) if \(\bar{\rho }_n<1,\) and \(\Vert \mathcal {D}^{-1/2} \partial _\sigma ^kS_n(\sigma )\mathcal {D}^{-1/2}\Vert \le C\) for any \(\sigma \in \Theta _1,\) \(1\le i,j\le M,\) and \(k\in \{0,1,2,3,4\}\).

Proof

By (A1) and Lemma 3.3, we have

$$\begin{aligned} \Vert \mathcal {D}^{-1/2}\partial _\sigma ^k S_n(\sigma )\mathcal {D}^{-1/2}\Vert \le C\sum _{j=0}^k\bigg \Vert \partial _\sigma ^j\bigg \{\mathcal {E}_{M}+ \left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) \bigg \}\bigg \Vert \le C. \end{aligned}$$

Moreover, by (A1) and Lemma 3.4, we have

$$\begin{aligned} \Vert \mathcal {D}^{1/2}S_n^{-1}\mathcal {D}^{1/2}\Vert \le C\bigg \Vert \bigg (\mathcal {E}_M+\left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) \bigg )^{-1}\bigg \Vert \le C(1-\bar{\rho }_n)^{-1} \end{aligned}$$

if \(\bar{\rho }_n<1\).

Using the equation \(\partial _\sigma S_n^{-1}=-S_n^{-1}\partial _\sigma S_n S_n^{-1}\), we obtain

$$\begin{aligned} \Vert \mathcal {D}^{1/2}\partial _\sigma S_n^{-1}\mathcal {D}^{1/2}\Vert= & {} \Vert \mathcal {D}^{1/2}S_n^{-1}\partial _\sigma S_nS_n^{-1}\mathcal {D}^{1/2}\Vert \\{} & {} \le \Vert \mathcal {D}^{1/2}S_n^{-1}\mathcal {D}^{1/2}\Vert ^2\Vert \mathcal {D}^{-1/2}\partial _\sigma S_n \mathcal {D}^{-1/2}\Vert \le C(1-\bar{\rho }_n)^{-2} \end{aligned}$$

if \(\bar{\rho }_n<1\). Similarly, we obtain

$$\begin{aligned} \Vert \mathcal {D}^{1/2}\partial _\sigma ^k S_n^{-1}\mathcal {D}^{1/2}\Vert \le C(1-\bar{\rho }_n)^{-k-1} \end{aligned}$$

if \(\bar{\rho }_n<1\) for \(k\in \{0,1,2,3,4\}\).

If \(\bar{\rho }_n<1\), since Lemma 3.4 yields

$$\begin{aligned} S_n^{-1}=\tilde{\mathcal {D}}^{-1/2}\bigg (\left( \begin{array}{cc} \mathcal {E}_{M_1} &{} 0 \\ 0 &{} \mathcal {E}_{M_2} \end{array} \right) +\left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) \bigg )^{-1}\tilde{\mathcal {D}}^{-1/2} =\tilde{\mathcal {D}}^{-1/2}\sum _{p=0}^\infty (-1)^p\left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) ^p\tilde{\mathcal {D}}^{-1/2}, \end{aligned}$$

we obtain

$$\begin{aligned} |[S_n^{-1}]_{ij}|\le C\bigg [\mathcal {D}^{-1/2} \sum _{p=0}^\infty \bar{\rho }_n^p\left( \begin{array}{cc} 0 &{} G \\ G^\top &{} 0 \end{array} \right) ^p\mathcal {D}^{-1/2}\bigg ]_{ij} =[\mathcal {D}^{-1/2} {\dot{S}}_n^{-1}\mathcal {D}^{-1/2}]_{ij}. \end{aligned}$$

\(\square \)

Under (A1), we have \(\Sigma _t(\sigma )\ge c_1\mathcal {E}_2\), which implies that \(\sup _{t,\sigma }|\rho _t(\sigma )|<1\). Then, by (A2) and uniform continuity of \(b_t\), for some fixed \(\delta >0\) and any \(\epsilon >0\), there exists \(N\in \mathbb {N}\), such that \(P(1-\bar{\rho }_n<\delta )<\epsilon \) for \(n\ge N\). Therefore, we have

$$\begin{aligned} P(\bar{\rho }_n<1-\delta )\rightarrow 1 \end{aligned}$$
(3.3)

as \(n\rightarrow \infty \), and we have

$$\begin{aligned} P((1-\bar{\rho }_n)^{-q}>\delta ^{-q})<\epsilon \end{aligned}$$

for any \(q>0\) and \(n\ge N\), which implies that

$$\begin{aligned} (1-\bar{\rho }_n)^{-q}=O_p(1). \end{aligned}$$
(3.4)

Moreover, Lemma 3.4 yields

$$\begin{aligned} S_n^{-1}(\sigma )= & {} \tilde{\mathcal {D}}^{-1/2}\sum _{p=0}^\infty (-1)^p\left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) ^p\tilde{\mathcal {D}}^{-1/2}\nonumber \\= & {} \tilde{\mathcal {D}}^{-1/2}\sum _{p=0}^\infty \left( \begin{array}{cc} (\tilde{G}\tilde{G}^\top )^p &{} -(\tilde{G}\tilde{G}^\top )^p\tilde{G}\\ -(\tilde{G}^\top \tilde{G})^p\tilde{G}^\top &{} (\tilde{G}^\top \tilde{G})^p \end{array} \right) \tilde{\mathcal {D}}^{-1/2} \end{aligned}$$
(3.5)

if \(\bar{\rho }_n<1\).

3.2 Consistency of \(\hat{\sigma }_n\)

In this section, we show consistency: \(\hat{\sigma }_n\overset{P}{\rightarrow }\sigma _0\) as \(n\rightarrow \infty \). For this purpose, we specify the limit of \(H_n^1(\sigma )-H_n^1(\sigma _0)\).

Lemma 3.6

Assume (A1) and (A2). Then

$$\begin{aligned} \frac{1}{n}\sup _{\sigma \in \Theta _1}\bigg |\partial _\sigma ^k(H_n^1(\sigma )-H_n^1(\sigma _0)) +\frac{1}{2}\partial _\sigma ^k\textrm{tr}(S_n^{-1}(\sigma )(S_n(\sigma _0)-S_n(\sigma )))+\frac{1}{2}\partial _\sigma ^k\log \frac{\det S_n(\sigma )}{\det S_n(\sigma _0)}\bigg | \overset{P}{\rightarrow }0 \qquad \end{aligned}$$
(3.6)

as \(n\rightarrow \infty \) for \(k\in \{0,1,2,3\}\).

Proof

Let \(X_t^c=\int _0^tb_s(\sigma _0){\textrm{d}}W_s\). By the definition of \(H_n^1\), we have

$$\begin{aligned} H_n^1(\sigma )-H_n^1(\sigma _0)=-\frac{1}{2}\Delta X^\top (S_n^{-1}(\sigma )-S_n^{-1}(\sigma _0))\Delta X -\frac{1}{2}\log \frac{\det S_n(\sigma )}{\det S_n(\sigma _0)}. \end{aligned}$$

We first show that

$$\begin{aligned} H_n^1(\sigma )-H_n^1(\sigma _0)=-\frac{1}{2}(\Delta X^c)^\top (S_n^{-1}(\sigma )-S_n^{-1}(\sigma _0))\Delta X^c-\frac{1}{2}\log \frac{\det S_n(\sigma )}{\det S_n(\sigma _0)}+{\sqrt{n}{\dot{e}}_n(\sigma )},\nonumber \\ \end{aligned}$$
(3.7)

where \(({\dot{e}}_n(\sigma ))_{n=1}^\infty \) denotes a general sequence of random variables, such that \(\sup _\sigma |{\dot{e}}_n(\sigma )|\overset{P}{\rightarrow }0\) as \(n\rightarrow \infty \).

Since

$$\begin{aligned}{} & {} \Delta X^\top S_n^{-1}(\sigma )\Delta X-(\Delta X^c)^\top S_n^{-1}(\sigma )\Delta X^c =2(\Delta \bar{V})^\top S_n^{-1}(\sigma )\Delta X^c\nonumber \\{} & {} \quad +(\Delta \bar{V})^\top S_n^{-1}(\sigma )\Delta \bar{V}=:\Psi _1+\Psi _2, \end{aligned}$$
(3.8)

it suffices to show that \(\Psi _i=\sqrt{n}{\dot{e}}_n\) for \(i\in \{1,2\}\).

Lemma 3.5 and (3.4) yield

$$\begin{aligned} |\Psi _2|\le \Vert \mathcal {D}^{1/2}S_n^{-1}(\sigma )\mathcal {D}^{1/2} \Vert |\mathcal {D}^{-1/2}\Delta \bar{V}|^2=O_p(1)\times |\mathcal {D}^{-1/2}\Delta \bar{V}|^2. \end{aligned}$$
(3.9)

Moreover, Lemma 3.2 yields

$$\begin{aligned} |\mathcal {D}^{-1/2}\Delta V(\theta )|^2=\sum _{i,l}|I_i^l|^{-1}|\Delta _i^l V(\theta )|^2 \le C\sum _{i,l}|I_i^l|^{-1}|I_i^l|^2=C\sum _{i,l}|I_i^l|\le Cnh_n. \end{aligned}$$
(3.10)

Furthermore, Lemma 3.5, (3.4), (3.10), and the equation \(E_\Pi [\Delta X^c(\Delta X^c)^\top ]=S_n(\sigma _0)\) yield

$$\begin{aligned} E_\Pi [|\Psi _1|^2] =4(\Delta \bar{V})^\top S_n^{-1}(\sigma )E_\Pi [\Delta X^c(\Delta X^c)^\top ] S_n^{-1}(\sigma )\Delta \bar{V}= O_p(nh_n)=o_p(n). \end{aligned}$$
(3.11)

Then, we obtain (3.7) by (3.8)–(3.11) and Lemma 3.1.

Next, we show that

$$\begin{aligned} (\Delta X^c)^\top S_n^{-1}(\sigma )\Delta X^c - \textrm{tr}(S_n^{-1}(\sigma )S_n(\sigma _0))=\bar{R}_n(\sqrt{n}). \end{aligned}$$
(3.12)

Itô’s formula yields

$$\begin{aligned}{} & {} (\Delta X^c)^\top S_n^{-1}(\sigma )\Delta X^c - \textrm{tr}(S_n^{-1}(\sigma )S_n(\sigma _0)) \nonumber \\{} & {} \quad =\sum _{i,j}[S_n^{-1}(\sigma )]_{ij}(\Delta _i X^c\Delta _j X^c-{[S_n(\sigma _0)]_{ij}}) \nonumber \\{} & {} \quad =\sum _{i,j}[S_n^{-1}(\sigma )]_{ij}\bigg \{\int _{I_i}\Delta _{j,t} X^c{\textrm{d}}X^{c,\psi (i)}_t+\int _{I_j}\Delta _{i,t} X^c{\textrm{d}}X_t^{c,\psi (j)}\bigg \} \nonumber \\{} & {} \quad =2\sum _{i,j}[S_n^{-1}(\sigma )]_{ij}\int _{I_i}\Delta _{j,t} X^c{\textrm{d}}X_t^{c,\psi (i)}, \end{aligned}$$
(3.13)

where \(X^{c,l}_t\) is the l-th component of \(X_t^c\).

Since \(\langle \Delta _i X^c, \Delta _j X^c\rangle _t=\int _{[0,t)\cap I_i\cap I_j}[\Sigma _t]_{\psi (i),\psi (j)}{\textrm{d}}t\), together with the Burkholder–Davis–Gundy inequality, we have

$$\begin{aligned}{} & {} E_\Pi \bigg [\bigg (\sum _{i,j}[S_n^{-1}(\sigma )]_{ij}\int _{I_i}\Delta _{j,t} X^c {\textrm{d}}X_t^{c,\psi (i)}\bigg )^q\bigg ] \\{} & {} \quad \le C_q\sum _{l=1}^2E_\Pi \bigg [\bigg (\sum _{\begin{array}{c} i,j_1,j_2 \\ \psi (i)=l \end{array}}[S_n^{-1}(\sigma )]_{i,j_1}[S_n^{-1}(\sigma )]_{i,j_2} \int _{I_i} \Delta _{j_1,t} X^c \Delta _{j_2,t} X^c[\Sigma _t]_{\psi (i),\psi (i)}{\textrm{d}}t\bigg )^{q/2}\bigg ] \\{} & {} \qquad +C_q{E_\Pi }\bigg [\bigg (\sum _{\begin{array}{c} i_1,i_2,j_1,j_2 \\ \psi (i_1)=1, \psi (i_2)=2 \end{array}}[S_n^{-1}(\sigma )]_{i_1,j_1}[S_n^{-1}(\sigma )]_{i_2,j_2}\\{} & {} \qquad \times \int _{I_{i_1}\cap I_{i_2}} \Delta _{j_1,t} X^c \Delta _{j_2,t} X^c[\Sigma _t]_{\psi (i_1),\psi (i_2)}{\textrm{d}}t\bigg )^{q/2}\bigg ] \\{} & {} \quad \le C_qE_\Pi \bigg [\bigg (\sum _{i_1,i_2,j_1,j_2}|[S_n^{-1}(\sigma )]_{i_1,j_1}[S_n^{-1}(\sigma )]_{i_2,j_2}|\sup _t|[\Sigma _t]_{\psi (i_1),\psi (i_2)}||I_{i_1}\cap I_{i_2}|\sup _t|\Delta _{j_1,t}X^c|\\{} & {} \qquad \sup _t|\Delta _{j_2,t}X^c|\bigg )^{q/2}\bigg ] \\{} & {} \quad \le {C_qE_\Pi \bigg [\bigg (\Vert \mathcal {D}^{1/2}\textrm{Abs}(S_n^{-1})\{|I_i\cap I_j|\}_{ij}\textrm{Abs}(S_n^{-1})\mathcal {D}^{1/2}\Vert \sum _i\frac{\sup _t|\Delta _{i,t}X^c|^2}{|I_i|}\bigg )^{q/2}\bigg ]}. \end{aligned}$$

Together with Lemmas 3.3 and 3.5, the triangle inequality for \(L^{q/2}\) that

$$\begin{aligned} |I_i\cap I_j|=\bigg [\mathcal {D}^{1/2}\left( \begin{array}{cc} \mathcal {E}_{M_1} &{} G \\ G^\top &{} \mathcal {E}_{M_2} \end{array} \right) \mathcal {D}^{1/2}\bigg ]_{ij}, \end{aligned}$$

and that

$$\begin{aligned} \Vert \textrm{Abs}(S_n^{-1})\Vert ^2= & {} \sup _{|x|=1}|\textrm{Abs}(S_n^{-1})x|^2 \\= & {} \sup _{|x|=1}\sum _i\bigg (\sum _j|[S_n^{-1}]_{ij}|x_j\bigg )^2 \\\le & {} C\sup _{|x|=1}\sum _i\bigg (\sum _j[\mathcal {D}^{-1/2}{\dot{S}}_n^{-1}\mathcal {D}^{-1/2}]_{ij}|x_j|\bigg )^2 \\\le & {} C\Vert \mathcal {D}^{-1/2}{\dot{S}}_n^{-1}\mathcal {D}^{-1/2}\Vert ^2 \end{aligned}$$

by Lemma 3.5, we have

$$\begin{aligned}{} & {} E_\Pi \bigg [\bigg (\sum _{i,j}[S_n^{-1}(\sigma )]_{ij}\int _{I_i}\Delta _{j,t} X^c {\textrm{d}}X_t^{c,\psi (i)}\bigg )^q\bigg ]\\{} & {} \quad \le C_q(1-\bar{\rho }_n)^{-q}E_\Pi \bigg [\bigg (\sum _i\frac{\sup _t|\Delta _{i,t}X^c|^2}{|I_i|}\bigg )^{q/2}\bigg ] \\{} & {} \quad \le {C_q(1-\bar{\rho }_n)^{-q}\bigg (\sum _i\frac{E_\Pi [\sup _t|\Delta _{i,t}X^c|^q]^{2/q}}{|I_i|}\bigg )^{q/2}} \\{} & {} \quad \le C_qM^{q/2}(1-\bar{\rho }_n)^{-q} \end{aligned}$$

on \(\{\bar{\rho }_n<1\}\) for \(q\ge 1\). Then, thanks to (3.2), (3.4), (3.13) and Lemma 3.1, we obtain (3.12).

(3.12), (3.7), Sobolev’s inequality, and similar estimates for \(\partial _\sigma ^k(H_n^1(\sigma )-H_n^1(\sigma _0))\) yield

$$\begin{aligned}{} & {} \partial _\sigma ^k(H_n^1(\sigma )-H_n^1(\sigma _0)) \\{} & {} \quad =-\frac{1}{2}\partial _\sigma ^k\textrm{tr}(S_n(\sigma _0)(S_n^{-1}(\sigma )-S_n^{-1}(\sigma _0)))-\frac{1}{2}\partial _\sigma ^k\log \frac{\det S_n(\sigma )}{\det S_n(\sigma _0)}+\sqrt{n}{\dot{e}}_n(\sigma ) \\{} & {} \quad =-\frac{1}{2}\partial _\sigma ^k\textrm{tr}(S_n^{-1}(\sigma )(S_n(\sigma _0)-S_n(\sigma )))-\frac{1}{2}\partial _\sigma ^k\log \frac{\det S_n(\sigma )}{\det S_n(\sigma _0)}+\sqrt{n}{\dot{e}}_n(\sigma ) \end{aligned}$$

for \(k\in \{0,1,2,3\}\). \(\square \)

For \((s_k)_{k=0}^\infty \in \mathfrak {S}\), let \(\dot{\mathcal {A}}_{k,p}^1=\mathcal {E}_{(k)}^1(GG^\top )^p\) and \(\dot{\mathcal {A}}_{k,p}^2=\mathcal {E}_{(k)}^2(G^\top G)^p\) for \(p\in \mathbb {Z}_+\) and \(1\le k\le q_n\). The following lemma is used when we specify the limit of \(n^{-1}(H_n^1(\sigma )-H_n^1(\sigma _0))\) in the next proposition.

Lemma 3.7

Assume (A2) and (A4). Then,  for any \(p\ge 1\)

$$\begin{aligned} n^{-1}\max _{1\le k\le q_n}|\textrm{tr}(\dot{\mathcal {A}}_{k,p}^1)-\textrm{tr}(\dot{\mathcal {A}}_{k,p}^2)| \overset{P}{\rightarrow }0 \end{aligned}$$

as \(n\rightarrow \infty \).

Proof

By the definition of \(\dot{\mathcal {A}}_{k,p}^l\), we obtain

$$\begin{aligned}{} & {} |\textrm{tr}(\dot{\mathcal {A}}_{k,p}^1)-\textrm{tr}(\dot{\mathcal {A}}_{k,p}^2)| \\{} & {} \quad =\bigg |\sum _{i;~\sup I_i^1\in (s_{k-1},s_k]}[(GG^\top )^p]_{ii}-\sum _{j;~\sup I_j^2\in (s_{k-1},s_k]}[(G^\top G)^p]_{jj}\bigg | \\{} & {} \quad =\bigg |\sum _{i;~\sup I_i^1\in (s_{k-1},s_k]}\sum _{i',j}[(GG^\top )^{p-1}]_{ii'}[G]_{i'j}[G^\top ]_{ji}\\{} & {} \qquad -\sum _{j;~\sup I_j^2\in (s_{k-1},s_k]}\sum _{i,i'}[G^\top ]_{ji}[(GG^\top )^{p-1}]_{ii'}[G]_{i'j}\bigg |. \end{aligned}$$

Two summands in the right-hand side coincide when both \(\sup I_i^1\) and \(\sup I_j^2\) are included or not included in \((s_{k-1},s_k]\). In other cases, we have \(\min _{u=0,1}|\sup I_i^1-s_{k-u}|\le r_n\) if \([G^\top ]_{ji}>0\). Therefore, we obtain

$$\begin{aligned} |\textrm{tr}(\dot{\mathcal {A}}_{k,p}^1)-\textrm{tr}(\dot{\mathcal {A}}_{k,p}^2)|\le & {} \bigg (\sum _{\begin{array}{c} i,j ;~\sup I_i^1\not \in (s_{k-1},s_k] \\ \sup I_j^2\in (s_{k-1},s_k] \end{array}}+\sum _{\begin{array}{c} i,j ;~\sup I_i^1\in (s_{k-1},s_k] \\ \sup I_j^2\not \in (s_{k-1},s_k] \end{array}}\bigg )\\{} & {} \times \sum _{i'} [(GG^\top )^{p-1}]_{ii'}[G]_{i'j}[G^\top ]_{ji} \\\le & {} \sum _{i;~\min _{u=0,1}|\sup I_i^1-s_{k-u}|\le r_n} [(GG^\top )^p]_{ii}. \end{aligned}$$

Thanks to (A2) and (A4), the right-hand side of the above inequality is equal to \(O_p(h_n^{-1})=o_p(n)\). \(\square \)

Let \(\mathcal {Y}_1(\sigma )=\lim _{T\rightarrow \infty }(T^{-1}\int _0^T y_{1,t}(\sigma ){\textrm{d}}t)\), where

$$\begin{aligned} y_{1,t}(\sigma )= & {} -\frac{1}{2}\mathcal {A}(\rho _t)\sum _{l=1}^2B_{l,t}^2+\mathcal {A}(\rho _t)\frac{B_{1,t}B_{2,t}\rho _{t,0}}{\rho _t} +\sum _{l=1}^2a_0^l\bigg (\frac{1}{2}-\frac{1}{2}B_{l,t}^2+\log B_{l,t}\bigg )\\{} & {} +\int _{\rho _{t,0}}^{\rho _t}\frac{\mathcal {A}(\rho )}{\rho }{\textrm{d}}\rho . \end{aligned}$$

The limit \(\mathcal {Y}_1(\sigma )\) exists under (A1), (A3), and (A4).

Proposition 3.8

Assume (A1)–(A4). Then

$$\begin{aligned} \sup _{\sigma \in \Theta _1} |n^{-1}\partial _\sigma ^k(H_n^1(\sigma )-H_n^1(\sigma _0))-\partial _\sigma ^k\mathcal {Y}_1(\sigma )|\overset{P}{\rightarrow }0 \end{aligned}$$
(3.14)

as \(n\rightarrow \infty \) for \(k\in \{0,1,2,3\}\).

Proof

Let \(\mathcal {A}_p^1=(\tilde{G}\tilde{G}^\top )^p\), \(\mathcal {A}_p^2=(\tilde{G}^\top \tilde{G})^p\), \(\tilde{\Sigma }_{i,0}^l=\tilde{\Sigma }_i^l(\sigma _0)\), and \(\tilde{\Sigma }_{i,j,0}^{1,2}=\tilde{\Sigma }_{i,j}^{1,2}(\sigma _0)\). Thanks to (A1), for any \(\epsilon >0\), there exists \(\delta >0\), such that \(|t-s|<\delta \) implies

$$\begin{aligned} |\rho _t-\rho _s|\vee |\Sigma _t-\Sigma _s|\vee |\mu _t-\mu _s|<\epsilon \end{aligned}$$
(3.15)

for any \(\sigma \) and \(\theta \). We fix such \(\delta >0\), and fix a partition \(s_k= k\delta /2\). Then, (3.5) and (A4) yield

$$\begin{aligned}{} & {} n^{-1}\textrm{tr}(S_n^{-1}(\sigma )(S_n(\sigma _0)-S_n(\sigma ))) \nonumber \\{} & {} \quad =\frac{1}{n}\textrm{tr}\bigg (S_n^{-1}(\sigma )\left( \begin{array}{cc} \textrm{diag}((\tilde{\Sigma }_{i,0}^1-\tilde{\Sigma }_i^1)_i) &{} \{\tilde{\Sigma }_{i,j,0}^{1,2}-\tilde{\Sigma }_{i,j}^{1,2}\}_{ij} \\ \{\tilde{\Sigma }_{i,j,0}^{1,2}-\tilde{\Sigma }_{i,j}^{1,2}\}_{ji} &{} \textrm{diag}((\tilde{\Sigma }_{j,0}^2-\tilde{\Sigma }_j^2)_j) \end{array} \right) \bigg ) \nonumber \\{} & {} \quad =\frac{1}{n}\sum _{p=0}^\infty \bigg \{\sum _{l=1}^2\textrm{tr}\bigg (\textrm{diag}\bigg (\bigg (\frac{\tilde{\Sigma }_{i,0}^l}{\tilde{\Sigma }_i^l}-1\bigg )_i\bigg )\mathcal {A}_p^l\bigg ) -2\textrm{tr}\bigg (\mathcal {A}_p^1\tilde{G}\bigg \{\frac{\tilde{\Sigma }_{i,j,0}^{1,2}-\tilde{\Sigma }_{i,j}^{1,2}}{(\tilde{\Sigma }_i^1)^{1/2}(\tilde{\Sigma }_j^2)^{1/2}}\bigg \}_{ij}\bigg ) \bigg \} \nonumber \\{} & {} \quad =\frac{1}{n}\sum _{p=0}^\infty \sum _{k=1}^{q_n+1} \bigg \{\sum _{l=1}^2\textrm{tr}\bigg (\textrm{diag}\bigg (\bigg (\frac{\tilde{\Sigma }_{i,0}^l}{\tilde{\Sigma }_i^l}-1\bigg )_i\bigg ){\mathcal {E}_{(k)}^l}\mathcal {A}_p^l\bigg )\nonumber \\{} & {} \qquad -2\textrm{tr}\bigg (\mathcal {E}_{(k)}^1\mathcal {A}_p^1\tilde{G}\bigg \{\frac{\tilde{\Sigma }_{i,j,0}^{1,2}-\tilde{\Sigma }_{i,j}^{1,2}}{(\tilde{\Sigma }_i^1)^{1/2}(\tilde{\Sigma }_j^2)^{1/2}}\bigg \}_{ij}\bigg ) \bigg \} \end{aligned}$$
(3.16)

if \(\bar{\rho }_n<1\).

Let \(\dot{\rho }_k=\rho _{s_{k-1}}\) and \({\dot{B}}_{k,l}=([\Sigma _{s_{k-1}}(\sigma _0)]_{ll}/[\Sigma _{s_{k-1}}(\sigma )]_{ll})^{1/2}\). Then, (3.15) yields that for any \(p\in \mathbb {Z}_+\), we have

$$\begin{aligned} |[\mathcal {E}_{(k)}^l\mathcal {A}_p^l]_{ij}-\dot{\rho }_k^{2p}[\dot{\mathcal {A}}_{k,p}^l]_{ij}|\le Cp\bar{\rho }_n^{2p-1}\epsilon \end{aligned}$$
(3.17)

on \(\{2pr_n<\delta /2\}\). Here, the factor p in the right-hand side appears, because we consider the difference between 2p products of \(\rho _{i'j'}\) and \(\dot{\rho }_k^{2p}\). Moreover, Lemma 3.4 and (3.4) yield

$$\begin{aligned} \limsup _{n\rightarrow \infty } \max _{1\le k\le q_n+1} \sum _{p=0}^\infty \Vert \mathcal {E}_{(k)}^l\mathcal {A}_p^l\Vert \le C\limsup _{n\rightarrow \infty }\sum _{p=0}^\infty \bar{\rho }_n^{2p}<\infty \end{aligned}$$
(3.18)

almost surely.

Then, together with (A2) and Lemma 3.7, we obtain

$$\begin{aligned}{} & {} n^{-1}\textrm{tr}(S_n^{-1}(\sigma )(S_n(\sigma _0)-S_n(\sigma ))) \nonumber \\{} & {} \quad =\frac{1}{n}\sum _{p=0}^\infty \sum _{k=1}^{q_n}\bigg \{\dot{\rho }_k^{2p}\sum _{l=1}^2({\dot{B}}_{k,l}^2-1)\textrm{tr}(\dot{\mathcal {A}}_{k,p}^l) -2\dot{\rho }_k^{2p+1}({\dot{B}}_{k,1}{\dot{B}}_{k,2}\dot{\rho }_{k,0}-\dot{\rho }_k)\textrm{tr}(\dot{\mathcal {A}}_{k,p+1}^1)\bigg \}+e_n,\nonumber \\ \end{aligned}$$
(3.19)

where \(\dot{\rho }_{k,0}=\rho _{s_{k-1}}(\sigma _0)\), and \((e_n)_{n=1}^\infty \) denotes a general sequence of random variables such that \(\limsup _{n\rightarrow \infty }|e_n|\rightarrow 0\) as \(\delta \rightarrow 0\).

Moreover, by (3.3) and Lemma 3.4, we can apply Lemma A.3 in Ogihara (2018) to \(S_n\). Then, we have

$$\begin{aligned} \log \det S_n(\sigma )= & {} \log \det \tilde{\mathcal {D}}+ \log \det \bigg (\mathcal {E}_M+\left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) \bigg ) \\= & {} \sum _{l=1}^2\sum _{i=1}^{M_l} \log \tilde{\Sigma }_i^l+ \sum _{p=1}^\infty \frac{(-1)^{p-1}}{p}\textrm{tr}\bigg (\left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) ^p\bigg ) \\= & {} \sum _{l=1}^2\sum _{i=1}^{M_l} \log \tilde{\Sigma }_i^l - \sum _{p=1}^\infty \frac{1}{p}\textrm{tr}((\tilde{G}\tilde{G}^\top )^p) \end{aligned}$$

if \(\bar{\rho }_n<1\). Therefore, thanks to (3.2) and (3.17), we obtain

$$\begin{aligned} n^{-1}\log \frac{\det S_n(\sigma )}{\det S_n(\sigma _0)}= & {} n^{-1}\sum _{l=1}^2\sum _{i=1}^{M_l}\log \frac{\tilde{\Sigma }_i^l}{\tilde{\Sigma }_{i,0}^l}-n^{-1}\sum _{p=1}^\infty \frac{1}{p}{\textrm{tr}((\tilde{G}\tilde{G}^\top )^p- (\tilde{G}\tilde{G}^\top )^p(\sigma _0))} \nonumber \\= & {} -n^{-1}\sum _{k=1}^{q_n}\bigg \{\sum _{l=1}^2M_{l,k}\log {\dot{B}}_{k,l}^2+\sum _{p=1}^\infty \frac{\dot{\rho }_k^{2p}-\dot{\rho }_{k,0}^{2p}}{p}\textrm{tr}(\dot{\mathcal {A}}_{k,p}^1)\bigg \}+e_n.\nonumber \\ \end{aligned}$$
(3.20)

Lemma 3.7, (3.6), (3.19), and (3.20) yield

$$\begin{aligned} H_n^1(\sigma )-H_n^1(\sigma _0)= & {} -\frac{1}{2}\sum _{p=0}^\infty \sum _{k=1}^{q_n}\bigg \{\dot{\rho }_k^{2p}\sum _{l=1}^2({\dot{B}}_{k,l}^2-1)\textrm{tr}(\dot{\mathcal {A}}_{k,p}^l)\nonumber \\{} & {} -2\dot{\rho }_k^{2p+1}({\dot{B}}_{k,1}{\dot{B}}_{k,2}\dot{\rho }_{k,0}-\dot{\rho }_k)\textrm{tr}(\dot{\mathcal {A}}_{k,p+1}^1)\bigg \} \nonumber \\{} & {} +\frac{1}{2}\sum _{k=1}^{q_n}\bigg \{\sum _{l=1}^2M_{l,k}\log {\dot{B}}_{k,l}^2+\sum _{p=1}^\infty \frac{\dot{\rho }_k^{2p}-\dot{\rho }_{k,0}^{2p}}{p}\textrm{tr}(\dot{\mathcal {A}}_{k,p}^1)\bigg \}+ne_n \nonumber \\= & {} \sum _{k=1}^{q_n}\bigg \{-\frac{1}{2}\sum _{p=0}^\infty \dot{\rho }_k^{2p}\sum _{l=1}^2{\dot{B}}_{k,l}^2\textrm{tr}(\dot{\mathcal {A}}_{k,p}^l) +\sum _{p=1}^\infty \dot{\rho }_k^{2p-1}\dot{\rho }_{k,0}{\dot{B}}_{k,1}{\dot{B}}_{k,2}\textrm{tr}(\dot{\mathcal {A}}_{k,p}^1)\nonumber \\{} & {} +\frac{1}{2}\sum _{l=1}^2\textrm{tr}({\dot{A}}_{k,0}^l)\nonumber \\{} & {} +\frac{1}{2}\sum _{l=1}^2M_{l,k}\log {\dot{B}}_{k,l}^2+\sum _{p=1}^\infty \frac{\dot{\rho }_k^{2p}-\dot{\rho }_{k,0}^{2p}}{2p}\textrm{tr}(\dot{\mathcal {A}}_{k,p}^1) \bigg \}+ne_n \nonumber \\= & {} \sum _{k=1}^{q_n}\bigg [\sum _{p=1}^\infty \dot{\rho }_k^{2p}\bigg \{{-\frac{1}{2}}\sum _{l=1}^2{\dot{B}}_{k,l}^2\textrm{tr}(\dot{\mathcal {A}}_{k,p}^l) +\frac{\dot{\rho }_{k,0}}{\dot{\rho }_k}{\dot{B}}_{k,1}{\dot{B}}_{k,2}\textrm{tr}(\dot{\mathcal {A}}_{k,p}^1)\bigg \} \nonumber \\{} & {} +\frac{1}{2}\sum _{l=1}^2M_{l,k}\big \{-{\dot{B}}_{k,l}^2+1+\log {\dot{B}}_{k,l}^2\big \}\nonumber \\{} & {} +\sum _{p=1}^\infty \frac{\dot{\rho }_k^{2p}-\dot{\rho }_{k,0}^{2p}}{2p}\textrm{tr}(\dot{\mathcal {A}}_{k,p}^1) \bigg ]+ne_n. \end{aligned}$$
(3.21)

Here, we used that \(\textrm{tr}(\dot{\mathcal {A}}_{k,0}^l)=\textrm{tr}(\mathcal {E}_{(k)}^l)=M_{l,k}\).

Moreover, (A4) and (3.15) yield

$$\begin{aligned}{} & {} \bigg |\sum _{k=1}^{q_n}f(s_{k-1})\textrm{tr}(\dot{\mathcal {A}}_{k,p}^l)-h_n^{-1}\int _0^{nh_n}a_p^1f(t){\textrm{d}}t\bigg | \nonumber \\{} & {} \quad \le \bigg |\sum _{k=1}^{q_n}f(s_{k-1})(\textrm{tr}(\dot{\mathcal {A}}_{k,p}^l)-h_n^{-1}a_p^1(s_k-s_{k-1}))\bigg |\nonumber \\{} & {} \qquad +\bigg |h_n^{-1}a_p^1\sum _{k=1}^{q_n}\int _{s_{k-1}}^{s_k}(f(t)-f(s_{k-1})){\textrm{d}}t\bigg | {+O_p(h_n^{-1})} \nonumber \\{} & {} \quad \le o_p(h_n^{-1}) \cdot q_n + {C_p\epsilon n+O_p(h_n^{-1})} = o_p(n)+ne_n \end{aligned}$$
(3.22)

for \(p\ge 1\) and any choice of \(f(t)=\rho _t^{2p}B_{l,t}^2\), \(\rho _t^{2p-1}\rho _{t,0}B_{1,t}B_{2,t}\) and \((\rho _t^{2p}-\rho _{t,0}^{2p})/(2p)\).

Here, we used that \(q_n=O(nh_n)\) by the definition of \((s_k)_{k=0}^\infty \in \mathfrak {S}\). Similarly, we obtain

$$\begin{aligned} \sum _{k=1}^{q_n}M_{l,k}(1-{\dot{B}}_{k,l}^2+\log {\dot{B}}_{k,l}^2)=h_n^{-1}\int _0^{nh_n}a_0^l(1-B_{l,t}^2+\log B_{l,t}^2){\textrm{d}}t+ne_n. \end{aligned}$$

Together with (3.21), (A3) and the equation

$$\begin{aligned} \sum _{p=1}^\infty a_p^1\frac{\rho _t^{2p}-\rho _{t,0}^{2p}}{2p}=\sum _{p=1}^\infty a_p^1\int _{\rho _{t,0}}^{\rho _t}\rho ^{2p-1}{\textrm{d}}\rho =\int _{\rho _{t,0}}^{\rho _t}\frac{\mathcal {A}(\rho )}{\rho }{\textrm{d}}\rho , \end{aligned}$$

we obtain

$$\begin{aligned} H_n^1(\sigma )-H_n^1(\sigma _0)=n\mathcal {Y}_1(\sigma )+ne_n. \end{aligned}$$

The above arguments show that the supremum with respect to \(\sigma \) of the residual term in the above equation is also equal to \(ne_n\), and consequently, we obtain (3.14) with \(k=0\). Similarly, we obtain (3.14) with \(k\in \{1,2,3\}\). \(\square \)

Proposition 3.9

Assume (A1)–(A4). Then,  there exists a positive constant \(\chi ,\) such that

$$\begin{aligned} \mathcal {Y}_1\le & {} \liminf _{T\rightarrow \infty }\int _0^T\bigg \{-\frac{1}{2}(a_0^1\wedge a_0^2)(B_{1,t}-B_{2,t})^2-\chi \big \{a_1^1(\rho _t-\rho _{t,0})^2\\{} & {} +\,a_0^1\wedge a_0^2 (B_{1,t}B_{2,t}-1)^2\big \}\bigg \}{\textrm{d}}t. \end{aligned}$$

Proof

The proof is based on the ideas of proof of Lemma 5 in Ogihara and Yoshida (2014). Let

$$\begin{aligned} G_k=\{[G]_{ij}1_{\{\sup I_i^1, \sup I_j^2 \in (s_{k-1}, s_k]\}}\}_{ij}, \end{aligned}$$

and let \(\tilde{\mathcal {A}}_{k,p}^1=(G_kG_k^\top )^p\) and \(\tilde{\mathcal {A}}_{k,p}^2=(G_k^\top G_k)^p\). Let \(\tilde{\mathcal {A}}_k=\sum _{p=1}^\infty \dot{\rho }_k^{2p}\textrm{tr}(\tilde{\mathcal {A}}_{k,p}^1)\) and \(\tilde{\mathcal {B}}_k=\sum _{p=1}^\infty (2p)^{-1}(\dot{\rho }_k^{2p}-\dot{\rho }_{k,0}^{2p})\textrm{tr}(\tilde{\mathcal {A}}_{k,p}^1)\). Similarly to the proof of Lemma 3.7, the difference between \(\textrm{tr}(\dot{\mathcal {A}}_{k,p}^l)\) and \(\textrm{tr}(\tilde{\mathcal {A}}_{k,p}^l)\) comes from terms with \(\sup I_i^1\) close to \(s_{k-1}\) or \(s_k\), and hence, we obtain

$$\begin{aligned} {\max _{1\le k\le q_n}|\textrm{tr}(\dot{\mathcal {A}}_{k,p}^l)-\textrm{tr}(\tilde{\mathcal {A}}_{k,p}^l)|=o_p(n).} \end{aligned}$$

Therefore, (3.21) yields

$$\begin{aligned} \mathcal {Y}_1= & {} \frac{1}{n}\sum _{k=1}^{q_n}\bigg \{-\frac{1}{2}({\dot{B}}_{k,1}^2+{\dot{B}}_{k,2}^2)\tilde{\mathcal {A}}_k +\frac{\dot{\rho }_{k,0}}{\dot{\rho }_k}{\dot{B}}_{k,1}{\dot{B}}_{k,2}\tilde{\mathcal {A}}_k\\{} & {} +\frac{1}{2}\sum _{l=1}^2M_{l,k}(1-{\dot{B}}_{k,l}^2+\log {\dot{B}}_{k,l}^2)+\tilde{\mathcal {B}}_k\bigg \}+e_n \\= & {} \frac{1}{n}\sum _{k=1}^{q_n}\bigg \{-\frac{1}{2}({\dot{B}}_{k,1}-{\dot{B}}_{k,2})^2\tilde{\mathcal {A}}_k +{\dot{B}}_{k,1}{\dot{B}}_{k,2}\bigg (\tilde{\mathcal {A}}_k\frac{\dot{\rho }_{k,0}}{\dot{\rho }_k}-\tilde{\mathcal {A}}_k\bigg )\\{} & {} +\frac{1}{2}\sum _{l=1}^2M_{l,k}(1-{\dot{B}}_{k,l}^2+\log {\dot{B}}_{k,l}^2)+\tilde{\mathcal {B}}_k\bigg \}+e_n. \end{aligned}$$

Then, since

$$\begin{aligned}{} & {} \frac{1}{2}\sum _{l=1}^2M_{l,k}(1-{\dot{B}}_{k,l}^2+\log {\dot{B}}_{k,l}^2) \\{} & {} \quad =M_{1,k}\bigg (1-\frac{{\dot{B}}_{k,1}^2}{2}-\frac{{\dot{B}}_{k,2}^2}{2}+\log ({\dot{B}}_{k,1}{\dot{B}}_{k,2})\bigg ) +\frac{M_{2,k}-M_{1,k}}{2}(1-{\dot{B}}_{k,2}^2+\log ({\dot{B}}_{k,2}^2)) \\{} & {} \quad =-\frac{1}{2}M_{1,k}({\dot{B}}_{k,1}-{\dot{B}}_{k,2})^2-M_{1,k}{\dot{B}}_{k,1}{\dot{B}}_{k,2} +M_{1,k}\bigg (1+\log ({\dot{B}}_{k,1}{\dot{B}}_{k,2})\bigg ) \\{} & {} \qquad +\frac{M_{2,k}-M_{1,k}}{2}(1-{\dot{B}}_{k,2}^2+\log ({\dot{B}}_{k,2}^2)), \end{aligned}$$

and a similar estimate holds by switching the roles of \(M_{1,k}\) and \(M_{2,k}\), we have

$$\begin{aligned} \mathcal {Y}_1= & {} n^{-1}\sum _{k=1}^{q_n}\bigg \{-\frac{1}{2}(M_{1,k}+\tilde{\mathcal {A}}_k)({\dot{B}}_{k,1}-{\dot{B}}_{k,2})^2+M_{1,k}(1+\log ({\dot{B}}_{k,1}{\dot{B}}_{k,2})) \\{} & {} +\tilde{\mathcal {B}}_k+\frac{M_{2,k}-M_{1,k}}{2}(1-{\dot{B}}_{k,2}^2+\log ({\dot{B}}_{k,2}^2)) +{\dot{B}}_{k,1}{\dot{B}}_{k,2}\bigg (\tilde{\mathcal {A}}_k\frac{\dot{\rho }_{k,0}}{\dot{\rho }_k}-\tilde{\mathcal {A}}_k-M_{1,k}\bigg )\bigg \}+e_n \\= & {} n^{-1}\sum _{k=1}^{q_n}\bigg \{-\frac{1}{2}(M_{2,k}+\tilde{\mathcal {A}}_k)({\dot{B}}_{k,1}-{\dot{B}}_{k,2})^2+M_{2,k}(1+\log ({\dot{B}}_{k,1}{\dot{B}}_{k,2})) \\{} & {} +\tilde{\mathcal {B}}_k+\frac{M_{1,k}-M_{2,k}}{2}(1-{\dot{B}}_{k,1}^2+\log ({\dot{B}}_{k,1}^2)) +{\dot{B}}_{k,1}{\dot{B}}_{k,2}\bigg (\tilde{\mathcal {A}}_k\frac{\dot{\rho }_{k,0}}{\dot{\rho }_k}-\tilde{\mathcal {A}}_k-M_{2,k}\bigg )\bigg \}+e_n. \end{aligned}$$

For \(l\in \{1,2\}\), let

$$\begin{aligned} F_{l,k}=M_{l,k}(1+\log ({\dot{B}}_{k,1}{\dot{B}}_{k,2}))+\tilde{\mathcal {B}}_k+{\dot{B}}_{k,1}{\dot{B}}_{k,2}\bigg (\tilde{\mathcal {A}}_k\frac{\dot{\rho }_{k,0}}{\dot{\rho }_k}-\tilde{\mathcal {A}}_k-M_{l,k}\bigg ), \end{aligned}$$

then since \(1-x+\log x \le 0\) for \(x>0\), we obtain

$$\begin{aligned} \mathcal {Y}_1\le & {} n^{-1}\sum _{k=1}^{q_n}\bigg [\bigg \{-\frac{1}{2}(M_{1,k}+\tilde{\mathcal {A}}_k)({\dot{B}}_{k,1}-{\dot{B}}_{k,2})^2+F_{1,k}\bigg \}1_{\{M_{2,k}\ge M_{1,k}\}} \\{} & {} +\bigg \{-\frac{1}{2}(M_{2,k}+\tilde{\mathcal {A}}_k)({\dot{B}}_{k,1}-{\dot{B}}_{k,2})^2+F_{2,k}\bigg \}1_{\{M_{2,k}< M_{1,k}\}}\bigg ] +e_n, \end{aligned}$$

and therefore, we have

$$\begin{aligned} \mathcal {Y}_1\le n^{-1}\sum _{k=1}^{q_n}\bigg \{-\frac{1}{2}(M_{1,k}\wedge M_{2,k}+\tilde{\mathcal {A}}_k)({\dot{B}}_{k,1}-{\dot{B}}_{k,2})^2+F_{1,k}\vee F_{2,k}\bigg \}+e_n. \end{aligned}$$
(3.23)

Let \((\lambda _i^k)_{i=1}^{M_{1,k}}\) be all the eigenvalues of \(G_kG_k^\top \). Similarly to Lemma 3.3, we have \(0\le \lambda _i^k\le 1\). Then, we have

$$\begin{aligned} F_{1,k}= & {} \sum _{i=1}^{M_{1,k}}\bigg \{1+\log ({\dot{B}}_{k,1}{\dot{B}}_{k,2})+{\dot{B}}_{k,1}{\dot{B}}_{k,2}\sum _{p=0}^\infty \big \{(\lambda _i^k)^{p+1}\dot{\rho }_k^{2p+1}\dot{\rho }_{k,0}-(\lambda _i^k)^p\dot{\rho }_k^{2p}\big \}\\{} & {} +\sum _{p=1}^\infty \frac{(\lambda _i^k)^p}{2p}(\dot{\rho }_k^{2p}-\dot{\rho }_{k,0}^{2p})\bigg \}. \end{aligned}$$

Moreover, by setting \(g_i^k=\sqrt{1-\lambda _i^k\dot{\rho }_k^2}\), \(g_{i,0}^k=\sqrt{1-\lambda _i^k\dot{\rho }_{k,0}^2}\), and \(F(x)=1-x+\log x\), we have

$$\begin{aligned} F_{1,k}= & {} \sum _{i=1}^{M_{1,k}}\Big \{1+{\dot{B}}_{k,1}{\dot{B}}_{k,2}(g_i^k)^{-2}(\lambda _i^k\dot{\rho }_k\dot{\rho }_{k,0}-1)+\log ({\dot{B}}_{k,1}{\dot{B}}_{k,2}g_{i,0}^k(g_i^k)^{-1})\Big \} \\= & {} \sum _{i=1}^{M_{1,k}}\Big \{{\dot{B}}_{k,1}{\dot{B}}_{k,2}(g_i^k)^{-2}(\lambda _i^k\dot{\rho }_k\dot{\rho }_{k,0}-1)+{\dot{B}}_{k,1}{\dot{B}}_{k,2}g_{i,0}^k(g_i^k)^{-1}+F({\dot{B}}_{k,1}{\dot{B}}_{k,2}g_{i,0}^k(g_i^k)^{-1})\Big \}. \end{aligned}$$

Here, we also used the expansion formulas \((1-x)^{-1}=\sum _{p=0}^\infty x^p\) and \(-\log (1-x)=\sum _{p=1}^\infty x^p/p\) for \(|x|<1\).

Let

$$\begin{aligned} \mathcal {R}=\sup _{t,\sigma ,0\le l\le 4}{(|\partial _\sigma ^l \Sigma _t|^{1/2}\vee |\partial _\sigma ^l \Sigma _t^{-1}|^{1/2}).} \end{aligned}$$

Since \(g_i^k\le 1\), \(0\le \lambda _i^k\le 1\), and \(|\dot{\rho }_k|< 1\), we have

$$\begin{aligned} (g_i^k)^{-2}(\lambda _i^k\dot{\rho }_k\dot{\rho }_{k,0}-1) {+g_{i,0}^k(g_i^k)^{-1}}= & {} {\frac{(\lambda _i^k\dot{\rho }_k\dot{\rho }_{k,0}-1+g_{i,0}^kg_i^k)(1-\lambda _i^k\dot{\rho }_k\dot{\rho }_{k,0}+g_{i,0}^kg_i^k)}{(g_i^k)^2(1-\lambda _i^k\dot{\rho }_k\dot{\rho }_{k,0}+g_{i,0}^kg_i^k)}} \\= & {} -\frac{(\lambda _i^k\dot{\rho }_k\dot{\rho }_{k,0}-1)^2-(g_{i,0}^k)^2(g_i^k)^2}{(g_i^k)^2(1-\lambda _i^k\dot{\rho }_k\dot{\rho }_{k,0}+g_{i,0}^kg_i^k)} \\= & {} -\frac{\lambda _i^k(\dot{\rho }_k-\dot{\rho }_{k,0})^2}{(g_i^k)^2(1-\lambda _i^k\dot{\rho }_k\dot{\rho }_{k,0}+g_{i,0}^kg_i^k)} \\\le & {} -\frac{\lambda _i^k}{3}(\dot{\rho }_k-\dot{\rho }_{k,0})^2. \end{aligned}$$

Together with Lemma 11 in Ogihara and Yoshida (2014) and

$$\begin{aligned} {\dot{B}}_{k,1}{\dot{B}}_{k,2}g_{i,0}^k(g_i^k)^{-1}-1=\frac{{\dot{B}}_{k,1}{\dot{B}}_{k,2}\sqrt{1-\lambda _i^k\dot{\rho }_{k,0}^2}-\sqrt{1-\lambda _i^k\dot{\rho }_k^2}}{\sqrt{1-\lambda _i^k\dot{\rho }_k^2}} \le \frac{{\dot{B}}_{k,1}{\dot{B}}_{k,2}}{\sqrt{1-\bar{\rho }_n^2}} \le \frac{\mathcal {R}^4}{\sqrt{1-\bar{\rho }_n^2}}, \end{aligned}$$

we have

$$\begin{aligned} F_{1,k}\le \sum _{i=1}^{M_{1,k}}\bigg \{-\frac{{\dot{B}}_{k,1}{\dot{B}}_{k,2}}{3}\lambda _i^k(\dot{\rho }_k-\dot{\rho }_{k,0})^2-\frac{1-\bar{\rho }_n^2}{4\mathcal {R}^8}({\dot{B}}_{k,1}{\dot{B}}_{k,2}g_{i,0}^k(g_i^k)^{-1}-1)^2\bigg \}. \end{aligned}$$

Moreover, the inequality \(a^2\ge (a+b)^2/2-b^2\) with \(a={\dot{B}}_{k,1}{\dot{B}}_{k,2}g_{i,0}^k-g_i^k\) and \(b=g_i^k-g_{i,0}^k\) yields

$$\begin{aligned} ({\dot{B}}_{k,1}{\dot{B}}_{k,2}g_{i,0}^k(g_i^k)^{-1}-1)^2\ge & {} ({\dot{B}}_{k,1}{\dot{B}}_{k,2}g_{i,0}^k-g_i^k)^2 \\\ge & {} \frac{(g_{i,0}^k)^2}{2}({\dot{B}}_{k,1}{\dot{B}}_{k,2}-1)^2 - (g_i^k-g_{i,0}^k)^2 \\= & {} \frac{1-\lambda _i^k\dot{\rho }_{k,0}^2}{2}({\dot{B}}_{k,1}{\dot{B}}_{k,2}-1)^2-\frac{(\lambda _i^k)^2(\dot{\rho }_k-\dot{\rho }_{k,0})^2}{(g_i^k+g_{i,0}^k)^2} \\\ge & {} \frac{1-\bar{\rho }_n^2}{2}({\dot{B}}_{k,1}{\dot{B}}_{k,2}-1)^2 - \frac{\lambda _i^k}{{4(1-\bar{\rho }_n^2)}}(\dot{\rho }_k-\dot{\rho }_{k,0})^2, \end{aligned}$$

and hence, we have

$$\begin{aligned} F_{1,k}\le & {} \sum _{i=1}^{M_{1,k}}\bigg \{-\frac{{\dot{B}}_{k,1}{\dot{B}}_{k,2}}{3}\lambda _i^k(\dot{\rho }_k-\dot{\rho }_{k,0})^2 -\frac{(1-\bar{\rho }_n^2)^2}{8\mathcal {R}^8}({\dot{B}}_{k,1}{\dot{B}}_{k,2}-1)^2 +\frac{\lambda _i^k}{{16}\mathcal {R}^8}(\dot{\rho }_k-\dot{\rho }_{k,0})^2\bigg \} \\= & {} -\bigg (\frac{{\dot{B}}_{k,1}{\dot{B}}_{k,2}}{3}-\frac{1}{{16}\mathcal {R}^8}\bigg ){\textrm{tr}(\tilde{\mathcal {A}}_{k,1}^1)}(\dot{\rho }_k-\dot{\rho }_{k,0})^2 -\frac{(1-\bar{\rho }_n^2)^2}{8\mathcal {R}^8}M_{1,k}({\dot{B}}_{k,1}{\dot{B}}_{k,2}-1)^2. \end{aligned}$$

By a similar argument for \(F_{2,k}\), there exists a positive constant \(\tilde{\chi }\) which does not depend on k nor n, such that

$$\begin{aligned} F_{1,k}\vee F_{2,k} \le {-\tilde{\chi }(1-\bar{\rho }_n^2)^2 \big \{\textrm{tr}(\tilde{\mathcal {A}}_{k,1}^1)}(\dot{\rho }_k-\dot{\rho }_{k,0})^2 + M_{1,k}\wedge M_{2,k} ({\dot{B}}_{k,1}{\dot{B}}_{k,2}-1)^2\big \}. \end{aligned}$$

Together with (3.23), we have

$$\begin{aligned} \mathcal {Y}_1\le & {} n^{-1}\sum _{k=1}^{q_n}\bigg \{-\frac{1}{2}(M_{1,k}\wedge M_{2,k})({\dot{B}}_{k,1}-{\dot{B}}_{k,2})^2 \\{} & {} - \tilde{\chi }(1-\bar{\rho }_n^2)^2 \big \{\textrm{tr}(\tilde{\mathcal {A}}_{k,1}^1)(\dot{\rho }_k-\dot{\rho }_{k,0})^2 + M_{1,k}\wedge M_{2,k} ({\dot{B}}_{k,1}{\dot{B}}_{k,2}-1)^2\big \}\bigg \}+e_n. \end{aligned}$$

By letting \(n\rightarrow \infty \), (A4) and (3.3) yield the conclusion. \(\square \)

(A6) and Remark 4 in Ogihara and Yoshida (2014) yield that

$$\begin{aligned} \limsup _{T\rightarrow \infty }\frac{1}{T}\int _0^T\big \{|B_{1,t}-B_{2,t}|^2+|B_{1,t}B_{2,t}-1|^2+| \rho _t-\rho _{t,0}|^2\big \} {\textrm{d}}t>0, \end{aligned}$$

when \(\sigma \ne \sigma _0\).

Then, by Proposition 3.9, we have \(\mathcal {Y}_1(\sigma )<0\) (note that \(a_0^1\wedge a_0^2\ge a_1^1\) by (2.3) and a similar argument). Therefore, for any \(\delta >0\), there exists \(\eta >0\), such that

$$\begin{aligned} {\inf _{|\sigma -\sigma _0|\ge \delta }(-\mathcal {Y}_1(\sigma ))\ge \eta .} \end{aligned}$$

Then, since \(H_n^1(\hat{\sigma }_n)-H_n^1(\sigma _0)\ge 0\) by the definition, for any \(\epsilon >0\), we have

$$\begin{aligned} P(|\hat{\sigma }_n-\sigma _0|\ge \delta )\le & {} P\bigg (\sup _{|\sigma -\sigma _0|\ge \delta }(H_n^1(\sigma )-H_n^1(\sigma _0))\ge 0\bigg ) \nonumber \\\le & {} P\bigg (\sup _{|\sigma -\sigma _0|\ge \delta }\Big (n^{-1}(H_n^1(\sigma )-H_n^1(\sigma _0))-\mathcal {Y}_1(\sigma )\Big )\ge \eta \bigg ) \nonumber \\\le & {} P\bigg (\sup _\sigma |n^{-1}(H_n^1(\sigma )-H_n^1(\sigma _0))-{\mathcal {Y}_1}(\sigma )|\ge \eta \bigg )<\epsilon \qquad \end{aligned}$$
(3.24)

for sufficiently large n by Proposition 3.8, which implies \(\hat{\sigma }_n\overset{P}{\rightarrow }\sigma _0\) as \(n\rightarrow \infty \).

3.3 Asymptotic normality of \(\hat{\sigma }_n\)

Let \(S_{n,0}=S_n(\sigma _0)\) and \(\Sigma _{t,0}=\Sigma _t(\sigma _0)\). (3.7) and the equation \(\partial _\sigma S_{n,0}^{-1}=-S_{n,0}^{-1}\partial _\sigma S_{n,0} S_{n,0}^{-1}\) imply

$$\begin{aligned} \partial _\sigma H_n^1(\sigma _0)= & {} -\frac{1}{2}(\Delta X^c)^\top \partial _\sigma S_{n,0}^{-1}\Delta X^c-\frac{1}{2}\textrm{tr}(\partial _\sigma S_{n,0} S_{n,0}^{-1})+o_p(\sqrt{n}) \nonumber \\= & {} -\frac{1}{2}\textrm{tr}(\partial _\sigma S_{n,0}^{-1}(\Delta X^c(\Delta X^c)^\top - S_{n,0}))+o_p(\sqrt{n}). \end{aligned}$$
(3.25)

Let \((L_n)_{n\in \mathbb {N}}\) be a sequence of positive integers such that \(L_n\rightarrow \infty \) and \(L_nn^\eta (nh_n)^{-1}\rightarrow 0\) as \(n\rightarrow \infty \) for some \(\eta >0\). Let \(\check{s}_k=kT_n/L_n\) for \(0\le k\le L_n\), let \(J^k=(\check{s}_{k-1},\check{s}_k]\), and let \(S_{n,0}^{(k)}\) be an \(M\times M\) matrix satisfying

$$\begin{aligned}{}[S_{n,0}^{(k)}]_{ij}=\int _{I_i\cap I_j\cap J^k}[\Sigma _{t,0}]_{{\psi (i),\psi (j)}}{\textrm{d}}t. \end{aligned}$$

For a two-dimensional stochastic process \((U_t)_{t\ge 0}=((U_t^1,U_t^2))_{t\ge 0}\), let \(\Delta _{i,t}^{l,(k)} U=U^l_{(S^{n,l}_i\vee \check{s}_{k-1})\wedge \check{s}_k\wedge t}-U^l_{(S^{n,l}_{i-1}\vee \check{s}_{k-1})\wedge \check{s}_k\wedge t}\), and let \(\Delta _{i,t}^{(k)} U=\Delta _{\varphi (i),t}^{\psi (i),(k)} U\) for \(1\le i\le M\). Let \(\Delta _i^{(k)} U=\Delta _{i,T_n}^{(k)} U\), and let \(\Delta ^{(k)} U=(\Delta _i^{(k)} U)_{1\le i\le M}\).

Let

$$\begin{aligned} \mathcal {X}_k=-\frac{1}{2\sqrt{n}}\big \{(\Delta ^{(k)} X^c)^\top \partial _\sigma S_{n,0}^{-1}\Delta ^{(k)}X^c-\textrm{tr}(\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)})\big \} -\frac{1}{\sqrt{n}}\sum _{k'<k}(\Delta ^{(k)} X^c)^\top \partial _\sigma S_{n,0}^{-1}\Delta ^{(k')}X^c. \end{aligned}$$

Then, since \(\Delta X^c=\sum _{k=1}^{L_n}\Delta ^{(k)} X^c\) and \(S_{n,0}=\sum _{k=1}^{L_n}S_{n,0}^{(k)}\), (3.25) yields

$$\begin{aligned} n^{-1/2}\partial _\sigma H_n^1(\sigma _0)=\sum _{k=1}^{L_n}\mathcal {X}_k+o_p(1). \end{aligned}$$
(3.26)

Moreover, Itô’s formula yields

$$\begin{aligned} \sqrt{n}\mathcal {X}_k= & {} -\frac{1}{2}\sum _{i,j}[\partial _\sigma S_{n,0}^{-1}]_{ij}\bigg \{2\int _{I_i\cap J^k}\Delta _{j,t}^{(k)} X^c{\textrm{d}}X_t^{c,\psi (i)} +2\sum _{k'<k}\int _{I_i\cap J^k}\Delta _j^{(k')} X^c{\textrm{d}}X_t^{c,\psi (i)}\bigg \} \nonumber \\= & {} -\sum _{i,j}[\partial _\sigma S_{n,0}^{-1}]_{ij}\int _{I_i\cap J^k}\Delta _{j,t} X^c{\textrm{d}}X_t^{c,\psi (i)}. \end{aligned}$$
(3.27)

Let \(\mathcal {G}_t=\mathcal {F}_t\bigvee \sigma (\{\Pi _n\}_n)\) for \(t\ge 0\). We will show

$$\begin{aligned} n^{-1/2}\partial _\sigma H_n^1(\sigma _0) \overset{d}{\rightarrow }N(0,\Gamma _1), \end{aligned}$$
(3.28)

using Corollary 3.1 and the remark after that in Hall and Heyde (1980). For this purpose, it is sufficient to show

$$\begin{aligned} \sum _{k=1}^{L_n}E_k[\mathcal {X}_k^2]\overset{P}{\rightarrow }\Gamma _1, \end{aligned}$$
(3.29)

and

$$\begin{aligned} \sum _{k=1}^{L_n}E_k[\mathcal {X}_k^4]\overset{P}{\rightarrow }0, \end{aligned}$$
(3.30)

by (3.26), where \(E_k\) denotes the conditional expectation with respect to \(\mathcal {G}_{\check{s}_{k-1}}\).

We first show four auxiliary lemmas. Let \(\tilde{M}_k=\#\{i; 1\le i\le M, \sup I_i\in J^k\}\).

Lemma 3.10

Assume (A1). Then,  there exists a positive constant C,  such that \(\Vert \mathcal {D}^{-1/2}S_{n,0}^{(k)}\mathcal {D}^{-1/2}\Vert \le C\) and \(\textrm{tr}(\mathcal {D}^{-1/2}S_{n,0}^{(k)}\mathcal {D}^{-1/2})\le C(\tilde{M}_k+1)\) for any \(1\le k\le L_n\).

Proof

Since

$$\begin{aligned} |[S_{n,0}^{(k)}]_{ij}|\le C\bigg [\mathcal {D}^{1/2}\left( \begin{array}{cc} \mathcal {E}_{M_1} &{} G \\ G^\top &{} \mathcal {E}_{M_2} \end{array} \right) \mathcal {D}^{1/2}\bigg ]_{ij}, \end{aligned}$$

Lemma 3.3 yields

$$\begin{aligned} \Vert \mathcal {D}^{-1/2}S_{n,0}^{(k)}\mathcal {D}^{-1/2}\Vert \le C\bigg \Vert \left( \begin{array}{cc} \mathcal {E}_{M_1} &{} G \\ G^\top &{} \mathcal {E}_{M_2} \end{array} \right) \bigg \Vert \le C. \end{aligned}$$

Moreover, we have

$$\begin{aligned} \textrm{tr}(\mathcal {D}^{-1/2}S_{n,0}^{(k)}\mathcal {D}^{-1/2})=\sum _{i=1}^M\frac{\int _{I_i\cap J^k}[\Sigma _{t,0}]_{\psi (i),\psi (i)}{\textrm{d}}t}{|I_i|}\le C \sum _{i=1}^M1_{\{i; I_i\cap J^k\ne \emptyset \}} \le C(\tilde{M}_k+1). \end{aligned}$$

\(\square \)

Lemma 3.11

Assume (A4) and that \(nh_nL_n^{-1}\rightarrow \infty \) as \(n\rightarrow \infty \). Then,  \(\{L_nn^{-1}\max _{1\le k\le L_n}\tilde{M}_k\}_{n=1}^\infty \) is P-tight.

Proof

Let \(\mathcal {M}_n=[nh_nL_n^{-1}]\). We define a partition of \([0,\infty )\) by

$$\begin{aligned} s_j=\frac{nh_nj}{2L_n\mathcal {M}_n} \quad (j\ge 0). \end{aligned}$$

Then, \((s_j)_{j=0}^\infty \in \mathfrak {S}\) when \(nh_nL_n^{-1}\ge 1\), and \((s_j)_{j=0}^{2L_n\mathcal {M}_n}\) is a subpartition of \((\check{s}_k)_{k=0}^{L_n}\).

For \(M_{l,j}\) which corresponds to this partition (\(\tilde{M}_k\) remains to be defined using \(\check{s}_k\)), we have

$$\begin{aligned} {\tilde{M}_k= \sum _{l=1}^2\sum _{j=2\mathcal {M}_n(k-1)+1}^{2\mathcal {M}_nk} M_{l,j},} \end{aligned}$$

since \({\check{s}_k=nh_nkL_n^{-1}=s_{2\mathcal {M}_nk}}\). Therefore, (A4) yields

$$\begin{aligned} \max _{1\le k\le L_n}\tilde{M}_k\le 4\mathcal {M}_n \max _{l,j}M_{l,j} \le C\mathcal {M}_n\{h_n^{-1}(a_0^1\vee a_0^2)+o_p(h_n^{-1})\}=O_p(nL_n^{-1}). \end{aligned}$$

\(\square \)

Lemma 3.12

Assume (A1). Then

$$\begin{aligned} \Vert \tilde{\mathcal {D}}^{-1/2}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k')}\tilde{\mathcal {D}}^{-1/2}\Vert \le C\frac{{(\mathcal {Q}_n+1)\bar{\rho }_n^{\mathcal {Q}_n}}}{(1-\bar{\rho }_n)^2} \end{aligned}$$

on \(\{\bar{\rho }_n<1\}\) for \(|k-k'|>1,\) where \(\mathcal {Q}_n=[r_n^{-1}(T_n/L_n-2r_n)]\).

Proof

Using the expansion formula (3.5), we have

$$\begin{aligned} S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k')}= & {} -S_{n,0}^{(k)}S_{n,0}^{-1} \partial _\sigma S_{n,0}S_{n,0}^{-1} S_{n,0}^{(k')} \nonumber \\= & {} -S_{n,0}^{(k)}\tilde{\mathcal {D}}^{-1/2}\sum _{p=0}^\infty (-1)^p\left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) ^p \tilde{\mathcal {D}}^{-1/2}\partial _\sigma S_{n,0}\tilde{\mathcal {D}}^{-1/2}\nonumber \\{} & {} \times \sum _{q=0}^\infty (-1)^q\left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) ^q \tilde{\mathcal {D}}^{-1/2}S_{n,0}^{(k')} \nonumber \\= & {} \sum _{p,q=0}^\infty (-1)^{p+q+1}S_{n,0}^{(k)}{{\mathfrak {C}}_{p,q}^n} S_{n,0}^{(k')} \end{aligned}$$
(3.31)

if \(\bar{\rho }_n<1\), where

$$\begin{aligned} {\mathfrak {C}}^n_{p,q}=\tilde{\mathcal {D}}^{-1/2}\left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) ^p \tilde{\mathcal {D}}^{-1/2}\partial _\sigma S_{n,0}\tilde{\mathcal {D}}^{-1/2}\left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) ^q \tilde{\mathcal {D}}^{-1/2}. \end{aligned}$$

We consider a necessary condition for

$$\begin{aligned}{}[S_{n,0}^{(k)}{\mathfrak {C}}_{p,q}^n S_{n,0}^{(k')}]_{i',j'}=\sum _{ij}[S_{n,0}^{(k)}]_{i'i} [{\mathfrak {C}}_{p,q}^n]_{ij} [S_{n,0}^{(k')}]_{j,j'} \end{aligned}$$
(3.32)

to be zero for any \(i'\) and \(j'\). We first observe that the element \([{\mathfrak {C}}^n_{p,q}]_{ij}\) is equal to zero if \([\bar{S}^{p+q+1}]_{ij}=0\), where

$$\begin{aligned} \bar{S}=\left( \begin{array}{cc} \mathcal {E}_{M_1} &{} G \\ G^\top &{} \mathcal {E}_{M_2} \end{array} \right) . \end{aligned}$$

Moreover, \([S_{n,0}^{(k)}]_{i'i}\ne 0\) only if \(I_i\cap J^k \ne \emptyset \), and \([S_{n,0}^{(k')}]_{jj'}\ne 0\) only if \(I_j\cap J^{k'}\ne \emptyset \). Since \(\inf _{x\in I_i, y\in I_j}|x-y|> T_n/L_n-2r_n\) if \(I_i\cap J^k \ne \emptyset \) and \(I_j\cap J^{k'}\ne \emptyset \), we have \([\bar{S}^r]_{ij}=0\) for \(r\le \mathcal {Q}_n\) when \([S_{n,0}^{(k)}]_{i'i}\ne 0\) and \([S_{n,0}^{(k')}]_{jj'}\ne 0\). Therefore, \([S_{n,0}^{(k)}{\mathfrak {C}}_{p,q}^n S_{n,0}^{(k')}]_{i',j'}=0\) for any \(i'\) and \(j'\) if \(p+q+1\le \mathcal {Q}_n\).

Then, (3.31) and Lemmas 3.43.5 and 3.10 yield

$$\begin{aligned}{} & {} \Vert \tilde{\mathcal {D}}^{-1/2}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k')}\tilde{\mathcal {D}}^{-1/2}\Vert \\{} & {} \quad \le \sum _{p=0}^\infty \sum _{q=(\mathcal {Q}_n-p)\vee 0}^\infty \bigg \Vert \tilde{\mathcal {D}}^{-1/2}S_{n,0}^{(k)}\tilde{\mathcal {D}}^{-1/2}\left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) ^p \tilde{\mathcal {D}}^{-1/2}\partial _\sigma S_{n,0}\tilde{\mathcal {D}}^{-1/2}\\{} & {} \qquad \times \left( \begin{array}{cc} 0 &{} \tilde{G}\\ \tilde{G}^\top &{} 0 \end{array} \right) ^q \tilde{\mathcal {D}}^{-1/2}S_{n,0}^{(k')} \tilde{\mathcal {D}}^{-1/2}\bigg \Vert \\{} & {} \quad \le C\sum _{p=0}^\infty \sum _{q=(\mathcal {Q}_n-p)\vee 0}^\infty \bar{\rho }_n^{p+q}=C\frac{\mathcal {Q}_n\bar{\rho }_n^{\mathcal {Q}_n}+\bar{\rho }_n^{\mathcal {Q}_n}(1-\bar{\rho }_n)^{-1}}{1-\bar{\rho }_n} \\{} & {} \quad \le C\frac{{(\mathcal {Q}_n+1)\bar{\rho }_n^{\mathcal {Q}_n}}}{(1-\bar{\rho }_n)^2} \end{aligned}$$

on \(\{\bar{\rho }_n<1\}\). \(\square \)

Lemma 3.13

Let \(m\in \mathbb {N}\). Let V be an \(m\times m\) symmetric,  positive definite matrix and A be a \(m\times m\) matrix. Let X be a random variable following N(0, V). Then

$$\begin{aligned} E[(X^\top AX)^2]= & {} \textrm{tr}(AV)^2+2\textrm{tr}((AV)^2), \\ E[(X^\top AX)^3]= & {} \textrm{tr}(AV)^3+6\textrm{tr}(AV)\textrm{tr}((AV)^2)+8\textrm{tr}((AV)^3), \\ E[(X^\top AX)^4]= & {} \textrm{tr}(AV)^4+12\textrm{tr}(AV)^2\textrm{tr}((AV)^2)+12\textrm{tr}((AV)^2)^2\\{} & {} +32\textrm{tr}(AV)\textrm{tr}((AV)^3)+48\textrm{tr}((AV)^4). \end{aligned}$$

Proof

We only show the result for \(E[(X^\top AX)^4]\). Let U be an orthogonal matrix and \(\Lambda \) be a diagonal matrix satisfying \(UVU^\top =\Lambda \). Then, we have \(UX\sim N(0,\Lambda )\), and

$$\begin{aligned} E\bigg [\prod _{i=1}^8[UX]_{j_i}\bigg ]=\sum _{(l_{2q-1},l_{2q})_{q=1}^4}\prod _{q=1}^4[\Lambda ]_{l_{2q-1},l_{2q}}, \end{aligned}$$

where the summation of \((l_{2q-1},l_{2q})_{q=1}^4\) is taken over all disjoint pairs of \(\{j_1,\ldots j_8\}\). Then, by setting \(B=UAU^\top \), we have

$$\begin{aligned} E[(X^\top AX)^4]=\sum _{j_1,\ldots , j_8}\sum _{(l_{2q-1},l_{2q})_{q=1}^4}\prod _{p=1}^4[B]_{j_{2p-1},j_{2p}}\prod _{q=1}^4[\Lambda ]_{l_{2q-1},l_{2q}}. \end{aligned}$$

Let \(_nC_k=\frac{n!}{k!(n-k)!}\). Out of \(j_1,\ldots , j_8\), we connect \(j_{2p-1}\) to \(j_{2p}\) and \(l_{2q-1}\) to \(l_{2q}\) (\(1\le p,q\le 4\)). Then, the pattern of the connected components gives five different cases.

  1. 1.

    Four connected components (four components of size 2): only one case of the pairs \((l_{2q-1},l_{2q})_{q=1}^4\) appears, which corresponds to \(\textrm{tr}((B\Lambda )^4)\).

  2. 2.

    Three connected components (a component of size 4 and two components of size 2): The choice of elements for a components of size 4 gives \(_4C_2\) ways, and the choice of the pair \((l_{2q-1},l_{2q})\) for this component gives two ways, and hence, \(_4C_2\times 2=12\) ways in total. This case corresponds to \(\textrm{tr}(B\Lambda )^2\textrm{tr}((B\Lambda )^2)\).

  3. 3.

    Two connected components (two components of size 4): The choice of elements for each component gives \(\frac{_4C_2}{2}\) ways, excluding duplicates, and the choice of the pair \((l_{2q-1},l_{2q})\) for each component gives two ways, and hence, \(\frac{_4C_2}{2}\times 2\times 2=12\) ways in total. This case corresponds to \(\textrm{tr}((B\Lambda )^2)^2\).

  4. 4.

    Two connected components (a component of size 6 and a component of size 2): The choice of elements for a components of size 6 gives \(_4C_1\) ways, and the choice of the pair \((l_{2q-1},l_{2q})\) for this component gives \(4\times 2=8\) ways, and hence \(_4C_1\times 8=32\) ways in total. This case corresponds to \(\textrm{tr}(B\Lambda )\textrm{tr}((B\Lambda )^3)\).

  5. 5.

    One connected component (a component of size 8): The choice of the pair \((l_{2q-1},l_{2q})\) gives \(6\times 4\times 2=48\) ways. This case corresponds to \(\textrm{tr}((B\Lambda )^4)\).

Then, we obtain the conclusion. \(\square \)

Proposition 3.14

Assume (A1)–(A4) and (A6). Then

$$\begin{aligned} n^{-1/2}\partial _\sigma H_n^1(\sigma _0) \overset{d}{\rightarrow }N(0,\Gamma _1) \end{aligned}$$

as \(n\rightarrow \infty \).

Proof

It is sufficient to show (3.29) and (3.30). Let \(\mathfrak {A}_k=(\Delta ^{(k)} X^c)^\top \partial _\sigma S_{n,0}^{-1}\Delta ^{(k)}X^c\) and \(\mathfrak {B}_k=\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\). By the definition of \(\mathcal {X}_k\), we have

$$\begin{aligned}{} & {} \sum _{k=1}^{L_n}E_k[\mathcal {X}_k^4] \nonumber \\{} & {} \quad \le \frac{C}{n^2}\sum _{k=1}^{L_n}\bigg \{E_k\big [\big \{(\Delta ^{(k)} X^c)^\top \partial _\sigma S_{n,0}^{-1}\Delta ^{(k)}X^c-\textrm{tr}(\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)})\big \}^4\big ]\nonumber \\{} & {} \qquad +E_k\bigg [\bigg (\sum _{k'<k}(\Delta ^{(k)} X^c)^\top \partial _\sigma S_{n,0}^{-1}\Delta ^{(k')}X^c\bigg )^4\bigg ]\bigg \} \nonumber \\{} & {} \quad =\frac{C}{n^2}\sum _{k=1}^{L_n}\Big \{E_k[\mathfrak {A}_k^4]-4E_k[\mathfrak {A}_k^3]\textrm{tr}(\mathfrak {B}_k)+6E_k[\mathfrak {A}_k^2]\textrm{tr}(\mathfrak {B}_k)^2-4\textrm{tr}(\mathfrak {B}_k)^4+\textrm{tr}(\mathfrak {B}_k)^4\Big \} \nonumber \\{} & {} \qquad +\frac{C}{n^2}\sum _{k=1}^{L_n}\bigg \{\bigg (\sum _{k'<k}\Delta ^{(k')}X^c\bigg )^\top \partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}\bigg (\sum _{k'<k}\Delta ^{(k')}X^c\bigg )\bigg \}^2. \end{aligned}$$
(3.33)

Thanks to Lemmas 3.133.103.11 and 3.5, (3.4), and Lemma A.1 in Ogihara (2018), the first term in the right-hand side is calculated as

$$\begin{aligned}{} & {} \frac{C}{n^2}\sum _{k=1}^{L_n}\Big \{\textrm{tr}(\mathfrak {B}_k)^4+12\textrm{tr}(\mathfrak {B}_k)^2\textrm{tr}(\mathfrak {B}_k^2)+12\textrm{tr}(\mathfrak {B}_k^2)^2+32\textrm{tr}(\mathfrak {B}_k)\textrm{tr}(\mathfrak {B}_k^3)+48\textrm{tr}(\mathfrak {B}_k^4) \nonumber \\{} & {} \qquad -4\textrm{tr}(\mathfrak {B}_k)\big \{\textrm{tr}(\mathfrak {B}_k)^3+6\textrm{tr}(\mathfrak {B}_k)\textrm{tr}(\mathfrak {B}_k^2)+8\textrm{tr}(\mathfrak {B}_k^3)\big \} +6\textrm{tr}(\mathfrak {B}_k)^2\big \{\textrm{tr}(\mathfrak {B}_k)^2\nonumber \\{} & {} \qquad +2\textrm{tr}(\mathfrak {B}_k^2)\big \}-3\textrm{tr}(\mathfrak {B}_k)^4\Big \} \nonumber \\{} & {} \quad = \frac{C}{n^2}\sum _{k=1}^{L_n}\big \{48\textrm{tr}(\mathfrak {B}_k^4)+12\textrm{tr}(\mathfrak {B}_k^2)^2\big \} \nonumber \\{} & {} \quad \le \frac{C}{n^2}(\max _k\tilde{M}_k+1)^2L_n(1-\bar{\rho }_n)^{-8}1_{\{\bar{\rho }_n<1\}}+o_p(1)\overset{P}{\rightarrow }0. \end{aligned}$$
(3.34)

Moreover, Lemma 3.13 yields

$$\begin{aligned}{} & {} E_\Pi \bigg [\frac{C}{n^2}\sum _{k=1}^{L_n}\bigg \{\bigg (\sum _{k'<k}\Delta ^{(k')}X^c\bigg )^\top \partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}\bigg (\sum _{k'<k}\Delta ^{(k')}X^c\bigg )\bigg \}^2\bigg ] \nonumber \\{} & {} \quad \le \frac{C}{n^2}\sum _{k=1}^{L_n}\sum _{k'_1,k'_2<k}\big \{|\textrm{tr}(\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k'_1)})\textrm{tr}(\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k'_2)})| \nonumber \\{} & {} \qquad +|\textrm{tr}(\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k'_1)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k'_2)})|\big \}. \end{aligned}$$
(3.35)

If \(k'_1<k-1\), Lemmas 3.5 and 3.12, Lemma A.1 in Ogihara (2018) and the equation \(\partial _\sigma S_{n,0}^{-1}=-S_{n,0}^{-1}\partial _\sigma S_{n,0} S_{n,0}^{-1}\) yield

$$\begin{aligned}{} & {} |\textrm{tr}(\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)} \partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k'_1)})| \\{} & {} \quad =|\textrm{tr}(\tilde{\mathcal {D}}^{1/2}S_{n,0}^{-1}\tilde{\mathcal {D}}^{1/2}\tilde{\mathcal {D}}^{-1/2} \partial _\sigma S_{n,0} \tilde{\mathcal {D}}^{-1/2}\tilde{\mathcal {D}}^{1/2}S_{n,0}^{-1}\tilde{\mathcal {D}}^{1/2}\tilde{\mathcal {D}}^{-1/2}S_{n,0}^{(k)} \partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k'_1)}\tilde{\mathcal {D}}^{-1/2})| \\{} & {} \quad \le \textrm{tr}(\tilde{\mathcal {D}}^{1/2}S_{n,0}^{-1}\tilde{\mathcal {D}}^{1/2})\Vert \tilde{\mathcal {D}}^{-1/2} \partial _\sigma S_{n,0} \tilde{\mathcal {D}}^{-1/2}\Vert \Vert \tilde{\mathcal {D}}^{1/2}S_{n,0}^{-1}\tilde{\mathcal {D}}^{1/2}\Vert \Vert \tilde{\mathcal {D}}^{-1/2}S_{n,0}^{(k)} \partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k'_1)}\tilde{\mathcal {D}}^{-1/2})\Vert \\{} & {} \quad \le CM\mathcal {Q}_n\bar{\rho }_n^{\mathcal {Q}_n}(1-\bar{\rho }_n)^{-4} \end{aligned}$$

on \(\{\bar{\rho }_n<1\}\). Here, we used that \(\textrm{tr}(\tilde{\mathcal {D}}^{1/2}S_{n,0}^{-1}\tilde{\mathcal {D}}^{1/2})\le M\cdot \Vert \tilde{\mathcal {D}}^{1/2}S_{n,0}^{-1}\tilde{\mathcal {D}}^{1/2}\Vert \le CM(1-\bar{\rho }_n)^{-1}\). Similarly, we obtain

$$\begin{aligned} |\textrm{tr}(\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k'_1)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k'_2)})| \le CM\mathcal {Q}_n\bar{\rho }_n^{\mathcal {Q}_n}(1-\bar{\rho }_n)^{-8}. \end{aligned}$$

Since \(\bar{\rho }_n^{\mathcal {Q}_n}\) converges to zero very fast if \(\bar{\rho }_n<1\) and \(r_n\le 1\), together with (A2) and (3.3), the summation for of the terms with \(k'_1<k-1\) or \(k'_2<k-1\) in the right-hand side of (3.35) is equal to \(o_p(1)\).

Then, together with Lemmas 3.53.10, and 3.11, and Lemma A.1 in Ogihara (2018), we obtain

$$\begin{aligned}{} & {} E_\Pi \bigg [\frac{C}{n^2}\sum _{k=1}^{L_n}\bigg \{\bigg (\sum _{k'<k}\Delta ^{(k')}X^c\bigg )^\top \partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}\bigg (\sum _{k'<k}\Delta ^{(k')}X^c\bigg )\bigg \}^2\bigg ] \nonumber \\{} & {} \quad \le \frac{C}{n^2}\sum _{k=1}^{L_n}\big \{|\textrm{tr}(\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k-1)})^2|+|\textrm{tr}((\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k)}\partial _\sigma S_{n,0}^{-1}S_{n,0}^{(k-1)})^2)|\big \} +o_p(1) \nonumber \\{} & {} \quad =O_p\bigg (\frac{L_n}{n^2}\Big \{\max _k\tilde{M}_k+1\Big \}^2\bigg )+o_p(1)\overset{P}{\rightarrow }0 \end{aligned}$$
(3.36)

as \(n\rightarrow \infty \). Then, (3.33), (3.34), and (3.36) yield (3.30).

Next, we show (3.29). Let \(\mathcal {I}_{i,j}^k=I_i\cap I_j\cap J^k\). Then, (3.27) yields

$$\begin{aligned} \sum _{k=1}^{L_n}E_k[\mathcal {X}_k^2]= & {} \frac{1}{n}\sum _{k=1}^{L_n}\sum _{i_1,j_1}\sum _{i_2,j_2}[\partial _\sigma S_{n,0}^{-1}]_{i_1,j_1}[\partial _\sigma S_{n,0}^{-1}]_{i_2,j_2}\nonumber \\{} & {} \times \int _{\mathcal {I}_{i_1,i_2}^k}[\Sigma _{t,0}]_{\psi (i_1),\psi (i_2)}E_k[\Delta _{j_1,t} X^c \Delta _{j_2,t} X^c]{\textrm{d}}t \nonumber \\= & {} \frac{1}{n}\sum _{k=1}^{L_n}\sum _{i_1,j_1}\sum _{i_2,j_2}[\partial _\sigma S_{n,0}^{-1}]_{i_1,j_1}[\partial _\sigma S_{n,0}^{-1}]_{i_2,j_2}\nonumber \\{} & {} \times \int _{\mathcal {I}_{i_1,i_2}^k}[\Sigma _{t,0}]_{\psi (i_1),\psi (i_2)}\int _{I_{j_1}\cap I_{j_2}\cap [0,t)}[\Sigma _{s,0}]_{\psi (j_1),\psi (j_2)}{\textrm{d}}s{\textrm{d}}t. \qquad \end{aligned}$$
(3.37)

We can decompose

$$\begin{aligned}{} & {} \int _{\mathcal {I}_{i_1,i_2}^k}[\Sigma _{t,0}]_{\psi (i_1),\psi (i_2)}\int _{I_{j_1}\cap I_{j_2}\cap [0,t)}[\Sigma _{s,0}]_{\psi (j_1),\psi (j_2)}{\textrm{d}}s{\textrm{d}}t\\{} & {} \quad =\int _0^{T_n}F_{i_1,i_2}^k(t)\int _0^t F_{j_1,j_2}^k(s){\textrm{d}}s{\textrm{d}}t + \sum _{k'<k}\mathcal {F}_{i_1,i_2}^k\mathcal {F}_{j_1,j_2}^{k'}, \end{aligned}$$

where \(F_{ij}^k(t)=[\Sigma _{t,0}]_{\psi (i),\psi (j)}1_{\mathcal {I}_{i,j}^k}(t)\), and \(\mathcal {F}_{i,j}^k=\int _0^{T_n}F_{i,j}^k(t){\textrm{d}}t\). Moreover, switching the roles of \(i_1,i_2\) and \(j_1,j_2\), we obtain

$$\begin{aligned}{} & {} \sum _{i_1,j_1}\sum _{i_2,j_2}[\partial _\sigma S_{n,0}^{-1}]_{i_1,j_1}[\partial _\sigma S_{n,0}^{-1}]_{i_2,j_2}\int _0^{T_n}F_{i_1,i_2}^k(t)\int _0^tF_{j_1,j_2}^k(s){\textrm{d}}s{\textrm{d}}t \\{} & {} \quad =\sum _{i_1,j_1}\sum _{i_2,j_2}[\partial _\sigma S_{n,0}^{-1}]_{i_1,j_1}[\partial _\sigma S_{n,0}^{-1}]_{i_2,j_2} \times \frac{1}{2}\bigg \{\int _0^{T_n}F_{i_1,i_2}^k(t)\int _0^tF_{j_1,j_2}^k(s){\textrm{d}}s{\textrm{d}}t\\{} & {} \qquad +\int _0^{T_n}F_{j_1,j_2}^k(t)\int _0^tF_{i_1,i_2}^k(s){\textrm{d}}s{\textrm{d}}t\bigg \} \\{} & {} \quad =\frac{1}{2}\sum _{i_1,j_1}\sum _{i_2,j_2}[\partial _\sigma S_{n,0}^{-1}]_{i_1,j_1}[\partial _\sigma S_{n,0}^{-1}]_{i_2,j_2} \bigg \{\int _0^{T_n}F_{i_1,i_2}^k(t)\int _0^tF_{j_1,j_2}^k(s){\textrm{d}}s{\textrm{d}}t\\{} & {} \qquad +\int _0^{T_n}F_{i_1,i_2}^k(s)\int _s^{T_n}F_{j_1,j_2}^k(t){\textrm{d}}t{\textrm{d}}s\bigg \} \\{} & {} \quad =\frac{1}{2}\sum _{i_1,j_1}\sum _{i_2,j_2}[\partial _\sigma S_{n,0}^{-1}]_{i_1,j_1}[\partial _\sigma S_{n,0}^{-1}]_{i_2,j_2}\mathcal {F}_{i_1,i_2}^k\mathcal {F}_{j_1,j_2}^k. \end{aligned}$$

Therefore, we have

$$\begin{aligned} \sum _{k=1}^{L_n}E_k[\mathcal {X}_k^2]= & {} \frac{1}{2n}\sum _{k=1}^{L_n}\sum _{i_1,j_1}\sum _{i_2,j_2}[\partial _\sigma S_{n,0}^{-1}]_{i_1,j_1}[\partial _\sigma S_{n,0}^{-1}]_{i_2,j_2} \bigg \{\mathcal {F}_{i_1,i_2}^k\mathcal {F}_{j_1,j_2}^k+2\sum _{k'<k}\mathcal {F}_{i_1,i_2}^k\mathcal {F}_{j_1,j_2}^{k'}\bigg \} \nonumber \\= & {} \frac{1}{2n}\sum _{k,k'=1}^{L_n} \sum _{i_1,j_1}\sum _{i_2,j_2}[\partial _\sigma S_{n,0}^{-1}]_{i_1,j_1}[\partial _\sigma S_{n,0}^{-1}]_{i_2,j_2}\mathcal {F}_{i_1,i_2}^k\mathcal {F}_{j_1,j_2}^{k'} \nonumber \\= & {} \frac{1}{2n}\sum _{i_1,j_1}\sum _{i_2,j_2}[\partial _\sigma S_{n,0}^{-1}]_{i_1,j_1}[\partial _\sigma S_{n,0}^{-1}]_{i_2,j_2}\int _{I_{i_1}\cap I_{i_2}}[\Sigma _{t,0}]_{\psi (i_1),\psi (i_2)}{\textrm{d}}t\nonumber \\{} & {} \times \int _{I_{j_1}\cap I_{j_2}}[\Sigma _{s,0}]_{\psi (j_1),\psi (j_2)}{\textrm{d}}s \nonumber \\= & {} \frac{1}{2n}\textrm{tr}((\partial _\sigma S_{n,0}^{-1}S_{n,0})^2). \end{aligned}$$
(3.38)

\(\partial _\sigma S_{n,0}^{-1}S_{n,0}\) corresponds to \(\hat{\mathcal {D}}(t)\) in the proof (p. 2993) of Proposition 10 of Ogihara and Yoshida (2014). Then, by a similar step to the proof of Proposition 10 in Ogihara and Yoshida (2014), we have (3.29). \(\square \)

Proposition 3.15

Assume (A1)–(A4) and (A6). Then,  \(\Gamma _1\) is positive definite and

$$\begin{aligned} \sqrt{n}(\hat{\sigma }_n-\sigma _0)\overset{d}{\rightarrow }N(0,\Gamma _1^{-1}) \end{aligned}$$

as \(n\rightarrow \infty \).

Proof

Proposition 3.9, (A6), and Remark 4 in Ogihara and Yoshida (2014) yield

$$\begin{aligned} \mathcal {Y}_1(\sigma )\le -c |\sigma -\sigma _0|^2 \end{aligned}$$
(3.39)

for some positive constant c. Moreover, \(\mathcal {Y}_1(\sigma _0)=0\) by \(B_{l,t,0}=1\), and \(\partial _\sigma \mathcal {Y}_1(\sigma _0)=0\) by

$$\begin{aligned} \partial _\sigma y_{1,t}(\sigma _0)= & {} -\partial _\rho \mathcal {A}(\rho _{t,0})\partial _\sigma \rho _{t,0}-\frac{1}{2}\mathcal {A}(\rho _{t,0})\sum _{l=1}^2 2\partial _\sigma B_{l,t,0}\\{} & {} +\partial _\rho \mathcal {A}(\rho _{t,0})\partial _\sigma \rho _{t,0} -\mathcal {A}(\rho _{t,0})\frac{\partial _\sigma \rho _{t,0}}{\rho _{t,0}} \\{} & {} +(\partial _\sigma B_{1,t,0}+\partial _\sigma B_{2,t,0})\mathcal {A}(\rho _{t,0})+\sum _{l=1}^2a_0^l(-\partial _\sigma B_{l,t,0}+\partial _\sigma B_{l,t,0})\\{} & {} +\frac{\mathcal {A}(\rho _{t,0})}{\rho _{t,0}}\partial _\sigma \rho _{t,0} \\= & {} 0. \end{aligned}$$

Then, Taylor’s formula yields

$$\begin{aligned} \mathcal {Y}_1(\sigma )=(\sigma -\sigma _0)^\top \partial _\sigma ^2\mathcal {Y}_1(\sigma _0)(\sigma -\sigma _0)+o(|\sigma -\sigma _0|^2). \end{aligned}$$

Therefore, considering \(\sigma \) sufficiently close to \(\sigma _0\), \(\Gamma _1=-\partial _\sigma ^2 \mathcal {Y}_1(\sigma _0)\) should be positive definite by (3.39).

By Taylor’s formula and the equation \(\partial _\sigma H_n^1(\hat{\sigma }_n)=0\), we have

$$\begin{aligned} -\partial _\sigma H_n^1(\sigma _0)= & {} \partial _\sigma H_n^1(\hat{\sigma }_n)-\partial _\sigma H_n^1(\sigma _0) \\= & {} \int _0^1\partial _\sigma ^2H_n^1(\sigma _t)dt(\hat{\sigma }_n-\sigma _0) \\= & {} \partial _\sigma ^2H_n^1(\sigma _0)(\hat{\sigma }_n-\sigma _0)+(\hat{\sigma }_n-\sigma _0)^\top \int _0^1(1-t)\partial _\sigma ^3H_n^1(\sigma _t){\textrm{d}}t(\hat{\sigma }_n-\sigma _0), \end{aligned}$$

where \(\sigma _t=t\hat{\sigma }_n+(1-t)\sigma _0\).

Therefore, we obtain

$$\begin{aligned} \sqrt{n}(\hat{\sigma }_n-\sigma _0)=\bigg \{-\frac{1}{n}\partial _\sigma ^2H_n^1(\sigma _0)-\frac{1}{n}\int _0^1(1-t)\partial _\sigma ^3H_n^1(\sigma _t){\textrm{d}}t(\hat{\sigma }_n-\sigma _0)\bigg \}^{-1}\cdot \frac{1}{\sqrt{n}}\partial _\sigma H_n^1(\sigma _0). \end{aligned}$$
(3.40)

Since Proposition 3.8 yields

$$\begin{aligned} -\frac{1}{n}\partial _\sigma ^2 H_n^1(\sigma _0)\overset{P}{\rightarrow }-\partial _\sigma ^2\mathcal {Y}_1(\sigma _0)=\Gamma _1, \end{aligned}$$

and

$$\begin{aligned} \bigg \{\sup _\sigma \bigg |\frac{1}{n}\partial _\sigma ^3H_n^1(\sigma )\bigg |\bigg \}_{n\in \mathbb {N}} \end{aligned}$$

is P-tight, together with Proposition 3.14, we conclude

$$\begin{aligned} \sqrt{n}(\hat{\sigma }_n-\sigma _0)\overset{d}{\rightarrow }N(0,\Gamma _1^{-1}). \end{aligned}$$
(3.41)

\(\square \)

3.4 Consistency of \(\hat{\theta }_n\)

Let

$$\begin{aligned} \mathcal {Y}_2(\theta )=\lim _{T\rightarrow \infty }\frac{1}{T}\int _0^T\sum _{p=0}^\infty \bigg \{-\frac{1}{2}\sum _{l=1}^2f_p^{ll}\rho _{t,0}^{2p}\phi _{l,t}^2+f_p^{12}\rho _{t,0}^{2p+1}\phi _{1,t}\phi _{2,t}\bigg \}{\textrm{d}}t, \end{aligned}$$

which exists under (A1), (A3), and (A5).

Proposition 3.16

Assume (A1)–(A6). Then

$$\begin{aligned} \sup _{\theta \in \Theta _2}\big |(nh_n)^{-1}\partial _\theta ^k(H_n^2(\theta )-H_n^2(\theta _0))-\partial _\theta ^k\mathcal {Y}_2(\theta )\big |\overset{P}{\rightarrow }0 \end{aligned}$$
(3.42)

as \(n\rightarrow \infty \) for \(k\in \{0,1,2,3\}\).

Proof

We first show that

$$\begin{aligned}{} & {} \bar{X}(\theta )^\top S_n^{-1}(\hat{\sigma }_n)\bar{X}(\theta ) \nonumber \\{} & {} \quad =\Delta X^\top S_n^{-1}(\hat{\sigma }_n)\Delta X -2\Delta V(\theta )^\top S_{n,0}^{-1}\Delta X^c -\Delta V(\theta )^\top S_{n,0}^{-1}(2\Delta V(\theta _0) \nonumber \\{} & {} \quad - \Delta V(\theta ))+\sqrt{nh_n}{\dot{e}}_n(\theta ), \end{aligned}$$
(3.43)

where \(({\dot{e}}_n(\theta ))_{n=1}^\infty \) denotes a general sequence of random variables, such that \(\sup _\theta |{\dot{e}}_n(\theta )|\overset{P}{\rightarrow }0\) as \(n\rightarrow \infty \).

Lemma 3.5 and (3.10) yield

$$\begin{aligned}{} & {} E_\Pi \big [(\Delta V(\theta )^\top \partial _\sigma ^k S_{n,0}^{-1}\Delta X^c)^2\big ] \nonumber \\{} & {} \quad =\sum _{i_1,j_1}\sum _{i_2,j_2}[\partial _\sigma ^k S_{n,0}^{-1}]_{i_1,j_1}[\partial _\sigma ^k S_{n,0}^{-1}]_{i_2,j_2}\Delta _{i_1} V(\theta ) \Delta _{i_2} V(\theta )E_\Pi [\Delta _{j_1}X^c \Delta _{j_2}X^c] \nonumber \\{} & {} \quad =\sum _{i_1,j_1}\sum _{i_2,j_2}[\partial _\sigma ^k S_{n,0}^{-1}]_{i_1,j_1}[\partial _\sigma ^k S_{n,0}^{-1}]_{i_2,j_2}\Delta _{i_1} V(\theta ) \Delta _{i_2} V(\theta )[S_{n,0}]_{j_1,j_2} \nonumber \\{} & {} \quad \le C|\mathcal {D}^{-1/2}\Delta V(\theta )|^2\Vert \mathcal {D}^{1/2}\partial _\sigma ^k S_{n,0}^{-1}\mathcal {D}^{1/2}\Vert ^2\Vert \mathcal {D}^{-1/2}S_{n,0}\mathcal {D}^{-1/2}\Vert \nonumber \\{} & {} \quad \le {Cnh_n(1-\bar{\rho }_n)^{-2k-2}} \end{aligned}$$
(3.44)

on \(\{\bar{\rho }_n<1\}\).

Since (3.2), Lemma 3.2 and Taylor’s formula yield

$$\begin{aligned}{} & {} E_\Pi [|\mathcal {D}^{-1/2}\Delta X|^2]=\sum _{i=1}^M \frac{E_\Pi [|\Delta _i X|^2]}{|I_i|}\le C{(M+nh_n)}={O_p(n),}\nonumber \\{} & {} \bar{X}(\theta )^\top S_n^{-1}(\hat{\sigma }_n)\bar{X}(\theta ) -\Delta X^\top S_n^{-1}(\hat{\sigma }_n)\Delta X =-\Delta V(\theta )^\top S_n^{-1}(\hat{\sigma }_n)(2\Delta X - \Delta V(\theta )),\nonumber \\ \end{aligned}$$
(3.45)

and

$$\begin{aligned} S_n^{-1}(\hat{\sigma }_n)=S_{n,0}^{-1}+(\hat{\sigma }_n-\sigma _0)\partial _\sigma S_{n,0}^{-1}+{(\hat{\sigma }_n-\sigma _0)^\top }\int _0^1(1-u)\partial _\sigma ^2S_n^{-1}(u\hat{\sigma }_n+(1-u)\sigma _0)du (\hat{\sigma }_n-\sigma _0) \end{aligned}$$
(3.46)

(3.4), (3.10), (3.41), (3.45), and Lemma 3.5 simply

$$\begin{aligned}{} & {} \sup _\theta \big |\bar{X}(\theta )^\top S_n^{-1}(\hat{\sigma }_n)\bar{X}(\theta ) -\Delta X^\top S_n^{-1}(\hat{\sigma }_n)\Delta X\nonumber \\{} & {} \qquad +\Delta V(\theta )^\top \big \{S_{n,0}^{-1}+(\hat{\sigma }_n-\sigma _0)\partial _\sigma S_{n,0}^{-1}\big \}(2\Delta X - \Delta V(\theta ))\big | \nonumber \\{} & {} \quad =\sup _\theta \bigg |\Delta V(\theta )^\top \int _0^1(1-u)\sum _{i,j}\partial _{\sigma _i}\partial _{\sigma _j}S_n^{-1}(u\hat{\sigma }_n\nonumber \\{} & {} \qquad +(1-u)\sigma _0) [\hat{\sigma }_n-\sigma _0]_i[\hat{\sigma }_n-\sigma _0]_j{\textrm{d}}u(2\Delta X-\Delta V(\theta ))\bigg | \nonumber \\{} & {} \quad {\le \sup _\theta |\mathcal {D}^{-1/2}\Delta V(\theta )|\cdot \sup _\theta |\mathcal {D}^{-1/2}(2\Delta X-\Delta V(\theta ))|\cdot |\hat{\sigma }_n-\sigma _0|^2} \nonumber \\{} & {} \qquad {\times \sum _{ij}\bigg \Vert \int _0^1(1-u)\mathcal {D}^{1/2}\partial _{\sigma _i}\partial _{\sigma _j}S_n^{-1}(u\hat{\sigma }_n+(1-u)\sigma _0)\mathcal {D}^{1/2}{\textrm{d}}u\bigg \Vert } \nonumber \\{} & {} \quad {=O_p(\sqrt{nh_n} \cdot \sqrt{n}\cdot (n^{-1/2})^2\cdot 1)=o_p(\sqrt{nh_n}).} \end{aligned}$$
(3.47)

Thanks to (3.4), (3.10), (3.41), and Lemma 3.5, we have

$$\begin{aligned}{} & {} \sup _\theta |\Delta V(\theta )^\top \big \{(\hat{\sigma }_n-\sigma _0)\partial _\sigma S_{n,0}^{-1}\big \}(2\Delta X - \Delta V(\theta ))| \nonumber \\{} & {} \quad =\sup _\theta |\Delta V(\theta )^\top \big \{(\hat{\sigma }_n-\sigma _0)\partial _\sigma S_{n,0}^{-1}\big \}(2\Delta X^c + 2\Delta V(\theta _0) - \Delta V(\theta ))| \nonumber \\{} & {} \quad \le \sup _\theta |2\Delta V(\theta )^\top \big \{(\hat{\sigma }_n-\sigma _0)\partial _\sigma S_{n,0}^{-1}\big \}\Delta X^c|\nonumber \\{} & {} \qquad +C\sup _\theta |\mathcal {D}^{-1/2}\Delta V(\theta )|^2\Vert \mathcal {D}^{1/2}\partial _\sigma S_{n,0}^{-1}\mathcal {D}^{1/2}\Vert |\hat{\sigma }_n-\sigma _0| \nonumber \\{} & {} \quad \le {|\hat{\sigma }_n-\sigma _0|\sup _\theta |2\Delta V(\theta )^\top \partial _\sigma S_{n,0}^{-1}\Delta X^c|+O_p(nh_n)\cdot O_p(n^{-1/2}).} \end{aligned}$$
(3.48)

For \(k\in \{0,1\}\) and \(q\ge 1\), the Burkholder–Davis–Gundy inequality, Lemma 3.5 and a similar estimate to (3.10) yield

$$\begin{aligned}{} & {} \sup _\theta E_\Pi [|\partial _\theta ^k \Delta V(\theta )^\top \partial _\sigma S_{n,0}^{-1} \Delta X^c|^q]^{1/q} \nonumber \\{} & {} \quad \le C_q\sup _\theta \sum _{l=1}^2E_\Pi \bigg [\bigg |\sum _i[\partial _\sigma S_{n,0}^{-1}\partial _\theta ^k\Delta V(\theta )]_{i+(l-1)M_1}\Delta _i^l X^c\bigg |^q\bigg ]^{1/q} \nonumber \\{} & {} \quad \le C_q \sup _\theta \sum _{l=1}^2\bigg (\sum _i [\partial _\sigma S_{n,0}^{-1}\partial _\theta ^k\Delta V(\theta )]_{i+(l-1)M_1}^2|I_i^l|\bigg )^{1/2} \nonumber \\{} & {} \quad = C_q\sup _\theta \big (\partial _\theta ^k\Delta V(\theta )^\top \partial _\sigma S_{n,0}^{-1}\mathcal {D}\partial _\sigma S_{n,0}^{-1}\partial _\theta ^k\Delta V(\theta )\big )^{1/2} \nonumber \\{} & {} \quad \le C_q\sqrt{nh_n}(1-\bar{\rho }_n)^{-2}. \end{aligned}$$
(3.49)

Together with (3.4), (3.41), (3.48), and Sobolev’s inequality, we have

$$\begin{aligned} \sup _\theta |\Delta V(\theta )^\top \big \{(\hat{\sigma }_n-\sigma _0)\partial _\sigma S_{n,0}^{-1}\big \}(2\Delta X - \Delta V(\theta ))|=o_p(\sqrt{nh_n}). \end{aligned}$$
(3.50)

Then, (3.47) and (3.50) yield (3.43).

Applying (3.43) to \(\theta \) and \(\theta =\theta _0\), we have

$$\begin{aligned}{} & {} H_n^2(\theta )-H_n^2(\theta _0) \nonumber \\{} & {} \quad =\Delta (V(\theta )-V(\theta _0))^\top S_{n,0}^{-1}\Delta X^c+\frac{1}{2}\Delta V(\theta )^\top S_{n,0}^{-1}(2\Delta V(\theta _0) - \Delta V(\theta )) \nonumber \\{} & {} \qquad -\frac{1}{2}\Delta V(\theta _0)^\top S_{n,0}^{-1}\Delta V(\theta _0)+\sqrt{nh_n}{\dot{e}}_n(\theta ) \nonumber \\{} & {} \quad =\Delta (V(\theta )-V(\theta _0))^\top S_{n,0}^{-1}\Delta X^c -\frac{1}{2}\Delta (V(\theta )-V(\theta _0))^\top S_{n,0}^{-1}\Delta (V(\theta )-V(\theta _0)) +\sqrt{nh_n}{\dot{e}}_n(\theta ),\nonumber \\ \end{aligned}$$
(3.51)

and hence, by similar estimates to (3.49), we have

$$\begin{aligned} \sup _\theta \bigg |H_n^2(\theta )-H_n^2(\theta _0)+\frac{1}{2}\Delta (V(\theta ) - V(\theta _0))^\top S_{n,0}^{-1}\Delta (V(\theta ) - V(\theta _0))\bigg |=O_p(\sqrt{nh_n}). \end{aligned}$$
(3.52)

Then, (3.5), (3.17), and a similar argument to (3.16) yield

$$\begin{aligned}{} & {} \Delta (V(\theta ) - V(\theta _0))^\top S_{n,0}^{-1}\Delta (V(\theta ) - V(\theta _0)) \\{} & {} \quad =\Delta (V(\theta ) - V(\theta _0))^\top \tilde{\mathcal {D}}^{-1/2}(\sigma _0)\sum _{p=0}^\infty \left( \begin{array}{cc} (\tilde{G}\tilde{G}^\top )^p &{} -(\tilde{G}\tilde{G}^\top )^p\tilde{G}\\ -(\tilde{G}^\top \tilde{G})^p\tilde{G}^\top &{} (\tilde{G}^\top \tilde{G})^p \end{array} \right) \tilde{\mathcal {D}}^{-1/2}(\sigma _0)\Delta (V(\theta ) - V(\theta _0)) \\{} & {} \quad =\sum _{p=0}^\infty \sum _{k=1}^{q_n}\dot{\rho }_{k,0}^{2p}\bigg \{\sum _{l=1}^2(\phi _{l,s_{k-1}})^2\dot{\mathfrak {I}}_{k,l}^\top \dot{\mathcal {A}}_{k,p}^l\dot{\mathfrak {I}}_{k,l} -2\dot{\rho }_{k,0}\phi _{1,s_{k-1}}\phi _{2,s_{k-1}}\dot{\mathfrak {I}}_{k,1}^\top \dot{\mathcal {A}}_{k,p}^1 G_k\dot{\mathfrak {I}}_{k,2}\bigg \}+nh_ne_n, \end{aligned}$$

where \(\dot{\mathfrak {I}}_{k,l}=\mathcal {E}_{(k)}^l\dot{\mathfrak {I}}_l\). Together with (A3), (A5), (3.52), and a similar argument to (3.22), we obtain

$$\begin{aligned} \sup _\theta \big |(nh_n)^{-1}(H_n^2(\theta )-H_n^2(\theta _0)) - \mathcal {Y}_2(\theta )\big |\overset{P}{\rightarrow }0 \end{aligned}$$
(3.53)

as \(n\rightarrow \infty \). Similar estimates for \((nh_n)^{-1}\partial _\theta ^k(H_n^2(\theta )-H_n^2(\theta _0))\) \((k\in \{0,1,2,3,4\})\) yield the conclusion. \(\square \)

Proposition 3.17

Assume (A1)–(A6). Then,  \(\hat{\theta }_n\overset{P}{\rightarrow }\theta _0\) as \(n\rightarrow \infty .\)

Proof

By Lemma 3.5, we have

$$\begin{aligned} \mathcal {D}^{1/2}S_{n,0}^{-1}\mathcal {D}^{1/2}\ge \Vert \mathcal {D}^{-1/2}S_{n,0}\mathcal {D}^{-1/2}\Vert ^{-1} \mathcal {E}_M\ge C\mathcal {E}_M. \end{aligned}$$
(3.54)

Therefore, together with (3.15) and (3.16), we obtain

$$\begin{aligned}{} & {} -\frac{1}{2}\Delta (V(\theta ) - V(\theta _0))^\top S_{n,0}^{-1}\Delta (V(\theta ) - V(\theta _0)) \nonumber \\{} & {} \quad \le -C\Delta (V(\theta ) - V(\theta _0))^\top \mathcal {D}^{-1}\Delta (V(\theta ) - V(\theta _0)) \nonumber \\{} & {} \quad {=C\sum _i|I_i|^{-1}\bigg (\int _{I_i}(\mu _t^{\psi (i)}(\theta )-\mu _t^{\psi (i)}(\theta _0)){\textrm{d}}t\bigg )^2} \nonumber \\{} & {} \quad = {-C\sum _{k=1}^{q_n}\sum _i (\mu _{s_{k-1}}^{\psi (i)}(\theta )-\mu _{s_{k-1}}^{\psi (i)}(\theta _0))^2|I_i^l\cap J^k|+nh_ne_n} \nonumber \\{} & {} \quad = {-C\int _0^{T_n}|\mu _t(\theta )-\mu _t(\theta _0)|^2{\textrm{d}}t+nh_ne_n.} \end{aligned}$$
(3.55)

Hence, we have

$$\begin{aligned} \mathcal {Y}_2(\theta )\le -C\limsup _{T\rightarrow \infty }\bigg (\frac{1}{T}\int _0^T{|\mu _t(\theta )-\mu _t(\theta _0)|^2{\textrm{d}}t}\bigg ). \end{aligned}$$
(3.56)

Assumption (A6) yields that for any \(\theta \in \Theta \)

$$\begin{aligned} \mathcal {Y}_2(\theta )\le 0, \quad \textrm{and} \quad \mathcal {Y}_2(\theta )=0 \quad \mathrm{if~and~only~if} \quad \theta =\theta _0; \end{aligned}$$
(3.57)

(3.42), (3.57) together with a similar estimate to (3.24), we have the conclusion. \(\square \)

3.5 Asymptotic normality of \(\hat{\theta }_n\)

Proof of Theorem 2.3

By the definition of \(H_n^2(\theta )\), we obtain

$$\begin{aligned} \partial _\theta H_n^2(\theta _0)=\partial _\theta \Delta V(\theta _0)^\top S_n^{-1}(\hat{\sigma }_n)\bar{X}(\theta _0) =\partial _\theta \Delta V(\theta _0)^\top S_n^{-1}(\hat{\sigma }_n)\Delta X^c. \end{aligned}$$

By a similar argument to the derivation of (3.43), we can replace \(S_n^{-1}(\hat{\sigma }_n)\) in the right-hand side of the above equation by \(S_{n,0}^{-1}\) with approximation error equal to \(o_p(\sqrt{nh_n})\). Then, we have

$$\begin{aligned} \partial _\theta H_n^2(\theta _0) =\partial _\theta \Delta V(\theta _0)^\top S_{n,0}^{-1}\Delta X^c+o_p(\sqrt{nh_n}). \end{aligned}$$

Let

$$\begin{aligned} \dot{\mathcal {X}}_k=\frac{1}{\sqrt{nh_n}} \partial _\theta \Delta V(\theta _0) S_{n,0}^{-1}\Delta ^{(k)} X^c \end{aligned}$$

for \(1\le k\le L_n\). Then, we have

$$\begin{aligned} (nh_n)^{-1/2}\partial _\theta H_n^2(\theta _0)= \sum _{k=1}^{L_n}\dot{\mathcal {X}}_k +o_p(1). \end{aligned}$$
(3.58)

Lemma 3.5 and a similar argument to (3.10) yield

$$\begin{aligned} \sum _{k=1}^{L_n}E_k[\dot{\mathcal {X}}_k^4]= & {} \frac{3}{n^2h_n^2}\sum _{k=1}^{L_n}\big \{\partial _\theta \Delta V(\theta _0)^\top S_{n,0}^{-1}S_{n,0}^{(k)}S_{n,0}^{-1}\partial _\theta \Delta V(\theta _0)\big \}^2 \\\le & {} \frac{C}{n^2h_n^2}|\mathcal {D}^{-1/2}\Delta \partial _\theta V(\theta _0)|^2\Vert \mathcal {D}^{1/2}S_{n,0}^{-1}\mathcal {D}^{1/2}\Vert ^2 \sum _{k=1}^{L_n}\Vert \mathcal {D}^{-1/2}S_{n,0}^{(k)}\mathcal {D}^{-1/2}\Vert \\\le & {} \frac{CL_n}{nh_n}{(1-\bar{\rho }_n)^2} \overset{P}{\rightarrow }0. \end{aligned}$$

Moreover, (3.5), (A5), and a similar argument to the proof of Proposition 3.8 yield

$$\begin{aligned} \sum _{k=1}^{L_n}E_k[\dot{\mathcal {X}}_k^2]= & {} \frac{1}{nh_n}\sum _{k=1}^{L_n}\sum _{i_1,j_1}\sum _{i_2,j_2}[S_{n,0}^{-1}]_{i_1,j_1}[S_{n,0}^{-1}]_{i_2,j_2} \Delta _{i_1} \partial _\theta V(\theta _0)\Delta _{i_2} \partial _\theta V(\theta _0)[S_{n,0}^{(k)}]_{j_1,j_2} \\= & {} \frac{1}{nh_n}\Delta \partial _\theta V(\theta _0)^\top S_{n,0}^{-1}S_{n,0}S_{n,0}^{-1}\Delta \partial _\theta V(\theta _0) \\= & {} \frac{1}{nh_n}\sum _{p=0}^\infty \sum _{k=1}^{q_n}\dot{\rho }_{k,0}^{2p}\bigg \{\sum _{l=1}^2\partial _\theta \phi _{l,s_{k-1}}^2(\theta _0)\mathfrak {I}_l^\top \dot{\mathcal {A}}_{k,p}^l\mathfrak {I}_l\\{} & {} -2\dot{\rho }_{k,0}\partial _\theta \phi _{1,s_{k-1}}\partial _\theta \phi _{2,s_{k-1}}(\theta _0)\mathfrak {I}_1^\top \dot{\mathcal {A}}_{k,p}^1G\mathfrak {I}_2\bigg \}+e_n \\{} & {} \overset{P}{\rightarrow }\Gamma _2. \end{aligned}$$

Therefore, (3.58) and the martingale central limit theorem (Corollary 3.1 and the remark after that in Hall & Heyde, 1980) yield

$$\begin{aligned} (nh_n)^{-1/2}\partial _\theta H_n^2(\theta _0)=\sum _{k=1}^{L_n}\dot{\mathcal {X}}_k+o_p(1)\overset{d}{\rightarrow }N(0,\Gamma _2). \end{aligned}$$
(3.59)

By (3.56) and (A6), there exists a positive constant c, such that \(\mathcal {Y}_2(\theta )\le -c|\theta -\theta _0|^2\). Then, \(\Gamma _2=-\partial _\theta ^2\mathcal {Y}_2(\theta _0)\) is positive definite, since \(\mathcal {Y}_2(\theta _0)=0\) and \(\partial _\theta \mathcal {Y}_2(\theta _0)=0\).

Therefore, a similar estimate to Sect. 3.3, P-tightness of \(\{(nh_n)^{-1}\sup _\theta |\partial _\theta ^3H_n^2(\theta )|\}_n\), and the equation \(-(nh_n)^{-1}\partial _\theta ^2H_n^2(\theta _0)\overset{P}{\rightarrow }\Gamma _2\) yield

$$\begin{aligned} \sqrt{T_n}(\hat{\theta }_n-\theta _0)\overset{d}{\rightarrow }N(0,\Gamma _2^{-1}). \end{aligned}$$

(3.40) and a similar equation for \(\sqrt{nh_n}(\hat{\theta }_n-\theta _0)\) yield

$$\begin{aligned} (\sqrt{n}(\hat{\sigma }_n-\sigma _0),\sqrt{T_n}(\hat{\theta }_n-\theta _0))= & {} (n^{-1/2}\Gamma _1^{-1}\partial _\sigma H_n^1(\sigma _0),T_n^{-1/2}\Gamma _2^{-1}\partial _\theta H_n^2(\theta _0))+o_p(1) \nonumber \\= & {} \sum _{k=1}^{L_n}(\Gamma _1^{-1}\mathcal {X}_k,\Gamma _2^{-1}\dot{\mathcal {X}}_k)+o_p(1). \end{aligned}$$
(3.60)

Then, since \(\sum _{k=1}^{L_n}E_k[\mathcal {X}_k\dot{\mathcal {X}}_k]=0\), we obtain

$$\begin{aligned} (\sqrt{n}(\hat{\sigma }_n-\sigma _0), \sqrt{nh_n}(\hat{\theta }_n-\theta _0))\overset{d}{\rightarrow }N(0,\Gamma ^{-1}). \end{aligned}$$

\(\square \)

3.6 Proofs of the results in Sects. 2.3 and 2.4

Proof of Theorem 2.5

Let \(\sigma _{tu}=\sigma _0+t\epsilon _n u\) for \(u\in \mathbb {R}^d\) and \(t\in [0,1]\), and let

$$\begin{aligned} H_n(\sigma ,\theta )=-\frac{1}{2}\bar{X}(\theta )^\top S_n^{-1}(\sigma )\bar{X}(\theta ) -\frac{1}{2}\log \det S_n(\sigma ). \end{aligned}$$

Then, we have

$$\begin{aligned}{} & {} H_n(\sigma _u,\theta _u) \\{} & {} \quad ={u^\top \epsilon _n}\int _0^1 \partial _\alpha H_n(\sigma _{tu},\theta _{tu}){\textrm{d}}t \\{} & {} \quad =u^\top \epsilon _n \partial _\alpha H_n(\sigma _0,\theta _0) +\frac{1}{2}u^\top \epsilon _n \partial _\alpha ^2 H_n(\sigma _0,\theta _0)\epsilon _n u \\{} & {} \qquad +\sum _{i,j,k}\int _0^1\frac{(1-s)^2}{2}\partial _{\alpha _i}\partial _{\alpha _j}\partial _{\alpha _k} H_n(\sigma _{su},\theta _{su}){\textrm{d}}s [\epsilon _n u]_i[\epsilon _n u]_j[\epsilon _n u]_k. \end{aligned}$$

By similar arguments to Propositions 3.8 and 3.14, and Sects. 3.4 and 3.5, we obtain

$$\begin{aligned} \sum _{i,j,k}\int _0^1\frac{(1-s)^2}{2}\partial _{\alpha _i}\partial _{\alpha _j}\partial _{\alpha _k} H_n(\sigma _{su},\theta _{su}){\textrm{d}}s [\epsilon _n u]_i[\epsilon _n u]_j[\epsilon _n u]_k\overset{P}{\rightarrow } & {} 0, \\ \Delta _n:=\epsilon _n \partial _\alpha H_n(\sigma _0,\theta _0)\overset{d}{\rightarrow } & {} N(0,\mathcal {E}_d), \\ -\epsilon _n \partial _\alpha ^2 H_n(\sigma _0,\theta _0) \epsilon _n\overset{P}{\rightarrow } & {} \mathcal {E}_d. \end{aligned}$$

Therefore, we have the desired conclusion. \(\square \)

Remark 3.18

We can show that \((\hat{\sigma }_n,\hat{\theta }_n)\) is a regular estimator by the proof of Theorem 2.5, (3.60), and Theorem 2 in Jeganathan (1982).

Outline of the proof of Proposition 2.7

The proof is similar to the proof of Proposition 6 in Ogihara and Yoshida (2014). P-tightness of \(\{h_nM_{l,q_n+1}\}_{n=1}^\infty \) immediately follows from (B1-1). Fix \(1\le j\le q_n\). Then, using the mixing property (2.4) for \(\mathcal {N}_t^{n,l}\), we obtain the following result; there exists \(\eta >0\), such that for any \(q\ge 4\), there exists \(C_q>0\) that does not depend on j, such that

$$\begin{aligned} E\big [\big |h_n\textrm{tr}(\mathcal {E}_{(j)}^1(GG^\top )^p)-E[h_n\textrm{tr}(\mathcal {E}_{(j)}^1(GG^\top )^p)]\big |^q\big ] \le C_q(p+1)^{q-1}h_n^{q\eta }. \end{aligned}$$

(The above inequality corresponds to (31) in Ogihara and Yoshida (2014). This is obtained by defining \(b_n=h_n^{-1}\), \(t_k=s_{j-1}+k[h_n^{-1}]^{-1}(s_j-s_{j-1})\) for \(0\le k\le [h_n^{-1}]\), and \(X'_k=\textrm{tr}(\mathcal {E}_{(j,k)}^1(GG^\top )^p)1_{A_{k,b_n^{\delta '}}^p}-E[\textrm{tr}(\mathcal {E}_{(j,k)}^1(GG^\top )^p)1_{A_{k,b_n^{\delta '}}^p}]\) in the proof of Proposition 6 in Ogihara and Yoshida (2014), where \(\mathcal {E}_{(j,k)}^l\) is an \(M_l\times M_l\) matrix satisfying \([\mathcal {E}_{(j,k)}^l]_{ii'}=1\) if \(i=i'\) and \(\sup I_i^l\in (t_{k-1},t_k]\), and otherwise, \([\mathcal {E}_{(j,k)}^l]_{ii'}=0\).)

Therefore, by setting sufficiently large q, so that \(nh_n^{1+q\eta }\rightarrow 0\), we have

$$\begin{aligned}{} & {} E\bigg [\max _{1\le j\le q_n}\big |h_n\textrm{tr}(\mathcal {E}_{(j)}^1(GG^\top )^p)-E[h_n\textrm{tr}(\mathcal {E}_{(j)}^1(GG^\top )^p)]\big |^q\bigg ] \\{} & {} \quad \le E\bigg [\sum _{j=1}^{q_n}\big |h_n\textrm{tr}(\mathcal {E}_{(j)}^1(GG^\top )^p)-E[h_n\textrm{tr}(\mathcal {E}_{(j)}^1(GG^\top )^p)]\big |^q\bigg ] \\{} & {} \quad =O({q_n} \cdot h_n^{q\eta })\rightarrow 0. \end{aligned}$$

Here, we used that for any partition \((s_k)_{k=0}^\infty \in \mathfrak {S}\), we have \(q_n\le nh_n/\epsilon +1\) with \(\epsilon =\inf _{k\ge 1}|s_k-s_{k-1}|>0\), which implies \(q_n=O(nh_n)\). Together with the assumptions, we obtain the conclusion.

Outline of the proof of Proposition 2.8

Similarly to the previous proposition, using the idea of Proposition 6 in Ogihara and Yoshida (2014) and the mixing property (2.4) for \(\mathcal {N}_t^{n,l}\), we have that there exists \(\eta >0\), such that for any \(q\ge 4\), there exists \(C_q>0\), such that

$$\begin{aligned} E\Big [\big |\mathfrak {I}_1^\top \mathcal {E}_{(j)}^1(GG^\top )^p\mathfrak {I}_1-E[\mathfrak {I}_1^\top \mathcal {E}_{(j)}^1(GG^\top )^p\mathfrak {I}_1]\big |^q\Big ]\le C_q(p+1)^{q-1}h_n^{q\eta } \end{aligned}$$

for \(1\le j\le q_n\). (We define \(b_n\) and \(t_k\) the same as the previous proposition, and define

$$\begin{aligned} X'_k=[h_n]^{-1}\mathfrak {I}_1^\top \mathcal {E}_{(j,k)}^1(GG^\top )^p\mathfrak {I}_11_{A_{k,b_n^{\delta '}}^p}-E[[h_n]^{-1}\mathfrak {I}_1^\top \mathcal {E}_{(j,k)}^1(GG^\top )^p\mathfrak {I}_11_{A_{k,b_n^{\delta '}}^p}].) \end{aligned}$$

Together with the assumptions and similar estimates for \(\mathfrak {I}_1\mathcal {E}_{(j)}^1(GG^\top )^pG\mathfrak {I}_2\) and \(\mathfrak {I}_2\mathcal {E}_{(j)}^2(G^\top G)^p\mathfrak {I}_2\), we obtain the conclusion.

Outline of the proof of Proposition 2.9

We can show the results by a similar approach to the proof of Proposition 9 in Ogihara and Yoshida (2014). Roughly speaking, under (B2-q), the probability \(P(\mathcal {N}_{t+Nh_n}^{n,l}-\mathcal {N}_t^{n,l}=0)\) is small enough to estimate the denominator of

$$\begin{aligned} \sum _{i,j}\frac{|I_i^1\cap I_j^2|^2}{|I_i^1||I_j^2|} \end{aligned}$$

for sufficiently large N. Then, we obtain estimates for the numerator using an inequality \(x_1^2+\cdots +x_n^2\ge R^2/n\) when \(x_1+\cdots +x_n=R\).

Proof of Lemma 2.10

We only show

$$\begin{aligned} \max _{1\le k\le q_n}|h_nE[\textrm{tr}(\mathcal {E}_{(k)}^1(GG^\top )^p)]-a_p^1(s_k-s_{k-1})|\rightarrow 0. \end{aligned}$$

The other results are similarly obtained.

(2.4) is satisfied, because \(\alpha _k^n\le c_1e^{-c_2k}\) for some positive constants \(c_1\) and \(c_2\).

Let \(\bar{\tau }_i^l\) be i-th jump time of \(\bar{\mathcal {N}}^l\). Then, we have \(S_i^{n,l}=h_n\bar{\tau }_i^l\). Let \(\bar{G}\) be a matrix with infinity size defined by

$$\begin{aligned}{}[\bar{G}]_{ij}=\frac{|(\bar{\tau }_{i-1}^1,\bar{\tau }_i^1]\cap (\bar{\tau }_{j-1}^2,\bar{\tau }_j^2]|}{\sqrt{\bar{\tau }_i^1-\bar{\tau }_{i-1}^1}\sqrt{\bar{\tau }_j^2-\bar{\tau }_{j-1}^2}} \end{aligned}$$

for \(i,j\ge 1\).

For \(k\in \mathbb {N}\), let

$$\begin{aligned} \mathfrak {G}_k^p=\sum _{i;\bar{\tau }_{i-1}^1\in [k-1,k)}[(\bar{G}\bar{G}^\top )^p]_{ii}, \quad \mathfrak {G}_k^{n,p}=\sum _{i;S_{i-1}^{n,1}\in [(k-1)h_n,kh_n)}[(GG^\top )^p]_{ii}. \end{aligned}$$

The following idea is based on Section 7.5 of Ogihara and Yoshida (2014). Roughly speaking, if there are sufficient observations around the interval \([k-1,k)\), we can apply mixing property of \(\bar{\mathcal {N}}_t^{n,l}\) to \(\mathfrak {G}_k^p\). On the following sets \(A_{k,r}^p\) and \(\bar{A}_{k,r}^p\), we have sufficient observations of \(\mathcal {N}^{n,l}\) and \(\bar{\mathcal {N}}^l\). Let \(\bar{\Delta }_{j,t}^r U=U_{t+rj}-U_{t+r(j-1)}\) for a stochastic process \((U_t)_{t\ge 0}\), and let

$$\begin{aligned} A_{k,r}^p= & {} \bigcap _{l=1,2}\bigg \{\bigcap _{\begin{array}{c} 1\le j\le 2p+1 \\ t_k+rjh_n\le T_n \end{array}}\{\bar{\Delta }_{j,t_k}^{rh_n}\mathcal {N}^{n,l}>0\} \cap \bigcap _{\begin{array}{c} -2p\le j\le 0 \\ t_{k-1}+r(j-1)h_n\ge 0 \end{array}}\{\bar{\Delta }_{j,t_{k-1}}^{rh_n}\mathcal {N}^{n,l}>0\}\bigg \}, \nonumber \\ \bar{A}_{k,r}^p= & {} \bigcap _{l=1,2}\bigg \{\bigcap _{1\le j\le 2p+1}\{\bar{\Delta }_{j,k}^{r}\bar{\mathcal {N}}^l>0\} \cap \bigcap _{\begin{array}{c} -2p\le j\le 0 \\ k-1+r(j-1)\ge 0 \end{array}}\{\bar{\Delta }_{j,k-1}^r\bar{\mathcal {N}}^l>0\}\bigg \}. \end{aligned}$$
(3.61)

Then, we obtain

$$\begin{aligned} E[\mathfrak {G}_k^p1_{\bar{A}_{k,r}^p}]= & {} E[\mathfrak {G}_{k'}^p1_{\bar{A}_{k',r}^p}] \quad \textrm{if}\ k\wedge k'\ge rp+1, \\ E[\mathfrak {G}_k^{n,p}1_{A_{k,r}^p}]= & {} E[\mathfrak {G}_{k'}^{n,p}1_{A_{k',r}^p}] \quad \textrm{if}\ rp+1\le k,k'\le n-rp. \end{aligned}$$

We also have \(P((\bar{A}_{k,r}^p)^c)\le C(p+1)r^{-q}\) by (B2-q). For any \(\epsilon >0\), there exists \(r>0\), such that

$$\begin{aligned} P((\bar{A}_{k,r}^p)^c)<\epsilon /2. \end{aligned}$$
(3.62)

Therefore, \(\{E[\mathfrak {G}_k^p]\}_k\) is a Cauchy sequence, and hence, the limit \(a_p^1=\lim _{k\rightarrow \infty }E[\mathfrak {G}_k^p]\) exists for \(p\in \mathbb {N}\). Moreover, we see existence of

$$\begin{aligned} a_0^l=\lim _{k\rightarrow \infty }E[\bar{\mathcal {N}}_k^l-\bar{\mathcal {N}}_{k-1}^l]=E[\bar{\mathcal {N}}_1^l-\bar{\mathcal {N}}_0^l] \end{aligned}$$

for \(l\in \{1,2\}\).

Furthermore, for any \(\epsilon >0\), there exists \(r>0\), such that

$$\begin{aligned} P((\bar{A}_{k,r}^p)^c)<\epsilon \quad \textrm{and} \quad |E[\mathfrak {G}_k^p]-a_p^1|<\epsilon \end{aligned}$$
(3.63)

for \(k\ge [rp]\). We also have

$$\begin{aligned} E[\mathfrak {G}_k^p1_{\bar{A}_{k,r}^p}]=[\mathfrak {G}_k^{n,p}1_{A_{k,r}^p}] \end{aligned}$$
(3.64)

for \(rp+1\le k\le n-rp\), since

$$\begin{aligned} \sup I_i^l\in (s_{j-1},s_j] \quad \Longleftrightarrow \quad \bar{\tau }_i^l\in (h_n^{-1}s_{j-1},h_n^{-1}s_j]. \end{aligned}$$

Let \(r_j=[h_n^{-1}s_j]\). Then, since \(|\mathfrak {G}_k^{n,p}|\le \sum _{i;S_{i-1}^{n,l}\in ((k-1)h_n,kh_n]}1\le E[\bar{\mathcal {N}}_1^1]\), (3.63), (3.64), and the Cauchy–Schwarz inequality yield

$$\begin{aligned}{} & {} |h_n(s_j-s_{j-1})^{-1}E[\textrm{tr}(\mathcal {E}_{(j)}(GG^\top )^p)]-a_p^1| \\{} & {} \quad \le \bigg |h_n(s_j-s_{j-1})^{-1}E\bigg [\sum _{k=r_{j-1}+1}^{r_j}\mathfrak {G}_k^{n,p}\bigg ]-a_p^1\bigg |+2h_n(s_j-s_{j-1})^{-1}E[\bar{\mathcal {N}}_1^1] \\{} & {} \quad \le \bigg |\frac{1}{r_j-r_{j-1}}E\bigg [\sum _{k=r_{j-1}+1}^{r_j}\mathfrak {G}_k^{n,p}\bigg ]-a_p^1\bigg |+Ch_n(s_j-s_{j-1})^{-1} \\{} & {} \quad \le \frac{1}{r_j-r_{j-1}}\sum _{k=r_{j-1}+1}^{r_j}\big |E[\mathfrak {G}_k^{n,p}1_{A_{k,h}^p}]+E[\mathfrak {G}_k^{n,p}1_{(A_{k,h}^p)^c}]-a_p^1\big |+Ch_n(s_j-s_{j-1})^{-1} \\{} & {} \quad \le \frac{1}{r_j-r_{j-1}}\sum _{k=r_{j-1}+1}^{r_j}\big (\big |E[\mathfrak {G}_k^p]-a_p^1\big |+2E[(\bar{\mathcal {N}}_1^1)^2]^{1/2}\sqrt{\epsilon }\big )+Ch_n(s_j-s_{j-1})^{-1} \\{} & {} \quad \le \epsilon +2E[(\bar{\mathcal {N}}_1^1)^2]^{1/2}\sqrt{\epsilon }+Ch_n(s_j-s_{j-1})^{-1} \end{aligned}$$

for \(1<j<q_n\). To get the corresponding inequality for \(j=1, q_n\), we replace the summation range of k in the above inequality with the range from \(r_{j-1}+[rp]+2\) to \(r_j\) when \(j=1\), and with the range from \(r_{j-1}\) to \(r_j-[rp]-1\) when \(j=q_n\). Boundedness of \(\{E[h_nM_{l,q_n+1}]\}_{n\in \mathbb {N}}\) is shown using the same techniques. Then, we have the conclusion. \(\square \)