# A note on error estimation for hypothesis testing problems for some linear SPDEs

- 627 Downloads

## Abstract

The aim of the present paper is to estimate and control the Type I and Type II errors of a simple hypothesis testing problem of the drift/viscosity coefficient for stochastic fractional heat equation driven by additive noise. Assuming that one path of the first \(N\) Fourier modes of the solution is observed continuously over a finite time interval \([0,T]\), we propose a new class of rejection regions and provide computable thresholds for \(T\), and \(N\), that guarantee that the statistical errors are smaller than a given upper bound. The considered tests are of likelihood ratio type. The main ideas, and the proofs, are based on sharp large deviation bounds. Finally, we illustrate the theoretical results by numerical simulations.

## Keywords

Hypothesis testing for SPDE Likelihood ratio Maximum likelihood estimator Error estimates Sharp large deviation Fractional heat equation Additive space-time white noise## Mathematics Subject Classification

60H15 35Q30 65L09## 1 Introduction

Under assumption that one path of the first \(N\) Fourier modes of the solution of a Stochastic Partial Differential Equation (SPDE) is observed continuously over a finite time interval, the parameter estimation problem for the drift coefficient has been studied by several authors, starting with the seminal paper [5]. Consistency and asymptotic normality of the MLE type estimators are well understood, at least for equations driven by additive noise; see for instance the survey paper [9] for linear SPDEs, and [3] for nonlinear equations, and references therein. Generally speaking, the statistical inference theory for SPDEs did not go far beyond the fundamental properties of MLE estimators, although important and interesting classes of SPDEs driven by various noises were studied. The first attempt to study hypothesis testing problem for SPDEs is due to [4], where we investigated the simple hypothesis for the drift/viscosity coefficient for stochastic fractional heat equation driven by additive noise, white in time and colored in space. Therein, the authors established ‘the proper asymptotic classes’ of tests for which we can find ‘asymptotically the most powerful tests’—tests with fastest speed of error convergence. Moreover, we provided explicit forms of such tests in two asymptotic regimes: large time asymptotics \(T\rightarrow \infty \), and increasing number of Fourier modes \(N\rightarrow \infty \). By its very nature, the theory developed in [4] is based on asymptotic behavior, \(T,N\rightarrow \infty \), and a follow-up question is how large \(T\) or \(N\) should we take, such that the Type I and Type II errors of these tests are smaller than a given threshold. The main goal of this paper is to develop feasible methods to estimate and control the Type I and Type II errors when \(T\) and \(N\) are finite. Similar to [4], we are interested in Likelihood Ratio type rejection regions \(R_{T}=\{U_T^N: \ln L(\theta _0,\theta _1,U_T^N)\ge \eta T\}\) and \(R_{N}=\{U_T^N: \ln L(\theta _0,\theta _1,U_T^N)\ge \zeta M_N\}\), where \(U_T^N\) is the projected solution on the space generated by the first \(N\) Fourier modes, \(L\) is the likelihood ratio, \(M_N\) is a constant that depends on the first \(N\) eigenvalues of the Laplacian, and \(\eta ,\zeta \) are some constants that depend on \(T\) and \(N\). We will derive explicit expressions for \(\eta \) and \(\zeta \), and thresholds for \(T\), and respectively for \(N\), that will guarantee that the corresponding statistical errors are smaller than a given upper bound. However, this comes at the cost that these tests are no longer the most powerful in the class of tests proposed in [4]. The key ideas, and the proofs of main results, are based on sharp large deviation principles (both in time and spectral spatial component) developed in [4]. On top of the theoretical part, we also present some numerical experiments as a coarse verification of the main theorems. We find some bounds for the numerical approximation errors, that will also serve as a preliminary effort in studying the statistical inferences problems for SPDEs under discrete observations. Finally, we want to mention that the case of large \(T\) and \(N=1\) corresponds to classical one dimensional Ornstein–Uhlenbeck process, and even in this case, to our best knowledge, the obtained results are novel.

The paper is organized as follows. In Sect. 1.1 we set up the problem, introduce some necessary notations, and discuss why for the tests proposed in [4] it is hard to find explicit expressions for \(T\) and \(N\) in order to control the statistical errors. Since sharp large deviation principles from [4] play fundamental role in the derivation of main results, in Sect. 1.2 we briefly present them here too. Section 2 is devoted to the case of large time asymptotics, with number of observable Fourier modes \(N\) being fixed. We show how to choose \(T\) and \(\eta \) such that both Type I and Type II errors, associated with rejection region \(R_T\), are bounded by a given threshold. Similarly, in Sect. 3 we study the case of large \(N\) while keeping the time horizon \(T\) fixed. In Sect. 4 we illustrate the theoretical results by means of numerical simulations. We start, with the description of the numerical methods, and derive some error bounds of the numerical approximations. Consequently, we show that while the thresholds for \(T,N\) derived in Sects. 2 and 3 are conservative, as one may expect, they still provide a robust practical framework for controlling the statistical errors. Finally, in Sect. 5 we discuss the advantages and drawbacks of the current results and briefly elaborate on possible theoretical and practical methods of solving some of the open problems.

### 1.1 Setup of the problem and some auxiliary results

In this section we will set up the main equation, briefly recall the problem settings of hypothesis testing for the drift coefficient, and present some needed results from [4]. Also here we give the motivations that lead to the proposed problems.

However, by their very nature of being asymptotic type results, one cannot assess how large \(T\) (or \(N\)) shall be taken to guarantee that the error is smaller than a desired tolerance. *The main goal of this manuscript is to investigate the corresponding error estimates for fixed values of* \(T\) *and* \(N\).

Due to lack of knowledge of the behavior of higher order terms in the above asymptotics, practically speaking, the above constants \(C_1\) and \(C_2\) cannot be easily determined. The case of large Fourier modes is especially intricate, since the asymptotic expansion of Type I error is done in terms of \(M\) rather than \(N\). *To overcome this technical problem, we propose a new test, which may not be asymptotically the most powerful, but which is convenient for the errors’ estimation. Moreover, we validate the obtained results by numerical simulations.*

### 1.2 Sharp large deviation principle

The main results presented in this paper, and the ideas behind them, rely on some results on sharp large deviation bounds obtained in [4]. While the sharp deviations results for large time asymptotics \(T\rightarrow \infty \) are comparable in certain respects with those from Stochastic ODEs (cf. [2, 7, 8]), the results for large number of Fourier modes \(N\rightarrow \infty \) are new, and by analogy we refer to them also as sharp large deviation principle. For convenience, we will briefly present some of needed results here too.

## 2 The case of large times

Next we present the first main result of this paper that shows how large \(T\) has to be so that the Type I error is smaller than a given tolerance level.

### **Theorem 2.1**

^{1}.

### *Proof*

Next we will study the estimation of Type II error, as time \(T\) goes to infinity.

### **Theorem 2.2**

### *Proof*

## 3 The case of large number of Fourier modes

### **Theorem 3.1**

- (i)Ifthen the Type I error has the following upper bound estimate$$\begin{aligned} M&\ge -\frac{16\ln \alpha }{\theta _0T}\max \left\{ \frac{16\theta _0^2}{(\theta _1 -\theta _0)^2},1\right\} \quad \hbox {and}\quad \frac{M}{(N+1)^2}\nonumber \\&\ge -\frac{4(1+\varrho )^2(\theta _1-\theta _0)^2\ln \alpha }{\varrho ^2(\theta _1+\theta _0)^2T\theta _{0}}, \end{aligned}$$(3.3)where \(\varrho \) denotes a given threshold of error tolerance.$$\begin{aligned} \mathbb {P}^{N,T}_{\theta _0}\left( R^0_N\right) \le (1+\varrho )\alpha , \end{aligned}$$(3.4)
- (ii)Ifwe have the following estimate for Type II error$$\begin{aligned} M&\ge -\frac{16\ln \alpha }{\theta _0 T} \max \left\{ \frac{(\theta _1^2+16\theta _0^2)}{(\theta _1-\theta _0)^2},1\right\} \quad \hbox {and}\quad \frac{M}{(N+1)^2}\nonumber \\&\ge -\frac{4(1+\varrho )^2(\theta _1-\theta _0)^2\ln \alpha }{\varrho ^2(\theta _1+\theta _0)^2T\theta _{0}}, \end{aligned}$$(3.5)$$\begin{aligned} 1-\mathbb {P}^{N,T}_{\theta _1}\left( R^0_N\right) \le (1+\varrho )\exp \left( -\frac{(\theta _1-\theta _0)^2}{16\theta _0^2}MT\right) . \end{aligned}$$(3.6)

The proof is similar^{2} to the proofs of Theorem 2.1 and Theorem 2.2, and we omit it here^{3}.

## 4 Numerical experiments

In this section we give a simple illustration of theoretical results from previous sections by means of numerical simulations. Besides showing the behavior of Type I and Type II errors for the test \(R^0\) proposed in this paper, we will also display the simulation results for \(R^\sharp \) test mentioned in Sect. 1.1 and discussed in [4]. We start with description of the numerical scheme used for simulation of trajectories of the solution (more precisely of the Fourier modes), and provide a brief argument on the error estimates of the corresponding Monte Carlo experiments associated with this scheme. In the second part of the section, we focus on numerical interpretation of the theoretical results obtained in Sects. 2 and 3.

^{4}to numerically approximate the trajectories of the Fourier modes \(u_k(t)\) given by Eq. (1.2), and we apply Monte Carlo method to estimate the Type I and Type II errors. We partition the time interval \([0,T]\) into \(n\) equality spaced time intervals \(0=t_0<t_1<\cdots <t_n=T\), with \(\Delta T=T/n = t_{i}-t_{i-1}\), for \(1\le i\le n\). Let \(m\) denote the number of trials in the Monte Carlo experiment of each Fourier mode. Assume that \(u_k^j(t_i)\) is the true value of the \(k\)-th Fourier mode at time \(t_i\) of the \(j\)-th trial in Monte Carlo simulation. Then, for every \(1\le k\le N\), \(1\le j \le m\), we approximate \(u_k^j(t_i)\) according to the following recursion formula

Throughout this section we consider Eq. (1.1), and consequently (4.1), with \(\beta =1\), in one dimensional space \(d=1\), with the random forcing term being the space-time white noise \(\gamma =0, \ \sigma =1\). We also assume that the spacial domain \(G=[0,\pi ]\) and the initial value \(U_0=0\). In this case \(\lambda _k=k, \ k\in \mathbb {N}\). We fix the parameter of interest to be \(\theta _0=0.1\) and \(\theta _1=0.2\). The general case is treated analogously, the authors feel that a complete and detailed analysis of the numerical results are beyond the scope of the current publication. The numerical simulations presented here are intended to show a simple analysis of the proposed methods. We performed simulations for other sets of parameters, and the numerical results were in concordance with the theoretical ones. For example, for the case of large times, if one increases \(N\), then the statistical errors are reaching the threshold for smaller values of \(T\)—more information improves the rate of convergence. Similarly, increasing \(T\) for the case of asymptotics in \(N\), one needs to take fewer Fourier modes to bypass the threshold of the statistical errors. Different ranges and magnitudes of the parameter of interest \(\theta \) were considered, and the outcomes are similar to those presented below. All simulations and computations are done in MATLAB and the source code is available from the authors upon request.

### 4.1 Description and analysis of the numerical experiments

Type I error for various time steps \(\Delta T\) (or number of time steps \(n\))

\(\Delta T\) | 1 | 0.9 | 0.8 | 0.7 | 0.6 | 0.5 | 0.4 | 0.3 | 0.2 |
---|---|---|---|---|---|---|---|---|---|

\(n\) | 100 | 111 | 125 | 143 | 167 | 200 | 250 | 333 | 500 |

\(\widetilde{\mathcal {P}}_{\theta _0}^{m,n,N,T}(R_T^0)\) | 0.0475 | 0.0375 | 0.0342 | 0.0283 | 0.0239 | 0.0202 | 0.0165 | 0.0157 | 0.0129 |

\(\widetilde{\mathcal {P}}_{\theta _0}^{m,n,N,T}(R_T^\sharp )\) | 0.0975 | 0.0897 | 0.0802 | 0.0746 | 0.0686 | 0.0620 | 0.0566 | 0.0515 | 0.0503 |

\(\Delta T\) | 0.1 | 0.09 | 0.08 | 0.07 | 0.06 | 0.05 | 0.04 | 0.03 | 0.02 |
---|---|---|---|---|---|---|---|---|---|

\(n\) | 1,000 | 1,111 | 1,250 | 1,429 | 1,667 | 2,000 | 2,500 | 3,333 | 5,000 |

\(\widetilde{\mathcal {P}}_{\theta _0}^{m,n,N,T}(R_T^0)\) | 0.0102 | 0.0111 | 0.0099 | 0.0101 | 0.0096 | 0.0108 | 0.0089 | 0.0078 | 0.0088 |

\(\widetilde{\mathcal {P}}_{\theta _0}^{m,n,N,T}(R_T^\sharp )\) | 0.0453 | 0.0416 | 0.0443 | 0.0413 | 0.0428 | 0.0401 | 0.0421 | 0.0400 | 0.0385 |

As shown in Fig. 1 the value of \(\widetilde{\mathcal {P}}_{\theta _0}^{m,n,N,T}(R_T^0)\), and respectively \(\widetilde{\mathcal {P}}_{\theta _0}^{m,n,N,T}(R_T^\sharp )\), rapidly decays (approximatively up to the point when \(n=1000\) or \(\Delta T = 0.1\)), and then it steadily approaches a certain ‘asymptotic level’, which, as suggested by (4.6), shall be the true value of \(\mathbb {P}_{\theta _0}^{N,T}(R_{T}^0)\) (or \(\mathbb {P}_{\theta _0}^{N,T}(R_{T}^\sharp )\)). This assumes a reasonable large value of \(m\), in our case \(m=20,000\). When \(\Delta T\) gets smaller, we notice small fluctuations around that ‘asymptotic level’, which are errors induced by the Monte Carlo method, and one can increase the number of trials to locate more precisely that true value. In our case the fluctuations are negligible comparative to the order of \(\alpha \).

^{5}that for some \(\nu \ge 0\),

Similar results are obtained for the approximation of \(\mathbb {P}^{N,T}_{\theta _0} (R_N^\sharp )\) and the Type II errors \(\mathbb {P}^{N,T}_{\theta _1} (R_N^0)\), \(\mathbb {P}^{N,T}_{\theta _1} (R_N^\sharp )\), \(\mathbb {P}^{N,T}_{\theta _1} (R_T^0)\) and \(\mathbb {P}^{N,T}_{\theta _1} (R_T^\sharp )\), and for brevity we will omit them here.

We conclude that the errors due to the numerical approximations considered above are negligible. Hence, the numerical methods we propose are suitable for our purposes of computing the statistical errors of \(R_T^0\), \(R_T^\sharp \), \(R_N^0\) and \(R_N^\sharp \) tests, and we will use them for derivation of all numerical results from the next sections.

### 4.2 Numerical tests for large times

\(T=T_b^1\) given by Theorem 2.1 and Type I error for various \(\alpha \)

\(\alpha \) | 0.1 | 0.05 | 0.01 | 0.005 |
---|---|---|---|---|

\(T_b^1\) | 629 | 818 | 1258 | 1447 |

\(\mathbb {P}_{\theta _0}^{N,T}\left( R_{T}^0\right) \) | 0.021 | 0.010 | 0.0025 | 0.0015 |

Type I error for various \(T\ge T_b^1\), with \(T_b^1\) as in Theorem 2.1

\(T\) | \(T_b^1\) | \(T_b^1+ T_\delta \) | \(T_b^1+ 2T_\delta \) | \(T_b^1+ 3T_\delta \) | \(T_b^1+ 4T_\delta \) | \(T_b^1+ 5T_\delta \) |
---|---|---|---|---|---|---|

\(\mathbb {P}_{\theta _0}^{N,T}\left( R_{T}^0\right) \) | 0.0100 | 0.0097 | 0.0105 | 0.0100 | 0.0105 | 0.0102 |

\(\mathbb {P}_{\theta _0}^{N,T}\left( R_{T}^\sharp \right) \) | 0.0540 | 0.0525 | 0.0505 | 0.0526 | 0.0512 | 0.0505 |

As already mentioned, the statistical test \(R^\sharp _T\) derived in [4], while it is asymptotically the most powerful in \(\mathcal {K}^\sharp _\alpha \), it will not guarantee that the statistical errors will be below the threshold for a fixed finite \(T\); only asymptotically it will be smaller than \(\alpha \). Indeed, as Table 3 shows, the Type I error for \(R^\sharp _T\) fluctuates around \(\alpha =0.05\), with no pattern. That was the very reason we proposed the tests \(R^0\).

Type II errors for various \(T\); Illustration of Theorem 2.2

\(T\) | 10 | 20 | 30 | 40 | 50 | 60 |
---|---|---|---|---|---|---|

\(\exp \left( -\frac{(\theta _1-\theta _0)^2}{16\theta _0^2}MT\right) \) | 0.4169 | 0.1738 | 0.0724 | 0.0302 | 0.0126 | 0.0052 |

\(1-\mathbb {P}_{\theta _1}^{N,T}\left( R_{T}^0\right) \) | 0.7155 | 0.3329 | 0.1148 | 0.0293 | 0.0070 | 0.0012 |

\(1-\mathbb {P}_{\theta _1}^{N,T}\left( R_{T}^\sharp \right) \) | 0.7946 | 0.2402 | 0.0457 | 0.0060 | 0.0006 | 0.0002 |

### 4.3 Numerical tests for large number of Fourier modes

Type I errors for various \(N\); Theorem 3.1

\(N\) | \(10\) | \(20\) | \(30\) | \(40\) | \(50\) | \(60\) | \(70\) | \(80\) |
---|---|---|---|---|---|---|---|---|

\(\mathbb {P}_{\theta _0}^{N,T}\left( R_{N}^0\right) \) | 0.007 | 0.012 | 0.010 | 0.017 | 0.012 | 0.014 | 0.010 | 0.013 |

\(\mathbb {P}_{\theta _0}^{N,T}\left( R_{N}^\sharp \right) \) | 0.006 | 0.037 | 0.039 | 0.053 | 0.040 | 0.039 | 0.054 | 0.046 |

## 5 Concluding remarks

On discrete sampling. Eventually, in real life experiments, the random field would be measured/sampled on a discrete grid, both in time and spatial domain. It is true that the main results are based on continuous time sampling, and may appear as being mostly of theoretical interest. However, as argued in the Sect. 4, the main ideas of this paper and [4] have a good prospect to be applied to the case of discrete sampling too. The error bounds of the numerical results presented herein contributes to the preliminary effort of studying the statistical inference problems for SPDEs in the discrete sampling framework. At our best knowledge, there are no results on statistical inference for SPDEs with fully discretely observed data (both in time and space). We outline here how to apply our results to discrete sampling, with strict proofs deffered to our future studies. If we assume that the first \(N\) Fourier modes are observed at some discrete time points, then, to apply the theory presented here, one essentially has to approximate some integrals, including some stochastic integrals, convergence of each is well understood. Of course, the exact rates of convergence still need to be established. The connection between discrete observation in space and the approximation of Fourier coefficients is more intricate. Natural way is to use discrete Fourier transform for such approximations. While intuitively clear that increasing the number of observed spacial points will yield to the computation of larger number of Fourier coefficients, it is less obvious, in our opinion, how to prove consistency of the estimators, asymptotic normality, and corresponding properties from hypothesis testing problem.

On composite hypothesis. Despite of the fact simple hypothesis testing problems are rarely used in practice, the efforts of this work, as well as those from [4], should be seen as a starting point of a systematic study of general hypothesis testing problems and goodness of fit tests for stochastic evolution equation in infinite dimensional spaces. As pointed out in [4], the developments of ‘asymptotic theory’ for composite hypothesis testing problem will follow naturally, and consequently one can extend the results of this paper to the case of composite tests.

## Footnotes

- 1.
Generally expected to be small, say less than 10 %. Smaller \(\varrho \) will yield larger \(T\), and the final choice is left to the observer.

- 2.
For most of the derivations one just needs to ‘exchange \(T\) with \(M\).’ The results are, in a sense, symmetric with respect to \(T\) and \(M\). In (3.3) and (3.5) we separate the conditions for \(N\) into two inequalities, since we want to place all the terms related to \(N\) on the left side of the inequalities.

- 3.
We need to point out that sometimes we may not be able to find \(N\) such that the conditions (3.3) and (3.5) are satisfied. For example, if \(\beta /d\le 1/2\) then \(M/(N+1)^2\) is bounded for all \(N\in \mathbb {N}\), and if its bound is smaller than the right hand side of the second inequality in (3.3) and (3.5), then the conditions (3.3) and (3.5) fail for all \(N\). However, for \(\beta /d\le 1/2\) we might still be able to control the Type I and Type II errors by finite \(N\), which requires a more technical proof and is deferred to future study.

- 4.
- 5.
As usually, the case of large \(N\) is more delicate and technically challenging, comparative to the case of large times. Apparently, (4.9) holds true for some positive \(\nu \). The sharpest value of \(\nu \) is not relevant for this paper, and we defer the derivation of it to future study.

## Notes

### Acknowledgments

We would like to thank the anonymous referees, the associate editor and the editor for their helpful comments and suggestions which improved greatly the final manuscript. Igor Cialenco acknowledges support from the NSF grant DMS-1211256.

## References

- 1.Bishwal, J.P.N.: Parameter Estimation in Stochastic Differential Equations. Lecture Notes in Mathematics, vol. 1923. Springer, Berlin (2008)zbMATHGoogle Scholar
- 2.Bercu, B., Rouault, A.: Sharp large deviations for the Ornstein–Uhlenbeck process. Teor. Veroyatnost. i Primenen.
**46**(1), 74–93 (2001)CrossRefMathSciNetGoogle Scholar - 3.Cialenco, I., Glatt-Holtz, N.: Parameter estimation for the stochastically perturbed Navier–Stokes equations. Stoch. Process. Appl.
**121**(4), 701–724 (2011)CrossRefzbMATHMathSciNetGoogle Scholar - 4.Cialenco, I., Xu, L.: Hypothesis testing for stochastic PDEs driven by additive noise. Preprint arXiv:1308.1900 (2013)
- 5.Huebner, M., Khasminskii, R., Rozovskii, B.L.: Two examples of parameter estimation for stochastic partial differential equations. Stochastic Processes, pp. 149–160. Springer, New York (1993)CrossRefGoogle Scholar
- 6.Jentzen, A., Kloeden, P.E.: Taylor approximations for stochastic partial differential equations, vol. 83 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2011)Google Scholar
- 7.Kutoyants, YuA: Statistical Inference for Ergodic Diffusion Processes. Springer Series in Statistics. Springer, London (2004)Google Scholar
- 8.Lin\(^{\prime }\)kov, Y.N.: Large deviation theorems for extended random variables and some applications. In: Proceedings of the 18th Seminar on Stability Problems for Stochastic Models, Part III (Hajdúszoboszló, 1997), vol. 93, pp. 563–573 (1999)Google Scholar
- 9.Lototsky, S.V.: Statistical inference for stochastic parabolic equations: a spectral approach. Publ. Mat.
**53**(1), 3–45 (2009)CrossRefzbMATHMathSciNetGoogle Scholar