On estimation of nonparametric regression models with autoregressive and moving average errors

Zheng, Qi; Cui, Yunwei; Wu, Rongning

doi:10.1007/s10463-023-00882-6

On estimation of nonparametric regression models with autoregressive and moving average errors

Published: 26 October 2023

Volume 76, pages 235–262, (2024)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Qi Zheng¹,
Yunwei Cui² &
Rongning Wu³

151 Accesses
Explore all metrics

Abstract

The nonparametric regression model with correlated errors is a powerful tool for time series forecasting. We are interested in the estimation of such a model, where the errors follow an autoregressive and moving average (ARMA) process, and the covariates can also be correlated. Instead of estimating the constituent parts of the model in a sequential fashion, we propose a spline-based method to estimate the mean function and the parameters of the ARMA process jointly. We establish the desirable asymptotic properties of the proposed approach under mild regularity conditions. Extensive simulation studies demonstrate that our proposed method performs well and generates strong evidence supporting the established theoretical results. Our method provides a new addition to the arsenal of tools for analyzing serially correlated data. We further illustrate the practical usefulness of our method by modeling and forecasting the weekly natural gas scraping data for the state of Iowa.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating change points in nonparametric time series regression models

Article Open access 27 January 2020

Penalised inference for lagged dependent regression in the presence of autocorrelated residuals

Article Open access 09 September 2017

Bootstrapping regression models with locally stationary disturbances

Article 19 June 2020

References

Armstrong J.S., Collopy F (1992) Error measures for generalizing about forecasting methods: empirical comparisons. International Journal of Forecasting, 8(1), 69–80 .
Article Google Scholar
Bowerman B.L., O’Connell R.T., Koehler A.B. (2005) Forecasting, time series, and regression: an applied approach, 4th ed., Boston, MA: Brooks/Cole, Cengage Learning
Google Scholar
Box G.E., Jenkins G.M., Reinsel G.C., Ljung M., (2016) Time series analysis: forecasting and control, 5th ed., Hoboken, New Jersey: John Wiley and Sons Inc.
Google Scholar
Brockwell P.J., Davis R.A., (1991) Time series: theory and methods. 2nd ed. Springer Series in Statistics New York: Springer.
Book Google Scholar
Carroll R.J., Fan J., Gijbels I., Wand M.P. (1997) Generalized partially linear single-index models. Journal of the American Statistical Association, 92(438), 477–489
Article MathSciNet Google Scholar
Chernozhukov V., Chetverikov D., Kato K. (2013) Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. The Annals of Statistics, 41(6), 2786–2819
Article MathSciNet Google Scholar
Davis R.A., Dunsmuir W.T. (1997) Least absolute deviation estimation for regression with ARMA errors. Journal of Theoretical Probability, 10(2), 481–497
Article MathSciNet Google Scholar
Davis R.A., Knight K., Liu J. (1992) M-estimation for autoregressions with infinite variance. Stochastic Processes and Their Applications, 40(1), 145–180
Article MathSciNet Google Scholar
De Boor C., De Boor C. (1978) A practical guide to splines, Vol. 27, New York: Springer
Book Google Scholar
Durbán M., Currie I.D. (2003) A note on p-spline additive models with correlated errors. Computational Statistics, 18(2), 251–262
Article MathSciNet Google Scholar
Fan J. (1993) Local linear regression smoothers and their minimax efficiencies. The Annals of Statistics, 21(1), 196–216
Article MathSciNet Google Scholar
Ganesh E., Rajendran V., Ravikumar D., Kumar P.S., Revathy G., Harivardhan P. (2021) Detection and route estimation of ship vessels using linear filtering and ARMA model from AIS data. International Journal of Oceans and Oceanography 15(1), 1–10
Google Scholar
Greenhouse J.B., Kass R.E., Tsay R.S. (1987) Fitting nonlinear models with ARMA errors to biological rhythm data. Statistics in Medicine 6(2), 167–183
Article CAS PubMed Google Scholar
Hall P., Heyde C.C. (2014) Martingale limit theory and its application, New York: Academic Press Inc
Google Scholar
Hall P., Keilegom I. V. (2003) Using difference-based methods for inference in nonparametric regression with time series errors. Journal of the Royal Statistical Society. Series B, 65(2), 443–456
Article MathSciNet Google Scholar
Hart J.D. (1994) Automated kernel smoothing of dependent data by using time series cross-validation. Journal of the Royal Statistical Society, Series B, 56(3), 529–542
MathSciNet Google Scholar
Hart J.D., Wehrly T.E. (1986) Kernel regression estimation using repeated measurements data. Journal of the American Statistical Association, 81(396), 1080–1088
Article MathSciNet Google Scholar
Hastie T.J., Tibshirani R.J. (1990) Generalized additive models, Boca Raton: Routledge
Google Scholar
Huang J.Z. (2003) Local asymptotics for polynomial spline regression. The Annals of Statistics, 31(5), 1600–1635
Article MathSciNet Google Scholar
Hyndman R.J., Koehler A.B., Ord J.K., Snyder R.D. (2008) Forecasting with exponential smoothing: the state space approach, Berlin: Springer-Verlag
Book Google Scholar
Kohn R., Ansley C.F., Wong C.-M. (1992) Nonparametric spline regression with autoregressive moving average errors. Biometrika, 79(2), 335–346
Article MathSciNet Google Scholar
Krivobokova T., Kauermann G. (2007) A note on penalized spline smoothing with correlated errors. Journal of the American Statistical Association, 102(480), 1328–1337
Article MathSciNet CAS Google Scholar
Lee Y.K., Mammen E., Park B.U. (2010) Bandwidth selection for kernel regression with correlated errors. Statistics, 44(4), 327–340
Article MathSciNet Google Scholar
Liang H.-Y., Jing B.-Y. (2009) Asymptotic normality in partial linear models based on dependent errors. Journal of statistical planning and inference, 139(4), 1357–1371
Article MathSciNet Google Scholar
Merlevède F., Peligrad M., Rio E. (2011) A Bernstein type inequality and moderate deviations for weakly dependent sequences. Probability Theory and Related Fields, 151(3–4), 435–474
Article MathSciNet Google Scholar
Miaou S.-P. (1990) A stepwise time series regression procedure for water demand model identification. Water Resources Research, 26(9), 1887–1897
Article ADS Google Scholar
Mokkadem A. (1988) Mixing properties of ARMA processes. Stochastic Processes and Their Applications, 29(2), 309–315
Article MathSciNet Google Scholar
Opsomer J., Wang Y., Yang Y. (2001) Nonparametric regression with correlated errors. Statistical Science, 16(2), 134–153
Article MathSciNet Google Scholar
Petropoulos F., Apiletti F., Assimakopoulos V., Babai M.Z., Barrow D.K., Ben Taieb S., Ziel F. et al. (2022) Forecasting: Theory and practice. International Journal of Forecasting, 38(3), 705–871
Article Google Scholar
Qiu D., Shao Q., Yang L. (2013) Efficient inference for autoregressive coefficients in the presence of trends. Journal of Multivariate Analysis, 114, 40–53
Article MathSciNet Google Scholar
Roussas G.G., Tran L.T. (1992) Asymptotic normality of the recursive kernel regression estimate under dependence conditions. The Annals of Statistics 20(1), 98–120
Article MathSciNet Google Scholar
Roussas G.G., Tran L.T., Ioannides D.A. (1992) Fixed design regression for time series: Asymptotic normality. Journal of Multivariate Analysis 40(2), 262–291
Article MathSciNet Google Scholar
Serra, P., Krivobokova, T., and Rosales, F. (2018) Adaptive non-parametric estimation of mean and autocovariance in regression with dependent errors. arXiv preprintarXiv:1812.06948.
Shao Q., Yang L. (2011) Autoregressive coefficient estimation in nonparametric analysis. Journal of Time Series Analysis 32(2), 587–597
Article MathSciNet Google Scholar
Shao Q., Yang L. (2017) Oracally efficient estimation and consistent model selection for auto-regressive moving average time series with trend. Journal of the Royal Statistical Society Series B 79(2), 507–524
Article MathSciNet Google Scholar
Stone C.J. (1968) Optimal rates of convergence for nonparametric estimators. The Annals of Statistics 8(6), 1348–1360
Article MathSciNet Google Scholar
Stone C.J. (1986) The dimensionality reduction principle for generalized additive models. The Annals of Statistics 14(2), 590–606
Article MathSciNet Google Scholar
Straumann D., Mikosch T. (2006) Quasi-maximum-likelihood estimation in conditionally heteroscedastic time series: A stochastic recurrence equations approach. The Annals of Statistics 34(5), 2449–2495
Article MathSciNet Google Scholar
Tibshirani R. (1996) Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288
MathSciNet Google Scholar
Tran L., Roussas G., Yakowitz S., Van B.T. (1996) Fixed-design regression for linear time series. The Annals of Statistics, 24(3), 975–991
Article MathSciNet Google Scholar
Truong Y.K. (1991) Nonparametric curve estimation with time series errors. Journal of Statistical Planning and Inference, 28(2), 167–183
Article MathSciNet Google Scholar
Truong-Van B., Bru N. (2001) Asymptotic normality of spline estimator when the errors are a linear stationary process. Journal of Nonparametric Statistics, 13(5), 741–761
Article MathSciNet Google Scholar
Van de Geer S., Bühlmann P., Ritov Y., Dezeure R. (2014) On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics,42(3), 1166–1202
MathSciNet Google Scholar
Volkonskii V., Rozanov Y.A. (1959) Some limit theorems for random functions. I. Theory of Probability & Its Applications, 4(2), 178–197
Article MathSciNet Google Scholar
Wu R., Wang Q. (2012) Shrinkage estimation for linear regression with ARMA errors. Journal of Statistical Planning and Inference, 142(7), 2136–2148
Article MathSciNet Google Scholar
Zhou S., Shen X., Wolfe D. (1998) Local asymptotics for regression splines and confidence regions. The Annals of Statistics, 26(5), 1760–1782
MathSciNet Google Scholar
Zinde-Walsh V., Galbraith J.W. (1991) Estimation of a linear regression model with stationary ARMA(p, q) errors. Journal of Econometrics, 47(2–3), 333–357
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors thank the two anonymous referees for their invaluable comments and suggestions that have significantly improved the quality of the paper. The natural gas scrape data were obtained by the second author through the research contract (Grant 5040224) between Applied Mathematics Laboratory of Towson University and Exelon Generation Company LLC. This work was partially supported by the National Institutes of Health grant R03AG067611 and R21AG070659, and the National Science Foundation grant DMS-1952486.

Author information

Authors and Affiliations

Department of Bioinformatics & Biostatistics, University of Louisville, Louisville, KY, 40202, USA
Qi Zheng
Department of Mathematics, Towson University, Towson, MD, 21252, USA
Yunwei Cui
Zicklin School of Business, Baruch College, New York, NY, 10010, USA
Rongning Wu

Authors

Qi Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Yunwei Cui
View author publications
You can also search for this author in PubMed Google Scholar
Rongning Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qi Zheng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The online version of this article contains supplementary material.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 255 KB)

Appendix

1.1 A. The main results and proofs

We present the main results in this section. The proofs for Theorem 1 and Theorem 2 are provided. The remaining proofs are all relegated to the supplementary materials.

As $\mathcal {L}_{n}({\varvec{\xi }})=\sum _{t=1}^{n} \zeta _{t}^{2}({\varvec{\xi }})$ is not convex with respect to ${\varvec{\xi }}$, due to the MA component ${\varvec{\theta }}$, in order to study the asymptotic property of ${\hat{{\varvec{\xi }}}}$, we employ a second-order Taylor’s expansion of $\zeta _{t}({\varvec{\xi }})$ around ${\varvec{\xi }}_{*}$ (Davis and Dunsmuir, 1997): $\zeta _{t}({\varvec{\xi }})\approx \zeta _{t}({\varvec{\xi }}_{*})-\textbf{D}_{t}^{{\top }}({\varvec{\xi }}_{*})({\varvec{\xi }}-{\varvec{\xi }}_{*})-({\varvec{\xi }}-{\varvec{\xi }}_{*})^{{\top }}\textbf{H}_{t}({\varvec{\xi }}_{*})({\varvec{\xi }}-{\varvec{\xi }}_{*})/2$, where $\textbf{D}_{t}({\varvec{\xi }})=-\partial \zeta _{t}({\varvec{\xi }})/\partial {\varvec{\xi }}$ and $\textbf{H}_{t}({\varvec{\xi }})=-\partial ^{2}\zeta _{t}({\varvec{\xi }})/(\partial {\varvec{\xi }}\partial {\varvec{\xi }}^{{\top }})$.

We decompose $\textbf{D}_{t}({\varvec{\xi }})$ as $(\textbf{D}_{t1}({\varvec{\xi }}), \textbf{D}_{t2}({\varvec{\xi }}), \textbf{D}_{t3}({\varvec{\xi }}))^{{\top }}$, such that $\textbf{D}_{t1}({\varvec{\xi }})=-\partial \zeta _{t}({\varvec{\xi }})/\partial {\varvec{\beta }}$, $\textbf{D}_{t2}({\varvec{\xi }})=-\partial \zeta _{t}({\varvec{\xi }})/\partial {\varvec{\phi }}$, and $\textbf{D}_{t3}({\varvec{\xi }})=-\partial \zeta _{t}({\varvec{\xi }})/\partial {\varvec{\theta }}$, and partition $\textbf{H}_{t}({\varvec{\xi }})$ as follows:

$$\begin{aligned} \textbf{H}_{t}({\varvec{\xi }})=\left( \begin{array}{ccc} \textbf{H}_{t,11}({\varvec{\xi }}) &{} \textbf{H}_{t,12}({\varvec{\xi }}) &{} \textbf{H}_{t,13}({\varvec{\xi }})\\ \textbf{H}_{t,21}({\varvec{\xi }}) &{} \textbf{H}_{t,22}({\varvec{\xi }})&{} \textbf{H}_{t,23}({\varvec{\xi }})\\ \textbf{H}_{t,31}({\varvec{\xi }}) &{} \textbf{H}_{t,32}({\varvec{\xi }}) &{} \textbf{H}_{t,33}({\varvec{\xi }}) \end{array} \right) \end{aligned}$$

where $\textbf{H}_{t,11}({\varvec{\xi }})=-\partial ^{2}\zeta _{t}({\varvec{\xi }})/\partial {\varvec{\beta }}\partial {\varvec{\beta }}^{{\top }}$ is a zero $J\times J$ matrix, $\textbf{H}_{t,12}({\varvec{\xi }})=-\partial ^{2}\zeta _{t}({\varvec{\xi }})/\partial {\varvec{\beta }}\partial {\varvec{\phi }}^{{\top }}$ is a $J\times p$ matrix, $\textbf{H}_{t,13}({\varvec{\xi }})=-\partial ^{2}\zeta _{t}({\varvec{\xi }})/\partial {\varvec{\beta }}\partial {\varvec{\theta }}^{{\top }}$ is a $J\times q$ matrix, $\textbf{H}_{t,21}({\varvec{\xi }})=\textbf{H}_{t,12}^{{\top }}({\varvec{\xi }})$, $\textbf{H}_{t,22}({\varvec{\xi }})=-\partial ^{2}\zeta _{t}({\varvec{\xi }})/\partial {\varvec{\phi }}\partial {\varvec{\phi }}^{{\top }}$ is a zero $p\times p$ matrices, $\textbf{H}_{t,23}({\varvec{\xi }})=-\partial ^{2}\zeta _{t}({\varvec{\xi }})/\partial {\varvec{\phi }}\partial {\varvec{\theta }}^{{\top }}$ is a $p\times q$ matrix, $\textbf{H}_{t,31}({\varvec{\xi }})=\textbf{H}_{t,13}^{{\top }}({\varvec{\xi }})$, $\textbf{H}_{t,32}({\varvec{\xi }})=\textbf{H}_{t,23}^{{\top }}({\varvec{\xi }})$, and $\textbf{H}_{t,33}({\varvec{\xi }})=-\partial ^{2}\zeta _{t}({\varvec{\xi }})/\partial {\varvec{\theta }}\partial {\varvec{\theta }}^{{\top }}$ is a $q\times q$ matrix.

Let $[\textbf{A}]_{l}$ denote the $l^{\text {th}}$ element of the vector $\textbf{A}$. By simple algebra, we obtain that $\textbf{D}_{t1}({\varvec{\xi }})={\varvec{\theta }}^{-1}(B){\varvec{\phi }}(B)\textbf{W}_{t}$, $\left[ \textbf{D}_{t2}({\varvec{\xi }})\right] _{l}={\varvec{\theta }}^{-1}(B)\epsilon _{t-l}({\varvec{\beta }}), 1\le l\le p$, $\left[ \textbf{D}_{t3}({\varvec{\xi }})\right] _{l}={\varvec{\theta }}^{-1}(B)\zeta _{t-l}({\varvec{\xi }})$, $1\le l\le q$,

$$\begin{aligned}&\left[ \frac{\partial }{\partial {\varvec{\phi }}}\left[ \frac{\partial \zeta _{t}({\varvec{\xi }})}{\partial {\varvec{\beta }}}\right] _{l}\right] _{m}=\frac{1}{{\varvec{\theta }}(B)}\left[ \textbf{W}_{t-m}\right] _{l}, 1\le l\le J, 1\le m\le p, \\&\left[ \frac{\partial }{\partial {\varvec{\theta }}}\left[ \frac{\partial \zeta _{t}({\varvec{\xi }})}{\partial {\varvec{\beta }}}\right] _{l}\right] _{m}=\frac{{\varvec{\phi }}(B)}{{\varvec{\theta }}^{2}(B)}\left[ \textbf{W}_{t-m}\right] _{l}, 1\le l\le J, 1\le m\le q, \\&\left[ \frac{\partial }{\partial {\varvec{\theta }}}\left[ \frac{\partial \zeta _{t}({\varvec{\xi }})}{\partial {\varvec{\phi }}}\right] _{l}\right] _{m} = \frac{\epsilon _{t-l-m}({\varvec{\beta }})}{{\varvec{\phi }}(B){\varvec{\theta }}(B)}, 1\le l\le p, 1\le m\le q, \text { and }\\&\left[ \frac{\partial }{\partial {\varvec{\theta }}}\left[ \frac{\partial \zeta _{t}({\varvec{\xi }})}{\partial {\varvec{\theta }}}\right] _{l}\right] _{m} =\frac{2}{{\varvec{\theta }}^{2}(B)}\zeta _{t-l-m}({\varvec{\xi }}), 1\le l, m\le q. \end{aligned}$$

Furthermore, let $\textbf{V}_{t}$ be a symmetric matrix of dimension $(J+p+q)\times (J+p+q)$, whose upper triangular elements are given as

$$\begin{aligned} \left[ \textbf{V}_{t}\right] _{l,m}=\left\{ \begin{array}{ll} 0 &{} \text {if } 1\le l\le m\le J \text { or } J+1\le l\le m\le J+p, \\ -{\varvec{\theta }}_{*}^{-1}(B)\left[ \textbf{W}_{t-(m-J)}\right] _{l} &{} \text {if } 1\le l\le J, 1\le m-J\le p, \\ -{\varvec{\theta }}_{*}^{-2}(B){\varvec{\phi }}_{*}(B)\left[ \textbf{W}_{t-(m-J-p)}\right] _{l} &{} \text {if } 1\le l\le J, 1\le m-J-p\le q, \\ -{\varvec{\theta }}_{*}^{-1}(B){\varvec{\phi }}_{*}^{-1}(B)\zeta _{t-(l-J)-(m-J-p)} &{} \text {if } 1\le l-J\le p, 1\le m-J-p\le q, \\ -2{\varvec{\theta }}_{*}^{-2}(B)\zeta _{t-(l-J-p)-(m-J-p)} &{} \text {if } J+p+1\le l\le m\le J+p+q. \end{array} \right. \end{aligned}$$

We partition $\textbf{V}_{t}$ as follows:

$$\begin{aligned} \textbf{V}_{t}=\left( \begin{array}{ccc} \textbf{V}_{t,11} &{} \textbf{V}_{t,12} &{} \textbf{V}_{t,13}\\ \textbf{V}_{t,21} &{} \textbf{V}_{t,22} &{} \textbf{V}_{t,23}\\ \textbf{V}_{t,31} &{} \textbf{V}_{t,32} &{} \textbf{V}_{t,33} \end{array} \right) , \end{aligned}$$

where $\textbf{V}_{t,11}$ is a $J\times J$ matrix, $\textbf{V}_{t,12}$ is a $J\times p$ matrix, $\textbf{V}_{t,13}$ is a $J\times q$ matrix, $\textbf{V}_{t,22}$ is a $p\times p$ matrix, $\textbf{V}_{t,23}$ is a $p\times q$ matrix, and $\textbf{V}_{t,33}$ is a $q\times q$ matrix. By the definition, $\textbf{V}_{t,11}=\textbf{0}$ and $\textbf{V}_{t,22}=\textbf{0}$.

In addition, let $R_{t}=(g_{0}(X_{t})-{\varvec{\beta }}_{*}^{{\top }}\textbf{B}(X_t))1\{t>0\}=(\epsilon _{t}({\varvec{\beta }}_{*})-\epsilon _{t})1\{t>0\}$ be the spline approximation error at time t. In the following Proposition 1, we show that $\textbf{D}_{t}({\varvec{\xi }}_{*})$ and $\textbf{H}_{t}({\varvec{\xi }}_{*})$ are well approximated by $\textbf{Q}_{t}$ and $\textbf{V}_{t}$, respectively.

Proposition 1

Suppose Conditions (C1) – (C4) hold. There exists some constants $\delta _1$ and $\delta _2$, such that for all $\Vert {\varvec{\beta }}-{\varvec{\beta }}_{*}\Vert \le \delta _{1}, \Vert ({\varvec{\phi }}^{{\top }},{\varvec{\theta }}^{{\top }})-({\varvec{\phi }}_{*}^{{\top }},{\varvec{\theta }}_{*}^{{\top }})\Vert \le \delta _{2}$,

(i)
$\left| \zeta _{t}\right| \le \eta _{t}$, $|\zeta _{t}({\varvec{\xi }}_{*})-{\varvec{\phi }}_{*}(B){\varvec{\theta }}_{*}^{-1}(B)R_{t}-\zeta _{t}|\le r^{t}\eta _{0}$, $|\zeta ({\varvec{\xi }})|\le \eta _{t}+C_{2}(\varDelta +\delta _{1})$, and $\left| \zeta _{t}({\varvec{\xi }})-\zeta _{t}({\varvec{\xi }}_{*})\right| \le C_{3}\delta _{2}\eta _{t}+C_{2}C_{3}\delta _{2}(\delta _{1}+\varDelta )+C_{2}\delta _{1}$,
(ii)
$\left\| \textbf{D}_{t}({\varvec{\xi }}) \right\| _{\infty }\le \omega _{t}$, $\textbf{D}_{t1}({\varvec{\xi }}_{*})-\textbf{Q}_{t1}=\textbf{0}$, and $\left\| \left( \textbf{D}_{t2}^{{\top }}({\varvec{\xi }}_{*}),\textbf{D}_{t3}^{{\top }}({\varvec{\xi }}_{*})\right) -(\textbf{Q}_{t2}^{{\top }},\textbf{Q}_{t3}^{{\top }}) \right\| _{\infty }\le r^{t}\eta _{0}+C_{2}\varDelta$,
(iii)
$\left\| \textbf{H}_{t}({\varvec{\xi }})\right\| _{\max }\le \omega _{t}$, $\textbf{H}_{t,11}({\varvec{\xi }}_{*})-\textbf{V}_{t,11}=\textbf{0}$, and $\left\| \textbf{H}_{t}({\varvec{\xi }}_{*})-\textbf{V}_{t}\right\| _{\max } \le r^{t}\eta _{0}+C_{2}\varDelta$,

where $\eta _{t}=C_{1}\sum _{j=0}^{\infty }r^{j}\left| \epsilon _{t-j}\right|$, $\omega _{t}= \max \left\{ C_{2}, r^{-(p+q)}\eta _{t}+C_{2}\left( \varDelta +\delta _{1}\right) \right\}$, and $C_{3}$ is defined in Lemma 7.

Proposition 1 indicates that $\textbf{D}_{t}({\varvec{\xi }}_{*})$ and $\textbf{H}_{t}({\varvec{\xi }}_{*})$ can be approximated by $\textbf{Q}_{t}$ and $\textbf{V}_{t}$, respectively. Moreover, if ${\varvec{\xi }}$ is sufficiently close to the true parameters ${\varvec{\xi }}_{*}$, $\Vert \textbf{D}_{t}({\varvec{\xi }})\Vert _{\infty }$ and $\Vert \textbf{H}_{t}({\varvec{\xi }})\Vert _{\max }$ are bounded and the difference between $\zeta _{t}({\varvec{\xi }})$ and $\zeta _{t}({\varvec{\xi }}_{*})$ is well bounded, too.

To circumvent the non-convexity of $T(\textbf{h})$ with respect to $\textbf{h}$, we study a convex objective function

$$\begin{aligned} T_{1}(\textbf{h})=\sum _{t=1}^{n}\left[ \left( \zeta _{t}+\frac{{\varvec{\phi }}_{*}(B)}{{\varvec{\theta }}_{*}(B)}R_{t}-\textbf{h}^{{\top }}\textbf{Q}_{t} \right) ^{2}-\left( \zeta _{t}+\frac{{\varvec{\phi }}_{*}(B)}{{\varvec{\theta }}_{*}(B)}R_{t}\right) ^{2} \right] . \end{aligned}$$

To facilitate the investigation of the property of $T_{1}(\textbf{h})$, two extra terms, $T_{2}(\textbf{h})$ and $T_{3}(\textbf{h})$, are introduced for the theoretical development

$$\begin{aligned}&T_{2}(\textbf{h})=\sum _{t=1}^{n}\left[ \left( \zeta _{t}+\frac{{\varvec{\phi }}_{*}(B)}{{\varvec{\theta }}_{*}(B)}R_{t}-\textbf{h}^{{\top }}\textbf{Q}_{t}-\frac{1}{2}\textbf{h}^{{\top }}\textbf{V}_{t}\textbf{h}\right) ^{2}-\left( \zeta _{t}+\frac{{\varvec{\phi }}_{*}(B)}{{\varvec{\theta }}_{*}(B)}R_{t}\right) ^{2} \right] , \\&T_{3}(\textbf{h})=\sum _{t=1}^{n}\left[ \left( \zeta _{t}({\varvec{\xi }}_{*})-\textbf{h}^{{\top }}\textbf{D}_{t}({\varvec{\xi }}_{*})- \frac{1}{2}\textbf{h}^{{\top }}\textbf{H}_{t}({\varvec{\xi }}_{*})\textbf{h}\right) ^{2}-\zeta _{t}^{2}({\varvec{\xi }}_{*}) \right] , \end{aligned}$$

which are to be investigated in the Lemmas 4–6 to bridge the gap between $T_{1}(\textbf{h})$ and $T(\textbf{h})$. It is noteworthy that, as these terms involve unknown quantities, such as $\textbf{Q}_{t}$ and $R_{t}$, they cannot be computed in practice.

In light of Lemmas 4–6, we first establish that $T_{1}(\textbf{h})$ is an excellent approximation of $T(\textbf{h})$. Define $\varOmega (C):=\{\textbf{h}: \Vert \textbf{h}_{1}\Vert \le CJn^{-1/2}, \Vert \left( \textbf{h}_{2}^{{\top }},\textbf{h}_{3}^{{\top }} \right) \Vert \le CJ^{1/2}n^{-1/2} \}$ for any $C>0$. We use ${\bar{\varOmega }}(C)$ and $\varOmega ^{c}(C)$ to denote the boundary and the complement of $\varOmega (C)$, respectively.

Proposition 2

Suppose Conditions (C1)–(C4) hold. If $J=n^{1/(2\alpha +1)}$, for any $C>0$,

$$\begin{aligned} \sup _{\textbf{h}\in \varOmega (C)}\left| T_{1}(\textbf{h})-T(\textbf{h})\right| \rightarrow _{p} 0. \end{aligned}$$

Proposition 2 is inspired by Davis and Dunsmuir (1997). It demonstrates that $T(\textbf{h})$ can be well approximated by $T_{1}(\textbf{h})$ locally. Therefore, we can study the properties of the minimizer of $T_{1}(\textbf{h})$ and infer the properties of the minimizer of $T(\textbf{h})$. We refer to Davis and Dunsmuir (1997) for a detailed discussion. We next show that $T_{1}(\textbf{h})$ achieves its minimum in a ball round 0 in the following proposition.

Proposition 3

Under the same conditions as in Proposition 2, given any $0<\varepsilon <1$, there exists some $C_{\varepsilon }>0$, such that

$$\begin{aligned} P\left( \inf _{\textbf{h}\in {\bar{\varOmega }}(C_{\varepsilon })\bigcup \varOmega ^{c}(C_{\varepsilon })} T_{1}(\textbf{h})>1\right) >1-\varepsilon . \end{aligned}$$

Propositions 2 and 3 together enable us to establish the consistency of ${\hat{\textbf{h}}}$ and subsequently ${\hat{{\varvec{\xi }}}}$. Hence, the proofs for Theorem 1 and Theorem 2 are in order.

Proof of Theorem 1:

By Proposition 3, given any $0<\varepsilon <1$, there exists some $C_{\varepsilon }$, such that

$$\begin{aligned} P\left( \inf _{\textbf{h}\in {\bar{\varOmega }}(C_{\varepsilon })\bigcup \varOmega ^{c}(C_{\varepsilon })} T_{1}(\textbf{h})>1\right) >1-\varepsilon . \end{aligned}$$

Under the event $\{\inf _{\textbf{h}\in {\bar{\varOmega }}(C_{\varepsilon })\bigcup \varOmega ^{c}(C_{\varepsilon })} T_{1}(\textbf{h})>1\}$, we claim that there exists a local minimizer of $T(\textbf{h})$, ${\widehat{\textbf{h}}}$, which satisfies ${\widehat{\textbf{h}}}\in \varOmega (C_{\varepsilon })$ but ${\widehat{\textbf{h}}}\notin {\bar{\varOmega }}(C_{\varepsilon })$. Suppose the claim is not true. We can find a $\textbf{h}_{a} \in {\bar{\varOmega }}(C_{\varepsilon })$, such that $T(\textbf{h}_{a})= \min _{\textbf{h}\in \varOmega (C_{\varepsilon })}T(\textbf{h}).$

By Proposition 2, for any $C>0$, $\sup _{\textbf{h}\in \varOmega (C)}\left| T_{1}(\textbf{h})-T(\textbf{h})\right| \rightarrow _{p} 0.$ Choose C as $C_{\varepsilon }$. Then $0\ge T(\textbf{h}_{a})-T(\textbf{0})\rightarrow _{p}T_{1}(\textbf{h}_{a})-T_{1}(\textbf{0})= T_{1}(\textbf{h}_{a})>1$. Contradiction! Therefore, for any $0<\varepsilon <1$, there exists $C_{\varepsilon }$, such that ${\widehat{\textbf{h}}}\in \varOmega (C_{\varepsilon })$ with probability at least $1-\varepsilon.$

Given any $\textbf{h}\in \varOmega (C_{\varepsilon })$, $E\left[ \textbf{h}_{1}^{{\top }}\textbf{W}_{t}\textbf{W}_{t}^{{\top }}\textbf{h}_{1} \right] \le \lambda _{\max }J^{-1}\left( C_{\varepsilon }^2J^{2}n^{-1}\right) =\lambda _{\max }C_{\epsilon }^{2}Jn^{-1}$. Noting that ${\hat{{\varvec{\xi }}}}={\varvec{\xi }}_{*}+{\widehat{\textbf{h}}}$, with probability at least $1-\varepsilon$,

$$\begin{aligned}&{\hspace{0.2in}} E\left[ \big ({\hat{g}}(X_{t})-g_{0}(X_{t})\big )^{2}\right] \le 2 E\left[ \big ({\hat{g}}(X_{t})-g_{*}(X_{t})\big )^{2}\right] +2E\left[ \big (g_{*}(X_{t})-g_{0}(X_{t})\big )^{2}\right] \\&=E\left[ {\widehat{\textbf{h}}}_{1}^{{\top }}\textbf{W}_{t}\textbf{W}_{t}^{{\top }}{\widehat{\textbf{h}}}_{1}\right] +2C_{0}^{2}J^{-2\alpha }\le 2\lambda _{\max }C_{\epsilon }^{2}Jn^{-1}+2C_{0}^{2}J^{-2\alpha }. \end{aligned}$$

Thus, $E\left[ \big ({\hat{g}}(X_{t})-g_{0}(X_{t})\big )^{2}\right] =O_{p}(Jn^{-1}+J^{-2\alpha })=O_{p}\left( n^{-2\alpha /(2\alpha +1)}\right)$. This completes the proof of Theorem 1. $\square$

Proof of Theorem 2:

In the proof of Theorem 1, we have shown that for any $0<\varepsilon <1$, there exists $C_{\varepsilon }$, such that ${\widehat{\textbf{h}}}\in \varOmega (C_{\varepsilon })$ with probability at least $1-\varepsilon$. Thus, we restrict our attention to the event that ${\widehat{\textbf{h}}}\in \varOmega (C_{\varepsilon })$.

We consider that $S(\textbf{b}_{2},\textbf{b}_{3}):= T_{1}(({\widehat{\textbf{h}}}_{1}^{{\top }}, \textbf{b}_{2}^{{\top }}/\sqrt{n},\textbf{b}_{3}^{{\top }}/\sqrt{n})^{{\top }})- T_{1}(({\widehat{\textbf{h}}}_{1}^{{\top }}, \textbf{0}^{{\top }}, \textbf{0}^{{\top }})^{{\top }})$. It is easily seen that

$$\begin{aligned} S(\textbf{b}_{2},\textbf{b}_{3})&= \sum _{t=1}^{n}\left( \frac{\textbf{b}_{2}^{{\top }}\textbf{Q}_{t2}}{\sqrt{n}}+ \frac{\textbf{b}_{3}^{{\top }}\textbf{Q}_{t3}}{\sqrt{n}} \right) ^2-2\sum _{t=1}^{n}\zeta _{t}\left( \frac{\textbf{b}_{2}^{{\top }}\textbf{Q}_{t2}}{\sqrt{n}}+ \frac{\textbf{b}_{3}^{{\top }}\textbf{Q}_{t3}}{\sqrt{n}} \right) \\&{\hspace{0.2in}}-2\sum _{t=1}^{n}\frac{{\varvec{\phi }}_{*}(B)}{{\varvec{\theta }}_{*}(B)}R_{t}\left( \frac{\textbf{b}_{2}^{{\top }}\textbf{Q}_{t2}}{\sqrt{n}}+ \frac{\textbf{b}_{3}^{{\top }}\textbf{Q}_{t3}}{\sqrt{n}} \right) \\&{\hspace{0.2in}} +2\sum _{t=1}^{n}{\widehat{\textbf{h}}}_{1}^{{\top }}\textbf{Q}_{t1}\left( \frac{\textbf{b}_{2}^{{\top }}\textbf{Q}_{t2}}{\sqrt{n}} +\frac{\textbf{b}_{3}^{{\top }}\textbf{Q}_{t3}}{\sqrt{n}} \right) \end{aligned}$$

By Lemma 3, we obtain that

$$\begin{aligned}{} & {} {\hspace{0.2in}} \sum _{t=1}^{n}\left[ \left( \frac{\textbf{b}_{2}^{{\top }}\textbf{Q}_{t2}}{\sqrt{n}}+ \frac{\textbf{b}_{3}^{{\top }}\textbf{Q}_{t3}}{\sqrt{n}} \right) ^2-2\zeta _{t}\left( \frac{\textbf{b}_{2}^{{\top }}\textbf{Q}_{t2}}{\sqrt{n}}+ \frac{\textbf{b}_{3}^{{\top }}\textbf{Q}_{t3}}{\sqrt{n}} \right) \right] \nonumber \\{} & {} \rightarrow _{d}\sigma ^2\left( \textbf{b}_{2}^{{\top }}, \textbf{b}_{3}^{{\top }}\right) {\varvec{\varSigma }}\left( \textbf{b}_{2}, \textbf{b}_{3}\right) -2\left( \textbf{b}_{2}^{{\top }}, \textbf{b}_{3}^{{\top }}\right) N(0,\sigma ^2{\varvec{\varSigma }}), \end{aligned}$$

(4)

over $\Vert (\textbf{b}_{2}^{{\top }},\textbf{b}_{3}^{{\top }})\Vert \le C$ for any $C>0$.

According to Condition (C3), $\{\zeta _{t}\}$ and $\{X_{t}\}$ are independent. Hence, $\{R_{t}\}$ and $\{(\textbf{Q}_{t2}, \textbf{Q}_{t3})\}$ are independent. As $|R_{t}|\le \varDelta \le C_{0}J^{-\alpha }$ and hence $|{\varvec{\phi }}_{*}(B){\varvec{\theta }}_{*}^{-1}(B)R_{t}|\le C_{0}C_{2}J^{-\alpha }\rightarrow 0$, by the same arguments as used for Lemma 3, we can show that

$$\begin{aligned} \sup _{\Vert (\textbf{b}_{2}^{{\top }},\textbf{b}_{3}^{{\top }})\Vert \le C} 2\left| \sum _{t=1}^{n}\frac{{\varvec{\phi }}_{*}(B)}{{\varvec{\theta }}_{*}(B)}R_{t}\left( \frac{\textbf{b}_{2}^{{\top }}\textbf{Q}_{t2}}{\sqrt{n}}+ \frac{\textbf{b}_{3}^{{\top }}\textbf{Q}_{t3}}{\sqrt{n}} \right) \right| =o_{p}(1). \end{aligned}$$

(5)

The independence between $\{\zeta _{t}\}$ and $\{X_{t}\}$ again implies the independence between $\textbf{Q}_{t1}$ and $(\textbf{Q}_{t2},\textbf{Q}_{t3})$, $\textbf{b}_{1}^{{\top }}\textbf{Q}_{t1}$. Thus, $E\left[ \textbf{h}_{1}^{{\top }}\textbf{Q}_{t1}\left( \textbf{b}_{2}^{{\top }}\textbf{Q}_{t2}+ \textbf{b}_{3}^{{\top }}\textbf{Q}_{t3} \right) \right] =0$, as $E\left[ \textbf{Q}_{t2}\right] =E\left[ \textbf{Q}_{t3}\right] =0$. Noting that $\Vert {\widehat{\textbf{h}}}_{1}\Vert \le C_{\varepsilon }Jn^{-1/2}$, it follows from Lemma 2 that

$$\begin{aligned} 2\left| \sum _{t=1}^{n}{\widehat{\textbf{h}}}_{1}^{{\top }}\textbf{Q}_{t1}\left( \frac{\textbf{b}_{2}^{{\top }}\textbf{Q}_{t2}}{\sqrt{n}}+ \frac{\textbf{b}_{3}^{{\top }}\textbf{Q}_{t3}}{\sqrt{n}} \right) \right|= & {} C_{\varepsilon }Jn^{-1/2}O_{p}\left( 7(p^{1/2}+q^{1/2})\sqrt{C_{4}J\log n}\right) \nonumber \\= & {} o_{p}(1). \end{aligned}$$

(6)

Combining (4), (5), and (6) together yield that

$$\begin{aligned} S(\textbf{b}_{2},\textbf{b}_{3})\rightarrow _{d} \sigma ^2\left( \textbf{b}_{2}^{{\top }}, \textbf{b}_{3}^{{\top }}\right) {\varvec{\varSigma }}\left( \textbf{b}_{2}, \textbf{b}_{3}\right) -2\left( \textbf{b}_{2}^{{\top }}, \textbf{b}_{3}^{{\top }}\right) N(0,\sigma ^2{\varvec{\varSigma }}) \end{aligned}$$

over $\Vert (\textbf{b}_{2}^{{\top }},\textbf{b}_{3}^{{\top }})\Vert \le C$ for any $C>0$.

Following from Lemmas 4–6, we have uniformly over $\Vert (\textbf{b}_{2}^{{\top }},\textbf{b}_{3}^{{\top }})\Vert \le C$ for any $C>0$.

$$\begin{aligned}&{\hspace{0.2in}} T\left( \left( {\widehat{\textbf{h}}}_{1}^{{\top }}, \textbf{b}_{2}^{{\top }}/\sqrt{n}, \textbf{b}_{3}^{{\top }}/\sqrt{n}\right) ^{{\top }}\right) -T\left( \left( {\widehat{\textbf{h}}}_{1}^{{\top }}, \textbf{0}^{{\top }}, \textbf{0}^{{\top }}\right) ^{{\top }}\right) \rightarrow _{p} S(\textbf{b}_{2},\textbf{b}_{3}). \end{aligned}$$

Noting that $N(0, \sigma ^2{\varvec{\varSigma }}^{-1})$ is the minimizer of the random process which $S(\textbf{b}_{2},\textbf{b}_{3})$ converges to, by Lemma 2.2 and Remark 1 in Davis et al. (1992), there exists $\left( {\widehat{\textbf{b}}}_{2}^{{\top }},{\widehat{\textbf{b}}}_{2}^{{\top }}\right)$, a local minimizer of $T\left( \left( {\widehat{\textbf{h}}}_{1}^{{\top }}, \textbf{b}_{2}^{{\top }}/\sqrt{n}, \textbf{b}_{3}^{{\top }}/\sqrt{n}\right) ^{{\top }}\right) -T\left( \left( {\widehat{\textbf{h}}}_{1}^{{\top }}, \textbf{0}^{{\top }}, \textbf{0}^{{\top }}\right) ^{{\top }}\right)$, such that $\left( {\widehat{\textbf{b}}}_{2}^{{\top }},{\widehat{\textbf{b}}}_{3}^{{\top }}\right) ^{{\top }}\rightarrow _{d} N(0, \sigma ^2{\varvec{\varSigma }}^{-1})$.

Since ${\widehat{\textbf{h}}}$ is the minimizer of $T(\textbf{h})$, $\left( {\widehat{\textbf{h}}}_{2}^{{\top }},{\widehat{\textbf{h}}}_{2}^{{\top }}\right)$ must also be the minimizer of

$$\begin{aligned} T\left( \left( {\widehat{\textbf{h}}}_{1}^{{\top }}, \textbf{h}_{2}^{{\top }}, \textbf{h}_{3}^{{\top }}\right) ^{{\top }}\right) -T\left( \left( {\widehat{\textbf{h}}}_{1}^{{\top }}, \textbf{0}^{{\top }}, \textbf{0}^{{\top }}\right) ^{{\top }}\right) . \end{aligned}$$

We thus have $\sqrt{n}\left( {\widehat{\textbf{h}}}_{2}^{{\top }},{\widehat{\textbf{h}}}_{2}^{{\top }}\right) =\left( {\widehat{\textbf{b}}}_{2}^{{\top }},{\widehat{\textbf{b}}}_{2}^{{\top }}\right)$ and $\sqrt{n}\left( {\widehat{\textbf{h}}}_{2}^{{\top }},{\widehat{\textbf{h}}}_{3}^{{\top }}\right) ^{{\top }}\rightarrow _{d} N\left( 0, \sigma ^2{\varvec{\varSigma }}^{-1}\right)$. This completes the proof of Theorem 2. $\square$

1.2 B. Preliminary proposition and lemmas

Next, we present the technical proposition and lemmas that are used in the proofs of our theorems and corollaries. The proofs of the proposition and lemmas are relegated to supplementary materials.

Proposition 4

If Condition (C4) is satisfied,

$$\begin{aligned} \sup _{\Vert \textbf{h}_{1}\Vert =1, \Vert ({\varvec{\phi }}^{{\top }},{\varvec{\theta }}^{{\top }})-({\varvec{\phi }}_{*}^{{\top }},{\varvec{\theta }}_{*}^{{\top }})\Vert \le \delta _{2}}\textbf{h}_{1}^{{\top }} E\left[ \left( \frac{{\varvec{\phi }}(B)}{{\varvec{\theta }}(B)}\textbf{W}_{t}\right) \left( \frac{{\varvec{\phi }}(B)}{{\varvec{\theta }}(B)}\textbf{W}_{t}^{{\top }}\right) \right] \textbf{h}_{1}\le \lambda _{\max }J^{-1}C_{2}^{2}, \end{aligned}$$

where $\delta _{2}$ is chosen as in Proposition 1.

Lemma 1

Suppose Condition (C3) holds. Then

(i)
$P\left( |\zeta _{t}|>v\right) \le 2\exp \left( \frac{-v^{2}}{2(C_{B}^{2}+C_{B}v)}\right)$ and
(ii)
$E[\left| \sum _{i=0}^{\infty }a_{i}\zeta _{t-i}\right| ^{k} ]\le \left( \sum _{i=0}^{\infty }|a_{i}|\right) ^{k}k!C_{B}^{k}/2$, for any a sequence $\{a_{t}, t\ge 0\}$ and $k\ge 1$.

Lemma 2

Suppose Conditions (C1)–(C4) hold. There exists some constant $C_{4}>0$ that does not depend on n, such that if $J=O(n^{1/(2\alpha +1)})$,

(i)
$$\begin{aligned} P\left( \sup _{\Vert \textbf{h}_{1}\Vert \le 1} \left| \mathbb {G}_{n} \left[ \left( \textbf{h}_{1}^{{\top }}\textbf{Q}_{t1}\right) ^{2}\zeta _{t}^{2}\right] \right| >7C_{2}\sqrt{C_{4}J\log n} \right) \le 2\exp (-6J\log n). \end{aligned}$$
(ii)
$$\begin{aligned} \sup _{\Vert \textbf{h}_{1}\Vert \le 1, \textbf{h}_{1}\ne 0} \left| \left( \sigma ^{2}E\left[ \left( \textbf{h}_{1}^{{\top }}\textbf{Q}_{t1}\right) ^{2}\right] \right) ^{-1} \mathbb {E}_{n}\left[ \left( \textbf{h}_{1}^{{\top }}\textbf{Q}_{t1}\right) ^{2}\zeta _{t}^{2}\right] \right| =1+o_{p}(1). \end{aligned}$$
(iii)
$$\begin{aligned}&P\left( \sup _{\Vert \textbf{h}_{1}\Vert \le 1, \Vert \textbf{h}_{2}\Vert \le 1} n^{-1/2}\left| \mathbb {G}_{n} \left[ \textbf{h}_{1}^{{\top }}\textbf{Q}_{t1}\textbf{Q}_{t2}^{{\top }}\textbf{h}_{2}\right] \right|>7p^{1/2}\sqrt{C_{4}Jn^{-1}\log n} \right) \\ \le&2p\exp (-6J\log n).\\&P\left( \sup _{\Vert \textbf{h}_{1}\Vert \le 1, \Vert \textbf{h}_{3}\Vert \le 1} n^{-1/2}\left| \mathbb {G}_{n} \left[ \textbf{h}_{1}^{{\top }}\textbf{Q}_{t1}\textbf{Q}_{t3}^{{\top }}\textbf{h}_{3}\right] \right| >7q^{1/2}\sqrt{C_{4}Jn^{-1}\log n} \right) \\ \le&2q\exp (-6J\log n). \end{aligned}$$

Lemma 3

Suppose Conditions (C1)–(C4) hold. Then,

(i)
$\sup _{\Vert (\textbf{h}_{2}^{{\top }}, \textbf{h}_{3}^{{\top }}) \Vert \le 1} \left| \mathbb {E}_{n}\left[ (\textbf{h}_{2}^{{\top }}\textbf{Q}_{t2}+\textbf{h}_{3}^{{\top }}\textbf{Q}_{t3})^{2}\zeta _{t}^{2}\right] - \sigma ^{2}\left( \textbf{h}_{2}^{{\top }}, \textbf{h}_{3}^{{\top }}\right) {\varvec{\varSigma }}\left( \textbf{h}_{2}^{{\top }},\textbf{h}_{3}^{{\top }}\right) ^{{\top }} \right| \rightarrow _{a.s.}0$,
(ii)
$\mathbb {G}_{n}\left[ (\textbf{h}_{2}^{{\top }}\textbf{Q}_{t2}+\textbf{h}_{3}^{{\top }}\textbf{Q}_{t3})\zeta _{t}\right] \rightarrow _{d} \left( \textbf{h}_{2}^{{\top }}, \textbf{h}_{3}^{{\top }}\right) N(0,\sigma ^2{\varvec{\varSigma }})$, given any $(\textbf{h}_{2}^{{\top }}, \textbf{h}_{3}^{{\top }})$ such that $\Vert (\textbf{h}_{2}^{{\top }}, \textbf{h}_{3}^{{\top }})\Vert \le C$, for any $C>0$.
(iii)
$\mathbb {G}_{n}\left[ (\textbf{h}_{2}^{{\top }}\textbf{Q}_{t2}+\textbf{h}_{3}^{{\top }}\textbf{Q}_{t3})\zeta _{t}\right] \rightarrow _{d} \left( \textbf{h}_{2}^{{\top }}, \textbf{h}_{3}^{{\top }}\right) N(0,\sigma ^2{\varvec{\varSigma }})$ on $\Vert (\textbf{h}_{2}^{{\top }}, \textbf{h}_{3}^{{\top }})\Vert \le C$, for any $C>0$.

Lemmas 4–6 follow from the steps in Davis and Dunsmuir (1997) and Brockwell and Davis (1991).

According to Proposition 1, $\left| \zeta _{t}\right| \le \eta _{t}$, $\Vert \textbf{Q}_{t}\Vert _{\infty }\le \Vert \textbf{Q}_{t}-\textbf{D}_{t}({\varvec{\xi }}_{*})\Vert _{\infty }+\Vert \textbf{D}_{t}({\varvec{\xi }}_{*})\Vert _{\infty }\le r^{t}\eta _{0}+C_{2}\varDelta +\omega _{t}=: \chi _t$, and similarly $\Vert \textbf{V}_{t}\Vert _{\max }\le \chi _t$. Thus,

$$\begin{aligned} \left| \textbf{h}^{{\top }}\textbf{Q}_{t}\right|\le & {} \Vert \textbf{h}_{1}\textbf{Q}_{t1}\Vert +\Vert \textbf{h}_{2}^{{\top }}\textbf{Q}_{t2}+\textbf{h}_{3}^{{\top }}\textbf{Q}_{t3}\Vert \le C_{2}\Vert \textbf{h}_{1}\Vert +\chi _{t}(\sqrt{p}\Vert \textbf{h}_{2}\Vert +\sqrt{q}\Vert \textbf{h}_{3}\Vert ), \end{aligned}$$

(7)

$$\begin{aligned} \left| \textbf{h}^{{\top }}\textbf{V}_{t}\textbf{h}\right|= & {} \left| 2\textbf{h}_{2}^{{\top }}\textbf{V}_{t,21}\textbf{h}_{1}+2\textbf{h}_{3}^{{\top }}\textbf{V}_{t,31}\textbf{h}_{1}+2\textbf{h}_{3}^{{\top }}\textbf{V}_{t,32}\textbf{h}_{2}+ \textbf{h}_{3}^{{\top }}\textbf{V}_{t,33}\textbf{h}_{3}\right| \nonumber \\&{\hspace{-0.15in}} \le {\hspace{-0.15in}}&2C_{2}(\sqrt{p}\Vert \textbf{h}_{2}\Vert +\sqrt{q}\Vert \textbf{h}_{3}\Vert )\Vert \textbf{h}_{1}\Vert +2\sqrt{pq}\chi _{t}\Vert \textbf{h}_{2}\Vert \Vert \textbf{h}_{3}\Vert +q\chi _{t}\Vert \textbf{h}_{3}\Vert ^{2}. \end{aligned}$$

(8)

Lemma 4

Suppose Conditions (C1) – (C4) hold. If $J^{2}\log n=o(n^{1/2})$, then for any $C>0$, $\sup _{\textbf{h}\in \varOmega (C)}\left| T_{1}(\textbf{h})-T_{2}(\textbf{h})\right| \rightarrow _{p} 0$.

Lemma 5

Suppose Conditions (C1) – (C4) hold. If $J^{-2\alpha +1/2}=o(n^{-1/2})$, then for any $C>0$, $\sup _{\textbf{h}\in \varOmega (C)}\left| T_{2}(\textbf{h})-T_{3}(\textbf{h})\right| \rightarrow _{p} 0$.

Lemma 6

Suppose Conditions (C1) – (C4) hold. If $J^{2}\log n=o(n^{1/2})$, then for any $C>0$, $\sup _{\textbf{h}\in \varOmega (C)}\left| T_{3}(\textbf{h})-T(\textbf{h})\right| \rightarrow _{p} 0$.

Lemma 7

Under the same conditions as in Proposition 1, for any sequence $\{a_{t}\}, t\ge 1$, there exists some constant $C_{3}$ such that

$$\begin{aligned} {\hspace{0.2in}} \left| \left( \frac{{\varvec{\phi }}(B)}{{\varvec{\theta }}(B)}-\frac{{\varvec{\phi }}_{*}(B)}{{\varvec{\theta }}_{*}(B)}\right) a_{t}\right| \le C_{3}\delta _{2}\sum _{i=0}^{\infty }r^{i} |a_{t-i}|, \end{aligned}$$

where $\delta _{2}$ and r are defined in Proposition 1.

About this article

Cite this article

Zheng, Q., Cui, Y. & Wu, R. On estimation of nonparametric regression models with autoregressive and moving average errors. Ann Inst Stat Math 76, 235–262 (2024). https://doi.org/10.1007/s10463-023-00882-6

Download citation

Received: 01 February 2023
Revised: 12 July 2023
Accepted: 28 August 2023
Published: 26 October 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s10463-023-00882-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On estimation of nonparametric regression models with autoregressive and moving average errors

Abstract

Access this article

Similar content being viewed by others

Estimating change points in nonparametric time series regression models

Penalised inference for lagged dependent regression in the presence of autocorrelated residuals

Bootstrapping regression models with locally stationary disturbances

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic Supplementary Material

Supplementary file 1 (pdf 255 KB)

Appendix

1.1 A. The main results and proofs

Proposition 1

Proposition 2

Proposition 3

Proof of Theorem 1:

Proof of Theorem 2:

1.2 B. Preliminary proposition and lemmas

Proposition 4

Lemma 1

Lemma 2

Lemma 3

Lemma 4

Lemma 5

Lemma 6

Lemma 7

About this article

Cite this article

Keywords

Navigation

On estimation of nonparametric regression models with autoregressive and moving average errors

Abstract

Access this article

Similar content being viewed by others

Estimating change points in nonparametric time series regression models

Penalised inference for lagged dependent regression in the presence of autocorrelated residuals

Bootstrapping regression models with locally stationary disturbances

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic Supplementary Material

Supplementary file 1 (pdf 255 KB)

Appendix

Appendix

1.1 A. The main results and proofs

Proposition 1

Proposition 2

Proposition 3

Proof of Theorem 1:

Proof of Theorem 2:

1.2 B. Preliminary proposition and lemmas

Proposition 4

Lemma 1

Lemma 2

Lemma 3

Lemma 4

Lemma 5

Lemma 6

Lemma 7

About this article

Cite this article

Share this article

Keywords

Search

Navigation